Project Overview
| The
proposed project benefits farmers, foresters, environmental
decision-makers, and relief workers worldwide who can access the
Internet, by providing them targeted applications bundled with
relevant, location-specific data within an online workspace they can
share with others. It partners with UN member countries, CGIAR centers,
development agencies, the Global
VSAT forum and others to identify and complement efforts like
India's "e-Choupal",
Benin's remote outreach
centers and World Food Programme's field
communications efforts that connect remote communities and
development projects to the Internet. Modeled on the highly successful "bottom-up" approach taken by the US Globe program for Earth science education, a configured bundle provides users a multilingual portal rich in data about the area they live. Using modem levels of connectivity, users gain access to terabytes of terrain, soils, water, temperature, vegetation index, MODIS, TRMM, CBERS, Landsat, and other GIS data. Users can also digitize areas of interest directly within a web application, and annotate, upload, and share georeferenced field notes and measurements with other authorized individuals. Crop and forest modeling, erosion and water quality management tools for manipulating this data within a user's workspace are provided. They
also gain
access to a global network of experts and project managers who can
provide real-time, interactive advise over the Internet, and a
powerful, easy to use mechanism for publishing and subscribing to data
generated by peer groups. Identity
management, audit trails, and structured workflow enable
a wide variety of enforceable, sustainable value propositions to be
constructed. A customizable "market module"
template is provided to bootstrap direct market agreements
and rudimentary supply chain certification using web, cellular SMS,
RFID and barcode-based interactions between buyers,
farmers, and other participants in a regional economy, where no
viable alternatives exist. Unlike e-Choupal or the Globe effort, the primarily
open-source, pre-packaged software framework can
be replicated and expanded by participating institutions for the
price of commodity CPU, storage, and bandwidth. Phase one of this project focuses on identity management, data registration, peer-peer content discovery, and automated aggregation into a simple "clip, zip, and ship" mechanism for "fat" clients, and OGC web mapping content document for online services. Phase 2 focuses on enhanced client-server interactions with the portal, batch-oriented applications and analysis that run at the portal itself, and outreach. |
|
||||
Rural Connectivity, Sustainable
Agriculture,
Linking rural areas to the world with cell phones and the Internet is a
powerful
agent of change. A dramatic example is India's "e-Choupal"
system, that reaches millions of farmers in over 11,000 villages.
By directly linking farmers to markets, it benefits local
villages with higher prices by removing middlemen. An accelerating
number of similar
activites worldwide are realizing the potential of wireline modem
or
inexpensive very small aperature satellite terminals (VSATs) to benefit
rural communities. A thorough discussion by USAID
of the potential for Information and Communication Technologies (ICTs)
in rural agriculture may
be found here.
The proposed effort complements this trend by providing these rural
users with powerful tools and data to more effectively
realize sustainable agriculture and share geospatial information
about where they live.
Beyond market prices, the proposed
framework provides data, tools, and IT infrastructure to address
location-specific questions about
planting time, irrigation, tillage, pesticide and fertilizer
application. It also provides a thin-wire framework to rapidly
provision existing assets, order new imagery, and coordinate activities
after
catastrophic events by providing geographically aware newsgroups,
discussion lists, and shared workspaces.
Servicing The "Pixel Inhabitants"
Careful positioning of this project by
the United Nations Food and Agriculture Organization (FAO), the Consultative Group on
International Agricultural Research (CGIAR),
USAID and other development
agencies can establish, at
minimal cost, a critical “neutral ground” to
coordinate and harness relevant remote sensing and information system
development efforts for the benefit of developing countries, at the
individual farm and watershed level. Indeed, that has been the
justification for major expenditures on the part of CEOS members.
Major related efforts include:
USAID's Famine Early Warning System (FEWS)
European Space Agency's Global Monitoring for Environment and Security (GMES)
NASA Research Opportunities in Space and Earth Science (ROSES)
NASA Research, Education, and Applications Solutions Network (REASoN)
European Joint Research Efforts on Environment and Sustainability
The Committee on Earth Observation Satellites (CEOS) GRID testbeds
Local Knowledge and Community
CEOS members operate
several
interesting orbital platforms; FEWS, GMES, and JRC create interesting
synoptic
information products on ever more powerful supercomputing grids; NASA
and ESA fund extremely interesting research prototypes demonstrating
the utility of their Earth Observation products. But the vast majority
of rural communities have had little prospect to benefit
from these advances until
now, until the Internet has started to reach their towns and villages.
Even then,
bridging the gap between CEOS member prototypes and UN member
populations
will require extensive participation by FAO, CGIAR, WFP, USAID, IBRD
and their close affiliates, such as RCMRD
and RECTAS.
They are the
organzations that understand the regional economics, and have
detailed, structured “local knowledge” about physical processes
observed by the CEOS members. They are the organizations with sustained
field
presence and deep "local knowledge" of issues important to
day-to-day operational decision makers, and how it does - and doesn't -
mesh with information and communication technologies (ICTs). They
are indeed the organizations that
can fully harness CEOS capabilities to
address UN member country issues in sustainable agriculture and poverty
alleviation in a well-coordinated, comprehensive, global fashion, and
effectively coordinate a Global
Land Cover Test Sites Project for "the masses" back to CEOS
members.
Orchestrating
International Efforts
To
achieve this goal, FAO, USAID, World Bank, and CGIAR can
orchestrate the creation of baseline,
open-source Internet 'portal' bundles of relevant data
and tools, which can
be freely downloaded, and position the overall effort as their
own interoperability testbed. They can leverage
their strategic
OpenGIS membership to drive standardization of
trade-offs suitable to its stakeholders. Because this effort
coordinates
agriculture-related IT development and remote sensing, and is aligned
well with activities such as NASA SEEDS and Canada's
GeoConnections, it can provide an umbrella framework for
relevant national
research that benefits all member countries, yet still provide
innovation potential to individual research efforts.
Most importantly, they can focus GIS standardization - and procurement - to achieve their objectives: sustainability and the Millenium Development Goals.
Spatially Explicit
Information for Lenders
Because the framework provides a geographically
referenced, semantically rich framework, it provides multilateral
lending institutions a unique opportunity to develop and
evaluate spatially explicit impact assessment models. Locally
maintained
household
data and interaction models can be incorporated by in-country officers
and incorporated into comprehensive frameworks
like
FEWS. This will also provide lendors
baseline statistics for compliance with the Pelosi
Act, Equator
Principles, and World
Bank environmental safeguards.
Creating
Value Chains
The effort also can take leadership to foster the creation of
value chains between
users within the overall federation, by establishing a simple standard
for tracking usage of geographic content and algorithms, similar to
telecom “call
detail records” (CDRs). These CDRs are the essential ingredients
for creating enforcable "credit" systems between stakeholders: in-kind
bartering and value chaining of a wide variety, complementary to
run-time contract protocols such as the proposed OpenGIS Web Pricing
and
Ordering System, (WPOS)
or Electronic Business XML (ebXML)
Trading Partner Agreements (TPAs).The
XML-based IP
detail record
specification will be considered as the potential standard
for recording value exchange between authenticated users, content and
service providers.
An example of such a value chain might be different UNEP programs that benefit from accessing high resolution SPOT imagery directly from the Vito-managed archive. A user might display several several "standard image unit" 120000 pixel screens, one of which involved an ESRI ArcGIS Spatial Analyst licence for 1/2 CPU hour to estimate Cambodian deforestation processes. Each of these items: the 1/2 CPU hour and the SPOT data, could be reconciled against different specific funding programs: the ESRI conservation GIS program, a World Wildlife Fund program for hardwood forest management, and SPOT Image. Once accounting CDR reconcilation processes are accepted by stakeholders, the path is open to value chaining, credit, settlement protocols, and bartering of geospatial information and services between federation members.
CDRs can be reconciled in much the same way SWIFT net and
VISA reconcile overall inter-bank balance of payments between its
members, or TELCOs reconcile CDRs and share revenues on calls that span
multiple carriers. A simple, standardized CDR format and authenticated
reconciliation process will provide an incentive for high-value online
geographic service providers, such as GlobeXplorer, MapPoint, and ESRI's
Geography
Network to participate.
Customer Intelligence
Billing records will be part of the more comprehensive customer
behavior profiling and data mining system. Detailed "click records" and
overall site statistics will be processed using the popular business
intelligence Weka
suite of machine learning and predictive behaviour algorithms. These
statistics will be kept in strict accordance with the delegated
authority identity schemes, and available to users, or expunged, along
that chain upon request. In particular, extreme care will be
taken to observe privacy, and avoid politically explosive situations
that might arise between soverign governments and NGOs, or UN
agencies. However, users will be advised that information may be
shared with authorities in abusive or extreme circumstances. Logs
of email threads will also be processed using the NetVis social network visualization
suite, and made available to "community managers" on a per-group basis.
Open Source Portal Base
Software development costs will be minimized by
leveraging a wide array of mainstream open-source software. Major
enhancements or new subsystems of the platform can be undertaken
independently by universities or agencies worldwide. At the portal's
core will be a fully-featured, multilingual enterprise-class framework
capable of content management, community building, and workflow
execution, whose internal object classes have spatial attributes,
filtering, and aggregation capability. Built on Jboss, the rich persistent object
framework Hibernate, Rapid
Application development (RAD) will be performed within the
Java portlets and companion Java
Faces framework. Additionally, all transactions (content uploads,
data provisioning, user sign-ups, subscription service processing, etc)
will be performed using the JBoss Java Business Process Management (jBPM) workflow engine. The design
philosophy will generally follow the component model interfaces of the "Geospatial One-Stop" effort,
but take liberties in favor of design elegance and "doing the right
thing," particularly given the strong emphasis on multi-lingual
support, taxonomic categorization, and structured workflow.
The arena of full-blown, java-based,
open-source content and portal offerings using these core packages
(JBoss, Jbpm, Hibernate, JSF) is changing extremely rapidly. A
selection process will be undertaken between the
major platforms , Jboss Portal, Alfresco, JackRabbit, Nuxeo in
particular, due to their backing by major corporations, RAD GUI design
tools, and design requirements for scalability and mission-critical
qualities of service. All have scriptable content ingest,
metadata extraction, version control, and role-based lifecycle
management. All have discussion lists with delegated administration by
moderators.
The overall project will be developed and maintained as a constantly
evolving, modular 'reference platform', orchestrated by FAO, CGIAR,
national agencies, NGOs, and universities worldwide, relentlessly
driven
by the requirement to be genuinely useful and cost-effective
for agricultural decision-support using appropriate technology.
In its simplest and most important form, the portal will enable users
to search, aggregate, and subscribe to useful data
for areas of
interest from a wide variety of sources. This data will be available be
made available as online KML and OGC layers, or aggregated into
download chunks via a mechanism similar to the USGS Seamless server.
The baseline viewer for applications will be Google Earth, with
overlays streamed as either KML vectors or ground overlays. Users can
also "clip, zip, and ship" data downloads to use in existing desktop
applications (such as CGIAR tools ,
the USDA/Forest Service Forest Vegetation
Simulator, FAO's WinDisp
, ADDAPIX,
CROPWAT, CROPWAT, CLIMWAT, SIMIS etc.) or
manipulate it within a "workspace" at the portal itself on a remote
desktop. Aggregation of
content will attempt to optimize overall CPU, bandwidth and storage
utilization by employing batch-oriented subsetting and lossless
compression of the areas of user interest as far "upstream" in the
overall workflows as possible.
Robust identity management will enable user groups to purchase and
use their own pools of floating license tokens. In this way, users with
modest connectivity (such as a VSAT terminal) can access fully licensed
ESRI, ERDAS, ENVI, and other products, which have local access to large
data sets. Because nearly all of these products use the Flex-LM
license management scheme, detailed usage records of license are
available. A translator of these logs to IPDR format will be
created,
to support 'software as a service' style cost-accounting.
Content
Sharing, Search and Ranking
It is also generally acknoledged that
the "find, use share and extend" (FUSE) model of user-contributed
content has tremendous value, if properly structured, Wikipedia and Craig's List
being spectacular examples. A primary goal of this effort is to allow
facilitate FUSE use cases for geographic data with semantic
interoperability. Several of the previously mentioned content
management systems employ a variety of these techniques against. The
particulars of these systems will be examined in relationship to the
customized indexes generated for geographic data dictionaries. While
Craig's List has enjoyed success in part due to a fixed taxonomy for
each locale, Wikipedia's support of indexing by different, concurrent
taxonomies has proven effective, giving renewed meaning to the phrase
"a rose is still a rose by any other name."
At the same time, the ability of the big
portals: Google, Yahoo,
Microsoft, AOL, and ASK to offer this mostly government-funded data
uniformly as an easy-to-use, subsidized service with mission-critical
quality of service, has generated exponential growth in actual usage by
end users. Making peer-generated geographic data this easy to use -
while supporting extremely diverse schemas - is a primary goal of this
effort. Right away, users will need to decide if the data they are
sharing can be "self-hosted" - i.e. their institution has the ability
to merely publish it, and the portal's utility is mostly registration,
indexing, publising and fusion, or the data is to be replicated and
hosted at the site itself. The decision to rehost data will be a
combination of level of service, content popularity, and "importance."
While some data might not be accessed very often, it may be very
"important" to have it online with a guarenteed quality of service. The
depth of sea ports in South Eastern Africa might not be important until
a famine situation is imminent, or air fields in Sumatra until a
Tsunami strikes.
Clearly, one factor in ranking are usage
patterns.
Effective methodologies to categorize and uniformly index and search geographic data from heterogeneous sources has always been difficult. Beyond bounding box and simple text descriptions, the long history of the FGDC metadata working groups and clearinghouse efforts is a testament to this difficulty, and the near total obscurity of their results a testament to its overall effectiveness. It is very difficult to create and maintain geographic data that is fully FGDC "compliant", and therefore, not much exists. However, it is well defined, "actionable" information, by virtue of rigorous control of terminology and metadata. Clearly, the race is on to revisit the basics armed with new search and indexing tools and techniques. Beyond the handful of simple categories enumerated in ISO/TC211 19115 B.5.27, a strong emphasis will be made on rigorous categorization of content, groups, and workflow items according to well-known taxonomies and controlled vocabularies, beginning with land cover data according to the FAO LCCS specification. But at least a "minimal set” of the essential FGDC elements and ISO 19115-compliant metadata will be required of any business process attempting to publish spatial collection-level information.
To address these issues, mainstream full-text and next-generation semantic search engine techniques will be used to enhance traditional geographic data metadata. In particular, both the descriptions and data dictionary items of registered data will have full-text search indexes routinely built using standard Lucene , able to cross-reference loosely "tagged" content such as the increasing amount available via GeoRSS feeds. Beyond this, uploaded shapefile text columns will also be crawled and cross-referenced with the eight million geographical entries of GeoNames. Given sufficient bandwidth, registered feature services themselves will also be crawled and indexed in a tiled manner.
Ranked text searching has obvious found
spectacular success with Google 'page rank' algorithm, now embodies in
such Lucene subprojects as Nutch
and Hadoop in use at most
of the search portals. The importance of semantic capability is also
becoming clear. The levels of funding going into startups like Radar Networks hint at the
next generation of mainstream search.
Specifically with respect to geographic
data, nobody would argue that bounded box search for "roads" should
yield highways, streets, or unimproved asphalt. Yet such
"common-sense" logic eludes full-text searching. The same is true for
multi-lingual search; an emergency relief effort in Peru looking for
'roads' should be able to find features such as "caminos".
To address these issues, support for
semantic indexing of data dictionary items will be provided using
a subset of the OpenCYC common
sense ontology, with additional, specialized ontologies performed
within the OWL-Lite
profile of Jena. Upon registering and/or
uploading their data, users will be provided ranked semantic search
results as
starting candidates within a stripped-down version of IsaViz, and encouraged to
refine their definition within RDF "is a" and "has a" properties for
their tables and columns. A few baseline schemas will be
established early on, starting with NASA's SWEET project, and those
deemed useful after a review of the massive GILS effort. For
completeness, the RDF of GML
itself will be included, as a testament to its user
"friendliness".
Federation
Early attention to RDF and underlying Jena support should make
adherance to any resultant ontologies emerging from GEOSS catalog
effort straightforward.
Installed instances of the platform are intended to federate
seamlessly, by supporting standard metadata protocols, registration in
appropriate catalogs (ECHO, GCMD, UNEP.net, Geonetwork, ESA assets,
etc) and being structured as
regional content centers with delegated authority over a
certain polygonal area of the Earth's surface.
Mapping and Image Processing
The foundation rendering component for spatial content will be the Minnesota Map Server using
the
Java version of its standard interface
objects. Map customization will be performed according to a user's
profile by creation of “on-the-fly” mapfiles.
Vector content will be stored in the PostGIS
extension to the PostgreSQL
database. Raster content will be stored in lossless
Jpeg2000 format using the Kakadu
software suite. A thin-wire applet base class, that supports uploading
ESRI Shapefiles, comma-separated value (CSV) point files, or user-drawn
overlay geometry with attributes, will be provided based on either the ROSA applet or an
equivalent. The GRASS i.* and r.* suite
of programs will be used for server-side image processing and
cell-based raster analysis services in conjunction with a controlling
client-side applet, or data can be downloaded and used locally within
several existing FAO
desktop applications.
Another important
component is support of field mapping and surveys using offline PDAs
and smart-phones. A rudimentary workflow to build a forms-based
field survey unit, similar to ESRI's popular Go! Sync,
based on OpenSync
to a user's shared workspace and supporting scripts, will be provided.
Market Module
Linking growers to markets is a critical element of rural Internet
connectivity. Athough it is expected that this function will most
likely be accomplished by pre-existing systems or major complementary
efforts, a simple market bidding and transaction service based upon
Jbpm will be
developed to help "bootstrap" those areas where nothing else exists. A
cocerted effort will be made to link with emerging ICT-based mechanisms
such as Tradenet , and facilitate
extensions to the World Bank's
dgMarket
module using jBPM workflows, with a gateway to cellular networks
using the JBoss Mobicents
framework.
Commercial Software
Platform development will place a strong emphasis on standard
interfaces. In particular, persistent content stores will be able to accommodate ESRI's spatial data engine (SDE),
or Oracle
spatial cartridge by virtue of support within Minnesota Mapserver and JBoss.
Raster image processing services should also be supportable by
server-side ERDAS,
PCI, ENVI, or ERMapper functionality, if
suitably
wrapped within the Java portlet infrastructure and overall
model-view-controller (MVC) framework. As previously mentioned, Most of
these utilize the Flex-LM licensen manager scheme, whose logs can be
processed into billing records to support "software as a service"
accounting.
Baseline Provisioning
To realize “out of the box” value for local decision-makers, it is
essential that relevant, timely, reliable, free data be available.
Datasets identified to meet this criteria are the global MERIS
and Landsat mosaics, real-time MODIS vegetation index,
MODIS surface
temperature, TRMM rainfall
measurements, and SRTM
products. Provisioning a user's area of
interest will result in assembling a time-series of subsetted
temperature, vegetation index, and historical
Landsat imagery for that
area in an offline, batch-oriented mode. These datasets were selected
because they are capable of creating the minimal data set
necessary to run the DSSAT CERES
and CROPGRO models. An additional derivative product, bi-weekly
vegetation index changes, has also been deemed of considerable
interest. These will all be made available on-demand via a standard
provisioning jBPM workflow. This baseline can be fully automated, since
all NASA DAAC holdings are registered with ECHO along with associated
access methods.
Beyond these free satellite sources, it will be essential to incorporate the scanned topographic holdings of major providers, such as East View Cartographic Landinfo, and OmniMaps, if truely useful coverage is the goal. As discussed in Value Chains, a critical part of this effort is to provide a sustainable accounting mechanism for these private assets to be offered online, and relief, aid, and development projects to fund, the ongoing scanning and maintanance of these private maps. Indeed, in many parts of the former Soviet Union and elsewhere, maintaining the mapping data is their agency's charter, and "giving it away" is utterly antithetical to their culture. Nothing will foster the flow of information from those who have valuable information to those who need it like an enforceable accounting system.
A simple tool for discovering and downloading data held at the
portal, that uses OpenGIS wire protocols and
propagates identity via IETF RFC 2617, will
be developed from one of several open-source client efforts already
underway in the OpenGIS community. User accounts created by a delegated
authority will be given a disk quota, and provided an applet-based
drawing tool for uploading ESRI shapefiles or freehand input of
georeferenced, attributed polygons, points, or outlines. Users can
upload field measurements, notes, digital pictures, etc about that
point
or area. Collections of points with associated scalar values can be
used
to create isobar contour surfaces.
Beyond remote sensing data, a concerted
effort to aggregate, load, and link to relevant, useful free datasets
will be ongoing. Obvious sources are FAO soils databases, UNEP.net, AQUISTAT,
Digital
Chart of the World, and Africover data. As
previously mentioned, the mapping server will have the ability to act
as
both an OpenGIS client and server for WMS, WCS, and WFS protocols. Node
managers can choose to download and cache datasets that either require
higher performance or higher qualities of service than provided by
remote access.
Additional Data
Any additional baseline data the site might acquire, such as aerial
photography, etc. can be added to the user workspaces. Because the
underlying Minnesota Map Server can utilize remote data sources using
the OpenGIS Web Mapping protocol, users can also access remote data
products from GlobeXplorer content partners, or other federates. Of
particular interest will be free
CBERS data throughout Africa and Brazil, NASA's OnEarth, as well as data
from the
ESRI geography network. ESA's Vito-managed
archive has also indicated a willingness to participate, and a few other
Landsat 7 operators .
Customized jBPM workflow efforts will be encouraged to capture
highly specific data, particularly localized meteorological observation
capability.
Installed nodes might also become a
focal point and distribution mechanism for content generated by members
of the International Steering
Committee
on Global Mapping. An identified area for enhancement of the
reference platform is integration with the emerging OpenGIS "SensorML"
specification for in-situ measurements.
Finally, suitable market price information feeds will be explored.
Distributed Workflow
Because of the enormous volumes of several core raster datasets, data
provisioning will necessarily be a carefully orchestrated process,
designed to optimize overall CPU, bandwidth, and storage requirements.
As such, as much "upstream" processing as possible will occur. For
example, MODIS EVI and temperature tiles will have their desired HDF
slices extracted, and then be mosaiced, reprojected, and losslessly
compressed before transmission to a content node. This will reduce
transmission and storage requirements by an order of magnitude. A batch
provisioning server, with multiple T3 connections to the EROS data center, has
been established specifically for this purpose. This may eventually be
co-located at the UNEP/EROS facility.
The embedded workflow engine of the portal will enable modular
addition of a wide variety of useful inputs, particularly local
meteorological observations. Such data might be retrieved over the
Internet, using an automated dial-out modem, or 3270 terminal emulation.
One other important workflow to be supported is remote order
fulfillment to Geonetcast
terminals. such as this
installation in Rwanda. Currently, EUMETCAST
and NOAA operate mostly regular meterological and coarse-resolution
data (AVHRR, etc). However, both support prioritized "push"
multicasting with subsystems such as Kencast. Both
operators have expressed willingness to accept prioritized "push"
images from this portal effort. Therefore, given a minimal VSAT
'backchannel', this portal can fulfill a vital role in the overall
GEOSS conceptual workflow: aggregating and prioritizing user requests
from Geonetcast stations. Indeed, subsetting and Geonetcast rebroadcast
will be a critical component to the success of free
CBERS data for Africa.
Subsetting
For higher resolution data, especially Landsat imagery, users will be
provisioned with suitable subsetted areas instead of entire scenes.
Subsetting capability will initially be invoked using secure shell, using the GDAL and subset.org suite of tools. These
tools will be also be enhanced to support both OpenGIS WCS client and
server capabilities, to complement the Minnesota Map Server's existing
ability to act as both OpenGIS Web Mapping and Web Feature client and
server. A GRID-based execution is
currently being prototyped in conjunction with GlobeXplorer's
GRID efforts and the LAITS OGC-GRID
integration effort. Phase one of this effort would establish such
subsetting capabilities within EROS data center, Goddard Space Flight
Center,
GlobeXplorer, and the Maryland Global
Land Cover Facility.. This
basic GRID subsetting is intended to be the first practical,
open-source
realization of the CEOS
GRID effort, and establish technical protocol for interaction with
the service elements
of the European Union's Global
Monitoring for Environment and Security program. Such cooperation
may become increasingly important given the
age of the MODIS, Landsat and TRMM platforms, and the current launch
schedules of Hydros,
NPOESS, and other American
sensors.
All of these raster datasets will be stored in two forms: a false-color version suitable for web presentation, and a lossless version suitable for scientific simulation. Content management business processes and interactive applications will associate the two atomically in a reusable fashion. An accompanying applet will enable users to interactively "pick" a point on the false-color JPEG rendition in their browser, and retrieve the value of all bands of the multispectral image "backing" the JPEG. In this manner multispectral, multitemporal signatures of a wide variety of phenomena may be collected by users worldwide, easily associated with close-up digital camera images, field notes, or audio recordings, and shared with others.
Access Control
Security and privacy is essential towards
gaining sufficient trust of users to upload data, and building
community. For example, in humanitarian relief and food security
efforts, it may not be useful to widely publish metadata about
mortality
statistics.
Uploaded individual feature data will be tagged with user and
group-level information at the table level for uploaded Shapefiles, and
in two database columns in the shared table for interactively drawn
points and polygons within PostGIS, viewable as "virtual" tables with
appropriate SQL "WHERE" clause predicates. In this manner role-driven
security and access-control may be maintained and enforced through all
publishing, workflow, and rendering operations. Metadata about "virtual
tables" will be periodically harvested from the shared tables.
Services
Workflows capable of remote transformations can be established as Apache Axis SOAP
web services agents, or OGSA
GRID agents, the first being subsetting. Such
transformation workflows would be initiated according to the ISO 19119
and the OpenGIS Services Architecture protocols, and discovered
according to the rapidly evolving OpenGIS Web Registry Services. As
previously mentioned, this effort will be closely aligned with GlobeXplorer's
GRID efforts and the LAITS OGC-GRID
integration effort, and in general, along the lines of ebXML
trading partner agrements and the OGC Web services testbed efforts.
Identity and Delegated Authority
Trust
User workspaces need to have suitable authentication and access control
mechanisms in place. Data about one's farmland or watershed,
particularly its yield capability or contaminants, is extremely
sensitive information. It can often be tied directly to loan risk, land
valuation, or environmental regulation noncompliance. Gaining users'
trust that uploaded data will not be used against them is a very long,
difficult process. This is precisely why the initial focus of the
portal
will be downloaded data for existing "fat" clients. The failed VantagePoint
Network had a very difficult time convincing perspective customers
that their data was not going to be used by the US EPA, farm credit
agency, or anyone else. Clearly this will only be as robust as the
practices of the hosting institution, but the overall architecture must
support role-based identity management by delegated authorities, and
access control of all uploaded data. This will be done by "tagging" all
individual uploaded features with identity and access control
information. Beyond trust, this is also essential if value exchange is
to be promoted among users: information known to be held by one party
must be deemed both scarce and useful by another.
Within a single instance of a portal, an overall site administrator will be assigned, who in turn can designate sub-administrators that can create groups and new individual users of their own. A unique feature of the spatial enhancements being made to the Alfresco platform will be its ability to delegate geographic areas to administrators.
Certificates
This effort establishes an X.509 chain of certificate authority with
FAO, using Java
Keytool. New sites will be issued a certificate, from which
they may issue their own login/password credentials. In this manner,
individual countries with sufficient technical capacity can either
support an entire portal node of their own, or simply be delegated all authority for the chains of
identity that fall within their polygonal spatial bounds. Nested
delegates can also be established, that can in turn issue credentials,
scoped to a group level. For example, the Ohio node will issue logins
for the Midwest, and the University of Cinncinati may be a delegate of
that node that issues its own credentials. As previously mentioned,
groups may have spatial attributes, so that one can cleanly define a
delegate group as a particular county or township. The overall
framework
used will be Java Authorization and Authentication (JAAS).
Authenticated Field
Inspectors
Verifiable identity provides an extremely unique opportunity to build
georeferenced ground truth data sets about agricultural practices by
certifiable role players, particularly agricultural inspectors.
Phase one of this effort prototypes a pilot program with Fairtrade Labeling
International and the International
Federation of Organic Agriculture Movements to highlight this
capability, by issuing inspectors digital cameras and GPS units for
uploading georeferenced information about coffee growers in the greater
Caribbean Basin and Columbia. A methodology for systematically making
this trusted information available to wholesalers and consumers to
learn
the certified practices that produced their coffee, and strategies to
integrate the federated network with UN/CEFACT bill of lading location
and function
codes will be explored. Proper positioning to country customs
protocols,
and liaison to other FAO efforts such as "the
Codex Alimentarius Commission" will be examined to utilize such
captured field information by authenticated inspectors.
Applications, Distance Learning, and Outreach
Fat Clients
A large class of portal users will probably not have the ability or
desire
to stay connected to the Internet all day while they work. Rather, they
will most
likely occaisionally connect, get what they need, and work offline in
desktop applications. This class of user will be given the choice to
download a "bundle" of free,
pre-packaged desktop applications and datasets, that can
occasionally "synchronize" their their workspace with field
measurements
aquired with GPS-capable harvesters or other means. These tools will
include Metalite, CGIAR's tools,
various FAO
applications, CROPWAT, CLIMWAT, DSSAT, SIMIS and other
items deemed useful. The source code for all of these items will be
placed in a centralized CVS
repository, moderated by FAO and CGIAR. Other precision-agriculture
software
from commercial vendors, such are ArcGIS, FarmGIS, SST
toolbox, will
no doubt be in wide use as well.
The bi-weekly
updates of MODIS vegetation information, daily TRMM updates, and other
specific newsgroups or discussion forums offer the basic incentive to
participate in the overall portal community. The centralized content
nodes
will also be able to stream
content on demand to OpenGIS
capable clients. Existing "fat client" desktop tools will such as
WinDisp will be individually evaluated for the level of effort to
directly use OpenGIS WCS and WMS protocols versus downloaded files.
Financial calculations
In addition to SIMIS, a simplistic web-based spreadsheet will be
provided to take outputs from applications and perform simple cash-flow
calculations for fertilizer, irrigation, pest control, tillage ,
harvest
and transport input costs versus crop transport price. This tool needs
better definition about its utility, data input, and training
requirements. Possible liaison with the FEWS "Priceman"
effort, particularly its data feeds, could be beneficial. A simple tool
to outline one's field, calculate the area, and determine potential
costs and market values is a part of this simple calculator.
Image Processing and Spatial Analysis
Phase 2 of this effort develops powerful image processing and spatial
analysis functionality so users can manipulate Gigabytes of
multitemporal remote sensing imagery over modem connections, once their
workspace has been provisioned. This will be generally be implemented
with a collection of pre-packaged, html POST and applet-based web pages
that run GRASS i.*
and r.*
programs on the server, against local files. One important application
will be an online version of the FAO Land Cover Classification System (LCCS) tools and documentation,
with a supporting Java applet-based drawing tool and
dialog to run a pre-packaged, batch version of the GRASS i.cluster,
i.group,
i.class,
i.gensig,
i.smap,
and i.maxlik
programs. Assistance and partnership with GRASS national user groups
will be established.
A concious effort will be made to enforce modularity of this type of capability, so that portal installations can choose to install commercial image processing application suites, such as ERDAS, PCI, ENVI, ERMapper, IDL or ION.
As previously mentioned, an accompanying applet will enable users to interactively "pick" a geographic point, and retrieve the value of all bands of one or many multispectral images at that point. In this manner multispectral, multitemporal signatures of a wide variety of phenomena may be collected and shared with associated ground truth data by users worldwide.
Shared Desktops and VOIPBecause VNC sessions execute at portal locations, they enable access to traditional GUI desktop applications, running on high-performance CPU power against content spinning on local disks. If a node operator chooses to purchase their own licenses, a generic "exec" portlet to spawn a new VNC session, with the authenticated user's credentials, can be used to run ArcGIS, or an other suitable desktop application within a Java-enabled browser applet, or a small Windows Active-X component. The "exec" portlet will log appropriate "call detail records" for reconciliation and billing purposes. For high volume centers supporting a large number of concurrent desktop and/or applet-based sessions, a site can be configured to use a pool of servers, where user sessions are provisioned on demand using GRID Engine.
Run-time
Access Control
Because the VNC session will be executed with the credentials of the
authenticated user, the overall session will only be granted access to
content store files for which they appear on the file's access
control list. In this manner, the terms of "shared buy" data
programs,
such as the Multi-Resolution
Land Characteristics Consortium can be enforced, for sites that
gain sufficient trust of data vendors. Another option is, given
sufficient bandwidth, to use such data as a metered remote OpenGIS
WCS from GlobeXplorer or others with CDR logging capability.
Distance
Learning
Because a major goal of this infrastructure is to
facilitate outreach and distance learning among rural communities,
a strong emphasis will be placed on creating structured rendezvous
processes, using calendars, searchable registries, and group messaging,
to link appropriate domain specialists in agronomy, entomology, soil
science, hydrology, and image processing with users throughout the
federation. Beyond threaded email lists, interactive support via text
"instant messaging", shared workspace, and voice conferencing will be
encouraged. As previously mentioned, a strong effort to establish
acceptance of standard voice-over-IP (VOIP) and shared desktop
conferencing infrastructure will be made early on and encouraged
for all new users. Additionally, because the VNC server enables
multiple simultaneous input streams and remote display buffers, it
is also an ideal tool for distance learning. Experts worldwide can
interactively "take control" of a shared workspace to run training
sessions or address site-specific issues. A directory and calendaring
system, fully integrated with the Alfresco framework, will enable users
to rendezvous with domain specialists in image processing, agronomy,
entomology, soil science, and hydrology. As with all other items,
call detail records will be generated for future reconciliation between
stakeholders. There is even an Active-X component built around VNC (ExpertVNC) that
dynamically
installs a VNC server "on the fly", thereby enabling call-center
experts
to "take control" of a remote user's desktop and local files.
It is believed that
VNC-based training, while never as effective as face-to-face
interaction,
will greatly facilitate outreach and acceptance of the overall network,
and enable highly specialized experts worldwide to provide assistance
during rapidly changing, critical situations during the growing season,
or immediately following "shock" events such as hurricanes, or floods.
Simulation and Decision Support
Effective decision-making is greatly enhanced by accurate underlying models of system dynamics. In agriculture and epidemiology, this means effectively modeling crops, soils, water, and disease vectors subject to environmental conditions and management practices.
Therefore, a major goal of the effort is to support useful simulation tools. Temperature and rainfall baseline datasets will be pre-populated with MODIS and TRMM data, and augmented by uploaded data and local data feeds. Users can add extremely detailed site-specific information to calibrate their models using local knowledge, particularly yield maps. An online version of the ICASA/DSSAT models will be available in the user's workspace. For soils data, users will encouraged to use SDBm Plus.
Specifically
targeted applications are:
Presentation of simulation results will be accomplished using Minnesota Map Server's GDAL capability.
Beyond these, a general-purpose dialog to build reusable libraries of GRASS r.mapcalc, r.cost, r.spread, and other commands for expressing arbitrary spatial processes will be explored, so important analysis like this, or this might be easily assembled and reused. Universities worldwide will be invited to engage in collaborative development of new application portlets in agriculture, hydrology, conservation, epidemiology, and other disciplines that can leverage SRTM, TRMM, temperature, Landsat, and user-provided datasets.
Integrated Processes
Finally, two highly-instrumented test sites will be selected to
evaluate
the comprehensive “FLORES”
simulation environment. This portion of the effort will drive
consideration of how to reuse detailed field measurements between three
powerful but different environments: GRASS, DSSAT, and Simulistics. It
will also provide extremely useful insight into how to approach
landscape-level, reusable, modular model development and policy
evaluation
efforts.
One of the Simulistics sites will be within the Ohio Little Miami River Basin, one of the US EPA's National Water Quality Assessment verification. This site is desirable because it is extremely well instrumented, has corn, soybeans, and wheat crops, and access to free real-time Landsat data through the OhioView program.
Taxonomic Support
Beyond the handful of simple categories enumerated in ISO/TC211 19115
B.5.27, a strong emphasis will be made on rigorous categorization of
content, groups, and workflow items according to well-known taxonomies
and controlled vocabularies, beginning with land cover data according
to the FAO
LCCS
specification. It is felt that adequate investment early on is
essential to address the specialized needs of the agricultural sciences
community. Pre-existing vocabularies and taxonomies are abundant and
extremely important to CGIAR and FAO member countries. An important
standards
group in this arena is the International Working Group on Taxonomic
Databases
(TDWG). In any case, however, local
farmers
will not simply "drop" their practices in favor of a universal system;
indeed much knowledge is to be captured adequately structuring indigenous
taxonomies. But beyond these, numerous "standard"
vocabularies and taxonomies already exist: those of SDBm
Plus, GCMD
keywords, SDTS
feature codes, NIMA place
names, FGDC
Biological Profile, etc. Standardization of taxonomies is in its
infancy, and is actively being debated within the OpenGIS Web Registry
group. Another important forum for this topic is the newly-formed
Taxonomy
and Semantics Working group of KM.gov,
formed
by US government Federal Chief Information Officers Council.
Again, FAO, CGIAR and USAID are clearly in a leadership position to
establish
a set of reusable standard taxonomies, which would in and of itself be
a major step
towards knowledge management.
Tools
An Alfresco tool is being developed to capture and edit a taxonomy
using
a simplistic recursive, single-inheritance Java Bean structure and an
associated category
class browser. The captured taxonomies will be done in
manner consistent with the “T-model” category mechanism
of the Universal Description, Discovery and Integration (UDDI) specification. Nodes of this
taxonomy will comprise content items themselves, and can contain
multilingual descriptions, photographs, or audio files. Taxonomies
will be maintained by members of a special group, in order to avoid
"stupid user errors." Once a taxonomy is established, it can be
searched using multi-lingual full-text queries, or interactively
with an applet helper. Once a category is found, it can be used to
"tag" other content items. Content can be categorized within multiple
different taxonomies. Additionally, the body of individual .pdf,
.doc, and .html, documents can be indexed for full-text queries as
well. Future planned enhancements include the ability to "tag"
individual
sections of documents as well. Similarly, individual database columns
of uploaded ESRI shapefiles, member groups or workflow items can also
be taxonomically and spatially categorized. Content, interest groups,
and events categorized as belonging to: "rice in this watershed" can be
formulated and portrayed graphically. Extending this capability to RDF
and
the semantic web is a clearly identified area of collaborative
research;
see the Knowledge Structuring
section of this document.
Collaborative Development and Research
Positioning
As illustrated throughout this document, the portal effort will be squarely
based upon OpenGIS standards and numerous mainstream, cutting-edge
open-sourced software efforts. As such, it directly benefits from
ongoing, active development of the individual packages. World Bank, USAID, FAO and
CGIAR's
core competences are complementary: domain expertise, and field presence applying
technologies to food security, poverty alleviation,
and
environmental management. This effort provides an exceptionally unique opportunity
to test theory against the real-world, in concert with the majority of
civilian
ground stations and Earth Observation archives.
Member development agencies need to assert these strengths and
diplomatic status to influence ESA-sponsored GMES activities, Canadian
GeoConnections, US NASA and NSF research, and other well-funded
activities in this arena to address the crushing issues humanity faces in
the next few decades.
University and NGO research
Beyond agriculture and food security, it is envisioned that this
framework application development and knowledge capture, with structured
business process definitions for scalable production of derivative spatial
products based on taxonomic metadata, PKI security, value
chaining, spatial content management, remote dataset discovery, and
GRID-based process execution can offer universities, national agencies,
and NGOs worldwide a wide yet focused spectrum of collaborative research
opportunities. It can also enable FAO/CGIAR to tap external funding sources for
major enhancements to monitoring and hypothesis evaluation of the GM crop activites,
vital to feeding a hungry, growing human population. Ongoing effort will be made to facilitate
multi-level aggregation of real data, whereby aggregates suitable for
regional statistics and economic policy can draw directly upon the most
accurate data available - that maintained by "pixel inhabitants."
The initial site locations are intentionally located in Africa and South America, where CBERS, Landsat, and other core data sets are free, already have a large overlap with UNDP, UNEP, and World Bank field operations, and can leverage FEWS activities. Strategic discussions should commence immediately as to how this may be positioned relative to their activities, for data collection, processing, and disseminationman. Also, NASA's Research, Education, and Applications Solutions Network (REASoN) are just beginning their work, and will no doubt be creating several subsystems, particularly thin-wire scientific applications, directly applicable to FAO and CGIAR's mission. Again, the project outlined in this document provides a coordinated conduit for that technology, as well as similar development efforts in Europe, India, China, and elsewhere, to reach and benefit UN member countries at the individual farm and watershed level.
Site Validation Efforts
As previously stated, it is recommended that a core strategy of the
portal be to recruit and coordinate test agricultural sites among their
field stations worldwide in order to present a "unified front" single
to CEOS members for sensor validation, in conjunction with their
existing Global
Land Cover Test Sites. Additionally, well-established ecological
monitoring sites, particularly TEMS sites,
should be actively recruited to become early adopters of the framework.
Commercial Partnerships
Because the overall framework comprises a global comprehensive
decision-support framework, with usage-tracking capability, a wide
range
of commercial partnerships are possible. First and foremost,
partnerships with commercial operators such as GlobeXplorer, SPOT,
Space
Imaging, and others to enhance the data available for manipulation.
Next, FAO and CGIAR should attempt to coordinate this effort with other
lending institutions GIS activity, and particularly major funded GIS
software system integration efforts with contractors such as Chemonics International, IBM
Global Services, and others. Also, involvement with major equipment and
fertilizer manufacturers early on for training and outreach.
Crop Models
Systematic Collaborations with ongoing CGIAR, USDA, FAO, ICASA, and other crop
modeling efforts to share results and calibration data.
Farm-level Economic Models
Integration with wide range of extremely useful
tools, such as Texas A&M's FLIPSIM
Knowledge Structuring
One identified area of research, consistent with the open model of
the effort, is to employ taxonomic mechanisms such as
the “KAON”
semantic framework,
to adequately structure and cross-reference knowledge about and between
biological, soils-related, and other taxonomic groupings. This area
represents a clear opportunity for university research worldwide, which
might immediately “tap into” the CGIAR user-base worldwide. It is felt
that the examinations of how to adequately categorize DSSAT - capable
measurements within OpenGIS and TC/211 efforts, as well as perform
interoperability between GRASS and FLORES, will quickly require such a
robust framework. It is also felt that this is a long-term strategic
necessity, particularly if CGIAR and FAO are to assist member countries
to keep up with international
biological taxonomy and eco-complexity
efforts, to adequately describe their ecosystems in a time of rapid
species extinction, climate change, introduction of GM organisms, and
position FAO
and CGIAR as leaders in major knowledge sharing efforts such as FIVIMS
and the World
Bank Knowledge Sharing program.
High Performance Computing
Another important area for collaboration is to develop a library of
reusable cellular automata models for frameworks such as the parallel,
open-source Spatial
Modeling Environment, compiled and linked as formal, generalized modular models.
Of particular
interest to IFPRI and other policy
institutions is using spatially explicit
crop production estimates to drive
economic-ecologic models and technology adoption models
with household
sub-models.
In this manner, models created by domain specialists in GRASS mapcalc, Simulistics model files , ArcGIS spatial analyst XML model files, or similar tools might be combined together, linked to social cellular automata models, and executed on D-GRID, Teragrid or DataTAG.
Incorporation of Earth System Modeling Framework coupling routines might enable multi-level aggregations between regional, high resolution models and coarse-grained, global models within the overall GTOS program.Because security and accounting will be enforced throughout all aspects of user and group interactions, multi-level, distributed resource gaming might be accomplished. A scenario can be envisioned where localized variables are controlled by appropriate delegated sovereigns, "environmental" variables are set by lending institutions or CEOS member supercomputer runs, and a variety of possible scenarios are evaluated at several levels of hierarchy and linkage. This is precisely the goal of the Earth System Modeling Framework.
It is also conceivable to integrate
with online socio-economic data, and conduct multiplayer "SimCity"-like
games among stakeholders at multiple levels of hierarchy, with
stochastic social
or political
models, for simulating food shortage, monetary crisis, or climate
change
scenarios caused by a wide variety of circumstances.