Integrating Ecoinformatics Resources on the Semantic Web
Cynthia Sims Parr |
Andriy Parafiynyk |
Joel Sachs |
Li Ding |
Sandor Dornbush |
Tim Finin |
csparr@ |
andr1@umbc.edu |
jsachs@umbc.edu |
ding.li@umbc.edu |
sandor1@umbc.edu |
finin@umbc.edu |
Department of Computer Science and Electrical Engineering
University of Maryland Baltimore County
Baltimore, MD 21250 USA
David Wang |
Allan Hollander |
Dept. of Computer Science University of Maryland College Park, MD, 20742 USA Tw7@cs.umd.edu |
Information Center for the Environment University of California Davis Davis, CA, 95616 adhollander@ ucdavis.edu
|
ABSTRACT
We describe
ELVIS(the Ecosystem
Location Visualization and Information System), a suite of
tools for constructing food webs for a given location. We
express both ELVIS input and output data in OWL, thereby
enabling its integration with other semantic web resources. In
particular, we describe using a Triple Shop application to
answer SPARQL queries from a collection of semantic web
documents. This is an end-to-end case study offor the semantic webÕs utility for
ecological and environmental research.
Categories and Subject Descriptors
H. Information Systems
H.4 Information System Applications
H.4.m miscellaneous
General Terms
Algorithms, Design, Experimentation, Human Factors, Standardization,
Keywords
Food webs - Ecological forecasting - Semantic web - Ontologies - Invasive Species - Biodiversity - Service Oriented Design
SPIRE
(Semantic Prototypes in
Research Ecoinformatics)-http://spire.umbc.edu) is a
distributed, interdisciplinary research project tasked with
building semantic web prototypes for invasive species
science. {SPIRE: Semantic Prototypes in Researchh
Ecoinformatics 2006 #6010}
.
Our main integrating suite of tools is ELVIS (the Ecosystem Location Visualization Information System). ELVIS is motivated by the belief that food web structure plays a role in the success or failure of potential species invasions. Because very few ecosystems have been the subject of empirical food web studies, response teams are typically unable to get quick answers to questions like Òwhat are likely prey and predator species of the invader in the new environment?Ó
The core data has been
integrated from publicly available sources and is now available
on the semantic web. We have constructed a platform for
investigating multiple algorithms for food web prediction.
Further, by exposing item-level data through several rich sets
of ecological and evolutionary ontologies, and by providing
these tools as web services, we enable integration with other
semantic web/web 2.0 applications, such as SwoogleWOOGLE. WE and FieldMarking
{Parr #6020}
.
We have developed a Òshopping
cartÓ application, Triple Shop, which allows a user to select
semantic web documents, and to issue SPARQL queries over their
union. Thus, we are able to integrate diverse ecoinformatics
data in response to ad-hoc queries.
Previous work on data
integration in ecological informatics includes online data
repositories[2] and workflow [4] ontologies. Metadata allows
only the discovery of possibly interesting datasets and does
not provide the means to harvest the data itself. Individual
food web researchers maintain and share their own digital data
archives, in individualized data formats, though more
accessible standardized archives are beginning to
emerge[1]. There are
goodexcellent databases on invasive
species
{Invasive Species Specialist Group 2006
#6060}
]s (e.g. http://www.issg.org/database/welcome/)
but they areit is not automatically integrated with
information about non-invasive specieswith which they
interacywith
which they may interact;, nor is there web-based support for
modeling an invasive species, anywhere.
The task of providing food web information
for a user-specified location breaks into two distinct
problems: constructing a species list for a given location; and
constructing a food web from a given species list (and habitat
information).
|
Our goal is to allow a user to input a location, and get back a species list for that location. This is a hard problem, typically ad-hoc, and relying on expert knowledge. There are, in general, three kinds of information that can be used to generate a species list: (i) park inventories; (ii) point locations, e.g. from specimen descriptions in museums and herbariums; and (iii) distribution maps generated by applying statistical techniques to point locations. We are integrating all of the above for California, and expect that the ontologies and synthesis strategies we have developed will apply to other states, and enable ELVIS to quickly spread beyond California.
The Food Web Constructor (FWC) uses empirically known food web links to predict food web links not yet recorded.
Copyright is held by IW3C2. WWW 2006, May 22-26, 2006, Edinburgh, UK.
|
A user can choose which food web
studies to use for prediction or exclude from 257 datasets we
compiled from previously digitized literature.
{Cohen 1989 #6030}
,
{Vazquez 2005 #6040}
,{Dunne, Williams, et al. #6050}
.
T
Taaxa can be entered several
different ways: simple text lists, XML files, or food web
number. In this latter case we seek to reconstruct feeding
links based on the rest of the database and can therefore
assess the success rate of the different algorithms or model
parameters.
Each suspected link is reported, together with references to supporting evidence. Summary statistics of the food web are also reported.
Figure 2 shows the evidence for a predicted trophic link - namely, the actual link that was observed, the study in which it was published, and the relationship between the species in the observed link and the predicted link.
The Species
List Constructor interacts with web services that provide a
variety of species informatione top half of ELVIS uses ontologies
(how?). OurThe current Food Web
Constructor algorithm uses taxonomic distances to weight
evidence supporting or failing to support links between
organisms. All data input, output, and taxonomies for Food Web
Constructor and Evidence Provider are available in OWL on
demand. ,
but at present the Ccalculations are
performed act on data residing in MySQL
databases. Scripts generate OWL documents from Animal Diversity
Web
{Myers 2003 #5080}
(http://www.animaldiversityweb.org)
via the
ETHAN ontology. The triple
shop is currently implemented using the
joseki(http://www.joseki.org/).
.Add Joseki web service and
www.sparqler.org?
Swoogle (Google for the Semantic Web) is
ourthe semantic web search engine
[3](Swoogle ref). It allows users
to search for both ontologies and instance data (collectively
referred to as semantic web documents) along a variety of
parameters. Once documents are returned, a user can select
certain of them for inclusion into her ÒTriple ShopÓ, a sort of
shopping cart for RDF triples. We have built a stand-alone
version of the Triple Shop, which allows a user to specify the
URLs of arbitrary semantic web documents, and to issue SPARQL
queries against the union of those documents.
ELVIS illustrates the potential of the semantic web to support rapid querying of distributed scientific databases for a variety of scenarios. For example (Figure 3): Determine known predator-prey relationships among an invader and a specific group of native species in a particular habitat, as reported in previous studies.
Figure 3. SPARQL query on ETHAN and SpireEcoConcepts OWL documents.
This research was
supported by NSF ITRgrant
#0326460 and
matching funds received from the USGS
Nat.ional Bio.logical Information
Infrastructure.
[1]
Dunne, J. A. The network
structure of food webs. In: Ecological
Networks: Linking Structure to Dynamics in Food Webs, eds.
Pascual, M. and Dunne, J. A. Oxford
University Press, 2005.pp. 27-86.
[2]
Jones , Matthew
.B.; Berkley,
C.had;
Bojilova, J.Jivka;
Schildhauer,
M. ark
P. 2001. Managing scientific
metadata. IEEE Internet Computing. Vol: 5(5).
Pages 59-68.
[1]
Li Ding et
al., "Swoogle: A Search and Metadata Engine
for the Semantic Web", Proceedings of the Thirteenth ACM
Conference on Information and Knowledge Management, November
2004.
[3]
[4]Ludaescher, B. et al. 2004.
Scientific workflow management and the Kepler system.
Concurrency and Computation: Practice and
Experience.
Ludaescher,
B. Bertram; Altintas, Ilkay;
Berkley, Chad; Higgins, Dan; Jaeger, Efrat; Jones, Matthew B.;
Lee, Edward A.; Tao, Jing; Zhao,
Yanget
al.In press
2004. Scientific workflow
management and the Kepler system. Concurrency and Computation:
Practice and
Experience.Anderson, R.E. Social impacts of computing: Codes of
professional ethics. Social Science Computing Review, 2 (Winter
1992),
453-469.
ACM SIG
PROCEEDINGS template.
http://www.acm.org/sigs/pubs/proceed/template.html.
Conger., S., and Loch, K.D. (eds.). Ethics and computer use.
Commun. ACM 38, 12 (entire
issue).
Mackay,
W.E. Ethics, lies and videotape... in Proceedings of CHI
'95 (Denver CO, May 1995), ACM Press,
138-145.
Schwartz, M., and Task Force on Bias-Free Language.
Guidelines for Bias-Free Writing. Indiana University Press,
Bloomington, IN, 1995.
[4]