Integrating Scientific Data through External, Concept-Based Annotations

  • Michael Gertz
  • Kai-Uwe Sattler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2590)


In several scientific application domains, such as the computational sciences, the transparent and integrated access to distributed and heterogeneous data sources is key to leveraging the knowledge and findings of researchers. Standard database integration approaches, however, are either not applicable or insufficient because of lack of local and global schema structures. In these application domains, data integration often occurs manually in that researchers collect data and categorize them using “semantic indexing”, in the most simple case through local bookmarking, which leaves them without appropriate data query, sharing, and management mechanisms.

In this paper, we present a data integration technique suitable for such application domains. This technique is based on the notion of controlled data annotations, resembling the idea of associating semantic rich metadata with diverse types of data, including images and text-based documents. Using concept like structures defined by scientists, data annotations allow scientists to link such Web-accessible data at different levels of granularity to concepts. Annotated data describing instances of such concepts then provide for sophisticated query schemes that researchers can employ to query the distributed data in an integrated and transparent fashion. We present our data annotation framework in the context of the Neurosciences where researchers employ concepts and annotations to integrate and query diverse types of data managed and distributed among individual research groups.


Transitive Closure Annotation Graph Query Operation Query Translation Query Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    R. Agrawal: Alpha: An Extension of Relational Algebra to Express a Class of Recursive Queries. In Proc. 1987 ACM SIGMOD International Conference on Management of Data, 580–590, ACM, 1987. 234Google Scholar
  2. [2]
    C. Baru, A. Gupta, B. Ludäscher, R. Marciano, Y. Papakonstantinou, and P. Velikhov. XML-based information mediation with MIX. In Proc 1999 ACM SIGMOD International Conference on Management of Data, 597–599, ACM, 1999. 220Google Scholar
  3. [3]
    S. Bergamaschi, S. Castano, and M. Vincini. Semantic Integration of Semistructured and Structured Data Sources. SIGMOD Record, 28(1):54–59, 1999. 237CrossRefGoogle Scholar
  4. [4]
    J. Clark, S. DeRose. XML Path Language (XPath) Version 1.0, W3C Recommendation, Nov 1999. 225Google Scholar
  5. [5]
    K.E. Campbell, D. E. Oliver, E.H. Shortliffe: The unified medical language system: towards a collaborative approach for solving terminology problems. JAMIA, Volume 8, 12–16, 1998. 223Google Scholar
  6. [6]
    L.M. Delcambre, D. Maier, S. Bowers, M. Weaver, L. Deng, P. Gorman, J. Ash, M. Lavelle, J. Lyman: Bundles in Captivity: An Application of Superimposed Information. In Proc. of the 17th International Conference on Data Engineering (ICDE 2001), IEEE Computer Society, 111–120, 2001. 237Google Scholar
  7. [8]
    S. Decker, M. Erdmann, D. Fensel, R. Studer: Ontobroker: Ontology based Access to Distributed and Semi-Structured Information. In Database Semantics-Semantic Issues in Multimedia Systems, IFIP TC2/WG2.6 Eighth Working Conference on Database Semantics (DS-8), 351–369. Kluwer, 1999. 237Google Scholar
  8. [9]
    D. Fensel, J. Angele, S. Decker, M. Erdmann, H.-P. Schnurr, S. Staab, R. Studer, A. Witt. On2broker: Semantic-based access to information sources at the WWW, 1999. In: Proceedings of the World Conference on the WWW and Internet (WebNet 99), 1999. 237Google Scholar
  9. [10]
    A. Farquhar, R. Fikes, J. Rice: The Ontolingua Server: A Tool for Collaborative Ontology Construction. Technical Report KSL-96-26, Knowledge Systems Laboratory, Stanford, CA, 1996. 231Google Scholar
  10. [11]
    J. Garfunkel: Web Annotation Technologies. 237
  11. [12]
    H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, V. Vassalos, and J. Widom. The TSIMMIS approach to mediation: Data models and languages. Journal of Intelligent Information Systems, 8(2):117–132, March 1997. 220CrossRefGoogle Scholar
  12. [13]
    T.R. Gruber. Toward Principles for the Design of Ontologies user for Knowledge Sharing. International Journal on Human-Computer Studies (1993). 231Google Scholar
  13. [14]
    M. Gertz, K. Sattler, F. Gorin, M. Hogarth, J. Stone: Annotating Scientific Images: A Concept-based Approach. In 14th Int. Conference on Statistical and Scientific Databases, 59–68, IEEE Computer Society, 2002. 231, 232Google Scholar
  14. [15]
    J. Heflin, J. Hendler: Dynamic Ontologies on the Web. In Proc. of the 17th National Conference on Artificial Intelligence (AAAI 2000), 443–449, AAAI/MIT Press, 2000. 237Google Scholar
  15. [16]
    R.M. Heck, S.M. Luebke, C. H. Obermark: A Survey of Web Annotation Systems, 237
  16. [17]
    S. Koslow, M. Huerta (eds.): Neuroinformatics: An Overview of the Human Brain Project. Lawrence Erlbaum Associates, NJ, 1997. 221Google Scholar
  17. [18]
    J. Kahan, M.-R. Koivunen, E. P. Hommeaux, R. R. Swick: Annotea: An Open RDF Infrastructure for Shared Web nnotations. In Proc. 10th International World Wide Web Conference (WWW10), 623–632, ACM, 2001. 237Google Scholar
  18. [19]
    B. Ludäscher, A. Gupta, and M. Martone. Model-Based Mediation with Domain Maps. In Proc. of the 17th Int. Conf. on Data Engineering, April 2-6, 2001, Heidelberg, Germany, 81–90, 2001. 237Google Scholar
  19. [20]
    E. Mena, V. Kashyap, A. Illarramendi, and A. Sheth. Domain Specific Ontologies for Semantic Information Brokering on the Global Information Infrastructure. In International Conference on Formal Ontologies in Information Systems (FOIS’98), Trento (Italy), 269–283, 1998. 237Google Scholar
  20. [23]
    A. Ouksel and C. Naiman. Coordinating Context Building in Heterogeneous Information Systems. Journal of Intelligent Information Systems, 3(2):151–183, 1994. 237CrossRefGoogle Scholar
  21. [24]
    A. Ouksel and A. Sheth. Semantic Interoperability in Global Information Systems: A Brief Introduction to the Research Area and the Special Section. SIGMOD Record, 28(1):5–12, 1999. 237CrossRefGoogle Scholar
  22. [25]
    T.A. Phelps, R. Wilensky: Multivalent Annotations. In Research and Advanced Technology for Digital Libraries-First European Conference, 287–303, LNCS 1324, Springer, 1997. 237Google Scholar
  23. [26]
    R. Stevens, P. Baker, S. Bechhofer, G. Ng, A. Jacoby, N. W. Paton, C. A. Goble, A. Brass: TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources. Bioinformatics 16(2):184–186, 2000. 237CrossRefGoogle Scholar
  24. [27]
    R. Stevens, C. Goble, I. Harrocks, S. Bechhofer: Building a Bioinformatics Ontology using OIL. To appear in a special issue of IEEE Information Technology in Biomedicine on Bioinformatics, 2001. 237Google Scholar
  25. [28]
    A.P. Sheth: Changing Focus on Interoperability in Information Systems: From System, Syntax, Structure to Semantics. Kluwer Academic Press, 1999. 237Google Scholar
  26. [29]
    B. Swartout, R. Patil, K. Knight, T. Russ. Toward Distributed Use of Large-Scale Ontologies, 1996. In Proc. 10th Knowledge Acquisition for Knowledge-Based Systems Workshop, Alberta, Canada, 1996. 231Google Scholar
  27. [32]
    K. Tochtermann, W.-F. Riekert, G. Wiest, J. Seggelke, B. Mohaupt-Jahr: Using Semantic, Geographical, and Temporal Relationships to Enhance Search and Retrieval In Digital Catalogs. In Research and Advanced Technology for Digital Libraries-First European Conference, ECDL’97, 73–86, LNCS 1324, Springer-Verlag, Berlin, 1997. 225Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Michael Gertz
    • 1
  • Kai-Uwe Sattler
    • 2
  1. 1.Department of Computer ScienceUniversity of CaliforniaDavisUSA
  2. 2.Department of Computer ScienceUniversity of MagdeburgGermany

Personalised recommendations