Designing a Global Information Resource for Molecular Biology (Short Paper)

  • Ulf Leser
Conference paper
Part of the Informatik aktuell book series (INFORMAT)


Research in molecular biology is continuously producing an immense amount of data, but this information is spread over numerous heterogeneous data repositories. Their integration into a federated information system would drastically reduce the time a biologist has to spend browsing different WWW sites or databases in search for a particular piece of information.

In this study we point out the specific problems that molecular biology is posing to data integration. We present our approach to cope with these problems. It is based on a mediator architecture and uses query correspondence assertions (QCA) to describe sources in a flexible yet expressive manner. QCAs both capture content and query capabilities of arbitrary data sources with respect to a federated schema. Based on such QCAs a mediator can answer queries against the federated schema by constructing semantically equivalent combinations of source queries.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Aho, A. V., Y. Sagiv, et al. (1979).“Equivalence among Relational Expressions.„ SIAM Journal of Computing 8 (2): 218–246.MathSciNetzbMATHCrossRefGoogle Scholar
  2. [2]
    Chen, I. A. and V. M. Markowitz (1995). “An Overview of the Object-Protocol Model (OPM) and OPM Data Management Tools.201E; Information Systems 20 (5): 393–418.CrossRefGoogle Scholar
  3. [3]
    Etzold, T., A. Ulyanov, et al. (1996). “SRS: Information Retrieval System for Molecular Biology Data Banks.„ Methods in Enzymology 266: 114–128.CrossRefGoogle Scholar
  4. [4]
    Hull, R. (1997). Managing Semantic Heterogeneity in Databases: A Theoretical Perspective. 16th ACM PODS.Google Scholar
  5. [5]
    Leser, U. (1998). Combining Heterogeneous Data Sources through Query Correspondence Assertions. 1st Workshop on Web Information and Data Management, Washington, D.C.Google Scholar
  6. [6]
    Leser, U. (1998). Maintenance and Mediation in Federated Databases. 8th WITS, Helsinki, Finland, to appear.Google Scholar
  7. [7]
    Leser, U., H. Lehrach, et al. (1998). “Issues in Developing Integrated Genomic Databases and Application to the Human X Chromosome.„ Bioinformatics 14 (7): 583–690.CrossRefGoogle Scholar
  8. [8]
    Levy, A. Y., A. O. Mendelzon, et al. (1995). Answering Queries using Views. 14th ACM PODS, San Jose, CA pp. 95–104.Google Scholar
  9. [9]
    Levy, A. Y., A. Rajaraman, et al. (1996). Querying Heterogeneous Information Sources Using Source Descriptions. 22th VLDB, Bombay, India pp. 251–262.Google Scholar
  10. [10]
    Miller, R. J. (1998). Using Schematically Heterogenous Structures. ACM SIGMOD, Seattle, Washington pp. 189–200.Google Scholar
  11. [11]
    Naumann, F., J. C. Freytag, et al. (1998). Quality driven Source Selection using Data Envelopment Analysis. Int. Conf. on Information Quality, MIT, Cambridge.Google Scholar
  12. [12]
    Sheth, A. and J. A. Larson (1990). “Federated Database Systems for Managing Distributed, Heterogeneous and Autonomous Databases.„ ACM Computing Survey 22 (3).Google Scholar
  13. [13]
    Wiederhold, G. (1992). “Mediators in the Architecture of Future Information Systems.„ IEEE Computer 25 (3): 38–49.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Ulf Leser
    • 1
  1. 1.Fachbereich 13 - CISTechnische Universität BerlinBerlinGermany

Personalised recommendations