On the Integration of a Large Number of Life Science Web Databases

  • Zina Ben Miled
  • Nianhua Li
  • Yang Liu
  • Yue He
  • Eric Lynch
  • Omran Bukhres
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2994)


The integration of life science web databases is an important research subject that has an impact on the rate at which new biological discoveries are made. However, addressing the interoperability of life science databases presents serious challenges, particularly when the databases are accessed through their web interfaces. Some of these challenges include the fact that life science databases are numerous and their access interface may change often. This paper proposes techniques that take into account these challenges and shows how these techniques were implemented in the context of BACIIS, a federation of life science web databases.


Domain Ontology Extraction Rule Nucleic Acid Research Ontology Concept Query Plan 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Goble, C.A., Stevens, R., Ng, G., Bechhofer, S., Paton, N.W., Baker, P.G., Peim, M., Brass, A.: Transparent Access to Multiple Bioinformatics Information Sources. IBM Systems Journal 40(2), 532–552 (2001)CrossRefGoogle Scholar
  2. 2.
    Zdobnov, E.M., Lopez, R., Apweiler, R., Etzold, T.: The EBI SRS server-recent developments. Bioinformatics 18, 368–373 (2002)CrossRefGoogle Scholar
  3. 3.
    Davidson, S.B., Crabtree, J., Brunk, B., Schug, J., Tannen, V., Overton, C., Stoeckert, C.: K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources. IBM Systems Journal 40(2) (2001)Google Scholar
  4. 4.
    Haas, L.M., Rice, J.E., Schwarz, P.M., Swops, W.C., Kodali, P., Kotlar, E.: DiscoveryLink: A system for integrated access to life sciences data sources. IBM System Journal 40(2) (2001)Google Scholar
  5. 5.
    McEntyre, J.: Linking up with Entrez. Trends Genet. 14(1), 39–40 (1998)CrossRefGoogle Scholar
  6. 6.
    Ben Miled, Z., Li, N., Kellett, G., Sipes, B., Bukhres, O.: Complex Life Science Multidatabase Queries. Proceedings of the IEEE 90(11) (2002)Google Scholar
  7. 7.
    Peim, M., Franconi, E., Paton, N.W., Goble, C.A.: Query Processing with Description Logic Ontologies Over Object-Wrapped Databases. In: Proc. 14th International Conference on Scientific and Statistical Database Management (SSDBM), pp. 27–36. IEEE Computer Society, Los Alamitos (2002)CrossRefGoogle Scholar
  8. 8.
    Zdobnov, E.M., Lopez, R., Apweiler, R., Etzold, T.: The EBI SRS server - new features. Bioinformatics 18(8), 1149–1150 (2002)CrossRefGoogle Scholar
  9. 9.
    Toner, B.: Rise of the Middle Class: Integration Vendors Differentiate Range of ’N-Tier’ Offerings. Bioinform Online, 6(16) (2002)
  10. 10.
    Wong, L.: Kleisli, a Functional Query System. Journal of Functional Programming 10(1), 19–56 (2000)CrossRefGoogle Scholar
  11. 11.
    Paton, N.W., Stevens, R., Baker, P.G., Goble, C.A., Bechhofer, S., Brass, A.: Query Processing in the TAMBIS Bioinformatics Source Integration System. In: Proc. 11th Int. Conf. on Scientific and Statistical Databases (SSDBM), pp. 138–147. IEEE Press, Los Alamitos (1999)CrossRefGoogle Scholar
  12. 12.
    Davidson, S.B., Overton, C., Tanen, V., Wong, L.: BioKleisli: A Digital Library for biomedical Researchers. Journal of Digital Libraries 1(1), 36–53 (1997)Google Scholar
  13. 13.
    Ben Miled, Z., Wang, Y., Li, N., Bukhres, O., Martin, J., Nayar, A., Oppelt, R.: BAO, A Biological and Chemical Ontology For Information Integration. Online Journal Bioinformatics 1, 60–73 (2002)Google Scholar
  14. 14.
    Baxevanis, A.D.: The Molecular Biology Database Collection: 2003 update. Nucleic Acids Res. 31(1), 1–12 (2003)CrossRefGoogle Scholar
  15. 15.
    Ben Miled, Z., Li, N., Kellett, G., Sipes, B., Bukhres, O.: Complex Life Science Multidatabase Queries. Proceedings of the IEEE 90(11) (2002)Google Scholar
  16. 16.
    Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Rapp, B.A., Wheeler, D.L.: GenBank. Nucleic Acids Research 30(1), 17–20 (2002)CrossRefGoogle Scholar
  17. 17.
    O’Donovan, C., Martin, M.J., Gattiker, A., Gasteiger, E., Bairoch, A., Pweiler, R.: High-quality protein knowledge resource: SWISS-PROT and TrEMBL Brief. Bioinform 3, 275–284 (2002)Google Scholar
  18. 18.
    Wu, C.H., Huang, H., Arminski, L., Castro-Alvear, J., Chen, Y., Hu, Z., Ledley, R.S., Lewis, K.C., Mewes, H., Orcutt, B.C., Suzek, B.E., Tsugita, A., Vinayaka, C.R., Yeh, L.L., Zhang, J., Barker, W.C.: The Protein Information Resource: an integrated public resource of functional annotation of proteins. Nucleic Acids Research 30, 35–37 (2002)CrossRefGoogle Scholar
  19. 19.
    Falquet, L., Pagni, M., Bucher, P., Hulo, N., Sigrist, C., Hofmann, K., Bairoch, A.: The PROSITE database, its status in 2002. Nucleic Acids Research 30, 235–238 (2002)CrossRefGoogle Scholar
  20. 20.
    Bairoch, A.: The ENZYME database in 2000. Nucleic Acids Research 28, 304–305 (2000)CrossRefGoogle Scholar
  21. 21.
    Westbrook, J., Feng, Z., Chen, L., Yang, H., Berman, H.: The Protein Data Bank and structural genomics. Nucleic Acids Research 31, 489–491 (2003)CrossRefGoogle Scholar
  22. 22.
    Hamosh, A., Scott, A., Amberger, J., Bocchini, C., Valle, D., McKusick, V.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Research 30, 52–55 (2002)CrossRefGoogle Scholar
  23. 23.
    Hammer, J., Garcia-Molina, H., Nestorov, S., Yerneni, R., Breuning, M., Vassalos, V.: Template-Based Wrappers in the TSIMMIS System. In: Proceedings of 23rd ACM SIGMOD International Conference on Management of Data, pp. 532–535 (1997)Google Scholar
  24. 24.
    Crescenzi, V., Mecca, G., Merialdo, P.: ROADRUNNER: Towards Automatic Data Extraction from Large Web Sites. The VLDB Journal, 109–118 (2001)Google Scholar
  25. 25.
    Knobloc, C., Lerman, K., Minton, S., Muslea, I.: Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 23(4), 33–41 (2000)Google Scholar
  26. 26.
    Soderland, S.: Learning Information Extraction Rules for Semi-structured and Free Text. Machine Learning 34(1-3), 233–272 (1999)zbMATHCrossRefGoogle Scholar
  27. 27.
    Califf, M.E., Mooney, R.J.: Relational Learning of Pattern-Match Rules for Information Extraction. In: Proceedings of AAAI Spring Symposium, vol. 6-11 (1996)Google Scholar
  28. 28.
    Cohen, W.: Text categorization and relational learning. In: Proceedings of the 12th International Conference on Machine Learning, pp. 124–132 (1995)Google Scholar
  29. 29.
    Kushmerick, N., Weld, D.S., Doorenbos, R.: Wrapper Induction for Information Extraction. In: International Joint Conference on Artificial Intelligence (IJCAI), pp. 729–737 (1997)Google Scholar
  30. 30.
    Hammer, J., Garcia-Molina, H., Cho, J., Aranha, R., Crespo, A.: Extracting Semistructured Information from the Web. In: Proceedings of the 1st Workshop on Management for Semistructured Data, pp. 18–25 (1997)Google Scholar
  31. 31.
    Huck, G., Frankhausewr, P., Aberer, K., Neuhold, E.: Jedi: Extracting and Synthesizing Information from the Web. In: Proceedings of Conference on Cooperative Information Systems, pp. 32–43 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Zina Ben Miled
    • 1
  • Nianhua Li
    • 2
  • Yang Liu
    • 1
  • Yue He
    • 1
  • Eric Lynch
    • 2
  • Omran Bukhres
    • 2
  1. 1.Department of Electrical and Computer EngineeringIndianapolis University Purdue University IndianapolisIndianapolisUSA
  2. 2.Department of Computer and Information ScienceIndianapolis University Purdue University IndianapolisIndianapolisUSA

Personalised recommendations