Skip to main content

A Survey on Data Integration in Bioinformatics

  • Conference paper
Informatics Engineering and Information Science (ICIEIS 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 254))

Abstract

The need for data integration is widely acknowledged in bioinformatics. There are several huge biological databanks now available across the world in different formats. To characterize or apply data mapping between several data sources requires integration of all related data fields. The problem of integration may be addressed using a variety of approaches; some are widely used and some are less so, having failed to achieve the basic requirements of data integration. In this paper, we discuss three techniques for data integration: the federated database system approach, the data warehousing approach and the link-driven approach. While each approach has its strengths and weaknesses, it is important to identify which approach is best suited to a given user’s needs. We also discuss some database systems which use these three different approaches to solving the problem of data integration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lacroix, Z., Critchlow, T.: Bioinformatics Managing Scientific Data. Morgan Kaufman Publishers (2003)

    Google Scholar 

  2. Eckman, B.A., Lacroix, Z., Raschid, L.: Optimized Seamless Integration of Biomolecular Data. In: Bioinformatics and Bioengineering Conference, Proceedings of the IEEE 2nd International Symposium, pp. 23–32 (2001)

    Google Scholar 

  3. Hernandez, T., Kambhampati, Z.: Integration of Biological Sources: Current Systems and Challenges Ahead. SIGMOD Record 33(3) (2004)

    Google Scholar 

  4. Stevens, R., Paton, N.W., Baker, P., Ng, G., Goble, C.A., Bechhofer, S., Brass, A.: TAMBIS Online: A Bioinformatics Source Integration Tool. In: Eleventh International Conference on Scientific and Statistical Database Management,1999, p. 280 (1999)

    Google Scholar 

  5. Yan, L., Vincent, S., Murphy, M.C.: Integrating Bioinformatics Data Sources over the SFSU ER Design Tools XML Databus. ACM International Conference Proceeding Series, vol. 155(19) (2006)

    Google Scholar 

  6. Wong, L.S.: Technologies for Integrating Biological Data. Laboratories for Information Technology 3, 389–404 (2002)

    Google Scholar 

  7. Davidson, S.B., Crabtree, J., Brunk, B., Schug, J., Tannen, V., Overton, C., Toeckert, C.: K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources. IBM System Joural. Deep Computing for the Life Sciences 40(31), 512–531 (2001)

    Google Scholar 

  8. Kirsten, T., Lange, J., Rahm, E.: An Integrated Platform for Analyzing Molecular-Biological Data Within Clinical Studies. In: Grust, T., Höpfner, H., Illarramendi, A., Jablonski, S., Fischer, F., Müller, S., Patranjan, P.-L., Sattler, K.-U., Spiliopoulou, M., Wijsen, J. (eds.) EDBT 2006. LNCS, vol. 4254, pp. 399–410. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Thuraisingham, B., Iyer, S.: Extended RBAC–Based Design and Implementation for a Secure Data Warehouse. In: The Second International Conference on Availability, Reliability, and Security (ARES), pp. 821–828 (2007)

    Google Scholar 

  10. Robert, M.R.: Bringing the Data Mart into the Curriculum. In: ACM-SE 38: Proceedings of the 38th Annual on Southeast Regional Conference, pp. 129–134 (2000)

    Google Scholar 

  11. Richard, M.C.: How Federated Databases Benefit Bioinformatics Research, http://www.b-eye-network.com/view/2164

  12. Amit, S.P., James, A.L.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys (CSUR) 22(3), 7–23 (1990)

    Google Scholar 

  13. Muilu, J., Peltonen, L., Litton, J.-E.: The Federated Database – A Basis for Biobank-Basedpost-Genome Studies, Integrating Phenome and Genome Data from 600 000 Twin Pairs in Europe. European Journal of Human Genetics, 1–6 (2007)

    Google Scholar 

  14. Davidson, S., Overton, C., Buneman, P.: Challenges in Integrating Biological Data Source. Journal of Computational Biology 2(4), 557–572 (1995)

    Article  Google Scholar 

  15. Friedman, M., Levy, A., Millstein, T.: Navigational Plans For Data Integration. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), pp. 67–73 (1999)

    Google Scholar 

  16. The Computational Biology and Informatics Laboratory. AllGenes: A Website Providing Access to an Integrated Database of Known and Predicted Human and Mouse Genes. Center for Bioinformatics, University of Pennsylvania (2004), http://www.allgenes.org

  17. Information U.S. National Library of Medicine National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thiam Yui, C., Liang, L.J., Jik Soon, W., Husain, W. (2011). A Survey on Data Integration in Bioinformatics. In: Abd Manaf, A., Sahibuddin, S., Ahmad, R., Mohd Daud, S., El-Qawasmeh, E. (eds) Informatics Engineering and Information Science. ICIEIS 2011. Communications in Computer and Information Science, vol 254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25483-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25483-3_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25482-6

  • Online ISBN: 978-3-642-25483-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics