Abstract
The need for data integration is widely acknowledged in bioinformatics. There are several huge biological databanks now available across the world in different formats. To characterize or apply data mapping between several data sources requires integration of all related data fields. The problem of integration may be addressed using a variety of approaches; some are widely used and some are less so, having failed to achieve the basic requirements of data integration. In this paper, we discuss three techniques for data integration: the federated database system approach, the data warehousing approach and the link-driven approach. While each approach has its strengths and weaknesses, it is important to identify which approach is best suited to a given user’s needs. We also discuss some database systems which use these three different approaches to solving the problem of data integration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lacroix, Z., Critchlow, T.: Bioinformatics Managing Scientific Data. Morgan Kaufman Publishers (2003)
Eckman, B.A., Lacroix, Z., Raschid, L.: Optimized Seamless Integration of Biomolecular Data. In: Bioinformatics and Bioengineering Conference, Proceedings of the IEEE 2nd International Symposium, pp. 23–32 (2001)
Hernandez, T., Kambhampati, Z.: Integration of Biological Sources: Current Systems and Challenges Ahead. SIGMOD Record 33(3) (2004)
Stevens, R., Paton, N.W., Baker, P., Ng, G., Goble, C.A., Bechhofer, S., Brass, A.: TAMBIS Online: A Bioinformatics Source Integration Tool. In: Eleventh International Conference on Scientific and Statistical Database Management,1999, p. 280 (1999)
Yan, L., Vincent, S., Murphy, M.C.: Integrating Bioinformatics Data Sources over the SFSU ER Design Tools XML Databus. ACM International Conference Proceeding Series, vol. 155(19) (2006)
Wong, L.S.: Technologies for Integrating Biological Data. Laboratories for Information Technology 3, 389–404 (2002)
Davidson, S.B., Crabtree, J., Brunk, B., Schug, J., Tannen, V., Overton, C., Toeckert, C.: K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources. IBM System Joural. Deep Computing for the Life Sciences 40(31), 512–531 (2001)
Kirsten, T., Lange, J., Rahm, E.: An Integrated Platform for Analyzing Molecular-Biological Data Within Clinical Studies. In: Grust, T., Höpfner, H., Illarramendi, A., Jablonski, S., Fischer, F., Müller, S., Patranjan, P.-L., Sattler, K.-U., Spiliopoulou, M., Wijsen, J. (eds.) EDBT 2006. LNCS, vol. 4254, pp. 399–410. Springer, Heidelberg (2006)
Thuraisingham, B., Iyer, S.: Extended RBAC–Based Design and Implementation for a Secure Data Warehouse. In: The Second International Conference on Availability, Reliability, and Security (ARES), pp. 821–828 (2007)
Robert, M.R.: Bringing the Data Mart into the Curriculum. In: ACM-SE 38: Proceedings of the 38th Annual on Southeast Regional Conference, pp. 129–134 (2000)
Richard, M.C.: How Federated Databases Benefit Bioinformatics Research, http://www.b-eye-network.com/view/2164
Amit, S.P., James, A.L.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys (CSUR) 22(3), 7–23 (1990)
Muilu, J., Peltonen, L., Litton, J.-E.: The Federated Database – A Basis for Biobank-Basedpost-Genome Studies, Integrating Phenome and Genome Data from 600 000 Twin Pairs in Europe. European Journal of Human Genetics, 1–6 (2007)
Davidson, S., Overton, C., Buneman, P.: Challenges in Integrating Biological Data Source. Journal of Computational Biology 2(4), 557–572 (1995)
Friedman, M., Levy, A., Millstein, T.: Navigational Plans For Data Integration. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), pp. 67–73 (1999)
The Computational Biology and Informatics Laboratory. AllGenes: A Website Providing Access to an Integrated Database of Known and Predicted Human and Mouse Genes. Center for Bioinformatics, University of Pennsylvania (2004), http://www.allgenes.org
Information U.S. National Library of Medicine National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thiam Yui, C., Liang, L.J., Jik Soon, W., Husain, W. (2011). A Survey on Data Integration in Bioinformatics. In: Abd Manaf, A., Sahibuddin, S., Ahmad, R., Mohd Daud, S., El-Qawasmeh, E. (eds) Informatics Engineering and Information Science. ICIEIS 2011. Communications in Computer and Information Science, vol 254. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25483-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-25483-3_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25482-6
Online ISBN: 978-3-642-25483-3
eBook Packages: Computer ScienceComputer Science (R0)