Abstract
The integration of data stemming from heterogeneous sources is an issue that has challenged computer science research for years – not to say decades. Therefore, many methods, frameworks and tools were and are still being developed that all promise to solve the integration of data. This work describes those which we think are most promising by relating them to each other. Since our focus is on scientific applications, we consider important properties within this domain such as data provenance. However, aspects like the extensibility of an approach are also considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
GBIF - Global Biodiversity Information Facilit, http://www.gbif.org/ (visited: 2010-07-05)
Tan, W.-C.: Provenance in Databases: Past Current and future. Bulletin of the Technical Committee on Data Engineering 32, 3–12 (2007)
Simmhan, Y.L., Plale, B., Gannon, D.: A Survey of Data Provenance in e-Science. ACM SIGMOD Record 34, 31–36 (2005)
Leser, U., Naumann, F.: Informationsintegration. dpunkt-Verlag, Heidelberg (2006)
Bauer, A., Günzel, H.: Data Warehouse Systeme. dpunkt.verlag, Heidelberg (2008)
Bleiholder, J., Naumann, F.: Data Fusion. ACM Computing Surveys (CSUR) 41, 1–40 (2008)
Hull, R., Zhou, G.: A Framework for Supporting Data Integration using the Materialized and Virtual Approaches. SIGMOD Rec. 25, 481–492 (1996)
Sheth, A.P., Larson, J.A.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys (CSUR) 22, 183–236 (1990)
Bernstein, P., Melnik, S.: Model Management 2.0: Manipulating Richer Mappings. In: ACM SIGMOD International Conference on Management of Data, pp. 1–12. ACM, New York (2007)
Kepler, https://kepler-project.org/ (visited: 2010-07-05)
Taverna Workflow System: http://www.taverna.org.uk/ (visited: 2010-07-05)
Oinn, T., Greenwood, M., Addis, M., Alpdemir, M.N., Ferris, J., Glover, K., Goble, C., Goderis, A., Hull, D., Marvin, D.: Taverna: Lessons in Creating a Workflow Environment for the Life Sciences. Concurrency and Computation: Practice and Experience 18, 1067–1100 (2006)
Bowers, S., McPhillips, T.M., Ludäscher, B.: Provenance in Collection-Oriented Scientific Workflows. Concurrency and Computation: Practice and Experience 20, 519–529 (2008)
SnapLogic - The DataFlow Company: http://www.snaplogic.org (visited: 2010-07-05)
Bhattacharjee, A., Islam, A., Amin, M., Hossain, S., Hosain, S., Jamil, H., Lipovich, L.: On-the-Fly Integration and Ad Hoc Querying of Life Sciences Databases Using LifeDB. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) Database and Expert Systems Applications. LNCS, vol. 5690, pp. 561–575. Springer, Heidelberg (2009)
Jamil, H., El-Hajj-Diab, B.: Bioflow: A Web-based Declarative Workflow Language for Life Sciences. In: IEEE Congress on Services (SERVICES 2008), pp. 453–460. IEEE, Hawaii (2008)
Hosain, S., Jamil, H.: An Algebraic Language for Semantic Data Integration on the Hidden Web. In: IEEE International Conference on Semantic Computing (ICSC 2009), pp. 237–244. IEEE, Berkeley (2009)
Motro, A., Anokhin, P.: Fusionplex: Resolution of Data Inconsistencies in the Integration of Heterogeneous Information Sources. Information Fusion 7, 176–196 (2006)
Bilke, A., Bleiholder, J., Naumann, F., Böhm, C., Draba, K., Weis, M.: Automatic Data Fusion with HumMer. In: 31st International Conference on Very Large Data Bases (VLDB 2005), Trondheim, Norway, pp. 1251–1254 (2005)
Bilke, A., Naumann, F.: Matching Using Duplicates. In: 21st International Conference on Data Engineering (ICDE 2005), Tokyo, Japan, pp. 69–80 (2005)
Bleiholder, J., Naumann, F.: Declarative Data Fusion – Syntax, Semantics and Implementation. In: Eder, J., Haav, H.-M., Kalja, A., Penjam, J. (eds.) ADBIS 2005. LNCS, vol. 3631, pp. 58–73. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Roth, B., Volz, B., Hecht, R. (2010). Data Integration Systems for Scientific Applications. In: Meersman, R., Dillon, T., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2010 Workshops. OTM 2010. Lecture Notes in Computer Science, vol 6428. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16961-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-16961-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16960-1
Online ISBN: 978-3-642-16961-8
eBook Packages: Computer ScienceComputer Science (R0)