Skip to main content

Data Integration Systems for Scientific Applications

  • Conference paper
On the Move to Meaningful Internet Systems: OTM 2010 Workshops (OTM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6428))

  • 1351 Accesses

Abstract

The integration of data stemming from heterogeneous sources is an issue that has challenged computer science research for years – not to say decades. Therefore, many methods, frameworks and tools were and are still being developed that all promise to solve the integration of data. This work describes those which we think are most promising by relating them to each other. Since our focus is on scientific applications, we consider important properties within this domain such as data provenance. However, aspects like the extensibility of an approach are also considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. GBIF - Global Biodiversity Information Facilit, http://www.gbif.org/ (visited: 2010-07-05)

  2. Tan, W.-C.: Provenance in Databases: Past Current and future. Bulletin of the Technical Committee on Data Engineering 32, 3–12 (2007)

    Google Scholar 

  3. Simmhan, Y.L., Plale, B., Gannon, D.: A Survey of Data Provenance in e-Science. ACM SIGMOD Record 34, 31–36 (2005)

    Article  Google Scholar 

  4. Leser, U., Naumann, F.: Informationsintegration. dpunkt-Verlag, Heidelberg (2006)

    MATH  Google Scholar 

  5. Bauer, A., Günzel, H.: Data Warehouse Systeme. dpunkt.verlag, Heidelberg (2008)

    MATH  Google Scholar 

  6. Bleiholder, J., Naumann, F.: Data Fusion. ACM Computing Surveys (CSUR) 41, 1–40 (2008)

    Article  Google Scholar 

  7. Hull, R., Zhou, G.: A Framework for Supporting Data Integration using the Materialized and Virtual Approaches. SIGMOD Rec. 25, 481–492 (1996)

    Article  Google Scholar 

  8. Sheth, A.P., Larson, J.A.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys (CSUR) 22, 183–236 (1990)

    Article  Google Scholar 

  9. Bernstein, P., Melnik, S.: Model Management 2.0: Manipulating Richer Mappings. In: ACM SIGMOD International Conference on Management of Data, pp. 1–12. ACM, New York (2007)

    Google Scholar 

  10. Kepler, https://kepler-project.org/ (visited: 2010-07-05)

  11. Taverna Workflow System: http://www.taverna.org.uk/ (visited: 2010-07-05)

  12. Oinn, T., Greenwood, M., Addis, M., Alpdemir, M.N., Ferris, J., Glover, K., Goble, C., Goderis, A., Hull, D., Marvin, D.: Taverna: Lessons in Creating a Workflow Environment for the Life Sciences. Concurrency and Computation: Practice and Experience 18, 1067–1100 (2006)

    Article  Google Scholar 

  13. Bowers, S., McPhillips, T.M., Ludäscher, B.: Provenance in Collection-Oriented Scientific Workflows. Concurrency and Computation: Practice and Experience 20, 519–529 (2008)

    Article  Google Scholar 

  14. SnapLogic - The DataFlow Company: http://www.snaplogic.org (visited: 2010-07-05)

  15. Bhattacharjee, A., Islam, A., Amin, M., Hossain, S., Hosain, S., Jamil, H., Lipovich, L.: On-the-Fly Integration and Ad Hoc Querying of Life Sciences Databases Using LifeDB. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) Database and Expert Systems Applications. LNCS, vol. 5690, pp. 561–575. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  16. Jamil, H., El-Hajj-Diab, B.: Bioflow: A Web-based Declarative Workflow Language for Life Sciences. In: IEEE Congress on Services (SERVICES 2008), pp. 453–460. IEEE, Hawaii (2008)

    Chapter  Google Scholar 

  17. Hosain, S., Jamil, H.: An Algebraic Language for Semantic Data Integration on the Hidden Web. In: IEEE International Conference on Semantic Computing (ICSC 2009), pp. 237–244. IEEE, Berkeley (2009)

    Chapter  Google Scholar 

  18. Motro, A., Anokhin, P.: Fusionplex: Resolution of Data Inconsistencies in the Integration of Heterogeneous Information Sources. Information Fusion 7, 176–196 (2006)

    Article  Google Scholar 

  19. Bilke, A., Bleiholder, J., Naumann, F., Böhm, C., Draba, K., Weis, M.: Automatic Data Fusion with HumMer. In: 31st International Conference on Very Large Data Bases (VLDB 2005), Trondheim, Norway, pp. 1251–1254 (2005)

    Google Scholar 

  20. Bilke, A., Naumann, F.: Matching Using Duplicates. In: 21st International Conference on Data Engineering (ICDE 2005), Tokyo, Japan, pp. 69–80 (2005)

    Google Scholar 

  21. Bleiholder, J., Naumann, F.: Declarative Data Fusion – Syntax, Semantics and Implementation. In: Eder, J., Haav, H.-M., Kalja, A., Penjam, J. (eds.) ADBIS 2005. LNCS, vol. 3631, pp. 58–73. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Roth, B., Volz, B., Hecht, R. (2010). Data Integration Systems for Scientific Applications. In: Meersman, R., Dillon, T., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2010 Workshops. OTM 2010. Lecture Notes in Computer Science, vol 6428. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16961-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16961-8_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16960-1

  • Online ISBN: 978-3-642-16961-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics