Advertisement

Schema Mappings: From Data Translation to Data Cleaning

  • Giansalvatore Mecca
  • Paolo Papotti
  • Donatello Santoro
Chapter
Part of the Studies in Big Data book series (SBD, volume 31)

Abstract

Schema mapping management is an important research area in data transformation, integration, and cleaning systems. The reasons for its success can be found in the declarative nature of its building block (thus enabling clean semantics and easy to use design tools) paired with the efficiency and modularity in the deployment step. In this chapter we cover the evolution of schema-mappings through what we identify as three main ages. We start presenting the foundations of schema mapping tools and the first tools aimed at translating data from a source to a target schema in the first, heroic age. We then discuss the silver age, when schema mapping tools have grown their way into complex systems and have been translated into both commercial and open-source tools. Finally, we show how recent results in schema-mapping are stimulating a third, golden age, with novel research opportunities and a new generation of systems capable of dealing with a significantly larger class of real-life applications.

Keywords

Schema Mapping Target Schema Target Database Core Solution Target Instance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    B. Alexe, M.A. Hernández, L. Popa, W.C. Tan, MapMerge: correlating independent schema mappings. PVLDB 3(1), 81–92 (2010)Google Scholar
  2. 2.
    B. Alexe, W. Tan, Y. Velegrakis, Comparing and evaluating mapping systems with STBenchmark. PVLDB 1(2), 1468–1471 (2008)Google Scholar
  3. 3.
    S. Amano, C. David, L. Libkin, F. Murlak, XML schema mappings: data exchange and metadata management. J. ACM 61(2), 12:1–12:48 (2014)Google Scholar
  4. 4.
    M. Arenas, L. Libkin, XML data exchange: consistency and query answering. J. ACM 55(2), 1–72 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    M. Arenas, J. Pérez, J. Reutter, C. Riveros, Query language-based inverses of schema mappings: semantics, computation, and closure properties. VLDB J. 21(6), 823–842 (2012)CrossRefGoogle Scholar
  6. 6.
    P.C. Arocena, B. Glavic, R. Ciucanu, R.J. Miller, The ibench integration metadata generator. PVLDB 9(3), 108–119 (2015)Google Scholar
  7. 7.
    C. Beeri, M. Vardi, A proof procedure for data dependencies. J. ACM 31(4), 718–741 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    M. Benedikt, G. Konstantinidis, G. Mecca, B. Motik, P. Papotti, D. Santoro, E. Tsamoura, Benchmarking the chase, in PODS (2017)Google Scholar
  9. 9.
    P.A. Bernstein, S. Melnik, Model management 2.0: manipulating richer mappings, in SIGMOD (2007), pp. 1–12Google Scholar
  10. 10.
    J. Bleiholder, F. Naumann, Data fusion. ACM Comp. Surv. 41(1), 1–41 (2008)CrossRefGoogle Scholar
  11. 11.
    A. Bonifati, I. Ileana, M. Linardi, Functional dependencies unleashed for scalable data exchange, in SSDBM (2016)Google Scholar
  12. 12.
    A. Bonifati, G. Mecca, A. Pappalardo, S. Raunich, G. Summa, Schema mapping verification: the spicy way, in EDBT (2008), pp. 85–96Google Scholar
  13. 13.
    R. Chirkova, L. Libkin, J. Reutter, Tractable XML data exchange via relations, in CIKM (2011)Google Scholar
  14. 14.
    S. Dessloch, M.A. Hernandez, R. Wisnesky, A. Radwan, J. Zhou, Orchid: integrating schema mapping and ETL, in ICDE (2008), pp. 1307–1316Google Scholar
  15. 15.
    R. Fagin, P. Kolaitis, R. Miller, L. Popa, Data exchange: semantics and query answering. TCS 336(1), 89–124 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    R. Fagin, P. Kolaitis, A. Nash, L. Popa, Towards a theory of schema-mapping optimization, in ACM PODS (2008), pp. 33–42Google Scholar
  17. 17.
    R. Fagin, P. Kolaitis, L. Popa, Data exchange: getting to the core. ACM TODS 30(1), 174–210 (2005)CrossRefzbMATHGoogle Scholar
  18. 18.
    R. Fagin, P. Kolaitis, L. Popa, W. Tan, Composing schema mappings: second-order dependencies to the rescue. ACM TODS 30(4), 994–1055 (2005)CrossRefGoogle Scholar
  19. 19.
    R. Fagin, P.G. Kolaitis, L. Popa, W.C. Tan, Schema matching and mapping, chapter Schema Mapping Evolution Through Composition and Inversion (Springer, Berlin, 2011), pp. 191–222Google Scholar
  20. 20.
    W. Fan, F. Geerts, Foundations of Data Quality Management (Morgan & Claypool Publishers, San Rafael, 2012)zbMATHGoogle Scholar
  21. 21.
    A. Fuxman, M.A. Hernández, C.T. Howard, R.J. Miller, P. Papotti, L. Popa, Nested mappings: schema mapping reloaded, in VLDB (2006), pp. 67–78Google Scholar
  22. 22.
    H. Galhardas, D. Florescu, D. Shasha, E. Simon, C.-A. Saita, Declarative data cleaning: language, model, and algorithms, in VLDB (2001), pp. 371–380Google Scholar
  23. 23.
    F. Geerts, G. Mecca, P. Papotti, D. Santoro, The LLUNATIC data-cleaning framework. PVLDB 6(9), 625–636 (2013)Google Scholar
  24. 24.
    F. Geerts, G. Mecca, P. Papotti, D. Santoro, Mapping and cleaning, in ICDE (2014), pp. 232–243Google Scholar
  25. 25.
    F. Geerts, G. Mecca, P. Papotti, D. Santoro, That’s all folks! LLUNATIC goes open source. PVLDB 7(13), 1565–1568 (2014)Google Scholar
  26. 26.
    G. Gottlob, A. Nash, Efficient core computation in data exchange. J. ACM 55(2), 1–49 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    L.M. Haas, M.A. Hernández, H. Ho, L. Popa, M. Roth, Clio grows up: from research prototype to industrial tool, in SIGMOD (2005), pp. 805–810Google Scholar
  28. 28.
    M.A. Hernández, P. Papotti, W.C. Tan, Data exchange with data-metadata translations. PVLDB 1(1), 260–273 (2008)Google Scholar
  29. 29.
    B. Kimelfeld, E. Livshits, L. Peterfreund, Detecting ambiguity in prioritized database repairing, in ICDT (2017)Google Scholar
  30. 30.
    B. Marnette, G. Mecca, P. Papotti, Scalable data exchange with functional dependencies. PVLDB 3(1), 105–116 (2010)Google Scholar
  31. 31.
    B. Marnette, G. Mecca, P. Papotti, S. Raunich, D. Santoro, ++Spicy: an opensource tool for second-generation schema mapping and data exchange. PVLDB 4(11), 1438–1441 (2011)Google Scholar
  32. 32.
    G. Mecca, P. Papotti, S. Raunich, Core schema mappings, in SIGMOD (2009), pp. 655–668Google Scholar
  33. 33.
    G. Mecca, P. Papotti, S. Raunich, D. Santoro, What is the IQ of your data transformation system? in CIKM (2012), pp. 872–881Google Scholar
  34. 34.
    G. Mecca, G. Rull, D. Santoro, E. Teniente, Semantic-based mappings, in Proceedings of the Conceptual Modeling - 32th International Conference, ER 2013, Hong-Kong, China, 11–13 November, 2013 (2013), pp. 255–269Google Scholar
  35. 35.
    R.J. Miller, L.M. Haas, M.A. Hernandez, Schema mapping as query discovery, in VLDB (2000), pp. 77–99Google Scholar
  36. 36.
    R. Pichler, V. Savenkov, DEMo: data exchange modeling tool. PVLDB 2(2), 1606–1609 (2009)Google Scholar
  37. 37.
    L. Popa, Y. Velegrakis, R.J. Miller, M.A. Hernandez, R. Fagin, Translating web data, in VLDB (2002), pp. 598–609Google Scholar
  38. 38.
    A. Roth, M.F. Korth, A. Silberschatz, Extended Algebra and calculus for nested relational databases. ACM TODS 13, 389–417 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    L. Seligman, P. Mork, A. Halevy, K. Smith, M.J. Carey, K. Chen, C. Wolf, J. Madhavan, A. Kannan, D. Burdick, OpenII: an open source information integration toolkit, in SIGMOD (2010), pp. 1057–1060Google Scholar
  40. 40.
    N.C. Shu, B.C. Housel, R.W. Taylor, S.P. Ghosh, V.Y. Lum, EXPRESS: a data EXtraction, processing and REstructuring system. ACM TODS 2(2), 134–174 (1977)CrossRefGoogle Scholar
  41. 41.
    B. ten Cate, L. Chiticariu, P. Kolaitis, W.C. Tan, Laconic schema mappings: computing core universal solutions by means of SQL queries. PVLDB 2(1), 1006–1017 (2009)Google Scholar
  42. 42.
    R. Wisnesky, M.A. Hernández, L. Popa, Mapping polymorphism, in ICDT (2010), pp. 196–208Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Giansalvatore Mecca
    • 1
  • Paolo Papotti
    • 2
  • Donatello Santoro
    • 1
  1. 1.Università della BasilicataPotenzaItaly
  2. 2.Arizona State UniversityTempeUSA

Personalised recommendations