A New Framework for Designing Schema Mappings

  • Bogdan Alexe
  • Wang-Chiew Tan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8000)

Abstract

One of the fundamental tasks in information integration is to specify the relationships, called schema mappings, between database schemas. Schema mappings specify how data structured under a source schema is to be transformed into data structured under a target schema. The design of schema mappings is usually a non-trivial and time-intensive process and the task of designing schema mappings is exacerbated by the fact that schemas that occur in real life tend to be large and heterogeneous. Traditional approaches for designing schema mappings are either manual or performed through a user interface from which a schema mapping is interpreted from correspondences between attributes of the source and target schemas. These correspondences are either specified by the user or automatically derived by applying schema matching on the two schemas.

In this paper, we examine an alternative approach that allows a user to follow the “divide-design-merge” paradigm for specifying a schema mapping. The user can choose to independently design schema mappings for smaller portions of the source and target schema. Afterwards, the user can interact with the system to refine and further design schema mappings through the use of data examples. Finally, in the merge phase, a global schema mapping is generated through the correlation of the individual schema mappings.

Keywords

Schema mappings data examples merge 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alexe, B., ten Cate, B., Kolaitis, P.G., Tan, W.C.: Designing and Refining Schema Mappings via Data Examples. In: SIGMOD Conference (2011)Google Scholar
  2. 2.
    Alexe, B., Chiticariu, L., Miller, R.J., Pepper, D., Tan, W.C.: Muse: a System for Understanding and Designing Mappings. In: SIGMOD Conference, pp. 1281–1284 (2008)Google Scholar
  3. 3.
    Alexe, B., Chiticariu, L., Miller, R.J., Tan, W.C.: Muse: Mapping Understanding and deSign by Example. In: ICDE, pp. 10–19 (2008)Google Scholar
  4. 4.
    Alexe, B., et al.: Simplifying Information Integration: Object-Based Flow-of-Mappings Framework for Integration. In: Castellanos, M., Dayal, U., Sellis, T. (eds.) BIRTE 2008. LNBIP, vol. 27, pp. 108–121. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Alexe, B., Hernández, M.A., Popa, L., Tan, W.C.: MapMerge: Correlating Independent Schema Mappings. PVLDB 3(1), 81–92 (2010)Google Scholar
  6. 6.
    Alexe, B., Hernández, M.A., Popa, L., Tan, W.C.: MapMerge: Correlating Independent Schema Mappings. VLDB Journal 21(1), 1–21 (2012)CrossRefGoogle Scholar
  7. 7.
    Alexe, B., Kolaitis, P.G., Tan, W.C.: Characterizing Schema Mappings via Data Examples. In: ACM PODS, pp. 261–272 (2010)Google Scholar
  8. 8.
    Alexe, B.: Interactive and Modular Design of Schema Mappings. Ph.D. thesis, University of California, Santa Cruz (2011)Google Scholar
  9. 9.
    Alexe, B., ten Cate, B., Kolaitis, P.G., Tan, W.C.: Characterizing schema mappings via data examples. ACM TODS 36(4) (2011)Google Scholar
  10. 10.
    Alexe, B., ten Cate, B., Kolaitis, P.G., Tan, W.C.: Eirene: Interactive design and refinement of schema mappings via data examples. PVLDB (Demonstration Track) (2011)Google Scholar
  11. 11.
    Beeri, C., Vardi, M.Y.: A Proof Procedure for Data Dependencies. JACM 31(4), 718–741 (1984)MathSciNetCrossRefMATHGoogle Scholar
  12. 12.
    Bernstein, P.A., Haas, L.M.: Information Integration in the Enterprise. Commun. ACM 51(9), 72–79 (2008)CrossRefGoogle Scholar
  13. 13.
    Microsoft BizTalk Server, http://www.microsoft.com/biztalk
  14. 14.
    Bonifati, A., Chang, E.Q., Ho, T., Lakshmanan, L.V.S.: HepToX: Heterogeneous Peer to Peer XML Databases (2005), http://www.citebase.org/abstract?id=oai:arXiv.org:cs/0506002
  15. 15.
    Bonifati, A., Chang, E.Q., Ho, T., Lakshmanan, V.S., Pottinger, R.: HePToX: Marrying XML and Heterogeneity in Your P2P Databases. In: VLDB, pp. 1267–1270 (2005)Google Scholar
  16. 16.
    Fagin, R., Haas, L.M., Hernández, M., Miller, R.J., Popa, L., Velegrakis, Y.: Clio: Schema Mapping Creation and Data Exchange. In: Borgida, A.T., Chaudhri, V.K., Giorgini, P., Yu, E.S. (eds.) Conceptual Modeling: Foundations and Applications. LNCS, vol. 5600, pp. 198–236. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  17. 17.
    Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data Exchange: Semantics and Query Answering. TCS 336(1), 89–124 (2005)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Composing Schema Mappings: Second-Order Dependencies to the Rescue. TODS 30(4), 994–1055 (2005)CrossRefGoogle Scholar
  19. 19.
    Fuxman, A., Hernández, M.A., Ho, H., Miller, R.J., Papotti, P., Popa, L.: Nested Mappings: Schema Mapping Reloaded. In: VLDB, pp. 67–78 (2006)Google Scholar
  20. 20.
    International Nucleotide Sequence Database Collection, http://www.insdc.org
  21. 21.
    Kolaitis, P.G.: Schema Mappings, Data Exchange, and Metadata Management. In: PODS, pp. 61–75 (2005)Google Scholar
  22. 22.
    Lenzerini, M.: Data Integration: A Theoretical Perspective. In: PODS, pp. 233–246 (2002)Google Scholar
  23. 23.
    Madhavan, J., Halevy, A.Y.: Composing Mappings Among Data Sources. In: VLDB, pp. 572–583 (2003)Google Scholar
  24. 24.
    Maier, D., Mendelzon, A.O., Sagiv, Y.: Testing Implications of Data Dependencies. TODS 4(4), 455–469 (1979)CrossRefGoogle Scholar
  25. 25.
    Altova MapForce, http://www.altova.com
  26. 26.
    Marnette, B., Mecca, G., Papotti, P., Raunich, S., Santoro, D.: ++spicy: an opensource tool for second-generation schema mapping and data exchange. PVLDB 4(12), 1438–1441 (2011)Google Scholar
  27. 27.
    Nash, A., Bernstein, P.A., Melnik, S.: Composition of Mappings Given by Embedded Dependencies. In: PODS, pp. 172–183 (2005)Google Scholar
  28. 28.
    Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M.A., Fagin, R.: Translating Web Data. In: VLDB, pp. 598–609 (2002)Google Scholar
  29. 29.
    Rahm, E., Bernstein, P.A.: A Survey of Approaches to Automatic Schema Matching. VLDB Journal 10(4), 334–350 (2001)CrossRefMATHGoogle Scholar
  30. 30.
    Roth, M., Hernández, M.A., Coulthard, P., Yan, L., Popa, L., Ho, H.C.T., Salter, C.C.: XML Mapping Technology: Making Connections in an XML-centric World. IBM Sys. Journal 45(2), 389–410 (2006)CrossRefGoogle Scholar
  31. 31.
    Shu, N.C., Housel, B.C., Taylor, R.W., Ghosh, S.P., Lum, V.Y.: EXPRESS: A Data EXtraction, Processing, and REStructuring System. ACM Trans. Database Syst. 2(2), 134–174 (1977)CrossRefGoogle Scholar
  32. 32.
    Smith, J.M., Bernstein, P.A., Dayal, U., Goodman, N., Landers, T.A., Lin, K.W.T., Wong, E.: Multibase: Integrating Heterogeneous Distributed Database Systems. In: AFIPS National Computer Conference, pp. 487–499 (1981)Google Scholar
  33. 33.
  34. 34.
    U.S. Census Bureau, http://www.census.gov
  35. 35.
    Yan, L., Miller, R., Haas, L., Fagin, R.: Data-Driven Understanding and Refinement of Schema Mappings. In: SIGMOD, pp. 485–496 (2001)Google Scholar
  36. 36.
    Yu, C., Popa, L.: Semantic Adaptation of Schema Mappings when Schemas Evolve. In: VLDB, pp. 1006–1017 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Bogdan Alexe
    • 1
  • Wang-Chiew Tan
    • 2
  1. 1.IBM Research - AlmadenSan JoseUSA
  2. 2.University of CaliforniaSanta CruzUSA

Personalised recommendations