The VLDB Journal

, Volume 21, Issue 2, pp 191–211

MapMerge: correlating independent schema mappings

  • Bogdan Alexe
  • Mauricio Hernández
  • Lucian Popa
  • Wang-Chiew Tan
Special Issue Paper

Abstract

One of the main steps toward integration or exchange of data is to design the mappings that describe the (often complex) relationships between the source schemas or formats and the desired target schema. In this paper, we introduce a new operator, called MapMerge, that can be used to correlate multiple, independently designed schema mappings of smaller scope into larger schema mappings. This allows a more modular construction of complex mappings from various types of smaller mappings such as schema correspondences produced by a schema matcher or pre-existing mappings that were designed by either a human user or via mapping tools. In particular, the new operator also enables a new “divide-and-merge” paradigm for mapping creation, where the design is divided (on purpose) into smaller components that are easier to create and understand and where MapMerge is used to automatically generate a meaningful overall mapping. We describe our MapMerge algorithm and demonstrate the feasibility of our implementation on several real and synthetic mapping scenarios. In our experiments, we make use of a novel similarity measure between two database instances with different schemas that quantifies the preservation of data associations. We show experimentally that MapMerge improves the quality of the schema mappings, by significantly increasing the similarity between the input source instance and the generated target instance. Finally, we provide a new algorithm that combines MapMerge with schema mapping composition to correlate flows of schema mappings.

Keywords

Schema mappings Data exchange Data integration 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Bogdan Alexe
    • 1
  • Mauricio Hernández
    • 1
  • Lucian Popa
    • 1
  • Wang-Chiew Tan
    • 1
    • 2
  1. 1.IBM Research-AlmadenSan JoseUSA
  2. 2.UC Santa CruzSanta CruzUSA

Personalised recommendations