Data Exchange: Semantics and Query Answering

  • Ronald Fagin
  • Phokion G. Kolaitis
  • Renée J. Miller
  • Lucian Popa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2572)

Abstract

Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to query answering in the context of data exchange. These issues arise because, given a source instance, there may be many target instances that satisfy the constraints of the data exchange problem. We give an algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that we call universal. A universal solution has no more and no less data than required for data exchange and it represents the entire space of possible solutions. We then identify fairly general, and practical, conditions that guarantee the existence of a universal solution and yield algorithms to compute a canonical universal solution efficiently.We adopt the notion of “certain answers” in indefinite databases for the semantics for query answering in data exchange. We investigate the computational complexity of computing the certain answers in this context and also study the problem of computing the certain answers of target queries by simply evaluating them on a canonical universal solution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    S. Abiteboul, S. Cluet, and T. Milo. Correspondence and Translation for Heterogeneous Data. In ICDT, pages 351–363, 1997.Google Scholar
  2. 2.
    S. Abiteboul and O. M. Duschka. Complexity of Answering Queries Using Materialized Views. In PODS, pages 254–263, 1998.Google Scholar
  3. 3.
    S. Abiteboul and O. M. Duschka. Complexity of Answering Queries Using Materialized Views. Unpublished full version of [2], 2000.Google Scholar
  4. 4.
    C. Beeri and M.Y. Vardi. A Proof Procedure for Data Dependencies. JACM, 31(4):718–741, 1984.MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    A. Calì, D. Calvanese, G. D. Giacomo, and M. Lenzerini. Data Integration under Integrity Constraints. In CAiSE, pages 262–279, 2002.Google Scholar
  6. 6.
    M. A. Casanova, R. Fagin, and C. H. Papadimitriou. Inclusion Dependencies and their Interaction with Functional Dependencies. JCSS, 28(1):29–59, 1984.MATHMathSciNetGoogle Scholar
  7. 7.
    S. S. Cosmadakis and P. C. Kanellakis. Functional and Inclusion Dependencies: A Graph Theoretic Approach. In Advances in Computing Research, volume 3, pages 163–184. 1986.Google Scholar
  8. 8.
    R. Fagin. Horn Clauses and Database Dependencies. JACM, 29(4):952–985, Oct. 1982.MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. IBM Research Report, Nov. 2002.Google Scholar
  10. 10.
    M. Friedman, A. Y. Levy, and T. D. Millstein. Navigational Plans For Data Integration. In AAAI, pages 67–73, 1999.Google Scholar
  11. 11.
    A. Halevy. Answering Queries UsingViews:A Survey. VLDB Journal, pages 270–294, 2001.Google Scholar
  12. 12.
    R. Hull and M. Yoshikawa. ILOG: Declarative Creation and Manipulation of Object Identifiers. In VLDB, pages 455–468, 1990.Google Scholar
  13. 13.
    M. Lenzerini. Data Integration: A Theoretical Perspective. In PODS, pages 233–246, 2002.Google Scholar
  14. 14.
    A.Y. Levy, A. O. Mendelzon, Y. Sagiv, and D. Srivastava. Answering Queries Using Views. In PODS, pages 95–104, May 1995.Google Scholar
  15. 15.
    D. Maier, A. O. Mendelzon, and Y. Sagiv. Testing Implications of Data Dependencies. ACM TODS, 4(4):455–469, Dec. 1979.CrossRefGoogle Scholar
  16. 16.
    D. Maier, J. D. Ullman, and M.Y. Vardi. On the Foundations of the Universal Relation Model. ACM TODS, 9(2):283–308, June 1984.MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    J. A. Makowsky. Why Horn Formulas Matter in Computer Science: Initial Structures and Generic Examples. JCSS, 34(2/3):266–292, April/June 1987.MATHMathSciNetGoogle Scholar
  18. 18.
    R. J. Miller, L.M. Haas, and M. Hernández. Schema Mapping as Query Discovery. In VLDB, pages 77–88, 2000.Google Scholar
  19. 19.
    L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernandez, and R. Fagin. Translating Web Data. In VLDB, pages 598–609, 2002.Google Scholar
  20. 20.
    N. C. Shu, B. C. Housel, and V. Y. Lum. CONVERT: A High Level Translation Definition Language for Data Conversion. Communications of the ACM, 18(10):557–567, 1975.MATHCrossRefGoogle Scholar
  21. 21.
    N. C. Shu, B. C. Housel, R. W. Taylor, S. P. Ghosh, and V. Y. Lum. EXPRESS: A Data EXtraction, Processing, amd REStructuring System. TODS, 2(2):134–174, 1977.CrossRefGoogle Scholar
  22. 22.
    R. van der Meyden. The Complexity of Querying Indefinite Data about Linearly Ordered Domains. JCSS, 54:113–135, 1997.MATHGoogle Scholar
  23. 23.
    R. van der Meyden. Logical Approaches to Incomplete Information:A Survey. In Logics for Databases and Information Systems, pages 307–356. Kluwer, 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Ronald Fagin
    • 1
  • Phokion G. Kolaitis
    • 2
  • Renée J. Miller
    • 3
  • Lucian Popa
    • 1
  1. 1.IBM Almaden Research CenterUSA
  2. 2.UC Santa CruzUSA
  3. 3.University of TorontoToronto

Personalised recommendations