The VLDB Journal

, Volume 20, Issue 2, pp 277–302 | Cite as

Normalization and optimization of schema mappings

Special Issue Paper


Schema mappings are high-level specifications that describe the relationship between database schemas. They are an important tool in several areas of database research, notably in data integration and data exchange. However, a concrete theory of schema mapping optimization including the formulation of optimality criteria and the construction of algorithms for computing optimal schema mappings is completely lacking to date. The goal of this work is to fill this gap. We start by presenting a system of rewrite rules to minimize sets of source-to-target tuple-generating dependencies. Moreover, we show that the result of this minimization is unique up to variable renaming. Hence, our optimization also yields a schema mapping normalization. By appropriately extending our rewrite rule system, we also provide a normalization of schema mappings containing equality-generating target dependencies. An important application of such a normalization is in the area of defining the semantics of query answering in data exchange, since several definitions in this area depend on the concrete syntactic representation of the mappings. This is, in particular, the case for queries with negated atoms and for aggregate queries. The normalization of schema mappings allows us to eliminate the effect of the concrete syntactic representation of the mapping from the semantics of query answering. We discuss in detail how our results can be fruitfully applied to aggregate queries.


Data integration Data exchange Schema mappings optimization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Afrati, F.N., Kolaitis, P.G.: Answering aggregate queries in data exchange. In: Proceedings PODS’08, pp. 129–138. ACM (2008)Google Scholar
  2. 2.
    Arenas, M., Barceló, P., Fagin, R., Libkin, L.: Locally consistent transformations and query answering in data exchange. In: Proceedings PODS’04, pp. 229–240. ACM (2004)Google Scholar
  3. 3.
    Arenas M., Bertossi L.E., Chomicki J., He X., Raghavan V., Spinrad J.: Scalar aggregation in inconsistent databases. Theor. Comput. Sci. 3(296), 405–434 (2003)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Beeri C., Vardi M.Y.: A proof procedure for data dependencies. J. ACM 31(4), 718–741 (1984)CrossRefMATHMathSciNetGoogle Scholar
  5. 5.
    Bernstein P.A., Green T.J., Melnik S., Nash A.: Implementing mapping composition. VLDB J. 17(2), 333–353 (2008)CrossRefGoogle Scholar
  6. 6.
    Bernstein, P.A., Melnik, S.: Model management 2.0: manipulating richer mappings. In: Proceedings SIGMOD’07, pp. 1–12. ACM (2007)Google Scholar
  7. 7.
    Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Proceedings STOC’77, pp. 77–90. ACM Press (1977)Google Scholar
  8. 8.
    Fagin R.: Horn clauses and database dependencies. J. ACM 29(4), 952–985 (1982)CrossRefMATHMathSciNetGoogle Scholar
  9. 9.
    Fagin R., Kolaitis P.G., Miller R.J., Popa L.: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005)CrossRefMATHMathSciNetGoogle Scholar
  10. 10.
    Fagin, R., Kolaitis, P.G., Nash A., Popa L.: Towards a theory of schema-mapping optimization. In: Proceedings PODS’08, pp. 33–42. ACM (2008)Google Scholar
  11. 11.
    Fagin R., Kolaitis P.G., Popa L.: Data exchange: getting to the core. ACM Trans. Database Syst. 30(1), 174–210 (2005)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.-C.: Reverse data exchange: coping with nulls. In: Proceedings PODS ’09, pp. 23–32. ACM (2009)Google Scholar
  13. 13.
    Gottlob, G., Pichler, R., Savenkov, V.: Optimization and normalization of schema mappings. Technical Report DBAI-TR-2011-69, Vienna University of Technology (2011)Google Scholar
  14. 14.
    Halevy, A.Y., Rajaraman, A., Ordille, J. J.: Data integration: the teenage years. In: Proceedings VLDB’06, pp. 9–16. ACM (2006)Google Scholar
  15. 15.
    Hernich, A., Schweikardt, N.: Cwa-solutions for data exchange settings with target dependencies. In: Proceedings PODS’07, pp. 113–122. ACM (2007)Google Scholar
  16. 16.
    Imielinski T., Lipski W. Jr: Incomplete information in relational databases. J. ACM 31(4), 761–791 (1984)CrossRefMATHMathSciNetGoogle Scholar
  17. 17.
    Johnson D.S., Klug A.C.: Testing containment of conjunctive queries under functional and inclusion dependencies. J. Comput. Syst. Sci. 28(1), 167–189 (1984)CrossRefMATHMathSciNetGoogle Scholar
  18. 18.
    Kolaitis, P.G.: Schema mappings, data exchange, and metadata management. In: Proceedings PODS’05, pp. 61–75. ACM (2005)Google Scholar
  19. 19.
    Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings PODS’02, pp. 233–246. ACM (2002)Google Scholar
  20. 20.
    Libkin, L.: Data exchange and incomplete information. In: Proceedings PODS’06, pp. 60–69. ACM Press (2006)Google Scholar
  21. 21.
    Libkin, L., Sirangelo, C.: Data exchange and schema mappings in open and closed worlds. In: Proceedings PODS’08, pp. 139–148. ACM (2008)Google Scholar
  22. 22.
    Marnette B., Mecca G., Papotti P.: Scalable data exchange with functional dependencies. PVLDB 3(1), 105–116 (2010)Google Scholar
  23. 23.
    Mecca, G., Papotti, P., Raunich, S.: Core schema mappings. In: Proceedings SIGMOD’09, pp. 655–668 (2009)Google Scholar
  24. 24.
    Pichler, R., Sallinger, E., Savenkov, V.: Relaxed notions of schema mapping equivalence revisited. In: Proceedings ICDT’11, pp. 90–101. ACM (2011)Google Scholar
  25. 25.
    Sagiv Y., Yannakakis M.: Equivalences among relational expressions with the union and difference operators. J. ACM 27(4), 633–655 (1980)CrossRefMATHMathSciNetGoogle Scholar
  26. 26.
    ten Cate B., Chiticariu L., Kolaitis P.G., Tan W.C.: Laconic schema mappings: computing the core with sql queries. PVLDB 2(1), 1006–1017 (2009)Google Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Georg Gottlob
    • 1
  • Reinhard Pichler
    • 2
  • Vadim Savenkov
    • 2
  1. 1.Computing LaboratoryOxford UniversityOxfordUnited Kingdom
  2. 2.Database and Artificial Intelligence Group, Institute of Information SystemsVienna University of TechnologyViennaAustria

Personalised recommendations