Schema Mappings: A Case of Logical Dynamics in Database Theory

  • Balder ten Cate
  • Phokion G. Kolaitis
Part of the Outstanding Contributions to Logic book series (OCTR, volume 5)


A schema mapping is a high-level specification of the structural relationships between two database schemas. This specification is expressed in a schema-mapping language, which is typically a fragment of first-order logic or second-order logic. Schema mappings have played an essential role in the study of important data-interoperability tasks, such as data integration and data exchange. In this chapter, we examine schema mappings as a case of logical dynamics in action. We provide a self-contained introduction to this area of research in the context of logic and databases, and focus on some of the concepts and results that may be of particular interest to the readers of this volume. After a basic introduction to schema mappings and schema-mapping languages, we discuss a series of results concerning fundamental structural properties of schema mappings. We then show that these structural properties can be used to obtain characterizations of various schema-mapping languages, in the spirit of abstract model theory. We conclude this chapter by highlighting the surprisingly subtle picture regarding compositions of schema mappings and the languages needed to express them.


Schema mappings Data interoperability Structural characterizations Composition Logical dynamics 


  1. 1.
    Abiteboul S, Hull R, Vianu V (1995) Foundations of databases. Addison-Wesley, BostonGoogle Scholar
  2. 2.
    Andréka H, van Benthem J, Németi I (1998) Modal languages and bounded fragments of predicate logic. J Philos Logic 27:217–274CrossRefGoogle Scholar
  3. 3.
    Arenas M, Pérez J, Reutter JL, Riveros C (2009) Composition and inversion of schema mappings. SIGMOD Rec 38(3):17–28CrossRefGoogle Scholar
  4. 4.
    Arenas M, Pérez J, Reutter JL, Riveros C (2013) The language of plain so-tgds: composition, inversion and structural properties. J Comput Syst Sci 79(6):763–784CrossRefGoogle Scholar
  5. 5.
    Barwise J, Seligman J (1997) Information flow: the logic of distributed systems., Cambridge tracts in theoretical computer science, Cambridge University Press, CambridgeGoogle Scholar
  6. 6.
    Beeri C, Vardi MY (1984) A proof procedure for data dependencies. J ACM 31(4):718–741Google Scholar
  7. 7.
    van Benthem J (1983) Modal logic and classical logic. Bibliopolis, BerkeleyGoogle Scholar
  8. 8.
    van Benthem J (1989) Logical constants across varying types. Notre Dame J Formal Logic 30(3):315–342CrossRefGoogle Scholar
  9. 9.
    van Benthem J (2000) Information transfer across chu spaces. J Logic IGPL 8(6):719–731CrossRefGoogle Scholar
  10. 10.
    van Benthem J (2010) Modal logic for open minds. CSLI lecture notes, Center for the Study of Language and InformationGoogle Scholar
  11. 11.
    Bernstein PA (2003) Applying model management to classical meta data problems. In: Proceedings of the 1st Biennial conference on innovative data systems research (CIDR)Google Scholar
  12. 12.
    Bernstein PA, Haas LM (2008) Information integration in the enterprise. Commun ACM 51(9):72–79CrossRefGoogle Scholar
  13. 13.
    Beth EW (1955) Semantic entailment and formal derivability. Meded van de KNAW, Afdeling Letterkunde 18(13):309–42 (Reprinted in 1969, Hintikka J (ed) The philosophy of mathematics, Oxford University Press)Google Scholar
  14. 14.
    Calvanese D, Giacomo GD, Lenzerini M, Rosati V (2004) Logical foundations of peer-to-peer data integration. In: Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pp 241–251Google Scholar
  15. 15.
    ten Cate B, Kolaitis PG (2009) Structural characterizations of schema-mapping languages. In: International conference on database theory, pp 63–72Google Scholar
  16. 16.
    ten Cate B, Kolaitis PG (2010) Structural characterizations of schema-mapping languages. Commun ACM 53(1):101–110Google Scholar
  17. 17.
    Chandra A, Merlin P (1977) Optimal implementation of conjunctive queries in relational databases. In: Proceedings of 9th ACM symposium on theory of computing, pp 77–90Google Scholar
  18. 18.
    Chang CC, Keisler J (1973) Model theory. Number 73 in Studies in Logic and the Foundations of Mathematics, North-Holland (3rd edn, 1990)Google Scholar
  19. 19.
    Dawar A (1998) A restricted second order logic for finite structures. Inf Comput 143(2): 154–174Google Scholar
  20. 20.
    Di Paola RA (1969) The recursive unsolvability of the decision problem for the class of definite formulas. J ACM 16(2):324–327Google Scholar
  21. 21.
    Fagin R, Kolaitis PG, Miller RJ, Popa L (2005) Data exchange: semantics and query answering. Theoret Comput Sci 336(1):89–124CrossRefGoogle Scholar
  22. 22.
    Fagin R, Kolaitis PG, Popa L (2005) Data exchange: getting to the core. ACM Trans Database Syst 30(1):174–210CrossRefGoogle Scholar
  23. 23.
    Fagin R, Kolaitis PG, Popa L, Tan W-C (2005) Composing schema mappings: second-order dependencies to the rescue. ACM Trans Database Syst 30(4):994–1055CrossRefGoogle Scholar
  24. 24.
    Fagin R, Kolaitis PG, Popa L, Tan WC (2011) Schema mapping evolution through composition and inversion. In: Schema matching and mapping. Springer, pp 191–222Google Scholar
  25. 25.
    Fagin R, Vardi MY (1986) The theory of data dependencies—a survey. In: Anshel M, Gewirtz W (eds) Proceedings of symposia in applied mathematics, vol 34. Mathematics of Information Processing American Mathematical Society, Providence, pp 19–71Google Scholar
  26. 26.
    Fuxman A, Hernández MA, Ho CTH, Miller RJ, Papotti P, Popa L (2006) Nested mappings: schema mapping reloaded. In: Proceedings of VLDB, pp 67–78Google Scholar
  27. 27.
    Gottlob G, Leone N, Scarcello F (2001) Hypertree decompositions: a survey. In: Sgall J, Pultr A, Kolman P (eds) MFCS of lecture notes in computer science, vol 2136. Springer, pp 37–57Google Scholar
  28. 28.
    Gottlob G, Leone N, Scarcello F (2002) Hypertree decompositions and tractable queries. J Comput Syst Sci 64(3):579–627CrossRefGoogle Scholar
  29. 29.
    Haas LM (2007) Beauty and the beast: the theory and practice of information integration. In: Schwentick T, Suciu D (eds) ICDT, lecture notes in computer science, vol 4353. Springer, pp 28–43Google Scholar
  30. 30.
    Haas LM, Hernández MA, Ho H, Popa L, Roth M (2005) Clio grows up: from research prototype to industrial tool. In: Özcan F (ed) SIGMOD conference, ACM, pp 805–810Google Scholar
  31. 31.
    Halevy AY, Ives ZG, Madhavan J, Mork P, Suciu D, Tatarinov I (2004) The piazza peer data management system. IEEE Trans Knowl Data Eng 16(7):787–798CrossRefGoogle Scholar
  32. 32.
    Halevy AY, Ives ZG, Suciu D, Tatarinov I (2005) Schema mediation for large-scale semantic data sharing. VLDB J 14(1):68–83CrossRefGoogle Scholar
  33. 33.
    Hell P, Nešetřil J (2004) Graphs and homomorphisms. Oxford lecture series in mathematics and its applications, Oxford University PressGoogle Scholar
  34. 34.
    Hell P, Nešetřil J (1992) The core of a graph. Discrete Math 109:117–126CrossRefGoogle Scholar
  35. 35.
    Hernández MA, Miller RJ, Haas LM (2001) Clio: a semi-automatic tool for schema mapping. In: SIGMOD conference, p 607Google Scholar
  36. 36.
    Hoare CAR (1969) An axiomatic basis for computer programming. Commun. ACM 12(10):576–580Google Scholar
  37. 37.
    Imielinski T, Jr WL (1984) Incomplete information in relational databases. J ACM 31(4): 761–791Google Scholar
  38. 38.
    Kolaitis PG, Vardi MY (2000) Conjunctive-query containment and constraint satisfaction. J Comput Syst Sci 61(2):302–332CrossRefGoogle Scholar
  39. 39.
    Kooi B, van Benthem J (2004) Reduction axioms for epistemic actions. In: Schmidt R, Pratt-Hartmann I, Reynolds M, Wansing H (eds) Preliminary proceedings of AiML-2004. Department of Computer Science, University of Manchester, pp 197–211Google Scholar
  40. 40.
    Lawvere FW, Schanuel SS (1997) Conceptual mathematics: a first introduction to category theory. Cambridge University Press, CambridgeGoogle Scholar
  41. 41.
    Lenzerini M (2002) Data integration: a theoretical perspective. In: Proceedings of principles of database systems, pp 233–246Google Scholar
  42. 42.
    Lenzerini M (2004) Principles of P2P data integration. In: Proceedings of the 3rd International Workshop on Data Integration Over the Web (DIWeb), pp 7–21Google Scholar
  43. 43.
    Madhavan J, Halevy AY (2003) Composing mappings among data sources. In: Proceedings of 29th International Conference on Very Large Data Bases (VLDB), pp 572–583Google Scholar
  44. 44.
    Maier D, Mendelzon AO, Sagiv Y (1979) Testing implications of data dependencies. ACM Trans Database Syst 4(4):455–469Google Scholar
  45. 45.
    Makowsky JA, Vardi MY (1986) On the expressive power of data dependencies. Acta Informatica 23(3):231–244CrossRefGoogle Scholar
  46. 46.
    Miller RJ, Haas LM, Hernández MA (2000) Schema mapping as query discovery. In: Abbadi AE, Brodie ML, Chakravarthy S, Dayal U, Kamel N, Schlageter G, Whang K.-Y (eds) Proceedings of 26th International Conference on Very Large Data Bases (VLDB), Morgan Kaufmann, pp 77–88Google Scholar
  47. 47.
    Miller RJ, Hernández MA, Haas LM, Yan L-L, Ho CTH, Fagin R, Popa L (2001) The clio project: managing heterogeneity. SIGMOD Rec 30(1):78–83CrossRefGoogle Scholar
  48. 48.
    Pottinger R, Halevy A (2001) Minicon: a scalable algorithm for answering queries using views. VLDB J 10(2–3):182–198Google Scholar
  49. 49.
    Rosen E (1997) Modal logic over finite structures. J Logic Lang Inform 6:427–439CrossRefGoogle Scholar
  50. 50.
    Rosen E (2002) Some aspects of model theory and finite structures. Bull Symb Logic 8(3): 380–403Google Scholar
  51. 51.
    Rossman B (2008) Homomorphism preservation theorems. J ACM 55(3):15:1–15:53Google Scholar
  52. 52.
    Trakhtenbrot B (1950) Impossibility of an algorithm for the decision problem on finite classes. Dokl Akad Nauk SSSR 70:569–572Google Scholar
  53. 53.
    Van Gelder A, Topor RW (1991) Safety and translation of relational calculus. ACM Trans Database Syst 16(2):235–278Google Scholar
  54. 54.
    Vardi MY (1982) The complexity of relational query languages (extended abstract). In: Proceedings of the 14th annual ACM symposium on theory of computing, STOC ’82, ACM, New York, pp 137–146Google Scholar
  55. 55.
    Yannakakis M (1981) Algorithms for acyclic database schemes. In: Proceedings of 7th International Conference on Very Large Data Bases (VLDB), pp 82–94Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.UC Santa CruzSanta CruzUSA
  2. 2.UC Santa Cruz and IBM ResearchAlmadenUSA

Personalised recommendations