Data Integration under Integrity Constraints

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2348)


Data integration systems provide access to a set of heterogeneous, autonomous data sources through a so-called global schema. There are basically two approaches for designing a data integration system. In the global-centric approach, one defines the elements of the global schema as views over the sources, whereas in the local-centric approach, one characterizes the sources as views over the global schema. It is well known that processing queries in the latter approach is similar to query answering with incomplete information, and, therefore, is a complex task. On the other hand, it is a common opinion that query processing is much easier in the former approach. In this paper we show the surprising result that, when the global schema is expressed in the relational model with integrity constraints, even of simple types, the problem of incomplete information implicitly arises, making query processing difficult in the global-centric approach as well. We then focus on global schemas with key and foreign key constraints, which represents a situation which is very common in practice, and we illustrate techniques for effectively answering queries posed to the data integration system in this case.


Logic Program Query Processing Data Integration Global Schema Integrity Constraint 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Batini, C., Lenzerini, M., Navathe, S. B.: A comparative analysis of methodologies for database schema integration. ACM Computing Surveys 18 (1986) 323–364CrossRefGoogle Scholar
  2. 2.
    Sheth, A. P., Larson, J. A.: Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys 22 (1990) 183–236CrossRefGoogle Scholar
  3. 3.
    Thomas, G., Thompson, G. R., Chung, C. W., Barkmeyer, E., Carter, F., Templeton, M., Fox, S., Hartman, B.: Heterogeneous distributed database systems for production use. ACM Computing Surveys 22 (1990) 237–266CrossRefGoogle Scholar
  4. 4.
    Litwin, W., Mark, L., Roussopoulos, N.: Interoperability of multiple autonomous databases. ACM Computing Surveys 22 (1990) 267–293CrossRefGoogle Scholar
  5. 5.
    Catarci, T., Lenzerini, M.: Representing and using interschema knowledge in cooperative information systems. J. of Intelligent and Cooperative Information Systems 2 (1993) 375–398CrossRefGoogle Scholar
  6. 6.
    Hull, R.: Managing semantic heterogeneity in databases: A theoretical perspective. In: Proc. of PODS’97. (1997)Google Scholar
  7. 7.
    Halevy, A. Y.: Answering queries using views: A survey. VLDB Journal 10 (2001) 270–294zbMATHCrossRefGoogle Scholar
  8. 8.
    Ullman, J. D.: Information integration using logical views. In: Proc. of ICDT’97. Volume 1186 of LNCS., Springer (1997) 19–40Google Scholar
  9. 9.
    Li, C., Chang, E.: Query planning with limited source capabilities. In: Proc. of ICDE 2000. (2000) 401–412Google Scholar
  10. 10.
    Garcia-Molina, H., Papakonstantinou, Y., Quass, D., Rajaraman, A., Sagiv, Y., Ullman, J. D., Vassalos, V., Widom, J.: The TSIMMIS approach to mediation: Data models and languages. J. of Intelligent Information Systems 8 (1997) 117–132CrossRefGoogle Scholar
  11. 11.
    Anthony Tomasic, Louiqa Raschid, P. V.: Scaling access to heterogeneous data sources with DISCO. IEEE Trans. on Knowledge and Data Engineering 10 (1998) 808–823CrossRefGoogle Scholar
  12. 12.
    Goh, C. H., Bressan, S., Madnick, S. E., Siegel, M. D.: Context interchange: New features and formalisms for the intelligent integration of information. ACM Trans. on Information Systems 17 (1999) 270–293CrossRefGoogle Scholar
  13. 13.
    Kirk, T., Levy, A. Y., Sagiv, Y., Srivastava, D.: The Information Manifold. In: Proceedings of the AAAI 1995 Spring Symp. on Information Gathering from Heterogeneous, Distributed Enviroments. (1995) 85–91Google Scholar
  14. 14.
    Abiteboul, S., Duschka, O.: Complexity of answering queries using materialized views. In: Proc. of PODS’98. (1998) 254–265Google Scholar
  15. 15.
    Calvanese, D., De Giacomo, G., Lenzerini, M., Nardi, D., Rosati, R.: Data integration in data warehousing. Int. J. of Cooperative Information Systems 10 (2001) 237–271CrossRefGoogle Scholar
  16. 16.
    Calì, A., De Giacomo, G., Lenzerini, M.: Models for information integration: Turning local-as-view into global-as-view. In: Proc. of Int. Workshop on Foundations of Models for Information Integration (10th Workshop in the series Foundations of Models and Languages for Data and Objects). (2001)Google Scholar
  17. 17.
    Gryz, J.: Query folding with inclusion dependencies. In: Proc. of ICDE’98. (1998) 126–133Google Scholar
  18. 18.
    Grahne, G., Mendelzon, A. O.: Tableau techniques for querying information sources through global schemas. In: Proc. of ICDT’99. Volume 1540 of LNCS., Springer (1999) 332–347Google Scholar
  19. 19.
    Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M. Y.: Query processing using views for regular path queries with inverse. In: Proc. of PODS 2000. (2000) 58–66Google Scholar
  20. 20.
    van der Meyden, R.: Logical approaches to incomplete information. In Chomicki, J., Saake, G., eds.: Logics for Databases and Information Systems. Kluwer Academic Publisher (1998) 307–356Google Scholar
  21. 21.
    Fernandez, M. F., Florescu, D., Levy, A., Suciu, D.: Verifying integrity constraints on web-sites. In: Proc. of IJCAI’99. (1999) 614–619Google Scholar
  22. 22.
    Fernandez, M. F., Florescu, D., Kang, J., Levy, A. Y., Suciu, D.: Catching the boat with strudel: Experiences with a web-site management system. In: Proc. of ACM SIGMOD. (1998) 414–425Google Scholar
  23. 23.
    Carey, M. J., Haas, L. M., Schwarz, P. M., Arya, M., Cody, W. F., Fagin, R., Flickner, M., Luniewski, A., Niblack, W., Petkovic, D., Thomas, J., Williams, J. H., Wimmers, E. L.: Towards heterogeneous multimedia information systems: The Garlic approach. In: Proc. of the 5th Int. Workshop on Research Issues in Data Engineering-Distributed Object Management (RIDE-DOM’95), IEEE CS Press (1995) 124–131Google Scholar
  24. 24.
    Li, C., Yerneni, R., Vassalos, V., Garcia-Molina, H., Papakonstantinou, Y., Ullman, J. D., Valiveti, M.: Capability based mediation in TSIMMIS. In: Proc. of ACM SIGMOD. (1998) 564–566Google Scholar
  25. 25.
    Galhardas, H., Florescu, D., Shasha, D., Simon, E.: An extensible framework for data cleaning. Technical Report 3742, INRIA, Rocquencourt (1999)Google Scholar
  26. 26.
    Bouzeghoub, M., Lenzerini, M.: Introduction to the special issue on data extraction, cleaning, and reconciliation. Information Systems 26 (2001) 535–536CrossRefGoogle Scholar
  27. 27.
    Lloyd, J. W.: Foundations of Logic Programming (Second, Extended Edition). Springer, Berlin, Heidelberg (1987)zbMATHCrossRefGoogle Scholar
  28. 28.
    Lloyd, J. W., Shepherdson, J. C.: Partial evaluation in logic programming. J. of Logic Programming 11 (1991) 217–242MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  1. 1.Dipartimento di Informatica e SistemisticaUniversità di Roma “La Sapienza”RomaItaly

Personalised recommendations