Schemas for Integration and Translation of Structured and Semi-structured Data

  • Catriel Beeri
  • Tova Milo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1540)


With the emergence of the Web as a universal data repository, research has recently focused on data integration and data translation, and a common data model of semistructured data has been established. It is being realized, however, that having a common schema model is also necessary, to support tasks such as query formulation, decomposition and optimization, or declarative specification of data translation. In this paper we elaborate on the theoretical foundations of a middle-ware schema model. We present expressive and flexible schema definition languages, and investigate properties such as expressive power and the complexity of decision problems that are significant in the context of data translation and integration.


Regular Expression Data Graph Expressive Power Parse Tree Virtual Node 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Extensible markup language, 1998. Available by from
  2. [2]
    S. Abiteboul, S. Cluet, and T. Milo. Correspondence and translation for heterogeneous data. In Proc. ICDT 97, pages 351–363, 1997.Google Scholar
  3. [3]
    S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J.L. Wiener. The lorel query language for semistructured data. Journal on Digital Libraries, 1(1), 1997.Google Scholar
  4. [4]
    S. Abiteboul and V. Vianu. Regular path queries with constraints. In Proc. Symp. on Principles of Database Systems-PODS 97, 1997.Google Scholar
  5. [5]
    P. Buneman, S. Davidson, M. Fernandez, and D. Suciu. Adding structure to unstructured data. In Proc. Int. Conf. on Database Theory ICDT 97, 1997.Google Scholar
  6. [6]
    P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A query language and optimization techniques for unstructured data. In Proceedings of SIGMOD’ 96, pages 505–516, 1996.Google Scholar
  7. [7]
    P. Buneman, W. Fan, and S. Weinstein. Path constraints on semistructured and structured data. In Proceedings of PODS’ 98, pages 129–138, 1998.Google Scholar
  8. [8]
    M.J. Carey et al. Towards heterogeneous multimedia information systems: The Garlic approach. Technical Report RJ 9911, IBM Almaden Research Center, 1994.Google Scholar
  9. [9]
    T.-P. Chang and R. Hull. Using witness generators to support bi-directional update between object-based databases. In Proc. Symp. on Principles of Database Systems-PODS 95, San Jose, California, May 1995.Google Scholar
  10. [10]
    V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl. From structured documents to novel query facilities. In Proc. ACM SIGMOD Symp. on the Management of Data, 94, pages 313–324, 1994.Google Scholar
  11. [11]
    S. Cluet, C. Delobel, J. Simeon, and K. Smaga. Your mediators need data conversion! In SIGMOD’98, to appear, 1998.Google Scholar
  12. [12]
    M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for a web-site management system. SIGMOD Record, 6(3):4–11, 1997.CrossRefGoogle Scholar
  13. [13]
    H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, V. Vassalos, and J. Widom. The tsimmis approach to mediation: Data models and languages. In Journal of Intelligent Information Systems, 1997.Google Scholar
  14. [14]
    S. Ginsburg. The Mathematical Theory of Context-Free Languages. McGraw-Hill, 1966.Google Scholar
  15. [15]
    C.F. Goldfarb. The SGML Handbook. Calendon Press, Oxford, 1990.Google Scholar
  16. [16]
    R. Goldman and J. Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. In Proceedings of Conf. on Very Large Data Bases, VLDB’ 97, 1997.Google Scholar
  17. [17]
    A. Levy, A. Rajaraman, and J. Ordille. Querying heterogeneous information sources using source descriptions. In Proceedings of Conf. on Very Large Data Bases, VLDB’ 96, 1996.Google Scholar
  18. [18]
    A. Mendelzon, G. Michaila, and T. Milo. Querying the world wide web. Int. Journal of Digital Libraries, 1(1), 1997.Google Scholar
  19. [19]
    T. Milo and S. Zohar. Using schema matching to simplify heterogeneous data translation. In To appear in VLDB’ 98, 1998.Google Scholar
  20. [20]
    Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object exchange across heterogeneous information sources. In Proc. IEEE International Conference on Data Engineering 95, 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Catriel Beeri
    • 1
  • Tova Milo
    • 2
  1. 1.Hebrew UniversityIsrael
  2. 2.Tel Aviv UniversityIsrael

Personalised recommendations