Index Structures for Path Expressions

  • Tova Milo
  • Dan Suciu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1540)


In recent years there has been an increased interest in managing data that does not conform to traditional data models, like the relational or object oriented model. The reasons for this non-conformance are diverse. On the one hand, data may not conform to such models at the physical level: it may be stored in data exchange formats, fetched from the Web, or stored as structured files. One the other hand, it may not conform at the logical level: data may have missing attributes, some attributes may be of different types in different data items, there may be heterogeneous collections, or the schema may be too complex or changes too often. The term semistructured data has been used to refer to such data. The semistructured data model consists of an edge-labeled graph, in which nodes correspond to objects and edges to attributes or values. Figure 1 illustrates a semistructured database providing information about a city.


Equivalence Class Index Structure Regular Expression Outgoing Edge Label Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    S. Abiteboul, D. Quass, J. McHugh, J.Widom, and J. Wiener. The Lorel query language for semistructured data. International Journal on Digital Libraries, 1(1):68–88, April 1997.CrossRefGoogle Scholar
  2. 2.
    Serge Abiteboul. Querying semi-structured data. In ICDT, 1997.Google Scholar
  3. 3.
    Serge Abiteboul and Victor Vianu. Generic computation and its complexity. In Proceedings of 23rd ACM Symposium on the Theory of Computing, 1991.Google Scholar
  4. 4.
    Elisa Bertion and Won Kim. Indexing techniques for queries on nested objects. IEEE Transactions on Knowledge and Data Engineering, 1(2):196–214, June 1989.CrossRefGoogle Scholar
  5. 5.
    Adam Buchsbaum, Paris Kanellakis, and Jeffrey Scott Vitter. A data structure for arc insertion and regular path finding. Annals of Mathematics and Artificial Intelligence, 3:187–210, 1991.zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Peter Buneman, Susan Davidson, Mary Fernandez, and Dan Suciu. Adding structure to unstructured data. In Proceedings of the International Conference on Database Theory, pages 336–350, Deplhi, Greece, 1997. Springer Verlag.Google Scholar
  7. 7.
    Peter Buneman, Susan Davidson, Gerd Hillebrand, and Dan Suciu. A query language and optimization techniques for unstructured data. In Proceedings of ACM-SIGMOD International Conference on Management of Data, 1996.Google Scholar
  8. 8.
    V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl. From structured documents to novel query facilities. In Richard Snodgrass and Marianne Winslett, editors, Proceedings of 1994 ACM SIGMOD International Conference on Management of Data, Minneapolis, Minnesota, May 1994.Google Scholar
  9. 9.
    P. Flajolet and R. Sedgewick. Digital search trees revisited. SIAM Journal on Computing, 15:748–767, 1986.zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    Harold Gabow and Robert Tarjan. Faster scaling algorithms for network problems. SIAM Journal of Computing, 18(5):1013–1036, 1989.zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Roy Goldman and Jennifer Widom. DataGuides: enabling query formulation and optimization in semistructured databases. In VLDB, September 1997.Google Scholar
  12. 12.
    G. Gonnet. Efficient searching of text and pictures (extended abstract). Technical Report OED-88-02, University of Waterloo, 1988.Google Scholar
  13. 13.
    Monika Henzinger, Thomas Henzinger, and Peter Kopke. Computing simulations on finite and infinite graphs. In Proceedings of 20th Symposium on Foundations of Computer Science, pages 453–462, 1995.Google Scholar
  14. 14.
    Alfons Kemper and Guido Moerkkotte. Access support relations: an indexing method for object bases. Information Systems, 17(2):117–145, 1992.zbMATHCrossRefGoogle Scholar
  15. 15.
    Alon Levy, Alberto Mendelzon, Yehoshua Sagiv, and Divesh Srivastava. Answering queries using views. In Proceedings of the 14th Symposium on Principles of Database Systems, San Jose, CA, June 1995.Google Scholar
  16. 16.
    A. Mendelzon, G. Mihaila, and T. Milo. Querying the world wide web. In Proceedings of the Fourth Conference on Parallel and Distributed Information Systems, Miami, Florida, December 1996.Google Scholar
  17. 17.
    Robin Milner. Communication and concurrency. Prentice Hall, 1989.Google Scholar
  18. 18.
    S. Nestorov, S. Abiteboul, and R. Motwani. Inferring structure in semistructured data. In Proceedings of the Workshop on Management of Semi-structured Data, 1997.Google Scholar
  19. 19.
    S. Nestorov, J. Ullman, J. Wiener, and S. Chawathe. Representative objects: concise representation of semistructured, hierarchical data. In ICDE, 1997.Google Scholar
  20. 20.
    Robert Paige and Robert Tarjan. Three partition refinement algorithms. SIAM Journal of Computing, 16:973–988, 1987.zbMATHCrossRefMathSciNetGoogle Scholar
  21. 21.
    A. Salminen and F. W. Tompa. Pat expressions: an algebra for text search. In Papers in Computational Lexicography: COMPLEX’92, pages 309–332, 1992.Google Scholar
  22. 22.
    L. J. Stockmeyer and A.R. Meyer. Word problems requiring exponential time. In 5th STOC, pages 1–9. ACM, 1973.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Tova Milo
    • 1
  • Dan Suciu
    • 2
  1. 1.Tel Aviv UniversityIsrael
  2. 2.AT&T LabsIsrael

Personalised recommendations