A Full and Linear Index of a Tree for Tree Patterns

  • Jan Janoušek
  • Bořivoj Melichar
  • Radomír Polách
  • Martin Poliak
  • Jan Trávníček
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8614)

Abstract

A new and simple method of indexing a tree for tree patterns is presented. A tree pattern is a tree whose leaves can be labelled by a special symbol S, which serves as a placeholder for any subtree. Given a subject tree T with n nodes, the tree is preprocessed and an index, which consists of a standard string compact suffix automaton and a subtree jump table, is constructed. The number of distinct tree patterns which match the tree is \(\mathcal{O}(2^n)\), and the size of the index is \(\mathcal{O}(n)\). The searching phase uses the index, reads an input tree pattern P of size m and computes the list of positions of all occurrences of the pattern P in the tree T. For an input tree pattern P in linear prefix notation pref(P) = P1SP2SSPk, k ≥ 1, the searching is performed in time \(\mathcal{O}(m + \sum\limits_{i=1}^k |occ(P_i)|))\), where occ(Pi) is the set of all occurrences of Pi in pref(T).

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aho, A.V., Ullman, J.D.: The theory of parsing, translation, and compiling. Prentice-Hall Englewood Cliffs, N.J (1972)Google Scholar
  2. 2.
    Bille, P.: Pattern Matching in Trees and Strings. PhD thesis.FIT University of Copenhagen, Copenhagen (2008)Google Scholar
  3. 3.
    Bille, P., Gørtz, I.L., Vildhøj, H.W., Vind, S.: String indexing for patterns with wildcards. In: Fomin, F.V., Kaski, P. (eds.) SWAT 2012. LNCS, vol. 7357, pp. 283–294. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  4. 4.
    Bille, P., Li Gørtz, I., Vildhøj, H.W., Wind, D.K.: String matching with variable length gaps. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 385–394. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M.T., Seiferas, J.I.: The smallest automaton recognizing the subwords of a text. Theor. Comput. Sci. 40, 31–55 (1985)CrossRefMATHMathSciNetGoogle Scholar
  6. 6.
    Cleophas, L.: Tree Algorithms. Two Taxonomies and a Toolkit. PhD thesis.Technische Universiteit Eindhoven, Eindhoven (2008)Google Scholar
  7. 7.
    Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (release November 18, 2008), http://www.grappa.univ-lille3.fr/tata
  8. 8.
    Crochemore, M.: Transducers and repetitions. Theor. Comput. Sci. 45(1), 63–86 (1986)CrossRefMATHMathSciNetGoogle Scholar
  9. 9.
    Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on strings. Cambridge Univ. Pr. (2007)Google Scholar
  10. 10.
    Crochemore, M., Rytter, W.: Text Algorithms. Oxford University Press (1994)Google Scholar
  11. 11.
    Crochemore, M., Rytter, W.: Jewels of stringology. World Scientific (2002)Google Scholar
  12. 12.
    Crochemore, M., Vérin, R.: Direct construction of compact directed acyclic word graphs. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 116–129. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  13. 13.
    Ehrenfeucht, A., McConnell, R.M., Osheim, N., Woo, S.-W.: Position heaps: A simple and dynamic text indexing data structure. J. Discrete Algorithms 9(1), 100–121 (2011)CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Hoffmann, C.M., O’Donnell, M.J.: Pattern matching in trees. J. ACM 29(1), 68–95 (1982)CrossRefMATHMathSciNetGoogle Scholar
  15. 15.
    Janoušek, J.: Tree indexing by deteministic automata. Dagstuhl reports 3(5), 6 (2013)Google Scholar
  16. 16.
    Lewenstein, M.: Indexing with gaps. In: Grossi, R., Sebastiani, F., Silvestri, F. (eds.) SPIRE 2011. LNCS, vol. 7024, pp. 135–143. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  17. 17.
    Melichar, B., Janoušek, J., Flouri, T.: Arbology: Trees and pushdown automata. Kybernetika 48(3), 402–428 (2012)MATHMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jan Janoušek
    • 1
  • Bořivoj Melichar
    • 1
  • Radomír Polách
    • 1
  • Martin Poliak
    • 1
  • Jan Trávníček
    • 1
  1. 1.Department of Theoretical Computer Science, Faculty of Information TechnologyCzech Technical University in PraguePrague 6Czech Republic

Personalised recommendations