Advertisement

Approximate word sequence matching over Sparse Suffix Trees

  • Knut Magne Risvik
Session II
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1448)

Abstract

In this paper, we discuss word sequence matching, and we adapt the common edit distance metric for approximate string matching to searching for words and sequences of words. We furthermore create a variant of the Sparse Suffix Tree([3]) and adapt algorithms for approximate word and word sequence matching over the Sparse Suffix Tree variant. The algorithms have been implemented and tested in WWW information retrieval environment, and performance data is presented.

Keywords

Leaf Node Edit Distance Semantic Interpretation Suffix Tree Edit Operation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Cobbs A. L. (1995) “Fast Approximate Matching using Suffix Trees,” In Proceedings of Sixth Symposium on Combinatorial Pattern Matching (CPM'95) Springer Verlag, pp. 41–54.Google Scholar
  2. [2]
    Gonnet G.H, Baeza-Yates R.A., Snider T. (1991) “Lexicographical indices for text: Inverted files vs. PAT trees.,” Technical Report OED-91-10, Center for the new OED, University of Waterloo.Google Scholar
  3. [3]
    Kärkkäinen J., Ukkonen E. “Sparse Suffix Trees“ In Proceedings of the Second Annual International Computing and Combinatorias Conference (COCOON 96), Springer Verlag, pp. 219–230.Google Scholar
  4. [4]
    Levenstein, V.I. (1965) “Binary codes capable of correcting deletions, insertions, and reversals,” (Russian) Doklady Akademii nauk SSSR, Vol. 163, No. 4, p. 845–8 (also Cybernetics and Control Theory, Vol. 10, No. 8, p. 707–10, 1966).Google Scholar
  5. [5]
    Morrison D.R. (1968) “PATRICIA — Practical Algorithm To Retrieve Information Coded in Alphanumeric,” Journal of the ACM, 15, pp. 514–534.Google Scholar
  6. [6]
    Sbang H., Merrettal T.H. (1996) “Tries for Approximate String Matching,” IEEE Transactions on Knowledge and Data Engineering, Vol 5, No. 4, p. 540–547.Google Scholar
  7. [7]
    Ukkonen E. (1985) “Finding Approximate Patterns in Strings,” Journal of Algorithms, vol. 6, pp. 132–137.Google Scholar
  8. [8]
    Weiner P. (1973) “Linear pattern matching algorithms,” In Proceedings of the IEEE 14th Annual Symposium on Switching and Automata Theory, pp. 1–11.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Knut Magne Risvik
    • 1
  1. 1.Department of Computer and Information Sciencethe Norwegian University of Science and TechnologyTrondheimNorway

Personalised recommendations