Skip to main content

Noisy subsequence recognition using constrained string editing involving substitutions, insertions, deletions and generalized transpositions

  • Session IA1b — Feature Matching & Detection
  • Conference paper
  • First Online:
Image Analysis Applications and Computer Graphics (ICSC 1995)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1024))

Included in the following conference series:

Abstract

We consider a problem which can greatly enhance the areas of cursive script recognition and the recognition of printed character sequences. This problem involves recognizing words/strings by processing their noisy subsequences. Let X* be any unknown word from a finite dictionary H. Let U be any arbitrary subsequence of X*. We study the problem of estimating X* by processing Y, a noisy version of U. Y contains substitution, insertion, deletion and generalized transposition errors — the latter occurring when transposed characters are themselves subsequently substituted. We solve the noisy subsequence recognition problem by defining and using the constrained edit distance between X ε H and Y subject to any arbitrary edit constraint involving the number and type of edit operations to be performed. An algorithm to compute this constrained edit distance has been presented. Using these algorithms we present a syntactic Pattern Recognition (PR) scheme which corrects noisy text containing all these types of errors. Experimental results which involve strings of lengths between 40 and 80 with an average of 30.24 deleted characters and an overall average noise of 68.69 % demonstrate the superiority of our system over existing methods.

Partially supported by the Natural Sciences and Engineering Research Council of Canada.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. A. V. Hall and G. R. Dowling, Approximate string matching, Comput. Surveys, 12:381–402 (1980).

    Google Scholar 

  2. R. L. Kashyap and B. J. Oommen, An effective algorithm for string correction using generalized edit distances-I. Description of the algorithm and its optimality, Inform. Sci., 23(2):123–142 (1981).

    Google Scholar 

  3. A. Marzal and E. Vidal, Computation of normalized edit distance and applications, IEEE Trans. on Pat. Anal. and Mach. Intel., PAMI-15:926–932 (1993).

    Google Scholar 

  4. A. Levenshtein, Binary codes capable of correcting deletions, insertions and reversals, Soviet Phys. Dokl., 10:707–710 (1966).

    Google Scholar 

  5. R. Lowrancc and R. A. Wagner, An extension of the string to string correction problem, J. Assoc. Comput. Mach., 22:177–183 (1975).

    Google Scholar 

  6. B. J. Oommen, Recognition of noisy subsequences using constrained edit distances, IEEE Trans. on Pat. Anal. and Mach. Intel., PAMI-9:676–685 (1987).

    Google Scholar 

  7. B. J. Oommen and R. K. S. Loke, Pattern recognition of strings with substitutions, insertions, deletions and generalized transpositions. Unabridged Paper. Available as a Carleton University technical report (1994).

    Google Scholar 

  8. B. J. Oommen and R. K. S. Loke, Noisy subsequence recognition using constrained string editing involving substitutions, insertions, deletions and generalized transpositions. Unabridged Paper. Available as a Carleton University technical report (1994).

    Google Scholar 

  9. J. L. Peterson, Computer programs for detecting and correcting spelling errors, Comm. Assoc. Comput. Mach., 23:676–687 (1980).

    Google Scholar 

  10. D. Sankoff and J. B. Kruskal, Time Warps,String Edits and Macromolecules: The Theory and practice of Sequence Comparison, Addison-Wesley (1983).

    Google Scholar 

  11. R. A. Wagner and M. J. Fischer, The string to string correction problem, J. Assoc. Comput. Mach., 21:168–173 (1974).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roland T. Chin Horace H. S. Ip Avi C. Naiman Ting-Chuen Pong

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oommen, B.J., Loke, R.K.S. (1995). Noisy subsequence recognition using constrained string editing involving substitutions, insertions, deletions and generalized transpositions. In: Chin, R.T., Ip, H.H.S., Naiman, A.C., Pong, TC. (eds) Image Analysis Applications and Computer Graphics. ICSC 1995. Lecture Notes in Computer Science, vol 1024. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60697-1_94

Download citation

  • DOI: https://doi.org/10.1007/3-540-60697-1_94

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60697-0

  • Online ISBN: 978-3-540-49298-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics