Advertisement

Spontaneous Handwriting Text Recognition and Classification Using Finite-State Models

  • Alejandro Héctor Toselli
  • Moisés Pastor
  • Alfons Juan
  • Enrique Vidal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3523)

Abstract

Finite-state models are used to implement a handwritten text recognition and classification system for a real application entailing casual, spontaneous writing with large vocabulary. Handwritten short phrases which involve a wide variety of writing styles and contain many non-textual artifacts, are to be classified into a small number of predefined classes. To this end, two different types of statistical framework for phrase recognition-classification are considered, based on finite-state models. HMMs are used for text recognition process. Depending to the considered architecture, N-grams are used for performing text recognition and then text classification (serial approach) or for performing both simultaneously (integrated approach). The multinomial text classifier is also employed in the classification phase of the serial approach. Experimental results are reported which, given the extreme difficulty of the task, are encouraging.

Keywords

Blank Space Word Error Rate Handwriting Recognition Recognition Phase Serial Approach 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Guillevic, D., Suen, C.Y.: Recognition of legal amounts on bank cheques. Pattern Analysis and Applications 1, 28–41 (1998)CrossRefGoogle Scholar
  2. 2.
    Bazzi, I., Schwartz, R., Makhoul, J.: An Omnifont Open-Vocabulary OCR System for English and Arabic. IEEE Trans. on PAMI 21, 495–504 (1999)Google Scholar
  3. 3.
    González, J., Salvador, I., Toselli, A.H., Juan, A., Vidal, E., Casacuberta, F.: Offline Recognition of Syntax-Constrained Cursive Handwritten Text. In: Proc. of the S+SSPR 2000, Alicante (Spain), pp. 143–153 (2000)Google Scholar
  4. 4.
    Marti, U.V., Bunke, H.: Using a Statistical Language Model to improve the preformance of an HMM-Based Cursive Handwriting Recognition System. Int. Journal of Pattern Recognition and Artificial In telligence 15, 65–90 (2001)CrossRefGoogle Scholar
  5. 5.
    Toselli, A.H., Juan, A., Keysers, D., González, J., Salvador, I., Ney, H., Vidal, E., Casacuberta, F.: Integrated Handwriting Recognition and Interpretation using Finite-State Models. Int. Journal of Pattern Recognition and Artificial Intelligence 18, 519–539 (2004)CrossRefGoogle Scholar
  6. 6.
    Toselli, A.H., Juan, A., Vidal, E.: Spontaneous Handwriting Recognition and Classification. In: Proceedings of the 17th International Conference on Pattern Recognition, Cambridge, United Kingdom, vol. 1, pp. 433–436 (2004)Google Scholar
  7. 7.
    Juan, A., Ney, H.: Reversing and Smoothing the Multinomial Naive Bayes Text Classifier. In: Proc. of the 2nd Int. Workshop on Pattern Recognition in Information Systems (PRIS 2002), Alacant (Spain), pp. 200–212 (2002)Google Scholar
  8. 8.
    Pastor, M., Toselli, A., Vidal, E.: Projection profile based algorithm for slant removal. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 183–190. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Cavnar, W.B., Trenkle, J.M.: n-gram-based text categorization. In: Proc. of the Third Annual Symposium on Document Analysis and Information Retrieval (SDAIR 1994), Las Vegas, Nevada, U.S.A, pp. 161–175 (1994)Google Scholar
  10. 10.
    Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)Google Scholar
  11. 11.
    Katz, S.M.: Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Trans. on Acoustics, Speech and Signal Processing ASSP-35, 400–401 (1987)CrossRefGoogle Scholar
  12. 12.
    Witten, I.H., Bell, T.C.: The Zero-Frequency Problem: Estimating the Probabilities of Novel Events in Adaptive Text Compression. IEEE Trans. on Information Theory 17 (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Alejandro Héctor Toselli
    • 1
  • Moisés Pastor
    • 1
  • Alfons Juan
    • 1
  • Enrique Vidal
    • 1
  1. 1.Instituto Tecnológico de Informática and, Departamento de Sistemas Informáticos y Computación, ITI/DSICUniversidad Politécnica de ValenciaValenciaSpain

Personalised recommendations