Handwritten word recognition using Web resources and recurrent neural networks

Abstract

Handwriting recognition systems usually rely on static dictionaries and language models. Full coverage of these dictionaries is generally not achieved when dealing with unrestricted document corpora due to the presence of Out-Of-Vocabulary (OOV) words. We propose an approach which uses the World Wide Web as a corpus to improve dictionary coverage. We exploit the very large and freely available Wikipedia corpus in order to obtain dynamic dictionaries on the fly. We rely on recurrent neural network (RNN) recognizers, with and without linguistic resources, to detect words that are non-reliably recognized within a word sequence. Such words are labeled as non-anchor words (NAWs) and include OOVs and In-Vocabulary words recognized with low confidence. To recognize a non-anchor word, a dynamic dictionary is built by selecting words from the Web resource based on their string similarity with the NAW image, and their linguistic relevance in the NAW context. Similarity is evaluated by computing the edit distance between the sequence of characters generated by the RNN recognizer exploited as a filler model, and the Wikipedia words. Linguistic relevance is based on an N-gram language model estimated from the Wikipedia corpus. Experiments conducted on a word-segmented version of the publicly available RIMES database show that the proposed approach can improve recognition accuracy compared to systems based on static dictionaries only. The proposed approach shows even better behavior as the proportion of OOVs increases, in terms of both accuracy and dictionary coverage.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

References

  1. 1.

    Brakensiek, A., Willett, D., Rigoll, G.: Unlimited vocabulary script recognition using character N-grams. In: DAGM, pp. 436–443 (2000)

  2. 2.

    Bazzi, I., Schwartz, R.M., Makhoul, J.: An omnifont open-vocabulary OCR system for english and arabic. IEEE Trans. Pattern Anal. Mach. Intell. 21(6), 495–504 (1999)

    Article  Google Scholar 

  3. 3.

    Hamdani, M., El-Desoky Mousa, A., Ney, H.: Open vocabulary arabic handwriting recognition using morphological decomposition. In: ICDAR, pp. 280–284 (2013)

  4. 4.

    Parada, C., Sethy, A., Dredze, M., Jelinek, F.: A spoken term detection framework for recovering out-of-vocabulary words using the web. In: INTERSPEECH (2010)

  5. 5.

    Oger, S., Popescu, V., Linarés, G.: Using the world wide web for learning new words in continuous speech recognition tasks: two case studies. In: SPECOM (2009)

  6. 6.

    Kaufmann, G., Bunke, H., Hadorn, M.: Lexicon reduction in an HMM-framework based on quantized feature vectors. In: ICDAR, pp. 1097–1101 (1997)

  7. 7.

    Guillevic, D., Nishiwaki, D., Yamada, K.: Word lexicon reduction by character spotting. In: IWFHR, pp. 373–382 (2000)

  8. 8.

    Powalka, R.K., Sherkat, N., Whitrow, R.J.: Word shape analysis for a hybrid recognition system. Pattern Recogn. 30(3), 421–445 (1997)

    Article  Google Scholar 

  9. 9.

    Seni, G., Srihari, R.K., Nasrabadi, N.M.: Large vocabulary recognition of on-line handwritten cursive words. IEEE Trans. Pattern Anal. Mach. Intell. 18(7), 757–762 (1996)

    Article  Google Scholar 

  10. 10.

    Leroy, A.: Lexicon reduction based on global features for on-line handwriting. In: IWFHR, pp. 431–440 (1994)

  11. 11.

    Vinciarelli, A.: Noisy text categorization. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1882–1895 (2005)

    Article  Google Scholar 

  12. 12.

    Milewski, R., Govindaraju, V., Bhardwaj, A.: Automatic recognition of handwritten medical forms for search engines. IJDAR 11(4), 203–218 (2009)

    Article  Google Scholar 

  13. 13.

    Farooq, F., Chandalia, G., Govindaraju, V.: Lexicon reduction in handwriting recognition using topic categorization. In: DAS, pp. 369–375 ( 2008)

  14. 14.

    Farooq, F., Bhardwaj, A., Govindaraju, V.: Using topic models for ocr correction. IJDAR 12(3), 153–164 (2009)

    Article  Google Scholar 

  15. 15.

    Whitelaw, C., Hutchinson, B., Chung, G., Ellis, G.: Using the web for language independent spellchecking and autocorrection. In: EMNLP, pp. 890–899 (2009)

  16. 16.

    Soricut, R., Brill, E.: Automatic question answering using the web: beyond the factoid. Inf. Retrieval 9(2), 191–206 (2006)

    Article  Google Scholar 

  17. 17.

    Rigau, G., Magnini, B., Agirre, E., Vossen, P., Carroll, J.: Meaning: a roadmap to knowledge technologies. In: COLING-02 on A roadmap for computational linguistics, pp. 1–7 (2002)

  18. 18.

    Grefenstette, G.: The World Wide Web as a resource for example-based machine translation tasks. In: Translating and the Computer 21: Proceedings of the 21st International Conference on Translating and the Computer (1999)

  19. 19.

    Cao, Y.: Base noun phrase translation using web data and the EM algorithm. In: Proceedings of CoLing, pp. 127–133 (2002)

  20. 20.

    Oger, S., Popescu, V., Linarés, G.: Using the world wide web for learning new words in continuous speech recognition tasks: two case studies. In: SPECOM (2009)

  21. 21.

    Keller, F., Lapata, M.: Using the web to obtain frequencies for unseen bigrams. Comput. Linguistics 29(3), 459–484 (2003)

    Article  Google Scholar 

  22. 22.

    Adler, M., Goldberg, Y., Gabay, D., Elhadad, M.: Unsupervised lexicon-based resolution of unknown words for full morphological analysis. In: ACL, pp. 728–736 (2008)

  23. 23.

    Umansky-Pesin, S., Reichart, R., Rappoport, A.: A multi-domain web-based algorithm for POS tagging of unknown words. Beijing, pp. 1274–1282 (2010)

  24. 24.

    Taghva, K., Agarwal, S.: Utilizing web data in identification and correction of OCR errors. In: Proceedings of DRR (2014)

  25. 25.

    Feild, J. L., Learned-Miller, E. G.: Improving open-vocabulary scene text recognition. In: ICDAR, pp. 604–608 (2013)

  26. 26.

    Oprean, C., Likforman-Sulem, L., Popescu, A., Mokbel, C.: Using the web to create dynamic dictionaries in handwritten out-of-vocabulary word recognition. In: ICDAR, pp. 989–993 (2013)

  27. 27.

    Vinciarelli, A., Luettin, J.: A new normalization technique for cursive handwritten words. Pattern Recogn. Lett. 22(9), 1043–1050 (2001)

    MATH  Article  Google Scholar 

  28. 28.

    Bianne-Bernard, A.-L., Menasri, F., El-Hajj, R., Mokbel, C., Kermorvant, C., Likforman-Sulem, L.: Dynamic and contextual information in HMM modeling for handwritten word recognition. IEEE PAMI 99(10), 2066–2080 (2011)

    Article  Google Scholar 

  29. 29.

    Oprean, C., Likforman-Sulem, L., Mokbel, C.: Handwritten word preprocessing for database adaptation. In: DRR XX, pp. 808–865 (2013)

  30. 30.

    Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 2673–2681 (1997)

  31. 31.

    Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kolen, J., Kremer, S. (eds.) Field Guide to Dynamical Recurrent Networks. IEEE Press, New York (2001)

  32. 32.

    Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. PAMI 31(5), 855–868 (2009)

    Article  Google Scholar 

  33. 33.

    Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Netw. 1(4), 339–356 (1988)

    Article  Google Scholar 

  34. 34.

    Williams, R. J., Zipser, D.: Backpropagation: theory, architecture and applications. In: Chauvin, Y., Rumelhart, D.E. (eds.) Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity, pp. 433–486. Lawrence Erlbaum Associates, Hillsdale, New Jersey (1995)

  35. 35.

    Graves, A., Fernández, S., Gomez, F.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML, pp. 369–376 (2006)

  36. 36.

    Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)

    Article  Google Scholar 

  37. 37.

    Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    MATH  Article  Google Scholar 

  38. 38.

    Singhal, A.: Modern information retrieval: a brief overview. IEEE Data Eng. Bull. 24(4), 35–43 (2001)

    Google Scholar 

  39. 39.

    Grosicki, E., El-Abed, H.: ICDAR 2011-French handwriting recognition competition. In: ICDAR, pp. 1459–1463 (2011)

  40. 40.

    Hayamizu, S., Itou, K., Tanaka, K.: Detection of unknown words in large vocabulary speech recognition. In: EUROSPEECH (1993)

  41. 41.

    White, C. M., Zweig, G., Burget, L., Schwarz, P., Hermansky, H.: Confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments. In: ICASSP, pp. 4085–4088 (2008)

  42. 42.

    Burget, L., Schwarz, P., Matějka, P., Hannemann, M., Rastrow, A., White, C., Khudanpur, S., Heřmanský, H., Černocký, J.: Combination of strongly and weakly constrained recognizers for reliable detection of OOVs. In: ICASSP (2008)

  43. 43.

    Levenshtein, V.: Binary codes capable of correcting deletions, insertions and reversals. Soviet Phys. Doklady 10, 707 (1966)

    MathSciNet  Google Scholar 

  44. 44.

    Damerau, F.: A technique for computer detection and correction of spelling errors. Commun. ACM 7, 171–176 (1964)

    Article  Google Scholar 

  45. 45.

    Grosicki, E., Carré, M., Geoffrois, E., Augustin, E., Preteux, F.: La campagne d’évaluation RIMES pour la reconnaissance de courriers manuscrits. In: CIFED (2006)

  46. 46.

    Grosicki, E., Abed, H. E.: ICDAR 2009 handwriting recognition competition. In: ICDAR (2009)

  47. 47.

    Brakensiek, A., Rottland, J., Kosmala, A., Rigoll, G.: Off-line handwriting recognition using various hybrid modeling techniques and character n-grams. In: IWFHR, pp. 343–352 (2000)

Download references

Acknowledgments

Authors wish to acknowledge ITESOFT/YOOZ which has supported this work by funding C. Oprean toward her PhD. We would also like to thank Pascal Vaillant of Paris 13 University for fruitful discussions on NLP approaches, as well as Alex Graves, Marcus Liwicki, Volkmar Frinken and Andreas Fischer for making available the RNN library.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Laurence Likforman-Sulem.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Oprean, C., Likforman-Sulem, L., Popescu, A. et al. Handwritten word recognition using Web resources and recurrent neural networks. IJDAR 18, 287–301 (2015). https://doi.org/10.1007/s10032-015-0251-1

Download citation

Keywords

  • Handwritten word recognition
  • Out-Of-Vocabulary word
  • Web resources
  • Dynamic dictionary
  • Recurrent neural networks