Skip to main content

A Lexicon and Depth-Wise Separable Convolution Based Handwritten Text Recognition System

  • Conference paper
  • First Online:
Image and Vision Computing (IVCNZ 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13836))

Included in the following conference series:

Abstract

Cursive handwritten text recognition is a challenging research problem in pattern recognition. The current state-of-the-art approaches include models based on convolutional recurrent neural networks and multi-dimensional long short-term memory recurrent neural network techniques. These methods are highly computationally extensive as well model is complex at the design level. In recent studies, a combination of convolutional neural networks and gated convolutional neural networks based models demonstrated less number of parameters in comparison to convolutional recurrent neural networks based models. In the direction to reduced the total number of parameters to be trained, in this work, we have used depthwise separable convolution in place of standard convolutions with a combination of gated-convolutional neural network and bidirectional gated recurrent unit to reduce the total number of parameters to be trained. Additionally, we have also included a lexicon-based word beam search decoder at the testing step. It also helps in improving the overall accuracy of the model. We have obtained 3.84% character error rate and 9.40% word error rate on IAM dataset, 3.15% character error rate and 11.8% word error rate on RIMES dataset and 4.88% character error rate and 14.56% word error rate in George Washington dataset respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/arthurflor23/handwritten-text-recognition.

  2. 2.

    https://github.com/githubharald/CTCWordBeamSearch.

References

  1. Kumari, L., Sharma, A.: A review of deep learning techniques in document image word spotting. Arch. Comput. Methods Eng. 29(2), 1085–1106 (2022)

    Article  Google Scholar 

  2. Scheidl, H., Fiel, S., Sablatnig, R.: Word beam search: a connectionist temporal classification decoding algorithm. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 253–258 (2018)

    Google Scholar 

  3. Chen, W.T., Gader, P., Shi, H.: Lexicon-driven handwritten word recognition using optimal linear combinations of order statistics. IEEE Trans. Pattern Anal. Mach. Intell. 21(1), 77–82 (1999)

    Article  Google Scholar 

  4. Bellman, R.E., Dreyfus, S.E.: Applied Dynamic Programming, vol. 2050. Princeton University Press, Princeton (2015)

    MATH  Google Scholar 

  5. Vinciarelli, A.: A survey on off-line cursive word recognition. Pattern Recogn. 35(7), 1433–1446 (2002)

    Article  MATH  Google Scholar 

  6. Toselli, A., et al.: Integrated handwriting recognition and interpretation using finite-state models. Int. J. Pattern Recognit. Artif. Intell. 18(4), 519–539 (2004)

    Article  Google Scholar 

  7. Sánchez, J.A., Romero, V., Toselli, A.H., Villegas, M., Vidal, E.: A set of benchmarks for handwritten text recognition on historical documents. Pattern Recogn. 94, 122–134 (2019)

    Article  Google Scholar 

  8. Espana-Boquera, S., Castro-Bleda, M., Gorbe-Moya, J., Zamora-Martinez, F.: Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans. Pattern Anal. Mach. Intell. 33(4), 767–779 (2011)

    Article  Google Scholar 

  9. Dreuw, P., Doetsch, P., Plahl, C., Ney, H.: Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained gaussian HMM: a comparison for offline handwriting recognition. In: 2011 18th IEEE International Conference on Image Processing, pp. 3541–3544 (2011)

    Google Scholar 

  10. Toselli, A.H., Vidal, E.: Handwritten text recognition results on the bentham collection with improved classical N-gram-hmm methods. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 15–22 (2015)

    Google Scholar 

  11. Doetsch, P., Kozielski, M., Ney, H.: Fast and robust training of recurrent neural networks for offline handwriting recognition. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 279–284. IEEE (2014)

    Google Scholar 

  12. Kozielski, M., Doetsch, P., Ney, H., et al.: Improvements in RWTH’s system for off-line handwriting recognition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 935–939. IEEE (2013)

    Google Scholar 

  13. Liwicki, M., Graves, A., Bunke, H.: Neural networks for handwriting recognition. In: Ogiela, M., Jain, L. (eds.) Computational Intelligence Paradigms in Advanced Pattern Classification, pp. 5–24. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24049-2_2

    Chapter  Google Scholar 

  14. Bourbakis, N.G., Koutsougeras, C., Jameel, A.: Handwriting recognition using a reduced character method and neural nets. In: Nonlinear Image Processing VI, vol. 2424, pp. 592–601. SPIE (1995)

    Google Scholar 

  15. Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, vol. 2006, pp. 369–376 (2006)

    Google Scholar 

  16. Louradour, J., Kermorvant, C.: Curriculum learning for handwritten text line recognition (2014)

    Google Scholar 

  17. Pham, V., Bluche, T., Kermorvant, C., Louradour, J.: Dropout improves recurrent neural networks for handwriting recognition. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 285–290 (2014)

    Google Scholar 

  18. Bluche, T., Louradour, J., Messina, R.: Scan, attend and read: end-to-end handwritten paragraph recognition with MDLSTM attention. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1050–1055 (2017)

    Google Scholar 

  19. Voigtlaender, P., Doetsch, P., Ney, H.: Handwriting recognition with large multidimensional long short-term memory recurrent neural networks. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 228–233 (2016)

    Google Scholar 

  20. Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 67–72 (2017)

    Google Scholar 

  21. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)

    Article  Google Scholar 

  22. Scheidl, H.: Handwritten text recognition in historical document. Diplom-Ingenieur in Visual Computing, Master’s thesis, Technische Universität Wien, Vienna (2018)

    Google Scholar 

  23. Bluche, T., Messina, R.: Gated convolutional recurrent neural networks for multilingual handwriting recognition. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 646–651 (2017)

    Google Scholar 

  24. de Sousa Neto, A.F., Bezerra, B.L.D., Toselli, A.H., Lima, E.B.: HTR-flor: a deep learning system for offline handwritten text recognition. In: 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 54–61 (2020)

    Google Scholar 

  25. Kumari, L., Singh, S., Sharma, A.: Page level input for handwritten text recognition in document images. In: Kim, J.H., Deep, K., Geem, Z.W., Sadollah, A., Yadav, A. (eds.) Proceedings of 7th International Conference on Harmony Search, Soft Computing and Applications, pp. 171–183. Springer, Singapore (2022)

    Chapter  Google Scholar 

  26. Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Trans. Pattern Anal. Mach. Intell. (2022)

    Google Scholar 

  27. Doetsch, P., Zeyer, A., Ney, H.: Bidirectional decoder networks for attention-based end-to-end offline handwriting recognition. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 361–366 (2016)

    Google Scholar 

  28. Castro, D., L. D. Bezerra, B., Valença, M.: Boosting the deep multidimensional long-short-term memory network for handwritten recognition systems. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 127–132 (2018)

    Google Scholar 

  29. Dutta, K., Krishnan, P., Mathew, M., Jawahar, C.: Improving CNN-RNN hybrid networks for handwriting recognition. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 80–85 (2018)

    Google Scholar 

  30. Chowdhury, A., Vig, L.: An efficient end-to-end neural model for handwritten text recognition (2018). https://arxiv.org/abs/1807.07965

  31. Michael, J., Labahn, R., Gruning, T., Zollner, J.: Evaluating sequence-to-sequence models for handwritten text recognition, pp. 1286–1293 (2019)

    Google Scholar 

  32. Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Pay attention to what you read: non-recurrent handwritten text-line recognition (2020). https://arxiv.org/abs/2005.13044

  33. Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), pp. 1–6 (2017)

    Google Scholar 

  34. Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches (2014). https://arxiv.org/abs/1409.1259

  35. Marti, U.V., Bunke, H.: A full english sentence database for off-line handwriting recognition. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition, ICDAR 1999, p. 705. IEEE Computer Society, USA (1999)

    Google Scholar 

  36. Grosicki, E., El-Abed, H.: ICDAR 2011 - French handwriting recognition competition. In: 2011 International Conference on Document Analysis and Recognition, pp. 1459–1463 (2011)

    Google Scholar 

  37. Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33(7), 934–942 (2012)

    Article  Google Scholar 

  38. Chen, K.N., Chen, C.H., Chang, C.C.: Efficient illumination compensation techniques for text images. Digit. Signal Process. 22(5), 726–733 (2012)

    Article  Google Scholar 

  39. Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)

    Article  Google Scholar 

  40. Vinciarelli, A., Luettin, J.: A new normalization technique for cursive handwritten words. Pattern Recogn. Lett. 22(9), 1043–1050 (2001)

    Article  MATH  Google Scholar 

  41. Yousef, M., Hussain, K.F., Mohammed, U.S.: Accurate, data-efficient, unconstrained text recognition with convolutional neural networks. Pattern Recognit. 108, 107482 (2020)

    Article  Google Scholar 

  42. Chen, Z., Wu, Y., Yin, F., Liu, C.L.: Simultaneous script identification and handwriting recognition via multi-task learning of recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 525–530 (2017)

    Google Scholar 

  43. Bluche, T.: Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS 2016, pp. 838–846. Curran Associates Inc., Red Hook (2016)

    Google Scholar 

  44. Huang, X., Qiao, L., Yu, W., Li, J., Ma, Y.: End-to-end sequence labeling via convolutional recurrent neural network with a connectionist temporal classification layer. Int. J. Comput. Intell. Syst. 13, 341–351 (2020)

    Article  Google Scholar 

  45. Poulos, J., Valle, R.: Character-based handwritten text transcription with attention networks. Neural Comput. Appl. 33(16), 10563–10573 (2021)

    Article  Google Scholar 

  46. Toledo, J.I., Dey, S., Fornes, A., Llados, J.: Handwriting recognition by attribute embedding and recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1038–1043 (2017)

    Google Scholar 

  47. Almazan, J., Gordo, A., Fornes, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36(12), 2552–2566 (2014)

    Article  Google Scholar 

  48. Fischer, A.: Handwriting recognition in historical documents. Ph.D. thesis, Verlag nicht ermittelbar (2012)

    Google Scholar 

  49. Scheffe, H.: The Analysis of Variance, vol. 72. Wiley, Hoboken (1999)

    MATH  Google Scholar 

Download references

Acknowledgements

This research is funded by the Government of India, University Grant Commission, under the Senior Research Fellowship scheme.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anuj Sharma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumari, L., Singh, S., Rathore, V.V.S., Sharma, A. (2023). A Lexicon and Depth-Wise Separable Convolution Based Handwritten Text Recognition System. In: Yan, W.Q., Nguyen, M., Stommel, M. (eds) Image and Vision Computing. IVCNZ 2022. Lecture Notes in Computer Science, vol 13836. Springer, Cham. https://doi.org/10.1007/978-3-031-25825-1_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25825-1_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25824-4

  • Online ISBN: 978-3-031-25825-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics