Abstract
Cursive handwritten text recognition is a challenging research problem in pattern recognition. The current state-of-the-art approaches include models based on convolutional recurrent neural networks and multi-dimensional long short-term memory recurrent neural network techniques. These methods are highly computationally extensive as well model is complex at the design level. In recent studies, a combination of convolutional neural networks and gated convolutional neural networks based models demonstrated less number of parameters in comparison to convolutional recurrent neural networks based models. In the direction to reduced the total number of parameters to be trained, in this work, we have used depthwise separable convolution in place of standard convolutions with a combination of gated-convolutional neural network and bidirectional gated recurrent unit to reduce the total number of parameters to be trained. Additionally, we have also included a lexicon-based word beam search decoder at the testing step. It also helps in improving the overall accuracy of the model. We have obtained 3.84% character error rate and 9.40% word error rate on IAM dataset, 3.15% character error rate and 11.8% word error rate on RIMES dataset and 4.88% character error rate and 14.56% word error rate in George Washington dataset respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kumari, L., Sharma, A.: A review of deep learning techniques in document image word spotting. Arch. Comput. Methods Eng. 29(2), 1085–1106 (2022)
Scheidl, H., Fiel, S., Sablatnig, R.: Word beam search: a connectionist temporal classification decoding algorithm. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 253–258 (2018)
Chen, W.T., Gader, P., Shi, H.: Lexicon-driven handwritten word recognition using optimal linear combinations of order statistics. IEEE Trans. Pattern Anal. Mach. Intell. 21(1), 77–82 (1999)
Bellman, R.E., Dreyfus, S.E.: Applied Dynamic Programming, vol. 2050. Princeton University Press, Princeton (2015)
Vinciarelli, A.: A survey on off-line cursive word recognition. Pattern Recogn. 35(7), 1433–1446 (2002)
Toselli, A., et al.: Integrated handwriting recognition and interpretation using finite-state models. Int. J. Pattern Recognit. Artif. Intell. 18(4), 519–539 (2004)
Sánchez, J.A., Romero, V., Toselli, A.H., Villegas, M., Vidal, E.: A set of benchmarks for handwritten text recognition on historical documents. Pattern Recogn. 94, 122–134 (2019)
Espana-Boquera, S., Castro-Bleda, M., Gorbe-Moya, J., Zamora-Martinez, F.: Improving offline handwritten text recognition with hybrid HMM/ANN models. IEEE Trans. Pattern Anal. Mach. Intell. 33(4), 767–779 (2011)
Dreuw, P., Doetsch, P., Plahl, C., Ney, H.: Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained gaussian HMM: a comparison for offline handwriting recognition. In: 2011 18th IEEE International Conference on Image Processing, pp. 3541–3544 (2011)
Toselli, A.H., Vidal, E.: Handwritten text recognition results on the bentham collection with improved classical N-gram-hmm methods. In: Proceedings of the 3rd International Workshop on Historical Document Imaging and Processing, pp. 15–22 (2015)
Doetsch, P., Kozielski, M., Ney, H.: Fast and robust training of recurrent neural networks for offline handwriting recognition. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 279–284. IEEE (2014)
Kozielski, M., Doetsch, P., Ney, H., et al.: Improvements in RWTH’s system for off-line handwriting recognition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 935–939. IEEE (2013)
Liwicki, M., Graves, A., Bunke, H.: Neural networks for handwriting recognition. In: Ogiela, M., Jain, L. (eds.) Computational Intelligence Paradigms in Advanced Pattern Classification, pp. 5–24. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24049-2_2
Bourbakis, N.G., Koutsougeras, C., Jameel, A.: Handwriting recognition using a reduced character method and neural nets. In: Nonlinear Image Processing VI, vol. 2424, pp. 592–601. SPIE (1995)
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, vol. 2006, pp. 369–376 (2006)
Louradour, J., Kermorvant, C.: Curriculum learning for handwritten text line recognition (2014)
Pham, V., Bluche, T., Kermorvant, C., Louradour, J.: Dropout improves recurrent neural networks for handwriting recognition. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 285–290 (2014)
Bluche, T., Louradour, J., Messina, R.: Scan, attend and read: end-to-end handwritten paragraph recognition with MDLSTM attention. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1050–1055 (2017)
Voigtlaender, P., Doetsch, P., Ney, H.: Handwriting recognition with large multidimensional long short-term memory recurrent neural networks. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 228–233 (2016)
Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 67–72 (2017)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)
Scheidl, H.: Handwritten text recognition in historical document. Diplom-Ingenieur in Visual Computing, Master’s thesis, Technische Universität Wien, Vienna (2018)
Bluche, T., Messina, R.: Gated convolutional recurrent neural networks for multilingual handwriting recognition. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 646–651 (2017)
de Sousa Neto, A.F., Bezerra, B.L.D., Toselli, A.H., Lima, E.B.: HTR-flor: a deep learning system for offline handwritten text recognition. In: 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pp. 54–61 (2020)
Kumari, L., Singh, S., Sharma, A.: Page level input for handwritten text recognition in document images. In: Kim, J.H., Deep, K., Geem, Z.W., Sadollah, A., Yadav, A. (eds.) Proceedings of 7th International Conference on Harmony Search, Soft Computing and Applications, pp. 171–183. Springer, Singapore (2022)
Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Trans. Pattern Anal. Mach. Intell. (2022)
Doetsch, P., Zeyer, A., Ney, H.: Bidirectional decoder networks for attention-based end-to-end offline handwriting recognition. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 361–366 (2016)
Castro, D., L. D. Bezerra, B., Valença, M.: Boosting the deep multidimensional long-short-term memory network for handwritten recognition systems. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 127–132 (2018)
Dutta, K., Krishnan, P., Mathew, M., Jawahar, C.: Improving CNN-RNN hybrid networks for handwriting recognition. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 80–85 (2018)
Chowdhury, A., Vig, L.: An efficient end-to-end neural model for handwritten text recognition (2018). https://arxiv.org/abs/1807.07965
Michael, J., Labahn, R., Gruning, T., Zollner, J.: Evaluating sequence-to-sequence models for handwritten text recognition, pp. 1286–1293 (2019)
Kang, L., Riba, P., Rusiñol, M., Fornés, A., Villegas, M.: Pay attention to what you read: non-recurrent handwritten text-line recognition (2020). https://arxiv.org/abs/2005.13044
Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), pp. 1–6 (2017)
Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches (2014). https://arxiv.org/abs/1409.1259
Marti, U.V., Bunke, H.: A full english sentence database for off-line handwriting recognition. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition, ICDAR 1999, p. 705. IEEE Computer Society, USA (1999)
Grosicki, E., El-Abed, H.: ICDAR 2011 - French handwriting recognition competition. In: 2011 International Conference on Document Analysis and Recognition, pp. 1459–1463 (2011)
Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33(7), 934–942 (2012)
Chen, K.N., Chen, C.H., Chang, C.C.: Efficient illumination compensation techniques for text images. Digit. Signal Process. 22(5), 726–733 (2012)
Sauvola, J., Pietikäinen, M.: Adaptive document image binarization. Pattern Recogn. 33(2), 225–236 (2000)
Vinciarelli, A., Luettin, J.: A new normalization technique for cursive handwritten words. Pattern Recogn. Lett. 22(9), 1043–1050 (2001)
Yousef, M., Hussain, K.F., Mohammed, U.S.: Accurate, data-efficient, unconstrained text recognition with convolutional neural networks. Pattern Recognit. 108, 107482 (2020)
Chen, Z., Wu, Y., Yin, F., Liu, C.L.: Simultaneous script identification and handwriting recognition via multi-task learning of recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 525–530 (2017)
Bluche, T.: Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS 2016, pp. 838–846. Curran Associates Inc., Red Hook (2016)
Huang, X., Qiao, L., Yu, W., Li, J., Ma, Y.: End-to-end sequence labeling via convolutional recurrent neural network with a connectionist temporal classification layer. Int. J. Comput. Intell. Syst. 13, 341–351 (2020)
Poulos, J., Valle, R.: Character-based handwritten text transcription with attention networks. Neural Comput. Appl. 33(16), 10563–10573 (2021)
Toledo, J.I., Dey, S., Fornes, A., Llados, J.: Handwriting recognition by attribute embedding and recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1038–1043 (2017)
Almazan, J., Gordo, A., Fornes, A., Valveny, E.: Word spotting and recognition with embedded attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36(12), 2552–2566 (2014)
Fischer, A.: Handwriting recognition in historical documents. Ph.D. thesis, Verlag nicht ermittelbar (2012)
Scheffe, H.: The Analysis of Variance, vol. 72. Wiley, Hoboken (1999)
Acknowledgements
This research is funded by the Government of India, University Grant Commission, under the Senior Research Fellowship scheme.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kumari, L., Singh, S., Rathore, V.V.S., Sharma, A. (2023). A Lexicon and Depth-Wise Separable Convolution Based Handwritten Text Recognition System. In: Yan, W.Q., Nguyen, M., Stommel, M. (eds) Image and Vision Computing. IVCNZ 2022. Lecture Notes in Computer Science, vol 13836. Springer, Cham. https://doi.org/10.1007/978-3-031-25825-1_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-25825-1_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25824-4
Online ISBN: 978-3-031-25825-1
eBook Packages: Computer ScienceComputer Science (R0)