Abstract
Optical Character Recognition is one of the most prevailing research fields since 1970s. Numerous research work has been conducted on Optical Character Recognition. The problem of Optical Character Recognition is to convert images of texts into editable texts. Recent advances in deep learning have accelerated the improvements in this field, particularly with languages with large annotated datasets. Bangla, a language with a large number of character classes and complex cursive alphabet shapes, is unfortunately not included in these advancements due to the lack of a large annotated dataset. This work concentrates on attempting to perform OCR in noisy conditions for Bangla text. We have created a dataset of 5000 noisy Bangla text samples. To augment this small collection, we use a strategy to pre-train our proposed end-to-end model on synthetically generated data and then optionally fine-tune on a part of the collected dataset. Our results indicate that attempting to perform noisy OCR is an extremely challenging task, and the best results are obtained when models trained on synthetic data are fine-tuned with some real-world data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bag S, Bhowmick P, Harit G (2011) Recognition of Bengali handwritten characters using skeletal convexity and dynamic programming. In: 2011 second international conference on emerging applications of information technology. IEEE, pp 265–268
Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten bangla characters. Pattern Recog 42(7):1467–1484
Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 41–48
Bhattacharya U, Chaudhuri BB (2009) Handwritten numeral databases of indian scripts and multistage recognition of mixed numerals. IEEE Trans Pattern Anal Mach Intell 31(3):444–457
Bhowmik TK, Bhattacharya U, Parui SK (2004) Recognition of Bangla handwritten characters using an MLP classifier based on stroke features. In: International conference on neural information processing. Springer, pp 814–819
Breuel TM (2008) The OCRopus open source OCR system. In: Document recognition and retrieval XV, vol 6815. International Society for Optics and Photonics, p 68150F
Chaudhuri B, Pal U (1997) An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi). In: Proceedings of the fourth international conference on document analysis and recognition, vol 2. IEEE, pp 1011–1015
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555
Egmont-Petersen M, de Ridder D, Handels H (2002) Image processing with neural networksa review. Pattern Recogn 35(10):2279–2301
Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 369–376
Medsker L, Jain LC eds (1999) Recurrent neural networks: design and applications. CRC press
Noor NA, Habib S (2005) Bangla optical character recognition. PhD thesis, School of Engineering and Computer Science (SECS), BRAC University
Omee FY, Himel SS, Bikas M, Naser A (2012) A complete workflow for development of Bangla OCR. arXiv:1204.1198
Shopon M, Mohammed N, Abedin MA (2016) Bangla handwritten digit recognition using autoencoder and deep convolutional neural network. In: 2016 international workshop on computational intelligence (IWCI). IEEE, pp 64–68
Shopon M, Mohammed N, Abedin MA (2017) Image augmentation by blocky artifact in deep convolutional neural network for handwritten digit recognition. In: 2017 IEEE International Conference on imaging, vision & pattern recognition (icIVPR). IEEE, pp 1–6
Singh R, Yadav C, Verma P, Yadav V (2010) Optical character recognition (OCR) for printed Devnagari script using artificial neural network. Int J Comput Sci Commun 1(1):91–95
Acknowledgements
This project is supported by The Institute for Energy, Environment, Research and Development (IEERD) of University of Asia Pacific.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shopon, M., Diptu, N.A., Nabeel Mohammed (2020). End-to-End Optical Character Recognition Using Sythetic Dataset Generator for Noisy Conditions. In: Uddin, M.S., Bansal, J.C. (eds) Proceedings of International Joint Conference on Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-3607-6_41
Download citation
DOI: https://doi.org/10.1007/978-981-15-3607-6_41
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3606-9
Online ISBN: 978-981-15-3607-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)