End-to-End Optical Character Recognition Using Sythetic Dataset Generator for Noisy Conditions

Shopon, Md.; Diptu, Nazmul Alam; Nabeel Mohammed

doi:10.1007/978-981-15-3607-6_41

Md. Shopon⁶,
Nazmul Alam Diptu⁷ &
Nabeel Mohammed^7,8

Part of the book series: Algorithms for Intelligent Systems ((AIS))

543 Accesses
1 Citations

Abstract

Optical Character Recognition is one of the most prevailing research fields since 1970s. Numerous research work has been conducted on Optical Character Recognition. The problem of Optical Character Recognition is to convert images of texts into editable texts. Recent advances in deep learning have accelerated the improvements in this field, particularly with languages with large annotated datasets. Bangla, a language with a large number of character classes and complex cursive alphabet shapes, is unfortunately not included in these advancements due to the lack of a large annotated dataset. This work concentrates on attempting to perform OCR in noisy conditions for Bangla text. We have created a dataset of 5000 noisy Bangla text samples. To augment this small collection, we use a strategy to pre-train our proposed end-to-end model on synthetically generated data and then optionally fine-tune on a part of the collected dataset. Our results indicate that attempting to perform noisy OCR is an extremely challenging task, and the best results are obtained when models trained on synthetic data are fine-tuned with some real-world data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bag S, Bhowmick P, Harit G (2011) Recognition of Bengali handwritten characters using skeletal convexity and dynamic programming. In: 2011 second international conference on emerging applications of information technology. IEEE, pp 265–268
Google Scholar
Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten bangla characters. Pattern Recog 42(7):1467–1484
Article Google Scholar
Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 41–48
Google Scholar
Bhattacharya U, Chaudhuri BB (2009) Handwritten numeral databases of indian scripts and multistage recognition of mixed numerals. IEEE Trans Pattern Anal Mach Intell 31(3):444–457
Article Google Scholar
Bhowmik TK, Bhattacharya U, Parui SK (2004) Recognition of Bangla handwritten characters using an MLP classifier based on stroke features. In: International conference on neural information processing. Springer, pp 814–819
Google Scholar
Breuel TM (2008) The OCRopus open source OCR system. In: Document recognition and retrieval XV, vol 6815. International Society for Optics and Photonics, p 68150F
Google Scholar
Chaudhuri B, Pal U (1997) An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi). In: Proceedings of the fourth international conference on document analysis and recognition, vol 2. IEEE, pp 1011–1015
Google Scholar
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555
Egmont-Petersen M, de Ridder D, Handels H (2002) Image processing with neural networksa review. Pattern Recogn 35(10):2279–2301
Article Google Scholar
Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 369–376
Google Scholar
Medsker L, Jain LC eds (1999) Recurrent neural networks: design and applications. CRC press
Google Scholar
Noor NA, Habib S (2005) Bangla optical character recognition. PhD thesis, School of Engineering and Computer Science (SECS), BRAC University
Google Scholar
Omee FY, Himel SS, Bikas M, Naser A (2012) A complete workflow for development of Bangla OCR. arXiv:1204.1198
Shopon M, Mohammed N, Abedin MA (2016) Bangla handwritten digit recognition using autoencoder and deep convolutional neural network. In: 2016 international workshop on computational intelligence (IWCI). IEEE, pp 64–68
Google Scholar
Shopon M, Mohammed N, Abedin MA (2017) Image augmentation by blocky artifact in deep convolutional neural network for handwritten digit recognition. In: 2017 IEEE International Conference on imaging, vision & pattern recognition (icIVPR). IEEE, pp 1–6
Google Scholar
Singh R, Yadav C, Verma P, Yadav V (2010) Optical character recognition (OCR) for printed Devnagari script using artificial neural network. Int J Comput Sci Commun 1(1):91–95
Google Scholar

Download references

Acknowledgements

This project is supported by The Institute for Energy, Environment, Research and Development (IEERD) of University of Asia Pacific.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Asia Pacific, Dhaka, Bangladesh
Md. Shopon
Department of Electrical and Computer Engineering, North South University, Dhaka, Bangladesh
Nazmul Alam Diptu & Nabeel Mohammed
Center for Data Driven Computing (CDDC), North South University, Dhaka, Bangladesh
Nabeel Mohammed

Authors

Md. Shopon
View author publications
You can also search for this author in PubMed Google Scholar
Nazmul Alam Diptu
View author publications
You can also search for this author in PubMed Google Scholar
Nabeel Mohammed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Md. Shopon .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Jahangirnagar University, Dhaka, Bangladesh
Mohammad Shorif Uddin
Department of Mathematics, South Asian University, New Delhi, India
Jagdish Chand Bansal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shopon, M., Diptu, N.A., Nabeel Mohammed (2020). End-to-End Optical Character Recognition Using Sythetic Dataset Generator for Noisy Conditions. In: Uddin, M.S., Bansal, J.C. (eds) Proceedings of International Joint Conference on Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-3607-6_41

Download citation

DOI: https://doi.org/10.1007/978-981-15-3607-6_41
Published: 23 May 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3606-9
Online ISBN: 978-981-15-3607-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics