Skip to main content

End-to-End Optical Character Recognition Using Sythetic Dataset Generator for Noisy Conditions

  • Conference paper
  • First Online:
Proceedings of International Joint Conference on Computational Intelligence

Part of the book series: Algorithms for Intelligent Systems ((AIS))

Abstract

Optical Character Recognition is one of the most prevailing research fields since 1970s. Numerous research work has been conducted on Optical Character Recognition. The problem of Optical Character Recognition is to convert images of texts into editable texts. Recent advances in deep learning have accelerated the improvements in this field, particularly with languages with large annotated datasets. Bangla, a language with a large number of character classes and complex cursive alphabet shapes, is unfortunately not included in these advancements due to the lack of a large annotated dataset. This work concentrates on attempting to perform OCR in noisy conditions for Bangla text. We have created a dataset of 5000 noisy Bangla text samples. To augment this small collection, we use a strategy to pre-train our proposed end-to-end model on synthetically generated data and then optionally fine-tune on a part of the collected dataset. Our results indicate that attempting to perform noisy OCR is an extremely challenging task, and the best results are obtained when models trained on synthetic data are fine-tuned with some real-world data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bag S, Bhowmick P, Harit G (2011) Recognition of Bengali handwritten characters using skeletal convexity and dynamic programming. In: 2011 second international conference on emerging applications of information technology. IEEE, pp 265–268

    Google Scholar 

  2. Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten bangla characters. Pattern Recog 42(7):1467–1484

    Article  Google Scholar 

  3. Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th annual international conference on machine learning. ACM, pp 41–48

    Google Scholar 

  4. Bhattacharya U, Chaudhuri BB (2009) Handwritten numeral databases of indian scripts and multistage recognition of mixed numerals. IEEE Trans Pattern Anal Mach Intell 31(3):444–457

    Article  Google Scholar 

  5. Bhowmik TK, Bhattacharya U, Parui SK (2004) Recognition of Bangla handwritten characters using an MLP classifier based on stroke features. In: International conference on neural information processing. Springer, pp 814–819

    Google Scholar 

  6. Breuel TM (2008) The OCRopus open source OCR system. In: Document recognition and retrieval XV, vol 6815. International Society for Optics and Photonics, p 68150F

    Google Scholar 

  7. Chaudhuri B, Pal U (1997) An OCR system to read two Indian language scripts: Bangla and Devnagari (Hindi). In: Proceedings of the fourth international conference on document analysis and recognition, vol 2. IEEE, pp 1011–1015

    Google Scholar 

  8. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555

  9. Egmont-Petersen M, de Ridder D, Handels H (2002) Image processing with neural networksa review. Pattern Recogn 35(10):2279–2301

    Article  Google Scholar 

  10. Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 369–376

    Google Scholar 

  11. Medsker L, Jain LC eds (1999) Recurrent neural networks: design and applications. CRC press

    Google Scholar 

  12. Noor NA, Habib S (2005) Bangla optical character recognition. PhD thesis, School of Engineering and Computer Science (SECS), BRAC University

    Google Scholar 

  13. Omee FY, Himel SS, Bikas M, Naser A (2012) A complete workflow for development of Bangla OCR. arXiv:1204.1198

  14. Shopon M, Mohammed N, Abedin MA (2016) Bangla handwritten digit recognition using autoencoder and deep convolutional neural network. In: 2016 international workshop on computational intelligence (IWCI). IEEE, pp 64–68

    Google Scholar 

  15. Shopon M, Mohammed N, Abedin MA (2017) Image augmentation by blocky artifact in deep convolutional neural network for handwritten digit recognition. In: 2017 IEEE International Conference on imaging, vision & pattern recognition (icIVPR). IEEE, pp 1–6

    Google Scholar 

  16. Singh R, Yadav C, Verma P, Yadav V (2010) Optical character recognition (OCR) for printed Devnagari script using artificial neural network. Int J Comput Sci Commun 1(1):91–95

    Google Scholar 

Download references

Acknowledgements

This project is supported by The Institute for Energy, Environment, Research and Development (IEERD) of University of Asia Pacific.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md. Shopon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shopon, M., Diptu, N.A., Nabeel Mohammed (2020). End-to-End Optical Character Recognition Using Sythetic Dataset Generator for Noisy Conditions. In: Uddin, M.S., Bansal, J.C. (eds) Proceedings of International Joint Conference on Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-3607-6_41

Download citation

Publish with us

Policies and ethics