Skip to main content

Bilingual Documents Text Lines Extraction Using Conditional GANs

  • Conference paper
  • First Online:
Machine Learning and Big Data Analytics (ICMLBDA 2022)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 401))

Included in the following conference series:

  • 443 Accesses

Abstract

Text line extraction in document recognition is the major step. A number of classical approaches are available like projection profile, bounding box analysis, etc. These classical approaches are unable to segment the text with large variations in individual handwriting. Furthermore, segmentation of documents having data from multiple scripts creates more hurdles due to the presence of different writing styles. The usage of deep networks has been less explored in this domain due to the need of high training time and data. In this research, we have used conditional generative adversarial networks (GANs) for text line extraction in bilingual documents containing Gurumukhi-Latin scripts. It considers text line segmentation problem as image-to-image translation task. Two kinds of encoder–decoder networks are used for comparison, i.e., with skip connections and without skip connections. Dataset for bilingual handwritten documents containing 150 document images has been designed. It includes large variability in writing style and content. Results on the designed dataset for text line extraction are efficient for encoder–decoder network with skip connections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jo, Junho, et al. “Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks.” Multimedia Tools and Applications 79.43 (2020): 32137–32150.

    Article  Google Scholar 

  2. dos Santos, Rodolfo P., et al. “Text line segmentation based on morphology and histogram projection.” 2009 10th International Conference on Document Analysis and Recognition. IEEE, 2009.

    Google Scholar 

  3. Susan, Seba, and KM Rachna Devi. “Text area segmentation from document images by novel adaptive thresholding and template matching using texture cues.” Pattern Anal. Appl. 23.2 (2020): 869–881.

    Google Scholar 

  4. Pal, U., and Sagarika Datta. “Segmentation of Bangla unconstrained handwritten text.” Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings. Vol. 3. IEEE Computer Society, 2003.

    Google Scholar 

  5. Jindal, Payal, and Balkrishan Jindal. “Line and word segmentation of handwritten text documents written in Gurmukhi script using mid point detection technique.” 2015 2nd International Conference on Recent Advances in Engineering & Computational Sciences (RAECS). IEEE, 2015.

    Google Scholar 

  6. Sharma, Dharam Veer, and Gurpreet Singh Lehal. “An iterative algorithm for segmentation of isolated handwritten words in Gurmukhi script.” 18th International Conference on Pattern Recognition (ICPR’06). Vol. 2. IEEE, 2006.

    Google Scholar 

  7. Sanasam, Inunganbi, Prakash Choudhary, and Khumanthem Manglem Singh. “Line and word segmentation of handwritten text document by mid-point detection and gap trailing.” Multimedia Tools and Applications 79.41 (2020): 30135–30150.

    Article  Google Scholar 

  8. Ptak, Roman, Bartosz Żygadło, and Olgierd Unold. “Projection-based text line segmentation with a variable threshold.” International Journal of Applied Mathematics and Computer Science 27.1 (2017): 195–206.

    Article  MathSciNet  MATH  Google Scholar 

  9. Jindal, Simpel, and Gurpreet Singh Lehal. “Line segmentation of handwritten Gurmukhi manuscripts.” Proceeding of the workshop on document analysis and recognition. 2012.

    Google Scholar 

  10. Mohammad, Khader, et al. “An adaptive text-line extraction algorithm for printed Arabic documents with diacritics.” Multimedia Tools and Applications 80.2 (2021): 2177–2204.

    Article  Google Scholar 

  11. Cheng, Keyang, et al. “An analysis of generative adversarial networks and variants for image synthesis on MNIST dataset.” Multimedia Tools and Applications 79.19 (2020): 13725–13752.

    Article  Google Scholar 

  12. Alonso, Eloi, Bastien Moysset, and Ronaldo Messina. “Adversarial generation of handwritten text images conditioned on sequences.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.

    Google Scholar 

  13. Cai, Junyang, et al. “TH-GAN: Generative adversarial network based transfer learning for historical Chinese character recognition.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.

    Google Scholar 

  14. Jha, Ganesh, and Hubert Cecotti. “Data augmentation for handwritten digit recognition using generative adversarial networks.” Multimedia Tools and Applications 79.47 (2020): 35055–35068.

    Article  Google Scholar 

  15. Majid, Nishatul, and Elisa H. Barney Smith. “Segmentation-free Bangla offline handwriting recognition using sequential detection of characters and diacritics with a faster R-CNN.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.

    Google Scholar 

  16. Kong, Hao, et al. “GARN: A novel generative adversarial recognition network for end-to-end scene character recognition.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.

    Google Scholar 

  17. Kundu, Soumyadeep, et al. “Text-line extraction from handwritten document images using GAN.” Expert Systems with Applications 140 (2020): 112916.

    Google Scholar 

  18. Kaur, Sukhandeep, Seema Bawa, and Ravinder Kumar. “A survey of mono-and multi-lingual character recognition using deep and shallow architectures: Indic and non-Indic scripts.” Artificial Intelligence Review 53.3 (2020): 1813–1872.

    Article  Google Scholar 

  19. Goodfellow, Ian, et al. “Generative adversarial nets.” Advances in neural information processing systems 27 (2014).

    Google Scholar 

  20. Zhu, Yixing, and Jun Du. “TextMountain: Accurate scene text detection via instance segmentation.” Pattern Recognition 110 (2021): 107336.

    Article  Google Scholar 

  21. Dutta, Arpita, et al. “Segmentation of text lines using multi-scale CNN from warped printed and handwritten document images.” International Journal on Document Analysis and Recognition (IJDAR) 24.4 (2021): 299–313.

    Article  Google Scholar 

  22. Yanagi, Rintaro, et al. “Query is GAN: scene retrieval with attentional text-to-image generative adversarial network.” IEEE Access 7 (2019): 153183–153193.

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by Council of Scientific and Industrial Research (CSIR) funded by the Ministry of Science and Technology (09/677(0031)/2018/EMR-I) as well as the Government of India.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kaur, S., Bawa, S., Kumar, R., Kumar, M. (2023). Bilingual Documents Text Lines Extraction Using Conditional GANs. In: Misra, R., Omer, R., Rajarajan, M., Veeravalli, B., Kesswani, N., Mishra, P. (eds) Machine Learning and Big Data Analytics. ICMLBDA 2022. Springer Proceedings in Mathematics & Statistics, vol 401. Springer, Cham. https://doi.org/10.1007/978-3-031-15175-0_5

Download citation

Publish with us

Policies and ethics