Bilingual Documents Text Lines Extraction Using Conditional GANs

Kaur, Sukhandeep; Bawa, Seema; Kumar, Ravinder; Kumar, Munish

doi:10.1007/978-3-031-15175-0_5

Sukhandeep Kaur⁷,
Seema Bawa⁷,
Ravinder Kumar⁷ &
…
Munish Kumar⁸

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 401))

Included in the following conference series:

International Conference on Machine Learning and Big Data Analytics

443 Accesses

Abstract

Text line extraction in document recognition is the major step. A number of classical approaches are available like projection profile, bounding box analysis, etc. These classical approaches are unable to segment the text with large variations in individual handwriting. Furthermore, segmentation of documents having data from multiple scripts creates more hurdles due to the presence of different writing styles. The usage of deep networks has been less explored in this domain due to the need of high training time and data. In this research, we have used conditional generative adversarial networks (GANs) for text line extraction in bilingual documents containing Gurumukhi-Latin scripts. It considers text line segmentation problem as image-to-image translation task. Two kinds of encoder–decoder networks are used for comparison, i.e., with skip connections and without skip connections. Dataset for bilingual handwritten documents containing 150 document images has been designed. It includes large variability in writing style and content. Results on the designed dataset for text line extraction are efficient for encoder–decoder network with skip connections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Jo, Junho, et al. “Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks.” Multimedia Tools and Applications 79.43 (2020): 32137–32150.
Article Google Scholar
dos Santos, Rodolfo P., et al. “Text line segmentation based on morphology and histogram projection.” 2009 10th International Conference on Document Analysis and Recognition. IEEE, 2009.
Google Scholar
Susan, Seba, and KM Rachna Devi. “Text area segmentation from document images by novel adaptive thresholding and template matching using texture cues.” Pattern Anal. Appl. 23.2 (2020): 869–881.
Google Scholar
Pal, U., and Sagarika Datta. “Segmentation of Bangla unconstrained handwritten text.” Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings. Vol. 3. IEEE Computer Society, 2003.
Google Scholar
Jindal, Payal, and Balkrishan Jindal. “Line and word segmentation of handwritten text documents written in Gurmukhi script using mid point detection technique.” 2015 2nd International Conference on Recent Advances in Engineering & Computational Sciences (RAECS). IEEE, 2015.
Google Scholar
Sharma, Dharam Veer, and Gurpreet Singh Lehal. “An iterative algorithm for segmentation of isolated handwritten words in Gurmukhi script.” 18th International Conference on Pattern Recognition (ICPR’06). Vol. 2. IEEE, 2006.
Google Scholar
Sanasam, Inunganbi, Prakash Choudhary, and Khumanthem Manglem Singh. “Line and word segmentation of handwritten text document by mid-point detection and gap trailing.” Multimedia Tools and Applications 79.41 (2020): 30135–30150.
Article Google Scholar
Ptak, Roman, Bartosz Żygadło, and Olgierd Unold. “Projection-based text line segmentation with a variable threshold.” International Journal of Applied Mathematics and Computer Science 27.1 (2017): 195–206.
Article MathSciNet MATH Google Scholar
Jindal, Simpel, and Gurpreet Singh Lehal. “Line segmentation of handwritten Gurmukhi manuscripts.” Proceeding of the workshop on document analysis and recognition. 2012.
Google Scholar
Mohammad, Khader, et al. “An adaptive text-line extraction algorithm for printed Arabic documents with diacritics.” Multimedia Tools and Applications 80.2 (2021): 2177–2204.
Article Google Scholar
Cheng, Keyang, et al. “An analysis of generative adversarial networks and variants for image synthesis on MNIST dataset.” Multimedia Tools and Applications 79.19 (2020): 13725–13752.
Article Google Scholar
Alonso, Eloi, Bastien Moysset, and Ronaldo Messina. “Adversarial generation of handwritten text images conditioned on sequences.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.
Google Scholar
Cai, Junyang, et al. “TH-GAN: Generative adversarial network based transfer learning for historical Chinese character recognition.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.
Google Scholar
Jha, Ganesh, and Hubert Cecotti. “Data augmentation for handwritten digit recognition using generative adversarial networks.” Multimedia Tools and Applications 79.47 (2020): 35055–35068.
Article Google Scholar
Majid, Nishatul, and Elisa H. Barney Smith. “Segmentation-free Bangla offline handwriting recognition using sequential detection of characters and diacritics with a faster R-CNN.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.
Google Scholar
Kong, Hao, et al. “GARN: A novel generative adversarial recognition network for end-to-end scene character recognition.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.
Google Scholar
Kundu, Soumyadeep, et al. “Text-line extraction from handwritten document images using GAN.” Expert Systems with Applications 140 (2020): 112916.
Google Scholar
Kaur, Sukhandeep, Seema Bawa, and Ravinder Kumar. “A survey of mono-and multi-lingual character recognition using deep and shallow architectures: Indic and non-Indic scripts.” Artificial Intelligence Review 53.3 (2020): 1813–1872.
Article Google Scholar
Goodfellow, Ian, et al. “Generative adversarial nets.” Advances in neural information processing systems 27 (2014).
Google Scholar
Zhu, Yixing, and Jun Du. “TextMountain: Accurate scene text detection via instance segmentation.” Pattern Recognition 110 (2021): 107336.
Article Google Scholar
Dutta, Arpita, et al. “Segmentation of text lines using multi-scale CNN from warped printed and handwritten document images.” International Journal on Document Analysis and Recognition (IJDAR) 24.4 (2021): 299–313.
Article Google Scholar
Yanagi, Rintaro, et al. “Query is GAN: scene retrieval with attentional text-to-image generative adversarial network.” IEEE Access 7 (2019): 153183–153193.
Article Google Scholar

Download references

Acknowledgements

This research was supported by Council of Scientific and Industrial Research (CSIR) funded by the Ministry of Science and Technology (09/677(0031)/2018/EMR-I) as well as the Government of India.

Author information

Authors and Affiliations

Thapar Institute of Engineering and Technology, Patiala, Punjab, India
Sukhandeep Kaur, Seema Bawa & Ravinder Kumar
Maharaja Ranjit Singh Punjab Technical University, Bathinda, Punjab, India
Munish Kumar

Authors

Sukhandeep Kaur
View author publications
You can also search for this author in PubMed Google Scholar
Seema Bawa
View author publications
You can also search for this author in PubMed Google Scholar
Ravinder Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Munish Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science & Engineering, Indian Institute of Technology Patna, Patna, Bihar, India
Rajiv Misra
Cardiff University, Cardiff, UK
Rana Omer
Dept. of EE Engineering, University of London, London, UK
Muttukrishnan Rajarajan
Dept. of ECE, National University of Singapore, Singapore, Singapore
Bharadwaj Veeravalli
Dept. of Computer Science, Central University of Rajasthan, Tehsil Kishangarh, Rajasthan, India
Nishtha Kesswani
Dept. of CSE, Indian Institute of Information Technology, Kota, Jawahar Lal Nehru Marg, Rajasthan, India
Priyanka Mishra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaur, S., Bawa, S., Kumar, R., Kumar, M. (2023). Bilingual Documents Text Lines Extraction Using Conditional GANs. In: Misra, R., Omer, R., Rajarajan, M., Veeravalli, B., Kesswani, N., Mishra, P. (eds) Machine Learning and Big Data Analytics. ICMLBDA 2022. Springer Proceedings in Mathematics & Statistics, vol 401. Springer, Cham. https://doi.org/10.1007/978-3-031-15175-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-15175-0_5
Published: 10 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15174-3
Online ISBN: 978-3-031-15175-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics