Abstract
Text line extraction in document recognition is the major step. A number of classical approaches are available like projection profile, bounding box analysis, etc. These classical approaches are unable to segment the text with large variations in individual handwriting. Furthermore, segmentation of documents having data from multiple scripts creates more hurdles due to the presence of different writing styles. The usage of deep networks has been less explored in this domain due to the need of high training time and data. In this research, we have used conditional generative adversarial networks (GANs) for text line extraction in bilingual documents containing Gurumukhi-Latin scripts. It considers text line segmentation problem as image-to-image translation task. Two kinds of encoder–decoder networks are used for comparison, i.e., with skip connections and without skip connections. Dataset for bilingual handwritten documents containing 150 document images has been designed. It includes large variability in writing style and content. Results on the designed dataset for text line extraction are efficient for encoder–decoder network with skip connections.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jo, Junho, et al. “Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks.” Multimedia Tools and Applications 79.43 (2020): 32137–32150.
dos Santos, Rodolfo P., et al. “Text line segmentation based on morphology and histogram projection.” 2009 10th International Conference on Document Analysis and Recognition. IEEE, 2009.
Susan, Seba, and KM Rachna Devi. “Text area segmentation from document images by novel adaptive thresholding and template matching using texture cues.” Pattern Anal. Appl. 23.2 (2020): 869–881.
Pal, U., and Sagarika Datta. “Segmentation of Bangla unconstrained handwritten text.” Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings. Vol. 3. IEEE Computer Society, 2003.
Jindal, Payal, and Balkrishan Jindal. “Line and word segmentation of handwritten text documents written in Gurmukhi script using mid point detection technique.” 2015 2nd International Conference on Recent Advances in Engineering & Computational Sciences (RAECS). IEEE, 2015.
Sharma, Dharam Veer, and Gurpreet Singh Lehal. “An iterative algorithm for segmentation of isolated handwritten words in Gurmukhi script.” 18th International Conference on Pattern Recognition (ICPR’06). Vol. 2. IEEE, 2006.
Sanasam, Inunganbi, Prakash Choudhary, and Khumanthem Manglem Singh. “Line and word segmentation of handwritten text document by mid-point detection and gap trailing.” Multimedia Tools and Applications 79.41 (2020): 30135–30150.
Ptak, Roman, Bartosz Żygadło, and Olgierd Unold. “Projection-based text line segmentation with a variable threshold.” International Journal of Applied Mathematics and Computer Science 27.1 (2017): 195–206.
Jindal, Simpel, and Gurpreet Singh Lehal. “Line segmentation of handwritten Gurmukhi manuscripts.” Proceeding of the workshop on document analysis and recognition. 2012.
Mohammad, Khader, et al. “An adaptive text-line extraction algorithm for printed Arabic documents with diacritics.” Multimedia Tools and Applications 80.2 (2021): 2177–2204.
Cheng, Keyang, et al. “An analysis of generative adversarial networks and variants for image synthesis on MNIST dataset.” Multimedia Tools and Applications 79.19 (2020): 13725–13752.
Alonso, Eloi, Bastien Moysset, and Ronaldo Messina. “Adversarial generation of handwritten text images conditioned on sequences.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.
Cai, Junyang, et al. “TH-GAN: Generative adversarial network based transfer learning for historical Chinese character recognition.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.
Jha, Ganesh, and Hubert Cecotti. “Data augmentation for handwritten digit recognition using generative adversarial networks.” Multimedia Tools and Applications 79.47 (2020): 35055–35068.
Majid, Nishatul, and Elisa H. Barney Smith. “Segmentation-free Bangla offline handwriting recognition using sequential detection of characters and diacritics with a faster R-CNN.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.
Kong, Hao, et al. “GARN: A novel generative adversarial recognition network for end-to-end scene character recognition.” 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019.
Kundu, Soumyadeep, et al. “Text-line extraction from handwritten document images using GAN.” Expert Systems with Applications 140 (2020): 112916.
Kaur, Sukhandeep, Seema Bawa, and Ravinder Kumar. “A survey of mono-and multi-lingual character recognition using deep and shallow architectures: Indic and non-Indic scripts.” Artificial Intelligence Review 53.3 (2020): 1813–1872.
Goodfellow, Ian, et al. “Generative adversarial nets.” Advances in neural information processing systems 27 (2014).
Zhu, Yixing, and Jun Du. “TextMountain: Accurate scene text detection via instance segmentation.” Pattern Recognition 110 (2021): 107336.
Dutta, Arpita, et al. “Segmentation of text lines using multi-scale CNN from warped printed and handwritten document images.” International Journal on Document Analysis and Recognition (IJDAR) 24.4 (2021): 299–313.
Yanagi, Rintaro, et al. “Query is GAN: scene retrieval with attentional text-to-image generative adversarial network.” IEEE Access 7 (2019): 153183–153193.
Acknowledgements
This research was supported by Council of Scientific and Industrial Research (CSIR) funded by the Ministry of Science and Technology (09/677(0031)/2018/EMR-I) as well as the Government of India.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kaur, S., Bawa, S., Kumar, R., Kumar, M. (2023). Bilingual Documents Text Lines Extraction Using Conditional GANs. In: Misra, R., Omer, R., Rajarajan, M., Veeravalli, B., Kesswani, N., Mishra, P. (eds) Machine Learning and Big Data Analytics. ICMLBDA 2022. Springer Proceedings in Mathematics & Statistics, vol 401. Springer, Cham. https://doi.org/10.1007/978-3-031-15175-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-15175-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15174-3
Online ISBN: 978-3-031-15175-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)