Skip to main content

Text Line Extraction Using Fully Convolutional Network and Energy Minimization

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12667))

Included in the following conference series:

Abstract

Text lines are important parts of handwritten document images and easier to analyze by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task. This paper proposes to use a fully convolutional network for text line detection and energy minimization for text line extraction. Detected text lines are represented by blob lines that strike through the text lines. These blob lines assist an energy function for text line extraction. The detection stage can locate arbitrarily oriented text lines. Furthermore, the extraction stage is capable of finding out the pixels of text lines with various heights and interline proximity independent of their orientations. Besides, it can finely split the touching and overlapping text lines without an orientation assumption. We evaluate the proposed method on VML-AHTE, VML-MOC, and Diva-HisDB datasets. The VML-AHTE dataset contains overlapping, touching and close text lines with rich diacritics. The VML-MOC dataset is very challenging by its multiply oriented and skewed text lines. The Diva-HisDB dataset exhibits distinct text line heights and touching text lines. The results demonstrate the effectiveness of the method despite various types of challenges, yet using the same parameters in all the experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.cs.bgu.ac.il/~berat/data/ahte_dataset.

  2. 2.

    https://www.cs.bgu.ac.il/~berat/data/moc_dataset.

References

  1. Alberti, M., Bouillon, M., Ingold, R., Liwicki, M.: Open evaluation tool for layout analysis of document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 4, pp. 43–47. IEEE (2017)

    Google Scholar 

  2. Aldavert, D., Rusiñol, M.: Manuscript text line detection and segmentation using second-order derivatives. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 293–298. IEEE (2018)

    Google Scholar 

  3. Barakat, B., Droby, A., Kassis, M., El-Sana, J.: Text line segmentation for challenging handwritten document images using fully convolutional network. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 374–379. IEEE (2018)

    Google Scholar 

  4. Barakat, B., El-Sana, J.: Binarization free layout analysis for Arabic historical documents using fully convolutional networks. In: 2nd International Workshop on Arabic Script Analysis and Recognition (ASAR), pp. 26–30. IEEE (2018)

    Google Scholar 

  5. Barakat, B.K., Cohen, R., El-Sana, J., Rabaev, I.: VML-MOC: segmenting a multiply oriented and curved handwritten text line dataset. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 6, pp. 13–18. IEEE (2019)

    Google Scholar 

  6. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)

    Article  Google Scholar 

  7. Boykov, Y.Y., Jolly, M.P.: Interactive graph cuts for optimal boundary & region segmentation of objects in ND images. In: Proceedings Eighth IEEE International Conference on Computer Vision, ICCV 2001, vol. 1, pp. 105–112. IEEE (2001)

    Google Scholar 

  8. Campos, V.B., Gómez, V.R., Rossi, A.H.T., Ruiz, E.V.: Text line extraction based on distance map features and dynamic programming. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 357–362. IEEE (2018)

    Google Scholar 

  9. Cohen, R., Dinstein, I., El-Sana, J., Kedem, K.: Using scale-space anisotropic smoothing for text line extraction in historical documents. In: Campilho, A., Kamel, M. (eds.) ICIAR 2014. LNCS, vol. 8814, pp. 349–358. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11758-4_38

    Chapter  Google Scholar 

  10. Diem, M., Kleber, F., Fiel, S., Grüning, T., Gatos, B.: cBAD: ICDAR2017 competition on baseline detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1355–1360. IEEE (2017)

    Google Scholar 

  11. Fischer, A., Baechler, M., Garz, A., Liwicki, M., Ingold, R.: A combined system for text line extraction and handwriting recognition in historical documents. In: 2014 11th IAPR International Workshop on Document Analysis Systems, pp. 71–75. IEEE (2014)

    Google Scholar 

  12. Gatos, B., Stamatopoulos, N., Louloudis, G.: ICDAR2009 handwriting segmentation contest. Int. J. Doc. Anal. Recogn. (IJDAR) 14(1), 25–33 (2011)

    Article  Google Scholar 

  13. Gatos, B., Stamatopoulos, N., Louloudis, G.: ICFHR 2010 handwriting segmentation contest. In: 2010 12th International Conference on Frontiers in Handwriting Recognition, pp. 737–742. IEEE (2010)

    Google Scholar 

  14. Grüning, T., Leifert, G., Strauß, T., Michael, J., Labahn, R.: A two-stage method for text line detection in historical documents. Int. J. Doc. Anal. Recogn. (IJDAR) 22(3), 285–302 (2019). https://doi.org/10.1007/s10032-019-00332-1

    Article  Google Scholar 

  15. Gruuening, T., Leifert, G., Strauss, T., Labahn, R.: A robust and binarization-free approach for text line detection in historical documents. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 236–241. IEEE (2017)

    Google Scholar 

  16. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  17. Koffka, K.: Principles of Gestalt Psychology, vol. 44. Routledge, Abingdon (2013)

    Book  Google Scholar 

  18. Kubovy, M., Van Den Berg, M.: The whole is equal to the sum of its parts: a probabilistic model of grouping by proximity and similarity in regular patterns. Psychol. Rev. 115(1), 131 (2008)

    Article  Google Scholar 

  19. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  20. Moysset, B., Kermorvant, C., Wolf, C., Louradour, J.: Paragraph text segmentation into lines with recurrent neural networks. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 456–460. IEEE (2015)

    Google Scholar 

  21. Moysset, B., Louradour, J., Kermorvant, C., Wolf, C.: Learning text-line localization with shared and local regression neural networks. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 1–6. IEEE (2016)

    Google Scholar 

  22. Oliveira, S.A., Seguin, B., Kaplan, F.: dhSegment: a generic deep-learning approach for document segmentation. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 7–12. IEEE (2018)

    Google Scholar 

  23. Pletschacher, S., Antonacopoulos, A.: The page (page analysis and ground-truth elements) format framework. In: 2010 20th International Conference on Pattern Recognition, pp. 257–260. IEEE (2010)

    Google Scholar 

  24. Renton, G., Chatelain, C., Adam, S., Kermorvant, C., Paquet, T.: Handwritten text line segmentation using fully convolutional network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 5, pp. 5–9. IEEE (2017)

    Google Scholar 

  25. Renton, G., Soullard, Y., Chatelain, C., Adam, S., Kermorvant, C., Paquet, T.: Fully convolutional network with dilated convolutions for handwritten text line segmentation. Int. J. Doc. Anal. Recogn. (IJDAR) 21(3), 177–186 (2018). https://doi.org/10.1007/s10032-018-0304-3

    Article  Google Scholar 

  26. Saabni, R., Asi, A., El-Sana, J.: Text line extraction for historical document images. Pattern Recogn. Lett. 35, 23–33 (2014)

    Article  Google Scholar 

  27. Sayre, K.M.: Machine recognition of handwritten words: a project report. Pattern Recogn. 5(3), 213–228 (1973)

    Article  Google Scholar 

  28. Simistira, F., et al.: ICDAR2017 competition on layout analysis for challenging medieval manuscripts. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1361–1370. IEEE (2017)

    Google Scholar 

  29. Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: ICDAR 2013 handwriting segmentation contest. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1402–1406. IEEE (2013)

    Google Scholar 

  30. Vo, Q.N., Lee, G.: Dense prediction for text line segmentation in handwritten document images. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3264–3268. IEEE (2016)

    Google Scholar 

Download references

Acknowledgment

The authors would like to thank Gunes Cevik for annotating the ground truth. This work has been partially supported by the Frankel Center for Computer Science.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Berat Kurar Barakat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Barakat, B.K., Droby, A., Alaasam, R., Madi, B., Rabaev, I., El-Sana, J. (2021). Text Line Extraction Using Fully Convolutional Network and Energy Minimization. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12667. Springer, Cham. https://doi.org/10.1007/978-3-030-68787-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68787-8_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68786-1

  • Online ISBN: 978-3-030-68787-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics