Skip to main content
Log in

BINYAS: a complex document layout analysis system

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Document layout analysis (DLA) is an irreplaceable pre-requisite for the development of a comprehensive document image processing and analysis system. The main purpose of DLA is to segment an input document image into its constituent and coherent regions and identify their classes. In this paper, we propose a competent DLA system, named as BINYAS, based on the connected component (CC) and pixel analysis based approach. Here, we initially identify the regions and then classify these regions as paragraph, separator, graphic, image, table, chart, and inverted text etc. The proposed system is evaluated on four publicly available standard datasets, namely ICDAR 2009, 2015, 2017 and 2019 page segmentation competition datasets, and the performance is compared with many contemporary methods, which also include some well-known software products and deep learning based methods. Experimental results show that our method performs significantly better than state-of-the-art methods in terms of the evaluation metrics considered by the research community of this domain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

References

  1. Acharyya M, Kundu MK (2001) Multiscale segmentation of document images using m-band wavelets, In International Conference on Computer Analysis of Images and Patterns, pp. 510–517

  2. Ackley HS (2020) Optical character recognition systems and methods. Google Patents, 09-Jun

  3. Antonacopoulos A, Clausner C, Papadopoulos C, Pletschacher S (2013) Icdar 2013 competition on historical newspaper layout analysis (hnla 2013), In 2013 12th International Conference on Document Analysis and Recognition, pp. 1454–1458

  4. Antonacopoulos A, Clausner C, Papadopoulos C, Pletschacher S (2015) ICDAR2015 competition on recognition of documents with complex layouts-RDCL2015, In Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp. 1151–1155

  5. Antonacopoulos A, Pletschacher S, Bridson D, Papadopoulos C (2009) ICDAR 2009 page segmentation competition, In 2009 10th International Conference on Document Analysis and Recognition, pp. 1370–1374

  6. Antonacopoulos A, Ritchings RT (1994) Flexible page segmentation using the background, In Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3-Conference C: Signal Processing (Cat. No. 94CH3440–5), vol. 2, pp. 339–344

  7. Basic Book Design/Indentation, WIKIBOOKS1

  8. Bhowmik S, Sarkar R, Das B, Doermann D (2019) GiB: a game theory inspired Binarization technique for degraded document images. IEEE Trans Image Process 28(3):1443–1455

    Article  MathSciNet  Google Scholar 

  9. Bhowmik S, Sarkar R, Nasipuri M, Doermann D (2018) Text and non-text separation in offline document images: a survey. Int J Doc Anal Recognit 21(1–2):1–20

    Article  Google Scholar 

  10. Binmakhashen GM, Mahmoud SA (2019) Document layout analysis: a comprehensive survey. ACM Comput Surv 52(6):1–36

    Article  Google Scholar 

  11. Bloomberg DS (1991) Multiresolution morphological approach to document image analysis, In Proceedings of the International Conference on Document Analysis and Recognition, Saint-Malo

  12. Bukhari SS, Azawi A, Ali MI, Shafait F, Breuel TM (2010) Document image segmentation using discriminative learning over connected components, In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 183–190

  13. Bukhari SS, Shafait F, Breuel TM (2011) Improved document image segmentation algorithm using multiresolution morphology, In IS&T/SPIE Electronic Imaging, pp. 78740D-78740D

  14. Chen K, Yin F, Liu CL (2013) Hybrid page segmentation with efficient whitespace rectangles extraction and grouping, In 2013 12th International Conference on Document Analysis and Recognition, pp. 958–962

  15. Clausner SPC, Antonacopoulos A (2019) ICDAR2019 Competition on Recognition of Documents with Complex Layouts – RDCL2019, In Proceedings of the 15th International Conference on Document Analysis and Recognition (ICDAR2019), pp. 1521–1526

  16. Clausner C, Antonacopoulos A, Pletschacher S (2017) ICDAR2017 Competition on Recognition of Documents with Complex Layouts-RDCL2017, in Document Analysis and Recognition (ICDAR), 2017 14th IAPR international conference on, vol. 1, pp. 1404–1410

  17. Clausner C, Antonacopoulos A, Pletschacher S (2019) ICDAR2019 Competition on Recognition of Documents with Complex Layouts-RDCL2019, In 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1521–1526

  18. Clausner C, Pletschacher S, Antonacopoulos A (2011) Scenario driven in-depth performance evaluation of document layout analysis methods, In 2011 International Conference on Document Analysis and Recognition, pp. 1404–1408

  19. Convert Inch to Pixel, unitconverters.net

  20. Dai-Ton H, Duc-Dung N, Duc-Hieu L (2016) An adaptive over-split and merge algorithm for page segmentation. Pattern Recogn Lett 80:137–143

    Article  Google Scholar 

  21. Eskenazi S, Gomez-Krämer P, Ogier J-M (2017) A comprehensive survey of mostly textual document segmentation algorithms since 2008. Pattern Recogn 64:1–14

    Article  Google Scholar 

  22. FineReader Engine 10, ABBYY Technology Portal

  23. FineReader Engine 11 (2015) ABBYY Technology Portal

  24. FineReader Engine 12, ABBYY Technology Portal

  25. Kaur RP, Jindal MK, Kumar M (2020) Text and graphics segmentation of newspapers printed in Gurmukhi script: a hybrid approach, Vis Comput, pp. 1–23

  26. Kise K (2014) Page segmentation techniques in document analysis, in Handbook of Document Image Processing and Recognition, Springer, pp. 135–175

  27. Kise K, Sato A, Iwata M (1998) Segmentation of page images using the area Voronoi diagram. Comput Vis Image Underst 70(3):370–382

    Article  Google Scholar 

  28. Kise K, Yanagida O, Takamatsu S (1996) Page segmentation based on thinning of background, In Proceedings of 13th International Conference on Pattern Recognition, vol. 3, pp. 788–792

  29. Le VP, Nayef N, Visani M, Ogier JM, De Tran C (2015) Text and non-text segmentation based on connected component features, In Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp. 1096–1100

  30. Lin MW, Tapamo J-R, Ndovie B (2006) A texture-based method for document segmentation and classification. South African Computer Journal 36(1):49–56

    Google Scholar 

  31. Nagy G, Seth S, Viswanathan M (1992) A prototype document image analysis system for technical journals. Computer (Long Beach Calif) 25(7):10–22

    Google Scholar 

  32. Nestor T et al (2020) A multidimensional hyperjerk oscillator: Dynamics analysis, analogue and embedded systems implementation, and its application as a cryptosystem. Sensors 20(1):83

    Article  Google Scholar 

  33. Normand N, Viard-Gaudin C (1995) A background based adaptive page segmentation algorithm, In Proceedings of 3rd International Conference on Document Analysis and Recognition, vol. 1, pp. 138–141

  34. Olszewska JI (2015) Active contour based optical character recognition for automated scene understanding. Neurocomputing 161:65–71

    Article  Google Scholar 

  35. Oyedotun OK, Khashman A (2016) Document segmentation using textural features summarization and feedforward neural network, Appl Intell, pp. 1–15

  36. Pavlidis T, Zhou J (1992) Page segmentation and classification. CVGIP Graph Model image Process 54(6):484–496

    Article  Google Scholar 

  37. Sauvola J, Pietikäinen M (2000) Adaptive document image binarization. Pattern Recogn 33(2):225–236

    Article  Google Scholar 

  38. Shih FY, Chen S-S (1996) Adaptive document block segmentation and classification. IEEE Transactions on Systems Man Cybernetics Part B 26(5):797–802

    Article  Google Scholar 

  39. Smith RW (2009) Hybrid page layout analysis via tab-stop detection, In 2009 10th International Conference on Document Analysis and Recognition, pp. 241–245

  40. Smith RW (2013) History of the Tesseract OCR engine: what worked and what didn’t, In IS&T/SPIE Electronic Imaging, p. 865802

  41. Sun HM (2005) Page segmentation for Manhattan and non-Manhattan layout documents via selective CRLA, in Eighth International Conference on Document Analysis and Recognition (ICDAR’05), pp. 116–120

  42. “Tesseract-OCR.” [Online]. Available: https://github.com/tesseract-ocr/tesseract/wiki

  43. Tran T-A, Na I-S, Kim S-H (2015) Separation of text and non-text in document layout analysis using a recursive filter. KSII Transactions on Internet and Information Systems 9(10):4072–4091

    Google Scholar 

  44. Tran TA, Na IS, Kim SH (2016) Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology. Int J Doc Anal Recognit 19(3):191–209

    Article  Google Scholar 

  45. Tran TA, Oh K, Na IS, Lee GS, Yang HJ, Kim SH (2017) A robust system for document layout analysis using multilevel homogeneity structure, Expert Systems and Applications

  46. Vasilopoulos N, Kavallieratou E (2017) Unified layout analysis and text localization framework. J Electron Imaging 26(1):13009

    Article  Google Scholar 

  47. Vasilopoulos N, Kavallieratou E (2017) Complex layout analysis based on contour classification and morphological operations. Eng Appl Artif Intell 65:220–229

    Article  Google Scholar 

  48. Zagoris K, Chatzichristofis SA, Papamarkos N (2011) Text localization using standard deviation analysis of structure elements and support vector machines. EURASIP Journal on Advances in Signal Processing 2011(1):1–12

    Article  Google Scholar 

  49. Zlatopolsky AA (1994) Automated document segmentation. Pattern Recogn Lett 15(7):699–704

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Showmik Bhowmik.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhowmik, S., Kundu, S. & Sarkar, R. BINYAS: a complex document layout analysis system. Multimed Tools Appl 80, 8471–8504 (2021). https://doi.org/10.1007/s11042-020-09832-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09832-3

Keywords

Navigation