Skip to main content
Log in

Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Document layout analysis or page segmentation is the task of decomposing document images into many different regions such as texts, images, separators, and tables. It is still a challenging problem due to the variety of document layouts. In this paper, we propose a novel hybrid method, which includes three main stages to deal with this problem. In the first stage, the text and non-text elements are classified by using minimum homogeneity algorithm. This method is the combination of connected component analysis and multilevel homogeneity structure. Then, in the second stage, a new homogeneity structure is combined with an adaptive mathematical morphology in the text document to get a set of text regions. Besides, on the non-text document, further classification of non-text elements is applied to get separator regions, table regions, image regions, etc. The final stage, in refinement region and noise detection process, all regions both in the text document and non-text document are refined to eliminate noises and get the geometric layout of each region. The proposed method has been tested with the dataset of ICDAR2009 page segmentation competition and many other databases with different languages. The results of these tests showed that our proposed method achieves a higher accuracy compared to other methods. This proves the effectiveness and superiority of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Agrawal, M., Doermann, D. S.: Voronoi++ A dynamic page segmentation approach based on Voronoi and Docstrum features. In: Proceedings of the ICDAR, pp. 1011–1015. IEEE (2009)

  2. Antonacopoulos, S., Bridson, D., Papadopoulos, C., A., Pletschacher: A realistic dataset for performance evaluation of document layout analysis. In: Proceedings of the ICDAR, pp. 296–300. IEEE (2009)

  3. Antonacopoulos, A., Pletschacher, S., Bridson, D., Papadopoulos, C.: ICDAR2009 page segmentation competition. In: Proceedings of the ICDAR, pp. 1370–1374. Barcelona (2009)

  4. Baird, H., Jones, S., Fortune, S.: Image segmentation by shape-directed covers. In: Proceedings of the ICPR. pp. 820–825. IEEE (1990)

  5. Bland, J. M., Altman, D. G.: Statistics notes: measurement error. Bmj, 312(7047), 1654. Retrieved 22 November (2013)

  6. Caponetti, L., Castiello, C., Gorecki, P.: Document page segmentation using neuro-fuzzy approach. Appl. Soft Comput. 8, 118–126 (2008)

    Article  Google Scholar 

  7. Chang, F., Chen, C.-J., Lu, C.-J.: A linear time component labeling algorithm using contour tracing technique. Comput. Vis. Image Underst. 93(2), 206–220 (2004)

    Article  Google Scholar 

  8. Chen, K., Yin, F., Liu, C.-L.: Hybrid page segmentation with efficient whitespace rectangles extraction and grouping. In: Proceedings of the ICDAR. pp. 958–962. IEEE (2013)

  9. Cheng, H., Bouman, C.A.: Multi-scale Bayesian Segmentation Using a Trainable Context Model. IEEE Trans. Image Process. 10(4), 511–525 (2001)

    Article  MATH  Google Scholar 

  10. Cinque, L., Lombardi, L., Manzini, G.: A multiresolution approach for page segmentation. Pattern Recognit. Lett. 19, 217–225 (1998)

    Article  Google Scholar 

  11. Clausner, C., Pletschacher, S., Antonacopoulos, A.: Scenario driven in-depth performance evaluation of document layout analysis methods. In: Proceedings of the ICDAR, pp. 1404–1408. IEEE (2011)

  12. Fan, F., Zhu, L., Tang, Y.: Skew detection in document images based on rectangular active contour. IJDAR 13(4), 261–269 (2010)

    Article  Google Scholar 

  13. Ferilli, S., Basile, T.M.A., Esposito, F.: A histogram based technique for automatic threshold assessment in a run length smoothing-based algorithm. In: Proceedings of the DAS, pp. 349–356. ACM (2010)

  14. Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.J.: Automatic table detection in document images. In: Third International Conference on Advances in Pattern Recognition, pp. 609–618. Springer (2005)

  15. Gatos, B., Papamarkos, N., Chamzas, C.: Skew detection and text line position determination in digitized documents. Pattern Recognit. 30, 1505–1519 (1997)

    Article  Google Scholar 

  16. Ha, J., Haralick, R.M., Phillips, I.T.: Recursive X-Y cut using bounding boxes of connected components. In: Proceedings of the ICDAR, pp. 952–955. IEEE (1995)

  17. Haralick, R.M., Sternberg, S.R., Zhuang, X.: Image analysis using mathematical morphology. IEEE PAMI 9(4), 532–550 (1987)

    Article  Google Scholar 

  18. http://www.diotek.com/

  19. Jain, A.K., Yu, B.: Document representation and its application to page decomposition. IEEE PAMI 20(3), 294–308 (1998)

    Article  Google Scholar 

  20. Kasar, T., Barlas, P., Adam, S., Chatelain, C., Paquet, T.: Learning to detect tables in scanned document images using line information. In: Proceedings of the ICDAR, pp. 1185–1189. IEEE (2013)

  21. Kise, K., Sato, A., Iwata, M.: Segmentation of page images using the area Voronoi diagram. Comput. Vis. Image Underst. 70(3), 370–382 (1998)

    Article  Google Scholar 

  22. Lazzara, G., Geraud, T.: Efficient multiscale Sauvola’s binarization. IJDAR 17, 105–123 (2014)

    Article  Google Scholar 

  23. Lee, S.-W., Ryu, D.-S.: Parameter-free geometric document layout analysis. IEEE PAMI 23(11), 1240–1256 (2001)

    Article  Google Scholar 

  24. Liang J., Ha, J., Haralick, R.M., Phillips, I.T.: Document layout structure extraction using bounding boxes of different entities. In: 3rd IEEE Workshop on Applications of Computer Vision, pp. 278–283. IEEE (1996)

  25. Mallows, C.: Another comment on O’Cinneide. Am. Stat. 45(3), 256–262 (1991)

    Article  Google Scholar 

  26. Nagy, G., Seth, S., Viswanathan, M.: A prototype document image analysis system for technical journals. Computer 25(7), 1022 (1992)

    Article  Google Scholar 

  27. O’Gorman, L.: The document spectrum for page layout analysis. IEEE PAMI 15(11), 1162–1173 (1993)

    Article  Google Scholar 

  28. Okamoto, M., Takahashi, M.: A hybrid page segmentation method. In: Proceedings of the ICDAR, pp 743–746. IEEE (1993)

  29. Pan, Y., Zhao, Q., Kamata, S.: Document layout analysis and reading order determination for a reading robot. In: Tencon 2010–2010 IEEE Region 10 Conference, pp. 1607–1612. IEEE (2010)

  30. Papamdreou, A., Gatos, B.: A novel skew detection technique based on vertical projections. In: Proceedings of the ICDAR, pp 384–388. IEEE (2011)

  31. Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)

    Article  Google Scholar 

  32. Shafait, F., Keysers, D., Breuel, T.M.: Efficient implementation of local adaptive thresholding techniques using integral images. In: Proceeding of SPIE 6815, Document Recognition and Retrieval XV, 681510. ACM (2008)

  33. Shafait, F., Smith, R.: Table detection in heterogeneous documents. In: Proceedings of the DAS, pp. 65–72. ACM (2010)

  34. Shafait, F., Keysers, D., Breuel, T.M.: Performance evaluation and benchmarking of six-page segmentation algorithm. IEEE PAMI 30(6), 941–954 (2008)

    Article  Google Scholar 

  35. Simon, A., Pret, J.C., Peter Johnson, A.: A fast algorithm for bottom-up document layout analysis. IEEE PAMI 19(3), 273–277 (1997)

    Article  Google Scholar 

  36. Smith, R.: Hybrid page layout analysis via tab-stop detection. In: Proceedings of the ICDAR, pp 241–245. IEEE (2009)

  37. Sun, H.M.: Page segmentation for Manhattan and non-Manhattan layout documents via selective CRLA. In: Proceedings of the ICDAR, pp. 116–120. IEEE (2005)

  38. Tran, T.A., Na, I.S., Hyung, K.S.: Hybird page segmentation using multilevel homogeneity structure. In: 9th International Conference on Ubiquitous Information Management and Communication, CDPub. ACM (2015)

  39. Wahl, F.M., Wong, K.Y., Casey, R.G.: Block segmentation and text extraction in mixed text/image documents. Graph. Models Image Process. 20(4), 375–390 (1982)

  40. Xiao, Y., Yan, H.: Text region extraction in a document image based on the Delaunay tessellation. Pattern Recognit. 36, 799–809 (2003)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2015R1D1A3A01018993) and by the Ministry of Science, ICT & Future Planning (NRF-2015R1C1A1A02036495).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soo Hyung Kim.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tran, T.A., Na, I.S. & Kim, S.H. Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology. IJDAR 19, 191–209 (2016). https://doi.org/10.1007/s10032-016-0265-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-016-0265-3

Keywords

Navigation