Advertisement

Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology

  • Tuan Anh Tran
  • In Seop Na
  • Soo Hyung KimEmail author
Original Paper

Abstract

Document layout analysis or page segmentation is the task of decomposing document images into many different regions such as texts, images, separators, and tables. It is still a challenging problem due to the variety of document layouts. In this paper, we propose a novel hybrid method, which includes three main stages to deal with this problem. In the first stage, the text and non-text elements are classified by using minimum homogeneity algorithm. This method is the combination of connected component analysis and multilevel homogeneity structure. Then, in the second stage, a new homogeneity structure is combined with an adaptive mathematical morphology in the text document to get a set of text regions. Besides, on the non-text document, further classification of non-text elements is applied to get separator regions, table regions, image regions, etc. The final stage, in refinement region and noise detection process, all regions both in the text document and non-text document are refined to eliminate noises and get the geometric layout of each region. The proposed method has been tested with the dataset of ICDAR2009 page segmentation competition and many other databases with different languages. The results of these tests showed that our proposed method achieves a higher accuracy compared to other methods. This proves the effectiveness and superiority of our method.

Keywords

Page segmentation Document layout analysis Homogeneity structure OCR Mathematical morphology Recursive filter 

Notes

Acknowledgments

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2015R1D1A3A01018993) and by the Ministry of Science, ICT & Future Planning (NRF-2015R1C1A1A02036495).

Compliance with Ethical Standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Agrawal, M., Doermann, D. S.: Voronoi++ A dynamic page segmentation approach based on Voronoi and Docstrum features. In: Proceedings of the ICDAR, pp. 1011–1015. IEEE (2009)Google Scholar
  2. 2.
    Antonacopoulos, S., Bridson, D., Papadopoulos, C., A., Pletschacher: A realistic dataset for performance evaluation of document layout analysis. In: Proceedings of the ICDAR, pp. 296–300. IEEE (2009)Google Scholar
  3. 3.
    Antonacopoulos, A., Pletschacher, S., Bridson, D., Papadopoulos, C.: ICDAR2009 page segmentation competition. In: Proceedings of the ICDAR, pp. 1370–1374. Barcelona (2009)Google Scholar
  4. 4.
    Baird, H., Jones, S., Fortune, S.: Image segmentation by shape-directed covers. In: Proceedings of the ICPR. pp. 820–825. IEEE (1990)Google Scholar
  5. 5.
    Bland, J. M., Altman, D. G.: Statistics notes: measurement error. Bmj, 312(7047), 1654. Retrieved 22 November (2013)Google Scholar
  6. 6.
    Caponetti, L., Castiello, C., Gorecki, P.: Document page segmentation using neuro-fuzzy approach. Appl. Soft Comput. 8, 118–126 (2008)CrossRefGoogle Scholar
  7. 7.
    Chang, F., Chen, C.-J., Lu, C.-J.: A linear time component labeling algorithm using contour tracing technique. Comput. Vis. Image Underst. 93(2), 206–220 (2004)CrossRefGoogle Scholar
  8. 8.
    Chen, K., Yin, F., Liu, C.-L.: Hybrid page segmentation with efficient whitespace rectangles extraction and grouping. In: Proceedings of the ICDAR. pp. 958–962. IEEE (2013)Google Scholar
  9. 9.
    Cheng, H., Bouman, C.A.: Multi-scale Bayesian Segmentation Using a Trainable Context Model. IEEE Trans. Image Process. 10(4), 511–525 (2001)CrossRefzbMATHGoogle Scholar
  10. 10.
    Cinque, L., Lombardi, L., Manzini, G.: A multiresolution approach for page segmentation. Pattern Recognit. Lett. 19, 217–225 (1998)CrossRefGoogle Scholar
  11. 11.
    Clausner, C., Pletschacher, S., Antonacopoulos, A.: Scenario driven in-depth performance evaluation of document layout analysis methods. In: Proceedings of the ICDAR, pp. 1404–1408. IEEE (2011)Google Scholar
  12. 12.
    Fan, F., Zhu, L., Tang, Y.: Skew detection in document images based on rectangular active contour. IJDAR 13(4), 261–269 (2010)CrossRefGoogle Scholar
  13. 13.
    Ferilli, S., Basile, T.M.A., Esposito, F.: A histogram based technique for automatic threshold assessment in a run length smoothing-based algorithm. In: Proceedings of the DAS, pp. 349–356. ACM (2010)Google Scholar
  14. 14.
    Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.J.: Automatic table detection in document images. In: Third International Conference on Advances in Pattern Recognition, pp. 609–618. Springer (2005)Google Scholar
  15. 15.
    Gatos, B., Papamarkos, N., Chamzas, C.: Skew detection and text line position determination in digitized documents. Pattern Recognit. 30, 1505–1519 (1997)CrossRefGoogle Scholar
  16. 16.
    Ha, J., Haralick, R.M., Phillips, I.T.: Recursive X-Y cut using bounding boxes of connected components. In: Proceedings of the ICDAR, pp. 952–955. IEEE (1995)Google Scholar
  17. 17.
    Haralick, R.M., Sternberg, S.R., Zhuang, X.: Image analysis using mathematical morphology. IEEE PAMI 9(4), 532–550 (1987)CrossRefGoogle Scholar
  18. 18.
  19. 19.
    Jain, A.K., Yu, B.: Document representation and its application to page decomposition. IEEE PAMI 20(3), 294–308 (1998)CrossRefGoogle Scholar
  20. 20.
    Kasar, T., Barlas, P., Adam, S., Chatelain, C., Paquet, T.: Learning to detect tables in scanned document images using line information. In: Proceedings of the ICDAR, pp. 1185–1189. IEEE (2013)Google Scholar
  21. 21.
    Kise, K., Sato, A., Iwata, M.: Segmentation of page images using the area Voronoi diagram. Comput. Vis. Image Underst. 70(3), 370–382 (1998)CrossRefGoogle Scholar
  22. 22.
    Lazzara, G., Geraud, T.: Efficient multiscale Sauvola’s binarization. IJDAR 17, 105–123 (2014)CrossRefGoogle Scholar
  23. 23.
    Lee, S.-W., Ryu, D.-S.: Parameter-free geometric document layout analysis. IEEE PAMI 23(11), 1240–1256 (2001)CrossRefGoogle Scholar
  24. 24.
    Liang J., Ha, J., Haralick, R.M., Phillips, I.T.: Document layout structure extraction using bounding boxes of different entities. In: 3rd IEEE Workshop on Applications of Computer Vision, pp. 278–283. IEEE (1996)Google Scholar
  25. 25.
    Mallows, C.: Another comment on O’Cinneide. Am. Stat. 45(3), 256–262 (1991)CrossRefGoogle Scholar
  26. 26.
    Nagy, G., Seth, S., Viswanathan, M.: A prototype document image analysis system for technical journals. Computer 25(7), 1022 (1992)CrossRefGoogle Scholar
  27. 27.
    O’Gorman, L.: The document spectrum for page layout analysis. IEEE PAMI 15(11), 1162–1173 (1993)CrossRefGoogle Scholar
  28. 28.
    Okamoto, M., Takahashi, M.: A hybrid page segmentation method. In: Proceedings of the ICDAR, pp 743–746. IEEE (1993)Google Scholar
  29. 29.
    Pan, Y., Zhao, Q., Kamata, S.: Document layout analysis and reading order determination for a reading robot. In: Tencon 2010–2010 IEEE Region 10 Conference, pp. 1607–1612. IEEE (2010)Google Scholar
  30. 30.
    Papamdreou, A., Gatos, B.: A novel skew detection technique based on vertical projections. In: Proceedings of the ICDAR, pp 384–388. IEEE (2011)Google Scholar
  31. 31.
    Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)CrossRefGoogle Scholar
  32. 32.
    Shafait, F., Keysers, D., Breuel, T.M.: Efficient implementation of local adaptive thresholding techniques using integral images. In: Proceeding of SPIE 6815, Document Recognition and Retrieval XV, 681510. ACM (2008)Google Scholar
  33. 33.
    Shafait, F., Smith, R.: Table detection in heterogeneous documents. In: Proceedings of the DAS, pp. 65–72. ACM (2010)Google Scholar
  34. 34.
    Shafait, F., Keysers, D., Breuel, T.M.: Performance evaluation and benchmarking of six-page segmentation algorithm. IEEE PAMI 30(6), 941–954 (2008)CrossRefGoogle Scholar
  35. 35.
    Simon, A., Pret, J.C., Peter Johnson, A.: A fast algorithm for bottom-up document layout analysis. IEEE PAMI 19(3), 273–277 (1997)CrossRefGoogle Scholar
  36. 36.
    Smith, R.: Hybrid page layout analysis via tab-stop detection. In: Proceedings of the ICDAR, pp 241–245. IEEE (2009)Google Scholar
  37. 37.
    Sun, H.M.: Page segmentation for Manhattan and non-Manhattan layout documents via selective CRLA. In: Proceedings of the ICDAR, pp. 116–120. IEEE (2005)Google Scholar
  38. 38.
    Tran, T.A., Na, I.S., Hyung, K.S.: Hybird page segmentation using multilevel homogeneity structure. In: 9th International Conference on Ubiquitous Information Management and Communication, CDPub. ACM (2015)Google Scholar
  39. 39.
    Wahl, F.M., Wong, K.Y., Casey, R.G.: Block segmentation and text extraction in mixed text/image documents. Graph. Models Image Process. 20(4), 375–390 (1982)Google Scholar
  40. 40.
    Xiao, Y., Yan, H.: Text region extraction in a document image based on the Delaunay tessellation. Pattern Recognit. 36, 799–809 (2003)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  1. 1.School of Electronic and Computer EngineeringChonnam National UniversityGwangjuRepublic of Korea

Personalised recommendations