Abstract
Document layout analysis or page segmentation is the task of decomposing document images into many different regions such as texts, images, separators, and tables. It is still a challenging problem due to the variety of document layouts. In this paper, we propose a novel hybrid method, which includes three main stages to deal with this problem. In the first stage, the text and non-text elements are classified by using minimum homogeneity algorithm. This method is the combination of connected component analysis and multilevel homogeneity structure. Then, in the second stage, a new homogeneity structure is combined with an adaptive mathematical morphology in the text document to get a set of text regions. Besides, on the non-text document, further classification of non-text elements is applied to get separator regions, table regions, image regions, etc. The final stage, in refinement region and noise detection process, all regions both in the text document and non-text document are refined to eliminate noises and get the geometric layout of each region. The proposed method has been tested with the dataset of ICDAR2009 page segmentation competition and many other databases with different languages. The results of these tests showed that our proposed method achieves a higher accuracy compared to other methods. This proves the effectiveness and superiority of our method.
Similar content being viewed by others
References
Agrawal, M., Doermann, D. S.: Voronoi++ A dynamic page segmentation approach based on Voronoi and Docstrum features. In: Proceedings of the ICDAR, pp. 1011–1015. IEEE (2009)
Antonacopoulos, S., Bridson, D., Papadopoulos, C., A., Pletschacher: A realistic dataset for performance evaluation of document layout analysis. In: Proceedings of the ICDAR, pp. 296–300. IEEE (2009)
Antonacopoulos, A., Pletschacher, S., Bridson, D., Papadopoulos, C.: ICDAR2009 page segmentation competition. In: Proceedings of the ICDAR, pp. 1370–1374. Barcelona (2009)
Baird, H., Jones, S., Fortune, S.: Image segmentation by shape-directed covers. In: Proceedings of the ICPR. pp. 820–825. IEEE (1990)
Bland, J. M., Altman, D. G.: Statistics notes: measurement error. Bmj, 312(7047), 1654. Retrieved 22 November (2013)
Caponetti, L., Castiello, C., Gorecki, P.: Document page segmentation using neuro-fuzzy approach. Appl. Soft Comput. 8, 118–126 (2008)
Chang, F., Chen, C.-J., Lu, C.-J.: A linear time component labeling algorithm using contour tracing technique. Comput. Vis. Image Underst. 93(2), 206–220 (2004)
Chen, K., Yin, F., Liu, C.-L.: Hybrid page segmentation with efficient whitespace rectangles extraction and grouping. In: Proceedings of the ICDAR. pp. 958–962. IEEE (2013)
Cheng, H., Bouman, C.A.: Multi-scale Bayesian Segmentation Using a Trainable Context Model. IEEE Trans. Image Process. 10(4), 511–525 (2001)
Cinque, L., Lombardi, L., Manzini, G.: A multiresolution approach for page segmentation. Pattern Recognit. Lett. 19, 217–225 (1998)
Clausner, C., Pletschacher, S., Antonacopoulos, A.: Scenario driven in-depth performance evaluation of document layout analysis methods. In: Proceedings of the ICDAR, pp. 1404–1408. IEEE (2011)
Fan, F., Zhu, L., Tang, Y.: Skew detection in document images based on rectangular active contour. IJDAR 13(4), 261–269 (2010)
Ferilli, S., Basile, T.M.A., Esposito, F.: A histogram based technique for automatic threshold assessment in a run length smoothing-based algorithm. In: Proceedings of the DAS, pp. 349–356. ACM (2010)
Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.J.: Automatic table detection in document images. In: Third International Conference on Advances in Pattern Recognition, pp. 609–618. Springer (2005)
Gatos, B., Papamarkos, N., Chamzas, C.: Skew detection and text line position determination in digitized documents. Pattern Recognit. 30, 1505–1519 (1997)
Ha, J., Haralick, R.M., Phillips, I.T.: Recursive X-Y cut using bounding boxes of connected components. In: Proceedings of the ICDAR, pp. 952–955. IEEE (1995)
Haralick, R.M., Sternberg, S.R., Zhuang, X.: Image analysis using mathematical morphology. IEEE PAMI 9(4), 532–550 (1987)
Jain, A.K., Yu, B.: Document representation and its application to page decomposition. IEEE PAMI 20(3), 294–308 (1998)
Kasar, T., Barlas, P., Adam, S., Chatelain, C., Paquet, T.: Learning to detect tables in scanned document images using line information. In: Proceedings of the ICDAR, pp. 1185–1189. IEEE (2013)
Kise, K., Sato, A., Iwata, M.: Segmentation of page images using the area Voronoi diagram. Comput. Vis. Image Underst. 70(3), 370–382 (1998)
Lazzara, G., Geraud, T.: Efficient multiscale Sauvola’s binarization. IJDAR 17, 105–123 (2014)
Lee, S.-W., Ryu, D.-S.: Parameter-free geometric document layout analysis. IEEE PAMI 23(11), 1240–1256 (2001)
Liang J., Ha, J., Haralick, R.M., Phillips, I.T.: Document layout structure extraction using bounding boxes of different entities. In: 3rd IEEE Workshop on Applications of Computer Vision, pp. 278–283. IEEE (1996)
Mallows, C.: Another comment on O’Cinneide. Am. Stat. 45(3), 256–262 (1991)
Nagy, G., Seth, S., Viswanathan, M.: A prototype document image analysis system for technical journals. Computer 25(7), 1022 (1992)
O’Gorman, L.: The document spectrum for page layout analysis. IEEE PAMI 15(11), 1162–1173 (1993)
Okamoto, M., Takahashi, M.: A hybrid page segmentation method. In: Proceedings of the ICDAR, pp 743–746. IEEE (1993)
Pan, Y., Zhao, Q., Kamata, S.: Document layout analysis and reading order determination for a reading robot. In: Tencon 2010–2010 IEEE Region 10 Conference, pp. 1607–1612. IEEE (2010)
Papamdreou, A., Gatos, B.: A novel skew detection technique based on vertical projections. In: Proceedings of the ICDAR, pp 384–388. IEEE (2011)
Sauvola, J., Pietikainen, M.: Adaptive document image binarization. Pattern Recognit. 33(2), 225–236 (2000)
Shafait, F., Keysers, D., Breuel, T.M.: Efficient implementation of local adaptive thresholding techniques using integral images. In: Proceeding of SPIE 6815, Document Recognition and Retrieval XV, 681510. ACM (2008)
Shafait, F., Smith, R.: Table detection in heterogeneous documents. In: Proceedings of the DAS, pp. 65–72. ACM (2010)
Shafait, F., Keysers, D., Breuel, T.M.: Performance evaluation and benchmarking of six-page segmentation algorithm. IEEE PAMI 30(6), 941–954 (2008)
Simon, A., Pret, J.C., Peter Johnson, A.: A fast algorithm for bottom-up document layout analysis. IEEE PAMI 19(3), 273–277 (1997)
Smith, R.: Hybrid page layout analysis via tab-stop detection. In: Proceedings of the ICDAR, pp 241–245. IEEE (2009)
Sun, H.M.: Page segmentation for Manhattan and non-Manhattan layout documents via selective CRLA. In: Proceedings of the ICDAR, pp. 116–120. IEEE (2005)
Tran, T.A., Na, I.S., Hyung, K.S.: Hybird page segmentation using multilevel homogeneity structure. In: 9th International Conference on Ubiquitous Information Management and Communication, CDPub. ACM (2015)
Wahl, F.M., Wong, K.Y., Casey, R.G.: Block segmentation and text extraction in mixed text/image documents. Graph. Models Image Process. 20(4), 375–390 (1982)
Xiao, Y., Yan, H.: Text region extraction in a document image based on the Delaunay tessellation. Pattern Recognit. 36, 799–809 (2003)
Acknowledgments
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2015R1D1A3A01018993) and by the Ministry of Science, ICT & Future Planning (NRF-2015R1C1A1A02036495).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Tran, T.A., Na, I.S. & Kim, S.H. Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology. IJDAR 19, 191–209 (2016). https://doi.org/10.1007/s10032-016-0265-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-016-0265-3