A hierarchical and scalable model for contemporary document image segmentation

Ouji, Asma; Leydier, Yann; LeBourgeois, Frank

doi:10.1007/s10044-012-0282-x

A hierarchical and scalable model for contemporary document image segmentation

Industrial and Commercial Application
Published: 20 July 2012

Volume 16, pages 679–693, (2013)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Asma Ouji¹,
Yann Leydier¹ &
Frank LeBourgeois¹

345 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we introduce a novel color segmentation approach robust against digitization noise and adapted to contemporary document images. This system is scalable, hierarchical, versatile and completely automated, i.e. user independent. It proposes an adaptive binarization/quantization without any penalizing information loss. This model may be used for many purposes. For instance, we rely on it to carry out the first steps leading to advertisement recognition in document images. Furthermore, the color segmentation output is used to localize text areas and enhance optical character recognition (OCR) performances. We held tests on a variety of magazine images to point up our contribution to the well-known OCR product Abby FinerReader. We also get promising results with our ad detection system on a large set of complex layout testing images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Article Open access 22 November 2021

Colorful Image Colorization

Ncorr: Open-Source 2D Digital Image Correlation Matlab Software

Article 31 March 2015

References

Atsalakis A, Papamarkos N, Kroupis N, Soudris D, Thanailakis A (2004) Colour quantisation technique based on image decomposition and its embedded system implementation. VISP 151(6):511–524
Google Scholar
Bottou L, Haffner P, Howard PG, Simard P, Bengio Y, Lecun Y (1998) High quality document image compression with djvu. J Electron Imaging 7:410–425
Article Google Scholar
Braquelaire J, Brun L (1997) Comparison and optimization of methods of color image quantization. IEEE Trans Image Process 6:1048–1051
Article Google Scholar
Cattoni R, Coianiz T, Messelodi S, Modena CM, irst Via Sommarive I (1998) Geometric layout analysis techniques for document image understanding: a review. Technical report
Chen Q, Sun QS, Ann Heng P, Xia DS (2008) A double-threshold image binarization method based on edge detector. Pattern Recogn 41(4):1254–1267
Article Google Scholar
Chowdhury S, Mandal S, Das A, Chanda B (2007) Segmentation of text and graphics from document images. In: ICDAR07, pp 619–623
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. doi:10.1007/BF00994018
MATH Google Scholar
Drira F, Lebourgeois F, Emptoz H (2007) A coupled mean shift-anisotropic diffusion approach for document image segmentation and restoration. In: IEEE (ed.) ICDAR, pp 814–818
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the second European conference on computational learning theory. Springer, London, pp 23–37
Gatos B, Pratikakis I, Perantonis S (2006) Adaptive degraded document image binarization. Pattern Recogn 39(3):317–327
Article MATH Google Scholar
Gong C, Zhu F (2010) On detection of contextual advertisements. In: CAR’10: Proceedings of the 2nd international Asia conference on Informatics in control, automation and robotics. IEEE Press, Piscataway, pp 29–32
Haffner P, Bottou L, Lecun Y, Vincent L (2002) A general segmentation scheme for djvu document compression. In: ISMM’02, international symposium on mathematical morphology. CSIRO Publications, Sydney
Heckbert P (1982) Color image quantization for frame buffer display. SIGGRAPH Comput Graph 16(3):297–307
Article Google Scholar
Hopfield JJ (1988) Neural networks and physical systems with emergent collective computational abilities. MIT Press, Cambridge, pp 457–464
Google Scholar
Jain A, Yu B (1998) Automatic text location in images and video frames. In: Proceedings of Fourteenth international conference on pattern recognition, 1998, vol 2, pp 1497–1499
Jain AK, Yu B (1998) Document representation and its application to page decomposition. IEEE Trans Pattern Anal Mach Intell 20(3):294–308
Article Google Scholar
Jang J, Hong K (1999) Binarization of noisy gray-scale character images by thin line modeling. Pattern Recogn 32(5):743–752
Article Google Scholar
Jung K, Kim KI, Jain AK (2004) Text information extraction in images and video: a survey. Pattern Recogn 37(5):977–997
Google Scholar
Karatzas D, Antonacopoulos A (2007) Colour text segmentation in web images based on human perception. Image Vision Comput 25(5):564–577
Article Google Scholar
Kim HK (1996) Efficient automatic text location method and content-based indexing and structuring of video database. J Vis Commun Image Represent 7(4):336–344
Article Google Scholar
Kim JH, Shin DK, Moon YS (2009) Color transfer in images based on separation of chromatic and achromatic colors. In: MIRAGE ’09: proceedings of the 4th international conference on computer vision/computer graphics collaboration techniques. Springer, Berlin, pp 285–296
LeBourgeois F, Emptoz H (1999) Document analysis in gray level and typography extraction using character pattern redundancies. In: ICDAR ’99. IEEE Computer Society, Washington, DC, pp 177–180
Leydier Y, Lebourgeois F, Emptoz H (2004) Serialized k-means for adaptative color image segmentation: application to document images and others. In: DAS2004. Lecture Notes in Computer Science. Springer, pp 252–263
Leydier Y, Lebourgeois F, Emptoz H (2004) Serialized unsupervised classifer for adaptative color image segmentation: application to digitized ancient manuscripts. In: ICPR 2004, pp 494–497
Li D, Wang B, Li Z, Yu N, Li M (2007) On detection of advertising images. In: ICME, pp 1758–1761. doi:10.1109/ICME.2007.4285011
Lim YK, Choi SH, Lee SW (2000) Text extraction in mpeg compressed video for content-based indexing. In: Proceedings of 15th international conference on pattern recognition, 2000, vol 4, pp 409–412. doi:10.1109/ICPR.2000.902945
Liu Y, Srihari S (1997) Document image binarization based on texture features. IEEE Trans Pattern Anal Mach Intell 19(5):540–544
Article Google Scholar
Mo S, Mathews V (1998) Adaptive, quadratic preprocessing of document images for binarization. IEEE Trans Image Process 7(7):992–999
Article Google Scholar
Moghaddamzadeh A, Bourbakis N (1997) A fuzzy region growing approach for segmentation of color images. PR 30(6):867–881
Google Scholar
Nikolaou N, Papamarkos N (2009) Color reduction for complex document images. Int J Imaging Syst Technol 19(1):14–26
Article Google Scholar
Oh H, Lim K, Chien S (2005) An improved binarization algorithm based on a water flow model for document image with inhomogeneous backgrounds. Pattern Recogn 38(12):2612–2625
Article Google Scholar
Ouji A, Leydier Y, LeBourgeois F (2011) Chromatic / achromatic separation in noisy document images. In: IEEE International Conference on Document Analysis and Recognition, pp 167–171
Papamarkos N, Atsalakis AE, Strouthopoulos CP (2002) Adaptive color reduction. IEEE Systems Man Cybern Part B 32:44–56
Google Scholar
Pujol A, Chen L (2007) Color quantization for image processing using self information. In: International conference on information communications and signal processing (ICICS)
Pujol A, Chen L (2008) Coarse adaptive color image segmentation for visual object classification. In: 15th international conference on systems, signals and image processing
Rowe NC, Coffman J, Degirmenci Y, Hall S, Lee S, Williams C (2002) Automatic removal of advertising from web-page display. In: JCDL’02: proceedings of the 2nd ACM/IEEE-CS joint conference on digital libraries. ACM, New York, pp 406–406. doi:http://doi.acm.org/10.1145/544220.544354
Scheunders P (1997) A comparison of clustering algorithms applied to color image quantization. Pattern Recogn Lett 18(11–13):1379–1384
Article Google Scholar
Tominaga S (1988) A color classification algorithm for color images. In: Pattern recognition in practice, vol 301
Trier OD, Taxt T (1995) Evaluation of binarization methods for document images. IEEE Trans Pattern Anal Mach Intell 17:312–315
Article Google Scholar
Wang J, Duan L, Liu Q, Lu H, Jin JS (2007) Robust commercial retrieval in video streams. In: ICME
Watve A, Sural S (2008) Soccer video processing for the detection of advertisement billboards. Pattern Recogn Lett 29(7):994–1006. doi:10.1016/j.patrec.2008.01.022
Article Google Scholar
Weeks A, Hague G (1997) Color segmentation in the hsi color space using the k-means algorithm. SPIE 3026:143–154
Article Google Scholar
Wolf C, Jolion JM (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Document Anal Recogn 8(4):280–296
Article Google Scholar
Wolf C, Jolion JM, Chassaing F (2002) Text localization, enhancement and binarization in multimedia documents. In: Proceedings of the international conference on pattern recognition, vol 2, pp 1037–1040
Wu X (1992) Color quantization by dynamic programming and principal analysis. ACM Trans Graph 11(4):348–372
Article MATH Google Scholar
Yang J, Zhu SJ (2009) A multi-scale algorithm for graffito advertisement detection from images of real estate. In: AICI’09: proceedings of the international conference on artificial intelligence and computational intelligence. Springer, Berlin, pp 444–452
Zhang L, Zhu Z, Zhao Y (2007) Robust commercial detection system. In: IEEE international conference on multimedia and expo, 2007, pp 587–590. doi:10.1109/ICME.2007.4284718
Zhong Y, Zhang H, Jain A (2000) Automatic caption localization in compressed video. IEEE Trans Pattern Anal Mach Intell 22(4):385–392. doi:10.1109/34.845381
Article Google Scholar

Download references

Author information

Authors and Affiliations

Université de Lyon, CNRS, INSA-Lyon, LIRIS, UMR5205, 20 av. Albert Einstein, Villeurbanne, 69621, France
Asma Ouji, Yann Leydier & Frank LeBourgeois

Authors

Asma Ouji
View author publications
You can also search for this author in PubMed Google Scholar
Yann Leydier
View author publications
You can also search for this author in PubMed Google Scholar
Frank LeBourgeois
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Asma Ouji.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ouji, A., Leydier, Y. & LeBourgeois, F. A hierarchical and scalable model for contemporary document image segmentation. Pattern Anal Applic 16, 679–693 (2013). https://doi.org/10.1007/s10044-012-0282-x

Download citation

Received: 24 June 2011
Accepted: 04 July 2012
Published: 20 July 2012
Issue Date: November 2013
DOI: https://doi.org/10.1007/s10044-012-0282-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical and scalable model for contemporary document image segmentation

Abstract

Access this article

Similar content being viewed by others

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Colorful Image Colorization

Ncorr: Open-Source 2D Digital Image Correlation Matlab Software

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hierarchical and scalable model for contemporary document image segmentation

Abstract

Access this article

Similar content being viewed by others

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

Colorful Image Colorization

Ncorr: Open-Source 2D Digital Image Correlation Matlab Software

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation