Abstract
In this paper, we introduce a novel color segmentation approach robust against digitization noise and adapted to contemporary document images. This system is scalable, hierarchical, versatile and completely automated, i.e. user independent. It proposes an adaptive binarization/quantization without any penalizing information loss. This model may be used for many purposes. For instance, we rely on it to carry out the first steps leading to advertisement recognition in document images. Furthermore, the color segmentation output is used to localize text areas and enhance optical character recognition (OCR) performances. We held tests on a variety of magazine images to point up our contribution to the well-known OCR product Abby FinerReader. We also get promising results with our ad detection system on a large set of complex layout testing images.
Similar content being viewed by others
References
Atsalakis A, Papamarkos N, Kroupis N, Soudris D, Thanailakis A (2004) Colour quantisation technique based on image decomposition and its embedded system implementation. VISP 151(6):511–524
Bottou L, Haffner P, Howard PG, Simard P, Bengio Y, Lecun Y (1998) High quality document image compression with djvu. J Electron Imaging 7:410–425
Braquelaire J, Brun L (1997) Comparison and optimization of methods of color image quantization. IEEE Trans Image Process 6:1048–1051
Cattoni R, Coianiz T, Messelodi S, Modena CM, irst Via Sommarive I (1998) Geometric layout analysis techniques for document image understanding: a review. Technical report
Chen Q, Sun QS, Ann Heng P, Xia DS (2008) A double-threshold image binarization method based on edge detector. Pattern Recogn 41(4):1254–1267
Chowdhury S, Mandal S, Das A, Chanda B (2007) Segmentation of text and graphics from document images. In: ICDAR07, pp 619–623
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. doi:10.1007/BF00994018
Drira F, Lebourgeois F, Emptoz H (2007) A coupled mean shift-anisotropic diffusion approach for document image segmentation and restoration. In: IEEE (ed.) ICDAR, pp 814–818
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the second European conference on computational learning theory. Springer, London, pp 23–37
Gatos B, Pratikakis I, Perantonis S (2006) Adaptive degraded document image binarization. Pattern Recogn 39(3):317–327
Gong C, Zhu F (2010) On detection of contextual advertisements. In: CAR’10: Proceedings of the 2nd international Asia conference on Informatics in control, automation and robotics. IEEE Press, Piscataway, pp 29–32
Haffner P, Bottou L, Lecun Y, Vincent L (2002) A general segmentation scheme for djvu document compression. In: ISMM’02, international symposium on mathematical morphology. CSIRO Publications, Sydney
Heckbert P (1982) Color image quantization for frame buffer display. SIGGRAPH Comput Graph 16(3):297–307
Hopfield JJ (1988) Neural networks and physical systems with emergent collective computational abilities. MIT Press, Cambridge, pp 457–464
Jain A, Yu B (1998) Automatic text location in images and video frames. In: Proceedings of Fourteenth international conference on pattern recognition, 1998, vol 2, pp 1497–1499
Jain AK, Yu B (1998) Document representation and its application to page decomposition. IEEE Trans Pattern Anal Mach Intell 20(3):294–308
Jang J, Hong K (1999) Binarization of noisy gray-scale character images by thin line modeling. Pattern Recogn 32(5):743–752
Jung K, Kim KI, Jain AK (2004) Text information extraction in images and video: a survey. Pattern Recogn 37(5):977–997
Karatzas D, Antonacopoulos A (2007) Colour text segmentation in web images based on human perception. Image Vision Comput 25(5):564–577
Kim HK (1996) Efficient automatic text location method and content-based indexing and structuring of video database. J Vis Commun Image Represent 7(4):336–344
Kim JH, Shin DK, Moon YS (2009) Color transfer in images based on separation of chromatic and achromatic colors. In: MIRAGE ’09: proceedings of the 4th international conference on computer vision/computer graphics collaboration techniques. Springer, Berlin, pp 285–296
LeBourgeois F, Emptoz H (1999) Document analysis in gray level and typography extraction using character pattern redundancies. In: ICDAR ’99. IEEE Computer Society, Washington, DC, pp 177–180
Leydier Y, Lebourgeois F, Emptoz H (2004) Serialized k-means for adaptative color image segmentation: application to document images and others. In: DAS2004. Lecture Notes in Computer Science. Springer, pp 252–263
Leydier Y, Lebourgeois F, Emptoz H (2004) Serialized unsupervised classifer for adaptative color image segmentation: application to digitized ancient manuscripts. In: ICPR 2004, pp 494–497
Li D, Wang B, Li Z, Yu N, Li M (2007) On detection of advertising images. In: ICME, pp 1758–1761. doi:10.1109/ICME.2007.4285011
Lim YK, Choi SH, Lee SW (2000) Text extraction in mpeg compressed video for content-based indexing. In: Proceedings of 15th international conference on pattern recognition, 2000, vol 4, pp 409–412. doi:10.1109/ICPR.2000.902945
Liu Y, Srihari S (1997) Document image binarization based on texture features. IEEE Trans Pattern Anal Mach Intell 19(5):540–544
Mo S, Mathews V (1998) Adaptive, quadratic preprocessing of document images for binarization. IEEE Trans Image Process 7(7):992–999
Moghaddamzadeh A, Bourbakis N (1997) A fuzzy region growing approach for segmentation of color images. PR 30(6):867–881
Nikolaou N, Papamarkos N (2009) Color reduction for complex document images. Int J Imaging Syst Technol 19(1):14–26
Oh H, Lim K, Chien S (2005) An improved binarization algorithm based on a water flow model for document image with inhomogeneous backgrounds. Pattern Recogn 38(12):2612–2625
Ouji A, Leydier Y, LeBourgeois F (2011) Chromatic / achromatic separation in noisy document images. In: IEEE International Conference on Document Analysis and Recognition, pp 167–171
Papamarkos N, Atsalakis AE, Strouthopoulos CP (2002) Adaptive color reduction. IEEE Systems Man Cybern Part B 32:44–56
Pujol A, Chen L (2007) Color quantization for image processing using self information. In: International conference on information communications and signal processing (ICICS)
Pujol A, Chen L (2008) Coarse adaptive color image segmentation for visual object classification. In: 15th international conference on systems, signals and image processing
Rowe NC, Coffman J, Degirmenci Y, Hall S, Lee S, Williams C (2002) Automatic removal of advertising from web-page display. In: JCDL’02: proceedings of the 2nd ACM/IEEE-CS joint conference on digital libraries. ACM, New York, pp 406–406. doi:http://doi.acm.org/10.1145/544220.544354
Scheunders P (1997) A comparison of clustering algorithms applied to color image quantization. Pattern Recogn Lett 18(11–13):1379–1384
Tominaga S (1988) A color classification algorithm for color images. In: Pattern recognition in practice, vol 301
Trier OD, Taxt T (1995) Evaluation of binarization methods for document images. IEEE Trans Pattern Anal Mach Intell 17:312–315
Wang J, Duan L, Liu Q, Lu H, Jin JS (2007) Robust commercial retrieval in video streams. In: ICME
Watve A, Sural S (2008) Soccer video processing for the detection of advertisement billboards. Pattern Recogn Lett 29(7):994–1006. doi:10.1016/j.patrec.2008.01.022
Weeks A, Hague G (1997) Color segmentation in the hsi color space using the k-means algorithm. SPIE 3026:143–154
Wolf C, Jolion JM (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Document Anal Recogn 8(4):280–296
Wolf C, Jolion JM, Chassaing F (2002) Text localization, enhancement and binarization in multimedia documents. In: Proceedings of the international conference on pattern recognition, vol 2, pp 1037–1040
Wu X (1992) Color quantization by dynamic programming and principal analysis. ACM Trans Graph 11(4):348–372
Yang J, Zhu SJ (2009) A multi-scale algorithm for graffito advertisement detection from images of real estate. In: AICI’09: proceedings of the international conference on artificial intelligence and computational intelligence. Springer, Berlin, pp 444–452
Zhang L, Zhu Z, Zhao Y (2007) Robust commercial detection system. In: IEEE international conference on multimedia and expo, 2007, pp 587–590. doi:10.1109/ICME.2007.4284718
Zhong Y, Zhang H, Jain A (2000) Automatic caption localization in compressed video. IEEE Trans Pattern Anal Mach Intell 22(4):385–392. doi:10.1109/34.845381
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ouji, A., Leydier, Y. & LeBourgeois, F. A hierarchical and scalable model for contemporary document image segmentation. Pattern Anal Applic 16, 679–693 (2013). https://doi.org/10.1007/s10044-012-0282-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-012-0282-x