Skip to main content
Log in

A hierarchical and scalable model for contemporary document image segmentation

  • Industrial and Commercial Application
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In this paper, we introduce a novel color segmentation approach robust against digitization noise and adapted to contemporary document images. This system is scalable, hierarchical, versatile and completely automated, i.e. user independent. It proposes an adaptive binarization/quantization without any penalizing information loss. This model may be used for many purposes. For instance, we rely on it to carry out the first steps leading to advertisement recognition in document images. Furthermore, the color segmentation output is used to localize text areas and enhance optical character recognition (OCR) performances. We held tests on a variety of magazine images to point up our contribution to the well-known OCR product Abby FinerReader. We also get promising results with our ad detection system on a large set of complex layout testing images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

References

  1. Atsalakis A, Papamarkos N, Kroupis N, Soudris D, Thanailakis A (2004) Colour quantisation technique based on image decomposition and its embedded system implementation. VISP 151(6):511–524

    Google Scholar 

  2. Bottou L, Haffner P, Howard PG, Simard P, Bengio Y, Lecun Y (1998) High quality document image compression with djvu. J Electron Imaging 7:410–425

    Article  Google Scholar 

  3. Braquelaire J, Brun L (1997) Comparison and optimization of methods of color image quantization. IEEE Trans Image Process 6:1048–1051

    Article  Google Scholar 

  4. Cattoni R, Coianiz T, Messelodi S, Modena CM, irst Via Sommarive I (1998) Geometric layout analysis techniques for document image understanding: a review. Technical report

  5. Chen Q, Sun QS, Ann Heng P, Xia DS (2008) A double-threshold image binarization method based on edge detector. Pattern Recogn 41(4):1254–1267

    Article  Google Scholar 

  6. Chowdhury S, Mandal S, Das A, Chanda B (2007) Segmentation of text and graphics from document images. In: ICDAR07, pp 619–623

  7. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. doi:10.1007/BF00994018

    MATH  Google Scholar 

  8. Drira F, Lebourgeois F, Emptoz H (2007) A coupled mean shift-anisotropic diffusion approach for document image segmentation and restoration. In: IEEE (ed.) ICDAR, pp 814–818

  9. Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the second European conference on computational learning theory. Springer, London, pp 23–37

  10. Gatos B, Pratikakis I, Perantonis S (2006) Adaptive degraded document image binarization. Pattern Recogn 39(3):317–327

    Article  MATH  Google Scholar 

  11. Gong C, Zhu F (2010) On detection of contextual advertisements. In: CAR’10: Proceedings of the 2nd international Asia conference on Informatics in control, automation and robotics. IEEE Press, Piscataway, pp 29–32

  12. Haffner P, Bottou L, Lecun Y, Vincent L (2002) A general segmentation scheme for djvu document compression. In: ISMM’02, international symposium on mathematical morphology. CSIRO Publications, Sydney

  13. Heckbert P (1982) Color image quantization for frame buffer display. SIGGRAPH Comput Graph 16(3):297–307

    Article  Google Scholar 

  14. Hopfield JJ (1988) Neural networks and physical systems with emergent collective computational abilities. MIT Press, Cambridge, pp 457–464

    Google Scholar 

  15. Jain A, Yu B (1998) Automatic text location in images and video frames. In: Proceedings of Fourteenth international conference on pattern recognition, 1998, vol 2, pp 1497–1499

  16. Jain AK, Yu B (1998) Document representation and its application to page decomposition. IEEE Trans Pattern Anal Mach Intell 20(3):294–308

    Article  Google Scholar 

  17. Jang J, Hong K (1999) Binarization of noisy gray-scale character images by thin line modeling. Pattern Recogn 32(5):743–752

    Article  Google Scholar 

  18. Jung K, Kim KI, Jain AK (2004) Text information extraction in images and video: a survey. Pattern Recogn 37(5):977–997

    Google Scholar 

  19. Karatzas D, Antonacopoulos A (2007) Colour text segmentation in web images based on human perception. Image Vision Comput 25(5):564–577

    Article  Google Scholar 

  20. Kim HK (1996) Efficient automatic text location method and content-based indexing and structuring of video database. J Vis Commun Image Represent 7(4):336–344

    Article  Google Scholar 

  21. Kim JH, Shin DK, Moon YS (2009) Color transfer in images based on separation of chromatic and achromatic colors. In: MIRAGE ’09: proceedings of the 4th international conference on computer vision/computer graphics collaboration techniques. Springer, Berlin, pp 285–296

  22. LeBourgeois F, Emptoz H (1999) Document analysis in gray level and typography extraction using character pattern redundancies. In: ICDAR ’99. IEEE Computer Society, Washington, DC, pp 177–180

  23. Leydier Y, Lebourgeois F, Emptoz H (2004) Serialized k-means for adaptative color image segmentation: application to document images and others. In: DAS2004. Lecture Notes in Computer Science. Springer, pp 252–263

  24. Leydier Y, Lebourgeois F, Emptoz H (2004) Serialized unsupervised classifer for adaptative color image segmentation: application to digitized ancient manuscripts. In: ICPR 2004, pp 494–497

  25. Li D, Wang B, Li Z, Yu N, Li M (2007) On detection of advertising images. In: ICME, pp 1758–1761. doi:10.1109/ICME.2007.4285011

  26. Lim YK, Choi SH, Lee SW (2000) Text extraction in mpeg compressed video for content-based indexing. In: Proceedings of 15th international conference on pattern recognition, 2000, vol 4, pp 409–412. doi:10.1109/ICPR.2000.902945

  27. Liu Y, Srihari S (1997) Document image binarization based on texture features. IEEE Trans Pattern Anal Mach Intell 19(5):540–544

    Article  Google Scholar 

  28. Mo S, Mathews V (1998) Adaptive, quadratic preprocessing of document images for binarization. IEEE Trans Image Process 7(7):992–999

    Article  Google Scholar 

  29. Moghaddamzadeh A, Bourbakis N (1997) A fuzzy region growing approach for segmentation of color images. PR 30(6):867–881

    Google Scholar 

  30. Nikolaou N, Papamarkos N (2009) Color reduction for complex document images. Int J Imaging Syst Technol 19(1):14–26

    Article  Google Scholar 

  31. Oh H, Lim K, Chien S (2005) An improved binarization algorithm based on a water flow model for document image with inhomogeneous backgrounds. Pattern Recogn 38(12):2612–2625

    Article  Google Scholar 

  32. Ouji A, Leydier Y, LeBourgeois F (2011) Chromatic / achromatic separation in noisy document images. In: IEEE International Conference on Document Analysis and Recognition, pp 167–171

  33. Papamarkos N, Atsalakis AE, Strouthopoulos CP (2002) Adaptive color reduction. IEEE Systems Man Cybern Part B 32:44–56

    Google Scholar 

  34. Pujol A, Chen L (2007) Color quantization for image processing using self information. In: International conference on information communications and signal processing (ICICS)

  35. Pujol A, Chen L (2008) Coarse adaptive color image segmentation for visual object classification. In: 15th international conference on systems, signals and image processing

  36. Rowe NC, Coffman J, Degirmenci Y, Hall S, Lee S, Williams C (2002) Automatic removal of advertising from web-page display. In: JCDL’02: proceedings of the 2nd ACM/IEEE-CS joint conference on digital libraries. ACM, New York, pp 406–406. doi:http://doi.acm.org/10.1145/544220.544354

  37. Scheunders P (1997) A comparison of clustering algorithms applied to color image quantization. Pattern Recogn Lett 18(11–13):1379–1384

    Article  Google Scholar 

  38. Tominaga S (1988) A color classification algorithm for color images. In: Pattern recognition in practice, vol 301

  39. Trier OD, Taxt T (1995) Evaluation of binarization methods for document images. IEEE Trans Pattern Anal Mach Intell 17:312–315

    Article  Google Scholar 

  40. Wang J, Duan L, Liu Q, Lu H, Jin JS (2007) Robust commercial retrieval in video streams. In: ICME

  41. Watve A, Sural S (2008) Soccer video processing for the detection of advertisement billboards. Pattern Recogn Lett 29(7):994–1006. doi:10.1016/j.patrec.2008.01.022

    Article  Google Scholar 

  42. Weeks A, Hague G (1997) Color segmentation in the hsi color space using the k-means algorithm. SPIE 3026:143–154

    Article  Google Scholar 

  43. Wolf C, Jolion JM (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int J Document Anal Recogn 8(4):280–296

    Article  Google Scholar 

  44. Wolf C, Jolion JM, Chassaing F (2002) Text localization, enhancement and binarization in multimedia documents. In: Proceedings of the international conference on pattern recognition, vol 2, pp 1037–1040

  45. Wu X (1992) Color quantization by dynamic programming and principal analysis. ACM Trans Graph 11(4):348–372

    Article  MATH  Google Scholar 

  46. Yang J, Zhu SJ (2009) A multi-scale algorithm for graffito advertisement detection from images of real estate. In: AICI’09: proceedings of the international conference on artificial intelligence and computational intelligence. Springer, Berlin, pp 444–452

  47. Zhang L, Zhu Z, Zhao Y (2007) Robust commercial detection system. In: IEEE international conference on multimedia and expo, 2007, pp 587–590. doi:10.1109/ICME.2007.4284718

  48. Zhong Y, Zhang H, Jain A (2000) Automatic caption localization in compressed video. IEEE Trans Pattern Anal Mach Intell 22(4):385–392. doi:10.1109/34.845381

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asma Ouji.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ouji, A., Leydier, Y. & LeBourgeois, F. A hierarchical and scalable model for contemporary document image segmentation. Pattern Anal Applic 16, 679–693 (2013). https://doi.org/10.1007/s10044-012-0282-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-012-0282-x

Keywords

Navigation