Complex documents images segmentation based on steerable pyramid features

  • Mohamed BenjelilEmail author
  • Slim Kanoun
  • Rémy Mullot
  • Adel M. Alimi
Full Paper


Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine-printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photographs, etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper. We compared our results with those from existing state-of-the-art methods. This comparison shows that the proposed method performs consistently well on large sets of complex document images.


Steerable pyramid Complex document segmentation Multi-resolution analysis Invariant features 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Freeman W.T., Adelson E.H.: The design and use of steerable filters. IEEE Trans. Pattern Anal. Mach. Intell. 13(9), 891–906 (1991)CrossRefGoogle Scholar
  2. 2.
    Fletcher L., Kasturi R.: A robust algorithm for text string separation from mixed text/graphics images. IEEE Trans. Pattern Anal. Mach. Intell. 10, 910–918 (1988)CrossRefGoogle Scholar
  3. 3.
    Tan, C.L., Yuan, B., Huang, W., Zhang, Z.: Text/graphics separation using pyramid operations. In: International Conference on Document Analysis and Recognition, Bangalore, pp. 169–172 (1999)Google Scholar
  4. 4.
    Antaonacopulos A.: Page Segmentation Using the Description of the Background. Comput. Vis. Image Underst. 70(3), 350–369 (1998)CrossRefGoogle Scholar
  5. 5.
    Pavlidis, T., Zhou, J.: Page segmentation by white streams. In: Proceedings of the 10th International Conference on Pattern recognition, pp. 945–953, Saint-Malo, France (1991)Google Scholar
  6. 6.
    Sural, S., Das, P.K.: A two step algorithm and its parallelisation for the generation of minimum containing rectangles for document image segmentation. In: International Conference on Document Analysis and Recognition, Bangalore, pp. 173–176 (1999)Google Scholar
  7. 7.
    Wang D., Srihari S.N.: Classi_cation of newspaper image blocks using texture analysis. CVGIP 47, 327–352 (1989)Google Scholar
  8. 8.
    Vishwanathan M., Nagy G.: Characteristics of digitized images of technical articles. SPIE 1(661), 6–17 (1992)CrossRefGoogle Scholar
  9. 9.
    Jung K., Kim K., Jain A.K.: Text information extraction in images and video: a survey. Pattern Recognit. 37(5), 977–997 (2004)CrossRefGoogle Scholar
  10. 10.
    Jain, A.K., Yu, B.: Page segmentation using document model. In: International Conference on Document Analysis and Recognition, Vol. 1, pp. 173–176. Munich (1997)Google Scholar
  11. 11.
    Mao S., Rosenfeld A.: Tapas Kanungo document structure analysis algorithms: a literature survey. Doc. Recognit. Retr. X 5010(1), 197–207 (2003)Google Scholar
  12. 12.
    Ahmad, U.A., Kidiyo, K., Joseph, R.: Texture features based on Fourier transform and Gabor filters: an empirical comparison. In: ICMV 2007, International Conference on Machine Vision, Islamabad, Pakistan vol. 1, pp. 67–72 (2007)Google Scholar
  13. 13.
    Rellier G., Descombes X., Falzon F., Zerubia J.: Texture feature analysis using a Gauss-Markov model in hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 42(7), 1543–1551 (2004)CrossRefGoogle Scholar
  14. 14.
    Wang, Y., Wei, X., Xiao, S.: LBP texture analysis based on the local adaptive niblack algorithm. In: CISP, Congress on Image and Signal Processing, Hainan, China, (2), pp. 777–780 (2008)Google Scholar
  15. 15.
    Charalampidis D.: Texture synthesis: textons revisited. IEEE Trans. Image Process. 15(3), 777–787 (2006)CrossRefGoogle Scholar
  16. 16.
    Haralick R.M., Shanmungan K., Dinstein I.: Textural features for image classification. IEEE Trans. Syst. Man Cybern. 3(6), 610–621 (1973)CrossRefGoogle Scholar
  17. 17.
    Haley G.M., Manjunath B.S.: Rotation-invariant texture classification using a complete space-frequency model. IEEE Trans. Image Process. 8, 255–269 (1999)CrossRefGoogle Scholar
  18. 18.
    Kim J.K., Park H.W.: Statistical textural features for detection of micro calcifications in digitized mammograms. IEEE Trans. Med. Imaging 18(3), 231–238 (1999)zbMATHCrossRefGoogle Scholar
  19. 19.
    Zhao Y., Zhang L., Li P., Huang B.: Classification of high spatial resolution imagery using improved Gaussian Markov random-field-based texture features. IEEE Trans. Geosci. Remote Sens. 45(5), 1458–1468 (2007)CrossRefGoogle Scholar
  20. 20.
    Krishnamachari S., Chellappa R.: Multiresolution Gauss–Markov random field models for texture segmentation. IEEE Trans. Image Process. 6(2), 251–267 (1997)CrossRefGoogle Scholar
  21. 21.
    Van de Wouwer G., Scheunders P., Dyck D.V.: Statistical texture characterization from discrete wavelet representations. IEEE Trans. Image Process. 8, 592–598 (1999)CrossRefGoogle Scholar
  22. 22.
    Clausi D.A., Huang D.: Design-based texture feature fusion using Gabor filters and co-occurrence probabilities. IEEE Trans. Image Process. 14(7), 925–936 (2005)CrossRefGoogle Scholar
  23. 23.
    Mittal, N., Mital, D.P., Chan, K.L.: Features for texture segmentation using Gabor filters. In: International Conference on Image Processing and its Applications, Dublin, Ireland, vol. 1, pp. 353–357 (1999)Google Scholar
  24. 24.
    Marcelja S.: Mathematical description of the responses of simple cortical cells. J. Opt. Soc. Am. 70(1), 1297–1300 (1980)CrossRefMathSciNetGoogle Scholar
  25. 25.
    Grigorescu S.E., Petkov N., Kruizinga P.: Comparison of texture features based on Gabor filters. IEEE Trans. Image Process. 11(10), 1160–1167 (2002)CrossRefMathSciNetGoogle Scholar
  26. 26.
    Nourbakhsh, F., Pati, P.B., Ramakrishnan, A.G.: Document page layout analysis using harris corner points. In: Proceedings of ICISIP (2006)Google Scholar
  27. 27.
    Raju, S., Patiand, P.B., Ramakrishnan, A.G.: Gabor filter based block energy analysis for text extraction from digital document images. In: Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL’04)Google Scholar
  28. 28.
    Kumar, S., Gupta, R., Khanna, N., Chaudhury, S., Joshi, S.D.: Text extraction and document image segmentation using matchedwavelets and MRF Model. IEEE Trans. Image Process. 16(8), 2117–2128 (2007)CrossRefMathSciNetGoogle Scholar
  29. 29.
    Liang C.-W., Chen P.-Y.: DWT based text localization. Int. J. Appl. Sci. Eng. 2(1), 105–116 (2004)MathSciNetGoogle Scholar
  30. 30.
    Do M.N., Vetterli M.: Wavelet-based texture retrieval using generalized gaussian density and kullback-leibler distance. IEEE Trans. Image Process. 11(2), 146–158 (2002)CrossRefMathSciNetGoogle Scholar
  31. 31.
    Huang P.W., Dai S.K., Lin P.L.: Texture image retrieval and image segmentation using composite sub-band gradient vectors. J. Vis. Commun. Image Represent. 17(5), 947–957 (2006)CrossRefGoogle Scholar
  32. 32.
    Kokare M., Chatterji B.N., Biswas P.K.: Cosinemodulated wavelet based texture features for content-based image retrieval. Pattern Recognit. Lett. 25(4), 391–398 (2004)CrossRefGoogle Scholar
  33. 33.
    Randen T., Husoy J.H.: Filtering for texture classification: a comparative study. IEEE Trans. Pattern Anal. Mach. Intell. 21(4), 291–310 (1999)CrossRefGoogle Scholar
  34. 34.
    Sarkar A., Sharma K.M.S., Sonak R.V.: A new approach for subset 2-D AR model identification for describing textures. IEEE Trans. Image Process. 6(3), 407–413 (1997)CrossRefGoogle Scholar
  35. 35.
    Sayadi M., Najim M.: Comparison of second and third order statistics based adaptive filters for texture characterization. IEEE ICASSP Arizona USA 6 99, 3281–3284 (1999)Google Scholar
  36. 36.
    Sayadi, M., Buzenac, V., Najim, M.: Texture characterization using 2-D cumulant-based lattice adaptive filtering. In: IEEE ICASSP 98, Seattle, USA, vol. 3, pp. 2725–2728 (1998)Google Scholar
  37. 37.
    Sabari Raju, S., Pati, P.B., Ramakrishnan, A.G.: Gabor filter based block energy analysis for text extraction from digital document images, DIAL04, pp. 233–243 (2004)Google Scholar
  38. 38.
    Etemad, K., Doermann, D., Chellappa, R.: Multiscale Segmentation of unstructured Document Pages Using Soft Decision Integration. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 19, No. 1, Jan (1997)Google Scholar
  39. 39.
    Greenspans, H., Belongic, S., Goodman, R.: Rotation invariant texture recognition using a steerable pyramid. In: Proceedings of ICPR’94, International Conference on Pattern Recognition, vol. 1, pp. 162–167 (1994)Google Scholar
  40. 40.
    Tzagkarakis G., Beferull-Lozano B., Tsakalides P.: Rotation-invariant texture retrieval with Gaussianized steerable pyramids. IEEE Trans. Image Process. 15(9), 2702–2718 (2006)CrossRefGoogle Scholar
  41. 41.
    Montoya-Zegarra, J.A., Leite, N.J., Torres, R.: Rotation-invariant and scale-invariant steerable pyramid decomposition for texture image retrieval. Brazilian Symposium on Computer Graphics and Image Processing (1), pp. 121–128 (2007)Google Scholar
  42. 42.
    Simoncelli, E.P., Freeman, W.T.: The steerable pyramid: a flexible architecture for multi-scale derivative computation In: Proceedings of IEEE second international Conference on Image Processing, pp. 444–447, Washington, DC (1995)Google Scholar
  43. 43.
    Danielsson P., Seger O.: Rotation invariance in gradient and higher order derivative detectors. Comp. Vis. Graph. Image Proc. 49, 198–221 (1990)zbMATHCrossRefGoogle Scholar
  44. 44.
    Freeman, W.T., Adelson, E.H.: Steerable filters. In: Topical Meeting on Image Understanding and Machine Vision. Optical society of America. Technical Digest Series vol. 14, June (1989)Google Scholar
  45. 45.
    Portilla J., Simoncelli E.P.: A parametric texture model based on joint statistics of complex wavelet coefficients. Int. J. Comput. Vis. 40(1), 49–70 (2000)zbMATHCrossRefGoogle Scholar
  46. 46.
    Fawcett T., Flach P.A.: A response to webb and Ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach. Learn. 58(1), 33–38 (2005)CrossRefGoogle Scholar
  47. 47.
    O’Gorman L.: The document spectrum for page layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 15(11), 1162–1173 (1993)CrossRefGoogle Scholar
  48. 48.
    Nagy G., Seth S., Viswanathan M.: A prototype document image analysis system for technical journals. Computer 7(25), 10–22 (1992)CrossRefGoogle Scholar
  49. 49.
    Phillips I., Chhabra A.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)CrossRefGoogle Scholar
  50. 50.
    Phillips, I., Liang, J., Chhabra, A., Haralick, R.: A performance evaluation protocol for graphics recognition systems. In: Graphics Recognition: Algorithms and Systems, Lecture Notes in Computer Science, vol. 1389, pp. 372–389, Springer (1998)Google Scholar
  51. 51.
    Yanikoglu B.A., Vincent L.: Pink Panther: a complete environment for ground-truthing and benchmarking document page segmentation. Pattern Recognit. 31(9), 1191–1204 (1994)CrossRefGoogle Scholar
  52. 52.
    Phillips I.T., Chhabra A.: Empirical performance evaluation of graphics recognition systems. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), 849–870 (1999)CrossRefGoogle Scholar
  53. 53.
    Refering to MediaTeam Document Database Sauvola J. and Kauniskangas H.: MediaTeam Document Database II, a CD-ROM collection of document images, University of Oulu, Finland (1999)Google Scholar
  54. 54.
    Antonacopoulos, A., Pletschacher, S., Bridson, D., Papadopoulos, C.: ICDAR 2009 Page Segmentation Competition, in: 10th International Conference on Document Analysis and Recognition (ICDAR’09), Barcelona, Spain, July (2009)Google Scholar
  55. 55.
    Nourbakhsh, F., Pati, P.B., Ramakrishnan, A.G.: Text localization and extraction from complex Gray images, 2006, Computer vision and image processing. In: Proceedings of the 5th Indian Conference, ICVGIP, Madurai (2006)Google Scholar
  56. 56.
    Zhong Y., Karu K., Jain A.K.: Locating text in complex color images. Pattern recognit. 28(10), 1523–1535 (1995)CrossRefGoogle Scholar
  57. 57.
    Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of 4th Alvey Vision Conference, pp. 147–151 (1988)Google Scholar
  58. 58.
    Otsu N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • Mohamed Benjelil
    • 1
    • 2
    Email author
  • Slim Kanoun
    • 1
  • Rémy Mullot
    • 2
  • Adel M. Alimi
    • 1
  1. 1.REGIM–ENISSfaxTunisia
  2. 2.L3I, University of La RochelleLa RochelleFrance

Personalised recommendations