Texture sparseness for pixel classification of business document images

  • Melissa Cote
  • Alexandra Branzan AlbuEmail author
Original Paper


Contemporary business documents contain diverse, multi-layered mixtures of textual, graphical, and pictorial elements. Existing methods for document segmentation and classification do not handle well the complexity and variety of contents, geometric layout, and elemental shapes. This paper proposes a novel document image classification approach that distributes individual pixels into four fundamental classes (text, image, graphics, and background) through support vector machines. This approach uses a novel low-dimensional feature descriptor based on textural properties. The proposed feature vector is constructed by considering the sparseness of the document image responses to a filter bank on a multi-resolution and contextual basis. Qualitative and quantitative evaluations on business document images show the benefits of adopting a contextual and multi-resolution approach. The proposed approach achieves excellent results; it is able to handle varied contents and complex document layouts, without imposing any constraint or making assumptions about the shape and spatial arrangement of document elements.


Business documents Document image segmentation Pixel classification Sparseness Support vector machines  Texture 


  1. 1.
    Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE Trans. Syst, Man Cybern. SMC–3(6), 610–621 (1973) Google Scholar
  2. 2.
    Galloway, M.M.: Texture analysis using gray level run lengths. Comput. Graph. Image Process. 4(2), 172–179 (1975)CrossRefGoogle Scholar
  3. 3.
    Tuceryan, M., Jain, A.K.: Texture analysis. In: Chen, C.H., Pau, L.F., Wang, P.S.P. (eds.) Handbook of Pattern Recognition and Computer Vision, pp. 235–276. World Scientific, Singapore (1993)CrossRefGoogle Scholar
  4. 4.
    Turner, M.R.: Texture discrimination by Gabor functions. Biol. Cybern. 55(2–3), 71–82 (1986)Google Scholar
  5. 5.
    Liu, Y., Srihari, S.N.: Document image binarization based on texture features. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 540–544 (1997)CrossRefGoogle Scholar
  6. 6.
    Bloomberg, D.S.: Multiresolution Morphological Approach to Document Image Analysis. ICDAR, Saint-Malo (1991)Google Scholar
  7. 7.
    Zhu, Y., Tan, T., Wang, Y.: Font recognition based on global texture analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(10), 1192–1200 (2001)CrossRefGoogle Scholar
  8. 8.
    Ma, H., Doermann, D.: Gabor filter based multi-class classifier for scanned document images. In: ICDAR, Edinburgh, UK pp. 968–72 (2003)Google Scholar
  9. 9.
    Aviles-Cruz, C., Rangel-Kuoppa, R., Reyes-Ayala, M., Andrade-Gonzalez, A., Escarela-Perez, R.: High-order statistical texture analysis—font recognition applied. Pattern Recognit. Lett. 26(2), 135–145 (2005)CrossRefGoogle Scholar
  10. 10.
    Peake, G.S., Tan, T.N.: Script and language identification from document images. In: DIA, San Juan, Puerto Rico pp. 10–17 (1997)Google Scholar
  11. 11.
    Tan, T.N.: Rotation invariant texture features and their use in automatic script identification. IEEE Trans. Pattern Anal. Mach. Intell. 20(7), 751–756 (1998)CrossRefGoogle Scholar
  12. 12.
    Busch, A., Boles, W.W., Sridharan, S.: Texture for script identification. IEEE Trans. Pattern Anal. Mach. Intell. 27(11), 1720–1732 (2005)CrossRefGoogle Scholar
  13. 13.
    Hiremath, P.S., Shivashankar, S.: Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image. Pattern Recognit. Lett. 29(9), 1182–1189 (2008)CrossRefGoogle Scholar
  14. 14.
    Liang, J., DeMenthon, D., Doermann, D.: Geometric rectification of camera-captured document images. IEEE Trans. Pattern Anal. Mach. Intell. 30(4), 591–605 (2008)CrossRefGoogle Scholar
  15. 15.
    Tian, Y., Narasimhan, S.G.: Rectification and 3D reconstruction of curved document images. In: CVPR, Providence, USA, pp. 377–84 (2011)Google Scholar
  16. 16.
    Cullen, J.F., Hull, J.J., Hart, P.E.: Document image database retrieval and browsing using texture analysis. In: ICDAR, Ulm, Germany vol. 2, pp. 718–721 (1997)Google Scholar
  17. 17.
    Journet, N., Ramel, J., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. Int. J. Doc. Anal. Recognit. 11(1), 9–18 (2008)CrossRefGoogle Scholar
  18. 18.
    Wang, D., Srihari, S.N.: Classification of newspaper image blocks using texture analysis. Comput. Vis. Graph. Image Process. 47(3), 327–352 (1989)CrossRefGoogle Scholar
  19. 19.
    Chetverikov, D., Liang, J., Komuves, J., Haralick, R.M.: Zone classification using texture features. In: ICPR, Vienna, Austria, vol. 3, pp. 676–80 (1996)Google Scholar
  20. 20.
    Eglin, V., Gagneux, A.: Visual Exploration and functional document labeling. In: ICDAR, Seattle, USA pp. 816–20 (2001)Google Scholar
  21. 21.
    Allier, B., Duong, J., Gagneux, A., Mallet, P., Emptoz, H.: Texture feature characterization for logical pre-labeling. In: ICDAR, Edinburgh, UK, vol. 1, pp. 567–71 (2003)Google Scholar
  22. 22.
    Payne, J.S., Stonham, T.J., Patel, D.: Document segmentation using texture analysis. In: ICPR, Jerusalem, Israel, vol. 2, pp. 380–382 (1994)Google Scholar
  23. 23.
    Chen, J.L.: A simplified approach to the HMM based texture analysis and its application to document segmentation. Pattern Recognit. Lett. 18(10), 993–1007 (1997)CrossRefGoogle Scholar
  24. 24.
    Baird, H.S., Moll, M.A., An, C., Casey, M.R.: Document image content inventories. In: DRR XIV (Proc SPIE vol 6500), San Jose, USA 65000X-1-12 (2007)Google Scholar
  25. 25.
    Kim, B.R., Kim, W.H.: Texture-based PCA for classifying contents in document image. In: IPCV, Las Vegas, USA vol. 1, pp. 228–233 (2008)Google Scholar
  26. 26.
    Jain, A. K., Bhattacharjee, S.K., Chen, Y. (1992) On texture in document images. In: CVPR, Champaign, USA, pp. 677–80Google Scholar
  27. 27.
    Jain, A.K., Zhong, Y.: Page segmentation using texture analysis. Pattern Recognit. 29(5), 743–770 (1996)CrossRefGoogle Scholar
  28. 28.
    Vieux, R., Domenger, J.P.: Hierarchical clustering model for pixel-based classification of document images. In: ICPR, Tsukuba, Japan, pp. 290–293 (2012)Google Scholar
  29. 29.
    Antonacopoulos, A., Bridson, D., Papadopoulos, C., Pletschacher, S.: A realistic dataset for performance evaluation of document layout analysis. In: ICDAR, Barcelona, Spain, pp. 296–300 (2009)Google Scholar
  30. 30.
    Zhong, G., Cheriet, M.: Image patches analysis for text block identification. In: ISSPA, Montreal, Canada, pp. 1241–1246 (2012)Google Scholar
  31. 31.
    Etemad, K., Doermann, D., Chellappa, R.: Multiscale segmentation of unstructured document pages using soft decision integration. IEEE Trans. Pattern Anal. Mach. Intell. 19(1), 92–96 (1997)CrossRefGoogle Scholar
  32. 32.
    Li, J., Gray, R.M.: Context-based multiscale classification of document images using wavelet coefficient distributions. IEEE Trans. Image Process. 9(9), 1604–1616 (2000)CrossRefGoogle Scholar
  33. 33.
    Lee, S.W., Ryu, D.S.: Parameter-free geometric document layout analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1240–1256 (2001)CrossRefGoogle Scholar
  34. 34.
    Acharyya, M., Kundu, M.K.: Document image segmentation using wavelet scale-space features. IEEE Trans. Circuits Syst. Video Technol. 12(12), 1117–1127 (2002)CrossRefGoogle Scholar
  35. 35.
    Sauvola, J., Kauniskangas, H.: MediaTeam Document Database II, a CD-ROM Collection of Document Images. Univ of Oulu (1999)Google Scholar
  36. 36.
    Ford, G, Thoma, G.R.: Ground truth data for document image analysis. In: SDIUT, Greenbelt, USA, pp. 199–205 (2003)Google Scholar
  37. 37.
    Todoran, L., Worring, M., Smeulders, A.W.M.: The UvA color document dataset. Int. J. Doc. Anal. Recognit. 7(4), 228–240 (2005)CrossRefGoogle Scholar
  38. 38.
    Clausner, C., Pletschacher, S., Antonacopoulos, A.: Aletheia—an advanced document layout and text ground-truthing system for production environments. In: ICDAR, Beijing, China pp. 48–52 (2011)Google Scholar
  39. 39.
    Pletschacher, S., Antonacopoulos, A.: The PAGE (Page Analysis and Ground-truth Elements) format framework. In: ICPR, Istanbul, Turkey, pp. 257–260 (2010)Google Scholar
  40. 40.
    O’Gorman, L., Kasturi, R.: Document Image Analysis. IEEE Computer Society Press, Los Alamitos (1997)Google Scholar
  41. 41.
    Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Comput. Vis. 43(1), 29–44 (2001)CrossRefzbMATHGoogle Scholar
  42. 42.
    Omer, I., Werman, M.: Image specific feature similarities. In: ECCV (Lect Notes Comput Sc vol 3952), Graz, Austria, pp. 321–333 (2006) Google Scholar
  43. 43.
    Lu, L., Toyama, K., Hager, G.D.: A two level approach for scene recognition. In: CVPR, San Diego, USA, vol. 1, pp. 688–695 (2005)Google Scholar
  44. 44.
    Garcia-Pineda, O., MacDonald, I., Zimmer, B.: Synthetic aperture radar image processing using the supervised textural-neural network classification algorithm. In: IGARSS, Boston, USA, vol. 4, pp. 1265–1268 (2008)Google Scholar
  45. 45.
    Varma, M., Zisserman, A.: A statistical approach to texture classification from single images. Int. J. Comput. Vis. 62(1), 61–81 (2005)CrossRefGoogle Scholar
  46. 46.
    Hurley, N., Rickard, S.: Comparing measures of sparsity. IEEE Trans. Inf. Theory 55(10), 4723–4741 (2009)CrossRefMathSciNetGoogle Scholar
  47. 47.
    Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98(6), 1031–1044 (2010)CrossRefGoogle Scholar
  48. 48.
    Hoang, T.V., Tabbone, S.: Text extraction from graphical document images using sparse representation. In: DAS, Boston, USA, pp. 143–150 (2010)Google Scholar
  49. 49.
    Zhao, M., Li, S., Kwok, J.: Text detection in images using sparse representation with discriminative dictionaries. Image Vis. Comput. 28(12), 1590–1599 (2010)CrossRefGoogle Scholar
  50. 50.
    Pan, W., Bui, T.D., Suen, C.Y.: Text detection from scene images using sparse representation. In: ICPR, Tampa, USA, pp. 1–5 (2008)Google Scholar
  51. 51.
    Zhang, F., Ye, X., Liu, W.: Image decomposition and texture segmentation via sparse representation. IEEE Signal Process. Lett. 15, 641–644 (2008)CrossRefGoogle Scholar
  52. 52.
    Alpert, S., Galun, M., Basri, R., Brandt, A.: Image segmentation by probabilistic bottom-up aggregation and cue integration. In: CVPR, Minneapolis, USA, pp. 1–8 (2007)Google Scholar
  53. 53.
    Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5(9), 1457–1469 (2004)zbMATHMathSciNetGoogle Scholar
  54. 54.
    Bukhari, S.S., Al-Azawi, M.I.A., Shafait, F., Breuel, T.M.: Document image segmentation using discriminative learning over connected components. In: DAS, Boston, USA, pp. 183–90 (2010)Google Scholar
  55. 55.
    Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: COLT, Pittsburgh, USA, pp. 144–152 (1992)Google Scholar
  56. 56.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  57. 57.
    Mathur, A., Foody, G.M.: Multiclass and binary SVM classification: implications for training and classification users. IEEE Geosci. Remote. Sens. Lett. 5(2), 241–245 (2008)CrossRefGoogle Scholar
  58. 58.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)CrossRefGoogle Scholar
  59. 59.
    Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. In: Technical Report, Dept of Comput Sci, Natl Taiwan Univ (2003)Google Scholar
  60. 60.
    R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2013).

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Electrical and Computer EngineeringUniversity of VictoriaVictoriaCanada

Personalised recommendations