Unsupervised Font Clustering Using Stochastic Versio of the EM Algorithm and Global Texture Analysis

  • Carlos Avilés-Cruz
  • Juan Villegas
  • René Arechiga-Martínez
  • Rafael Escarela-Perez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3287)


An Unsupervised Font clustering technique is proposed in this work. The new approach is based on global texture analysis, using high order statistic features, Gaussian classifier and a stochastic version of the EM algorithm. The font recognition is performed by taking the document as a simple image, where one or several types of fonts are present. The identification is not performed letter by letter as with conventional approaches. In the proposed method a window analysis is employed to obtain the features of the document, using fourth and third order moments. The new technique does not involve a study of local typography; therefore, it is content independent. A detailed study was performed with 8 types of fonts commonly used in the Spanish language. Each type of font can have four styles that lead, to 32 font combinations. The font recognition with clean images is 100% accurate.


Machine Intelligence Order Moment Document Image Text Line High Order Statistic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Nagy, G.: Twenty Years of Document Image Analysis in PAMI. IEEE Trans. Pattern Analysis and Machine Intelligence 22(1), 38–62 (2000)CrossRefGoogle Scholar
  2. 2.
    Khoubyari, S., Hull, J.J.: Font and function word identification in Document Recognition. Computer Vision and Image Understanding 63(1), 66–74 (1996)CrossRefGoogle Scholar
  3. 3.
    Cooperman, R.: Producing Good Font Attribute Determination Using Error-Prone Information. Int. Society for Optical Eng. J. 3027, 50–57 (1997)Google Scholar
  4. 4.
    Shi, H., Pavlidis, T.: Font Recognition and Contextual Processing for More Accurate Text Recognition. In: Proc. Fourth Int. Conference Document Analysis and Recognition (ICDAR 1997), August 1997, pp. 39–44 (1997)Google Scholar
  5. 5.
    Zramdini, A., Ingold, R.: Optical Font Recognition Using Typographical Features. IEEE Trans. Pattern Analysis and Machine Intelligence 20(8), 877–882 (1998)CrossRefGoogle Scholar
  6. 6.
    Schreyer, A., Suda, P., Maderlechner, G.: Font Style Detection in Document Using Textons. In: Proc. Third Document Analysis Systems Work-Shop, Assoc. for Pattern Recognition Int. (1998)Google Scholar
  7. 7.
    Julesz, B., Bergen, J.R.: Textons, the Fundamental Elements in Preattentive Vision and Perception of Textures. The Bell System Technical J. 62(6), 1619–1645 (1983)Google Scholar
  8. 8.
    Malik, J., Belongie, S., Shi, J., Leung, T.: Textons, Contours and Regions: Cue Integration in Image Segmentation. In: IEEE International Conference on Computer Vision, Corfu, Greece (September 1999)Google Scholar
  9. 9.
    Zhu, Y., Tan, T., Wang, Y.: Font Recognition Based on Global Texture Analysis. IEEE Trans. Pattern Analysis and Machine Intelligence 23(10), 1192–1200 (2001)CrossRefGoogle Scholar
  10. 10.
    Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic press, United Kingdom, 591 páginas (1990)Google Scholar
  11. 11.
    Bovik, A.C., Clark, M., Geisler, W.B.: Multichannel Texture Analysis Using Localized Spatial Filters. IEEE Trans. Pattern Analysis and Machine Intelligence 12(1), 55–73 (1990)CrossRefGoogle Scholar
  12. 12.
    Haralick, R.M.: Texture Feature for Image Classification. IEEE Trans. On SMC 3(1), 610–621 (1973)MathSciNetGoogle Scholar
  13. 13.
    Haralick, R.M.: Statistical and Structural approaches to texture. IEEE TPAMI 67(5), 786–804 (1979)CrossRefGoogle Scholar
  14. 14.
    Evans, D.H.: Probability and its Applications for Engineers. Prentice Hall, Englewood Cliffs (2000) ISBN/Part Number: 0824786564Google Scholar
  15. 15.
    Jain, A.K., et al.: Statistical Pattern Recognition: a Review. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(1), 4–37 (2000)CrossRefGoogle Scholar
  16. 16.
    Avilés-Cruz: Texture recognition by high order statistics: characterization and performance, PhD Thesis, Institute National Polytechnic of Grenoble L.T.I.R.F. Grenoble, France (1997)Google Scholar
  17. 17.
    Avilés-Cruz: Unsupervised texture segmentation using stochastic version of the EM algorithm and data fusion. In: Proceedings International Conference on Pattern Recognition ICPR 1998, Vol. 2, pp. 1005–1009, Brisbane-Australia (August 1998)Google Scholar
  18. 18.
    Reed, T., Hans De Buf, J.M.: A Review of Recent Texture Segmentation and Feature Extraction Techniques. CVGIP: Image understanding 57, 359–372 (1993)CrossRefGoogle Scholar
  19. 19.
    Iftekharuddin, K.M., Jemili, K., Karim, M.A.: A feature-based neural wavelet optical character recognition system. In: Aerospace and Electronics Conference. NAECON 1995, vol. 2, pp. 621–628 (1995)Google Scholar
  20. 20.
    González, R.: Digital Image Processing. Prentice Hall, Englewood Cliffs (1990)Google Scholar
  21. 21.
    Peake, G.S., Tan, T.N.: Script and Language Identification from Document Images. In: Proc. BMVC 1997, September 1997, vol. 2, pp. 169–184 (1997)Google Scholar
  22. 22.
    Tan, T.N.: Rotation Invariant Texture Features and their Use in Automatic Script Identification. IEEE Trans. Pattern Analysis and Machine Intelligence 20(7), 751–756 (1998)CrossRefGoogle Scholar
  23. 23.
    Cross, G.R., Jain, A.K.: Markov random field texture models. IEEE Trans. Pattern Analysis and Machine Intelligence, PAMI 5, 25–39 (1983)CrossRefGoogle Scholar
  24. 24.
    Haralick, R.M., Shanmugam, K., Dinstein, I.: Textural features for image classification. IEEE Trans. Systems, Man, and Cybernetics 3(6), 610–621 (1973)CrossRefGoogle Scholar
  25. 25.
    Julesz, B.: Experiments in the visual perception of texture. Scientific American (232), 34–43 (1975)Google Scholar
  26. 26.
    Laws, K.L.: Rapid texture identification. In: SPIE: Image processing for missile guidance, vol. 238, pp. 376–380 (1980)Google Scholar
  27. 27.
    Leung, T., Malik, J.: Recognizing surfaces using three-dimensional textons. In: IEEE International Conference on Computer Vision, pp. 1010–1017 (1999)Google Scholar
  28. 28.
    Tamura, H., Mori, S., Yamawaki, T.: Textural features corresponding to visual perception. IEEE Trans. on Systems., Man and Cybernetics SMC-8(6), 460–472 (1978)CrossRefGoogle Scholar
  29. 29.
    Tomita, Shirai, Y., Tsuji, S.: Description of textures by a structural analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 4(2), 183–191 (1982)CrossRefGoogle Scholar
  30. 30.
    Wang, L., He, D.C.: Texture classification using texture spectrum. Pattern Recognition 23, 905–910 (1990)CrossRefGoogle Scholar
  31. 31.
    Zucker, S.W., Terzopoulos, D.: Finding Structure in Co-Occurrence Matrices for Texture Analysis. Computer Graphics and Image Processing 2, 286–308 (1980)CrossRefGoogle Scholar
  32. 32.
    Xu, L., Jordan, M.I.: On Convergence Properties of the EM Algorithm for Gaussian Mixtures. C.B.C.L. paper No. 111, A.I. Memo No. 1520, Massachusetts Institute of Technology (July 1995)Google Scholar
  33. 33.
    Zhang, J., Modestino, W.J.: Maximum Likelihood Parameter Estimation for Unsupervised Stochastic Model_Base Image Segmentation. IEEE Trans. on Image Processing 3(4), 404–420 (1994)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Carlos Avilés-Cruz
    • 1
  • Juan Villegas
    • 1
  • René Arechiga-Martínez
    • 1
  • Rafael Escarela-Perez
    • 1
  1. 1.Departamento de ElectrónicaUniversidad Autónoma Metropolitana – Azcapotzalco

Personalised recommendations