Maximum-Minimum Similarity Training for Text Extraction

  • Hui Fu
  • Xiabi Liu
  • Yunde Jia
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4234)


In this paper, the discriminative training criterion of maximum-minimum similarity (MMS) is used to improve the performance of text extraction based on Gaussian mixture modeling of neighbor characters. A recognizer is optimized in the MMS training through maximizing the similarities between observations and models from the same classes, and minimizing those for different classes. Based on this idea, we define the corresponding objective function for text extraction. Through minimizing the objective function by using the gradient descent method, the optimum parameters of our text extraction method are obtained. Compared with the maximum likelihood estimation (MLE) of parameters, the result trained with the MMS method makes the overall performance of text extraction improved greatly. The precision rate decreased little from 94.59% to 93.56%, but the recall rate increased a lot from 80.39% to 98.55%.


Gaussian Mixture Modeling Recall Rate Gradient Descent Method Text Region Precision Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Jung, K., Kim, K.I., Jain, A.K.: Text information extraction in images and video: a survey. Pattern Recognition 37, 977–997 (2004)CrossRefGoogle Scholar
  2. 2.
    Jain, A.K., Yu, B.: Automatic text location in images and video frames. Pattern Recognition 31, 2055–2076 (1998)CrossRefGoogle Scholar
  3. 3.
    Sato, T., Kanade, T., Hughes, E.K., Smith, M.A.: Video OCR for digital news archive. In: Proceedings of IEEE Workshop on Content based Access of Image and Video Databases, Bombay, India, pp. 52–60 (1998)Google Scholar
  4. 4.
    Sin, B., Kim, S., Cho, B.: Locating characters in scene images using frequency features. In: Proceedings of International Conference on Pattern Recognition, Quebec, Canada, pp. 489–492 (2002)Google Scholar
  5. 5.
    Wu, V., Manmatha, R., Riseman, E.M.: TextFinder: an automatic system to detect and recognize text in images. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 1224–1229 (1999)CrossRefGoogle Scholar
  6. 6.
    Zhang, D., Chang, S.: Learning to Detect Scene Text Using a Higher-order MRF with Belief Propagation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2004), Washington, DC, United States, pp. 101–108 (2004)Google Scholar
  7. 7.
    Fu, H., Liu, X., Jia, Y.: Gaussian Mixture Modeling of Neighbor Characters for Multilingual Text Extraction in Images. In: IEEE International Conference on Image Processing 2006 (ICIP 2006), Atlanta (accepted, 2006)Google Scholar
  8. 8.
    Juang, B.H., Chou, W., Lee, C.H.: Minimum Classification Error Rate Methods for Speech Recognition. IEEE Trans. Speech and Audio Processing 5, 257–265 (1997)CrossRefGoogle Scholar
  9. 9.
    Jiqing, H., Wen, G.: Robust Speech Recognition Method Based on Discriminative Environment Feature Extraction. Journal of Computer Science and Technology 16, 458–464 (2001)CrossRefMATHGoogle Scholar
  10. 10.
    Rui, Z., Xiaoqing, D.: Minimum Classification Error Training for Handwritten Character Recognition. In: 16th International Conference on Pattern Recognition, August 2002, vol. 1, pp. 580–583 (2002)Google Scholar
  11. 11.
    Liu, X., Jia, Y., Chen, X., Fu, H., Wang, Y.: Maximum-Minimum Similarity Training Criterion for Pattern Recognition. Technical Report (2006),
  12. 12.
    Moerland, P.: A comparison of mixture models for density estimation. In: Proceedings of the International Conference on Artificial Neural Networks (ICANN 1999), vol. 1, pp. 25–30 (1999)Google Scholar
  13. 13.
    Fu, H., Liu, X., Jia, Y.: Text Area extraction Method Based on Edge-pixels Clustering. In: Proceedings of the 8th International Computer Scientists, Convergence of Computing Technologies in the New Era, Beijing, pp. 446–450 (2005)Google Scholar
  14. 14.
    Yuan, Y., Sun, W.: Optimization Theory and Methods (in Chinese). Since Press (2003)Google Scholar
  15. 15.
    Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: Icdar 2003 robust reading competitions. In: Proceeding of the 7th International Conference on Document Analysis and Recognition, Edinburgh, UK, pp. 682–687 (2003)Google Scholar
  16. 16.
    Karatzas, D., Antonacopoulos, A.: Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Colour Perception. In: IEEE, Proceedings of the 17th International Conference on Pattern Recognition (ICPR 2004), Cambridge, UK, pp. 634–637 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Hui Fu
    • 1
  • Xiabi Liu
    • 1
  • Yunde Jia
    • 1
  1. 1.School of Computer Science and TechnologyBeijing Institute of TechnologyBeijingP.R. China

Personalised recommendations