Semi-automatic Handwritten Word Segmentation Based on Character Width Approximation Via Maximum Likelihood Method and Regression Model

  • Jerzy Sas
  • Marek Kurzynski
Conference paper
Part of the Advances in Soft Computing book series (AINSC, volume 45)


The paper presents a method of word image segmentation into images of individual characters. The method is semi-automatic, because it requires that the character sequence constituting the word on the image is know. It is assumed that widths of the characters in the alphabet are random variables and that the parametres of probability distribution are specific for each character. At the first stage of the proposed method the parameters of the distributions for all alphabet characters are estimated. Then for each word in the corpus being processed all possible segmentation variants are analyzed and for each variant its probability is calculated taking into account probability distrubution of corresponding characters. Finally, such segmentation variant is selected for which the calculated probability is highest.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arica, N., Yarman-Vural, F.: Optical character recognition for cursive handwriting. IEEE Trans, on PAMI 24 (2002) 801–813Google Scholar
  2. 2.
    Marti, U., Bunke, H.: Using a statistical language model to improve the performance of an hmm-based cursive handwritting recognition system. Int. Journ. of Pattern Recognition and Artificial Intelligence 15 (2001) 65–90CrossRefGoogle Scholar
  3. 3.
    Schomaker, L., Teulings, H.: Unsupervised learning of prototype allographs in cursive script using invariant handwritting features. In J.C. Simon, S.I., ed.: From Pixels to Features III, Amsterdam, North-Holland (1992)Google Scholar
  4. 4.
    Sas, J., Markowska-Kaczmar, U.: Semi-automatic training sets acquisition for handwriting recognition. In Saed, K., ed.: Proc. of 6th Int. Conference on Computer Information Systems and Industrial Management Applications-CISIM 2007, IEEE Press (2007)Google Scholar
  5. 5.
    Mackowiak, J., Schomaker, L., Vuurpijl, L.: Semi-automatic determination of allograph duration and position in on-line handwriting words based on the expected number of strokes. In: Progress in Handwriting Recognition, London: World Scientific (1997)Google Scholar
  6. 6.
    Pal, U., Belaid, A., Choisy, C.: Touching numeral segmentation using water reservoir concept. Pattern Recognition Letters (2003) 261–272Google Scholar
  7. 7.
    Xiao, X., Leedham, G.: Knowledge based english cursive script segmentation. Pattern Recognition Letters (2000) 945–954Google Scholar
  8. 8.
    Sadri, J., Suen, C., Bui, T.D.: A genetic framework using contextual knowledge for segmentation and recognition of handwritten numeral strings. Pattern Recognition 40 (2007) 898–919Google Scholar
  9. 9.
    El-Yacoubi, A., Gilloux, M., Sabourin, R., C.Y., S.: An hmm-based approach for off-line unconstrained handwritten word modeling and recognition. Int. Journ. of Pattern Recognition and Artificial Intelligence 21 (1999)Google Scholar
  10. 10.
    Sas, J.: Combined approach to semi-automatic hanwwritten word segmentation based on character width approximation. In: Proc. of XVI Int. Conference on Systems Science. (2007)-to be publishedGoogle Scholar
  11. 11.
    Sas, J.: Application of bidirectional probabilistic character language model in handwritten words recognition. In E. Corchado, H. Yin, V.B.C.F., ed.: Intelligent Data Engineering and Automated Learning-IDEAL 2006, Springer Verlag, Berlin, Heidelberg (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Jerzy Sas
    • 1
  • Marek Kurzynski
    • 2
  1. 1.Applied Informatics InstituteTechnical University of WroclawPoland
  2. 2.Faculty of Electronics, Chair of Systems and Computer NetworksTechnical University of WroclawPoland

Personalised recommendations