Advertisement

Character Segmentation of Hindi Unconstrained Handwritten Words

  • Soumen BagEmail author
  • Ankit Krishna
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9448)

Abstract

The proper character level segmentation of printed or handwritten text is an important preprocessing step for optical character recognition (OCR). It is noticed that the languages having cursive nature in writing make the segmentation problem much more complicated. Hindi is one of the well known language in India having this cursive nature in writing style. The main challenge in handwritten character segmentation is to handle the inherent variability in the writing style of different individuals. In this paper, we present an efficient character segmentation method for handwritten Hindi words. Segmentation is performed on the basis of some structural patterns observed in the writing style of this language. The proposed method can cope with high variations in writing style and skewed header lines as input. The method has been tested on our own database for both printed and handwritten words. The average success rate is 96.93 %. The method yields fairly good results for this database comparing with other existing methods. We foresee that the proposed character segmenattion technique can be used as a part of an OCR system for cursive handwritten Hindi language.

Keywords

Character segmentation Handwritten word Header line detection Hindi language Lower modifier Upper modifier Structural approach OCR 

References

  1. 1.
    Bag, S., Harit, G.: Skeletonizing character images using a modified medial axis-based strategy. Int. J. Pattern Recognit. Artif. Intell. 25, 1035–1054 (2011)CrossRefGoogle Scholar
  2. 2.
    Bag, S., Harit, G.: A survey on optical character recognition for Bangla and Devanagari scripts. Sadhana 38, 133–168 (2013)CrossRefGoogle Scholar
  3. 3.
    Bag, S., Bhowmick, P., Behera, P., Harit, G.: Robust binarization of degraded documents using adaptive-cum-interpolative thresholding in a multi-scale framework. In: International Conference on Image Information Processing, pp. 1–6. IEEE Press, New York (2011)Google Scholar
  4. 4.
    Bag, S., Bhowmick, P., Harit, G., Biswas, A.: Character segmentation of handwritten Bangla text by vertex characterization of isothetic covers. In: National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, pp. 21–24. IEEE Press, New York (2011)Google Scholar
  5. 5.
    Bansal, V., Sinha, R.M.K.: Segmentation of touching and fused Devanagari characters. Pattern Recognit. 35, 875–893 (2002)zbMATHCrossRefGoogle Scholar
  6. 6.
    Bishnu, A., Chaudhuri, B.B.: Segmentation of Bangla handwritten text into characters by recursive contour Following. In: International Conference on Document Analysis and Recognition, pp. 236–239. IEEE Press, New York (1999)Google Scholar
  7. 7.
    Casey, R.G., Lecolinet, E.: A survey of methods and strategies in character segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18, 690–706 (1996)CrossRefGoogle Scholar
  8. 8.
    Garain, U., Chaudhuri, B.B.: Segmentation of touching characters in printed Devnagari and Bangla scripts using fuzzy multifactorial analysis. IEEE Trans. Syst. Man Cybern. Part C 32, 449–459 (2002)CrossRefGoogle Scholar
  9. 9.
    Hanmandlu, M., Agrawal, P.: A structural approach for segmentation of handwritten Hindi text. In: International Conference on Cognition and Recognition, pp. 589–597 (2005)Google Scholar
  10. 10.
    Huang, L., Wan, G., Liu, C.: An improved parallel thinning algorithm. In: International Conference on Document Analysis and Recognition, pp. 780–783. IEEE Press, New York (2003)Google Scholar
  11. 11.
    Otsu, N.: A threshold selection method from gray-level histogram. IEEE Trans. Syst. Man Cybern. 9, 62–66 (1979)CrossRefGoogle Scholar
  12. 12.
    Pal, U., Chaudhuri, B.B.: Indian script character recognition: a survey. Pattern Recognit. 37, 1887–1899 (2004)CrossRefGoogle Scholar
  13. 13.
    Pal, U., Datta, S.: Segmentation of Bangla unconstrained handwritten text. In: International Conference on Document Analysis and Recognition, pp. 1128–1132. IEEE Press, New York (2003)Google Scholar
  14. 14.
    Pal, U., Jayadevan, R., Sharma, N.: Handwritten recognition in Indian regional scripts: a survey. ACM Trans. Asian Lang. Inf. Process. 11(1), 1–35 (2012)CrossRefGoogle Scholar
  15. 15.
    Rosenfeld, A., Kak, A.C.: Digital Picture Processing, 2nd edn., vols. 1 and 2. Academic Press, New York (1982)Google Scholar
  16. 16.
    Sarkar, R., Das, N., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.: A two-stage approach for segmentation of handwritten Bangla word images. In: International Conference on Frontiers in Handwriting Recognition, pp. 403–408. CENPARMI, Canada (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringISM DhanbadDhanbadIndia
  2. 2.Department of Computer Science and EngineeringIIIT BhubaneswarBhubaneshwarIndia

Personalised recommendations