Skip to main content

Segmentation of Printed Devnagari Documents

  • Conference paper
Advances in Computing and Information Technology (ACITY 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 198))

Abstract

Document segmentation is one of the most important phases in machine recognition of any language. Correct segmentation of individual symbols decides the success of character recognition technique. It is used to decompose an image of a sequence of characters into sub images of individual symbols by segmenting lines and words. Devnagari is the most popular script in India. It is used for writing Hindi, Marathi, Sanskrit and Nepali languages. Moreover, Hindi is the third most popular language in the world. Devnagari documents consist of vowels, consonants and various modifiers. Hence a proper segmentation Devnagari word is challenging. A simple approach based on bounded box to segment Devnagari documents is proposed in this paper. Various challenges in segmentation of Devnagari script are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Priyanka, N., Pal, S., Mandal, R.: Line and Word Segmentation Approach for Printed Documents. IJCA Special Issue on Recent Trends in Image Processing and Pattern Recognition 1“RTIPPR”, 30–36 (2010)

    Google Scholar 

  2. Wong, K., Casey, R., Wahl, F.: Document Analysis System. IBM J. Res. Dev. 26(6), 647–656 (1982)

    Article  Google Scholar 

  3. Nagy, G., Seth, S., Viswanathan, M.: A prototype document image analysis system for technical journals. Computer 25, 10–22 (1992)

    Article  Google Scholar 

  4. Kumar, V., Senegar, P.K.: Segmentation of Printed Text in Devnagari Script and Gurmukhi Script. IJCA: International Journal of Computer Applications 3, 24–29 (2010)

    Google Scholar 

  5. Pal, U., Datta, S.: Segmentation of Bangla Unconstrained Handwritten Text. In: Proc. 7th Int. Conf. on Document Analysis and Recognition, pp.1128–1132 (2003)

    Google Scholar 

  6. Dongre, V.J., Mankar, V.H.: A Review of Research on Devnagari Character Recognition. International Journal of Computer Applications (0975 – 8887) 12(2), 8–15 (2010)

    Article  Google Scholar 

  7. Pal, U., Mitra, M., Chaudhuri, B.B.: Multi-skew detection of Indian script documents. In: Proc. 6th Int. Conf. Document Analysis Recognition, pp. 292–296 (2001)

    Google Scholar 

  8. Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line Segmentation of Historical Documents: a Survey. International Journal on Document Analysis and Recognition 9(2), 123–138 (2007)

    Article  Google Scholar 

  9. Magy, G.: Twenty years of Document Analysis in PAMI. IEEE Trans. in PAMI 22, 38–61 (2000)

    Article  Google Scholar 

  10. Serra, J.: Morphological Filtering: An Overview. Signal Processing 38(1), 3–11 (1994)

    Article  MATH  Google Scholar 

  11. Arica, N., Yarman-Vural, F.T.: An Overview of Character Recognition Focused On Off-line Handwriting. In: C99-06-C-203. IEEE, Los Alamitos (2000)

    Google Scholar 

  12. Cheriet, M., Kharma, N., Liu, C.-L., Suen, C.Y.: Character Recognition Systems: A Guide for students and Practioners. John Wiley & Sons, Inc., Hoboken (2007)

    Book  MATH  Google Scholar 

  13. Kapoor, R., Bagai, D., Kamal, T.S.: Skew angle detection of a cursive handwritten Devnagari script character image. Journal of Indian Inst. Science, 161–175 (May-August 2002)

    Google Scholar 

  14. Pal, U., Mitra, M., Chaudhuri, B.B.: Multi-Skew Detection of Indian Script Documents. In: CVPRU IEEE, pp. 292–296 (2001)

    Google Scholar 

  15. Mankar, V.H., et al.: Contour Detection and Recovery through Bio-Medical Watermarking for Telediagnosis. International Journal of Tomography & Statistics 14(S10) (special volume) (Summer 2010)

    Google Scholar 

  16. Jing, G., Rajan, D., Siong, C.E.: Motion Detection with Adaptive Background and Dynamic Thresholds. In: Fifth International Conference on Information, Communications and Signal Processing, Bangkok, W B.4, pp. 41–45 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dongre, V.J., Mankar, V.H. (2011). Segmentation of Printed Devnagari Documents. In: Wyld, D.C., Wozniak, M., Chaki, N., Meghanathan, N., Nagamalai, D. (eds) Advances in Computing and Information Technology. ACITY 2011. Communications in Computer and Information Science, vol 198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22555-0_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22555-0_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22554-3

  • Online ISBN: 978-3-642-22555-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics