Skip to main content
Log in

A Novel Shape-Based Character Segmentation Method for Devanagari Script

  • Research Article - Computer Engineering and Computer Science
  • Published:
Arabian Journal for Science and Engineering Aims and scope Submit manuscript

Abstract

This paper presents a new algorithm to extract shape-oriented feature vectors using pixel intensities from offline printed Devanagari script documents. Almost, all the characters of the script contain Shirorekha (header line) on the upper portion, which makes segmentation a difficult and complex problem. The problem gets more challenging when images are in multiple gray levels, skewed and noisy. A new fast and effective algorithm is designed using gradient structural information, and its performance is evaluated on a challenging dataset containing 80 printed documents consisting of around 87,000 characters. Experimental results show that the proposed algorithm has 98.56% accuracy, which is 02.66% higher than that reported in literature. Also, the proposed algorithm is time efficient and less complex in comparison with the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Jayadevan, R.; Kolhe, S.; Patil, P.; Pal, U.: Offline recognition of Devanagari script—a survey. IEEE Transact. Syst. Man Cybern. Part C Appl. Rev. 41(6), 782–796 (2011)

    Article  Google Scholar 

  2. Singla, S.K.; Yadav, R.K.: Optical character recognition based speech synthesis system using lab view. J. Appl. Res. Technol. 12(5), 919–926 (2014)

    Article  Google Scholar 

  3. Jindal, K.; Kumar, R.: A Note on Data mining based noise diagnosis and fuzzy filter design for image processing. Comput. Electr. Eng. 49, 50–51 (2016)

    Article  Google Scholar 

  4. Mohandes, M.; Deriche, M.; Ahmadi, H.; Kousa, M.: An intelligent system for vehicle access control using RFID and ALPR technologies. Arab. J. Sci. Eng. 41, 3521–3530 (2016)

    Article  Google Scholar 

  5. Nikolaou, N.; Makridis, M.; Gatos, B.; Stamatopoulos, N.; Papamarkos, N.: Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths. Image Vision Comput. 28, 590–604 (2010)

    Article  Google Scholar 

  6. Murthy, O.V.R.; Roy, S.; Narang, V.; Hanmandlu, M.; Gupta, S.: An approach to divide pre-detected Devanagari words from the scene images into characters. Signal Image Video Process. 7(6), 1071–1082 (2013)

    Article  Google Scholar 

  7. Ma, H.; Doermann, D.: Adaptive Hindi ocr using generalized hausdorff image comparison. ACM Transact Asian Lang. Inf Process. 2(3), 193–218 (2003)

    Article  Google Scholar 

  8. Grafmller, M.; Beyerer, J.: Performance improvement of character recognition in industrial applications using prior knowledge for more reliable segmentation. Expert Syst. Appl. 40(17), 6955–6963 (2013)

    Article  Google Scholar 

  9. Garain, U.; Chaudhuri, B.B.: Segmentation of touching characters in printed Devanagari and bangla scripts using fuzzy multifactorial analysis. IEEE Transact. Syst. Man Cybern. 32(4), 449–459 (2002)

    Article  Google Scholar 

  10. Shivakumara, P.; Yuan, Z.; Zhao, D.; Lu, T.; Tan, C.L.: New gradient-spatial-structural features for video script identification. Comput. Vision Image Underst. 130, 35–53 (2015)

    Article  Google Scholar 

  11. Pande, H.; Dhami, H.S.: Mathematical modelling of occurrence of letters and word’s initials in texts of Hindi language. SKASE J. Theor. Linguist. 7(2), 19–37 (2010)

    Google Scholar 

  12. Frias-Martinez, E.; Sanchez, A.; Velez, J.: Support vector machines versus multi-layer perceptrons for efficient off-line signature recognition. Eng. Appl. Artif. Intell. 19, 693–704 (2006)

    Article  Google Scholar 

  13. Bansal, V.; Sinha, R.M.K.: Integrating knowledge sources in Devanagari text recognition system. IEEE Transact. Syst. Man Cybern. Part A Syst. Hum. 30(4), 500–505 (2000)

    Article  Google Scholar 

  14. Sinha, R.M.K.; Mahabala, H.: Machine recognition of Devanagari script. IEEE Transact. Syst. Man Cybern. 9, 435–441 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kompalli, S.; Setlur, S.; Govindaraju, V.: Devanagari ocr using a recognition driven segmentation framework and stochastic language models. Int. J. Doc. Anal. Recognit. 12(2), 123–138 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khushneet Jindal.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jindal, K., Kumar, R. A Novel Shape-Based Character Segmentation Method for Devanagari Script. Arab J Sci Eng 42, 3221–3228 (2017). https://doi.org/10.1007/s13369-017-2420-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13369-017-2420-7

Keywords

Navigation