Skip to main content

Script Identification from Offline Handwritten Characters Using Combination of Features

  • Conference paper
  • First Online:
Proceedings of Sixth International Conference on Soft Computing for Problem Solving

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 547))

Abstract

Script identification in multi-lingual text images will help in improving the efficiency of many real life applications, such as sorting, transcription of multilingual documents and OCR. In this paper, we have presented a technique for identification of three scripts, namely, Devanagari, Gurmukhi and Roman. We have identified the script of text based on statistical features, namely, zoning features; diagonal features; intersection and open end points based features; peak extent based features and combinations of these features. For classification, we have used multiple classification techniques, namely, Support Vector Machine (SVM), k-Nearest Neighbour (k-NN), and Convolutional Neural Network (CNN). The proposed strategy using CNN attains an average identification rate of 93.64%, with 5-fold cross-validation, for these three scripts when isolated offline handwritten characters of these scripts were considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pal, U., Chaudhari, B.B.: Script line separation from Indian multi-script documents. In: Proceedings of International Conference on Documents Analysis and Recognition, pp. 406–409 (1999)

    Google Scholar 

  2. Pal, U., Chaudhuri, B.B.: Identification of different script lines from multi-script documents. Image Vis. Comput. 20(13), 945–954 (2002)

    Article  Google Scholar 

  3. Pal, U., Sinha, S., Chaudhri, B.B.: Multi-script line identification from Indian documents. In: Proceedings of International Conference on Documents Analysis and Recognition, pp. 880–884 (2003)

    Google Scholar 

  4. Spitz, A.L.: Determination of the script and language content of document images. IEEE Trans. Patt. Anal. Mach. Intell. 19(3), 235–245 (1997)

    Article  Google Scholar 

  5. Peake, G.S., Tan, T.N.: Script and language identification from document images. In: Proceedings of Third Asian Conference on Computer Vision Hong Kong, vol. 2, pp. 97–104 (1997)

    Google Scholar 

  6. Hochberg, J., Kelly, P., Thomas, T., Kerns, L.: Automatic script identification from document images using cluster-based templates. IEEE Trans. Patt. Anal. Mach. Intell. 19(2), 176–181 (1997)

    Article  Google Scholar 

  7. Hochberg, J., Bowers, K., Cannon, M., Kelly, P.: Script and language identification for handwritten document images. Int. J. Doc. Anal. Recogn. 2(2), 45–52 (1999)

    Article  Google Scholar 

  8. Wood, S.L., Yao, X., Krishnamurthi, K., Dang, L.: Language identification for printed text independent of segmentation. In: Proceedings of International Conference on Image Processing, vol. 3, pp. 428–431 (1995)

    Google Scholar 

  9. Pal, U., Chaudhuri, B.B.: Script line separation from Indian multi-script documents. In: Proceedings of 5th International Conference on Documents Analysis and Recognition, pp. 406–409 (1999)

    Google Scholar 

  10. Dhanya, D., Ramakrishnan, A.G.: Script identification in printed bilingual documents. In: Proceedings of 5th International Workshop on Document Analysis and System, pp. 13–24 (2002)

    Google Scholar 

  11. Pal, U., Sinha, S., Chaudhuri, B.B.: Word-wise script identification from a document containing English, Devnagari and Telugu Text. In: Proceedings of National Conference on Document Analysis and Recognition, pp. 213–220 (2003)

    Google Scholar 

  12. Patil, S.B., Subbareddy, N.V.: Neural network based system for script identification in Indian documents. SADHANA 27(1), 83–97 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akshi Bhardwaj .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Bhardwaj, A., Jindal, S.R. (2017). Script Identification from Offline Handwritten Characters Using Combination of Features. In: Deep, K., et al. Proceedings of Sixth International Conference on Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 547. Springer, Singapore. https://doi.org/10.1007/978-981-10-3325-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3325-4_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3324-7

  • Online ISBN: 978-981-10-3325-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics