Skip to main content

Text Recognition Using K-means Clustering and Support Vector Machine

  • Conference paper
  • First Online:
Proceeding of First Doctoral Symposium on Natural Computing Research

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 169))

  • 247 Accesses

Abstract

Extraction of text characters and strings from the nameplates, sign boards, printed names on bank cheques, printed postal envelope addresses and office boards can give useful information for numerous applications. The process to extract text characters from all the categories mentioned above is very complex task due to divergence in the text patterns in terms of text fonts used. Text recognition always starts with text detection. Therefore, text recognition is the process of text detection followed by text recognition. This can be done by combining image processing and machine learning together. This paper proposes a method for text detection and identification mainly on nameplates in the offices and home doors written in English language which uses Roman script. Initial step of text detection of Roman characters is carried out by using image processing and K-means clustering algorithm while later part of identification is carried out with the help of multiclass feature of support vector machine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Lu S, Li L, Tan CL (2008) Document image retrieval through word shape coding. IEEE Trans Pattern Anal Mach Intell 30(11):1913–1918

    Article  Google Scholar 

  2. De Campos T, Babu B, Varma M (2009) Character recognition in natural images. In: Proceeding of VISAPP

    Google Scholar 

  3. Weinman JJ, Learned-Miller E, Hanson AR (2009) Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE Trans Pattern Anal Mach Intell 31(10):1733–1746

    Article  Google Scholar 

  4. Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transforms. In: Proceeding of CVPR, pp 2963–2970

    Google Scholar 

  5. Neumann L, Matas J (2012) Real-time scene text localization and recognition. In: Proceeding of IEEE conference on computer vision and pattern recognition, pp 3538–3545

    Google Scholar 

  6. Kulkarni SA, Borde PL, Manza RR, Yannawar PL (2015) Offline MODI character recognition using complex moments. In: Science direct, second international symposium on computer vision and the internet

    Google Scholar 

  7. Rathi SR, Jadhav RH (2015) Recognition and conversion of handwritten Modi characters. Int J Tech Res Appl 3(1):128–131

    Google Scholar 

  8. Patil PA (2016) Character recognition system for Modi script. Int J Comput Eng Res 6(9)

    Google Scholar 

  9. Chandure SL, Inamdar V (2016) Performance analysis of handwritten devanagari and MODI character recognition system. In: International conference on computing, analytics and security trends

    Google Scholar 

  10. .Gharde SS, Ramteke RJ (2016) Recognition of characters in Indian Modi script. In: International conference on global trends in signal processing, information computing and communication

    Google Scholar 

  11. Maurya RK, Maurya SR (2018) Recognition of a medieval Indic-‘Modi’ script using empirically determined heuristics in hybrid feature space. Int J Comput Sci Eng Open Access Res Pap 6(2)

    Google Scholar 

  12. Singh PK (2013) Identification of Devanagari and Roman scripts from multi-script handwritten documents. In: International conference on pattern recognition and machine intelligence, Springer, Berlin

    Google Scholar 

  13. Shelke S, Apte S (2015) A fuzzy-based classification scheme for unconstrained handwritten Devanagari character recognition. In: Communication, information & computing technology (ICCICT), 2015 international conference on IEEE

    Google Scholar 

  14. Saikat R (2017) Handwritten isolated Bangla compound character recognition: A new benchmark using a novel deep learning approach. Pattern Recogn Lett 90:15–21

    Article  Google Scholar 

  15. Milind B (2018) Combined classifier approach for offline handwritten Devanagari character recognition using multiple features computational vision and bio-inspired computing. Springer, Cham, pp 45–54

    Google Scholar 

  16. Boufenar C (2018) An artificial immune system for offline isolated handwritten Arabic character recognition. Evolv Syst 9:25–41

    Article  Google Scholar 

  17. Li Z (2018) Building efficient CNN architecture for offline handwritten Chinese character recognition. Preprint arXiv:1804.01259

  18. Jagan Mohan Reddy, Vishnuvardhan Reddy DA (2019) Recognition of handwritten characters using deep convolutional neural network. IJITEE 8(6S4). ISSN: 2278-3075

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kanchan Varpe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dhore, M., Varpe, K. (2021). Text Recognition Using K-means Clustering and Support Vector Machine. In: Patil, V.H., Dey, N., N. Mahalle, P., Shafi Pathan, M., Kimbahune, V.V. (eds) Proceeding of First Doctoral Symposium on Natural Computing Research. Lecture Notes in Networks and Systems, vol 169. Springer, Singapore. https://doi.org/10.1007/978-981-33-4073-2_10

Download citation

Publish with us

Policies and ethics