An Efficient Continuous Speech Recognition System for Dravidian Languages Using Support Vector Machine

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 324)


This paper mainly focuses on developing a novel speech recognition system for Dravidian languages such as Tamil, Malayalam, Telugu, and Kannada. This research work targets to afford a well-organized way for human to interconnect with computers absolutely for people with disabilities who façade variety of stumbling blocks while using computers. This work would be very helpful to the native speakers in various applications. The proposed CSR system comprises of three steps namely preprocessing, feature extraction, and classification. In the preprocessing step, the input signal is preprocessed through the steps such as pre-emphasis filter, framing, windowing, and band stop filtering in order to remove the background noise and to enrich the signal. The best-filtered and the enriched signal from the preprocessing step is taken as the input for the further process of CSR system. The speech features being the most essential segment in speech recognition system. The most powerful and widely used short-term energy (STE) and zero-crossing rate (ZCR) are used for continuous speech segmentation, and Mel-frequency cepstral coefficients (MFCC) and shifted delta cepstrum (SDC) are used for recognition task. Feature vectors are given as the input to the classifier such as support vector machine (SVM) for classifying and recognizing Dravidian language speech. Experiments are carried out with real-time Dravidian speech signals, and the results reveal that the proposed method competes with the existing methods reported in literature.


Dravidian languages CSR system Support vector machine Automatic speech recognition Large-vocabulary speech recognition 


  1. 1.
    V. Radha, Efficient speaker independent isolated speech recognition for tamil language using wavelet denoising and hidden markov model. in Proceedings of the Fourth International Conference on Signal and Image Processing (2012)Google Scholar
  2. 2.
    M. Magimai Doss, R. Rasipuram, G. Aradilla, H. Bourlard, Grapheme-based automatic speech recognition using KL-HMM, in Proceedings of Inter Speech (2011)Google Scholar
  3. 3.
    B. liu, Research and implementation of the speech recognition technology based on DSP, in International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC) (2011)Google Scholar
  4. 4.
    G.E. Dahl, D. Yu, L. Deng, A. Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1) (2012)Google Scholar
  5. 5.
    E. McDermott, T. Hazen,. J.L. Roux, A. Nakamura, S. Katagiri, Discriminative training for large vocabulary speech recognition using minimum classification error. IEEE Trans. Speech Audio Process. 15(1), 203–223 (2007)Google Scholar
  6. 6.
    D. Povey, P. Woodland, Minimum phone error and smoothing for improved discriminative training, in Proceedings of ICASSP (2002), pp. 105–108Google Scholar
  7. 7.
    H. Jiang, X. Li, Incorporating training errors for large margin HMMs under semi-definite programming framework, in Proceedings of ICASSP, vol. 4 (2007), pp. 629–632Google Scholar
  8. 8.
    D. Povey, D. Kanevsky, B. Kingsbury, B. Ramabhadran, G. Saon, K. Visweswariah, Boosted MMI for model and feature space discriminative training, in Proceedings of ICASSP (2008), pp. 4057–4060Google Scholar
  9. 9.
    G. Heigold, A log-linear discriminative modeling framework for speech recognition. Ph.D. dissertation, Aachen University, Aachen, Germany, 2010Google Scholar
  10. 10.
    A.N Sigappi, Spoken word recognition strategy for Tamil Language. Int. J. Comput. Sci. Issues 9(3) (2012)Google Scholar
  11. 11.
    HTK book (2002)Google Scholar
  12. 12.
    A. Geetha, V. Ramalingam, B. Palaniappan, S. Palanivel, Facial expression recognition—a real time approach. Expert Syst. Appl. 36(1), 303–308 (2009)Google Scholar
  13. 13.
    Saheli A.A., Abdali G.A., A.A. Suratgar, Speech recognition from PSD using neural network, in Proceedings of the International Multi Conference of Engineers and Computer Scientists, vol. I, Hong Kong, 18–20 March 2009Google Scholar
  14. 14.
    M. Antal, Speaker independent phoneme classification in continuous speech. Studia Univ. Babes-Bolyai Informatica 49(2) (2004)Google Scholar
  15. 15.
    B. Yegnanarayana, S.P. Kishore, AANN: an alternative to GMM for pattern recognition. Neural Networks 15, 459–469 (2002)Google Scholar

Copyright information

© Springer India 2015

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringAnnamalai UniversityChidambaramIndia

Personalised recommendations