Speaker Independent Word Recognition Using Cepstral Distance Measurement

  • Arnab Pramanik
  • Rajorshee Raha
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 182)


Speech recognition has been developed from theoretical methods practical systems. Since 90’s people have moved their interests to the difficult task of Large Vocabulary Continuous Speech Recognition (LVCSR) and indeed achieved a great progress. Meanwhile, many well-known research and commercial institutes have established their recognition systems including via Voice system IBM, Whisper system by Microsoft etc. In this paper we have developed a simple and efficient algorithm for the recognition of speech signal for speaker independent isolated word recognition system. We use Mel frequency cepstral coefficients (MFCCs) as features of the recorded speech. A decoding algorithm is proposed for recognizing the target speech computing the cepstral distance of the cepstral coefficients. Simulation experiments were carried using MATLAB here the method produced relatively good (85% word recognition accuracy) results.


Speech Recognition Speech Signal Automatic Speech Recognition Speech Recognition System Continuous Speech Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Huang, X.D., Lee, K.F.: Phonene classification using semicontinuous hidden markov models. IEEE Trans. on Signal Processessing 40(5), 1962–1067 (1992)Google Scholar
  2. 2.
    Levinson, S.E., Rabiner, L.R., Juang, B.H., Sondhi, M.M.: Recognition of isolated digits using hidden markov models with continuous mixture densities. AT & T Technical Journal 64(6), 1211–1234 (1985)MathSciNetGoogle Scholar
  3. 3.
    Acero, Acoustical and environmental robustness in automatic speech recognition. Kluwer Academic Pubs. (1993)Google Scholar
  4. 4.
    Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall (1978)Google Scholar
  5. 5.
    Jelinek, F.: Continuous Speech Recognition by Statisical Methods. IEEE Proceedings 64(4), 532–556 (1976)CrossRefGoogle Scholar
  6. 6.
    Young, S.: A Review of Large-Vocabulary Continuous Speech Recognition. IEEE Signal Processing Magazine, 45–57 (September 1996)Google Scholar
  7. 7.
    Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall (1993)Google Scholar
  8. 8.
    Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music by Sigurdur Sigurdsson, Kaare Brandt Petersen and TueLehn-SchiølerGoogle Scholar
  9. 9.
    Speech and speaker recognition: A tutorial by Samudravijaya, K., Young, S.J.: The general use of tying in phoneme-based hmm speech recognisers. In: Proceedings of ICASSP (1992)Google Scholar
  10. 10.
    Nefian, A.V., Liang, L., Pi, X., Liu, X., Mao, C.: An coupled hidden Markov model for audio-visual speech recognition. In: International Conference on Acoustics, Speech and Signal Processing (2002)Google Scholar
  11. 11.
    Neti, C., Potamianos, G., Luettin, J., Matthews, I., Vergyri, D., Sison, J., Mashari, A., Zhou, J.: Audio visual speech recognition. In: Final Workshop 2000 Report (2000)Google Scholar
  12. 12.
    Oerder, M., Ney, H.: Word graphs: an efficient interface between continuous-speech recognition and language understanding. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2 (1993)Google Scholar
  13. 13.
    Potamianos, G., Luettin, J., Neti, C.: Asynchronous stream modelling for large vocabulary audio-visual speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 169–172 (2001)Google Scholar
  14. 14.
    Dupont, S., Luettin, J.: Audio-visual speech modeling for continuous speech recognition. IEEE Transactions on Multimedia 151 (September 2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.G S Sanyal School of TelecommunicationIndian Institute of TechnologyKharagpurIndia

Personalised recommendations