Abstract
Issue of speech interface to computer has been capturing the global attention because of convenience put forth by it. Although speech recognition is not a new phenomenon in existing developments of user-machine interface studies but the highlighted facts only provide promising solutions for widely accepted language English. This paper presents development of an experimental, speaker-dependent, real-time, isolated word recognizer for Indian regional language Punjabi. Research is further extended to comparison of speech recognition system for small vocabulary of speaker dependent isolated spoken words in Indian regional language (Punjabi) using the Hidden Markov Model (HMM) and Dynamic Time Warp (DTW) technique. Punjabi language gives immense changes between consecutive phonemes. Thus, end point detection becomes highly difficult. The presented work emphasizes on template-based recognizer approach using linear predictive coding with dynamic programming computation and vector quantization with Hidden Markov Model based recognizers in isolated word recognition tasks, which also significantly reduces the computational costs. The parametric variation gives enhancement in the feature vector for recognition of 500-isolated word vocabulary on Punjabi language, as the Hidden Marko Model and Dynamic Time Warp technique gives 91.3% and 94.0% accuracy respectively.
Chapter PDF
Similar content being viewed by others
Keywords
References
Prasun, S.: British experts use Gurmukhi to aid forensic research. In: Indo-Asian News Service, London. Hindustan Times (September 21, 2007)
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2), 257–286 (1989)
George, M.W., Richard, B.N.: Speech Recognition Experiments with Linear Prediction, Bandpass Filtering, and Dynamic Programming. IEEE Transaction on Acoustics, Speech, and Signal Processing ASSP-24(2), 183–188 (1976)
Bovbel, E.L., Kheidorov, I.E.: Statistical recognition methods, application for isolated word recognition. IEEE Transaction on Digital Signal Processing, 821–823 (June 1997)
Guan, C., Zhu, C., Chen, Y., He, Z.: Performance Comparison of Several Speech Recognition Methods. In: 1994 International Symposium on Speech, Image Processing and Neural Networks, Hong Kong, pp. 13–16 (April 1994)
Levinson, S.E., Rabiner, L.R., Sondhi, M.M.: Speaker Independent Isolated Digit Recognition Using Hidden Markov Models. In: International Conference on Acoustics, Speech, and Signal Processing, Paper 22.8, pp. 1049–1052 (April 1983)
Picone, J.W.: Signal Modeling Techniques in Speech Recognition. Proceedings of the IEEE 81(9), 1214–1245 (1993)
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice Hall PTR, Englewood Cliffs (1993)
Bahl, L.R., Brown, P.F., de Souza, P.V., Mercer, R.L., Picheny, M.A.: A Method for the Construction of Acoustic Markov Models for Words. IEEE Transaction on Speech and Audio Processing 1(4), 443–452 (1993)
Soong, F.K., Rosenberg, A.E., Rabiner, L.R., Juang, B.H.: A Vector Quantization Approach to Speaker Recognition. In: Conference Record 1985 IEEE International Conference on Acoustics, Speech, and Signal Processing, Paper 11.4.1, pp. 387–390 (March 1985)
Rabiner, L.R., Juang, B.H., Levinson, S.E., Sondhi, M.M.: Recognition of Isolated Digits Using Hidden Markov Models with Continuous Mixture Densities. Bell System Tech. Jour. 64(6), 1211–1234 (1985)
Rabiner, L.R., Juang, B.H.: An Introduction to Hidden Markov Models. IEEE ASSP Magazine 3(1), 4–16 (1986)
Rabiner, L.R., Schmidt, C.E.: Application of Dynamic Time Warping to Connected Digit Recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing ASSP 28(4), 377–388 (1980)
Myers, C.S., Rabiner, L.R., Rosenberg, A.E.: Performance Tradeoffs in Dynamic Time Warping Algorithms for Isolated Word Recognition. IEEE Trans. on Acoustics, Speech, and Signal Processing ASSP 28(6), 623–635 (1980); Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. J. Mol. Biol. 147, 195–197 (1981)
Axelrod, S., Maison, B.: Combination of hidden markov model with dynamic time warping for speech recognition. In: Proceedings of the ICASSP, pp. 173–176 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ravinder, K. (2010). Comparison of HMM and DTW for Isolated Word Recognition System of Punjabi Language. In: Bloch, I., Cesar, R.M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2010. Lecture Notes in Computer Science, vol 6419. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16687-7_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-16687-7_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16686-0
Online ISBN: 978-3-642-16687-7
eBook Packages: Computer ScienceComputer Science (R0)