IEA/AIE 2006: Advances in Applied Artificial Intelligence pp 674-681 | Cite as
Comparative Study: HMM and SVM for Automatic Articulatory Feature Extraction
Abstract
Generally speech recognition systems make use of acoustic features as a representation of speech for further processing. These acoustic features are usually based on human auditory perception or signal processing. More recently, Articulatory Feature (AF) based speech representations have been investigated by a number of speech technology researchers. Articulatory features are motivated by linguistic knowledge and hence may better represent speech characteristics. In this paper, we introduce two popular classification models, Hidden Markov Model (HMM) and Support Vector Machine (SVM), for automatic articulatory feature extraction. HMM-based systems are found to be best when there is good balance in the numbers of positive and negative examples in the data while SVM is better in the unbalanced data condition.
Keywords
Support Vector Machine Hide Markov Model Speech Recognition Speech Signal Speech Recognition SystemPreview
Unable to display preview. Download preview PDF.
References
- 1.Hermansky, H.: Mel Cepstrum, Deltas, Double-Deltas,-What Else is New? In: Proc. Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland (1999)Google Scholar
- 2.Launay, B., Siohan, O., Surendran, A.C., Lee, C.H.: Towards knowledge-Based Features for HMM Based Large Vocabulary Automatic Speech Recognition. In: Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Orlando (2002)Google Scholar
- 3.Carson-Berndsen, J.: Time Map Phonology: Finite State Models and Event Logics in Speech Recognition. Kluwer Academic Publishers, Dordrecht (1998)CrossRefGoogle Scholar
- 4.Stueker, S., Schulz, T., Metze, F., Waibel, A.: Multilingual Articulatory Features. In: Proceedings of ICASSP, vol. 1, pp. 144–147 (2003)Google Scholar
- 5.Kirchhoff, K.: Robust Speech Recognition using Articulatory Information. Ph.D. thesis, University of Bielefeld (1999)Google Scholar
- 6.Stevens, K.N.: Acoustic Correlates of some Phonetic Categories. Journal of the Acoustical Society of America (JASA) 68(3), 836–842 (1980)CrossRefGoogle Scholar
- 7.Stevens, K.N.: Acoustic Phonetics. MIT Press, Cambridge (1998)Google Scholar
- 8.Jakobson, R., Fant, G., Halle, M.: Preliminaries to Speech Analysis: The Distinctive Features and Their Correlates, 9th edn. MIT Press, Cambridge (1952)Google Scholar
- 9.Joachims, T.: Making large-Scale SVM Learning Practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1999)Google Scholar
- 10.Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L.: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM, NIST (1993)Google Scholar
- 11.Chang, S., Greenberg, S., Wester, M.: An Elitist Approach to Articulatory-Acoustic Feature Classification. In: Proc. 7th Eurospeech, Aalborg, Denmark, pp. 1725–1728 (2001)Google Scholar
- 12.Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book, Microsoft Corporation and Cambridge University Engineering Department (December 2002)Google Scholar
- 13.Davis, S.B., Mermelstein, P.: Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Trans. Acoustic Speech and Signal Processing 28(4), 357–366 (1980)CrossRefGoogle Scholar
- 14.Tarsaku, P., Kanokphara, S.: A Study of HMM-Based Automatic Segmentations for Thai Continuous Speech Recognition System. In: Proc. the Symposium on Natural Language Processing, pp. 217–220 (2002)Google Scholar
- 15.Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)CrossRefMATHGoogle Scholar
- 16.Burges, C.J.C.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)CrossRefGoogle Scholar
- 17.Maloof, M.A.: Learning when data sets are imbalanced and when costs are unequal and unknown. In: ICML 2003 Workshop on Learning from Imbalanced Data Sets II (2003)Google Scholar
- 18.Lachiche, N., Flach, P.A.: Improving accuracy and cost of two-class and multi-class probabilistic classifiers using ROC curves. In: Proc. 20th International Conference on Machine Learning (ICML 2003), pp. 416–423. AAAI Press, Menlo Park (2003)Google Scholar