Advertisement

Comparative Study: HMM and SVM for Automatic Articulatory Feature Extraction

  • Supphanat Kanokphara
  • Jan Macek
  • Julie Carson-Berndsen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4031)

Abstract

Generally speech recognition systems make use of acoustic features as a representation of speech for further processing. These acoustic features are usually based on human auditory perception or signal processing. More recently, Articulatory Feature (AF) based speech representations have been investigated by a number of speech technology researchers. Articulatory features are motivated by linguistic knowledge and hence may better represent speech characteristics. In this paper, we introduce two popular classification models, Hidden Markov Model (HMM) and Support Vector Machine (SVM), for automatic articulatory feature extraction. HMM-based systems are found to be best when there is good balance in the numbers of positive and negative examples in the data while SVM is better in the unbalanced data condition.

Keywords

Support Vector Machine Hide Markov Model Speech Recognition Speech Signal Speech Recognition System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Hermansky, H.: Mel Cepstrum, Deltas, Double-Deltas,-What Else is New? In: Proc. Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland (1999)Google Scholar
  2. 2.
    Launay, B., Siohan, O., Surendran, A.C., Lee, C.H.: Towards knowledge-Based Features for HMM Based Large Vocabulary Automatic Speech Recognition. In: Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Orlando (2002)Google Scholar
  3. 3.
    Carson-Berndsen, J.: Time Map Phonology: Finite State Models and Event Logics in Speech Recognition. Kluwer Academic Publishers, Dordrecht (1998)CrossRefGoogle Scholar
  4. 4.
    Stueker, S., Schulz, T., Metze, F., Waibel, A.: Multilingual Articulatory Features. In: Proceedings of ICASSP, vol. 1, pp. 144–147 (2003)Google Scholar
  5. 5.
    Kirchhoff, K.: Robust Speech Recognition using Articulatory Information. Ph.D. thesis, University of Bielefeld (1999)Google Scholar
  6. 6.
    Stevens, K.N.: Acoustic Correlates of some Phonetic Categories. Journal of the Acoustical Society of America (JASA) 68(3), 836–842 (1980)CrossRefGoogle Scholar
  7. 7.
    Stevens, K.N.: Acoustic Phonetics. MIT Press, Cambridge (1998)Google Scholar
  8. 8.
    Jakobson, R., Fant, G., Halle, M.: Preliminaries to Speech Analysis: The Distinctive Features and Their Correlates, 9th edn. MIT Press, Cambridge (1952)Google Scholar
  9. 9.
    Joachims, T.: Making large-Scale SVM Learning Practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1999)Google Scholar
  10. 10.
    Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L.: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM, NIST (1993)Google Scholar
  11. 11.
    Chang, S., Greenberg, S., Wester, M.: An Elitist Approach to Articulatory-Acoustic Feature Classification. In: Proc. 7th Eurospeech, Aalborg, Denmark, pp. 1725–1728 (2001)Google Scholar
  12. 12.
    Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book, Microsoft Corporation and Cambridge University Engineering Department (December 2002)Google Scholar
  13. 13.
    Davis, S.B., Mermelstein, P.: Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Trans. Acoustic Speech and Signal Processing 28(4), 357–366 (1980)CrossRefGoogle Scholar
  14. 14.
    Tarsaku, P., Kanokphara, S.: A Study of HMM-Based Automatic Segmentations for Thai Continuous Speech Recognition System. In: Proc. the Symposium on Natural Language Processing, pp. 217–220 (2002)Google Scholar
  15. 15.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)CrossRefMATHGoogle Scholar
  16. 16.
    Burges, C.J.C.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)CrossRefGoogle Scholar
  17. 17.
    Maloof, M.A.: Learning when data sets are imbalanced and when costs are unequal and unknown. In: ICML 2003 Workshop on Learning from Imbalanced Data Sets II (2003)Google Scholar
  18. 18.
    Lachiche, N., Flach, P.A.: Improving accuracy and cost of two-class and multi-class probabilistic classifiers using ROC curves. In: Proc. 20th International Conference on Machine Learning (ICML 2003), pp. 416–423. AAAI Press, Menlo Park (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Supphanat Kanokphara
    • 1
  • Jan Macek
    • 1
  • Julie Carson-Berndsen
    • 1
  1. 1.UCD DublinUCD School of Computer Science and InformaticsDublinIreland

Personalised recommendations