Application of Feature Transformation and Learning Methods in Phoneme Classification

  • András Kocsor
  • László Tóth
  • László Felföldi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2070)


This paper examines the applicability of some learning tech- niques to the classification of phonemes. The methods tested were arti- ficial neural nets (ANN), support vector machines (SVM) and Gaussian mixture modeling. We compare these methods with a traditional hid- den Markov phoneme model (HMM) working with the linear prediction- based cepstral coefficient features (LPCC). We also tried to combine the learners with feature transformation methods, like linear discriminant analysis (LDA), principal component analysis (PCA) and independent component analysis (ICA). We found that the discriminative learners can attain the efficiency of the HMM, and after LDA they can attain practically the same score on only 27 features. PCA and ICA proved ineffective, apparently because of the discrete cosine transform inherent in LPCC.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Battle, E., Nadeu, C. and Fonollosa, J. A. R. Feature Decorrelation Methods in Speech Recognition. A Comparative Study. Proceedings of ICSLP’98, 1998.Google Scholar
  2. 2.
    Comon, P. Independent component analysis, A new concept? Signal Processing, 36:287–314, 1994.MATHCrossRefGoogle Scholar
  3. 3.
    Duda, R., Hart, P. Pattern Classiffication and Scene Analysis. Wiley and Sons, New York, 1973.Google Scholar
  4. 4.
    Greenberg, S. and Kingsbury, B. E. D. The Modulation Spectrogram: In Pursuit of an Invariant Representation of Speech. Proceedings of ICASSP’97, Munich, vol. 3., pp. 1647–1650, 1998.Google Scholar
  5. 5.
    Hyvärinen, A. A family of fixed-point algorithms for independent component analysis Proceedings of ICASSP, Munich, Germany, 1997.Google Scholar
  6. 6.
    Hyvärinen, A. New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit. In Advances in Neural Information Processing Systems, 10:273–279, MIT Press, 1998.Google Scholar
  7. 7.
    Jolliffe, I. J. Principal Component Analysis, Springer-Verlag, New York, 1986.Google Scholar
  8. 8.
    Kocsor, A., Kuba, A. Jr. and Tóth, L. An Overview of the OASIS speech recognition project, In Proceedings of ICAI’99, 1999.Google Scholar
  9. 9.
    Kocsor, A., Tóth, L., Kuba, A. Jr., Kovács, K., Jelasity, M., Gyim-othy, T. and Csirik, J., A Comparative Study of Several Feature Transformation and Learning Methods for Phoneme Classification, Int. Journal of Speech Technology, Vol. 3., No. 3/4, pp. 263–276, 2000.MATHCrossRefGoogle Scholar
  10. 10.
    Rabiner, L. and Juang, B.-H. Fundamentals of Speech Recognition, Prentice Hall, 1993.Google Scholar
  11. 11.
    Schürmann, J. Pattern Classification, A Unified View of Statistical and Neural Approaches, Wiley & Sons, 1996.Google Scholar
  12. 12.
    Szarvas, M., Mihajlik, P., Fegyó, T. and Tatai, P. Automatic Recognition of Hungarian: Theory and Practice, Int. Journal of Speech Technology, Vol 3., No. 3/4, pp. 237–252, 2000.MATHCrossRefGoogle Scholar
  13. 13.
    Toth, L., Kocsor, A., and Koväcs, K., A Discriminative Segmental Speech Model and Its Application to Hungarian Number Recognition, In Sojka, P. et al.(eds.):Text, Speech and Dialogue, Proceedings of TSD 2000, Springer Verlag LNAI series, vol. 1902, pp. 307–313, 2000.Google Scholar
  14. 14.
    Vapnik, V. N., Statistical Learning Theory, John Wiley & Sons Inc., 1998.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • András Kocsor
    • 1
  • László Tóth
    • 1
  • László Felföldi
    • 1
  1. 1.Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and of the University of SzegedSzegedHungary

Personalised recommendations