Application of Feature Transformation and Learning Methods in Phoneme Classification
This paper examines the applicability of some learning tech- niques to the classification of phonemes. The methods tested were arti- ficial neural nets (ANN), support vector machines (SVM) and Gaussian mixture modeling. We compare these methods with a traditional hid- den Markov phoneme model (HMM) working with the linear prediction- based cepstral coefficient features (LPCC). We also tried to combine the learners with feature transformation methods, like linear discriminant analysis (LDA), principal component analysis (PCA) and independent component analysis (ICA). We found that the discriminative learners can attain the efficiency of the HMM, and after LDA they can attain practically the same score on only 27 features. PCA and ICA proved ineffective, apparently because of the discrete cosine transform inherent in LPCC.
Unable to display preview. Download preview PDF.
- 1.Battle, E., Nadeu, C. and Fonollosa, J. A. R. Feature Decorrelation Methods in Speech Recognition. A Comparative Study. Proceedings of ICSLP’98, 1998.Google Scholar
- 3.Duda, R., Hart, P. Pattern Classiffication and Scene Analysis. Wiley and Sons, New York, 1973.Google Scholar
- 4.Greenberg, S. and Kingsbury, B. E. D. The Modulation Spectrogram: In Pursuit of an Invariant Representation of Speech. Proceedings of ICASSP’97, Munich, vol. 3., pp. 1647–1650, 1998.Google Scholar
- 5.Hyvärinen, A. A family of fixed-point algorithms for independent component analysis Proceedings of ICASSP, Munich, Germany, 1997.Google Scholar
- 6.Hyvärinen, A. New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit. In Advances in Neural Information Processing Systems, 10:273–279, MIT Press, 1998.Google Scholar
- 7.Jolliffe, I. J. Principal Component Analysis, Springer-Verlag, New York, 1986.Google Scholar
- 8.Kocsor, A., Kuba, A. Jr. and Tóth, L. An Overview of the OASIS speech recognition project, In Proceedings of ICAI’99, 1999.Google Scholar
- 10.Rabiner, L. and Juang, B.-H. Fundamentals of Speech Recognition, Prentice Hall, 1993.Google Scholar
- 11.Schürmann, J. Pattern Classification, A Unified View of Statistical and Neural Approaches, Wiley & Sons, 1996.Google Scholar
- 13.Toth, L., Kocsor, A., and Koväcs, K., A Discriminative Segmental Speech Model and Its Application to Hungarian Number Recognition, In Sojka, P. et al.(eds.):Text, Speech and Dialogue, Proceedings of TSD 2000, Springer Verlag LNAI series, vol. 1902, pp. 307–313, 2000.Google Scholar
- 14.Vapnik, V. N., Statistical Learning Theory, John Wiley & Sons Inc., 1998.Google Scholar