Recognition of Greek Phonemes Using Support Vector Machines

  • Iosif Mporas
  • Todor Ganchev
  • Panagiotis Zervas
  • Nikos Fakotakis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3955)


In the present work we study the applicability of Support Vector Machines (SVMs) on the phoneme recognition task. Specifically, the Least Squares version of the algorithm (LS-SVM) is employed in recognition of the Greek phonemes in the framework of telephone-driven voice-enabled information service. The N-best candidate phonemes are identified and consequently feed to the speech and language recognition components. In a comparative evaluation of various classification methods, the SVM-based phoneme recognizer demonstrated a superior performance. Recognition rate of 74.2% was achieved from the N-best list, for N=5, prior to applying the language model.


Support Vector Machine Language Model Independent Component Analysis Acoustic Model Phoneme Recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zissman, M.: Comparison of four Approaches to Automatic Language Identification of Telephone Speech. IEEE Trans. Speech and Audio Proc. 4, 31–44 (1996)CrossRefGoogle Scholar
  2. 2.
    Mak, M.: Combining ANNs to improve phone recognition. In: IEEE ICASSP 1997, Munich, Germany, vol. 4, pp. 3253–3256 (1997)Google Scholar
  3. 3.
    Kwon, O.-W., Lee, T.-W.: Phoneme recognition using ICA-based feature extraction and transformation. Signal Processing (June 2004)Google Scholar
  4. 4.
    Caseiro, D., Trancoso, I.: Identification of Spoken European Languages. In: Eusipco, IX European Signal Processing Conference, Greece (September 1998)Google Scholar
  5. 5.
    Yan, Y., Barnard, E.: Experiments for an approach to Language Identification with conversational telephone speech. In: ICASSP, Atlanta, USA, May 1996, vol. 1, pp. 789–792 (1996)Google Scholar
  6. 6.
    Lee, K.-F., Hon, H.-W.: Speaker Independent Phone Recognition using HMM. IEEE Trans. on Acoustics Speech and Audio Processing 37(11) (November 1989)Google Scholar
  7. 7.
    Schultz, T., Waibel, A.: Language Independent and Language Adaptive Acoustic Modeling for Speech Recognition. Speech Communication 35(1-2), 31–51 (2001)CrossRefMATHGoogle Scholar
  8. 8.
    Dalsgaard, P., Andersen, O., Hesselager, H., Petek, B.: Language Identification using Language-dependent phonemes and Language-independent speech units. In: ICSLP 1996, pp. 1808–1811 (1996)Google Scholar
  9. 9.
    Corredor-Ardoy, C., Gauvain, J., Adda_decker, M., Lamel, L.: Language Identification with Language-independent acoustic models. In: Proc. of EUROSPEECH 1997 (September 1997)Google Scholar
  10. 10.
    Martin, T., Wong, E., Baker, B., Mason, M.: Pitch and Energy Trajectory Modeling in a Syllable Length Temporal Framework for Language Identification. In: ODYSSEY 2004, Toledo, Spain, May 31-June 3 (2004)Google Scholar
  11. 11.
    Pusateri, E., Thong, J.M.: N-best List Generation using Word and Phoneme Recognition Fusion. In: 7th European Conference on Speech Communication and Technology (EuroSpeech), Aalborg, Denmark (September 2001)Google Scholar
  12. 12.
    Salomon, J., King, S., Osborne, M.: Framewise phone classification using support vector machines. In: Proceedings International Conference on Spoken Language Processing, Denver (2002)Google Scholar
  13. 13.
    Garofolo, J.: Getting started with the DARPA-TIMIT CD-ROM: An acoustic phonetic continuous speech database. National Institute of Standards and Technology (NIST), Gaithersburgh, MD, USA (1988)Google Scholar
  14. 14.
    Friedman, J.: Another approach to polychotomous classification. Technical report, Stanford University, UA (1996)Google Scholar
  15. 15.
    Suykens, J., Vandewalle, J.: Least Squares Support Vector Machine Classifiers. Neural Processing Letters 9(3), 293–300 (1999)CrossRefMATHGoogle Scholar
  16. 16.
    Hodge, H.: SpeechDat multilingual speech databases for teleservices: across the finish line. In: EUROSPEECH 1999, Budapest, Hungary, September 5-9, 1999, vol. 6, pp. 2699–2702 (1999)Google Scholar
  17. 17.
    Chatzi, I., Fakotakis, N., Kokkinakis, G.: Greek speech database for creation of voice driven teleservices. In: EUROSPEECH 1997, Rhodes, Greece, September 22-25, 1997, vol. 4, pp. 1755–1758 (1997)Google Scholar
  18. 18.
    John, G., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: 11th Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Mateo (1995)Google Scholar
  19. 19.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)Google Scholar
  20. 20.
    Quinlan, J.R.: Bagging, Boosting, and C4.5, vol. 1. AAAI/IAAI (1996)Google Scholar
  21. 21.
    Witten, I., Frank, E.: Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco (1999)Google Scholar
  22. 22.
    Burges, C.: A tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2(2), 121–167 (1998)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Iosif Mporas
    • 1
  • Todor Ganchev
    • 1
  • Panagiotis Zervas
    • 1
  • Nikos Fakotakis
    • 1
  1. 1.Wire Communications Laboratory, Dept. of Electrical and Computer EngineeringUniversity of PatrasRion, PatrasGreece

Personalised recommendations