SVM Classification Using Sequences of Phonemes and Syllables

  • Gerhard Paaß
  • Edda Leopold
  • Martha Larson
  • Jörg Kindermann
  • Stefan Eickeler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2431)


In this paper we use SVMs to classify spoken and written documents. We show that classification accuracy for written material is improved by the utilization of strings of sub-word units with dramatic gains for small topic categories. The classification of spoken documents for large categories using sub-word units is only slightly worse than for written material, with a larger drop for small topicc ategories. Finally it is possible, without loss, to train SVMs on syllables generated from written material and use them to classify audio documents. Our results confirm the strong promise that SVMs hold for robust audio document classification, and suggest that SVMs can compensate for speech recognition error to an extent that allows a significant degree of topic independence to be introduced into the system.


Support Vector Machine Acoustic Model Speech Recognizer Human Annotator String Kernel 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Drucker, H, Wu, D., Vapnik, V. Support vector machines for spam categorization. IEEE Transactions on Neural Networks, 10(5): 1048–1054, 1999.CrossRefGoogle Scholar
  2. 2.
    Dumais, S., Platt, J., Heckerman, D., Sahami, M. (1998): Inductive learning algorithms and representations for text categorization. In: 7th International Conference on Information and Knowledge Management, 1998.Google Scholar
  3. 3.
    Gelman, A., Carlin J. B., Stern, H. S., Rubin, D. B.: Bayesian Data Analysis. Chapman, Hall, London, 1995.Google Scholar
  4. 4.
    Glavitsch, U., Schäuble, P. (1992): A System for Retrieving Speech Documents, SIGIR 1992.Google Scholar
  5. 5.
    Haussler, David (1999): Convolution Kernels on Discrete Structures, UCSL-CRL-99-10.Google Scholar
  6. 6.
    Joachims, T. (1998). Text categorization with support vector machines: learning with many relevant features. Proc. ECML’ 98, (pp. 137–142).Google Scholar
  7. 7.
    Klabbers, E., Stöber, K., Veldhuis, R. Wagner, P., Breuer, S.: Speech synthesis development made easy: The Bonn Open Synthesis System, EUROSPEECH 2001.Google Scholar
  8. 8.
    Larson, M.: Sub-word-based language models for speech recognition: implications for spoken document retrieval, Proc. Workshop on Language Modeling and IR. Pittsburgh 2001.Google Scholar
  9. 9.
    Leopold, E., Kindermann, J.: Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? Machine Learning, 46, 2002, 423–444.MATHCrossRefGoogle Scholar
  10. 10.
    Leslie, Christa, Eskin, Eleazar, Noble, William Stafford (2002): The Spectrum Kernel: A String Kernel SVM Protein Classification. To appear: Pacific Symposium on Biocomputing.Google Scholar
  11. 11.
    Lodhi, Huma, Shawe-Taylor, John, Cristianini, Nello & Watkins, Chris (2001) Text classification using kernels, NIPS 2001, pp. 563–569. MIT Press.Google Scholar
  12. 12.
    Manning, Christopher D., Schütze (2000): Foundations of Statistical Natural Language Processing, MIT Press.Google Scholar
  13. 13.
    Watkins, Chris (1998): Dynamicalign ment Kernels. Technical report, Royal Holloway, University of London. CSD-TR-98-11.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Gerhard Paaß
    • 1
  • Edda Leopold
    • 1
  • Martha Larson
    • 2
  • Jörg Kindermann
    • 1
  • Stefan Eickeler
    • 2
  1. 1.Fraunhofer Institute for Autonomous Intelligent Systems (AIS)St. AugustinGermany
  2. 2.Fraunhofer Institute for Media Communication (IMK)St. AugustinGermany

Personalised recommendations