Skip to main content

An Online Algorithm for Hierarchical Phoneme Classification

  • Conference paper
Machine Learning for Multimodal Interaction (MLMI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3361))

Included in the following conference series:

Abstract

We present an algorithmic framework for phoneme classification where the set of phonemes is organized in a predefined hierarchical structure. This structure is encoded via a rooted tree which induces a metric over the set of phonemes. Our approach combines techniques from large margin kernel methods and Bayesian analysis. Extending the notion of large margin to hierarchical classification, we associate a prototype with each individual phoneme and with each phonetic group which corresponds to a node in the tree. We then formulate the learning task as an optimization problem with margin constraints over the phoneme set. In the spirit of Bayesian methods, we impose similarity requirements between the prototypes corresponding to adjacent phonemes in the phonetic hierarchy. We describe a new online algorithm for solving the hierarchical classification problem and provide worst-case loss analysis for the algorithm. We demonstrate the merits of our approach by applying the algorithm to synthetic data and as well as speech data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Deller, J., Proakis, J., Hansen, J.: Discrete-Time Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1987)

    Google Scholar 

  2. Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1978)

    Google Scholar 

  3. Robinson, A.J.: An application of recurrent nets to phone probability estimation. IEEE Transactions on Neural Networks 5, 298–305 (1994)

    Article  Google Scholar 

  4. Clarkson, P., Moreno, P.: On the use of support vector machines for phonetic classification. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing 1999, Phoenix, Arizona (1999)

    Google Scholar 

  5. Salomon, J.: Support vector machines for phoneme classification. Master’s thesis, University of Edinburgh (2001)

    Google Scholar 

  6. Koller, D., Sahami, M.: Hierarchically classifying docuemnts using very few words. In: Machine Learning: Proceedings of the Fourteenth International Conference, pp. 171–178 (1997)

    Google Scholar 

  7. McCallum, A.K., Rosenfeld, R., Mitchell, T.M., Ng, A.Y.: Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of ICML 1998, pp. 359–367 (1998)

    Google Scholar 

  8. Weigend, A.S., Wiener, E.D., Pedersen, J.O.: Exploiting hierarchy in text categorization. Information Retrieval 1, 193–216 (1999)

    Article  Google Scholar 

  9. Dumais, S.T., Chen, H.: Hierarchical classification of Web content. In: Proceedings of SIGIR 2000, pp. 256–263 (2000)

    Google Scholar 

  10. Katz, S.: Estimation of probabilities from sparsedata for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech and Signal Processing (ASSP) 35, 400–440 (1987)

    Article  Google Scholar 

  11. Vapnik, V.N.: Statistical Learning Theory. Wiley, Chichester (1998)

    MATH  Google Scholar 

  12. Crammer, K., Dekel, O., Shalev-Shwartz, S., Singer, Y.: Online passive aggressive algorithms. Advances in Neural Information Processing Systems 16 (2003)

    Google Scholar 

  13. Herbster, M.: Learning additive models online with fast evaluating kernels. In: Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, pp. 444–460 (2001)

    Google Scholar 

  14. Kivinen, J., Warmuth, M.K.: Exponentiated gradient versus gradient descent for linear predictors. Information and Computation 132, 1–64 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  15. Cesa-Bianchi, N., Conconi, A., Gentile, C.: On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory (2004) (to appear)

    Google Scholar 

  16. Weston, J., Watkins, C.: Support vector machines for multi-class pattern recognition. In: Proceedings of the Seventh European Symposium on Artificial Neural Networks (1999)

    Google Scholar 

  17. Censor, Y., Zenios, S.: Parallel Optimization: Theory, Algorithms, and Applications. Oxford University Press, New York (1997)

    MATH  Google Scholar 

  18. Lemel, L., Kassel, R., Seneff, S.: Speech database development: Design and analysis. In: Proc. DARPA Speech Recognition Workshop, Report no. SAIC-86/1546 (1986)

    Google Scholar 

  19. ETSI Standard, ETSI ES 201 108 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dekel, O., Keshet, J., Singer, Y. (2005). An Online Algorithm for Hierarchical Phoneme Classification. In: Bengio, S., Bourlard, H. (eds) Machine Learning for Multimodal Interaction. MLMI 2004. Lecture Notes in Computer Science, vol 3361. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30568-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30568-2_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24509-4

  • Online ISBN: 978-3-540-30568-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics