An Online Algorithm for Hierarchical Phoneme Classification

Dekel, Ofer; Keshet, Joseph; Singer, Yoram

doi:10.1007/978-3-540-30568-2_13

Ofer Dekel¹⁸,
Joseph Keshet¹⁸ &
Yoram Singer¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3361))

Included in the following conference series:

International Workshop on Machine Learning for Multimodal Interaction

994 Accesses
13 Citations
3 Altmetric

Abstract

We present an algorithmic framework for phoneme classification where the set of phonemes is organized in a predefined hierarchical structure. This structure is encoded via a rooted tree which induces a metric over the set of phonemes. Our approach combines techniques from large margin kernel methods and Bayesian analysis. Extending the notion of large margin to hierarchical classification, we associate a prototype with each individual phoneme and with each phonetic group which corresponds to a node in the tree. We then formulate the learning task as an optimization problem with margin constraints over the phoneme set. In the spirit of Bayesian methods, we impose similarity requirements between the prototypes corresponding to adjacent phonemes in the phonetic hierarchy. We describe a new online algorithm for solving the hierarchical classification problem and provide worst-case loss analysis for the algorithm. We demonstrate the merits of our approach by applying the algorithm to synthetic data and as well as speech data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Deller, J., Proakis, J., Hansen, J.: Discrete-Time Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1987)
Google Scholar
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1978)
Google Scholar
Robinson, A.J.: An application of recurrent nets to phone probability estimation. IEEE Transactions on Neural Networks 5, 298–305 (1994)
Article Google Scholar
Clarkson, P., Moreno, P.: On the use of support vector machines for phonetic classification. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing 1999, Phoenix, Arizona (1999)
Google Scholar
Salomon, J.: Support vector machines for phoneme classification. Master’s thesis, University of Edinburgh (2001)
Google Scholar
Koller, D., Sahami, M.: Hierarchically classifying docuemnts using very few words. In: Machine Learning: Proceedings of the Fourteenth International Conference, pp. 171–178 (1997)
Google Scholar
McCallum, A.K., Rosenfeld, R., Mitchell, T.M., Ng, A.Y.: Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of ICML 1998, pp. 359–367 (1998)
Google Scholar
Weigend, A.S., Wiener, E.D., Pedersen, J.O.: Exploiting hierarchy in text categorization. Information Retrieval 1, 193–216 (1999)
Article Google Scholar
Dumais, S.T., Chen, H.: Hierarchical classification of Web content. In: Proceedings of SIGIR 2000, pp. 256–263 (2000)
Google Scholar
Katz, S.: Estimation of probabilities from sparsedata for the language model component of a speech recognizer. IEEE Transactions on Acoustics, Speech and Signal Processing (ASSP) 35, 400–440 (1987)
Article Google Scholar
Vapnik, V.N.: Statistical Learning Theory. Wiley, Chichester (1998)
MATH Google Scholar
Crammer, K., Dekel, O., Shalev-Shwartz, S., Singer, Y.: Online passive aggressive algorithms. Advances in Neural Information Processing Systems 16 (2003)
Google Scholar
Herbster, M.: Learning additive models online with fast evaluating kernels. In: Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, pp. 444–460 (2001)
Google Scholar
Kivinen, J., Warmuth, M.K.: Exponentiated gradient versus gradient descent for linear predictors. Information and Computation 132, 1–64 (1997)
Article MATH MathSciNet Google Scholar
Cesa-Bianchi, N., Conconi, A., Gentile, C.: On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory (2004) (to appear)
Google Scholar
Weston, J., Watkins, C.: Support vector machines for multi-class pattern recognition. In: Proceedings of the Seventh European Symposium on Artificial Neural Networks (1999)
Google Scholar
Censor, Y., Zenios, S.: Parallel Optimization: Theory, Algorithms, and Applications. Oxford University Press, New York (1997)
MATH Google Scholar
Lemel, L., Kassel, R., Seneff, S.: Speech database development: Design and analysis. In: Proc. DARPA Speech Recognition Workshop, Report no. SAIC-86/1546 (1986)
Google Scholar
ETSI Standard, ETSI ES 201 108 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, The Hebrew University, Jerusalem, 91904, Israel
Ofer Dekel, Joseph Keshet & Yoram Singer

Authors

Ofer Dekel
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Keshet
View author publications
You can also search for this author in PubMed Google Scholar
Yoram Singer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IDIAP Research Institute, Martigny, Switzerland
Samy Bengio
IDIAP Research Institute, CH-1920, Martigny, Switzerland
Hervé Bourlard

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dekel, O., Keshet, J., Singer, Y. (2005). An Online Algorithm for Hierarchical Phoneme Classification. In: Bengio, S., Bourlard, H. (eds) Machine Learning for Multimodal Interaction. MLMI 2004. Lecture Notes in Computer Science, vol 3361. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30568-2_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-30568-2_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24509-4
Online ISBN: 978-3-540-30568-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics