Trajectory Representations and Acoustic Descriptions for a Segment-Modelling Approach to Automatic Speech Recognition

Holmes, Wendy J.

doi:10.1007/978-3-642-60087-6_18

Trajectory Representations and Acoustic Descriptions for a Segment-Modelling Approach to Automatic Speech Recognition

Wendy J. Holmes²

Chapter

227 Accesses

Part of the book series: NATO ASI Series ((NATO ASI F,volume 169))

Summary

This paper discusses some of the possibilities for modelling speech segment trajectories in a domain which is more directly correlated with the mechanisms of speech production than the typical mel-cepstrum representation. Initial developments are described towards using linear dynamic segmental HMMs [12] to model underlying (unobserved) trajectories of features which closely reflect the nature of articulation. So far, this work has involved calculating segment probabilities using an approach which is different from that used in earlier studies (e.g. [4]), and is more consistent with the idea of treating the trajectory as unobserved. In parallel, experiments have demonstrated that formant features can be useful for HMM-based automatic speech recognition [3].

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

L. Deng, G. Ramsay, and H. Sameti. From modeling surface phenomena to modeling mechanisms: Towards a faithful model of the speech process aiming at speech recognition. In Proc. IEEE Automatic Speech Recognition Workshop, pages 183–184, Snowbird, 1995.
Google Scholar
M. J. Gales and S. J. Young. Segmental hidden Markov models. In EUROSPEECH, pages 1611–1614, Berlin, 1993.
Google Scholar
J. N. Holmes, W. J. Holmes, and R N. Garner. Using formant frequencies in speech recognition. In EUROSPEECH, Rhodes, 1997.
Google Scholar
W. J. Holmes and M. J. Russell. Linear dynamic segmental HMMs: Variability representation and training procedure. In ICASSP, pages 1399–1402, Munich, 1997.
Google Scholar
A. Hu and E. Barnard. Smoothness analysis for trajectory features. In ICASSP, pages 979–982, Munich, 1997.
Google Scholar
R. K. Moore. Signal decomposition using Markov modelling techniques. RSRE Memo 3931, RSRE, Malvern, UK, 1986.
Google Scholar
R. K. Moore. Twenty things we still don’t know about speech. In Proceedings CRIM/FORWISS Workshop on Progress and Prospects of Speech Research Technology, 1994.
Google Scholar
M. Ostendorf, V. V. Digalakis, and O. A. Kimball. From HMM’s to segment models: A unified view of stochastic modeling for speech recognition. IEEE Trans Speech and Audio Processing, 4 (5): 360–378, 1996.
Article Google Scholar
G. Ramsay and L. Deng. Maximum-likelihood estimation for articulatory speech recognition using a stochastic target model. In EUROSPEECH, pages 1401–1404, Madrid, 1995.
Google Scholar
H. B. Richards, J. S. Bridle, M. J. Hunt, and J. S. Mason. Vocal tract shape trajectory estimation using MLP analysis-by-synthesis. In ICASSP, pages 1287–1290, Munich, 1997.
Google Scholar
M. J. Russell. Advances in speech recognition. In Proceedings: Institute of Acoustics, Vol. 18: Part 9, pages 267–274, 1996.
Google Scholar
M. J. Russell and W. J. Holmes. Linear trajectory segmental HMM’s. IEEE Signal Processing Letters, 4 (3): 72–74, 1997.
Article Google Scholar
R Schmid and E. Barnard. Explicit, N-best formant features for vowel classification. In ICASSP, pages 991–994, Munich, 1997.
Google Scholar
M. J. Tomlinson, M. J. Russell, R. K. Moore, A. R Buckland, and M. A. Fawley. Modelling asynchrony in speech using elementary single-signal decomposition. In ICASSP, pages 1247–1250, Munich, 1997.
Google Scholar
L. Welling and H. Ney. A model for efficient formant estimation. In ICASSP, pages 797–800, Atlanta, 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Speech Research Unit, DERA Malvern, St. Andrew’s Road, Great Malvern, Worcs, WR14 3PS, UK
Wendy J. Holmes

Authors

Wendy J. Holmes
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Speech Research Unit, DERA Malvern, St. Andrew’s Road, WR14 4DT, Great Malvern, Worcs, UK
Keith Ponting

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Holmes, W.J. (1999). Trajectory Representations and Acoustic Descriptions for a Segment-Modelling Approach to Automatic Speech Recognition. In: Ponting, K. (eds) Computational Models of Speech Pattern Processing. NATO ASI Series, vol 169. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60087-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-60087-6_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-64250-0
Online ISBN: 978-3-642-60087-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics