Towards a Better Representation of the Envelope Modulation of Aspiration Noise

Cabral, João P.; Carson-Berndsen, Julie

doi:10.1007/978-3-642-38847-7_9

João P. Cabral²¹ &
Julie Carson-Berndsen²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7911))

Included in the following conference series:

International Conference on Nonlinear Speech Processing

1118 Accesses
1 Citations

Abstract

The control over aspects of the glottal source signal is fundamental to correctly modify relevant voice characteristics, such as breathiness. This voice quality is strongly related to the characteristics of the glottal source signal produced at the glottis, mainly the shape of the glottal pulse and the aspiration noise. This type of noise results from the turbulence of air passing through the glottis and it can be represented by an amplitude modulated Gaussian noise, which depends on the glottal volume velocity and glottal area. However, the dependency between the glottal signal and the noise component is usually not taken into account for transforming breathiness. In this paper, we propose a method for modelling the aspiration noise which permits to adapt the aspiration noise to take into account its dependency with the glottal pulse shape, while producing high-quality speech. The envelope of the amplitude modulated noise is estimated from the speech signal pitch-synchronously and then it is parameterized by using a non-linear polynomial fitting algorithm. Finally, an asymmetric triangular window is obtained from the non-linear polynomial representation for obtaining a shape of the energy envelope of the noise closer to that of the glottal source. In the experiments for voice transformation, both the proposed aspiration noise model and an acoustic glottal source model are used to transform a modal voice into breathy. Results show that the aspiration noise model improves the voice quality transformation compared with an excitation using only the glottal model and an excitation that combines the glottal source model and a spectral representation of the noise component.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 72.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Voice production model based on phonation biophysics

Article Open access 08 September 2021

Simultaneous Estimation of Glottal Source Waveforms and Vocal Tract Shapes from Speech Signals Based on ARX-LF Model

Article 23 December 2019

Enhancing Voice Quality in Vocal Tract Rehabilitation Device

References

Mehta, D., Quatieri, F.: Synthesis, analysis, and pitch modification of the breathy vowel. In: Proc. of IEEE WASPAA, pp. 1628–1639 (2005)
Google Scholar
Pantazis, Y., Stylianou, Y.: Improving the modeling of the noise part in the harmonic plus noise model of speech. In: Proc. of ICASSP, pp. 4609–4612 (2008)
Google Scholar
Stylianou, Y.: Harmonic plus Noise Models for Speech, combined with Statistical Methods, for Speech and Speaker Modification, PhD thesis, Ecole Nationale Supérieure des Télécommunications (1996)
Google Scholar
Degottex, G., Roebel, A., Rodet, X., “Pitch transposition and breathiness modification using a glottal source model and its adapted vocal-tract filter”, Proc. of ICASSP, 5128–5131, 2011.
Google Scholar
Cabral, J.P., Renals, S., Richmond, K., Yamagishi, J.: Glottal Spectral Separation for Parametric Speech Synthesis. In: Proc. Interspeech, pp. 1829–1832 (2008)
Google Scholar
Fant, G., Liljencrants, J., Lin, Q.: A four-parameter model of glottal flow. STL-QPSR 26(4), 1–13 (1985)
Google Scholar
Cabral, J.P., Renals, S., Richmond, K., Yamagishi, J.: HMM-based speech synthesiser using the LF-model of the glottal source. In: Proc. of ICASSP (2011)
Google Scholar
Alku, P., Vilkman, E., Laine, U.K.: Analysis of glottal waveform in different phonation types using the new IAIF method. In: Proc. of ICPhS, France, vol. 4, pp. 362–365 (1991)
Google Scholar
Kawahara, H., Masuda-Katsuse, I., Cheveigné, A.: Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f ₀ extraction: Possible role of a repetitive structure in sounds. Speech Communication 27, 187–207 (1999)
Article Google Scholar
Hermes, D.J.: Synthesis of breathy vowels: some research methods. Speech Communication 10, 497–502 (1991)
Article Google Scholar
Erro, D., Moreno, A.: A Pitch-Asynchronous Simple Method for Speech Synthesis by Diphone Concatenation using the Deterministic plus Stochastic Model. In: SPECOM, Greece, pp. 321–324 (2005)
Google Scholar
Mehta, D.: Aspiration noise during phonation: Synthesis, analysis, and pitch-scale modification, PhD Thesis, Massachussets Institute of Technology (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Informatics, University College Dublin, Ireland
João P. Cabral & Julie Carson-Berndsen

Authors

João P. Cabral
View author publications
You can also search for this author in PubMed Google Scholar
Julie Carson-Berndsen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

TCTS Lab, University of Mons, 31, Bouldevard Bolez, 7000, Mons, Belgium
Thomas Drugman
TCTS Lab, University of Mons, 31, Boulevard Dolez, 7000, Mons, Belgium
Thierry Dutoit

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cabral, J.P., Carson-Berndsen, J. (2013). Towards a Better Representation of the Envelope Modulation of Aspiration Noise. In: Drugman, T., Dutoit, T. (eds) Advances in Nonlinear Speech Processing. NOLISP 2013. Lecture Notes in Computer Science(), vol 7911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38847-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-38847-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38846-0
Online ISBN: 978-3-642-38847-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards a Better Representation of the Envelope Modulation of Aspiration Noise

Abstract

Access this chapter

Preview

Similar content being viewed by others

Voice production model based on phonation biophysics

Simultaneous Estimation of Glottal Source Waveforms and Vocal Tract Shapes from Speech Signals Based on ARX-LF Model

Enhancing Voice Quality in Vocal Tract Rehabilitation Device

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Towards a Better Representation of the Envelope Modulation of Aspiration Noise

Abstract

Access this chapter

Preview

Similar content being viewed by others

Voice production model based on phonation biophysics

Simultaneous Estimation of Glottal Source Waveforms and Vocal Tract Shapes from Speech Signals Based on ARX-LF Model

Enhancing Voice Quality in Vocal Tract Rehabilitation Device

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation