Phase-Based Methods for Voice Source Analysis

d’Alessandro, Christophe; Bozkurt, Baris; Doval, Boris; Dutoit, Thierry; Henrich, Nathalie; Tuan, Vu Ngoc; Sturmel, Nicolas

doi:10.1007/978-3-540-77347-4_1

Phase-Based Methods for Voice Source Analysis

Christophe d’Alessandro¹,
Baris Bozkurt²,
Boris Doval¹,
Thierry Dutoit³,
Nathalie Henrich⁴,
Vu Ngoc Tuan¹ &
…
Nicolas Sturmel¹

Conference paper

628 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4885))

Abstract

Voice source analysis is an important but difficult issue for speech processing. In this talk, three aspects of voice source analysis recently developed at LIMSI (Orsay, France) and FPMs (Mons, Belgium) are discussed. In a first part, time domain and spectral domain modelling of glottal flow signals are presented. It is shown that the glottal flow can be modelled as an anticausal filter (maximum phase) before the glottal closing, and as a causal filter (minimum phase) after the glottal closing. In a second part, taking advantage of this phase structure, causal and anticausal components of the speech signal are separated according to the location in the Z-plane of the zeros of the Z-Transform (ZZT) of the windowed signal. This method is useful for voice source parameters analysis and source-tract deconvolution. Results of a comparative evaluation of the ZZT and linear prediction for source/tract separation are reported. In a third part, glottal closing instant detection using the phase of the wavelet transform is discussed. A method based on the lines of maximum phase in the time-scale plane is proposed. This method is compared to EGG for robust glottal closing instant analysis.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alku, P.: Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Communication 11, 109–118 (1992)
Article Google Scholar
Alku, P., Bäckström, T., Vilkman, E.: Normalized amplitude quotient for parametrization of the glottal flow. J. Acous. Soc. Am. 112(2), 701–710 (2002)
Article Google Scholar
Alsteris, L.D., Paliwal, K.K.: Short-time phase spectrum in speech processing: A review and some experimental results. Digital Signal Processing 17, 578–616 (2007)
Article Google Scholar
Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Improved differential phase spectrum processing for formant tracking. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 4 pages (2004)
Google Scholar
Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Appropriate windowing for group delay analysis and roots of z-transform of speech signals. In: EUSIPCO 2004. 12th European Signal Processing Conference, EURASIP, Vienna, Austria, September 6-10, 2004, 4 pages (2004)
Google Scholar
Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: A method for glottal formant frequency estimation. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 4 pages (2004)
Google Scholar
Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Zeros of Z-Transform (ZZT) decomposition of speech for source-tract separation. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 5 pages (2004)
Google Scholar
Bozkurt, B.: Zeros of z-transform(ZZT) representation and chirp group delay processing for analysis of source and filter characteristics of speech signals, PhD Thesis, Université Polytechnique de Mons, Belgium and LIMSI-CNRS, France (October 2005)
Google Scholar
Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Zeros of Z-transform representation with application to source-filter separation in speech. IEEE Signal Processing Letters 12(4), 344–347 (2005)
Article Google Scholar
Childers, D.G., Lee, C.K.: Voice quality factors: analysis, synthesis and perception. J. Acoust. Soc. Am. 90(5), 2394–2410 (1991)
Article Google Scholar
d’Alessandro, N., Doval, B., Le Beux, S., Woodruff, P., Fabre, Y., d’Alessandro, C., Dutoit, T.: Realtime and Accurate Musical Control of Expression in Singing Synthesis. Journal on Multimodal User Interfaces 1(1), 31–39 (2007)
Google Scholar
Doval, B., d’Alessandro, C., Henrich, N.: The voice source as a causal/anticausal linear filter. In: Proc. Voqual 2003. Voice Quality: Functions, analysis and synthesis, ISCA workshop, Geneva, Switzerland, pp. 15–20 (August 2003B)
Google Scholar
Doval, C.D., Henrich, N.: The spectrum of glottal flow models. Acustica united with Acta Acustica 92, 1026–1046 (2006)
Google Scholar
Fant, G.: Acoustic theory of speech production, Mouton De Gruyter, Revised edn. (January 1970)
Google Scholar
Fant, G., Liljencrants, J., Lin, Q.: A four-parameter model of glottal flow. STL-QPSR 4, 1–13 (1985)
Google Scholar
Hanson, H.M.: Glottal characteristics of female speakers: Acoustic correlates. J. Acous. Soc. Am. 101, 466–481 (1997)
Article Google Scholar
Henrich, N., d’Alessandro, C., Doval, B.: Spectral correlates of voice open quotient and glottal flow asymmetry: theory, limits and experimental data. In: Eurospeech 2001, Aalborg, Denmark (September 2001)
Google Scholar
Henrich, N., d’Alessandro, C., Castellengo, M., Doval, B.: On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. J. Acoust. Soc. Amer. 115(3), 1321–1332 (2004)
Article Google Scholar
Kadambe, S., Boudreaux-Bartels, G.F.: Application of the wavelet transform for pitch detection of speech signals. IEEE trans. on IT 38(2), 917–924D (1992)
Article Google Scholar
Klatt, D., Klatt, L.: Analysis, synthesis, and perception of voice quality variations among female and male talkers. J. Acous. Soc. Am. 87(2), 820–857 (1990)
Article Google Scholar
Mallat, S., Hwang, W.L.: Singularity detection and processing with wavelets. IEEE trans. on IT 38(2), 617–943 (1992)
Article MathSciNet Google Scholar
Markel, J.D., Gray Jr., A.H.: Linear Prediction of Speech. Springer, Berlin (1976)
MATH Google Scholar
Rosenberg, E.: Effect of glottal pulse shape on the quality of natural vowels. J. Acous. Soc. Am. 49, 583–590 (1971)
Article Google Scholar
Sturmel, N., d’Alessandro, C., Doval, B.: A spectral method for estimation of the voice speed quotient and evaluation using electroglottography. In: 7th Conference on Advances in Quantitative Laryngology, Groningen, The Netherlands, October 6-7, 2006 (2006)
Google Scholar
Sturmel, N., d’Alessandro, C., Doval, B.: A comparative evaluation of the Zeros of Z Transform representation for voice source estimation. In: Proceedings of Interspeech 2007, Antwerp, Belgium (August 27-31, 2007)
Google Scholar
Veldhuis, R.: A computationally efficient alternative for the Liljencrants-Fant model and its perceptual evaluation. J. Acous. Soc. Am. 103, 566–571 (1998)
Article Google Scholar
Tuan, V.N., d’Alessandro, C.: Robust Glottal closing Detection using the Wavelet Transform. In: Proceedings of Eurospeech 1999 Budapest, Hungary, vol. 6, pp. 2805–2808 (1999)
Google Scholar
Tuan, V.N., d’Alessandro, C.: Glottal closing Detection using EGG and the Wavelet Transform. In: Advances in Quantitative Laryngoscopy, Voice and Speech Research Proceedings of the 4th International Workshop, Jena, Germany, pp. 147–154 (2000)
Google Scholar
Yegnanarayana, B., Murthy, H.A.: Significance of group delay functions in spectrum estimation. IEEE Trans. Signal Process. 40(9) (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

LIMSI-CNRS Orsay, France
Christophe d’Alessandro, Boris Doval, Vu Ngoc Tuan & Nicolas Sturmel
Izmir Institute of Technology, Izmir, Turkey
Baris Bozkurt
TCTS-FPMs, Mons, Belgium
Thierry Dutoit
DPC-GIPSA-Lab Grenoble,
Nathalie Henrich

Authors

Christophe d’Alessandro
View author publications
You can also search for this author in PubMed Google Scholar
Baris Bozkurt
View author publications
You can also search for this author in PubMed Google Scholar
Boris Doval
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Dutoit
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Henrich
View author publications
You can also search for this author in PubMed Google Scholar
Vu Ngoc Tuan
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Sturmel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Mohamed Chetouani Amir Hussain Bruno Gas Maurice Milgram Jean-Luc Zarader

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

d’Alessandro, C. et al. (2007). Phase-Based Methods for Voice Source Analysis. In: Chetouani, M., Hussain, A., Gas, B., Milgram, M., Zarader, JL. (eds) Advances in Nonlinear Speech Processing. NOLISP 2007. Lecture Notes in Computer Science(), vol 4885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77347-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-77347-4_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77346-7
Online ISBN: 978-3-540-77347-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics