Phase-Based Methods for Voice Source Analysis

  • Christophe d’Alessandro
  • Baris Bozkurt
  • Boris Doval
  • Thierry Dutoit
  • Nathalie Henrich
  • Vu Ngoc Tuan
  • Nicolas Sturmel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4885)


Voice source analysis is an important but difficult issue for speech processing. In this talk, three aspects of voice source analysis recently developed at LIMSI (Orsay, France) and FPMs (Mons, Belgium) are discussed. In a first part, time domain and spectral domain modelling of glottal flow signals are presented. It is shown that the glottal flow can be modelled as an anticausal filter (maximum phase) before the glottal closing, and as a causal filter (minimum phase) after the glottal closing. In a second part, taking advantage of this phase structure, causal and anticausal components of the speech signal are separated according to the location in the Z-plane of the zeros of the Z-Transform (ZZT) of the windowed signal. This method is useful for voice source parameters analysis and source-tract deconvolution. Results of a comparative evaluation of the ZZT and linear prediction for source/tract separation are reported. In a third part, glottal closing instant detection using the phase of the wavelet transform is discussed. A method based on the lines of maximum phase in the time-scale plane is proposed. This method is compared to EGG for robust glottal closing instant analysis.


Speech Signal Vocal Tract Magnitude Spectrum Inverse Filter Voice Source 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alku, P.: Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Communication 11, 109–118 (1992)CrossRefGoogle Scholar
  2. 2.
    Alku, P., Bäckström, T., Vilkman, E.: Normalized amplitude quotient for parametrization of the glottal flow. J. Acous. Soc. Am. 112(2), 701–710 (2002)CrossRefGoogle Scholar
  3. 3.
    Alsteris, L.D., Paliwal, K.K.: Short-time phase spectrum in speech processing: A review and some experimental results. Digital Signal Processing 17, 578–616 (2007)CrossRefGoogle Scholar
  4. 4.
    Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Improved differential phase spectrum processing for formant tracking. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 4 pages (2004)Google Scholar
  5. 5.
    Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Appropriate windowing for group delay analysis and roots of z-transform of speech signals. In: EUSIPCO 2004. 12th European Signal Processing Conference, EURASIP, Vienna, Austria, September 6-10, 2004, 4 pages (2004)Google Scholar
  6. 6.
    Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: A method for glottal formant frequency estimation. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 4 pages (2004)Google Scholar
  7. 7.
    Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Zeros of Z-Transform (ZZT) decomposition of speech for source-tract separation. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 5 pages (2004)Google Scholar
  8. 8.
    Bozkurt, B.: Zeros of z-transform(ZZT) representation and chirp group delay processing for analysis of source and filter characteristics of speech signals, PhD Thesis, Université Polytechnique de Mons, Belgium and LIMSI-CNRS, France (October 2005)Google Scholar
  9. 9.
    Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Zeros of Z-transform representation with application to source-filter separation in speech. IEEE Signal Processing Letters 12(4), 344–347 (2005)CrossRefGoogle Scholar
  10. 10.
    Childers, D.G., Lee, C.K.: Voice quality factors: analysis, synthesis and perception. J. Acoust. Soc. Am. 90(5), 2394–2410 (1991)CrossRefGoogle Scholar
  11. 11.
    d’Alessandro, N., Doval, B., Le Beux, S., Woodruff, P., Fabre, Y., d’Alessandro, C., Dutoit, T.: Realtime and Accurate Musical Control of Expression in Singing Synthesis. Journal on Multimodal User Interfaces 1(1), 31–39 (2007)Google Scholar
  12. 12.
    Doval, B., d’Alessandro, C., Henrich, N.: The voice source as a causal/anticausal linear filter. In: Proc. Voqual 2003. Voice Quality: Functions, analysis and synthesis, ISCA workshop, Geneva, Switzerland, pp. 15–20 (August 2003B)Google Scholar
  13. 13.
    Doval, C.D., Henrich, N.: The spectrum of glottal flow models. Acustica united with Acta Acustica 92, 1026–1046 (2006)Google Scholar
  14. 14.
    Fant, G.: Acoustic theory of speech production, Mouton De Gruyter, Revised edn. (January 1970)Google Scholar
  15. 15.
    Fant, G., Liljencrants, J., Lin, Q.: A four-parameter model of glottal flow. STL-QPSR 4, 1–13 (1985)Google Scholar
  16. 16.
    Hanson, H.M.: Glottal characteristics of female speakers: Acoustic correlates. J. Acous. Soc. Am. 101, 466–481 (1997)CrossRefGoogle Scholar
  17. 17.
    Henrich, N., d’Alessandro, C., Doval, B.: Spectral correlates of voice open quotient and glottal flow asymmetry: theory, limits and experimental data. In: Eurospeech 2001, Aalborg, Denmark (September 2001)Google Scholar
  18. 18.
    Henrich, N., d’Alessandro, C., Castellengo, M., Doval, B.: On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. J. Acoust. Soc. Amer. 115(3), 1321–1332 (2004)CrossRefGoogle Scholar
  19. 19.
    Kadambe, S., Boudreaux-Bartels, G.F.: Application of the wavelet transform for pitch detection of speech signals. IEEE trans. on IT 38(2), 917–924D (1992)CrossRefGoogle Scholar
  20. 20.
    Klatt, D., Klatt, L.: Analysis, synthesis, and perception of voice quality variations among female and male talkers. J. Acous. Soc. Am. 87(2), 820–857 (1990)CrossRefGoogle Scholar
  21. 21.
    Mallat, S., Hwang, W.L.: Singularity detection and processing with wavelets. IEEE trans. on IT 38(2), 617–943 (1992)CrossRefMathSciNetGoogle Scholar
  22. 22.
    Markel, J.D., Gray Jr., A.H.: Linear Prediction of Speech. Springer, Berlin (1976)zbMATHGoogle Scholar
  23. 23.
    Rosenberg, E.: Effect of glottal pulse shape on the quality of natural vowels. J. Acous. Soc. Am. 49, 583–590 (1971)CrossRefGoogle Scholar
  24. 24.
    Sturmel, N., d’Alessandro, C., Doval, B.: A spectral method for estimation of the voice speed quotient and evaluation using electroglottography. In: 7th Conference on Advances in Quantitative Laryngology, Groningen, The Netherlands, October 6-7, 2006 (2006)Google Scholar
  25. 25.
    Sturmel, N., d’Alessandro, C., Doval, B.: A comparative evaluation of the Zeros of Z Transform representation for voice source estimation. In: Proceedings of Interspeech 2007, Antwerp, Belgium (August 27-31, 2007)Google Scholar
  26. 26.
    Veldhuis, R.: A computationally efficient alternative for the Liljencrants-Fant model and its perceptual evaluation. J. Acous. Soc. Am. 103, 566–571 (1998)CrossRefGoogle Scholar
  27. 27.
    Tuan, V.N., d’Alessandro, C.: Robust Glottal closing Detection using the Wavelet Transform. In: Proceedings of Eurospeech 1999 Budapest, Hungary, vol. 6, pp. 2805–2808 (1999)Google Scholar
  28. 28.
    Tuan, V.N., d’Alessandro, C.: Glottal closing Detection using EGG and the Wavelet Transform. In: Advances in Quantitative Laryngoscopy, Voice and Speech Research Proceedings of the 4th International Workshop, Jena, Germany, pp. 147–154 (2000)Google Scholar
  29. 29.
    Yegnanarayana, B., Murthy, H.A.: Significance of group delay functions in spectrum estimation. IEEE Trans. Signal Process. 40(9) (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Christophe d’Alessandro
    • 1
  • Baris Bozkurt
    • 2
  • Boris Doval
    • 1
  • Thierry Dutoit
    • 3
  • Nathalie Henrich
    • 4
  • Vu Ngoc Tuan
    • 1
  • Nicolas Sturmel
    • 1
  1. 1.LIMSI-CNRS OrsayFrance
  2. 2.Izmir Institute of Technology, IzmirTurkey
  3. 3.TCTS-FPMs, MonsBelgium
  4. 4.DPC-GIPSA-Lab Grenoble 

Personalised recommendations