Skip to main content

Phase-Based Methods for Voice Source Analysis

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4885))

Abstract

Voice source analysis is an important but difficult issue for speech processing. In this talk, three aspects of voice source analysis recently developed at LIMSI (Orsay, France) and FPMs (Mons, Belgium) are discussed. In a first part, time domain and spectral domain modelling of glottal flow signals are presented. It is shown that the glottal flow can be modelled as an anticausal filter (maximum phase) before the glottal closing, and as a causal filter (minimum phase) after the glottal closing. In a second part, taking advantage of this phase structure, causal and anticausal components of the speech signal are separated according to the location in the Z-plane of the zeros of the Z-Transform (ZZT) of the windowed signal. This method is useful for voice source parameters analysis and source-tract deconvolution. Results of a comparative evaluation of the ZZT and linear prediction for source/tract separation are reported. In a third part, glottal closing instant detection using the phase of the wavelet transform is discussed. A method based on the lines of maximum phase in the time-scale plane is proposed. This method is compared to EGG for robust glottal closing instant analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alku, P.: Glottal wave analysis with pitch synchronous iterative adaptive inverse filtering. Speech Communication 11, 109–118 (1992)

    Article  Google Scholar 

  2. Alku, P., Bäckström, T., Vilkman, E.: Normalized amplitude quotient for parametrization of the glottal flow. J. Acous. Soc. Am. 112(2), 701–710 (2002)

    Article  Google Scholar 

  3. Alsteris, L.D., Paliwal, K.K.: Short-time phase spectrum in speech processing: A review and some experimental results. Digital Signal Processing 17, 578–616 (2007)

    Article  Google Scholar 

  4. Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Improved differential phase spectrum processing for formant tracking. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 4 pages (2004)

    Google Scholar 

  5. Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Appropriate windowing for group delay analysis and roots of z-transform of speech signals. In: EUSIPCO 2004. 12th European Signal Processing Conference, EURASIP, Vienna, Austria, September 6-10, 2004, 4 pages (2004)

    Google Scholar 

  6. Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: A method for glottal formant frequency estimation. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 4 pages (2004)

    Google Scholar 

  7. Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Zeros of Z-Transform (ZZT) decomposition of speech for source-tract separation. In: Interspeech 2004-ICSLP. 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 5 pages (2004)

    Google Scholar 

  8. Bozkurt, B.: Zeros of z-transform(ZZT) representation and chirp group delay processing for analysis of source and filter characteristics of speech signals, PhD Thesis, Université Polytechnique de Mons, Belgium and LIMSI-CNRS, France (October 2005)

    Google Scholar 

  9. Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Zeros of Z-transform representation with application to source-filter separation in speech. IEEE Signal Processing Letters 12(4), 344–347 (2005)

    Article  Google Scholar 

  10. Childers, D.G., Lee, C.K.: Voice quality factors: analysis, synthesis and perception. J. Acoust. Soc. Am. 90(5), 2394–2410 (1991)

    Article  Google Scholar 

  11. d’Alessandro, N., Doval, B., Le Beux, S., Woodruff, P., Fabre, Y., d’Alessandro, C., Dutoit, T.: Realtime and Accurate Musical Control of Expression in Singing Synthesis. Journal on Multimodal User Interfaces 1(1), 31–39 (2007)

    Google Scholar 

  12. Doval, B., d’Alessandro, C., Henrich, N.: The voice source as a causal/anticausal linear filter. In: Proc. Voqual 2003. Voice Quality: Functions, analysis and synthesis, ISCA workshop, Geneva, Switzerland, pp. 15–20 (August 2003B)

    Google Scholar 

  13. Doval, C.D., Henrich, N.: The spectrum of glottal flow models. Acustica united with Acta Acustica 92, 1026–1046 (2006)

    Google Scholar 

  14. Fant, G.: Acoustic theory of speech production, Mouton De Gruyter, Revised edn. (January 1970)

    Google Scholar 

  15. Fant, G., Liljencrants, J., Lin, Q.: A four-parameter model of glottal flow. STL-QPSR 4, 1–13 (1985)

    Google Scholar 

  16. Hanson, H.M.: Glottal characteristics of female speakers: Acoustic correlates. J. Acous. Soc. Am. 101, 466–481 (1997)

    Article  Google Scholar 

  17. Henrich, N., d’Alessandro, C., Doval, B.: Spectral correlates of voice open quotient and glottal flow asymmetry: theory, limits and experimental data. In: Eurospeech 2001, Aalborg, Denmark (September 2001)

    Google Scholar 

  18. Henrich, N., d’Alessandro, C., Castellengo, M., Doval, B.: On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. J. Acoust. Soc. Amer. 115(3), 1321–1332 (2004)

    Article  Google Scholar 

  19. Kadambe, S., Boudreaux-Bartels, G.F.: Application of the wavelet transform for pitch detection of speech signals. IEEE trans. on IT 38(2), 917–924D (1992)

    Article  Google Scholar 

  20. Klatt, D., Klatt, L.: Analysis, synthesis, and perception of voice quality variations among female and male talkers. J. Acous. Soc. Am. 87(2), 820–857 (1990)

    Article  Google Scholar 

  21. Mallat, S., Hwang, W.L.: Singularity detection and processing with wavelets. IEEE trans. on IT 38(2), 617–943 (1992)

    Article  MathSciNet  Google Scholar 

  22. Markel, J.D., Gray Jr., A.H.: Linear Prediction of Speech. Springer, Berlin (1976)

    MATH  Google Scholar 

  23. Rosenberg, E.: Effect of glottal pulse shape on the quality of natural vowels. J. Acous. Soc. Am. 49, 583–590 (1971)

    Article  Google Scholar 

  24. Sturmel, N., d’Alessandro, C., Doval, B.: A spectral method for estimation of the voice speed quotient and evaluation using electroglottography. In: 7th Conference on Advances in Quantitative Laryngology, Groningen, The Netherlands, October 6-7, 2006 (2006)

    Google Scholar 

  25. Sturmel, N., d’Alessandro, C., Doval, B.: A comparative evaluation of the Zeros of Z Transform representation for voice source estimation. In: Proceedings of Interspeech 2007, Antwerp, Belgium (August 27-31, 2007)

    Google Scholar 

  26. Veldhuis, R.: A computationally efficient alternative for the Liljencrants-Fant model and its perceptual evaluation. J. Acous. Soc. Am. 103, 566–571 (1998)

    Article  Google Scholar 

  27. Tuan, V.N., d’Alessandro, C.: Robust Glottal closing Detection using the Wavelet Transform. In: Proceedings of Eurospeech 1999 Budapest, Hungary, vol. 6, pp. 2805–2808 (1999)

    Google Scholar 

  28. Tuan, V.N., d’Alessandro, C.: Glottal closing Detection using EGG and the Wavelet Transform. In: Advances in Quantitative Laryngoscopy, Voice and Speech Research Proceedings of the 4th International Workshop, Jena, Germany, pp. 147–154 (2000)

    Google Scholar 

  29. Yegnanarayana, B., Murthy, H.A.: Significance of group delay functions in spectrum estimation. IEEE Trans. Signal Process. 40(9) (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Mohamed Chetouani Amir Hussain Bruno Gas Maurice Milgram Jean-Luc Zarader

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

d’Alessandro, C. et al. (2007). Phase-Based Methods for Voice Source Analysis. In: Chetouani, M., Hussain, A., Gas, B., Milgram, M., Zarader, JL. (eds) Advances in Nonlinear Speech Processing. NOLISP 2007. Lecture Notes in Computer Science(), vol 4885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77347-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77347-4_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77346-7

  • Online ISBN: 978-3-540-77347-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics