Modification of the Glottal Voice Characteristics Based on Changing the Maximum-Phase Speech Component

  • Martin Vondra
  • Robert Vích
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6800)

Abstract

Voice characteristics are influenced especially by the vocal cords and by the vocal tract. Characteristics known as voice type (normal, breathy, tense, falsetto etc.) are attributed to vocal cords. Emotion influences among others the tonus of muscles and thus influences also the vocal cords behavior. Previous research confirms a large dependence of emotional speech on the glottal flow characteristics. There are several possible ways for obtaining the glottal flow signal from speech. One of them is the decomposition of speech using the complex cepstrum into the maximum- and minimum-phase components. In this approach the maximum-phase component is considered as the open phase of the glottal flow signal. In this contribution we present experiments with the modification of the maximum-phase speech signal component with the aim to obtain synthetic emotional speech.

Keywords

Fast Fourier Transform Impulse Response Vocal Cord Speech Signal Vocal Tract 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Iida, A., et al.: A corpus-based speech synthesis system with emotions. Speech Communication 40, 161–187 (2003)CrossRefMATHGoogle Scholar
  2. 2.
    Vích, R.: Pitch Synchronous Linear Predictive Czech and Slovak Text-to-Speech Synthesis. In: Proc. of the 15th International Congress on Acoustics, ICA 1995, Trondheim, Norway, vol. III, pp. 181–184 (1995)Google Scholar
  3. 3.
    Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation and Gain Matching in Cepstral Speech Synthesis. In: Jan, J. (ed.) BIOSIGNAL 2000, VUTIUM, Brno, pp. 77–82 (2000)Google Scholar
  4. 4.
    Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Communication 40, 18–212 (2003)CrossRefMATHGoogle Scholar
  5. 5.
    Airas, M., Alku, P.: Emotions in Vowel Segments of Continuous Speech: Analysis of the Glottal Flow Using the Normalized Amplitude Quotient. Phonetica 63, 26–46 (2006)CrossRefGoogle Scholar
  6. 6.
    Walker, J., Murphy, P.: A Review of Glottal Waveform Analysis. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds.) COST 277. LNCS, vol. 4391, pp. 1–21. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  7. 7.
    Bozkurt, B.: Zeros of the z-transform (ZZT) representation and chirp group delay processing for the analysis of source and filter characteristics of speech signals. Ph.D. Thesis, Faculté Polytechnique De Mons, Belgium (2005)Google Scholar
  8. 8.
    Drugman, T., Bozkurt, B., Dutoid, T.: Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation. In: INTERSPEECH 2009, Brighton, U.K, pp. 116–119 (2009)Google Scholar
  9. 9.
    Doval, B.: Alessandro, Ch., Henric, N.: The voice source as a causal/anticausal linear filter. In: Proc. of ISCA Tutorial and Research Workshop on Voice Quality (VOQUAL), Geneva, pp. 15–19 (2003)Google Scholar
  10. 10.
    Tribolet, J.: A new phase unwrapping algorithm. IEEE Transactions on Acoustics, Speech and Signal Processing 25(2), 170–177 (1977)CrossRefMATHGoogle Scholar
  11. 11.
    Oppenheim, A.V., Schafer, R.V.: Discrete-Time Signal Processing, pp. 768–825. Prentice Hall, Englewood Cliffs (1989)MATHGoogle Scholar
  12. 12.
    Vích, R.: Nichtkausales Cepstrales Sprachmodell. In: Proc. 20th Electronic Speech Processing Conference – ESSV 2009, Dresden, Germany, pp. 107–114 (2009)Google Scholar
  13. 13.
    Vondra, M., Vích, R.: Speech Conversion Using a Mixed-phase Cepstral Vocoder. In: Proc. of 21st Electronic Speech Processing Conference – ESSV 2010, Berlin, Germany, pp. 112–118 (2010)Google Scholar
  14. 14.
    Stylianou, Y.: Decomposition of speech signals into a deterministic and a stochastic part. In: Proc. of Fourth International Conference on Spoken Language, ICSLP 1996, Philadelphia, pp. 1213–1216 (1996)Google Scholar
  15. 15.
    Doval, B., d’Alessandro, C., Henrich, N.: The spectrum of glottal flow models, http://rs2007.limsi.fr/index.php/PS:Page_2

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Martin Vondra
    • 1
  • Robert Vích
    • 1
  1. 1.Institute of Photonics and ElectronicsAcademy of Sciences of the Czech RepublicPrague 8Czech Republic

Personalised recommendations