Skip to main content

Text-to-Speech Synthesis with Dynamic Control of Source Parameters

  • Chapter
Progress in Speech Synthesis
  • 284 Accesses

Abstract

This chapter describes the study of some characteristics of sourceparameter dynamics to derive a preliminary set of rules that were integrated in textto-speech (TTS) systems. An automated procedure estimated the source parameters of 534 seconds of voiced speech from a set of 300 English sentences spoken by a single female speaker. The results showed that there is a strong correlation between the values of the source parameter in the vowel midpoint and the vowel duration. The same parameters tend to decrease on vowel onsets and to increase on vowels offsets. This seems to indicate a prosodic nature of these parameters requiring special treatment in concatenative-based TTS systems that use source modification techniques, such as pitch synchronous overlap add (PSOLA) and multipulse.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. B. Almeida and J. M. Tribolet. Non-stationary spectral modeling of voiced speech. Transactions on Acoustic Speech and Signal Froc. ASSP-31(3):664–678, June 1983.

    Article  Google Scholar 

  2. C. Gobi and A. Chasaide. The effects of adjacent voiced/voiceless consonants on the vowel voice source: A cross language study. Speech Transmission Laboratory — QPSR Stockholm, Sweden, 2–3, 1988.

    Google Scholar 

  3. G. Fant, J. Liljencrants, and Q. Lin. A four parameter model of glottal flow. Speech Transmission Laboratory — QPSR Stockholm, Sweden, 4:1–13, 1985.

    Google Scholar 

  4. I. Karlsson. Glottal waveforms for normal female speakers. Speech Transmission Laboratory — QPSR Stockholm, Sweden, 31–36, 1985.

    Google Scholar 

  5. D. H. Klatt and L. C. Klatt. Analysis, synthesis and perception of voice quality variations among female and male talkers. J. Acoust. Soc. Amer. 87(2):820–857, 1990.

    Article  Google Scholar 

  6. D. H. Klatt. Review of text-to-speech conversion for English. J. Acoust. Soc. Amer. 82(3):737–793, 1987.

    Article  Google Scholar 

  7. J. Marques and L. Almeida. Sinusoidal modeling of voiced and unvoiced speech. In Proceedings of the European Conference on Speech Communication and Technology, September 1989.

    Google Scholar 

  8. J. P. Olive. A new algorithm for a concatenative speech synthesis system using an augmented acoustic inventory of speech sounds. In ESCA Workshop on Speech Synthesis, Autrans, France, 25–29, September 1990.

    Google Scholar 

  9. L. C. Oliveira. Estimation of source parameters by frequency analysis. In Proceedings of the European Conference on Speech Communication and Technology, Berlin, vol. 1, 99–102, September 1993.

    Google Scholar 

  10. A. E. Rosenberg. Effect of glottal pulse shape on the quality of natural vowels. J. Acoust. Soc. Amer. 49(2 (Part 2)):583–590, 1971.

    Article  Google Scholar 

  11. J. Schroeter and M. M. Sondhi. Speech coding based on physiological models of speech production. In Advances in Speech Signal Processing, S. Furui and M. Mohan Sondhi, eds., Marcel Dekker, Inc., New York, 231–268, 1992.

    Google Scholar 

  12. D. Talkin. Voice epoch determination with dynamic programming. J. Acoust. Soc. Amer. 85S1-S149 1989

    Google Scholar 

  13. D. Talkin and J. Rowley. Pitch-synchronous analysis and synthesis for TTS systems. In ESCA Workshop on Speech Synthesis, Autrans, France, 55–58, September 1990.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer Science+Business Media New York

About this chapter

Cite this chapter

Oliveira, L.C. (1997). Text-to-Speech Synthesis with Dynamic Control of Source Parameters. In: van Santen, J.P.H., Olive, J.P., Sproat, R.W., Hirschberg, J. (eds) Progress in Speech Synthesis. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-1894-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-1894-4_3

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-7328-8

  • Online ISBN: 978-1-4612-1894-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics