Skip to main content

Digital Formant Synthesis

  • Chapter
  • 466 Accesses

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 8))

Abstract

The attempt to build a talking machine has a long history and can even be traced back to a time before the beginning of the Christian era (Linggard, 1985). The first complete talking machine is due to von Kempelen (1791) and is described in a book of over 400 pages that also reports on the twenty or so years of experimentation that were needed to build the device (interesting historical accounts of the development of speech synthesis are given in Dudley & Tarnóczy, 1950; Flanagan, 1972; Linggard, 1985; see also Klatt, 1987; and Flanagan & Ra-biner, 1973). It was not until the 20th century that speech synthesis became a widespread research endeavour. Part of the reason for this is that with the invention of the telephone, there was an increasing need to find a way of reducing the data in speech transmission without degrading significantly its quality; and this was one of the principal motivations that led to the invention of the first electronic speech synthesis system capable of synthesising whole utterances which was demonstrated publicly at the New York World’s Fair in 1939 and in San Francisco in 1940 (Dudley, 1939; Dudley et al., 1939). Another reason was that mechanical devices that model the vocal tract accurately enough to produce intelligible speech are very difficult to construct; and the advent of electronic instrumentation at the beginning of this century provided a way of synthesising speech without having to copy the action of the vocal organs in detail. Some landmarks in the development of speech synthesis systems in the 1950s include the pattern playback system of the Haskins Laboratories (Cooper, Liberman, & Borst, 1951), the Parametric Artificial Talker (PAT) by Lawrence (1953) and the Orator Verbis Electris (OVE) system developed by Fant (1953). In more recent times, major advances in the development of text-to-speech systems have been made both in the development of the MlTtalk text-to-speech system developed over a number of years by Dennis Klatt at MIT (Allen, Hunnicutt, & 11att, 1987; Klatt, 1980, 1982, 1987; Klatt & Blatt, 1990), which can synthesise intelligible and natural English speech in different voices and from an unrestricted vocabulary, and the KTH synthesis-by-rule system developed at Stockholm (Carlson & Granström, 1975, 1976; Carlson, Granström, & Hunnicutt, 1982).1

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

  1. There are many other different kinds of speech synthesis systems available. The most important of these are discussed in the review article in Klatt (1987) which also includes recordings of them. There is also much material available on the WWW currently on the comrnp.speech web page(http://vii.speech.cs.cmu.edu:80/comp.speech).

  2. These publications provide some of the background to the Haskins Laboratory articulatory synthesis system — an excellent demonstration of this (and also of the Pattern Playback system) can be found on their WWW site: http://www.haskins.yale.edu.

  3. This causes a zero to be introduced at frequencies of zero Hertz and the Nyquist frequency.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Harrington, J., Cassidy, S. (1999). Digital Formant Synthesis. In: Techniques in Speech Acoustics. Text, Speech and Language Technology, vol 8. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-4657-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-4657-9_7

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-0-7923-5822-0

  • Online ISBN: 978-94-011-4657-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics