Skip to main content

Time-Domain Representation of Phones

  • Chapter
  • First Online:
Time Domain Representation of Speech Sounds
  • 238 Accesses

Abstract

The three basic group of speech signals: quasi-periodic, quasi-random, and quiescent are defined by the time period amplitude and complexity. The time-domain representation of the first two basic parameters has been a textbook affair since long. On the other hand, for complexity, spectral domain representation is so well studied that it also has acquired almost a robust textbook knowledge. Traditionally, phones are divided into vowels (which include also glides trills laterals) and consonants (plosives, affricates, and sibilants). The quasi-periodicity of vowels arises from the flapping of the mucosal cover introducing nonlinear dynamics. This generate random perturbation, e.g., in time period (jitter), in amplitude (shimmer) and in complexity (complexity perturbation. Five time-domain parameters have been introduced for the classification of a vowel sounds. The articulatory mechanisms for generating different Bangla consonants and the consequent signature for labeling them are discussed in necessary details. The developed algorithm is tested on an SCB database containing speech signals for 850 sentences spoken by 12 native Bangla informants of both the sexes, all in the age group of 20–50 years. A detailed labeling scheme with 94% recognition rate for labeling speech signal in five manner classes is described. These classes are: S (Sibilants), P (Unaspirated plosives), F (aspirated plosives), A (voiced plosives and affricates), L (laterals and Nasal murmurs), and V (vowels and glides). This labeling was used to introduce a partition in the Bangla pronunciation dictionary forming well-defined cohorts. The properties of these cohorts and their usefulness in ASR are discussed. It has been shown that it is possible to automatically generate expert systems for each cohort ultimately leading to a 95% recognition rate in ASR using only vowel recognition. A section is devoted on time-domain features for identification of the vowels. These features reflected a potential of almost 85% recognition in the all vowel situation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Das Mandal, S. (2007). Role of shape parameters in speech recognition: A study on standard colloquial Bengali (SCB). Ph.D. thesis, Jadavpur University.

    Google Scholar 

  • Datta, A. K. (in press). Book on ESOLA. Springer.

    Google Scholar 

  • Datta, A. K., Ganguly, N. R., & Ray, S. (1980). Recognition of unaspirated plosives: A statistical approach. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(1), 85–91.

    Article  Google Scholar 

  • Datta, A. K., & Sridhar, R. (1989). Organization and access procedure for a large lexicon. In Speech input/output assessment and databases (pp. 2183–2186). Noordwijkerhout, the Netherlands: ISCA archives.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Asoke Kumar Datta .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Datta, A.K. (2018). Time-Domain Representation of Phones. In: Time Domain Representation of Speech Sounds. Springer, Singapore. https://doi.org/10.1007/978-981-13-2303-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-2303-4_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-2302-7

  • Online ISBN: 978-981-13-2303-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics