Skip to main content

Shimmer, Jitter and Complexity Perturbation

  • Chapter
  • First Online:
Epoch Synchronous Overlap Add (ESOLA)

Part of the book series: Signals and Communication Technology ((SCT))

  • 314 Accesses

Abstract

Normal human voice is not perfectly periodic, it is said to be quasi-periodic. Two successive pitch cycles do not produce exactly the same pressure waves. The variations are random in nature and occur for pitch, amplitude and complexity, referred to as jitter, shimmer and complexity perturbations respectively. The perceptual manifestation of these is the quality of sound. The excess of these makes speech harsh and absence of them produces a mechanical unnatural timbre. The study for finding out the values of shimmer, jitter and CP in the natural speech is thus necessary to make the synthesized speech signal to sound more natural. This chapter includes a comprehensive study of these carried out for Bangla. The goal of the studies in this chapter is to get the optimum values of these three parameters so that after inclusion of these values in the synthesized speech would increase the quality particularly naturalness. The signals of nonsense utterances, in adequate number, in CVC form are collected from the native SCB female speaker, whose voice is to be used for speech synthesis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Baer T (1981) Observations of vocal fold vibration: measurement of excised larynges. In: Stevens KN, Hirano M (eds) Vocal Fold Physiology, University of Tokyo Press, New York, pp 119–133

    Google Scholar 

  • Berkeley DS, Moore DM, Morkewitz PA, Hanson DG, Geratt BR (1989) A preliminary study of particle velocity during phonation in an in-vivo canine model. J Voice 3:306–313

    Google Scholar 

  • Bickley C, Stevens KN (1986) Effects of vocal tract constriction on the glottal source: experimental and modelling studies. J Phonetics 14:373–382

    Google Scholar 

  • Herzel H, Berry D, Titze IR, Saleh M (1994) Analysis of vocal disorders with methods from non-linear dynamics. J Speech Hear Disord 37:1008–1019

    Article  Google Scholar 

  • Hirano M (1974) Morphological structure of the vocal cord as a vibrator and its variations. Folia Phoniat 26:89–94

    Article  Google Scholar 

  • Hirschberg A, Relorson Y, Hofmans GCH, Vanttassel RR, Wijnands APJ (1996) Starting transient of the flow through an in-vitro model of the vocal folds. In: Davis P, Fletcher N (eds) Vocal fold physiology. Singular, San Diego, pp 31–46

    Google Scholar 

  • Hon H, Acero A, Huang X, Liu J, Plumpe M (1998) Automatic generation of synthesis units for trainable text-to-speech systems. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP98), pp 293–296

    Google Scholar 

  • Horii Y (1979) Fundamental frequency perturbation observed in sustained phonation. J Speech and Hear Res 22:5–19

    Google Scholar 

  • Karnell MP, Scherer RS, Fischer LB (1991) Comparison of acoustic voice perturbation measures among three independent voice laboratories—a research note. J Speech and Hear Res 34:781–790

    Google Scholar 

  • Kumar A, Mullic SK (1996) Non-linear dynamical analysis of speech. J Acoust Soc Am 100:615–629

    Google Scholar 

  • Lucero JC (1999) A theoretical study of the hysteresis phenomenon at vocal fold oscillation onset-offset. J Acoust Soc Am 105(1):423–431

    Google Scholar 

  • Ludlow C, Bassich C, Conner N, Coulter D, Lee Y (1987) The validity of using phonatory jitter and shimmer to detect laryngeal pathology. In: Baer T, Sasaki CT, Harris KS (eds) Laryngeal function in phonation and respiration, Little Brown, Boston, pp 492–508

    Google Scholar 

  • Mlilekovic P (1987) Least mean square measures of voice perturbation. J Speech Lang Hear Res 30:529–538

    Google Scholar 

  • Orlikoff RF, Baken RJ (1990) Consideration of the relationship between the fundamental frequency of phonation and vocal jitter. Folia Phoniat 42:31–40

    Google Scholar 

  • Rasch RA (1983) Jitter in violin tones. In: Proceedings of the Stockholm music acoustics conference 1983, Stockholm, 28 July–1 August 1983, vol 2, pp 275–284

    Google Scholar 

  • Sengupta R, Dey N, Nag D, Datta AK (1999a) A study of fractal analysis of vowel sounds. J Acoust Soc of India 27(1–4):195–198

    Google Scholar 

  • Sengupta R, Dey N, Nag D, Datta AK (1999b) Fractal dimension analysis of quasi periodic speech signal. In: Proceedings of 4th international conference on advances in pattern recognition and digital techniques, Indian Statistical Institute, Calcutta, India, 27–31 December 1999, pp 442–446

    Google Scholar 

  • Sengupta R, Dey N, Nag D, Datta AK (1999c) Role of random perturbation of source voice in musical quality of singing voice. J Acoust Soc of India 27 (1–4):187–190

    Google Scholar 

  • Shadle CH, Barney A, Davies POAL (1999) Fluid flow in a dynamic mechanical model of the vocal folds and tract. II. Implications for speech production studies. J Acoust Soc Am 105(1):456–466

    Article  Google Scholar 

  • Sorensen D, Horii Y (1983) Frequency and amplitude perturbation in the voice of female speakers. J Comm Disorders 16:57–61

    Article  Google Scholar 

  • Stevens KN (1991) Vocal fold vibrations for obstruent consonants. In: Gauffin J, Hammerberg B (eds) Vocal fold physiology, acoustic, perceptual and phonological aspects of voice mechanisms, Singular Publishing Group, San Diego, pp 29–36

    Google Scholar 

  • Teager HM (1980) Some observations on oral airflow during phonation. IEEE Trans Acoust Speech, and Signal Processing (ASSP25) 25:599–601

    Google Scholar 

  • Teager HM, Teager SM (1990) A phenomenological model for vowel production in the vocal tract. In: Daniloff RG (ed) Speech sciences: recent advances. College Hill, San Diego, pp 73–109

    Google Scholar 

  • Titze IR, Talkin D (1979) A theoretical study of the effects of the various laryngeal configurations on the acoustics of phonation. J Acoust Soc Am 66:60–74

    Google Scholar 

  • Titze IR, Baken RJ, Herzel H (1993) Evidence of chaos in vocal folds vibration. In: Titze IR (ed) Vocal Fold Physiology, Singular, San Diego, pp 143–188

    Google Scholar 

  • Wendahl RW (1963) Laryngeal analog synthesis of harsh voice quality. Folia Phoniat 15:241–250

    Article  Google Scholar 

  • Wu M, Wang DL, Brown GJ (2002) A multi-pitch tracking algorithm for noisy speech. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2002), Orlando, Florida, USA, 13–17 May 2002, vol 1, pp 369–372

    Google Scholar 

  • Yumoto E, Gould WJ (1976) Harmonics to noise ration as an index of the degree of hoarseness. J Acoust Soc Am 71(6):1544–1550

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Datta, A.K. (2018). Shimmer, Jitter and Complexity Perturbation. In: Epoch Synchronous Overlap Add (ESOLA). Signals and Communication Technology. Springer, Singapore. https://doi.org/10.1007/978-981-10-7016-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7016-7_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7015-0

  • Online ISBN: 978-981-10-7016-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics