Abstract
An adaptive formulation of the long-term bi-directional linear predictive analysis is proposed in the context of the acoustic assessment of disordered speech. Vocal dysperiodicities are summarized by means of a signal-to-dysperiodicity ratio (SDR) marker. It is shown that performing an adaptive forward and backward long-term linear prediction of each speech sample and retaining the minimal prediction error energy as a cue of vocal dysperiodicity results in an SDR that correlates with the perceived degree of hoarseness. The coefficients of the time-varying long-term linear predictive model are estimated by means of the recursive least squares algorithm. The corpora comprise sustained vowels and French sentences produced by male and female normophonic and dysphonic speakers. A perceptual assessment of speech samples, which rests on comparative judgments, is used to evaluate the ability of the acoustic marker to predict subjective measures of voice quality. Experimental results show that the adaptive approach gives rise to high correlations for sustained vowels as well as for sentences.
Similar content being viewed by others
References
Bettens F, Grenez F, Schoengen J (2005) Estimation of vocal dysperiodicities in connected speech by means of distant-sample bi-directional linear predictive analysis. J Acoust Soc Am 117:328–337
Dejonckere PH, Remacle M, Fresnel-elbaz E, Woisard V, Crevier-buchman L, Millet B (1996) Differentiated perceptual evaluation of pathological voice quality: reliability and correlations with acoustic measurements. Rev Laryngol Otol Rhinol 117:219–224
De Krom G (1993) A cepstrum-based technique for determining a harmonics-to-noise-ratio in speech signals. J Speech Hear Res 36:254–265
De Oliveira RM, Pareira JC, Grellet M (2000) Adaptive estimation of residue signal for voice pathology diagnosis. IEEE Trans Biomed Eng 47:96–103
Haykin S (1991) Adaptive filter theory. Prentice Hall, Englewood Cliffs
Hillenbrand J, Houde RA (1996) Acoustic correlates of breathy vocal quality: dysphonic voices and continuous speech. J Speech Hear Res 39:311–321
Kacha A, Grenez F, Schoentgen J (2005) Voice quality assessment by means of comparative judgments of speech tokens. In: International conference on spoken language processing, September 4–8, 2005, Lisboa, Portugal, pp 1733–1736
Kahn M, Garst P (1983) The effects of five voice characteristics on LPC quality. In: International conference on acoustics, speech, and signal processing, Boston, pp 531–534
Klingholtz F (1987) The measurement of the signal-to-noise ratio (SNR) in continuous speech. Speech Commun 6:1–12
Klingholtz F (1990) Acoustic recognition of voice disorders: a comparative study of running speech versus sustained vowels. J Acoust Soc Am 87:2218–2224
Kreiman J, Gerrat BR (1998) Validity of rating scale measures of voice quality. J Acoust Soc Am 104:1598–1608
Lieberman P (1963) Some acoustic measures of the fundamental periodicity of normal and pathologic larynges. J Acoust Soc Am 35:344–353
Makhoul J (1975) Linear prediction: a tutorial review. Proc IEEE 63:561–580
Moore D, Mccabe G (1999) Introduction to the practice of statistics. Freeman, New York
Murphy P (2000) Spectral characterization of jitter, shimmer and additive noise in synthetically generated voice signals. J Acoust Soc Am 107:978–988
Muta H, Baer T, Wagatsuma K, Muraoka T, Fukuda H (1988) A pitch-synchronous analysis of hoarseness in running speech. J Acoust Soc Am 84:1292–1301
Parsa J, Jamieson JDG (2001) Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. J Speech Hear Res 44:327–339
Qi Y (1999) The estimation of signal-to-noise ratio in continuous speech for disordered voices. J Acoust Soc Am 105:3532–2535
Qi Y, Hillman R (1997) Temporal and spectral estimation of harmonics-to-noise ratio in human voice signals. J Acoust Soc Am 102:537–543
Ramachandran RP, Kabal P (1989) Pitch prediction filters in speech coding. IEEE Trans Acoust Speech Signal Proc 37:467–478
Schoentgen J (1982) Quantitative evaluation of the discrimination performance of acoustic features in detecting laryngeal pathology. Speech Commun 1:269–282
Schoentgen J (2003) Spectral models of additive and modulation noise in speech and phonatory excitation signals. J Acoust Soc Am 113:553–562
Schoentgen J, Bensaid M, Bucella F (2000) Multivariate statistical analysis of flat vowel spectra models with a view to characterizing dysphonic voices. J Speech Lang Hear Res 43:1493–1508
Yumoto E, Gould WJ (1982) The estimation of signal-to-noise ratio in continuous speech of disordered voices. J Acoust Soc Am 71:1544–1549
Acknowledgements
The authors would like to thank Prof. J. Schoentgen, National Fund for Scientific Research, Belgium for useful comments and discussions and the anonymous reviewers for their useful advices.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kacha, A., Bettens, F. & Grenez, F. Vocal dysperiodicities estimation by means of adaptive long-term prediction. Med Bio Eng Comput 44, 61–68 (2006). https://doi.org/10.1007/s11517-005-0003-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11517-005-0003-3