Abstract
The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time-domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral - domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Boyanov, B., Hadjitodorov, S.: Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases. IEEE Engineering in Medicine and Biology 16, 74–82 (1997)
Jackson-Menaldi, M.C.A.: La voz patológica. Editorial Médica Panamericana, Buenos Aires (Argentina) (2002)
Godino-Llorente, J.I., Sáenz-Lechón, N., Osma-Ruiz, V., Aguilera-Navarro, S., Gómez-Vilda, P.: An integrated tool for the diagnosis of voice disorders. Medical Engineering & Physics 28, 276–289 (2006)
Deliyski, D.D.: Acoustic model and evaluation of pathological voice production. In: Proceedings of the 3rd Conference on Speech Communication and Technology (EUROSPEECH 1993), Berlin (Germany), pp. 1969–1972 (1993)
Boyanov, B., Ivanov, T., Hadjitodorov, S., Chollet, G.: Robust hybrid pitch detector. IEE Electronics Letters 29, 1924–1926 (1993)
Bou-Ghazale, S.E., Hansen, J.H.L.: A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on Speech and Audio Processing 8, 429–442 (2000)
Fraile, R., Godino-Llorente, J.I., Sáenz-Lechón, N., Osma-Ruiz, V., Gómez-Vilda, P.: Analysis of the impact of analogue telephone channel on mfcc parameters for voice pathology detection. In: 8th INTERSPEECH Conference (INTERSPEECH 2007), Antwerp (Belgium), pp. 1218–1221 (2007)
Ganchev, T., Fakotakis, N., Kokkinakis, G.: Comparative evaluation of various MFCC implementations on the speaker verification task. In: Proceedings of the 10th International Conference on Speech and Computer (SPECOM 2005), Patras (Greece), pp. 191–194 (2005)
Godino-Llorente, J.I., Gómez-Vilda, P.: Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Transactions on Biomedical Engineering 51, 380–384 (2004)
Deller, J.R., Proakis, J.G., Hansen, J.H.L.: Discrete-time processing of speech signals. Macmillan Publishing Company, New York (1993)
Rabiner, L., Juang, B.H.: Fundamentals of speech recognition. Prentice-Hall, Englewood Cliffs (1993)
Proakis, J.G., Manolakis, D.G.: Digital Signal Processing. Principles, Algorithms and Applications, 3rd edn. Prentice-Hall International, New Jersey (1996)
Godino-Llorente, J.I., Gómez-Vilda, P., Blanco-Velasco, M.: Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters. IEEE Transactions on Biomedical Engineering 53, 1493–1953 (2006)
Kay Elemetrics Corp.: Disordered voice database. version 1.03 (1994)
Haykin, S.: Neural Networks: a comprehensive foundation, 1st edn. Macmillan College Publishing Company, New York (1994)
Murphy, P.J., Akande, O.O.: Quantification of glottal and voiced speech harmonics-to-noise ratios using cepstral-based estimation. In: Faundez-Zanuy, M., Janer, L., Esposito, A., Satue-Villar, A., Roure, J., Espinosa-Duro, V. (eds.) NOLISP 2005. LNCS, vol. 3817, pp. 224–232. Springer, Heidelberg (2006)
Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET curve in assessment of detection task performance. In: Proceedings of the 5th Conference on Speech Communication and Technology (EUROSPEECH 1997), Rhodes (Greece), pp. 1895–1898 (1997)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification, 2nd edn. John Wiley & Sons, New York (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fraile, R., Godino-Llorente, J.I., Sáenz-Lechón, N., Osma-Ruiz, V., Gómez-Vilda, P. (2008). Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification. In: Fred, A., Filipe, J., Gamboa, H. (eds) Biomedical Engineering Systems and Technologies. BIOSTEC 2008. Communications in Computer and Information Science, vol 25. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92219-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-92219-3_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92218-6
Online ISBN: 978-3-540-92219-3
eBook Packages: Computer ScienceComputer Science (R0)