Intelligibility of laryngectomees’ substitute speech: automatic speech recognition and subjective rating

Schuster, Maria; Haderlein, Tino; Nöth, Elmar; Lohscheller, Jörg; Eysholdt, Ulrich; Rosanowski, Frank

doi:10.1007/s00405-005-0974-6

Intelligibility of laryngectomees’ substitute speech: automatic speech recognition and subjective rating

Phoniatrics
Published: 07 July 2005

Volume 263, pages 188–193, (2006)
Cite this article

European Archives of Oto-Rhino-Laryngology and Head & Neck Aims and scope Submit manuscript

Maria Schuster¹,
Tino Haderlein²,
Elmar Nöth²,
Jörg Lohscheller¹,
Ulrich Eysholdt¹ &
…
Frank Rosanowski¹

419 Accesses
45 Citations
Explore all metrics

Abstract

Substitute speech after laryngectomy is characterized by restricted aero-acoustic properties in comparison with laryngeal speech and has therefore lower intelligibility. Until now, an objective means to determine and quantify the intelligibility has not existed, although the intelligibility can serve as a global outcome parameter of voice restoration after laryngectomy. An automatic speech recognition system was applied on recordings of a standard text read by 18 German male laryngectomees with tracheoesophageal substitute speech. The system was trained with normal laryngeal speakers and not adapted to severely disturbed voices. Substitute speech was compared to laryngeal speech of a control group. Subjective evaluation of intelligibility was performed by a panel of five experts and compared to automatic speech evaluation. Substitute speech showed lower syllables/s and lower word accuracy than laryngeal speech. Automatic speech recognition for substitute speech yielded word accuracy between 10.0 and 50% (28.7±12.1%) with sufficient discrimination. It complied with experts’ subjective evaluations of intelligibility. The multi-rater kappa of the experts alone did not differ from the multi-rater kappa of experts and the recognizer. Automatic speech recognition serves as a good means to objectify and quantify global speech outcome of laryngectomees. For clinical use, the speech recognition system will be adapted to disturbed voices and can also be applied in other languages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis and Quantification of Acoustic Artefacts in Tracheoesophageal Speech

Estimation of Paralinguistic Features and Quality Analysis of Alaryngeal Voice

Intelligibility in Postlaryngectomy Speech

References

Davies M, Fleiss JL (1982) Measuring agreement for multinomial data. Biometrics 38:1047–1051
Google Scholar
Debruyne F, Delaere P, Wouters J, Uwents JP (1994) Acoustic analysis of tracheo-oesophageal versus oesophageal speech. J Laryngol Otol 108:325–328
CAS PubMed Google Scholar
Fleiss JL (1981) Statistical methods for rates and proportions, 2nd edn. John Wiley & Sons, New York
Gallwitz F, Niemann H, Nöth E (1999) Speech recognition—state of the art, applications, and future prospects. Wirtschaftsinformatik 41:538–547
Google Scholar
Gandour J, Weinberg B (1983) Perception of intonational contrasts in alaryngeal speech. J Speech Hear Res 44:1315–1320
Google Scholar
Pauloski BR (1998) Acoustic and aerodynamic characteristics of tracheoesophageal voice. In: Blom ED, Singer MI, Hamaker RC (eds) Tracheoesophageal voice restoration following total laryngectomy, PA. Singular Publishing Group Inc, San Diego London, pp 123–141
Pindzola RH, Cain BH (1989) Duration and frequency characteristics of tracheoesophageal speech. Ann Otol Rhinol Laryngol 98:960–964
CAS PubMed Google Scholar
Qi Y, Weinberg B (1995) Characteristics of voicing source waveforms produced by esophageal and tracheoesophageal speakers. J Speech Hear Res 38:536–548
CAS PubMed Google Scholar
Robbins J, Fisher HB, Blom ED, Singer MI (1984) A comparative study of normal, esophageal and tracheoesophageal speech production. J Speech Hear Disord 49:202–210
CAS PubMed Google Scholar
Schuster M, Lohscheller J, Kummer P, Hoppe U, Eysholdt U, Rosanowski F (2004) Voice handicap of laryngectomees with tracheoesophageal speech. Folia Phoniatr Logop 56:62–67
PubMed Google Scholar
Searl JP, Carpenter MA (2002) Acoustic cues to the voicing feature in tracheoesophageal speech. J Speech Lang Hear Res 45:282–294
PubMed Google Scholar
Steidl S, Stemmer G, Hacker C, Nöth E, Niemann H (2002) Improving children’s speech recognition by HMM Interpolation with adults’ speech recognizer. In: Michaelis B, Krell G (eds) Pattern recognition, 25^th DAGM Symposium, vol 2781 of lecture notes in computer science. Springer, Heidelberg New York Berlin, pp 600–607
Stemmer G (2005) Modeling variability in speech recognition. PhD Thesis, chair for pattern recognition. University of Erlangen-Nuremberg, Germany
Van As CJ, Hilgers FJM, Verdonck-de Leeuw IM, Koopmans-van Beinum FJ (1998) Acoustical analysis and perceptual evaluation of tracheoesophageal prosthetic voice. J Voice 12:239–248
PubMed Google Scholar
Wahlster W (ed) (2000) Verbmobil: Foundations of speech-to-speech translation, Springer, Berlin Heidelberg New York
Wiliams SE, Scanio TS, Ritterman SI (1989) Temporal and perceptual characteristics of tracheoesophageal voice. Laryngoscope 99:846–850
PubMed Google Scholar
Wilpon JG, Jacobsen CN (1996) A study of speech recognition for children and the elderly. Proc. of ICASSP, pp 349–352

Download references

Acknowledgments

This work was partially supported by the EU in the project PF-Star under grant IST-2001–37599 and by the DFG (Deutsche Forschungsgemeinschaft, German Research Council), SFB 603, subproject B5, and the Deutsche Krebshilfe (registration no. 106266). The authors are responsible for the content of this article. We thank PD Dr. A. Pfahlberg, Institute for Medical Informatics, Biometry and Epidemiology, University of Erlangen, for the helpful suggestions, and Prof. Dr. H. Iro, Department of ENT, University of Erlangen, for supplying the data of the control group.

Author information

Authors and Affiliations

Department of Phoniatrics and Pedaudiology, University of Erlangen, Bohlenplatz 21, 91054, Erlangen, Germany
Maria Schuster, Jörg Lohscheller, Ulrich Eysholdt & Frank Rosanowski
Department of Pattern Recognition, University of Erlangen, Erlangen, Germany
Tino Haderlein & Elmar Nöth

Authors

Maria Schuster
View author publications
You can also search for this author in PubMed Google Scholar
Tino Haderlein
View author publications
You can also search for this author in PubMed Google Scholar
Elmar Nöth
View author publications
You can also search for this author in PubMed Google Scholar
Jörg Lohscheller
View author publications
You can also search for this author in PubMed Google Scholar
Ulrich Eysholdt
View author publications
You can also search for this author in PubMed Google Scholar
Frank Rosanowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maria Schuster.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schuster, M., Haderlein, T., Nöth, E. et al. Intelligibility of laryngectomees’ substitute speech: automatic speech recognition and subjective rating. Eur Arch Otorhinolaryngol 263, 188–193 (2006). https://doi.org/10.1007/s00405-005-0974-6

Download citation

Received: 27 December 2004
Accepted: 04 April 2005
Published: 07 July 2005
Issue Date: February 2006
DOI: https://doi.org/10.1007/s00405-005-0974-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intelligibility of laryngectomees’ substitute speech: automatic speech recognition and subjective rating

Abstract

Access this article

Similar content being viewed by others

Analysis and Quantification of Acoustic Artefacts in Tracheoesophageal Speech

Estimation of Paralinguistic Features and Quality Analysis of Alaryngeal Voice

Intelligibility in Postlaryngectomy Speech

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Intelligibility of laryngectomees’ substitute speech: automatic speech recognition and subjective rating

Abstract

Access this article

Similar content being viewed by others

Analysis and Quantification of Acoustic Artefacts in Tracheoesophageal Speech

Estimation of Paralinguistic Features and Quality Analysis of Alaryngeal Voice

Intelligibility in Postlaryngectomy Speech

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation