Skip to main content
Log in

Intelligibility of laryngectomees’ substitute speech: automatic speech recognition and subjective rating

  • Phoniatrics
  • Published:
European Archives of Oto-Rhino-Laryngology and Head & Neck Aims and scope Submit manuscript

Abstract

Substitute speech after laryngectomy is characterized by restricted aero-acoustic properties in comparison with laryngeal speech and has therefore lower intelligibility. Until now, an objective means to determine and quantify the intelligibility has not existed, although the intelligibility can serve as a global outcome parameter of voice restoration after laryngectomy. An automatic speech recognition system was applied on recordings of a standard text read by 18 German male laryngectomees with tracheoesophageal substitute speech. The system was trained with normal laryngeal speakers and not adapted to severely disturbed voices. Substitute speech was compared to laryngeal speech of a control group. Subjective evaluation of intelligibility was performed by a panel of five experts and compared to automatic speech evaluation. Substitute speech showed lower syllables/s and lower word accuracy than laryngeal speech. Automatic speech recognition for substitute speech yielded word accuracy between 10.0 and 50% (28.7±12.1%) with sufficient discrimination. It complied with experts’ subjective evaluations of intelligibility. The multi-rater kappa of the experts alone did not differ from the multi-rater kappa of experts and the recognizer. Automatic speech recognition serves as a good means to objectify and quantify global speech outcome of laryngectomees. For clinical use, the speech recognition system will be adapted to disturbed voices and can also be applied in other languages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Davies M, Fleiss JL (1982) Measuring agreement for multinomial data. Biometrics 38:1047–1051

    Google Scholar 

  2. Debruyne F, Delaere P, Wouters J, Uwents JP (1994) Acoustic analysis of tracheo-oesophageal versus oesophageal speech. J Laryngol Otol 108:325–328

    CAS  PubMed  Google Scholar 

  3. Fleiss JL (1981) Statistical methods for rates and proportions, 2nd edn. John Wiley & Sons, New York

  4. Gallwitz F, Niemann H, Nöth E (1999) Speech recognition—state of the art, applications, and future prospects. Wirtschaftsinformatik 41:538–547

    Google Scholar 

  5. Gandour J, Weinberg B (1983) Perception of intonational contrasts in alaryngeal speech. J Speech Hear Res 44:1315–1320

    Google Scholar 

  6. Pauloski BR (1998) Acoustic and aerodynamic characteristics of tracheoesophageal voice. In: Blom ED, Singer MI, Hamaker RC (eds) Tracheoesophageal voice restoration following total laryngectomy, PA. Singular Publishing Group Inc, San Diego London, pp 123–141

  7. Pindzola RH, Cain BH (1989) Duration and frequency characteristics of tracheoesophageal speech. Ann Otol Rhinol Laryngol 98:960–964

    CAS  PubMed  Google Scholar 

  8. Qi Y, Weinberg B (1995) Characteristics of voicing source waveforms produced by esophageal and tracheoesophageal speakers. J Speech Hear Res 38:536–548

    CAS  PubMed  Google Scholar 

  9. Robbins J, Fisher HB, Blom ED, Singer MI (1984) A comparative study of normal, esophageal and tracheoesophageal speech production. J Speech Hear Disord 49:202–210

    CAS  PubMed  Google Scholar 

  10. Schuster M, Lohscheller J, Kummer P, Hoppe U, Eysholdt U, Rosanowski F (2004) Voice handicap of laryngectomees with tracheoesophageal speech. Folia Phoniatr Logop 56:62–67

    PubMed  Google Scholar 

  11. Searl JP, Carpenter MA (2002) Acoustic cues to the voicing feature in tracheoesophageal speech. J Speech Lang Hear Res 45:282–294

    PubMed  Google Scholar 

  12. Steidl S, Stemmer G, Hacker C, Nöth E, Niemann H (2002) Improving children’s speech recognition by HMM Interpolation with adults’ speech recognizer. In: Michaelis B, Krell G (eds) Pattern recognition, 25 th DAGM Symposium, vol 2781 of lecture notes in computer science. Springer, Heidelberg New York Berlin, pp 600–607

  13. Stemmer G (2005) Modeling variability in speech recognition. PhD Thesis, chair for pattern recognition. University of Erlangen-Nuremberg, Germany

  14. Van As CJ, Hilgers FJM, Verdonck-de Leeuw IM, Koopmans-van Beinum FJ (1998) Acoustical analysis and perceptual evaluation of tracheoesophageal prosthetic voice. J Voice 12:239–248

    PubMed  Google Scholar 

  15. Wahlster W (ed) (2000) Verbmobil: Foundations of speech-to-speech translation, Springer, Berlin Heidelberg New York

  16. Wiliams SE, Scanio TS, Ritterman SI (1989) Temporal and perceptual characteristics of tracheoesophageal voice. Laryngoscope 99:846–850

    PubMed  Google Scholar 

  17. Wilpon JG, Jacobsen CN (1996) A study of speech recognition for children and the elderly. Proc. of ICASSP, pp 349–352

Download references

Acknowledgments

This work was partially supported by the EU in the project PF-Star under grant IST-2001–37599 and by the DFG (Deutsche Forschungsgemeinschaft, German Research Council), SFB 603, subproject B5, and the Deutsche Krebshilfe (registration no. 106266). The authors are responsible for the content of this article. We thank PD Dr. A. Pfahlberg, Institute for Medical Informatics, Biometry and Epidemiology, University of Erlangen, for the helpful suggestions, and Prof. Dr. H. Iro, Department of ENT, University of Erlangen, for supplying the data of the control group.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria Schuster.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schuster, M., Haderlein, T., Nöth, E. et al. Intelligibility of laryngectomees’ substitute speech: automatic speech recognition and subjective rating. Eur Arch Otorhinolaryngol 263, 188–193 (2006). https://doi.org/10.1007/s00405-005-0974-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00405-005-0974-6

Keywords

Navigation