Skip to main content

Approach for Spectral Analysis in Detection of Selected Pronunciation Pathologies

  • Conference paper
  • First Online:
Book cover Innovations in Biomedical Engineering (IBE 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 925))

Included in the following conference series:

  • 403 Accesses

Abstract

A framework for semi-automated detection of selected types of sigmatism is presented in this paper. A database of speech recordings was collected containing sibilant /s/ surrounded by vowels in different articulation phases. Recordings of three pronunciation modes were included into the database: normal, simulated lateral sigmatism, and simulated interdental sigmatism. The data was collected under the supervision of a speech therapy expert, who also provided labelling and annotation of each database entry. Twenty eight features of four types were extracted from each time frame within the sibilant: the mel-frequency cepstral coefficients, filter bank energies, spectral brightness, and zero-crossing rate. A feature aggregation procedure weighing the time frame location influence was proposed to describe each phoneme by a single feature vector. At the three-class classification stage, two tools were employed and compared: the random forest and support vector machine. The latter provides more accurate and repeatable classification results in each articulation phase with a median sensitivity, specificity, and accuracy exceeding 0.71, 0.85, and 0.80, respectively. The results also show that the assessment is generally more efficient when the phoneme is located at the beginning or ending of the word than when in the middle position.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lobacz, P., Dobrzanska, K.: Opis akustyczny glosek sybilantnych w wymowie dzieci przedszkolnych. Audiofonologia 15, 7–26 (1999). (in Polish)

    Google Scholar 

  2. Miodońska, Z., Kręcichwost, M., Szymańska, A.: Computer-aided evaluation of sibilants in preschool children sigmatism diagnosis. In: Information Technologies in Medicine, pp. 367–376. Springer (2016)

    Google Scholar 

  3. Wielgat, R., Zielinski, T., Wozniak, T., Grabias, S., Król, D.: Automatic recognition of pathological phoneme production. Folia Phoniatr Logop 60(6), 323–331 (2008). Spoken Language Technology for Education

    Article  Google Scholar 

  4. Valentini-Botinhao, C., Degenkolb-Weyers, S., Maier, A., Nöth, E., Eysholdt, U., Bocklet, T.: Automatic detection of sigmatism in children. In: WOCCI, pp. 1–4 (2012)

    Google Scholar 

  5. Seddik, A.F., El Adawy, M., Shahin, A.I.: A computer-aided speech disorders correction system for arabic language, pp. 18–21, September 2013

    Google Scholar 

  6. Bodusz, W., Miodońska, Z., Badura, P.: Approach for spectrogram analysis in detection of selected pronunciation pathologies. In: Innovations in Biomedical Engineering, vol. 623, pp. 3–11. Springer (2018)

    Google Scholar 

  7. Kostera, K., Więclawek, W., Kręcichwost, M.: Prototype measurement system for spatial analysis of speech signal for speech therapy. In: Innovations in Biomedical Engineering, vol. 623, pp. 79–86. Springer (2018)

    Google Scholar 

  8. Kręcichwost, M., Miodońska, Z., Trzaskalik, J., Pyttel, J., Spinczyk, D.: Acoustic mask for air flow distribution analysis in speech therapy. In: Information Technologies in Medicine, pp. 377–387. Springer (2016)

    Google Scholar 

  9. Król, D., Lorenc, A.: Acoustic field distribution in speech with the use of the microphone array. Tarnowskie Colloquia Naukowe 3(4), 9–16 (2017)

    Google Scholar 

  10. Sebkhi, N., Desai, D., Islam, M., Lu, J., Wilson, K., Ghovanloo, M.: Multimodal speech capture system for speech rehabilitation and learning. IEEE Trans. Biomed. Eng. 64(11), 2639–2649 (2017)

    Article  Google Scholar 

  11. Aron, M., Berger, M.-O., Kerrien, E., Wrobel-Dautcourt, B., Potard, B., Laprie, Y.: Multimodal acquisition of articulatory data: geometrical and temporal registration. J. Acoust. Soc. Am. 139(2), 636–648 (2016)

    Article  Google Scholar 

  12. Opielinski, K.J., Gudra, T., Migda, J.: Computer ultrasonic imaging of the tongue shape changes in the process of articulation of vowels. In: Computer Recognition Systems 2, pp. 629–636. Springer, Berlin (2007)

    Google Scholar 

  13. Wielgat, R., Mik, L., Lorenc, A.: Correlational and regressive analysis of the relationship between tongue and lips motion - an EMA and video study of selected polish speech sounds, pp. 509–514, June 2017

    Google Scholar 

  14. Martony, J.: On the synthesis and perception of voiceless fricatives. STL-QPSR 3(1), 17–22 (1962)

    Google Scholar 

  15. Young, S.J., Kershaw, D., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book Version 3.4. Cambridge University Press, Cambridge (2006)

    Google Scholar 

  16. Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, 1st edn. Prentice Hall PTR, Upper Saddle River (2001)

    Google Scholar 

  17. Paliwal, K.K.: Decorrelated and liftered filter-bank energies for robust speech recognition. In: EUROSPEECH (1999)

    Google Scholar 

  18. Jensen, K., Andersen, T.H.: Real-time beat estimation using feature extraction. In: Computer Music Modeling and Retrieval, pp. 13–22. Springer, Berlin (2004)

    Chapter  Google Scholar 

  19. Bachu, R.G., Kopparthi, S., Adapa, B., Barkana, B.D.: Voiced/unvoiced decision for speech signals based on zero-crossing rate and energy. In: Advanced Techniques in Computing Sciences and Software Engineering, pp. 279–282. Springer, Dordrecht (2010)

    Google Scholar 

  20. Reidy, P.F.: Spectral dynamics of sibilant fricatives are contrastive and language specific. J. Acoust. Soc. Am. 140(4), 2518–2529 (2016)

    Article  Google Scholar 

  21. Klesla, J.: Analiza akustyczna polskich spolglosek tracych bezdzwiecznych realizowanych przez dzieci nieslyszace. Audiofonologia Problemy teorii i praktyki 26, 107–118 (2004). (in Polish)

    Google Scholar 

  22. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  23. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  24. Soli, S.D.: Second formants in fricatives: acoustic consequences of fricative vowel coarticulation. J. Acoust. Soc. Am. 70(4), 976–984 (1981)

    Article  Google Scholar 

  25. Sereno, J.A., Baum, S.R., Marean, G.C., Lieberman, P.: Acoustic analyses and perceptual data on anticipatory labial coarticulation in adults and children. J. Acoust. Soc. Am. 77(S1), S7–S8 (1985)

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the Polish Ministry of Science and Silesian University of Technology statutory financial support No. BK-209/RIB1/2018.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michał Kręcichwost .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kręcichwost, M., Rasztabiga, P., Woloshuk, A., Badura, P., Miodońska, Z. (2019). Approach for Spectral Analysis in Detection of Selected Pronunciation Pathologies. In: Tkacz, E., Gzik, M., Paszenda, Z., Piętka, E. (eds) Innovations in Biomedical Engineering. IBE 2018. Advances in Intelligent Systems and Computing, vol 925. Springer, Cham. https://doi.org/10.1007/978-3-030-15472-1_13

Download citation

Publish with us

Policies and ethics