Abstract
This study aims to address the communication barriers related to speech for individuals with cerebral palsy, with the goal of using technological methods to assist or alleviate difficulties in oral communication. To achieve this, the study plans to analyze and test mainstream speech recognition services or platforms available in the market to understand their current speech recognition capabilities for individuals with cerebral palsy, and explore the possibility of assisting them in solving their communication problems, in order to enhance their quality of life and promote their social skills. As the author is a person with congenital cerebral palsy, the study is particularly meaningful to him because the congenital brain damage affecting the nervous system has made his speech unclear, seriously affecting his ability to express himself orally. Therefore, the author plans to record a dataset of speech samples from individuals with cerebral palsy, collecting conversations from various aspects of daily life. This dataset will be tested and analyzed using mainstream speech recognition services such as Google, Microsoft, and YaTing, among others, in order to infer the current difficulties in speech recognition technology for individuals with cerebral palsy and propose potential solutions for oral communication barriers, with the hope that the contribution of this research will promote the development of mature assistive technologies for individuals with communication difficulties in the near future.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Lippmann, R.P.: Speech recognition by machines and humans. Speech Commun. 22(1), 1–15 (1997)
Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649 (2013)
Hannun, A., et al.: Deep speech: scaling up end-to-end speech recognition. arXiv preprint: arXiv:1412.5567 (2014)
Roberts, A., Raffel, C., Shazeer, N.: How much knowledge can you pack into the parameters of a language model? arXiv preprint: arXiv:2002.08910 (2020)
Xu, B., Tao, C., Feng, Z., Raqui, Y., Ranwez, S.: A benchmarking on cloud based speech-to-text services for French speech and background noise effect. arXiv preprint: arXiv:2105.03409 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, YR., Hung, J.C., Chang, JW. (2024). A Survey of Speech Recognition for People with Cerebral Palsy. In: Hung, J.C., Yen, N., Chang, JW. (eds) Frontier Computing on Industrial Applications Volume 4. FC 2023. Lecture Notes in Electrical Engineering, vol 1134. Springer, Singapore. https://doi.org/10.1007/978-981-99-9342-0_28
Download citation
DOI: https://doi.org/10.1007/978-981-99-9342-0_28
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9341-3
Online ISBN: 978-981-99-9342-0
eBook Packages: EngineeringEngineering (R0)