Abstract
Speech is essential in communication because it allows people to convey their thoughts, feelings, and ideas in different languages. However, due to the complexities of multilingual speech, it might be difficult to recognize each word in its associated language correctly. Fortunately, thanks to technological improvements, various automatic speech-to-text tools are available that can translate diverse languages into the required output language, hence decreasing linguistic barriers during communication. This review article aims to offer an overview of the many applications, problems, and methodologies utilized in developing multilingual speech-to-text technology. The report will also look at possible areas for future advancement and development of this technology. Overall, the presentation will emphasize the critical role that multilingual speech-to-text technology may play in breaking down language barriers and facilitating cross-linguistic communication.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Salini R, Safrin P, Shanmugapriyaa P, Sindhu S (2018) Switching between multiple languages based on speech recognition and translation. Int J Eng Res Technol (IJERT). ISSN: 2278-0181
Patil S et al (2016) Multilingual speech and text recognition and translation using image. Int J Eng Res 5(4)
Gopi A et al (2015) Multilingual speech to speech Mt based chat system. In: 2015 International conference on computing and network communications (COCONET). IEEE
Deepak Reddy P, Rudresh C, Adithya AS (2022) Multilingual speech to text using deep learning based on Mfcc features. Mach Learn Appl: Int J (MLAIJ) 9(2)
Sirigineedi AV et al (2020) A novel real time voice-based approach for multilingual web data extraction with Raspberry Pi. UGC Care Listed (Group I) J 9(2). 2012 IJFANS. All Rights Reserved
Bourlard H et al (2011) Current trends in multilingual speech processing. Sadhana 36:885–915
Biswas A et al (2022) Code-switched automatic speech recognition in five South African languages. Comput Speech Lang 71:101262
Bano S et al (2020) Speech to text translation enabling multilingualism. In: 2020 IEEE International conference for innovation in technology (INOCON). IEEE
Mussakhojayeva S et al (2023) Multilingual speech recognition for Turkic languages. Information 14(2):74
Padmane P, Pakhale A et al (2022) Multilingual speech and text recognition and translation. Int J Innov Eng Sci. E-ISSN: 2456-346
Nowakowski K et al (2023) Adapting multilingual speech representation model for a new, under resourced language through multilingual fine-tuning and continued pretraining. Inf Process Manage 60(2):103148
RodrÃguez LM, Cox C (2023) Speech-to-text recognition for multilingual spoken data in language documentation. In: Proceedings of the sixth workshop on the use of computational methods in the study of endangered languages
Weng F et al (1997) A study of multilingual speech recognition. In: Fifth European conference on speech communication and technology
Krishnan CG, Harold Robinson Y, Chilamkurti N (2020) Machine learning techniques for speech recognition using the magnitude. J Multimedia Inf Syst 7(1):33–40
Mohamed NA et al (2023) Multilingual speech recognition initiative for African languages
Ma JZ et al (2017) Improving deliverable speech-to-text systems with multilingual knowledge transfer. Interspeech
Singh W (2020) Multilingual speech to text conversion–a review
Wang Y, Wang H (2017) Multilingual convolutional, long short-term memory, deep neural networks for low resource speech recognition. Procedia Comput Sci 107:842–847
Cho J et al (2018) Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling. In: 2018 IEEE spoken language technology workshop (SLT). IEEE
Hemakumar G, Punitha P (2013) Speech recognition technology: a survey on Indian languages. Int J Inf Sci Intell Syst 2(4):1–38
Ardila R et al (2019) Common voice: a massively multilingual speech corpus. arXiv preprint arXiv:1912.06670
Ghule KR, Deshmukh RR (2015) Feature extraction techniques for speech recognition: a review. Int J Sci Eng Res 6(5). ISSN 2229-5518
Gaikwad SK, Gawali BW, Yannawar P (2010) A review on speech recognition technique. Int J Comput Appl 10(3):16–24
Bhuvnesh M, Hardik et al (2018) Feature extraction and classification techniques of automatic speech recognition system: a review. Int J Creative Res Thoughts (IJCRT) 6(2). ISSN: 2320-2882
Kurzekar PK et al (2014) A comparative study of feature extraction techniques for speech recognition systems. Int J Innov Res Sci Eng Technol 3(12):18006–18016
Kesarkar MP, Rao P (2003) Feature extraction for speech recognition. Electronic Systems, EE Department, IIT Bombay
Mohammed HM et al (2018) Speech recognition system with different methods of feature extraction. Int J Innov Res Comput Commun Eng 6(3):1–10
Ghadage YH, Shelke SD (2016) Speech to text conversion for multilingual languages. In: 2016 International conference on communication and signal processing (ICCSP). IEEE
Lin H et al (2012) Recognition of multilingual speech in mobile applications. In: 2012 IEEE International conference on acoustics, speech, and signal processing (ICASSP). IEEE
Garcia EG, Mengusoglu E, Janke E (2007) Multilingual acoustic models for speech recognition in low-resource devices. In: 2007 IEEE International conference on acoustics, speech and signal processing (ICASSP 07), vol 4. IEEE
Gitanjali W (2016) Multilingual speech recognition and language identification. Int J Modern Trends Eng Res. E-ISSN: 2349-9745
Luo J et al (2022) Adaptive activation network for low resource multilingual speech recognition. In: 2022 International joint conference on neural networks (IJCNN). IEEE
Alashban AA et al (2022) Spoken language identification system using convolutional recurrent neural network. Appl Sci 12(18):9181
Iranzo-sánchez J et al (2020) Europarl-st: a multilingual corpus for speech translation of parliamentary debates. In: 2020 IEEE International conference on acoustics, speech, and signal processing (ICASSP 2020). IEEE
Wang C et al (2020) Covost: a diverse multilingual speech-to-text translation corpus. arXiv preprint arXiv:2002.01320
Nakamura S et al (2006) The ATR multilingual speech-to-speech translation system. IEEE Trans Audio Speech Lang Process 14(2):365–376
Udhaykumar N, Ramakrishnan SK, Swaminathan R (2004) Multilingual speech recognition for information retrieval in Indian context. In: Proceedings of the student research workshop at HLT-NAACL 2004
Anwar M et al (2023) Muavic: a multilingual audio-visual corpus for robust speech recognition and robust speech-to-text translation. arXiv preprint arXiv:2303.00628
Schultz T (2002) Globalphone: a multilingual speech and text database developed at Karlsruhe university. In: Seventh International conference on spoken language processing
Gonzalez-Dominguez J et al (2014) A real-time end-to-end multilingual speech recognition architecture. IEEE J Sel Top Signal Process 9(4):749–759
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jani, M.M., Panchal, S.R., Patel, H.H., Raiyani, A. (2024). Multilingual Speech Recognition: An In-Depth Review of Applications, Challenges, and Future Directions. In: Sharma, H., Shrivastava, V., Tripathi, A.K., Wang, L. (eds) Communication and Intelligent Systems. ICCIS 2023. Lecture Notes in Networks and Systems, vol 968. Springer, Singapore. https://doi.org/10.1007/978-981-97-2079-8_1
Download citation
DOI: https://doi.org/10.1007/978-981-97-2079-8_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2078-1
Online ISBN: 978-981-97-2079-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)