Abstract
We propose to implement traditional ASR to speech recognition within the framework of the deep speech model. The ASR model should be able to detect Indian English speakers and transform their speech into text and compare the accuracy of speech to text between the pre-trained model and our trained model. We extract features from audio signals, and by using an acoustic model, we produce words. The accent is the basic pattern of acoustic features and pronunciation. It can tell you about a person’s social and linguistic history. It is a significant source of inter- and intra-speaker variation. To improve speech recognition accuracy, an accent-dependent vocabulary or model might be utilized and can be used to improve the accuracy of speech recognition systems. We present an experimental approach of acoustic. As a result, we may transfer the pre-trained model’s learnings to different accents, allowing the model to learn diverse accents on its own.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hinton G et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sign Proces Mag 29(6):82–97
Nagdewani S, Jain A (May 2020) A review on methods for speech-to-text and text-to-speech conversion. IRJET 07, P-ISSN: 2395-0072
Nassif AB, Shahin I, Attili I, Azzeh M, Shaalan K (2019) Speech recognition using deep neural networks: a systematic review. IEEE Access 7:19143–19165. https://doi.org/10.1109/ACCESS.2019.2896880
Anjali IP, Sherseena PM (2020) Speech recognition. Int J Eng Res Technol (IJERT) NSDARM–2020 8(04)
Bahl LR, Jelinek F, Mercer RL (March 2019) A maximum likelihood approach to continuous speech recognition. IEEE Trans Pattern Anal Mach Intell PAMI-5(2):179–190
Deng L, Li X (May 2013) Machine learning paradigms for speech recognition: an overview. IEEE Trans Audio, Speech, Lang Proces 21(5):1060–1089. https://doi.org/10.1109/TASL.2013.2244083
Chang S-Y, Morgan N (2018) Robust CNN-based speech recognition with Gabor filter kernels. Interspeech
Palaz D, Doss MM, Collobert R (2015) Analysis of CNN-based speech recognition system using raw speech as input. Interspeech
https://Www.Rev.Com/Blog/Resources/What-Is-An-AcoustiC-Model-In-Speech-Recognition
https://Towardsdatascience.Com/Fast-Fourier-Transform-937926e591cb
Bansal M, Dr. Thivakaran TK (Jan 2020) Analysis of speech recognition using convolutional neural network. JES 11. ISSN-NO: 0377-9252
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Joshi, D., Waso, P., Shelke, R., Jadhav, S., Bhale, K., Padalkar, A. (2023). Automatic Speech Recognition Using Acoustic Modeling. In: Chakraborty, B., Biswas, A., Chakrabarti, A. (eds) Advances in Data Science and Computing Technologies. ADSC 2022. Lecture Notes in Electrical Engineering, vol 1056. Springer, Singapore. https://doi.org/10.1007/978-981-99-3656-4_11
Download citation
DOI: https://doi.org/10.1007/978-981-99-3656-4_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-3655-7
Online ISBN: 978-981-99-3656-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)