Automatic Speech Recognition Using Acoustic Modeling

Joshi, Deepali; Waso, Pratik; Shelke, Rushikesh; Jadhav, Swapnil; Bhale, Kaustubh; Padalkar, Akshada

doi:10.1007/978-981-99-3656-4_11

Deepali Joshi³⁹,
Pratik Waso³⁹,
Rushikesh Shelke³⁹,
Swapnil Jadhav³⁹,
Kaustubh Bhale³⁹ &
…
Akshada Padalkar³⁹

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1056))

Included in the following conference series:

International Conference on Advances in Data Science and Computing Technologies

176 Accesses

Abstract

We propose to implement traditional ASR to speech recognition within the framework of the deep speech model. The ASR model should be able to detect Indian English speakers and transform their speech into text and compare the accuracy of speech to text between the pre-trained model and our trained model. We extract features from audio signals, and by using an acoustic model, we produce words. The accent is the basic pattern of acoustic features and pronunciation. It can tell you about a person’s social and linguistic history. It is a significant source of inter- and intra-speaker variation. To improve speech recognition accuracy, an accent-dependent vocabulary or model might be utilized and can be used to improve the accuracy of speech recognition systems. We present an experimental approach of acoustic. As a result, we may transfer the pre-trained model’s learnings to different accents, allowing the model to learn diverse accents on its own.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hinton G et al (2012) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sign Proces Mag 29(6):82–97
Article Google Scholar
Nagdewani S, Jain A (May 2020) A review on methods for speech-to-text and text-to-speech conversion. IRJET 07, P-ISSN: 2395-0072
Google Scholar
Nassif AB, Shahin I, Attili I, Azzeh M, Shaalan K (2019) Speech recognition using deep neural networks: a systematic review. IEEE Access 7:19143–19165. https://doi.org/10.1109/ACCESS.2019.2896880
https://www.Engpaper.Com/Cse/Speech-Recognition-2020.Html
Anjali IP, Sherseena PM (2020) Speech recognition. Int J Eng Res Technol (IJERT) NSDARM–2020 8(04)
Google Scholar
Bahl LR, Jelinek F, Mercer RL (March 2019) A maximum likelihood approach to continuous speech recognition. IEEE Trans Pattern Anal Mach Intell PAMI-5(2):179–190
Google Scholar
Deng L, Li X (May 2013) Machine learning paradigms for speech recognition: an overview. IEEE Trans Audio, Speech, Lang Proces 21(5):1060–1089. https://doi.org/10.1109/TASL.2013.2244083
Chang S-Y, Morgan N (2018) Robust CNN-based speech recognition with Gabor filter kernels. Interspeech
Google Scholar
Palaz D, Doss MM, Collobert R (2015) Analysis of CNN-based speech recognition system using raw speech as input. Interspeech
Google Scholar
https://Www.Rev.Com/Blog/Resources/What-Is-An-AcoustiC-Model-In-Speech-Recognition
https://Towardsdatascience.Com/Fast-Fourier-Transform-937926e591cb
Bansal M, Dr. Thivakaran TK (Jan 2020) Analysis of speech recognition using convolutional neural network. JES 11. ISSN-NO: 0377-9252
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, Vishwakarma Institute of Technology, Pune, India
Deepali Joshi, Pratik Waso, Rushikesh Shelke, Swapnil Jadhav, Kaustubh Bhale & Akshada Padalkar

Authors

Deepali Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Pratik Waso
View author publications
You can also search for this author in PubMed Google Scholar
Rushikesh Shelke
View author publications
You can also search for this author in PubMed Google Scholar
Swapnil Jadhav
View author publications
You can also search for this author in PubMed Google Scholar
Kaustubh Bhale
View author publications
You can also search for this author in PubMed Google Scholar
Akshada Padalkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deepali Joshi .

Editor information

Editors and Affiliations

School of Computing, Madanapalle Institute of Technology and Science, Angallu, Andhra Pradesh, India
Basabi Chakraborty
Kazi Nazrul University, Asansol, West Bengal, India
Arindam Biswas
A.K. Choudhury School of Information Technology, University of Calcutta, Kolkata, West Bengal, India
Amlan Chakrabarti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Joshi, D., Waso, P., Shelke, R., Jadhav, S., Bhale, K., Padalkar, A. (2023). Automatic Speech Recognition Using Acoustic Modeling. In: Chakraborty, B., Biswas, A., Chakrabarti, A. (eds) Advances in Data Science and Computing Technologies. ADSC 2022. Lecture Notes in Electrical Engineering, vol 1056. Springer, Singapore. https://doi.org/10.1007/978-981-99-3656-4_11

Download citation

DOI: https://doi.org/10.1007/978-981-99-3656-4_11
Published: 30 September 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-3655-7
Online ISBN: 978-981-99-3656-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics