Applications of Deep Learning Approaches in Speech Recognition: A Survey

Ali Al-Janabi, Sameer I.; Lateef, Ali Azawii Abdul

doi:10.1007/978-981-19-0604-6_17

Sameer I. Ali Al-Janabi¹³ &
Ali Azawii Abdul Lateef¹⁴

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 394))

567 Accesses
1 Citations

Abstract

Automated speech recognition (ASR) appeared to be a driving force for a variety of machine learning (ML) techniques, include to ubiquitously utilized discriminative learning, Bayesian learning, hidden Markov model, adaptive learning, and structured sequence learning. Although machine learning utilize ASR as a large scale, it can reasonable application to thoroughly test viability for a given procedure and to motivate unused issues emerging from intrinsically consecutive and discourse energetic nature. Also, although ASR is accessible commercially for a few applications used in this research through the limitation and research gaps that the researcher try to access high accuracy of these systems. The advance technology from new ML techniques appears incredible guarantee to progress the literature review in ASR innovation. This study gives reader with a diagram of present-day ML methods as used within the relevant and current as significant for ASR future systems and research. The study goal is to promote advanced cross-pollination between ML and ASR communities more than has hither to occurred.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

A. Kumar, S. Verma, H. Mangla, A survey of deep learning techniques in speech recognition, in Proceedings of IEEE 2018 International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2018 (2018), pp. 179–185
Google Scholar
E. Trentin, M. Gori, A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing 37(1–4), 91–126 (2001)
Article Google Scholar
L. Deng, X. Li, Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio Speech Lang. Process. 21(5), 1060–1089 (2013)
Google Scholar
H. Bourlard, N. Morgan, Connectionist Speech Recognition—A Hybrid Approach (1994)
Google Scholar
S. Malla, A. Alsadoon, S.K. Bajaj, A DFC taxonomy of speech emotion recognition based on convolutional neural network from speech signal, in CITISIA 2020—IEEE Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, Proceedings (2020)
Google Scholar
G. Dhande, Z. Shaikh, Analysis of epochs in environment based neural networks speech recognition system, in Proceedings of the International Conference on Trends in Electronics and Informatics, ICOEI 2019, no. Icoei (2019), pp. 605–608
Google Scholar
K. Tarunika, R.B. Pradeeba, P. Aruna, Applying machine learning techniques for speech emotion recognition, in 2018 9th International Conference on Computing, Communication and Networking Technologies (2018), pp. 1–5
Google Scholar
M. Yousefi, J.H.L. Hansen, Block-based high performance CNN architectures for frame-level overlapping speech detection. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 28–40 (2021)
Google Scholar
P. Tzirakis, J. Zhang, W. Schuller, End-to-End Speech Emotion Recognition Using Deep Neural Networks (Department of Computing, Imperial College London, London, UK Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany, 2018), pp. 5089–5093
Google Scholar
D.A. Rahman, Indonesian spontaneous speech recognition system using deep neural networks (2020), pp. 2020–2022
Google Scholar
I. Zakiah, D.P. Lestari, Iterative deep learning-based acoustic models using transcription agreement from multi-models automatic speech recognitions, in 2020 7th International Conference on Advanced Informatics: Concepts, Theory and Applications. ICAICTA 2020 (2020), pp. 1–5
Google Scholar
K. Nugroho, E. Noersasongko, Purwanto, Muljono, H.A. Santoso, Javanese gender speech recognition using deep learning and singular value decomposition, in Proceedings—2019 International Seminar on Application for Technology of Information and Communication: Industry 4.0: Retrospect, Prospect, and Challenges, iSemantic 2019 (2019), pp. 251–254
Google Scholar
P. Agrawal, S. Ganapathy, Modulation filter learning using deep variational networks for robust speech recognition. IEEE J. Sel. Top. Signal Process. 13(2), 244–253 (2019)
Article Google Scholar
A. Waris, R.K. Aggarwal, Optimization of deep neural network for automatic speech recognition, in Proceedings of the International Conference on Inventive Research in Computing Application. ICIRCA 2018, no. Icirca (2018), pp. 524–527
Google Scholar
J. Liu, Z. Liu, L. Wang, L. Guo, J. Dang, Speech emotion recognition with local-global aware deep representation learning, in ICASSP, IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings, May 2020, vol. 2020 (2020), pp. 7174–7178
Google Scholar
W. Saheaw, S. Jaiyen, A. Hanskunatai, Thai voice recognition for controlling electrical appliances using long short-term memory, in 2020 IEEE 7th International Conference on Industrial Engineering and Applications. ICIEA 2020 (2020), pp. 697–700
Google Scholar
Z. Han, H. Zhao, R. Wang, Transfer learning for speech emotion recognition, in Proceedings of 5th IEEE International Conference on Big Data Security on Cloud, BigDataSecurity 2019, 5th IEEE International Conference on High Performance and Smart Computing, HPSC 2019, 4th IEEE International Conference on Intelligent Data and Security, IDS 2019 (2019), pp. 96–99
Google Scholar

Download references

Author information

Authors and Affiliations

Collage of Islamic Science, University of Anbar, Anbar, Iraq
Sameer I. Ali Al-Janabi
Human Resources Department, University of Anbar, Anbar, Iraq
Ali Azawii Abdul Lateef

Authors

Sameer I. Ali Al-Janabi
View author publications
You can also search for this author in PubMed Google Scholar
Ali Azawii Abdul Lateef
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sameer I. Ali Al-Janabi .

Editor information

Editors and Affiliations

Manchester Metropolitan University, Manchester, UK
Ali Kashif Bashir
University of Calabria, Rende, Italy
Giancarlo Fortino
Maharaja Agrasen Institute of Technology, New Delhi, Delhi, India
Ashish Khanna
Maharaja Agrasen Institute of Technology, New Delhi, Delhi, India
Deepak Gupta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ali Al-Janabi, S.I., Lateef, A.A.A. (2022). Applications of Deep Learning Approaches in Speech Recognition: A Survey. In: Bashir, A.K., Fortino, G., Khanna, A., Gupta, D. (eds) Proceedings of International Conference on Computing and Communication Networks. Lecture Notes in Networks and Systems, vol 394. Springer, Singapore. https://doi.org/10.1007/978-981-19-0604-6_17

Download citation

DOI: https://doi.org/10.1007/978-981-19-0604-6_17
Published: 09 July 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0603-9
Online ISBN: 978-981-19-0604-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics