Abstract
Automated speech recognition (ASR) appeared to be a driving force for a variety of machine learning (ML) techniques, include to ubiquitously utilized discriminative learning, Bayesian learning, hidden Markov model, adaptive learning, and structured sequence learning. Although machine learning utilize ASR as a large scale, it can reasonable application to thoroughly test viability for a given procedure and to motivate unused issues emerging from intrinsically consecutive and discourse energetic nature. Also, although ASR is accessible commercially for a few applications used in this research through the limitation and research gaps that the researcher try to access high accuracy of these systems. The advance technology from new ML techniques appears incredible guarantee to progress the literature review in ASR innovation. This study gives reader with a diagram of present-day ML methods as used within the relevant and current as significant for ASR future systems and research. The study goal is to promote advanced cross-pollination between ML and ASR communities more than has hither to occurred.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A. Kumar, S. Verma, H. Mangla, A survey of deep learning techniques in speech recognition, in Proceedings of IEEE 2018 International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2018 (2018), pp. 179–185
E. Trentin, M. Gori, A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing 37(1–4), 91–126 (2001)
L. Deng, X. Li, Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio Speech Lang. Process. 21(5), 1060–1089 (2013)
H. Bourlard, N. Morgan, Connectionist Speech Recognition—A Hybrid Approach (1994)
S. Malla, A. Alsadoon, S.K. Bajaj, A DFC taxonomy of speech emotion recognition based on convolutional neural network from speech signal, in CITISIA 2020—IEEE Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, Proceedings (2020)
G. Dhande, Z. Shaikh, Analysis of epochs in environment based neural networks speech recognition system, in Proceedings of the International Conference on Trends in Electronics and Informatics, ICOEI 2019, no. Icoei (2019), pp. 605–608
K. Tarunika, R.B. Pradeeba, P. Aruna, Applying machine learning techniques for speech emotion recognition, in 2018 9th International Conference on Computing, Communication and Networking Technologies (2018), pp. 1–5
M. Yousefi, J.H.L. Hansen, Block-based high performance CNN architectures for frame-level overlapping speech detection. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 28–40 (2021)
P. Tzirakis, J. Zhang, W. Schuller, End-to-End Speech Emotion Recognition Using Deep Neural Networks (Department of Computing, Imperial College London, London, UK Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany, 2018), pp. 5089–5093
D.A. Rahman, Indonesian spontaneous speech recognition system using deep neural networks (2020), pp. 2020–2022
I. Zakiah, D.P. Lestari, Iterative deep learning-based acoustic models using transcription agreement from multi-models automatic speech recognitions, in 2020 7th International Conference on Advanced Informatics: Concepts, Theory and Applications. ICAICTA 2020 (2020), pp. 1–5
K. Nugroho, E. Noersasongko, Purwanto, Muljono, H.A. Santoso, Javanese gender speech recognition using deep learning and singular value decomposition, in Proceedings—2019 International Seminar on Application for Technology of Information and Communication: Industry 4.0: Retrospect, Prospect, and Challenges, iSemantic 2019 (2019), pp. 251–254
P. Agrawal, S. Ganapathy, Modulation filter learning using deep variational networks for robust speech recognition. IEEE J. Sel. Top. Signal Process. 13(2), 244–253 (2019)
A. Waris, R.K. Aggarwal, Optimization of deep neural network for automatic speech recognition, in Proceedings of the International Conference on Inventive Research in Computing Application. ICIRCA 2018, no. Icirca (2018), pp. 524–527
J. Liu, Z. Liu, L. Wang, L. Guo, J. Dang, Speech emotion recognition with local-global aware deep representation learning, in ICASSP, IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings, May 2020, vol. 2020 (2020), pp. 7174–7178
W. Saheaw, S. Jaiyen, A. Hanskunatai, Thai voice recognition for controlling electrical appliances using long short-term memory, in 2020 IEEE 7th International Conference on Industrial Engineering and Applications. ICIEA 2020 (2020), pp. 697–700
Z. Han, H. Zhao, R. Wang, Transfer learning for speech emotion recognition, in Proceedings of 5th IEEE International Conference on Big Data Security on Cloud, BigDataSecurity 2019, 5th IEEE International Conference on High Performance and Smart Computing, HPSC 2019, 4th IEEE International Conference on Intelligent Data and Security, IDS 2019 (2019), pp. 96–99
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Ali Al-Janabi, S.I., Lateef, A.A.A. (2022). Applications of Deep Learning Approaches in Speech Recognition: A Survey. In: Bashir, A.K., Fortino, G., Khanna, A., Gupta, D. (eds) Proceedings of International Conference on Computing and Communication Networks. Lecture Notes in Networks and Systems, vol 394. Springer, Singapore. https://doi.org/10.1007/978-981-19-0604-6_17
Download citation
DOI: https://doi.org/10.1007/978-981-19-0604-6_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-0603-9
Online ISBN: 978-981-19-0604-6
eBook Packages: EngineeringEngineering (R0)