Skip to main content

Applications of Deep Learning Approaches in Speech Recognition: A Survey

  • Conference paper
  • First Online:
Proceedings of International Conference on Computing and Communication Networks

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 394))

Abstract

Automated speech recognition (ASR) appeared to be a driving force for a variety of machine learning (ML) techniques, include to ubiquitously utilized discriminative learning, Bayesian learning, hidden Markov model, adaptive learning, and structured sequence learning. Although machine learning utilize ASR as a large scale, it can reasonable application to thoroughly test viability for a given procedure and to motivate unused issues emerging from intrinsically consecutive and discourse energetic nature. Also, although ASR is accessible commercially for a few applications used in this research through the limitation and research gaps that the researcher try to access high accuracy of these systems. The advance technology from new ML techniques appears incredible guarantee to progress the literature review in ASR innovation. This study gives reader with a diagram of present-day ML methods as used within the relevant and current as significant for ASR future systems and research. The study goal is to promote advanced cross-pollination between ML and ASR communities more than has hither to occurred.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. A. Kumar, S. Verma, H. Mangla, A survey of deep learning techniques in speech recognition, in Proceedings of IEEE 2018 International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2018 (2018), pp. 179–185

    Google Scholar 

  2. E. Trentin, M. Gori, A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing 37(1–4), 91–126 (2001)

    Article  Google Scholar 

  3. L. Deng, X. Li, Machine learning paradigms for speech recognition: an overview. IEEE Trans. Audio Speech Lang. Process. 21(5), 1060–1089 (2013)

    Google Scholar 

  4. H. Bourlard, N. Morgan, Connectionist Speech Recognition—A Hybrid Approach (1994)

    Google Scholar 

  5. S. Malla, A. Alsadoon, S.K. Bajaj, A DFC taxonomy of speech emotion recognition based on convolutional neural network from speech signal, in CITISIA 2020—IEEE Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, Proceedings (2020)

    Google Scholar 

  6. G. Dhande, Z. Shaikh, Analysis of epochs in environment based neural networks speech recognition system, in Proceedings of the International Conference on Trends in Electronics and Informatics, ICOEI 2019, no. Icoei (2019), pp. 605–608

    Google Scholar 

  7. K. Tarunika, R.B. Pradeeba, P. Aruna, Applying machine learning techniques for speech emotion recognition, in 2018 9th International Conference on Computing, Communication and Networking Technologies (2018), pp. 1–5

    Google Scholar 

  8. M. Yousefi, J.H.L. Hansen, Block-based high performance CNN architectures for frame-level overlapping speech detection. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 28–40 (2021)

    Google Scholar 

  9. P. Tzirakis, J. Zhang, W. Schuller, End-to-End Speech Emotion Recognition Using Deep Neural Networks (Department of Computing, Imperial College London, London, UK Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany, 2018), pp. 5089–5093

    Google Scholar 

  10. D.A. Rahman, Indonesian spontaneous speech recognition system using deep neural networks (2020), pp. 2020–2022

    Google Scholar 

  11. I. Zakiah, D.P. Lestari, Iterative deep learning-based acoustic models using transcription agreement from multi-models automatic speech recognitions, in 2020 7th International Conference on Advanced Informatics: Concepts, Theory and Applications. ICAICTA 2020 (2020), pp. 1–5

    Google Scholar 

  12. K. Nugroho, E. Noersasongko, Purwanto, Muljono, H.A. Santoso, Javanese gender speech recognition using deep learning and singular value decomposition, in Proceedings—2019 International Seminar on Application for Technology of Information and Communication: Industry 4.0: Retrospect, Prospect, and Challenges, iSemantic 2019 (2019), pp. 251–254

    Google Scholar 

  13. P. Agrawal, S. Ganapathy, Modulation filter learning using deep variational networks for robust speech recognition. IEEE J. Sel. Top. Signal Process. 13(2), 244–253 (2019)

    Article  Google Scholar 

  14. A. Waris, R.K. Aggarwal, Optimization of deep neural network for automatic speech recognition, in Proceedings of the International Conference on Inventive Research in Computing Application. ICIRCA 2018, no. Icirca (2018), pp. 524–527

    Google Scholar 

  15. J. Liu, Z. Liu, L. Wang, L. Guo, J. Dang, Speech emotion recognition with local-global aware deep representation learning, in ICASSP, IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings, May 2020, vol. 2020 (2020), pp. 7174–7178

    Google Scholar 

  16. W. Saheaw, S. Jaiyen, A. Hanskunatai, Thai voice recognition for controlling electrical appliances using long short-term memory, in 2020 IEEE 7th International Conference on Industrial Engineering and Applications. ICIEA 2020 (2020), pp. 697–700

    Google Scholar 

  17. Z. Han, H. Zhao, R. Wang, Transfer learning for speech emotion recognition, in Proceedings of 5th IEEE International Conference on Big Data Security on Cloud, BigDataSecurity 2019, 5th IEEE International Conference on High Performance and Smart Computing, HPSC 2019, 4th IEEE International Conference on Intelligent Data and Security, IDS 2019 (2019), pp. 96–99

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sameer I. Ali Al-Janabi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ali Al-Janabi, S.I., Lateef, A.A.A. (2022). Applications of Deep Learning Approaches in Speech Recognition: A Survey. In: Bashir, A.K., Fortino, G., Khanna, A., Gupta, D. (eds) Proceedings of International Conference on Computing and Communication Networks. Lecture Notes in Networks and Systems, vol 394. Springer, Singapore. https://doi.org/10.1007/978-981-19-0604-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-0604-6_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-0603-9

  • Online ISBN: 978-981-19-0604-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics