A Novel Deep Learning Based Nepali Speech Recognition

Joshi, Basanta; Bhatta, Bharat; Panday, Sanjeeb Prasad; Maharjan, Ram Krishna

doi:10.1007/978-981-19-1677-9_39

Basanta Joshi⁴⁰,
Bharat Bhatta⁴⁰,
Sanjeeb Prasad Panday⁴⁰ &
…
Ram Krishna Maharjan⁴⁰

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 894))

Included in the following conference series:

International Conference on Electrical and Electronics Engineering

690 Accesses
3 Citations

Abstract

Automatic speech recognition has allowed human beings to use their voices to speak with a computer interface. Nepali speech recognition involves conversion of Nepali language to corresponding text in Devanagari lipi. This work proposes a novel approach for developing Nepali Speech recognition model based using CNN-GRU. The data is collected from the Librispeech. The collected data is pre-processed and MFCC is applied on it for feature extraction. CNN-GRU model is responsible for extraction of the features and development of the acoustic model. CTC is responsible for decoding. The performance of the developed model has been assessed using Word Error Rate of the transcribed text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Speech Recognition of Bengali Using Kaldi

Hybrid deep learning based automatic speech recognition model for recognizing non-Indian languages

Article 15 September 2023

Convolutional Neural Network Based Automatic Speech Recognition for Tamil Language

References

Karita, S., et al.: A comparative study on transformer vs rnn in speech applications. In: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 449–456. IEEE (2019)
Google Scholar
Passricha, V., Aggarwal, R.K.: Convolutional neural networks for raw speech recognition. In: From Natural to Artificial Intelligence-Algorithms and Applications. IntechOpen (2018)
Google Scholar
Ssarma, M.K., Gajurel, A., Pokhrel, A., Joshi, B.: Hmm based isolated word nepali speech recognition. In: 2017 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 1, pp. 71–76. IEEE (2017)
Google Scholar
Regmi, P., Dahal, A., Joshi, B.: Nepali speech recognition using rnn-ctc model. Int. J. Comput. Appl. 178(31), 1–6 (2019)
Google Scholar
Bhatta, B., Joshi, B., Maharjhan, R.K.: Nepali speech recognition using CNN, GRU and CTC. In: Proceedings of the 32nd Conference on Computational Linguistics and Speech Processing (ROCLING 2020), pp. 238–246 (2020)
Google Scholar
Gupta, R., Sivakumar, G.: Speech recognition for Hindi language. IIT BOMBAY (2006)
Google Scholar
Kopparapu, S.K., Laxminarayana, M.: Choice of mel filter bank in computing MFCC of a resampled speech. In: 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010), pp. 121–124. IEEE (2010)
Google Scholar
Kong, X., Choi, J.Y., Shattuck-Hufnagel, S.: Evaluating automatic speech recognition systems in comparison with human perception results using distinctive feature measures. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5810–5814. IEEE (2017)
Google Scholar
Sodimana, K., et al.: A step-by-step process for building TTS voices using open source data and framework for Bangla, Javanese, Khmer, Nepali, Sinhala, and Sundanese. In: Proceedings of the 6th International Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU), pp. 66–70, Gurugram, India, August 2018
Google Scholar

Download references

Acknowledgement

This work has been supported by the University Grants Commission, Nepal under a Faculty Research Grant (UGC Award No. FRG-76/77-Engg-1) for the research project “Preparation of Nepali Speech Corpus: Step towards Efficient Nepali Speech Processing”.

Author information

Authors and Affiliations

Pulchowk Campus, Institute of Engineering, Tribhuvan University, Lalitpur, Nepal
Basanta Joshi, Bharat Bhatta, Sanjeeb Prasad Panday & Ram Krishna Maharjan

Authors

Basanta Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Bharat Bhatta
View author publications
You can also search for this author in PubMed Google Scholar
Sanjeeb Prasad Panday
View author publications
You can also search for this author in PubMed Google Scholar
Ram Krishna Maharjan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sanjeeb Prasad Panday .

Editor information

Editors and Affiliations

School of Science, Computing and Engineering Technologies, Swinburne University of Technology, Hawthorn, VIC, Australia
Saad Mekhilef
Office of the International Relations, Bharath Institute of Higher Education and Research, Chennai, India
Rabindra Nath Shaw
Department of Management and Innovation Systems, University of Salerno, Fisciano, Salerno, Italy
Pierluigi Siano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Joshi, B., Bhatta, B., Panday, S.P., Maharjan, R.K. (2022). A Novel Deep Learning Based Nepali Speech Recognition. In: Mekhilef, S., Shaw, R.N., Siano, P. (eds) Innovations in Electrical and Electronic Engineering. ICEEE 2022. Lecture Notes in Electrical Engineering, vol 894. Springer, Singapore. https://doi.org/10.1007/978-981-19-1677-9_39

Download citation

DOI: https://doi.org/10.1007/978-981-19-1677-9_39
Published: 14 April 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1676-2
Online ISBN: 978-981-19-1677-9
eBook Packages: EnergyEnergy (R0)

Publish with us

Policies and ethics

A Novel Deep Learning Based Nepali Speech Recognition

Abstract

Access this chapter

Similar content being viewed by others

Automatic Speech Recognition of Bengali Using Kaldi

Hybrid deep learning based automatic speech recognition model for recognizing non-Indian languages

Convolutional Neural Network Based Automatic Speech Recognition for Tamil Language

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Novel Deep Learning Based Nepali Speech Recognition

Abstract

Access this chapter

Similar content being viewed by others

Automatic Speech Recognition of Bengali Using Kaldi

Hybrid deep learning based automatic speech recognition model for recognizing non-Indian languages

Convolutional Neural Network Based Automatic Speech Recognition for Tamil Language

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation