Bengali Spoken Numerals Recognition by MFCC and GMM Technique

Paul, Bachchu; Bera, Somnath; Paul, Rakesh; Phadikar, Santanu

doi:10.1007/978-981-15-8752-8_9

Bachchu Paul³⁸,
Somnath Bera³⁸,
Rakesh Paul³⁸ &
…
Santanu Phadikar³⁹

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 709))

Included in the following conference series:

International Conference on Emerging Trends and Advances in Electrical Engineering and Renewable Energy

694 Accesses
12 Citations

Abstract

Speech is the standard vocalized communication media. Speech is one of the comfortable way for humans to communicate with each other. Similarly, speech recognition system is eagerly necessary to communicate with computer through voice. Speech recognition in English language already helps us to operate English voice command-based applications. But in rural and semi-urban areas, due to lack of knowledge in English in India, it is necessary to implement automatic speech recognition in regional languages. Here, we have built a Gaussian Mixture Model (GMM)-based Bengali (also called Bangla) isolated spoken numerals recognition system where mel frequency cepstral coefficients denoted as MFCC is taken for feature extraction. The proposed system achieved 91.7% correct prediction for the Bangla numeral data set of 1000 audio samples for 10 classes which is satisfactory for previous Bangla spoken digit recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Karpagavalli, S., & Chandra, E. (2015). Phoneme and word based model for Tamil speech recognition using GMM-HMM. In 2015 International Conference on Advanced Computing and Communication Systems (pp. 1–5). IEEE.
Google Scholar
Gupta, A., & Sarkar, K. (2018). Recognition of spoken bengali numerals using MLP, SVM, RF based models with PCA based feature summarization. International Arab Journal of Information Technology, 15(2), 263–269.
Google Scholar
Muhammad, G., Alotaibi, Y. A., & Huda, M. N. (2009). Automatic speech recognition for Bangla digits. In 2009 12th International Conference on Computers and Information Technology (pp. 379–383). IEEE.
Google Scholar
Gamit, M. R., & Dhameliya, K. (2015). Isolated words recognition using MFCC, LPC and neural network. International Journal of Research in Engineering and Technology, 4(6), 146–149.
Article Google Scholar
Patil, U. G., Shirbahadurkar, S. D., & Paithane, A. N. (2016). Automatic speech recognition of isolated words in Hindi language using MFCC. In 2016 International Conference on Computing, Analytics and Security Trends (CAST) (pp. 433–438). IEEE.
Google Scholar
Hammami, N., Bedda, M., & Farah, N. (2012). Spoken Arabic digits recognition using MFCC based on GMM. In 2012 IEEE Conference on Sustainable Utilization and Development in Engineering and Technology (STUDENT) (pp. 160–163). IEEE.
Google Scholar
Chauhan, V., Dwivedi, S., Karale, P., & Potdar, S. M. (2016). Speech to text converter using Gaussian Mixture Model (GMM). International Research Journal of Engineering and Technology (IRJET), 3(5), 160–164.
Google Scholar
Ali, M. A., Hossain, M., & Bhuiyan, M. N. (2013). Automatic speech recognition technique for Bangla words. International Journal of Advanced Science and Technology, 50.
Google Scholar
Padmanabhan, J., & Johnson Premkumar, M. J. (2015). Machine learning in automatic speech recognition: A survey. IETE Technical Review, 32(4), 240–251.
Article Google Scholar
Permanasari, Y., Harahap, E. H., & Ali, E. P. (2019). Speech recognition using dynamic time warping (DTW). In Journal of Physics: Conference Series (Vol. 1366, No. 1, p. 012091). UK: IOP Publishing.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Vidyasagar University, Midnapore, West Bengal, 721102, India
Bachchu Paul, Somnath Bera & Rakesh Paul
Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, Kolkata, 700064, India
Santanu Phadikar

Authors

Bachchu Paul
View author publications
You can also search for this author in PubMed Google Scholar
Somnath Bera
View author publications
You can also search for this author in PubMed Google Scholar
Rakesh Paul
View author publications
You can also search for this author in PubMed Google Scholar
Santanu Phadikar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bachchu Paul .

Editor information

Editors and Affiliations

School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT Deemed to be University), Bhubaneswar, Odisha, India
Pradeep Kumar Mallick
Department Electrical and Electronics Engineering, Sikkim Manipal Institute of Technology, Rangpo, Sikkim, India
Akash Kumar Bhoi
Division of Information and Communication, Baekseok University, Cheonan, Ch’ungch’ong-namdo, Korea (Republic of)
Gyoo-Soo Chae
Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Chennai, Tamil Nadu, India
Kanak Kalita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paul, B., Bera, S., Paul, R., Phadikar, S. (2021). Bengali Spoken Numerals Recognition by MFCC and GMM Technique. In: Mallick, P.K., Bhoi, A.K., Chae, GS., Kalita, K. (eds) Advances in Electronics, Communication and Computing. ETAEERE 2020. Lecture Notes in Electrical Engineering, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-15-8752-8_9

Download citation

DOI: https://doi.org/10.1007/978-981-15-8752-8_9
Published: 29 January 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8751-1
Online ISBN: 978-981-15-8752-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics