Development of Standard Yorùbá speech-to-text system using HTK

Adetunmbi, O. A.; Obe, O. O.; Iyanda, J. N.

doi:10.1007/s10772-016-9380-2

Development of Standard Yorùbá speech-to-text system using HTK

Published: 20 October 2016

Volume 19, pages 929–944, (2016)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

O. A. Adetunmbi¹,
O. O. Obe¹ &
J. N. Iyanda²

424 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, a Standard Yorùbá speech-to-text system capable of recognizing isolated words spoken by the users based on previously stored data was designed and implemented. This system adopted syllable-based approach, and carefully-selected words were recorded, analyzed and annotated, using Praat software. An experimental database of six native speakers was taken, each speaking 25 bi-syllabic and 25 tri-syllabic words, under an acoustically-controlled room. The meaningful spectral coefficients were successfully extracted using Mel-frequency cepstral coefficients technique and Hidden Markov Model Toolkit was used to implement the system. A graphical user interface was also developed to make the system accessible and more interactive. Furthermore, the system was tested and evaluated based on the perception of native speakers of the language. The overall accuracy for bi-syllabic and tri-syllabic words was 76 and 84 % respectively. These results obtained for both bi and tri-syllabic words showed that this system was a promising approach that could be adopted for Standard Yorùbá continuous speech recognition system as this will make the system useable for the foreign speaker.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Speaker Independent Isolated Speech Recognition for Tamil Language Using Wavelet Denoising and Hidden Markov Model

A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language

Article 19 October 2015

Automatic Speech Recognition of Quechua Language Using HMM Toolkit

References

Abdul-Wahab, F. A., Shahrul, A. M. Y., & Hariharan, M. (2013). Yorùbá Automatic speech recognition: A review. In International conference of rural ICT development, Melaka, Malaysia (pp. 116–121).
Adeniran, W. (2015). Will the Yoruba language survive beyond the 21st century? A lecture delivered at the Yoruba day celebration in Stockholm, Sweden. Retrieved November 07, 2015.
Afolabi, A. O., & Wahab, A. J. (2013). Implementation of Yoruba text-to-speech E-learning system. International Journal of Engineering Research and Technology, 2(11), 1055–1064.
Google Scholar
Ahmad, M. A., Gunawan, T. S., & Khalifa, O. O. (2010). English digits speech recognition system based on hidden Markov models. In International Islamic University Malaysia, international conference on computer and communication engineering (ICCCE), 11–13, Kuala Lumpur, Malaysia. Corporation and Cambridge University Engineering Department.
Bamgboṣe, A. (1969). Yorùbá. In Elizabeth Dunstan (Ed.), Twelve Nigerian languages (p. 166). New York: Africana Publishing Corp.
Google Scholar
Cini, K., & Balakrishnan, K. (2012). Continuous speech recognition system for Malayalam language using PLP cepstral coefficient. International Journal of Computing and Business Research (IJCBR), 3(1).
Das, R., & Das, P. K. (2013). Design and implementation of monophones and triphones based speech recognition systems for Spanish language. Bharati International Journal of Information Technology (BIJIT), 2(3), 237–253.
Google Scholar
Dopamu, P. (2004). Understanding Yorùbá life and culture. Trenton, NJ: Africa World Press Inc.
Google Scholar
Dua, M., Aggarwal, R. K., Kadyan, V, & Dua, S. (2013). Punjabi continuous speech recognition using HTK, Department of Computer Engineering, NIT, India.
Gales, M. J. F., Kim, D. Y., Woodland, P. C., Chan, R. H. Y., Mrva, D., Sinha, R., et al. (2006). Progress in the CU-HTK broadcast news transcription system. IEEE Transactions on Audio, Speech and Language Processing, 14(5), 1513–1525.
Article Google Scholar
Graham, W. (2014). Syllabic consonants-speech and language therapy information. Retrieved December 13, 2015 from http://www.sltinfo.com/syllabic-consonants/.
Ishizuka, K., & Nakatani, T. (2006). Study of noise robust voice activity detection based on periodic component to aperiodic component ratio. In Statistical and perceptual audition (SAPA) (pp. 65–70). Retrieved from ISCA Archive http://www.isca-speech.org/archive.
Kumolalo, F. O., Adagunodo, E. R., & Odejobi, O. A. (2010). Development of a Syllabicator for Yorùbá Language. Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria Proceedings of OAU TekConf.
Majdalawieh, O., Gu, J., & Meng, M. (2004). An HTK-developed Hidden Markov Model (hmm) for a voice-controlled robotic system. In IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan (pp. 4050–4055).
Manadhar, S., Ziolko, B., Wilson, R. C., & Galka, J. (2008). Application of HTK to the Polish language. International Conference on Audio, Language and Image Processing, 2(1), 234–245.
Google Scholar
Maya, M., Elizabeth, S., & Varghese, W. S. (2010). Malayalam word identification for speech recognition system. Kerala: Indian Institute of Information Technology and Management (IIITM-K).
Google Scholar
Mohri, M. (2002). Edit-distance of weighted automata: General definitions and algorithms. International Journal of Foundations of Computer Science, 14(6), 957–982.
Article MathSciNet MATH Google Scholar
Odéjobí, O. A. (2008). Recognition of tones in Yoruba speech: Experiments with artificial neural networks. In B. Prasad & S. R. M. Prasanna (Eds.), Studies Computational Intelligence (SCI) (vol. 83, pp. 23–47).
Oloruntoyin, S. T. (2014). Development of Yorùbá language text-to-speech E-learning system. International Journal of Scholarly Research Gate, 2(1), 345–367.
Google Scholar
Oyekanmi, E. O., Oluwadare, S. A., & Alese, B. K. (2013). Intelligent system learning and understanding of Yorùbá language. International Journal of Computer and Information Technology, 2(5), 993–997.
Google Scholar
Paul, D. B. (2010). A Tutorial of HMM Tool Kit (HTK): A power point presentation at the Department of Electrical and Computer Engineering Binghamton University.
Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliff, NJ: Prentice-Hall.
MATH Google Scholar
Saini, P., Kaur, P., & Dua, M. (2013). Automatic speech segmentation for Hindi language using HTK. International Journal of Engineering Trends and Technology (IJETT), 4(6), 2451–2555.
Google Scholar
Williamson, K., & Blench, R. (2000). Niger-Congo. African languages: An introduction (pp. 1–42).
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., & Woodland, P. (2002). The HTK book, Microsoft. Cambridge University Engineering Department.
Zhang, G., Yin, J., Liu, Q., & Yang, C. (2011). The fixed-point optimization of Mel frequency cepstrum coefficients for speech recognition. School of Applied Sciences, Harbin University of Science and Technology, Harbin, China.

Download references

Funding

Self sponsored.

Author information

Authors and Affiliations

Department of Computer Science, Federal University of Technology Akure, Akure, Ondo State, Nigeria
O. A. Adetunmbi & O. O. Obe
Department of Computer Science, Joseph Ayo Babalola University, Ikeji-Arakeji, Osun State, Nigeria
J. N. Iyanda

Authors

O. A. Adetunmbi
View author publications
You can also search for this author in PubMed Google Scholar
O. O. Obe
View author publications
You can also search for this author in PubMed Google Scholar
J. N. Iyanda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. N. Iyanda.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Adetunmbi, O.A., Obe, O.O. & Iyanda, J.N. Development of Standard Yorùbá speech-to-text system using HTK. Int J Speech Technol 19, 929–944 (2016). https://doi.org/10.1007/s10772-016-9380-2

Download citation

Received: 27 May 2016
Accepted: 05 October 2016
Published: 20 October 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s10772-016-9380-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development of Standard Yorùbá speech-to-text system using HTK

Abstract

Access this article

Similar content being viewed by others

Efficient Speaker Independent Isolated Speech Recognition for Tamil Language Using Wavelet Denoising and Hidden Markov Model

A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language

Automatic Speech Recognition of Quechua Language Using HMM Toolkit

References

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Development of Standard Yorùbá speech-to-text system using HTK

Abstract

Access this article

Similar content being viewed by others

Efficient Speaker Independent Isolated Speech Recognition for Tamil Language Using Wavelet Denoising and Hidden Markov Model

A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language

Automatic Speech Recognition of Quechua Language Using HMM Toolkit

References

Funding

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation