Linear Discriminant Analysis Based Approach for Automatic Speech Recognition of Urdu Isolated Words

Ali, Hazrat; Ahmad, Nasir; Zhou, Xianwei; Ali, Muhammad; Manjotho, Ali Asghar

doi:10.1007/978-3-319-10987-9_3

Hazrat Ali^7,11,
Nasir Ahmad⁸,
Xianwei Zhou⁷,
Muhammad Ali⁹ &
…
Ali Asghar Manjotho¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 414))

Included in the following conference series:

International Multi Topic Conference

735 Accesses
2 Citations

Abstract

Urdu is amongst the five largest languages of the world and enjoys extreme importance by sharing its vocabulary with several other languages of the Indo-Pak. However, there has not been any significant research in the area of Automatic Speech Recognition of Urdu. This paper presents the statistical based classification technique to achieve the task of Automatic Speech Recognition of isolated words in Urdu. For each isolated word, 52 Mel Frequency Cepstral Coefficients have been extracted and based upon these coefficients; the classification has been achieved using Linear Discriminant Analysis. As a prototype, the system has been trained with audio samples of seven speakers including male/female, native/non-native and speakers with different ages while the testing has been done using audio samples of three speakers. It was determined that majority of words exhibit a percentage error of less than 33 %. Words with 100 % error were declared to be bad words. The work reported in this paper may serve as a strong baseline for future research work on Urdu ASR, especially for continuous speech recognition of Urdu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Article MATH Google Scholar
Gagnon, L., Foucher, S., Laliberte, F., Boulianne, G.: A simplified audiovisual fusion model with application to large-vocabulary recognition of French Canadian speech. Can. J. Electr. Comput. Eng. (Spring) 33(2), 109–119 (2008)
Article Google Scholar
Morii, S., Niyada, K., Fujii, S., Hoshimi, M.: Large vocabulary speaker-independent Japanese speech recognition system. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 866–869 (1985)
Google Scholar
Shimizu, T., Ashikari, Y., Sumita, E., Zhang, J.: NICT/ATR Chinese-Japanese-English speech-to-speech translation system. Tshingua Sci. Technol. 13(4), 540–544 (2008)
Article Google Scholar
Mao, J., Chen, Q., Gao, F., Guo, R., Lu, R.: SHTQS: a telephone-based Chinese spoken dialogue system. J. Syst. Eng. Electron. 16(4), 881–885 (2005)
Google Scholar
Khadivi, S., Ney, S.: Integration of speech recognition and machine translation in computer-assisted translation. IEEE Trans. Audio Speech Lang. Process. 16(8), 1551–1564 (2008)
Article Google Scholar
Ghai, W., Singh, N.: Analysis of automatic speech recognition systems for Indo-Aryan languages: Punjabi a case study. Int. J. Soft Comput. Eng. (IJSCE) 2(1), 379–385 (2012)
Google Scholar
Akram, M.U., Arif, M.: Design of an Urdu speech recognizer based upon acoustic phonetic modeling. In: 8th International Multitopic Conference, pp. 91–96 (2004)
Google Scholar
Azam, S.M., Mansoor, Z.A., Mughal, M.S., Mohsin, S.: Urdu spoken digits recognition using classified MFCC and backpropgation neural network. In: Computer Graphics, Imaging and Visualization, CGIV’07, pp. 414–418 (2007)
Google Scholar
Ahad, A., Fayyaz, A., Mehmood, T.: Speech recognition using multilayer perceptron. In: Proceedings of IEEE Students Conference, ISCON’02, pp. 103–109 (2002)
Google Scholar
Hasnain, S.K., Awan, M.S.: Recognizing spoken Urdu numbers using fourier descriptor and neural networks with Matlab. In: Second International Conference on Electrical Engineering (ICEE 2008), pp. 1–6 (2008)
Google Scholar
Ashraf, J., Iqbal, N., Khattak, N.S., Zaidi, A.M.: Speaker independent Urdu speech recognition using HMM. In: The 7th International Conference on Informatics and Systems (INFOS 2010), pp. 1–5, March (2010)
Google Scholar
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Article Google Scholar
Sarfraz, H., et al.: Large vocabulary continuous speech recognition for Urdu. In: 8th International Conference on Frontiers of Information Technology (FIT’10) (2010)
Google Scholar
Ali, H., Ahmad, N., Yahya, K.M., Farooq, O.: A medium vocabulary Urdu isolated words balanced corpus for automatic speech recognition. In: 2012 International Conference on Electronics Computer Technology (ICECT 2012), pp. 473–476 (2012)
Google Scholar
Center for Language Engineering, May 2012. http://www.cle.org.pk/
Molau, S., Ptiz, M., Schluter, R., Ney, H.: Computing mel-frequency cepstral coefficients on the power spectrum. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’01), pp. 73–76 (2001)
Google Scholar
Han, W., Chan, C.-F., Choy, C.-S., Pun, K.-P.: An efficient MFCC extraction method in speech recognition. In: IEEE International Symposium on Circuits and Systems (ISCAS 2006) (2006)
Google Scholar
Kotnik, B., Vlaj, D., Horvat, B.: Efficient noise robust feature extraction algorithms for distributed speech recognition (DSR) systems. Int. J. Speech Technol. 6(3), 205–219 (2003)
Article Google Scholar
Proakis, J.G., Manolakis, D.G.: Digital Signal Processing; Principles, Algorithms & Applications, 4th edn. Pearson Education Inc., Prentice Hall (2007)
Google Scholar
Ingle, V.K., Proakis, J.G.: Digital Signal Processing Using Matlab, 3rd edn. Cengage Learning, Standford (2010)
Google Scholar
Salomon, D.: Data Compression: The Complete Reference, 4th edn. Springer, London (2007)
Google Scholar
Balakrishnama, S., Ganapathiraju, A., Picone, J.: Linear discriminant analysis for signal processing problems. In: Proceedings of the IEEE Southeastcon, pp. 36–39 March (1999)
Google Scholar
Balakrishnama, S., Ganapathiraju, A.: Linear discriminant analysis; a brief tutorial. http://www.music.mcgill.ca/~ich. Accessed March 2012

Download references

Acknowledgment

The authors are thankful to the supporting staff of the Department of Electrical Engineering, University of Engineering and Technology, Peshawar, Pakistan. It is due to their efforts which they put into keeping the Computer Lab open for extra hours and providing the authors with opportunity to use it. The authors also extend their gratitude to Engr. Salman Ilahi, Department of Electrical Engineering and Engr. Irfan Ahmad, Department of Industrial Engineering, UET Peshawar, for their valuable input and suggestions throughout this work.

Author information

Authors and Affiliations

Department of Communication Engineering, School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, 10083, China
Hazrat Ali & Xianwei Zhou
Department of Computer Systems Engineering, University of Engineering and Technology Peshawar, Peshawar, 25120, Pakistan
Nasir Ahmad
Department of Electrical and Computer Engineering, North Dakota State University, Fargo, ND, 58108-6050, USA
Muhammad Ali
Department of Computer Systems Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
Ali Asghar Manjotho
Machine Learning Group, School of Informatics, City University London, London, EC1V 0HB, UK
Hazrat Ali

Authors

Hazrat Ali
View author publications
You can also search for this author in PubMed Google Scholar
Nasir Ahmad
View author publications
You can also search for this author in PubMed Google Scholar
Xianwei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Ali
View author publications
You can also search for this author in PubMed Google Scholar
Ali Asghar Manjotho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hazrat Ali .

Editor information

Editors and Affiliations

University of Umm Al-Qura, Makkah, Saudi Arabia, Mehran University of Engineering and Technology, Jamshoro, Pakistan
Faisal Karim Shaikh
Faculty of Electrical, Electronics and Computer Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
Bhawani Shankar Chowdhry
College of Communication and Information, University of Kentucky, Lexington, Kentucky, USA
Sherali Zeadally
Department of Energy Technology, Aalborg University, Esbjerg, Denmark
Dil Muhammad Akbar Hussain
Department of Telecommunication Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
Aftab Ahmed Memon
Department of Electrical Engineering, Mehran University of Engineering and Technology, Jamshoro, Pakistan
Muhammad Aslam Uqaili

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ali, H., Ahmad, N., Zhou, X., Ali, M., Manjotho, A.A. (2014). Linear Discriminant Analysis Based Approach for Automatic Speech Recognition of Urdu Isolated Words. In: Shaikh, F., Chowdhry, B., Zeadally, S., Hussain, D., Memon, A., Uqaili, M. (eds) Communication Technologies, Information Security and Sustainable Development. IMTIC 2013. Communications in Computer and Information Science, vol 414. Springer, Cham. https://doi.org/10.1007/978-3-319-10987-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-10987-9_3
Published: 11 September 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10986-2
Online ISBN: 978-3-319-10987-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics