Skip to main content

Linear Discriminant Analysis Based Approach for Automatic Speech Recognition of Urdu Isolated Words

  • Conference paper
  • First Online:
Book cover Communication Technologies, Information Security and Sustainable Development (IMTIC 2013)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 414))

Included in the following conference series:

Abstract

Urdu is amongst the five largest languages of the world and enjoys extreme importance by sharing its vocabulary with several other languages of the Indo-Pak. However, there has not been any significant research in the area of Automatic Speech Recognition of Urdu. This paper presents the statistical based classification technique to achieve the task of Automatic Speech Recognition of isolated words in Urdu. For each isolated word, 52 Mel Frequency Cepstral Coefficients have been extracted and based upon these coefficients; the classification has been achieved using Linear Discriminant Analysis. As a prototype, the system has been trained with audio samples of seven speakers including male/female, native/non-native and speakers with different ages while the testing has been done using audio samples of three speakers. It was determined that majority of words exhibit a percentage error of less than 33 %. Words with 100 % error were declared to be bad words. The work reported in this paper may serve as a strong baseline for future research work on Urdu ASR, especially for continuous speech recognition of Urdu.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)

    Article  MATH  Google Scholar 

  2. Gagnon, L., Foucher, S., Laliberte, F., Boulianne, G.: A simplified audiovisual fusion model with application to large-vocabulary recognition of French Canadian speech. Can. J. Electr. Comput. Eng. (Spring) 33(2), 109–119 (2008)

    Article  Google Scholar 

  3. Morii, S., Niyada, K., Fujii, S., Hoshimi, M.: Large vocabulary speaker-independent Japanese speech recognition system. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 866–869 (1985)

    Google Scholar 

  4. Shimizu, T., Ashikari, Y., Sumita, E., Zhang, J.: NICT/ATR Chinese-Japanese-English speech-to-speech translation system. Tshingua Sci. Technol. 13(4), 540–544 (2008)

    Article  Google Scholar 

  5. Mao, J., Chen, Q., Gao, F., Guo, R., Lu, R.: SHTQS: a telephone-based Chinese spoken dialogue system. J. Syst. Eng. Electron. 16(4), 881–885 (2005)

    Google Scholar 

  6. Khadivi, S., Ney, S.: Integration of speech recognition and machine translation in computer-assisted translation. IEEE Trans. Audio Speech Lang. Process. 16(8), 1551–1564 (2008)

    Article  Google Scholar 

  7. Ghai, W., Singh, N.: Analysis of automatic speech recognition systems for Indo-Aryan languages: Punjabi a case study. Int. J. Soft Comput. Eng. (IJSCE) 2(1), 379–385 (2012)

    Google Scholar 

  8. Akram, M.U., Arif, M.: Design of an Urdu speech recognizer based upon acoustic phonetic modeling. In: 8th International Multitopic Conference, pp. 91–96 (2004)

    Google Scholar 

  9. Azam, S.M., Mansoor, Z.A., Mughal, M.S., Mohsin, S.: Urdu spoken digits recognition using classified MFCC and backpropgation neural network. In: Computer Graphics, Imaging and Visualization, CGIV’07, pp. 414–418 (2007)

    Google Scholar 

  10. Ahad, A., Fayyaz, A., Mehmood, T.: Speech recognition using multilayer perceptron. In: Proceedings of IEEE Students Conference, ISCON’02, pp. 103–109 (2002)

    Google Scholar 

  11. Hasnain, S.K., Awan, M.S.: Recognizing spoken Urdu numbers using fourier descriptor and neural networks with Matlab. In: Second International Conference on Electrical Engineering (ICEE 2008), pp. 1–6 (2008)

    Google Scholar 

  12. Ashraf, J., Iqbal, N., Khattak, N.S., Zaidi, A.M.: Speaker independent Urdu speech recognition using HMM. In: The 7th International Conference on Informatics and Systems (INFOS 2010), pp. 1–5, March (2010)

    Google Scholar 

  13. Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)

    Article  Google Scholar 

  14. Sarfraz, H., et al.: Large vocabulary continuous speech recognition for Urdu. In: 8th International Conference on Frontiers of Information Technology (FIT’10) (2010)

    Google Scholar 

  15. Ali, H., Ahmad, N., Yahya, K.M., Farooq, O.: A medium vocabulary Urdu isolated words balanced corpus for automatic speech recognition. In: 2012 International Conference on Electronics Computer Technology (ICECT 2012), pp. 473–476 (2012)

    Google Scholar 

  16. Center for Language Engineering, May 2012. http://www.cle.org.pk/

  17. Molau, S., Ptiz, M., Schluter, R., Ney, H.: Computing mel-frequency cepstral coefficients on the power spectrum. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’01), pp. 73–76 (2001)

    Google Scholar 

  18. Han, W., Chan, C.-F., Choy, C.-S., Pun, K.-P.: An efficient MFCC extraction method in speech recognition. In: IEEE International Symposium on Circuits and Systems (ISCAS 2006) (2006)

    Google Scholar 

  19. Kotnik, B., Vlaj, D., Horvat, B.: Efficient noise robust feature extraction algorithms for distributed speech recognition (DSR) systems. Int. J. Speech Technol. 6(3), 205–219 (2003)

    Article  Google Scholar 

  20. Proakis, J.G., Manolakis, D.G.: Digital Signal Processing; Principles, Algorithms & Applications, 4th edn. Pearson Education Inc., Prentice Hall (2007)

    Google Scholar 

  21. Ingle, V.K., Proakis, J.G.: Digital Signal Processing Using Matlab, 3rd edn. Cengage Learning, Standford (2010)

    Google Scholar 

  22. Salomon, D.: Data Compression: The Complete Reference, 4th edn. Springer, London (2007)

    Google Scholar 

  23. Balakrishnama, S., Ganapathiraju, A., Picone, J.: Linear discriminant analysis for signal processing problems. In: Proceedings of the IEEE Southeastcon, pp. 36–39 March (1999)

    Google Scholar 

  24. Balakrishnama, S., Ganapathiraju, A.: Linear discriminant analysis; a brief tutorial. http://www.music.mcgill.ca/~ich. Accessed March 2012

Download references

Acknowledgment

The authors are thankful to the supporting staff of the Department of Electrical Engineering, University of Engineering and Technology, Peshawar, Pakistan. It is due to their efforts which they put into keeping the Computer Lab open for extra hours and providing the authors with opportunity to use it. The authors also extend their gratitude to Engr. Salman Ilahi, Department of Electrical Engineering and Engr. Irfan Ahmad, Department of Industrial Engineering, UET Peshawar, for their valuable input and suggestions throughout this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hazrat Ali .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ali, H., Ahmad, N., Zhou, X., Ali, M., Manjotho, A.A. (2014). Linear Discriminant Analysis Based Approach for Automatic Speech Recognition of Urdu Isolated Words. In: Shaikh, F., Chowdhry, B., Zeadally, S., Hussain, D., Memon, A., Uqaili, M. (eds) Communication Technologies, Information Security and Sustainable Development. IMTIC 2013. Communications in Computer and Information Science, vol 414. Springer, Cham. https://doi.org/10.1007/978-3-319-10987-9_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10987-9_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10986-2

  • Online ISBN: 978-3-319-10987-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics