Advertisement

International Journal of Speech Technology

, Volume 18, Issue 2, pp 271–275 | Cite as

Database development and automatic speech recognition of isolated Pashto spoken digits using MFCC and K-NN

  • Zakir Ali
  • Arbab Waseem Abbas
  • T. M. Thasleema
  • Burhan Uddin
  • Tanzeela Raaz
  • Sahibzada Abdur Rehman Abid
Article

Abstract

Automatic recognition of isolated spoken digits is one of the most challenging tasks in the area of Automatic Speech Recognition. In this paper, Database Development and Automatic Speech Recognition of Isolated Pashto Spoken Digits from Sefer (0) to Naha (9) has been presented. A number of 50 individual Pashto native speakers (25 male and 25 female) of different ages, ranging from 18 to 60 years, were involved to utter from Sefer (0) to Naha (9) digits separately. Sony PCM-M 10 linear recorder is used for recoding purpose in the office and home in noise free environment. Adobe audition version 1.0 is used to split the audio of digits into individual digits and result is saved in .wav format. Mel frequency cepstral coefficients is used to extract speech features. K nearest neighbor classifier is used for the first time up to author knowledge in Pashto language to classify the features of speech and compare its accuracy with linear discriminate analysis. The experimental results are evaluated, and the overall average recognition exactitude of 76.8 % is obtained.

Keywords

KNN MFCC Pashto digits 

References

  1. Abbas, A. W., Ahmad, N., & Ali, H. (2012). Pashto spoken digits database for the automatic speech recognition research. In 18th IEEE international conference on automation and computing (ICAC), 2012 (pp. 1–5).Google Scholar
  2. Abdur, S., Abid, R., Ahmad, N., Khan, M. A. A., & Zuhra, F. T. (2013). Concatenative based Pashto digits and numbers synthesizer. International Journal of Computer Applications, 72(6), 38–42.Google Scholar
  3. Ádám, N. A. (2014). Speech analysis system based on vector quantization using the LBG algorithm and self-organizing maps. International Journal of Computer and Information Technology, 3(5), 952–957.Google Scholar
  4. Alotaibi, Y. A. (2003). High performance Arabic digits recognizer using neural networks. In 2003 IEEE proceedings of the international joint conference on neural networks (Vol. 1, pp. 670–674).Google Scholar
  5. Alcaraz Meseguer, N. (2009). Speech analysis for automatic speech recognition. Department of Electronics and Telecommunications, Norwegian University of Science and Technology (Thesis).Google Scholar
  6. Halpern, J. (2007). The challenges and pitfalls of Arabic romanization and arabization. In Proceedings of the workshop on computational approaches to Arabic script based language.Google Scholar
  7. Han, J., & Kamber, M. (2006). Data mining, Southeast Asia edition: Concepts and techniques (pp. 263–264). Burlington: Morgan Kaufmann.Google Scholar
  8. Jan, Z., Abrar, M., Bashir, S., & Mirza, A. M. (2009). Seasonal to inter-annual climate prediction using data mining KNN technique. In Wireless networks, information processing and systems (pp. 40–51). Berlin: Springer.Google Scholar
  9. Karpagavalli, S., Rani, K. U., Deepika, R., & Kokila, P. (2012). Isolated Tamil digits speech recognition using vector quantization. Paper presented at the International Journal of Engineering Research and Technology. PSGR Krishnammal College for Women, Coimbatore, 1(4), June 2012.Google Scholar
  10. Majeed, S. A., Husain, H., Samad, S. A., & Hussain, A. (2012). Hierarchical K-means algorithm applied on isolated Malay digit speech recognition. In International proceedings of computer science & information technology (Vol. 34).Google Scholar
  11. Muhammad, G., Alotaibi, Y. A., & Huda, M. N. (2009). Automatic speech recognition for Bangla digits. In 12th IEEE international conference on computers and information technology, 2009, (ICCIT’09) (pp. 379–383).Google Scholar
  12. Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083.
  13. Pei, J. I. A. (2010). Automatic speech recognition. London: Springer.Google Scholar
  14. Poonkuzhali, C., Karthiprakash, R., Valarmathy, S., & Kalamani, M. (2013). An approach to feature selection algorithm based on ant colony optimization for automatic speech recognition. pp. 5671–5678.Google Scholar
  15. Prasad, R., Tsakalidis, S., Bulyko, I., Kao, C. L., & Natarajan, P. (2010). Pashto speech recognition with limited pronunciation lexicon. In 2010 IEEE international conference on acoustics speech and signal processing (ICASSP) (pp. 5086–5089).Google Scholar
  16. Shah, F. (2010). Isolated Malayalam digit recogntion using Support Vector Machines. In 2010 international conference on communication control and computing technologies (pp. 692–695).Google Scholar
  17. Sheena, C. V., Thasleema, T. M., & Narayanan, N. K. (2013). Search time reduction using hidden markov models for isolated digit recognition (pp. 33–38). Department of Information Technology, Kannur University, Kerala.Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Zakir Ali
    • 1
  • Arbab Waseem Abbas
    • 2
  • T. M. Thasleema
    • 3
  • Burhan Uddin
    • 1
  • Tanzeela Raaz
    • 1
  • Sahibzada Abdur Rehman Abid
    • 2
  1. 1.Institute of Business and Management SciencesThe University of Agricultural PeshawarPeshawarPakistan
  2. 2.Universities of Engineering and Technology PeshawarPeshawarPakistan
  3. 3.Department of Computer ScienceCentral University of KeralaNileshwarIndia

Personalised recommendations