Mel Frequency Cepstral Coefficients Based Similar Albanian Phonemes Recognition

  • Bertan Karahoda
  • Krenare PirevaEmail author
  • Ali Shariq Imran
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9734)


In Albanian language there are several phonemes that are similar in pronunciation like /q/ - /ç/, /rr/ - /r/, /th/ - /dh/ and /gj/ - /xh/. These phonemes are difficult to distinguish by human ear even for native speaking Albanians from different regions. The task becomes more challenging for automated speech systems, recognizing and classifying Albanian words and language due to the similar sounding phonemes. This paper proposes to use Mel Frequency Cepstral Coefficients (MFCC) based features to distinguish these phonemes correctly. The three layers back propagation neural network is used for classification. The experiments are performed on speech signals that are collected from different male and female native speakers. The speaker independent tests are performed for analyzing the performance of the classification. The obtained results show that the serial MFCC features can be used to classify the very similar speech phonemes with higher accuracy.


Albanian phonemes Similar phonemes classification Serial MFCC features Neural network classifier Back propagation network 


  1. 1.
    Theera Umpon, N., Chansareewittaya, S., Auephanwiriyakul, S.: Phoneme and tonal accent recognition for THAI speech. Exp. Syst. Appl. 38(10), 13254–13259 (2011)CrossRefGoogle Scholar
  2. 2.
    Caranica, A., Buzo, A., Cucu, H., Burileanu, C.: Speed@ mediaeval 2015: Multilingual phone recognition approach to query by example STD (2015)Google Scholar
  3. 3.
    Dabbaghchian, S., Sameti, H., Ghaemmaghami, M., BabaAli, B.: Robust phoneme recognition using MLP neural networks in various domains of MFCC features. In: 2010 5th International Symposium on Telecommunications (IST), pp. 755–759, December 2010Google Scholar
  4. 4.
    Sharifzadeh, S., Serrano, J., Carrabina, J.: Spectro-temporal analysis of speech for spanish phoneme recognition. In: 2012 19th International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 548–551, April 2012Google Scholar
  5. 5.
    Zbancioc, M., Costin, M.: Using neural networks and LPCC to improve speech recognition. In: 2003 International Symposium on Signals, Circuits and Systems SCS 2003, vol. 2, pp. 445–448 (2003)Google Scholar
  6. 6.
    Sahu, P., Biswas, A., Bhowmick, A., Chandra, M.: Auditory ERB like admissible wavelet packet features for timit phoneme recognition. Int. J. Eng. Sci. Technol. 17(3), 145–151 (2014)CrossRefGoogle Scholar
  7. 7.
    Tavanaei, A., Manzuri, M., Sameti, H.: Mel-scaled discrete wavelet transform and dynamic features for the persian phoneme recognition. In: 2011 International Symposium on Artificial Intelligence and Signal Processing (AISP), pp. 138–140, June 2011Google Scholar
  8. 8.
    Xue-ying Zhang, X., Bai, J., zhou Liang, W.: The speech recognition system based on bark wavelet MFCC. In: 2006 8th International Conference on Signal Processing, vol. 1 (2006)Google Scholar
  9. 9.
    Halavati, R., Shouraki, S.B., Zadeh, S.H.: Recognition of human speech phonemes using a novel fuzzy approach. Appl. Soft Comput. 7(3), 828–839 (2007)CrossRefGoogle Scholar
  10. 10.
    Fartash, M., Setayeshi, S., Razzazi, F.: A scale-rate filter selection method in the spectro-temporal domain for phoneme classification. Comput. Electr. Eng. 39(5), 1537–1548 (2013)CrossRefGoogle Scholar
  11. 11.
    Muroi, T., Takiguchi, T., Ariki, Y.: Speaker independent phoneme recognition based on fisher weight map. In: 2008 International Conference on Multimedia and Ubiquitous Engineering MUE 2008, pp. 253–257. IEEE (2008)Google Scholar
  12. 12.
    Rahman, M., Islam, M.: Performance evaluation of MLPC and MFCC for HMM based noisy speech recognition. In: 2010 13th International Conference on Computer and Information Technology (ICCIT), pp. 273–276, December 2010Google Scholar
  13. 13.
    Paulraj, M., Bin Yaacob, S., Nazri, A., Kumar, S.: Classification of vowel sounds using MFCC and feed forward neural network. In: 2009 5th International Colloquium on Signal Processing Its Applications CSPA 2009, pp. 59–62, March 2009Google Scholar
  14. 14.
    Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using MEL frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint (2010). arXiv:1003.4083
  15. 15.
    Visa, S., Ramsay, B., Ralescu, A.L., Van Der Knaap, E.: Confusion matrix-based feature selection. In: MAICS, pp. 120–127 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Bertan Karahoda
    • 1
  • Krenare Pireva
    • 1
    Email author
  • Ali Shariq Imran
    • 2
  1. 1.Faculty of Computer Science and EngineeringUBTPristinaKosovo
  2. 2.Faculty of Computer Science and Media TechnologyNorwegian University of Science and Technology (NTNU)TrondheimNorway

Personalised recommendations