A New Text-Independent Speaker Identification Using Vector Quantization and Multi-layer Perceptron

  • Ji-Soo Keum
  • Chan-Ho Park
  • Hyon-Soo Lee
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3972)


In this paper, we propose a new text-independent speaker identification method using VQ and MLP. It consists of three parts: a new spectral peak analysis based feature extraction, speaker clustering and model selection using VQ, and MLP based speaker identification. The feature vector reflects the speaker specific characteristics and has a long-term feature for which makes it text-independent. The proposed method has a computational efficient for feature extraction and identification. To evaluate the proposed method, we calculated the correct identification ratio (CIR), the average CIR of the proposed and GMM method was 92.27% and 85.78% for 5 seconds segments in 15-speaker identification. Experimental results, we have achieved a performance comparable to GMM-method.


Gaussian Mixture Model Spectral Peak Vector Quantization Speaker Recognition Speaker Identification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Joseph, P., Campbell, J.R.: Speaker Recognition A Tutorial. Proceeding of The IEEE 85(9), 1437–1462 (1997)CrossRefGoogle Scholar
  2. Sadaoki, F.: Recent Advances in Speaker Recognition. Pattern Recognition Letter 18, 859–872 (1997)CrossRefGoogle Scholar
  3. Herbert, G., Michael, S.: Text-independent Speaker Identification. IEEE Signal Processing Magazine, 18–32 (1994)Google Scholar
  4. Reynolds, D.A., Rose, R.C.: Robust Text-independent Speaker Identification using Gaussian Mixture Speaker Models. IEEE Trans. on Speech and Audio Processing 3(1), 72–83 (1995)CrossRefGoogle Scholar
  5. Narayanaswamy, B., Gangadharaiah, R.: Extracting Additional Information from Gaussian Mixture Model Probabilities for Improved Text-Independent Speaker Identification. In: IEEE International Conference on Acoustics Speech and Signal Processing, vol. 1, pp. 621–624 (2005)Google Scholar
  6. Farrell, K.R., Mammone, R.J., Assaleh, K.T.: Speaker Recognition using Neural Networks and Conventional Classifiers. IEEE Trans. on Speech and Audio Processing 2(1), 194–205 (1994)CrossRefGoogle Scholar
  7. Hiroaki, H.: Text-Independent Speaker Recognition using Neural Networks. IEICE Trans. INF. & SYST. E76-D(3), 345–351 (1993)Google Scholar
  8. Lu, L., Zhang, H.J., Jiang, H.: Content Analysis for Audio Classification and Segmentation. IEEE Trans. on Speech and Audio Processing 10(7), 504–516 (2002)CrossRefGoogle Scholar
  9. Zhang, T., Kuo, J.: Audio Content Analysis for Online Audiovisual Data Segmentation and Classification. IEEE Trans. on Speech and Audio Processing 9(4), 441–457 (2001)CrossRefGoogle Scholar
  10. Keum, J.S., Lee, H.S.: Speaker Change Detection Based on Spectral Peak Track Analysis for Korean Broadcast News. In: The Fifth International Conference on Information Communications and Signal Processing, pp. 724–728 (2005)Google Scholar
  11. Laurene, F.: Fundamentals of Neural Networks. Prentice Hall, Englewood Cliffs (1994)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Ji-Soo Keum
    • 1
  • Chan-Ho Park
    • 2
  • Hyon-Soo Lee
    • 1
  1. 1.Dept. of Computer EngineeringKyung Hee UniversityYongin-si, Gyeonggi-doKorea
  2. 2.Dept. of Internet Information ScienceBucheon CollegeBucheon-si, Gyeonggi-doKorea

Personalised recommendations