Discriminating Speakers by Their Voices — A Fusion Based Approach

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10458)


The task of Speaker Discrimination (SD) consists in checking whether two speech segments belong to the same speaker or not. In this research field, it is often difficult to decide what could be the best classifier in terms of accuracy and robustness. For that purpose, we have implemented 9 classifiers: Support Vector Machines, Linear Discriminant Analysis, Multi-Layer Perceptron, Generalized Linear Model, Self Organizing Map, Adaboost, Second Order Statistical Measures, Linear Regression and Gaussian Mixture Models. Furthermore, a new fusion approach is proposed and experimented in speaker discrimination. Several experiments of speaker discrimination were conducted on Hub4 Broadcast-News with relatively short segments. The obtained results have shown that the best classifier is the SVM and that the proposed fusion approach is quite interesting since it provided the best performances at all.


Speaker discrimination Speaker identification Discriminative classification Fusion 


  1. 1.
    Rose, P.: Forensic speaker discrimination with australian english vowel acoustics. In: ICPhS XVI Saarbrücken, pp. 6–10 (2007)Google Scholar
  2. 2.
    Matrouf, D., Bonastre, J.F.: Accurate Log-Likelihood Ratio Estimation By Using Test Statistical Model For Speaker Verification. In: The Speaker and Language Recognition Workshop (2006)Google Scholar
  3. 3.
    Meignier, S., et al.: Step-by-step and integrated approaches in broadcast news speaker diarization. Comput. Speech Lang. 20, 303–330 (2006)CrossRefGoogle Scholar
  4. 4.
    Ouamour, S., Guerti, M., Sayoud, H.: A New Relativistic Vision in Speaker Discrimination. Can. Acoust. J. 36(4), 24–34 (2008)Google Scholar
  5. 5.
    Li, M., Xing, Y., Luo, R.: Hierarchical Speaker Verification Based on PCA and Kernel Fisher Discriminant. In: Fourth International Conference on Natural Computation, pp. 152–156 (2008)Google Scholar
  6. 6.
    Zhao, Z.D., Zhang, J., Tian, J.F., Lou, Y.Y.: An effective identification method for speaker recognition based on PCA and double VQ. In: Proceedings of the Eighth International Conference on Machine Learning and Cybernetics, Baoding, pp. 1686–1689 (2009)Google Scholar
  7. 7.
    Jayakurnar, A., Vimal, K.V.R., Babu Anto, P.: Text dependent speaker recognition using discrete stationary wavelet transform and PCA. In: International Conference on the Current Trends in Information Technology (CTIT), pp. 1–4 (2009)Google Scholar
  8. 8.
    Zhou, Y., Zhang, X., Wang, J., Gong, Y.: Research on speaker feature dimension reduction based on CCA and PCA. In: International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–4 (2010)Google Scholar
  9. 9.
    Mehra, A., Kumawat, M., Ranjan, R.: Expert system for speaker identification using lip features with PCA. In: 2nd International Workshop on Intelligent Systems and Applications (ISA), pp. 1–4 (2010)Google Scholar
  10. 10.
    Xiao-Chun, L., Jun-Xun, Y.: A text-independent speaker recognition system based on probabilistic principle component analysis. In: 2012 3rd International Conference on System Science, Engineering Design and Manufacturing Informatization, pp. 255–260 (2012)Google Scholar
  11. 11.
    Jing, X., Ma, J., Zhao, J., Yang, H.: Speaker recognition based on principal component analysis of LPCC and MFCC, pp. 403–408. IEEE (2014)Google Scholar
  12. 12.
    Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S-PLUS, 4th edn. Springer, New York (2002)CrossRefzbMATHGoogle Scholar
  13. 13.
    Ruihi, W..: AdaBoost for feature selection, classification and its relation with SVM, a review. In: International Conference on Solid State Devices and Materials Science, 1–2, April 2012, vol. 25, pp. 800–807. Physics Procedia, Macao (2012)Google Scholar
  14. 14.
    Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations. In Kasabov, N., Ko, K. (eds.) Proceedings of the ICONIP/ANZIIS/ANNES 1999 Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems, pp. 192–196 (1999)Google Scholar
  15. 15.
    Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to platt’s SMO algorithm for SVM classifier design. Neural Comput. 13, 637–649 (2001)CrossRefzbMATHGoogle Scholar
  16. 16.
    Sayoud, H.: Automatic speaker recognition–Connexionnist approach. PhD thesis, USTHB University, Algiers (2003)Google Scholar
  17. 17.
    Wikipedia, “Linear regression”, From Wikipedia, the free encyclopedia. The web page was last modified on 28 March (2013),
  18. 18.
    Huang, X., Pan, W.: Linear regression and two-class classification with gene expression data. Bioinformatics 19(16), 2072–2078 (2003)CrossRefGoogle Scholar
  19. 19.
    Wang, X., Fan, J.: Variable selection for multivariate generalized linear models. J. Appl. Stat. 41(2) (2014)Google Scholar
  20. 20.
    Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990). doi: 10.1109/5.58325 CrossRefGoogle Scholar
  21. 21.
    Tambouratzis, G., Hairetakis, G., Markantonatou, S., Carayannis, G.: Applying the SOM model to text classification according to register and stylistic content. Int. J. Neural Syst. 13(1), 1–11 (2003)CrossRefGoogle Scholar
  22. 22.
    McLachlan, G.J., Peel, D., Bean, R.W.: Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data Anal. 41(3–4), 379–388 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Přibil, J., Přibilová, A., Matoušek, J.: GMM classification of text-to-speech synthesis: identification of original speaker’s voice. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS, vol. 8655, pp. 365–373. Springer, Cham (2014). doi: 10.1007/978-3-319-10816-2_44 Google Scholar
  24. 24.
    Shlens, J.: A Tutorial on Principal Component Analysis–Derivation, Discussion and Singular Value Decomposition. Version number 1 (2003),
  25. 25.
    Shayegan, M.A., Aghabozorgi, S.: A new dataset size reduction approach for PCA-based classification in OCR application. Math. Prob. Eng. 2014, 14 (2014),
  26. 26.
    Dasarathy, B.V.: Decision fusion. In: Proceedings of IEEE Computer Society Press, Los Alamitos, CA (1994)Google Scholar
  27. 27.
    Verlinde, P.: Contribution à la vérification multimodale d’identité en utilisant la fusion de decisions. PhD thesis, Ecole Nationale Supérieure des Télécommunications, Paris, France, 17 September (1999)Google Scholar
  28. 28.
    Jain, A.K., Ross, A., Prabhakar, S.: An Introduction to Biometric Recognition. J. IEEE Trans. Circ. Syst. Video Technol. 14(1), 4–20 (2004)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Electronics and Computer Engineering FacultyUSTHB UniversityBab EzzouarAlgeria

Personalised recommendations