Abstract
The task of Speaker Discrimination (SD) consists in checking whether two speech segments belong to the same speaker or not. In this research field, it is often difficult to decide what could be the best classifier in terms of accuracy and robustness. For that purpose, we have implemented 9 classifiers: Support Vector Machines, Linear Discriminant Analysis, Multi-Layer Perceptron, Generalized Linear Model, Self Organizing Map, Adaboost, Second Order Statistical Measures, Linear Regression and Gaussian Mixture Models. Furthermore, a new fusion approach is proposed and experimented in speaker discrimination. Several experiments of speaker discrimination were conducted on Hub4 Broadcast-News with relatively short segments. The obtained results have shown that the best classifier is the SVM and that the proposed fusion approach is quite interesting since it provided the best performances at all.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Rose, P.: Forensic speaker discrimination with australian english vowel acoustics. In: ICPhS XVI Saarbrücken, pp. 6–10 (2007)
Matrouf, D., Bonastre, J.F.: Accurate Log-Likelihood Ratio Estimation By Using Test Statistical Model For Speaker Verification. In: The Speaker and Language Recognition Workshop (2006)
Meignier, S., et al.: Step-by-step and integrated approaches in broadcast news speaker diarization. Comput. Speech Lang. 20, 303–330 (2006)
Ouamour, S., Guerti, M., Sayoud, H.: A New Relativistic Vision in Speaker Discrimination. Can. Acoust. J. 36(4), 24–34 (2008)
Li, M., Xing, Y., Luo, R.: Hierarchical Speaker Verification Based on PCA and Kernel Fisher Discriminant. In: Fourth International Conference on Natural Computation, pp. 152–156 (2008)
Zhao, Z.D., Zhang, J., Tian, J.F., Lou, Y.Y.: An effective identification method for speaker recognition based on PCA and double VQ. In: Proceedings of the Eighth International Conference on Machine Learning and Cybernetics, Baoding, pp. 1686–1689 (2009)
Jayakurnar, A., Vimal, K.V.R., Babu Anto, P.: Text dependent speaker recognition using discrete stationary wavelet transform and PCA. In: International Conference on the Current Trends in Information Technology (CTIT), pp. 1–4 (2009)
Zhou, Y., Zhang, X., Wang, J., Gong, Y.: Research on speaker feature dimension reduction based on CCA and PCA. In: International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–4 (2010)
Mehra, A., Kumawat, M., Ranjan, R.: Expert system for speaker identification using lip features with PCA. In: 2nd International Workshop on Intelligent Systems and Applications (ISA), pp. 1–4 (2010)
Xiao-Chun, L., Jun-Xun, Y.: A text-independent speaker recognition system based on probabilistic principle component analysis. In: 2012 3rd International Conference on System Science, Engineering Design and Manufacturing Informatization, pp. 255–260 (2012)
Jing, X., Ma, J., Zhao, J., Yang, H.: Speaker recognition based on principal component analysis of LPCC and MFCC, pp. 403–408. IEEE (2014)
Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S-PLUS, 4th edn. Springer, New York (2002)
Ruihi, W..: AdaBoost for feature selection, classification and its relation with SVM, a review. In: International Conference on Solid State Devices and Materials Science, 1–2, April 2012, vol. 25, pp. 800–807. Physics Procedia, Macao (2012)
Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations. In Kasabov, N., Ko, K. (eds.) Proceedings of the ICONIP/ANZIIS/ANNES 1999 Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems, pp. 192–196 (1999)
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to platt’s SMO algorithm for SVM classifier design. Neural Comput. 13, 637–649 (2001)
Sayoud, H.: Automatic speaker recognition–Connexionnist approach. PhD thesis, USTHB University, Algiers (2003)
Wikipedia, “Linear regression”, From Wikipedia, the free encyclopedia. The web page was last modified on 28 March (2013), http://en.wikipedia.org/wiki/Linear_regression
Huang, X., Pan, W.: Linear regression and two-class classification with gene expression data. Bioinformatics 19(16), 2072–2078 (2003)
Wang, X., Fan, J.: Variable selection for multivariate generalized linear models. J. Appl. Stat. 41(2) (2014)
Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990). doi:10.1109/5.58325
Tambouratzis, G., Hairetakis, G., Markantonatou, S., Carayannis, G.: Applying the SOM model to text classification according to register and stylistic content. Int. J. Neural Syst. 13(1), 1–11 (2003)
McLachlan, G.J., Peel, D., Bean, R.W.: Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data Anal. 41(3–4), 379–388 (2003)
Přibil, J., Přibilová, A., Matoušek, J.: GMM classification of text-to-speech synthesis: identification of original speaker’s voice. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS, vol. 8655, pp. 365–373. Springer, Cham (2014). doi:10.1007/978-3-319-10816-2_44
Shlens, J.: A Tutorial on Principal Component Analysis–Derivation, Discussion and Singular Value Decomposition. Version number 1 (2003), www.cs.princeton.edu/picasso/mats/PCA-Tutorial-Intuition_jp.pdf
Shayegan, M.A., Aghabozorgi, S.: A new dataset size reduction approach for PCA-based classification in OCR application. Math. Prob. Eng. 2014, 14 (2014), http://dx.doi.org/10.1155/2014/537428
Dasarathy, B.V.: Decision fusion. In: Proceedings of IEEE Computer Society Press, Los Alamitos, CA (1994)
Verlinde, P.: Contribution à la vérification multimodale d’identité en utilisant la fusion de decisions. PhD thesis, Ecole Nationale Supérieure des Télécommunications, Paris, France, 17 September (1999)
Jain, A.K., Ross, A., Prabhakar, S.: An Introduction to Biometric Recognition. J. IEEE Trans. Circ. Syst. Video Technol. 14(1), 4–20 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Sayoud, H., Ouamour, S., Hamadache, Z. (2017). Discriminating Speakers by Their Voices — A Fusion Based Approach. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham. https://doi.org/10.1007/978-3-319-66429-3_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-66429-3_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66428-6
Online ISBN: 978-3-319-66429-3
eBook Packages: Computer ScienceComputer Science (R0)