Speaker Verification on Unbalanced Data with Genetic Programming

  • Róisín LoughranEmail author
  • Alexandros Agapitos
  • Ahmed Kattan
  • Anthony Brabazon
  • Michael O’Neill
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9597)


Automatic Speaker Verification (ASV) is a highly unbalanced binary classification problem, in which any given speaker must be verified against everyone else. We apply Genetic programming (GP) to this problem with the aim of both prediction and inference. We examine the generalisation of evolved programs using a variety of fitness functions and data sampling techniques found in the literature. A significant difference between train and test performance, which can indicate overfitting, is found in the evolutionary runs of all to-be-verified speakers. Nevertheless, in all speakers, the best test performance attained is always superior than just merely predicting the majority class. We examine which features are used in good-generalising individuals. The findings can inform future applications of GP or other machine learning techniques to ASV about the suitability of feature-extraction techniques.


Speaker verification Unbalanced data Genetic programming Feature selection 



This work was carried out as a collaboration of projects funded by Science Foundation Ireland under grant Grant Numbers 08/SRC/FM1389 and 13/IA/1850.


  1. 1.
    Agapitos, A., Brabazon, A., O’Neill, M.: Controlling overfitting in symbolic regression based on a bias/variance error decomposition. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012, Part I. LNCS, vol. 7491, pp. 438–447. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  2. 2.
    Batista, G., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newsl. 6(1), 20–29 (2004)CrossRefGoogle Scholar
  3. 3.
    Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: Balancing strategies and class overlapping. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 24–35. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Bhowan, U., Johnston, M., Zhang, M.: Developing new fitness functions in genetic programming for classification with unbalanced data. IEEE Trans. Syst. Man Cybern. B Cybern. 42(2), 406–421 (2012)CrossRefGoogle Scholar
  5. 5.
    Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Trans. Evol. Comput. 17(3), 368–386 (2013)CrossRefGoogle Scholar
  6. 6.
    Burton, D.: Text-dependent speaker verification using vector quantization source coding. IEEE Trans. Acoust. Speech Signal Process. 35(2), 133–143 (1987)CrossRefGoogle Scholar
  7. 7.
    Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using gmm supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)CrossRefGoogle Scholar
  8. 8.
    Curry, R., Lichodzijewski, P., Heywood, M.I.: Scaling genetic programming to large datasets using hierarchical dynamic subset selection. IEEE Trans. Syst. Man Cybern. B Cybern. 37(4), 1065–1073 (2007)CrossRefGoogle Scholar
  9. 9.
    Day, P., Nandi, A.K.: Robust text-independent speaker verification using genetic programming. IEEE Trans. Audio Speech Lang. Process. 15(1), 285–295 (2007)CrossRefGoogle Scholar
  10. 10.
    Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)CrossRefGoogle Scholar
  11. 11.
    Doucette, J., Heywood, M.I.: GP classification under imbalanced data sets: active sub-sampling and auc approximation. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alcázar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 266–277. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  12. 12.
    Eggermont, J., Eiben, A.E., van Hemert, J.: Adapting the fitness function in gp for data mining. In: Langdon, W.B., Fogarty, T.C., Nordin, P., Poli, R. (eds.) EuroGP 1999. LNCS, vol. 1598, pp. 193–202. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  13. 13.
    Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: Darpa timit acoustic-phonetic continous speech corpus cd-rom. nist speech disc 1–1.1. NASA STI/Recon technical report n 93, 27403 (1993)Google Scholar
  14. 14.
    Gathercole, C., Ross, P.: Dynamic training subset selection for supervised learning in genetic programming. PPSN III. LNCS, vol. 866, pp. 312–321. Springer, Jerusalem (1994)CrossRefGoogle Scholar
  15. 15.
    Gonçalves, I., Silva, S., Melo, J.B., Carreiras, J.M.B.: Random sampling technique for overfitting control in genetic programming. In: Moraglio, A., Silva, S., Krawiec, K., Machado, P., Cotta, C. (eds.) EuroGP 2012. LNCS, vol. 7244, pp. 218–229. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Hermansky, H.: Perceptual linear predictive (plp) analysis of speech. J. Acoust. Soc. Am. 87, 1738 (1990)CrossRefGoogle Scholar
  17. 17.
    Hermansky, H., Morgan, N., Bayya, A., Kohn, P.: Rasta-plp speech analysis technique. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1992, vol. 1, pp. 121–124. IEEE (1992)Google Scholar
  18. 18.
    Holmes, J.H.: Differential negative reinforcement improves classifier system learning rate in two-class problems with unequal base rates. In: 3rd Annual Conference on Genetic Programming, pp. 635–642. ICSC Academic Press (1998)Google Scholar
  19. 19.
    Huang, X., Acero, A., Hon, H.W., et al.: Spoken Language Processing, vol. 15. Prentice Hall PTR, New Jersey (2001)Google Scholar
  20. 20.
    Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Factor analysis simplified. In: Proceedings of ICASSP, vol. 1, pp. 637–640. Citeseer (2005)Google Scholar
  21. 21.
    Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Joint factor analysis versus eigenchannels in speaker recognition. IEEE Trans. Audio Speech Lang. Process. 15(4), 1435–1447 (2007)CrossRefGoogle Scholar
  22. 22.
    Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. 52(1), 12–40 (2010)CrossRefGoogle Scholar
  23. 23.
    Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: Fisher, D.H. (ed.) Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, July 8–12, 1997, pp. 179–186. Morgan Kaufmann (1997)Google Scholar
  24. 24.
    Lartillot, O., Toiviainen, P.: A matlab toolbox for musical feature extraction from audio. In: International Conference on Digital Audio Effects, pp. 237–244 (2007)Google Scholar
  25. 25.
    Liares, L.R., Garca-Mateo, C., Alba-Castro, J.L.: On combining classifiers for speaker authentication. Pattern Recogn. 36(2), 347–359 (2003)CrossRefGoogle Scholar
  26. 26.
    Logan, B., et al.: Mel frequency cepstral coefficient for music modelling. In: ISMIR (2000)Google Scholar
  27. 27.
    Loughran, R., Walker, J., O’Neill, M., McDermott, J.: Genetic programming for musical sound analysis. In: Machado, P., Romero, J., Carballal, A. (eds.) EvoMUSART 2012. LNCS, vol. 7247, pp. 176–186. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  28. 28.
    Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–580 (1975)CrossRefGoogle Scholar
  29. 29.
    Márquez-Vera, C., Cano, A., Romero, C., Ventura, S.: Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl. Intell. 38(3), 315–330 (2013)CrossRefGoogle Scholar
  30. 30.
    O’Shaughnessy, D.: Speech communication: human and machine. Digital Signal Processing. Addison-Wesley, Reading (1987)Google Scholar
  31. 31.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital sig. process 10(1), 19–41 (2000)CrossRefGoogle Scholar
  32. 32.
    Sivaram, G.S., Thomas, S., Hermansky, H.: Mixture of auto-associative neural networks for speaker verification. In: INTERSPEECH, pp. 2381–2384 (2011)Google Scholar
  33. 33.
    Song, D., Heywood, M.I., Zincir-Heywood, A.N.: Training genetic programming on half a million patterns: an example from anomaly detection. IEEE Trans. Evol. Comput. 9(3), 225–239 (2005)CrossRefGoogle Scholar
  34. 34.
    Winkler, S.M., Affenzeller, M., Wagner, S.: Advanced genetic programming based machine learning. J. Math. Model. Algorithms 6(3), 455–480 (2007)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Róisín Loughran
    • 1
    Email author
  • Alexandros Agapitos
    • 1
  • Ahmed Kattan
    • 2
  • Anthony Brabazon
    • 1
  • Michael O’Neill
    • 1
  1. 1.Natural Computing Research and Applications GroupUniversity College DublinDublinIreland
  2. 2.Computer Science DepartmentUm Al-Qura UniversityMeccaSaudi Arabia

Personalised recommendations