Sparsity Analysis and Compensation for i-Vector Based Speaker Verification

  • Wei LiEmail author
  • Tian Fan Fu
  • Jie Zhu
  • Ning Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9319)


Over recent years, i-vector based framework has been proven to provide state-of-art performance in speaker verification. Most of the researches focus on compensating the channel variability of i-vector. In this paper we will give an analysis that in the case that the duration of enrollment or test utterance is limited, i-vector based system may suffer from biased estimation problem. In order to solve this problem, we propose an improved i-vector extraction algorithm which we term Adapted First order Baum-Welch Statistics Analysis (AFSA). This new algorithm suppresses and compensates the deviation of first order Baum-Welch statistics caused by phonetic sparsity and phonetic imbalance. Experiments were performed based on NIST 2008 SRE data sets, Experimental results show that 10 %–15 % relative improvement is achieved compared to the baseline of traditional i-vector based system.


Speaker verification i-vector Phonetic sparsity Adapted first order Baum-Welch statistics analysis (AFSA) 



This article was supported by the National Natural Science Foundation of China (NSFC) under Grants No. 61271349, 61371147 and 11433002.


  1. 1.
    Bonastre, J.F., Scheffer, N., Matrouf, D., Fredouille, C., Larcher, A., Preti, A., Pouchoulin, G., Evans, N.W., Fauve, B.G., Mason, J.S.: Alize/spkdet: a state-of-the-art open source software for speaker recognition. In: Odyssey, p. 20 (2008)Google Scholar
  2. 2.
    Bousquet, P.M., Larcher, A., Matrouf, D., Bonastre, J.F., Plchot, O.: Variance-spectra based normalization for i-vector standard and probabilistic linear discriminant analysis. In: Speaker and Language Recognition Workshop (IEEE Odyssey) (2012)Google Scholar
  3. 3.
    Bousquet, P.M., Matrouf, D., Bonastre, J.F.: Intersession compensation and scoring methods in the i-vectors space for speaker recognition. In: INTERSPEECH, pp. 485–488 (2011)Google Scholar
  4. 4.
    Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)CrossRefGoogle Scholar
  5. 5.
    Kenny, P.: Joint factor analysis of speaker and session variability: Theory and algorithms. CRIM, Montreal, (Report) CRIM-06/08-13 (2005)Google Scholar
  6. 6.
    Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: Odyssey, p. 14 (2010)Google Scholar
  7. 7.
    Kenny, P., Boulianne, G., Dumouchel, P.: Eigenvoice modeling with sparse training data. IEEE Trans. Speech Audio Process. 13(3), 345–354 (2005)CrossRefGoogle Scholar
  8. 8.
    Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Joint factor analysis versus eigenchannels in speaker recognition. IEEE Trans. Audio Speech Lang. Process. 15(4), 1435–1447 (2007)CrossRefGoogle Scholar
  9. 9.
    Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: A study of interspeaker variability in speaker verification. IEEE Trans. Audio Speech Lang. Process. 16(5), 980–988 (2008)CrossRefGoogle Scholar
  10. 10.
    Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification (2001)Google Scholar
  11. 11.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital Sig. Process. 10(1), 19–41 (2000)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Electronic EngineeringShanghai Jiao Tong UniversityShanghaiChina
  2. 2.Department of Computer Science and Engineering (CSE)Shanghai Jiao Tong UniversityShanghaiChina
  3. 3.School of Information Science and EngineeringEast China University of S&TShanghaiChina

Personalised recommendations