Analysis of the Influence of Speech Corpora in the PLDA Verification in the Task of Speaker Recognition
In the paper recent methods used in the task of speaker recognition are presented. At first, the extraction of so called i-vectors from GMM based supervectors is discussed. These i-vectors are of low dimension and lie in a subspace denoted as Total Variability Space (TVS). The focus of the paper is put on Probabilistic Linear Discriminant Analysis (PLDA), which is used as a generative model in the TVS. The influence of development data is analyzed utilizing distinct speech corpora. It is shown that it is preferable to cluster available speech corpora to classes, train one PLDA model for each class and fuse the results at the end. Experiments are presented on NIST Speaker Recognition Evaluation (SRE) 2008 and NIST SRE 2010.
KeywordsPLDA latent space fusion supervector FA i-vector
Unable to display preview. Download preview PDF.
- 2.Longworth, C., Gales, M.: Parametric and Derivative Kernels for Speaker Verification. In: Interspeech 2007, pp. 310–313 (2007)Google Scholar
- 3.Solomonoff, A., Quillen, C., Campbell, W.: Channel compensation for SVM speaker recognition. In: Odyssey, pp. 57–62 (2004)Google Scholar
- 4.Kenny, P.: Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms. Tech. report, Centre de Recherche Informatique de Montral (2006)Google Scholar
- 5.Dehak, N.: Discriminative and Generative Approaches for Long- and Short-term Speaker Characteristics Modeling: Application to Speaker Verification. Ph.D. thesis, École de Technologie Supérieure, Université du Québec (2009)Google Scholar
- 6.Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis For Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing (2010)Google Scholar
- 7.Prince, S., Elder, J.: Probabilistic Linear Discriminant Analysis for Inferences About Identity. In: IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
- 8.Matějka, P., Glembek, O., Castaldo, F., Alam, J., Plchot, O., Kenny, P., Burget, L., Černocký, J.: Full-covariance UBM and Heavy-tailed PLDA in I-Vector Speaker Verification. In: ICASSP 2011, pp. 4828–4831 (2011)Google Scholar
- 11.The NIST Year, Speaker Recognition Evaluation Plan (2008), http://www.itl.nist.gov/iad/mig/tests/spk/2008/sre08_evalplan_release4.pdf
- 12.The NIST Year, Speaker Recognition Evaluation Plan (2010), http://www.itl.nist.gov/iad/mig/tests/spk/2010/NIST_SRE10_evalplan.r6.pdf
- 13.Brummer, N.: FoCal: Tools for fusion and calibration of automatic speaker detec- tion systems (2006), http://sites.google.com/site/nikobrummer/focal