Advertisement

Analysis of the Influence of Speech Corpora in the PLDA Verification in the Task of Speaker Recognition

  • Lukáš Machlica
  • Zbyněk Zajíc
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7499)

Abstract

In the paper recent methods used in the task of speaker recognition are presented. At first, the extraction of so called i-vectors from GMM based supervectors is discussed. These i-vectors are of low dimension and lie in a subspace denoted as Total Variability Space (TVS). The focus of the paper is put on Probabilistic Linear Discriminant Analysis (PLDA), which is used as a generative model in the TVS. The influence of development data is analyzed utilizing distinct speech corpora. It is shown that it is preferable to cluster available speech corpora to classes, train one PLDA model for each class and fuse the results at the end. Experiments are presented on NIST Speaker Recognition Evaluation (SRE) 2008 and NIST SRE 2010.

Keywords

PLDA latent space fusion supervector FA i-vector 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Campbell, W., Sturim, D., Reynolds, D.: Support Vector Machines Using GMM Supervectors for Speaker Verification. IEEE Signal Processing Letters 13, 308–311 (2006)CrossRefGoogle Scholar
  2. 2.
    Longworth, C., Gales, M.: Parametric and Derivative Kernels for Speaker Verification. In: Interspeech 2007, pp. 310–313 (2007)Google Scholar
  3. 3.
    Solomonoff, A., Quillen, C., Campbell, W.: Channel compensation for SVM speaker recognition. In: Odyssey, pp. 57–62 (2004)Google Scholar
  4. 4.
    Kenny, P.: Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms. Tech. report, Centre de Recherche Informatique de Montral (2006)Google Scholar
  5. 5.
    Dehak, N.: Discriminative and Generative Approaches for Long- and Short-term Speaker Characteristics Modeling: Application to Speaker Verification. Ph.D. thesis, École de Technologie Supérieure, Université du Québec (2009)Google Scholar
  6. 6.
    Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis For Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing (2010)Google Scholar
  7. 7.
    Prince, S., Elder, J.: Probabilistic Linear Discriminant Analysis for Inferences About Identity. In: IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
  8. 8.
    Matějka, P., Glembek, O., Castaldo, F., Alam, J., Plchot, O., Kenny, P., Burget, L., Černocký, J.: Full-covariance UBM and Heavy-tailed PLDA in I-Vector Speaker Verification. In: ICASSP 2011, pp. 4828–4831 (2011)Google Scholar
  9. 9.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)CrossRefGoogle Scholar
  10. 10.
    Patrick, K., Pierre, O., Najim, D., Vishwa, G., Pierre, D.: A Study of Interspeaker Variability in Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing 16, 980–988 (2008)CrossRefGoogle Scholar
  11. 11.
    The NIST Year, Speaker Recognition Evaluation Plan (2008), http://www.itl.nist.gov/iad/mig/tests/spk/2008/sre08_evalplan_release4.pdf
  12. 12.
    The NIST Year, Speaker Recognition Evaluation Plan (2010), http://www.itl.nist.gov/iad/mig/tests/spk/2010/NIST_SRE10_evalplan.r6.pdf
  13. 13.
    Brummer, N.: FoCal: Tools for fusion and calibration of automatic speaker detec- tion systems (2006), http://sites.google.com/site/nikobrummer/focal

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Lukáš Machlica
    • 1
  • Zbyněk Zajíc
    • 1
  1. 1.Faculty of Applied Sciences, Department of CyberneticsUniversity of West Bohemia in PilsenPilsenCzech Republic

Personalised recommendations