Advertisement

A Comparative Study of Feature and Score Normalization for Speaker Verification

  • Rong Zheng
  • Shuwu Zhang
  • Bo Xu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3832)

Abstract

In speaker verification, it is necessary to reduce the influence of different environmental conditions. In this paper, two stages of normalization techniques, feature normalization and score normalization, are examined for decreasing the mismatch between training and testing acoustic conditions. At the first stage, cepstral mean and variance normalization (CMVN) is modified to normalize the cepstral coefficients with the similar segmental parameter statistics. Next, due to score variability between verification trials, Test-dependent zero-score normalization (TZnorm) and Zero-dependent test-score normalization (ZTnorm) are comparatively presented to transform the output scores entirely and make the speaker-independent decision threshold more robust under adverse conditions. Experiments on NIST2002 SRE corpus show that the normalizations with CMVN in feature stage and ZTnorm in score stage achieved 20.3% relative reduction of EER and 18.1% relative reduction of the minimal DCF compared to the baseline system using CMN and zero normalization.

Keywords

Equal Error Rate Speaker Verification Test Segment Universal Background Model Target Speaker 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Viikki, O., Laurila, K.: Cepstral domain segmental feauture vector normalization for noise robust speech recognition. Speech Communication 25, 133–147 (1998)CrossRefGoogle Scholar
  2. 2.
    Segura, J.C., Benítez, C., et al.: Cepstral domain segmental nonlinear feature transformations for robust speech recognition. IEEE Signal Processing Letters 11, 517–520 (2004)CrossRefGoogle Scholar
  3. 3.
    Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. on Speech and Audio Processing 2, 578–589 (1994)CrossRefGoogle Scholar
  4. 4.
    Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification. In: Proc. Speaker Odyssey Conf., vol. 213 (2001)Google Scholar
  5. 5.
    Rosenberg, A.E., Delong, J., et al.: The use of cohort normalized scores for speaker verification Proc. In: ICSLP, vol. 2, pp. 599–602 (1992)Google Scholar
  6. 6.
    Reynolds, D.A.: Comparison of background normalization methods for text-independent speaker verification. In: Proc. EuroSpeech, pp. 963–966 (1997)Google Scholar
  7. 7.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models Digital Signal Processing 10, 19–41 (2000)Google Scholar
  8. 8.
    Auckenthaler, R., Carey, M., Lloyd-Thomas, H.: Score normalization for text-independent speaker verification systems Digital Signal Processing 10, 42–54 (2000)Google Scholar
  9. 9.
    Barras, C., Gauvain, J.: Feature and score normalization for speaker verification of cellular data Proc. In: ICASSP, vol. 2, pp. 49–52 (2003)Google Scholar
  10. 10.
    Molau, S., Pitz, M., Ney, H.: Histogram based normalization in the acoustic feature space. In: Proc. ASRU, pp. 21–24 (2001)Google Scholar
  11. 11.
  12. 12.
    Martin, A., Doddington, G., et al.: The DET curve in assessment of detection task performance. In: Proc. EuroSpeech, pp. 1895–1898 (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Rong Zheng
    • 1
  • Shuwu Zhang
    • 1
  • Bo Xu
    • 1
  1. 1.Institute of AutomationChinese Academy of SciencesBeijingChina

Personalised recommendations