Advertisement

Novel Phase Encoded Mel Cepstral Features for Speaker Verification

  • Apeksha J. Naik
  • Rishabh Tak
  • Hemant A. Patil
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10458)

Abstract

In this paper, we propose novel phase encoded Mel cepstral coefficients (PEMCC) features for Automatic Speaker Verification (ASV) task. This is motivated by recently proposed phase encoding scheme that uses causal delta dominance condition (CDD). In particular, we got on an average of 80% reduction in log-spectral distortion (LSD) for reconstruction error compared to its magnitude spectrum counterpart, using CDD scheme. This result indicates that phase encoded magnitude spectrum is having better reconstruction capability. The experiments of proposed PEMCC features are carried out on standard statistically meaningful NIST 2002 SRE database and the performance is compared with baseline MFCC features. Furthermore, score-level fusion of MFCC+PEMCC features gave better results for GMM-UBM-based system, i-vector probabilistic linear discriminant analysis (PLDA)-based system and i-vector Cosine Distance Scoring (CDS)-based system over MFCC and PEMCC features alone. This illustrates, the proposed PEMCC features capture complementary speaker-specific information.

Keywords

Speaker verification Causal delta dominance Phase encoding i-Vector Cosine distance scoring Probiblistic linear discriminant analysis 

References

  1. 1.
    Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using gmm supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)CrossRefGoogle Scholar
  2. 2.
    Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)CrossRefGoogle Scholar
  3. 3.
    Hansen, J.H., Hasan, T.: Speaker recognition by machines and humans: a tutorial review. IEEE Signal Process. Mag. 32(6), 74–99 (2015)CrossRefGoogle Scholar
  4. 4.
    Hayes, M., Lim, J., Oppenheim, A.: Signal reconstruction from phase or magnitude. IEEE Trans. Acoust. Speech Signal Process. 28(6), 672–680 (1980)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: Speaker Odyssey. p. 14 (2010)Google Scholar
  6. 6.
    Martin, A., Przybocki, M.: The NIST year 2002 speaker recognition evaluation plan (2001)Google Scholar
  7. 7.
    Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing, 2nd edn. Pearson Education, India (1999)MATHGoogle Scholar
  8. 8.
    Quatieri, T.F.: Discrete-Time Speech Signal Processing: Principles and Practice. Pearson Education, India (2006)Google Scholar
  9. 9.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Signal Proc. 10(1–3), 19–41 (2000)CrossRefGoogle Scholar
  10. 10.
    Tak, R., Kamble, M.R., Patil, H.A.: Analysis-by-synthesis approach for phase encoded Mel cepstral features to detect spoofed speech. In: Submited for possible publication in INTERSPEECH (2017)Google Scholar
  11. 11.
    Seelamantula, C.S.: Phase-encoded speech spectrograms. In: INTERSPEECH, San Francisco, USA, pp. 1775–1779 (2016)Google Scholar
  12. 12.
    Shenoy, B.A., Mulleti, S., Seelamantula, C.S.: Exact phase retrieval in principal shift-invariant spaces. IEEE Trans. Signal Process. 64(2), 406–416 (2016)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Shenoy, B.A., Seelamantula, C.S.: Exact phase retrieval for a class of 2-D parametric signals. IEEE Trans. Signal Process. 63(1), 90–103 (2015)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Apeksha J. Naik
    • 1
  • Rishabh Tak
    • 1
  • Hemant A. Patil
    • 1
  1. 1.Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT)GandhinagarIndia

Personalised recommendations