Speaker Verification Using Coded Speech

  • Antonio Moreno-Daniel
  • Biing-Hwang Juang
  • Juan A. Nolazco-Flores
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3287)


The implementation of a pseudo text-independent Speaker Verification system is described. This system was designed to use only information extracted directly from the coded parameters embedded in the ITU-T G.729 bit-stream. Experiments were performed over the YOHO database [1]. The feature vector as a short-time representation of speech consists of 16 LPC-Cepstral coefficients, as well as residual information appended in the form of a pitch estimate and a measure of vocality of the speech. The robustness in verification accuracy is also studied. The results show that while speech coders, G.729 in particular, introduce coding distortions that lead to verification performance degradation, proper augmented use of unconventional information nevertheless leads to a competitive performance on par with that of a well-studied traditional system which does not involve signal coding and transmission. The result suggests that speaker verification over a cell phone connection remains feasible even though the signal has been encoded to 8 Kb/s.


Equal Error Rate Speaker Recognition Speaker Verification Universal Background Model Speaker Verification System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Jr. Campbell, J.P.: Testing with the YOHO cd-rom voice verification corpus. In: Proc. ICASSP (1995)Google Scholar
  2. 2.
    Furui, A.: Recent Advances in Speaker Recognition. In: First Int. Conf. Audio- and Video based Biometric Person Authentication, Switzerland, pp. 237–252 (1997)Google Scholar
  3. 3.
    Reynolds, D.A.: An Overview of Automatic Speaker Recognition Technology. In: Proc. ICASSP (2002)Google Scholar
  4. 4.
    Reynolds, D.A., Rose, R.: Robust Text-Independent Speaker Identification Using Gaussians Mixture Speaker Model. IEEE Transactions on Speech and Audio Processing (1995)Google Scholar
  5. 5.
    Rosenberg, Aaron, E., Siohan, O., Parthasarathy, S.: Speaker verification using minimum verification error training. In: Proc. ICASSP (1998)Google Scholar
  6. 6.
    Li, Q., Juang, B.-H., Zhou, Q., Lee, C.-H.: Automatic Verbal Information Verification for User Authentication. IEEE Transactions on Speech and Audio Processing, 585–596 (2000)Google Scholar
  7. 7.
    Kim, H.K., Cox, R.: Bitstream-based feature extraction for wireless speech recognition. In: Proc. ICASSP (2000)Google Scholar
  8. 8.
    Zhong, X.: Speech coding and transmission for improved recognition in a communication network. PhD Dissertation, Georgia Institute of Technology (2000)Google Scholar
  9. 9.
    ITU-T Recommendation G.729, Coding of speech at 8 kbit/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP) (1996) Google Scholar
  10. 10.
    Young, S., et al.: The HTK Book, Cambridge University, Version 3.2 ed. (2002)Google Scholar
  11. 11.
    Huang, X., Acero, A., Hon, H.W.: Spoken language processing. Prentice Hall, Englewood Cliffs (2001)Google Scholar
  12. 12.
    Quatieri, T.F., et al.: Speaker Recognition Using G.729 Speech Codec Parameters. In: Proc. ICASSP (2000)Google Scholar
  13. 13.
    Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)Google Scholar
  14. 14.
    ITU-T Recommendation G.191, Software tool library 2000 user’s manual (2000) Google Scholar
  15. 15.
    Yu Eric, W.M., Man-Wai, M., Chin-Hung, S., Sun-Yuan, K.: Speaker verification based on G.729 and G.723.1 coder parameters and handset mismatch compensation. In: Proc. of the 8th European Conference on Speech Communication and Technology (2003)Google Scholar
  16. 16.
    Besacier, L., Grassi, S., Dufaux, A., Ansorge, M., Pellandini, F.: GSM Speech coding and Speaker Recognition. In: Proc. ICASSP (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Antonio Moreno-Daniel
    • 1
    • 2
  • Biing-Hwang Juang
    • 1
  • Juan A. Nolazco-Flores
    • 2
  1. 1.Center for Signal and Image ProcessingGeorgia Institute of TechnologyAtlantaUSA
  2. 2.Departamento de Ciencias ComputacionalesInstituto Tecnológico y de Estudios, Superiores de MonterreyMonterreyMéxico

Personalised recommendations