Advertisement

Unsupervised Speaker Adaptation for Phonetic Transcription Based Voice Dialing

  • Weon-Goo Kim
  • MinSeok Jang
  • Chin-Hui Lee
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3614)

Abstract

Since the speaker independent phoneme HMM based voice dialing system uses only the phoneme transcription of the input sentence, the storage space could be reduced greatly. However, the performance of the system is worse than that of the speaker dependent system due to the phoneme recognition errors generated when the speaker independent models are used. In order to solve this problem, a new method that jointly estimates the transformation vectors (bias) and transcriptions for the speaker adaptation is presented. The biases and transcriptions are estimated iteratively from the training data of each user with maximum likelihood approach to the stochastic matching using speaker independent phoneme models. Experimental result shows that the proposed method is superior to the conventional method using transcriptions only.

Keywords

Speech Recognition Transformation Vector Word Error Rate Input Sentence Speaker Adaptation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Jain, N., Cole, R., Barnard, E.: Creating Speaker-Specific Phonetic Templates with a Speaker-Independent Phonetic Recognizer: Implications for Voice Dialing. In: Proc. of ICASSP 1996, pp. 881–884 (1996)Google Scholar
  2. 2.
    Fontaine, V., Bourlard, H.: Speaker-Dependent Speech Recognition Based on Phone-Like Units Models-Application to Voice Dialing. In: Proc. of ICASSP 1997, pp. 1527–1530 (1997)Google Scholar
  3. 3.
    Ramabhadran, B., Bahl, L.R., deSouza, P.V.: Acoustic-Only Based Automatic Phonetic Baseform Generation. In: Proc. of ICASSP 1998, pp. 2275–2278 (1998)Google Scholar
  4. 4.
    Shozakai, M.: Speech Interface for Car Applications. In: Proc. of ICASSP 1999, pp. 1386–1389 (1999)Google Scholar
  5. 5.
    Zavaliagkos, G., Schwartz, R., Makhoul, J.: Batch, Incremental and Instantaneous Adaptation Techniques for Speech Recognition. In: Proc. of ICASSP 1995, pp. 676–679 (1995)Google Scholar
  6. 6.
    Sankar, A., Lee, C.H.: A Maximum-Likelihood Approach to Stochastic Matching for Robust Speech Recognition. IEEE Trans. on Speech and Audio Processing 4, 190–202 (1996)CrossRefGoogle Scholar
  7. 7.
    Sukkar, R.A., Lee, C.H.: Vocabulary independent discriminative utterance verification for non-keyword rejection in subword based speech recognition. IEEE Trans. Speech and Au-dio Processing 4, 420–429 (1996)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Weon-Goo Kim
    • 1
  • MinSeok Jang
    • 2
  • Chin-Hui Lee
    • 3
  1. 1.Biometrics Engineering Research CenterSchool of Electronic and Information Eng., Kunsan National Univ.KunsanKorea
  2. 2.Dept. of Computer Information ScienceKunsan National Univ.KunsanKorea
  3. 3.School of Electrical and Computer Eng.Georgia Institute of TechnologyAtlantaUSA

Personalised recommendations