Phonetic Question Generation Using Misrecognition

  • Supphanat Kanokphara
  • Julie Carson-Berndsen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4188)


Most automatic speech recognition systems are currently based on tied state triphones. These tied states are usually determined by a decision tree. Decision trees can automatically cluster triphone states into many classes according to data available allowing each class to be trained efficiently. In order to achieve higher accuracy, this clustering is constrained by manually generated phonetic questions. Moreover, the tree generated from these phonetic questions can be used to synthesize unseen triphones. The quality of decision trees therefore depends on the quality of the phonetic questions. Unfortunately, manual creation of phonetic questions requires a lot of time and resources. To overcome this problem, this paper is concerned with an alternative method for generating these phonetic questions automatically from misrecognition items. These questions are tested using the standard TIMIT phone recognition task.


Hide Markov Model Speech Recognition System Automatic Speech Recognition System Phone Recognition Phone Recognizer 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Odell, J.J.: The Use of Context in Large Vocabulary Speech Recognition. Ph.D. Thesis. Cambridge University, Cambridge (1995)Google Scholar
  2. 2.
    Beulen, K., Ney, H.: Automatic Question Generation for Decision Tree Based State Tying. In: Proc. ICASSP, vol. 2, pp. 805–809 (1988)Google Scholar
  3. 3.
    Singh, R., Raj, B., Stern, R.M.: Automatic Clustering and Generation of Contextual Questions for Tied States in Hidden Markov Models. In: Proc. ICSLP, vol. 1, pp. 117–1202 (1999)Google Scholar
  4. 4.
    Willett, D., Neukirchen, C., Rottland, J., Rigoll, G.: Refining Tree-Based Clustering by Means of Formal Concept Analysis, Balanced Decision Trees and Automatically Generated Model-Sets. In: Proc. ICASSP, vol. 2, pp. 565–568 (1999)Google Scholar
  5. 5.
  6. 6.
    Tarsaku, P., Kanokphara, S.: A Study of HMM-Based Automatic Segmentations for Thai Continuous Speech Recognition System. In: Proc. the Symposium on Natural Language Processing, pp. 217–220 (2002)Google Scholar
  7. 7.
    Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L.: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM, NIST (1993)Google Scholar
  8. 8.
    Lee, K.F., Hon, H.W.: Speaker-Independent Phone Recognition Using Hidden Markov Models. IEEE Trans. Acoust., Speech, Signal Processing 37(11), 1641–1648 (1989)CrossRefGoogle Scholar
  9. 9.
    Kanokphara, S., Carson-Berndsen, J.: Feature-Table-Based Automatic Question Generation for Tree-Based State Tying: A Practical Implementation. In: Proc. Int. Conf. on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (2004)Google Scholar
  10. 10.
    Chang, S., Greenberg, S., Wester, M.: An Elitist Approach to Articulatory-Acoustic Feature Classification. In: Proc. Eurospeech, pp. 1725–1728 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Supphanat Kanokphara
    • 1
  • Julie Carson-Berndsen
    • 1
  1. 1.School of Computer Science and InformaticsUniversity College DublinIreland

Personalised recommendations