Phonetic Question Generation Using Misrecognition
Most automatic speech recognition systems are currently based on tied state triphones. These tied states are usually determined by a decision tree. Decision trees can automatically cluster triphone states into many classes according to data available allowing each class to be trained efficiently. In order to achieve higher accuracy, this clustering is constrained by manually generated phonetic questions. Moreover, the tree generated from these phonetic questions can be used to synthesize unseen triphones. The quality of decision trees therefore depends on the quality of the phonetic questions. Unfortunately, manual creation of phonetic questions requires a lot of time and resources. To overcome this problem, this paper is concerned with an alternative method for generating these phonetic questions automatically from misrecognition items. These questions are tested using the standard TIMIT phone recognition task.
KeywordsHide Markov Model Speech Recognition System Automatic Speech Recognition System Phone Recognition Phone Recognizer
Unable to display preview. Download preview PDF.
- 1.Odell, J.J.: The Use of Context in Large Vocabulary Speech Recognition. Ph.D. Thesis. Cambridge University, Cambridge (1995)Google Scholar
- 2.Beulen, K., Ney, H.: Automatic Question Generation for Decision Tree Based State Tying. In: Proc. ICASSP, vol. 2, pp. 805–809 (1988)Google Scholar
- 3.Singh, R., Raj, B., Stern, R.M.: Automatic Clustering and Generation of Contextual Questions for Tied States in Hidden Markov Models. In: Proc. ICSLP, vol. 1, pp. 117–1202 (1999)Google Scholar
- 4.Willett, D., Neukirchen, C., Rottland, J., Rigoll, G.: Refining Tree-Based Clustering by Means of Formal Concept Analysis, Balanced Decision Trees and Automatically Generated Model-Sets. In: Proc. ICASSP, vol. 2, pp. 565–568 (1999)Google Scholar
- 6.Tarsaku, P., Kanokphara, S.: A Study of HMM-Based Automatic Segmentations for Thai Continuous Speech Recognition System. In: Proc. the Symposium on Natural Language Processing, pp. 217–220 (2002)Google Scholar
- 7.Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L.: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM, NIST (1993)Google Scholar
- 9.Kanokphara, S., Carson-Berndsen, J.: Feature-Table-Based Automatic Question Generation for Tree-Based State Tying: A Practical Implementation. In: Proc. Int. Conf. on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems (2004)Google Scholar
- 10.Chang, S., Greenberg, S., Wester, M.: An Elitist Approach to Articulatory-Acoustic Feature Classification. In: Proc. Eurospeech, pp. 1725–1728 (2001)Google Scholar