K. Choi, and J. Hwang, “Baum-Welch HMM inversion for audio-to-visual conversion”, IEEE International Workshop on Multimedia Signal Processing, pp. 175–180, 1999.
C. Fisher, “Confusions among visually perceived consonants”, Journal on Speech and Hearing Research, vol. 11, pp. 796–804, 1968.
K. Grant, B. Walden, and P. Seitz, “Auditory-visual speech recognition by hearing-impaired subjects: consonant recognition, sentence recognition, and auditory-visual integration”, Journal of Acoustic Society of America, vol. 103, pp. 2677–2690, 1998.
S. Morishima and H. Harashima, “A media conversion from speech to facial image for intelligent man-machine interface”, IEEE Journal on selected areas in communications, vol. 9, no. 4, pp. 594–600, 1991.
Nist Speech Disc 1-1.1, TIMIT Acoustic-phonetic continuous speech corpus, October 1990.
L. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition”, Proceedings of the IEEE, vol. 77, no.2, pp. 257–286, 1989.
R. Rao, T. Chen and R. Mersereau, “Audio-to-visual conversion for multimedia communication”, IEEE Transaction on Industrial Electronics, vol. 45, no. 1, pp. 15–22, 1998.
A. Rogozan and P. Delelise, “Adaptive fusion of acoustic and visual sources for automatic speech recognition”, Speech Communication, vol. 26, pp. 149–161, 1998.
S. Tamura and A. Waibel, “Noise reduction using connectionist models”, IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 553–556, 1988.
A. Viterbi, “Error bounds of convolutional codes and an asymmetrically optimum decoding algorithm”, IEEE Transactions on Information Theory, IT-13, pp. 260–267, 1967.
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. Lang, “Phoneme recognition using time-delay neural networks”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 3, pp. 328–339, 1989.