Choi, K., Hwang, J.: Baum-Welch HMM inversion for audio-to-visual conversion, IEEE International Workshop on Multimedia Signal Processing, pp. 175–180, 1999.
Fisher, C.: Confusions among visually perceived consonants, Journal on Speech and Hearing Research, vol. 11, pp. 796–804, 1968.
Grant, K., Walden, B., Seitz, P.: Auditory-visual speech recognition by hearing-impaired subjects: consonant recognition, sentence recognition, and auditory-visual integration, Journal of Acoustic Society of America, vol. 103, pp. 2677–2690, 1998.
Morishima, S., Harashima, H.: A media conversion from speech to facial image for intelligent man-machine interface, IEEE Journal on selected areas in communications, vol. 9, no. 4, pp. 594–600, 1991.
Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 1989.
Rao, R., Chen, T.: Mersereau, R., Audio-to-visual conversion for multimedia communication, IEEE Transaction on Industrial Electronics, vol. 45, no. 1, pp. 15–22, 1998.
Rogozan, A., Delelise, P.: Adaptive fusion of acoustic and visual sources for automatic speech recognition, Speech Communication, vol. 26, pp. 149–161, 1998.
Tamura, S., Waibel, A.: Noise reduction using connectionist models, IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 553–556, 1988.
TIMIT: Acoustic-phonetic continuous speech corpus, Nist Speech Disc 1-1.1, October 1990.
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K.: Phoneme recognition using time-delay neural networks, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 3, pp. 328–339, 1989.