Morphological Representation of Speech Knowledge for Automatic Speech Recognition Systems
This work proposes a technique to capture speech knowledge which is available in spectrograms by considering it as a scene. A simple pattern analysis technique applied to these patterns reveals significant properties which are relevant to transitions of vocal tract as well as being speaker independent in nature. This process is labeled under Biological Vision since, Biological vision system uses a global recognition strategy by considering the image as a “whole”. The recognition processor, the brain, uses symbols and symbolic relationships in image for image interpretation. Also, the knowledge base consists of symbols as well as symbolic relationship of objects in its long term memory. In order to give the machine a similar capability as that of biological vision systems, the pattern of a speech spectrogram is described as a morphology of symbols and symbolic relationships. Such symbols are then used for final hypothesis generation by statistical means.
KeywordsSpeech Signal Automatic Speech Recognition Vocal Tract Automatic Speech Recognition System Morphological Representation
Unable to display preview. Download preview PDF.
- V. W. Zue and L. F. Lamel, “An Expert Spectrogram Reader: A Knowledgebased approach to Speech Recognition,” IEEE International Conference on Acoustics, Speech and Signal Processing, Tokyo, pp. 1197–1200,1986.Google Scholar
- M. Palakal, “Morphological representation of Speech Knowledge, for Automatic Speech Recognition Systems,” Ph.D. Thesis, Concordia University, Montreal, 1987.Google Scholar
- R. De Mori and M. Palakal, “On the use of a Taxonomy of Time-Frequency Morphologies for Aotomatic Speech Recognition,” International Joint Conference on Artificial Intelligence, Los Angles, California, 1985.Google Scholar
- G. Kopec and M. Bush, “Network-based isolated digit recognition using vector quantization”, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-33, pp. 850–856, August 1986.Google Scholar
- E. Merlo, R. De Mori, M. Palakal, R. Rouat, and G. Mercier, “A Continuous Parameter and Frequency Domain based Markov Model,” IEEE International Conference on Acoustics, Speech and Signal Processing, Tokyo, 1986.Google Scholar
- R. De Mori, E. Merlo, M. Palakal, J. Rouat, “Use of Procedural Knowledge for Automatic Speech Recognition”, International Joint Conference on Artificial Intelligence, Milano, Italy, 1987.Google Scholar