Knowledge Representation for Speech Processing
In Automatic Speech Processing (ASP) not only speech sounds but articulatory parameters (face images, physiological signals), phonetic and linguistic symbols are treated. The nature of these entities is largely varied since they emerge from multiple knowledge sources: acoustic, articulatory, perceptive, phonetic, lexical, syntactic, semantic, etc. This multiple source origin renders the recognition and synthesis tasks difficult.
Among the most important APD problems we find the disponibility of “good knowledge”: knowledge which allows the pertinent and minimal description of the phonetic units, independent of speakers and vocabulary. One of the main difficulties is that the great quantity of information and variability involved (due to the speaker, environment, etc.) make it impossible to find a single expert with complete knowledge. Knowledge needed by ADP systems can be obtained in several ways: from the knowledge of many experts, by automatic learning, or by mixed methods.
This last alternative seems the richest, since it is controllable by the expert and since it is feasible to complete the knowledge (derived from experience only) with methods similar to learning. From this perspective, a system embodying knowledge acquisition, would provide the expert with tools to assess his knowledge, to quantify certain parameters that he currently uses and to allow him to search for new knowledge more systematically. Such a system should be able to manage pertinent speech data (recorded sounds, articulatory parameters, acoustic spectra, etc.), to produce knowledge from these data as well as from the knowledge of experts, and to manage the obtained knowledge.
The proposed systemmust have three main components: a speech data and knowledge base (built following an object oriented representation), an extensive specialized toolbox for speech processing, and a reasoning mechanism, to control the advanced knowledge processing (deductions, learning) and assure the interfacing of all other system components. The main conceptual problems the system deals with are presented and some proposed solutions are discussed.
KeywordsKnowledge representation knowledge acquisition automatic speech processing data and knowledge bases
Unable to display preview. Download preview PDF.
- Brachman R.J., Levesque H.J. (1985) Readings in Knowledge Representation Morgan and Kaufmann Publishers, Inc.Google Scholar
- Caelen J., Caelen-Haumont G., Vigouroux N., Barrera C., Malet J. (1986) ARCANE: Acquisition et Recherche de Connaissances Acoustico-Phonetiques dans un Noyau Evolutif Proc. 15-èmes JEP, Aix-en-Provence, pp. 207–211.Google Scholar
- Cervantes O. (1988) Bases de Données et d’Objets Complexes Multimedia pour la Recherche sur la Parole These Doctorat en Informatique de l’lnstitut National Polytechnique de Grenoble Janvier 1988.Google Scholar
- Cyphers D.S. (1985) SPIRE: A Speech Research Tool Master of Science Thesis, MIT, MassachussetsGoogle Scholar
- De Mori R., Lam L., Probst D. (1989) Rule-Based Detection of Speech Features for Automatic Speech Recognition in Fundamentals in Computer Understanding: Speech and Vision Cambridge University PressGoogle Scholar
- Erman L.D., Hayes-Roth F., Lesser V.R. (1980) The HEEARSAY II Speech Understanding System: Integrating Knowledge to Resolve Uncertainty Computing Surveys. Vol 12, no. 2, pp. 213–253Google Scholar
- Fernández Y. (1989) Gestion de Connaissances pour des applications du domaine de la parole. These Doctorat en Informatique de l’lnstitut National Polytechnique de Grenoble Decembre 1989Google Scholar
- Fernández Y. (1991) Knowledge Management Services in an Environment for Scientific Research Support III International Simposium for Artificial Intelligence-ITESM Cancun, November 1991Google Scholar
- Guizol J. (1986) Apprentissage inductif de règies pour le décodage acoustico-phonetique Proc. 15-èmes JEP, Aix-en-Provence, pp. 227–230Google Scholar
- Guizol J. (1987) Inference Automatique de Règies: Quelques Resultats Proc. 16-èmes JEP, Hammamet, pp. 52–55Google Scholar
- Goldberg A., Robson D. (1983) Smalltalk-80. The Language and its Implementation Addison-Wesley Reading, Mass.Google Scholar
- Haton J.P. (1987) Knowledge-Based and Expert Systems in Understanding Problems in Fundamentals in Computer Understanding: Speech and Vision Edited by J.P. Haton. Cambridge University Press.Google Scholar
- Hayes-Roth F., McDermott J. (1985) Rule Based System Communications of the ACM, Vol. 21, no. 5, pp. 401–410Google Scholar
- Huckvale M.N., Brookes D.M., Dworkin L.T., Johnson M.E., Pearce D.J., Whitaker L. (1987) The SPAR Filing System European Conference on Speech Technology, Vol 1, September, pp. 305–308Google Scholar
- Martelli T., Miclet L., Tubach J.P. (1987) REMORA: A Software Architecture for the Collaboration of Different Knowledge Sources in Phonetic Decoding of Continuous Speech IEEE-ICASSP, Dallas, pp. 387–390Google Scholar
- Narat V., Lochet P.Y. (1987) Les differentes techniques de representation de connaissances utilisees en Intelligence Artificielle. MDB, No. 6, pp.26–36, juin 1987Google Scholar
- Romero L. (1990) Reconocimiento de Voz Multilocutor: Un enfoque hibrido Tesis Maestrfa en Ciencias ITESM-Campus CuerriavacaGoogle Scholar
- Wellekens, Christian J. (1989) Speech Recognition Using Connectionist Methods, in Connectionism in Perspective (R.Pfeifer et al.) North-Holland editions.Google Scholar
- Zue V.W and Lamel L.F. (1986) An expert Spectogram Reader: A knowledge based approach to speech recognition IEEE-ICASSP 86, vol. 2Google Scholar