The IntelliMedia WorkBench-An Environment for Building Multimodal Systems
Purchase on Springer.com
$29.95 / €24.95 / £19.95*
* Final gross prices may vary according to local VAT.
Intelligent MultiMedia (IntelliMedia) focuses on the computer processing and understanding of signal and symbol input from at least speech, text and visual images in terms of semantic representations. We have developed a general suite of tools in the form of a software and hardware platform called “Chameleon” that can be tailored to conducting IntelliMedia in various application domains. Chameleon has an open distributed processing architecture and currently includes ten agent modules: blackboard, dialogue manager, domain model, gesture recogniser, laser system, microphone array, speech recogniser, speech synthesiser, natural language processor, and a distributed Topsy learner. Most of the modules are programmed in C and C++ and are glued together using the Dacs communications system. In effect, the blackboard, dialogue manager and Dacs form the kernel of Chameleon. Modules can communicate with each other and the blackboard which keeps a record of interactions over time via semantic representations in frames. Inputs to Chameleon can include synchronised spoken dialogue and images and outputs include synchronised laser pointing and spoken dialogue. An initial prototype application of Chameleon is an IntelliMedia Work-Bench where a user will be able to ask for information about things (e.g. 2D/3D models, pictures, objects, gadgets, people, or whatever) on a physical table. The current domain is a Campus Information System for 2D building plans which provides information about tenants, rooms and routes and can answer questions like Whose office is this? and Show me the route from Paul Mc Kevitt’s office to Paul Dalsgaard’s office. in real time. Chameleon and the IntelliMedia WorkBench are ideal for testing integrated signal and symbol processing of language and vision for the future of SuperinformationhighwayS.
- Bakman, L., M. Blidegn, T.D. Nielsen, and S. Carrasco Gonzalez (1997) NIVICO-Natural Interface for VIdeo COnferencing. Project Report (8th Semester), Department of Communication Technology, Institute for Electronic Systems, Aalborg University, Denmark.
- Bech, A. (1991) Description of the EUROTRA framework. In The Eurotra Formal Specifications, Studies in Machine Translation and Natural Language Processing, C. Copeland, J. Durand, S. Krauwer, and B. Maegaard (Eds), Vol. 2, 7–40. Luxembourg: Office for Official Publications of the Commission of the European Community.
- Br’lndsted, T. (1998) nlparser. http://www.kom.auc.dk/tb/nlparser
- Brøndsted, T., P. Dalsgaard, L.B. Larsen, M. Manthey, P. Mc Kevitt, T.B. Moeslund, and K.G. Olesen (1998) A platform for developing Intelligent MultiMedia applications. Technical Report R-98-1004, Center for PersonKommunikation (CPK), Institute for Electronic Systems (IES), Aalborg University, Denmark, May.
- Christensen, H., B. Lindberg, and P. Steingrimsson (1998) Functional specification of the CPK Spoken LANGuage recognition research system (SLANG). Center for PersonKommunikation, Aalborg University, Denmark, March.
- CPK Annual Report (1998) CPK Annual Report. Center for PersonKommunikation (CPK), Fredrik Bajers Vej 7-A2, Institute for Electronic Systems (IES), Aalborg University, DK-9220, Aalborg, Denmark.
- Denis, M. and M. Carfantan (Eds.) (1993) Images et langages: multimodalité et modelisation cognitive. Actes du Colloque Interdisciplinaire du Comité National de la Recherche Scientifique, Salle des Conférences, Siége du CNRS, Paris, April.
- Fink, G.A., N. Jungclaus, H. Ritter, and G. Sagerer (1995) A communication framework for heterogeneous distributed pattern analysis. In Proc. International Conference on Algorithms and Applications for Parallel Processing, V. L. Narasimhan (Ed.), 881–890. IEEE, Brisbane, Australia. CrossRef
- Fink, G.A., N. Jungclaus, and F. Kummert, H. Ritter, and G. Sagerer (1996) A distributed system for integrated speech and image understanding. In Proceedings of the International Symposium on Artificial Intelligence, Rogelio Soto (Ed.), 117–126. Cancun, Mexico.
- Infovox (1994) INFOVOX: Text-to-speech converter user’s manual (version 3.4). Solna, Sweden: Telia Promotor Infovox AB
- Jensen, F.V. (1996) An introduction to Bayesian Networks. London, England: UCL Press.
- Jensen, F. (1996) Bayesian belief network technology and the HUGIN system. In Proceedings of UNICOM seminar on Intelligent Data Management, Alex Gammerman (Ed.), 240–248. Chelsea Village, London, England, April.
- Kosslyn, S.M. and J.R. Pomerantz (1977) Imagery, propositions and the form of internal representations. In Cognitive Psychology, 9, 52–76. CrossRef
- Leth-Espensen, P. and B. Lindberg (1996) Separation of speech signals using eigen-filtering in a dual beamforming system. In Proc. IEEE Nordic Signal Processing Symposium (NORSIG), Espoo, Finland, September, 235–238.
- Manthey, M.J. (1998) The Phase Web Paradigm. In International Journal of General Systems, special issue on General Physical Systems Theories, K. Bowden (Ed.). in press.
- Mc Kevitt, P. (1994) Visions for language. In Proceedings of the Workshop on Integration of Natural Language and Vision processing, Twelfth American National Conference on Artificial Intelligence (AAAI-94), Seattle, Washington, USA, August, 47–57.
- Mc Kevitt, P. (Ed.) (1995/1996) Integration of Natural Language and Vision Processing (Vols. I-IV). Dordrecht, The Netherlands: Kluwer-Academic Publishers.
- Mc Kevitt, P. (1997) Superinformationhighway S. In “Sprog og Multimedier” (Speech and Multimedia), Tom Brøndsted and Inger Lytje (Eds.), 166–183, April 1997. Aalborg, Denmark: Aalborg Universitetsforlag (Aalborg University Press).
- Mc Kevitt, P. and P. Dalsgaard (1997) A frame semantics for an IntelliMedia Tour-Guide. In Proceedings of the Eighth Ireland Conference on Artificial Intelligence (AI-97), Volume 1, 104–111. University of Uster, Magee College, Derry, Northern Ireland, September.
- Minsky, M. (1975) A framework for representing knowledge. In The Psychology of Computer Vision, P.H. Winston (Ed.), 211–217. New York: McGraw-Hill.
- Nielsen, C., J. Jensen, O. Andersen, and E. Hansen (1997) Speech synthesis based on diphone concatenation. Technical Report, No. CPK971120-JJe (in confidence), Center for PersonKommunikation, Aalborg University, Denmark.
- Okada, N. (1997) Integrating vision, motion and language through mind. In Proceedings of the Eighth Ireland Conference on Artificial Intelligence (AI-97), Volume 1, 7–16. University of Uster, Magee, Derry, Northern Ireland, September.
- Pentland, A. (Ed.) (1993) Looking at people: recognition and interpretation of human action. IJCAI-93 Workshop (W28) at The 13th International Conference on Artificial Intelligence (IJCAI-93), Chambéry, France, August.
- Power, K., C. Matheson, D. Ollason, and R. Morton (1997) The grapHvite book (version 1.0). Cambridge, England: Entropic Cambridge Research Laboratory Ltd.
- Pylyshyn, Z. (1973) What the mind’s eye tells the mind’s brain: a critique of mental imagery. In Psychological Bulletin, 80, 1–24. CrossRef
- Rickheit, G. and I. Wachsmuth (1996) Collaborative Research Centre “Situated Artificial Communicators” at the University of Bielefeld, Germany. In Integration of Natural Language and Vision Processing, Volume IV, Recent Advances, Mc Kevitt, Paul (ed.), 11–16. Dordrecht, The Netherlands: Kluwer Academic Publishers.
- Thórisson, K.R. (1997) Layered action control in communicative humanoids. In Proceedings of Computer Graphics Europe’ 97, June 5–7, Geneva, Switzerland.
- Waibel, A., M.T. Vo, P. Duchnowski, and S. Manke (1996) Multimodal interfaces. In Integration of Natural Language and Vision Processing, Volume IV, Recent Advances, Mc Kevitt, Paul (Ed.), 145–165. Dordrecht, The Netherlands: Kluwer Academic Publishers.
- The IntelliMedia WorkBench-An Environment for Building Multimodal Systems
- Book Title
- Cooperative Multimodal Communication
- Book Subtitle
- Second International Conference, CMC’98 Tilburg, The Netherlands, January 28–30, 1998 Selected Papers
- pp 217-233
- Print ISBN
- Online ISBN
- Series Title
- Lecture Notes in Computer Science
- Series Volume
- Series ISSN
- Springer Berlin Heidelberg
- Copyright Holder
- Springer-Verlag Berlin Heidelberg
- Additional Links
- Industry Sectors
- eBook Packages
- Editor Affiliations
- 1. Computational Linguistics and AI Group, Tilburg University
- 2. Department of Information and Computing Science, Utrecht University
- Author Affiliations
- 5. Institute for Electronic Systems (IES), Aalborg University, Aalborg, Denmark
To view the rest of this content please follow the download PDF link above.