Abstract
Intelligent multimedia (IntelliMedia), which involves the computer processing and understanding of perceptual input from at least speech, text and visual images, and then reacting to it, is complex and involves signal and symbol processing techniques from not just engineering and computer science but also artificial intelligence and cognitive science (Mc Kevitt, 1994, 1995/96, 1997). With IntelliMedia systems, people can interact in spoken dialogues with machines, querying about what is being presented and even their gestures and body language can be interpreted.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Andersen, Ove, C. Hoequist, C. Nielsen. “Danish Research Ministry’s Initiative on Text-to-Speech Synthesis”. In: Proceedings of Nordic Signal Processing Symposium, Kolmârden, Sweden, 2000.
André, Elisabeth, G. Herzog, T. Rist. “On the simultaneous interpretation of real-world image sequences and their natural language description: the system SOCCER”. In: Proceedings of the 8`h European Conference on Artificial Intelligence 449–454, Munich, Germany, 1988.
André, Elisabeth, Thomas Rist. “The design of illustrated documents as a planning task”. In: Intelligent multimedia interfaces. M. Maybury (Ed.), 75–93 Menlo Park, CA: AAAI Press, 1993.
Batman, Lau, Mads Blidegn, Thomas Dorf Nielsen, Susana Carrasco Gonzalez. NIVICO - Natural Interface for Video COnferencing. Project Report (8th Semester), Department of Communication Technology, Institute for Electronic Systems, Aalborg University, Denmark, 1997.
Bech, A. “Description of the EUROTRA framework”. In: The Eurotra Formal Specifications, Studies in Machine Translation and Natural Language Processing. C. Copeland, J. Durand, S. Krauwer, B. Maegaard (Eds), Vol. 2, 7–40 Luxembourg: Office for Official Publications of the Conunission of the European Community, 1991.
Brondsted, Tom. “The CPK NLP Suite for Spoken Language Understanding.” In: Eurospeech, 6th European Conference on Speech Communication and Technology, Budapest, 1999a.
Brondsted, Tom. “The Natural Language Processing Modules in REWARD and IntelliMedia 2000+”. In: LAMBDA 25, S. Kirchmeier-Andersen, H. Erdman Thomsen (Eds.). Copenhagen Business School, Dep. of Computational Linguistics, 1999b.
Brgndsted, Tom. “Reference Problems in Chameleon”. In: ESCA Tutorial and Research Workshop: Interactive Dialogue in Multi-Modal Systems. Kloster Irsee, 1999c.
Brondsted, Tom, P. Dalsgaard, L.B. Larsen, M. Manthey, P. Mc Kevitt, T.B. Moeslund, K.G. Olesen. A platform for developing Intelligent MultiMedia applications. Technical Report R-98–1004, Center for PersonKommunikation (CPK), Institute for Electronic Systems (IES), Aalborg University, Denmark, May, 1998.
Carenini, G., F. Pianesi, M. Ponzi, O. Stock. Natural language generation and hypertext access. IRST Technical Report 9201–06, Instituto Per La Scientifica E Tecnologica, Loc. Pant e Di Povo, I-138100 Trento, Italy, 1992.
Christensen, Heidi, Borge Lindberg, Pall Steingrimsson. Functional specification of the CPK Spoken LANGuage recognition research system (SLANG). Center for PersonKommunikation, Aalborg University, Denmark, March, 1998.
Denis, M., M. Carfantan (Eds.). Images et langages: multimodalite et modelisation cognitive. Actes du Colloque Interdisciplinaire du Comite National de la Recherche Scientifique, Salle des Conferences, Siege du CNRS, Paris, April, 1993.
Dennett, Daniel. Consciousness explained. Harmondsworth: Penguin, 1991.
Fink, G.A., N. Jungclaus, H. Ritter, G. Sagerer. “A communication framework for heterogeneous distributed pattern analysis”. In: Proc. International Conference on Algorithms and Applications for Parallel Processing. V. L. Narasimhan (Ed.), 881–890 IEEE, Brisbane, Australia, 1995.
Fink, Gernot A., Nils Jungclaus, Franz Kummert, Helge Ritter, Gerhard Sagerer. “A distributed system for integrated speech and image understanding”. In: Proceedings of the International Symposium on Artificial Intelligence. Rogelio Soto (Ed.), 117–126 Cancun, Mexico, 1996.
Herzog, G., C.-K. Sung, E. André, W. Enkelmann, H.-H. Nagel, T. Rist, W. Wahlster. “Incremental natural language description of dynamic imagery”. In: Wissenbasierte Systeme. 3. Internationaler GI-Kongress, C. Freksa, W. Brauer (Eds.), 153–162 Berlin: Springer-Verlag, 1989.
Herzog, G., G. Retz-Schmidt. “Das System SOCCER: Simultane Interpretation und natürlich-sprachliche Beschreibung zeitveranderlicher Szenen”. In: Sport und Informatik, J. Perl (Ed.), 95–119 Schorndorf: Hofmann, 1990.
Infovox. INFOVOX: Text-to-speech converter user’s manual (version 3.4). Solna, Sweden: Telia Promotor Infovox AB, 1994.
Jensen, Finn V. An introduction to Bayesian Networks London, England: UCL Press, 1996.
Jensen, Frank. `Bayesian belief network technology and the HUGIN system“. In: Proceedings of UNICOM seminar on Intelligent Data Management. Alex Gammerman (Ed.), 240–248 Chelsea Village, London, England, April, 1996.
Kosslyn, S.M., J.R. Pomerantz. Imagery, propositions and the form of internal representations. In Cognitive Psychology, 9, 52–76, 1977.
Leth-Espensen, P., B. Lindberg. “Separation of speech signals using eigenfiltering in a dual beamforming system”. In: Proc. IEEE Nordic Signal Processing Symposium (NORSIG). Espoo, Finland, September, 235–238, 1996.
Lindberg, Bq rge. “The Danish SpeechDat(II) Corpus - a Spoken Language Resource”. In: Datalingvistisk Forenings.$rsmode 1999 i Kobenhavn.. Proceedings. CST Working Papers. Report No. 3, B. Maegaard, C. Povlsen, J. Wedekind (Eds), 1999.
Maaß, Wolfgang, Peter Wizinski, Gerd Herzog. VITRA GUIDE: Multimodal route descriptions for computer assisted vehicle navigation. Bereich Nr. 93, Universitat des Saarlandes, FB 14 Informatik IV, Im Stadtwald 15, D-6600, Saarbrucken 11, Germany, February, 1993.
Manthey, Michael J. “The Phase Web Paradigm”. In: International Journal of General Systems, special issue on General Physical Systems Theories. K. Bowden (Ed.), 1998.
Maybury, Mark. “Planning multimedia explanations using communicative acts”. In: Proceedings of the Ninth American National Conference on Artificial Intelligence (MAI-91), Anaheim, CA, July 14–19, 1991.
Maybury, Mark (Ed.). Intelligent multimedia interfaces. Menlo Park, CA: AAAI Press, 1993.
Maybury, Mark, Wolfgang Wahlster (Eds.). Readings in intelligent user interfaces. Los Altos, CA: Morgan Kaufmann Publishers, 1998.
Mc Kevitt, Paul. “Visions for language”. In: Proceedings of the Workshop on Integration of Natural Language and Vision processing. Twelfth American National Conference on Artificial Intelligence (AAAI-94), Seattle, Washington, USA, August, 47–57, 1994.
Mc Kevitt, Paul (Ed.). Integration of Natural Language and Vision Processing“(Vols. I-IV). Dordrecht, The Netherlands: Kluwer-Academic Publishers, 1995/1996.
Mc Kevitt, Paul. “SuperinformationhighwayS”. In: Sprog og Multimedier. Tom Brgndsted, Inger Lytje (Eds.), 166–183, Aalborg, Denmark: Aalborg University Press, April, 1997.
Mc Kevitt, Paul, Paul Dalsgaard. “A frame semantics for an IntelliMedia TourGuide”. In: Proceedings of the Eighth Ireland Conference on Artificial Intelligence (AI-97), Volume 1104–111, University of Uster, Magee College, Deny, Northern Ireland, September, 1997.
Minsky, Marvin. “A framework for representing knowledge”. In: The Psychology of Computer Vision. P.H. Winston (Ed.), 211–217 New York: McGraw-Hill, 1975.
Neumann, B., H.-J. Novak. “NAOS: Ein System zur natürlichsprachlichen Beschreibung zeitveränderlicher Szenen”. In: Informatik. Forschung and Entwicklung, 1(1): 83–92, 1986.
Okada, Naoyuki. “Integrating vision, motion and language through mind”. In: Integration of Natural Language and Vision Processing, Volume N, Recent Advances. Mc Kevitt, Paul (Ed.), 55–80 Dordrecht, The Netherlands: Kluwer Academic Publishers, 1996.
Okada, Naoyuki. “Integrating vision, motion and language through mind”. In: Proceedings of the Eighth Ireland Conference on Artificial Intelligence (AI-97), Volume 1, 7–16 University of Uster, Magee, Deny, Northern Ireland, September, 1997.
Olsen, Jesper. The SLANG Platform: Design and Philosophy, v. 1. Technical Report, Center for Person-Kommunikation, Aalborg University, September, 2000.
Partridge, Derek. A new guide to Artificial Intelligence Norwood, New Jersey: Ablex Publishing Corporation, 1991.
Pentland, Alex (Ed.). Looking at people: recognition and interpretation of human action. IJCAI-93 Workshop (W28) at The 13th International Conference on Artificial Intelligence (IJCAI-93), Chambery, France, August, 1993.
Power, Kevin, Caroline Matheson, Dave 011ason, Rachel Morton. The grapHvite book (version 1.0), Cambridge, England: Entropie Cambridge Research Laboratory Ltd., 1997.
Pylyshyn, Zenon. “What the mind’s eye tells the mind’s brain: a critique of mental imagery”. In: Psychological Bulletin, 80, 1–24, 1973.
Rich, Elaine, Kevin Knight. Artificial Intelligence. New York: McGraw-Hill, 1991.
Rickheit, Gert, Ipke Wachsmuth. “Collaborative Research Centre `Situated Artificial Communicators’ at the University of Bielefeld, Germany”. In: Integration of Natural Language and Vision Processing, Volume IV, Recent Advances. Mc Kevitt, Paul (Ed.), 11–16, Dordrecht, The Netherlands: Kluwer Academic Publishers, 1996.
Retz-Schmidt, Gudala. “Recognizing intentions, interactions, and causes of plan failures”. In: User Modelling and User-Adapted Interaction 1: 173–202, 1991.
Retz-Schmidt, Gudala, Markus Tetzlaff. Methods for the intentional description of image sequences. Bereich Nr. 80, Universitat des Saarlandes, FB 14 Informatik IV, Im Stadtwald 15, D-6600, Saarbrucken 11, Germany, August, 1991.
Stock, Oliviero. “Natural language and exploration of an information space: the ALFresco Interactive system”. In: Proceedings of the 12th International Joint Conference on Artificial Intelligence (IJCAI91) 972–978, Darling Harbour, Sydney, Australia, August, 1991.
Thdrinsson, Kris R. Communicative humanoids: a computational model of psychosocial dialogue skills. Ph.D. thesis, Massachusetts Institute of Technology, 1996.
Thσrisson, Kris R. “Layered action control in communicative humanoids”. In: Proceedings of Computer Graphics Europe ‘87 June 5–7, Geneva, Switzerland, 1997.
ThOrisson, Kris R. This book, 2001.
Wahlster, Wolfgang. One word says more than a thousand pictures: On the automatic verbalization of the results of image sequence analysis. Bereich Nr. 25, Universitat des Saarlandes, FB 14 Informatik IV, Im Stadtwald 15, D-6600, Saarbrucken 11, Germany, February, 1988.
Wahlster, Wolfgang, Elisabeth André, Wolfgang Finkler, Hans-Jurgen Profitlich, Thomas Rist. “Plan-based integration of natural language and graphics generation”. In: Artificial Intelligence, Special issue on natural language generation, 63, 387–427, 1993.
Wahlster, Wolfgang, Norbert Reithinger, Anselm Blocher. “SmartKom: Multimodal Communication with a Life-Like Character”. In: Eurospeech, 7th European Conference on Speech Communication and Technology, Aalborg, 2001.
Waibel, Alex, Minh Tue Vo, Paul Duchnowski, Stefan Manke. “Multimodal interfaces. In: Integration of Natural Language and Vision Processing, Volume IV, Recent Advances, Mc Kevitt, Paul (Ed.), 145–165, Dordrecht, The Netherlands: Kluwer Academic Publishers, 1996.
Waltz, David. “Understanding line drawings of scenes with shadows”. In: The psychology of computer vision, Winston, P.H. (Ed.), 19–91 New York: McGraw-Hill, 1975.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Brøndsted, T., Larsen, L.B., Manthey, M., Kevitt, P.M., Moeslund, T.B., Olesen, K.G. (2002). Developing Intelligent Multimedia Applications. In: Granström, B., House, D., Karlsson, I. (eds) Multimodality in Language and Speech Systems. Text, Speech and Language Technology, vol 19. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2367-1_7
Download citation
DOI: https://doi.org/10.1007/978-94-017-2367-1_7
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-6024-2
Online ISBN: 978-94-017-2367-1
eBook Packages: Springer Book Archive