Generating coherent presentations employing textual and visual material

André, Elisabeth; Rist, Thomas

doi:10.1007/BF00849177

Generating coherent presentations employing textual and visual material

Published: June 1995

Volume 9, pages 147–165, (1995)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Elisabeth André¹ &
Thomas Rist¹

94 Accesses
24 Citations
Explore all metrics

Abstract

The objective of the work described in this paper is the development of an intelligent generation system which is able to combine textual and visual material. As coherent presentations cannot be generated by simply merging verbalization and visualization results into multimedia output, the processes for content determination, medium selection and content realization in different media have to be carefully coordinated. We first show that multimedia presentations and pure text follow similar structuring principles. Based on this insight, we sketch how techniques for planning text and discourse can be generalized to allow the structure and contents of multimedia communications to be planned as well. In particular, we explain how our approach handles the crucial task of process coordination.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

André, E. & Rist, T. (1990). Towards a Plan-Based Synthesis of Illustrated Documents. In Proceedings ofThe Ninth ECAI, 25–30. Stockholm. Also as DFKI Research Report RR-90-11.
André, E. & Rist, T. (1993). The Design of Illustrated Documents as a Planning Task. In Maybury, M. (ed.)Intelligent Multimedia Interfaces, 94–116. AAAI Press. Also as DFKI Research Report RR-92-45.
André, E. & Rist, T. (1994). Referring to World Objects with Text and Pictures. In Proceedings ofThe Fifteenth COLING, Kyoto, Japan (to appear).
André, E., Finkler, W., Graf, W., Rist, T., Schauder, A. & Wahlster, W. (1993). WIP: The Automatic Synthesis of Multimodal Presentations. In Maybury, M. (ed.)Intelligent Multimedia Interfaces, 75–93. AAAI Press. Also as DFKI Research Report RR-92-46.
Appelt, D. & Kronfeld, A. (1987). A Computational Model of Referring. In Proceedings ofThe Tenth IJCAI, 640–647. Milan, Italy.
Arens, Y., Hovy, E. & van Mulken, S. (1993a). Structure and Rules in Automated Multimedia Presentation Planning. In Proceedings ofThe Thirteenth IJCAI, volume 2, 1253–1259. Chambéry, France.
Google Scholar
Arens, Y., Hovy, E. & Vossers, M. (1993b). Describing the Presentational Knowledge Underlying Multimedia Instruction Manuals. In Maybury, M. (ed.)Intelligent Multimedia Interfaces, 280–306. AAAI Press.
Badler, N., Barsky, B., Zeltzer, D. (eds.) (1991a).Making Them Move: Mechanics, control, and Animation of Articulated Figures. Morgan Kaufmann: San Mateo, California.
Google Scholar
Badler, N., Webber, B., Kalita, J. & Esakov, J. (1991b).Animation from Instructions. In Badler et al, 51–93.
Bandyopadhyay, S. (1990).Towards an Understanding of Coherence in Multimodal Discourse. Technical Memo TM-90-01, Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Saarbrücken, Germany.
Brandt, M., Koch, W., Motsch, W. & Rosengren, I. (1983). Der Einfluß der kommunikativen Strategie auf die Textstruktur — dargestellt am Beispiel des Geschäftsbriefes. In Rosengren, I. (ed.)Sprache und Pragmatik Lunder Symposium 1982, 105–135. Almquist & Wiksell: Stockholm.
Google Scholar
Costabile, M. F., Catarci, T. & Levialdi, S. (eds.) (1992).Advanced Visual Interfaces (Proceedings of AVI '92, Rome, Italy). World Scientific Press: Singapore.
Google Scholar
Feiner, S. K. & McKeown, K. R. (1991). Automating the Generation of Coordinated Multimedia Explanations.IEEE Computer 24(10): 33–41.
Google Scholar
Grice, H. P. (1975). Logic and Conversation. In Cole, P. & Morgan, J. L. (eds.)Syntax and Semantics: Speech Acts 3: 41–58. Academic Press: New York.
Google Scholar
Grimes, J. E. (1975).The Thread of Discourse. Mouton: The Hague, Paris.
Google Scholar
Hirst, G. (1981)Anaphora in Natural Language Understanding. Springer: Berlin, Heidelberg.
Google Scholar
Hobbs, J. (1978).Why is a Discourse Coherent? Technical Report 176, SRI International: Menlo Park, CA.
Google Scholar
Houghton, H. A. & Willows, D. M. (1987).The Psychology of Illustration, Instructional Issues, volume 2. Springer: New York, Berlin, Heidelberg, London, Paris, Tokyo.
Google Scholar
Hovy, E. H. (1988). Planning Coherent Multisentential Text. In Proceedings ofThe Twenty-Sixth ACL, 163–169.
Hunter, B., Crismore, A. & Pearson, P. D. (1987). Visual Displays in Basal Readers and Social Studies Textbooks. In Willows, D. M. & Houghton, H. A. (eds.)The Psychology of Illustration, Basic Research, volume 2, 116–135. Springer: New York, Berlin, Heidelberg.
Google Scholar
Kjorup, S. (1978). Pictorial Speech Acts.Erkenntnis 12: 55–71.
Google Scholar
Levie, W. H. (1987). Research on Pictures: A Guide to the Literature. In Willows, D. M. & Houghton, H. A. (eds.)The Psychology of Illustration, Basic Research, volume 1, 1–50. Springer: New York, Berlin, Heidelberg.
Google Scholar
Levin, J. R., Anglin, G. J. & Carney, R. N. (1987). On Empirically Validating Functions of Pictures in Prose. In Willows, D. M. & Houghton, H. A. (eds.)The Psychology of Illustration, Basic Research 1: 51–85. Springer: New York, Berlin, Heidelberg.
Google Scholar
Mann, W. C. & Thompson, S. A. (1987).Rhetorical Structure Theory: A Theory of Text Organization. Report ISI/RS-87-190. Univ. of Southern California, Marina del Rey, CA.
Google Scholar
Marks, J. & Reiter, E. (1990). Avoiding Unwanted Conversational Implicatures in Text and Graphics. In Proceedings of AAAI-90, volume 1, 450–456. Boston, MA.
Maybury, M. (ed.) (1993).Intelligent Multimedia Interfaces. AAAI Press.
Molitor, S., Ballstaedt, S.-P. & Mandl, H. (1989). Problems in Knowledge Acquisition from text and Pictures. In Mandl, H. & Levin, J. R. (eds.)Knowledge Acquisition from text and Pictures, 3–35. North Holland: Amsterdam, New York, Oxford, Tokyo.
Google Scholar
Moore, J. D. & Paris, C. L. (1989). Planning Text for Advisory Dialogues. In Proceedings ofThe Twenty-Seventh ACL, 203–211. Vancouver.
Reiter, E. & Dale, R. (1992). A Fast Algorithm for the Generation of Referring Expressions. In Proceedings ofThe Fourteenth COLING, volume 1, 232–238. Nantes, France.
Roth, S. F., Mattis, J. & Mesnard, X. (1991). Graphics and Natural Language as Components of Automatic Explanation. In Sullivan, J. W. & Tyler, S. W. (eds.)Intelligent User Interfaces, 207–239. ACM Press: New York, NY.
Google Scholar
Schneiderlöchner, F. (1994).Generierung von Referenzausdrücken in einem multimodalen Diskurs. Master's thesis, Fachbereich Informatik, Universität des Saarlandes, Saarbrücken, Germany.
Searle, J. R. (1980).Speech Acts: An Essay in the Philosophy of Language. Cambridge University Press: Cambridge, England.
Google Scholar
Stock, O. & the ALFRESCO Project Team (1993). ALFRESCO: Enjoying the Combination of Natural Language Processing and Hypermedia for Information Exploration. In Maybury, M. (ed.)Intelligent Multimedia Interfaces, 197–224. AAAI Press.
van Dijk, T. A. (1980).Textwissenschaft. dtv: München.
Wahlster, W., André, E., Graf, W. & Rist, T. (1991). Designing Illustrated Texts: How Language Production is Influenced by Graphics Generation. In Proceedings ofThe Fifth EACL, 8–14. Berlin, Germany.
Wahlster, W., André, E., Finkler, W. Profitlich, H.-J. & Rist, T. (1993). Plan-Based Integration of Natural Language and Graphics Generation.AI Journal 63: 387–427. Also as DFKI Research Report RR-93-02.
Google Scholar
Wazinski, P. (1992). Generating Spatial Descriptions for Cross-Modal References. In Proceedings ofThe Third Conference on Applied Natural Language Processing, 56–63. Trento, Italy.
Willows, D. M. & Houghton, H. A. (1987).The Psychology of Illustration, Basic Research, volume 1. Springer: New York, Berlin, Heidelberg, London, Paris, Tokyo.
Google Scholar
Wilson, M., Sedlock, D., Binot, J.-L. & Falzon, P. (1992). An Architecture For Multimodal Dialogue. In Proceedings ofThe Second Vencona Workshop for Multimodal Dialogue. Vencona, Italy.

Download references

Author information

Authors and Affiliations

German Research Center for Artificial Intelligence (DFKI), Stuhlsatzenhausweg 3, D-66123, Saarbrücken
Elisabeth André & Thomas Rist

Authors

Elisabeth André
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Rist
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

André, E., Rist, T. Generating coherent presentations employing textual and visual material. Artif Intell Rev 9, 147–165 (1995). https://doi.org/10.1007/BF00849177

Download citation

Issue Date: June 1995
DOI: https://doi.org/10.1007/BF00849177

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generating coherent presentations employing textual and visual material

Abstract

Access this article

Similar content being viewed by others

Discourse and Camera Control in Interactive Narratives

Discourse and Camera Control in Interactive Narratives

Improving Text Generation Through Introducing Coherence Metrics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key words

Navigation

Generating coherent presentations employing textual and visual material

Abstract

Access this article

Similar content being viewed by others

Discourse and Camera Control in Interactive Narratives

Discourse and Camera Control in Interactive Narratives

Improving Text Generation Through Introducing Coherence Metrics

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation