Skip to main content

AI Meets Authoring: User Models for Intelligent Multimedia

  • Chapter
Integration of Natural Language and Vision Processing

Abstract

Authoring is a complex, knowledge-intensive activity which until recently has been performed exclusively by humans. New computer-based techniques have added horsepower rather than intelligence to traditional approaches, and have not addressed their principal limitations, chief of which is the inability to tailor presentations to individual users at run-time.

We believe a model of the user is needed to support this kind of run-time determination of form and content. We describe our approach to the acquisition, representation and exploitation of user models: the most plausible user model is the result of an abductive recognition process and it incorporates assumptions about the user which are then used to constrain the design by abduction of the best presentation. Both recognition and design processes are performed at run-time. We describe a prototypical implementation designed to demonstrate these ideas in the domain of video authoring.

Our approach to authoring is intended to apply across multiple media; we have demonstrated these ideas with video because authoring in the video medium with traditional approaches inherits and exacerbates the problems from traditional media, and because the popularity of video as an authoring medium continues to grow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Arens, Y., Hovy, E. & van Mulken, S. (1993). Structure and Rules in Automated Multimedia Presentation Planning. In Proceedings of The 13th International Joint Conference on Artificial Intelligence, 1253–1259, Chambéry, France.

    Google Scholar 

  • Buxton, W. & Moran, T. (1990). EuroPARC’s Integrated Interactive Intermedia Facility (IIIF): Early Experiences. In Gibbs, S. & Verrijn-Stuart, A. A. (eds.) Multi-user Interfaces and Applications, 11–34.

    Google Scholar 

  • Cheeseman, P. (1990). On Finding the Most Probable Model. In Shranger, J. & Langley, P. (eds.) Computational Models of Scientific Discovery and Theory Formation, chapter 3, pp. 73–95. Morgan Kaufmann: San Mateo.

    Google Scholar 

  • Cherfaoui, M. & Bertin, C. (1993). Video Documents: Towards Automatic Summaries. In Workshop Proceedings of IEEE Visual Processing and Communications, 295–298, Melbourne, Australia.

    Google Scholar 

  • Csinger, A. (1992). The Psychology of Visualization. Technical Report 28, The University of British Columbia.

    Google Scholar 

  • Csinger, A. (1994). Cross-modal Reference and the Attention Problem (in prep.).

    Google Scholar 

  • Csinger, A. & Booth, K. S. (1994). Reasoning About Video: Knowledge-Based Transcription and Presentation. In Nunamaker, Jay F. &. Sprague, Ralph H. (eds.) 27th Annual Hawaii International Conference on System Sciences, Volume III: Information Systems: Decision Support and Knowledge-based Systems, pp. 599–608, Maui, HI.

    Google Scholar 

  • Csinger, A. & Poole, D. (1993). Hypothetically Speaking: Default Reasoning and Discourse Structure. In Proceedings of The 13th International Joint Conference on Artificial Intelligence, 1179–1184, Chambéry,France.

    Google Scholar 

  • Davenport, G., Evans, R. & Halliday, M. (1993). Orchestrating Digital Micromovies. Leonardo 26(4): 283–288.

    Article  Google Scholar 

  • Edwards, P. (ed.). (1967). The Encyclopaedia of Philosophy. Macmillan and The Free Press.

    Google Scholar 

  • Finger, J. J. & Genesereth, M. R. (1985). Residue: A Deductive Approach to Design Synthesis. Technical Report STAN-CS-85–1035, Department of Computer Science, Stanford University, Stanford, Cal.

    Google Scholar 

  • Fischer, G. (1992). Shared Knowledge in Cooperative Problem-Solving Systems: Integrating Adaptive and Adaptable Systems. In Proceedings of The Third International Workshop on User Modelling. 148–161. Dagstuhl, Germany.

    Google Scholar 

  • Goldman-Segall, R. (1989). Thick Descriptions: A Tool for Designing Ethnographic Interactive Videodiscs. SigChi Bulletin, 21(2).

    Google Scholar 

  • Goldman-Segall, R. (1991). A Multimedia Research Tool for Ethnographic Investigation. In Harel, I. & Papert, S. (eds.) Constructionismi Ablex Publishing Corporation: Norwood, NJ.

    Google Scholar 

  • Goodman, B. A. (1993). Multimedia Explanations for Intelligent Training Systems. In Maybury, Mark T. (ed.) Intelligent Multimedia Interfaces, chapter 7, pp. 148–171. AAAI Press – MIT Press.

    Google Scholar 

  • Hardman, L. van Rossum, G. & Bulterman, D. C. A. (1993). Structured Multimedia Authoring. In Proceedings ACM Multimedia 93, 283–289.

    Chapter  Google Scholar 

  • Harrison, B. L. & Baecker, R. M. (1992). Designing Video Annotation and Analysis Systems. In Graphics Interface ’92 Proceedings, 157–166. Vancouver, BC.

    Google Scholar 

  • Joly, P. & Cherfaoui, M. (1993). Survey of Automatical Tools for the Content Analysis of Video. IRIT 93–36-R, Bibliotheque de I’TRIT, UPS, 118 route de Narbonne, 31062 TOULOUSE CEDEX. Also available by anonymous FTP from ftp.irit.fr in PostScript, ascii and MS Word formats as private/svideo.[ps,as,wd], or by email direct from the authors (cherfaoui@ccett.fr orjoly@irit.fr).

    Google Scholar 

  • Karp, P. & Feiner, S. (1993). Automated Presentation Planning of Animation Using Task Decomposition with Heuristic Planning. In Graphics Interface ’93, 118–127, Toronto, Canada.

    Google Scholar 

  • Kass, R. & Finin, T. (1988). Modelling the User in Natural Language Systems. Computational Linguistics 14(3): 5.

    Google Scholar 

  • Kobsa, A. (1992). User Modelling: Recent Work, Prospects and Hazards. In Proceedings of The Workshop on User Adapted Interaction, Bari, Italy. Also available as a June 1992 Technical Report from Universität Konstanz Informationswissenschaft.

    Google Scholar 

  • Mackay, W. E. & Davenport G. (1989). Virtual Video Editing in Interactive Multimedia Applications. Communications of the Association for Computing Machinery 32(7): 802–810.

    Article  Google Scholar 

  • Mackay, W. E. & Tatar, D. G. (1989). Special Issue on Video as a Research and Design Tool. ACM SIGCHI Bulletin 21(2).

    Google Scholar 

  • Mackinlay, J. D. (1986). Automating the Design of Graphical Presentations of Relational Information. Association for Computing Machinery Transactions on Graphics 5(2): 110–141.

    Google Scholar 

  • Metz, C. (1974). Film Language: A Semiotics of the Cinema. Oxford University Press. Translated by Michael Taylor.

    Google Scholar 

  • Newcomb, S. R., Kipp, N. A. & Newcomb, V. T. (1991). The “HyTime” Hypermedia/Time-based Document Structuring Language. Communications of the Association for Computing Machinery 34(11): 67–83.

    Article  Google Scholar 

  • Poole, D. (1987). A Logical Framework for Default Reasoning. Artificial Intelligence 36(1): 27–47.

    Article  MathSciNet  Google Scholar 

  • Poole, D. (1989). Explanation and Prediction: An Architecture for Default and Abductive Reasoning. Computational Intelligence 5(2): 97–110.

    Article  MathSciNet  Google Scholar 

  • Poole, D. (1990). Hypo-Deductive Reasoning for Abduction, Default Reasoning and Design. In Working Notes, AAAI Spring Symposium on Automated Abduction, 106–110.

    Google Scholar 

  • Poole, D. (1993a). Logic Programming, Abduction and Probability: A Top-Down Anytime Algorithm for Computing Prior and Posterior Probabilities. New Generation Computing 11(3–4): 377–400.

    Article  MATH  Google Scholar 

  • Poole, D. (1993b). Probabilistic Horn Abduction and Bayesian Networks. Artificial Intelligence 64(1): 81–129.

    Article  MATH  Google Scholar 

  • Foth, S. F. & Mattis, J. (1990). Data Characterization for Intelligent Graphics Presentation. In CHI’90 Proceedings, 193–200, Seattle, WA.

    Google Scholar 

  • Searle, J. R. & Vanderveken, D. (1985). Foundations of Illocutionary Logic.Cambridge University Press.

    MATH  Google Scholar 

  • Suchman, L. & Trigg, R. (1991). Understanding Practice: Video as a Medium for Reflection and Design. In Greenbaum & Kyng (eds.) Design at Work: Cooperative Design of Computer Systems.

    Google Scholar 

  • Wahlster, W., André, E., Bandyopadhyay, S., Graf, W. & Rist, T. (1991). WIP: The Coordinated Generation of Multimodal Presentations from a Common Representation. Research Report RR-91–08, Deutsches Forschungszentrum für Künstliche Intelligenz, Stuhlsatzenhausweg 3, D-6600 Saarbriicken 11, Germany.

    Google Scholar 

  • Wahlster, W. & Kobsa, A. (1990). User Modelling in Dialog Systems Springer-Verlag.

    Google Scholar 

  • Xiang, Y., Beddoes, M. P. & Poole, D. (1990). Sequential Updating Conditional Probability in Bayesian Networks by Posterior Probability. In Proceedings of the Eighth Biennial Conference of the Canadian Society for Computational Studies of Intelligence, 21–27.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Csinger, A., Booth, K.S., Poole, D. (1995). AI Meets Authoring: User Models for Intelligent Multimedia. In: Mc Kevitt, P. (eds) Integration of Natural Language and Vision Processing. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-0273-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-94-011-0273-5_16

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-4121-8

  • Online ISBN: 978-94-011-0273-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics