Skip to main content

Personalized Multimedia Summarization

  • Chapter
Algorithms in Ambient Intelligence

Part of the book series: Philips Research ((PRBS,volume 2))

  • 236 Accesses

Abstract

Summarization and abstraction are our survival tools in this age of information explosion and availability. Ability to summarize will be seen as essential part of intelligent behavior of the consumer devices.

We introduce the notion of video summarization, and provide definitions of the different flavors of summaries: Video skim, highlights, and structured multimedia summary. We present different features and methods for automatic video content analysis for the purpose of summarization: color analysis using Super-histograms, transcript extraction and analysis, superimposed and overlaid text detection, face detection, commercial detection. All these different types of video content analysis are used to produce multimedia summaries. We use audio, visual, and text information for selecting important elements of news stories for multimedia news summary.

We present a method for surface level summarization and its applications: a talk show browser and a content-based video recorder called Video Scout. We also discuss a method for news story segmentation and summarization as an example of structured video summarization.

Summaries presented to users should be personalized. Our initial study shows that personal preferences include implicit features such as viewing history, summary usage history, and explicit preferences such as topics, location, age, gender, profession, hobbies, allocated time, consumption preferences (auditory vs. visual). We believe that personalized summarization will provide essential tools for access to relevant information everywhere at any time in the context of ambient intelligence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agnihotri, L., and N. Dimitrova [ 1999 ]. Text detection for video analysis. In: Proc. of the IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL’99), pages 109–113.

    Google Scholar 

  • Agnihotri, L., and N. Dimitrova [ 2000 ]. Video clustering using Superhistograms in large video archives. In: Proc. Fourth International Conference on Advances in Visual Information Systems (Visual 2000), pages 62–93.

    Google Scholar 

  • Agnihotri, L., K. Devara, T. McGee, and N. Dimitrova [2001]. Summarization of video programs based on closed captioning. In: Proc. SPIE Conf. on Storage and Retrieval in Media Databases, San Jose, CA, pages 599–607.

    Google Scholar 

  • Agnihotri, L., N. Dimitrova, T. McGee, S. Jeannin, D. Schaffer, and J. Nesvadba [ 2003a ]. Evolv-able visual commercial detectors. In: Proc. IEEE Conference on Vision and Pattern Recognition, Madison, WI.

    Google Scholar 

  • Agnihotri, L., N. Dimitrova, J. Kender, and J. Zimmerman [ 2003b ]. User study on personalized multimedia summarization. In: Proc. of the International Conference on Multimedia and Expo, submitted.

    Google Scholar 

  • Aner, A., and J.R. Kender [ 2002 ]. Video summaries through mosaic-based shot and scene clustering. In: Proc. European Conf on Computer Vision, Denmark.

    Google Scholar 

  • Arons, B. [ 1997 ]. Speechskimmer: A system for interactively skimming recorded speech. ACM Transactions on Computer-Human Interaction, 4 (l): 3–38.

    Article  Google Scholar 

  • Beyer, H., and K. Holtzblatt [ 1998 ]. Contextual Design: Defining Customer-Centered Systems. Morgan and Kaufmann, San Francisco, CA.

    Google Scholar 

  • Blum, D.W. [ 1992 ]. Method and Apparatus for Identifying and Eliminating Specific Material from Video Signals. US patent 5,151, 788.

    Google Scholar 

  • Bonner, E.L., and N.A. Faerber [1982]. Editing System for Video Apparatus. US patent 4, 314, 285.

    Google Scholar 

  • Chang, S.-F., D. Zhong, and R. Kumar [ 2001 ]. Video real-time content-based adaptive streaming of sports videos. In: Proc. of the IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL’01).

    Google Scholar 

  • Chellappa, R., C.L. Wilson, and S. Sirohey [ 1995 ]. Human and machine recognition of faces: A Survey. In: Proc. of the IEEE, 83 (5): 705–740.

    Google Scholar 

  • Christel, M.G., M.A. Smith, C.R. Taylor, and D.B. Winkler [ 1998 ]. Evolving video skims into useful multimedia abstractions. In: Proc. of the ACM Conference on Computer Human Interaction (CHF98), pages 171–178.

    Google Scholar 

  • Dagtas, S., T. McGee, and M. Abdel-Mottaleb [ 2000 ]. Smart Watch: An automated video event finder. In: Proc. ACM Multimedia’2000, Los Angeles, CA.

    Google Scholar 

  • Dimitrova, N., J. Martino, L. Agnihotri, and H. Elenbaas [ 1999 ]. Color super-histograms for video representation. In: Proc. of the International Conference on Image Processing (ICI’99), pages 314–318.

    Google Scholar 

  • Dimitrova, N., L. Agnihotri, and G. Wei [ 2001 ]. Video classification using object tracking. International Journal of Image and Graphics, Special Issue on Image and Video Databases, l(3):487–506.

    Google Scholar 

  • Dimitrova, N., L. Agnihotri, and R. Jainschi [2003]. Temporal video boundaries. In: Video Mining, A. Rosenfeld, D. Doermann, and D. Dementhon (eds.), Kluwer, pages 63–92.

    Google Scholar 

  • Firmin, T., and M.J. Chrzanowski [ 1999 ]. An evaluation of automatic text summarization systems. In: Advances in Automatic Text Summarization, I. Mani and M.T. Maybury (eds.), MIT Press, pages 391–401.

    Google Scholar 

  • Gould, J.D., and C. Lewis [ 1985 ]. Designing for usability: Key principles and what designers think. Communications of the ACM, 28 (3): 300–311.

    Article  Google Scholar 

  • Haas, N., R. Bolle, N. Dimitrova, A. Janevski, and J. Zimmerman [ 2002 ]. Personalized news through content augmentation and profiling. In: Proc. IEEE ICIP.

    Google Scholar 

  • Hauptmann, A.G., and M.J. Witbrock [ 1998 ]. Story segmentation and detection of commercials in broadcast news video. In: Proc. Advances in Digital Libraries Conference (ADL’98), pages 168–179.

    Google Scholar 

  • Iggulden, J., K. Fields, A. McFarland, and J. Wu [1997]. Method and Apparatus for Eliminating Television Commercial Messages. US Patent 5, 696, 866.

    Google Scholar 

  • Jasinschi, R.S., N. Dimitrova, T. McGee, L. Agnihotri, J. Zimmerman, and D. Li [ 2001 ]. Integrated multimedia processing for topic segmentation and classification. In: Proc. of IEEE Intl. Conf. on Image Processing (ICIP), Greece.

    Google Scholar 

  • Kurapati, K., S. Gutta, D. Schaffer, J. Martino, and J. Zimmerman [ 2001 ]. A multi-agent TV recommender. In: Proc. of User Modeling 2001: Personalization in Future TV Workshop.

    Google Scholar 

  • Lee, H., A. Smeaton, P. McCann, N. Murphy, N. O’Connor, and S. Marlow [ 2000 ]. Fischiar on a PDA: A handheld user interface for video indexing, browsing and playback system. In: Proc. of the Sixth ECRIM Workshop: User Interfaces for All.

    Google Scholar 

  • Li, Y., and C.C.J. Kuo [ 2000a ]. Detecting commercial breaks in real TV programs based on audiovisual information. In: Proc. SPIE on Internet Multimedia Management System, vol. 4210, Boston, pages 225–236.

    Google Scholar 

  • Li, D., I.K. Sethi, N. Dimitrova, and T. McGee [ 2000b ]. Classification of general audio data for content-based retrieval. Pattern Recognition Letters, 22 (5): 533–544.

    Article  Google Scholar 

  • Li, Y., W. Ming, C.-C. Jay Kuo [ 2001 ]. Semantic video content abstraction based on multiple cues. In: Proc. ICME, Japan, pages 804–808.

    Google Scholar 

  • Lienhart, R., C. Kuhmunch, and W. Effelsberg [ 1997a ]. On the detection and recognition of television commercials. In Proc. IEEE International Conference on Multimedia Computing and Systems, pages 509–516.

    Google Scholar 

  • Lienhart, R., S. Pfeiffer, and W. Effelsberg [ 1997b ]. Video abstracting. Communications of the ACM, 40 (12): 55–62.

    Article  Google Scholar 

  • Ma, Y.-F., L. Lu, H.J. Zhang, and M. Li [ 2002 ]. A user attention model for video summarization. In: Proc. ACM Multimedia, Juan Les Pin.

    Google Scholar 

  • Mani, I., and M.T. Maybury [1999]. Advances in Automatic Text Summarization. MIT Press.

    Google Scholar 

  • Merlino, A., D. Morey, and M. Maybury [ 1997 ]. Broadcast news navigation using story segmentation. In: Proc. of the Fifth ACM International Conference on Multimedia (ACMMM’97), pages 381–391

    Google Scholar 

  • Merlino, A., and M. Maybury [ 1999 ]. An empirical study of the optimal presentation of multimedia summaries of broadcast news. In: Advances in Automatic Text Summarization, I. Mani and M.T. Maybury, (eds.), MIT Press, pages 391–401.

    Google Scholar 

  • Merialdo, B., K.T. Lee, D. Luparello, and J. Roudaire [ 1999 ]. Automatic construction of personalized tv news program. In: Proc. of the Seventh ACM International Conference on Multimedia (ACMMM’99), pages 323–331.

    Google Scholar 

  • McGee, T., and N. Dimitrova [ 1999 ]. Parsing TV program structures for identification and removal of non-story segments. In: Proc. SPIE Conference on Storage and Retrieval for Image and Video Databases VII.

    Google Scholar 

  • Nafeh, J. [ 1994 ]. Method and Apparatus for Classifying Patterns of Television Programs and Commercials Based on Discerning of Broadcast Audio and Video Signals. US patent 5,343, 251.

    Google Scholar 

  • Naphade, M.R., I. Kozintsev, and T.S. Huang [ 2002 ]. A factor graph framework for semantic video indexing. IEEE Transactions on Circuits and Systems for Video Technology, 12 (1): 40–52.

    Article  Google Scholar 

  • Novak, A.P. [ 1988 ]. Method and System for Editing Unwanted Program Material from Broadcast Signals. US patent 4,750, 213.

    Google Scholar 

  • Petkovic, M., V. Mihajlovic, W. Jonker [ 2002 ]. Multi-modal extraction of highlights from TV formula 1 programs. In: Proc. IEEE Conf. on Multimedia and Expo, Lausanne.

    Google Scholar 

  • Smith, M., and T. Kanade [ 1997 ]. Video skimming and characterization through the combination of image and language understanding techniques. In: Proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’97), pages 775–781.

    Google Scholar 

  • Sundaram, H., L. Xie, S.-F. Chang [ 2002 ]. A utility framework for the automatic generation of audio-visual skims. In: Proc. ACM Multimedia, Juan Les Pin.

    Google Scholar 

  • IBM Intelligent Miner for Text.

    Google Scholar 

  • Uchihashi, S., J. Foote, A. Girgensohn, and J. Boreczky [ 1999 ]. Video Manga: Generating se-mantically meaningful video summaries. In: Proc. ACM Multimedia, pages 383–392.

    Google Scholar 

  • Wei, G., and I.K. Sethi [ 1999 ]. Face detection for image annotation. In: Proc. of Pattern Recognition in Practice VI, Vlieland, The Netherlands.

    Google Scholar 

  • Xie, L., S.-F. Chang, A. Divakaran, and H. Sun [2002]. Structure analysis of soccer video with hidden Markov models. In Proc. Interational Conference on Acoustic, Speech and Signal Processing (ICASSP-2002), Orlando, FL, May 13–17.

    Google Scholar 

  • Yahiaoui, I., B. Merialdo, and B. Huet [ 2001 ]. Generating summaries of multi-episode video. In: Proc. ICME, Japan, pages 792–796.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Agnihotri, L., Dimitrova, N. (2004). Personalized Multimedia Summarization. In: Verhaegh, W.F.J., Aarts, E., Korst, J. (eds) Algorithms in Ambient Intelligence. Philips Research, vol 2. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-0703-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-0703-9_5

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-6490-5

  • Online ISBN: 978-94-017-0703-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics