Abstract
Summarization and abstraction are our survival tools in this age of information explosion and availability. Ability to summarize will be seen as essential part of intelligent behavior of the consumer devices.
We introduce the notion of video summarization, and provide definitions of the different flavors of summaries: Video skim, highlights, and structured multimedia summary. We present different features and methods for automatic video content analysis for the purpose of summarization: color analysis using Super-histograms, transcript extraction and analysis, superimposed and overlaid text detection, face detection, commercial detection. All these different types of video content analysis are used to produce multimedia summaries. We use audio, visual, and text information for selecting important elements of news stories for multimedia news summary.
We present a method for surface level summarization and its applications: a talk show browser and a content-based video recorder called Video Scout. We also discuss a method for news story segmentation and summarization as an example of structured video summarization.
Summaries presented to users should be personalized. Our initial study shows that personal preferences include implicit features such as viewing history, summary usage history, and explicit preferences such as topics, location, age, gender, profession, hobbies, allocated time, consumption preferences (auditory vs. visual). We believe that personalized summarization will provide essential tools for access to relevant information everywhere at any time in the context of ambient intelligence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agnihotri, L., and N. Dimitrova [ 1999 ]. Text detection for video analysis. In: Proc. of the IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL’99), pages 109–113.
Agnihotri, L., and N. Dimitrova [ 2000 ]. Video clustering using Superhistograms in large video archives. In: Proc. Fourth International Conference on Advances in Visual Information Systems (Visual 2000), pages 62–93.
Agnihotri, L., K. Devara, T. McGee, and N. Dimitrova [2001]. Summarization of video programs based on closed captioning. In: Proc. SPIE Conf. on Storage and Retrieval in Media Databases, San Jose, CA, pages 599–607.
Agnihotri, L., N. Dimitrova, T. McGee, S. Jeannin, D. Schaffer, and J. Nesvadba [ 2003a ]. Evolv-able visual commercial detectors. In: Proc. IEEE Conference on Vision and Pattern Recognition, Madison, WI.
Agnihotri, L., N. Dimitrova, J. Kender, and J. Zimmerman [ 2003b ]. User study on personalized multimedia summarization. In: Proc. of the International Conference on Multimedia and Expo, submitted.
Aner, A., and J.R. Kender [ 2002 ]. Video summaries through mosaic-based shot and scene clustering. In: Proc. European Conf on Computer Vision, Denmark.
Arons, B. [ 1997 ]. Speechskimmer: A system for interactively skimming recorded speech. ACM Transactions on Computer-Human Interaction, 4 (l): 3–38.
Beyer, H., and K. Holtzblatt [ 1998 ]. Contextual Design: Defining Customer-Centered Systems. Morgan and Kaufmann, San Francisco, CA.
Blum, D.W. [ 1992 ]. Method and Apparatus for Identifying and Eliminating Specific Material from Video Signals. US patent 5,151, 788.
Bonner, E.L., and N.A. Faerber [1982]. Editing System for Video Apparatus. US patent 4, 314, 285.
Chang, S.-F., D. Zhong, and R. Kumar [ 2001 ]. Video real-time content-based adaptive streaming of sports videos. In: Proc. of the IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL’01).
Chellappa, R., C.L. Wilson, and S. Sirohey [ 1995 ]. Human and machine recognition of faces: A Survey. In: Proc. of the IEEE, 83 (5): 705–740.
Christel, M.G., M.A. Smith, C.R. Taylor, and D.B. Winkler [ 1998 ]. Evolving video skims into useful multimedia abstractions. In: Proc. of the ACM Conference on Computer Human Interaction (CHF98), pages 171–178.
Dagtas, S., T. McGee, and M. Abdel-Mottaleb [ 2000 ]. Smart Watch: An automated video event finder. In: Proc. ACM Multimedia’2000, Los Angeles, CA.
Dimitrova, N., J. Martino, L. Agnihotri, and H. Elenbaas [ 1999 ]. Color super-histograms for video representation. In: Proc. of the International Conference on Image Processing (ICI’99), pages 314–318.
Dimitrova, N., L. Agnihotri, and G. Wei [ 2001 ]. Video classification using object tracking. International Journal of Image and Graphics, Special Issue on Image and Video Databases, l(3):487–506.
Dimitrova, N., L. Agnihotri, and R. Jainschi [2003]. Temporal video boundaries. In: Video Mining, A. Rosenfeld, D. Doermann, and D. Dementhon (eds.), Kluwer, pages 63–92.
Firmin, T., and M.J. Chrzanowski [ 1999 ]. An evaluation of automatic text summarization systems. In: Advances in Automatic Text Summarization, I. Mani and M.T. Maybury (eds.), MIT Press, pages 391–401.
Gould, J.D., and C. Lewis [ 1985 ]. Designing for usability: Key principles and what designers think. Communications of the ACM, 28 (3): 300–311.
Haas, N., R. Bolle, N. Dimitrova, A. Janevski, and J. Zimmerman [ 2002 ]. Personalized news through content augmentation and profiling. In: Proc. IEEE ICIP.
Hauptmann, A.G., and M.J. Witbrock [ 1998 ]. Story segmentation and detection of commercials in broadcast news video. In: Proc. Advances in Digital Libraries Conference (ADL’98), pages 168–179.
Iggulden, J., K. Fields, A. McFarland, and J. Wu [1997]. Method and Apparatus for Eliminating Television Commercial Messages. US Patent 5, 696, 866.
Jasinschi, R.S., N. Dimitrova, T. McGee, L. Agnihotri, J. Zimmerman, and D. Li [ 2001 ]. Integrated multimedia processing for topic segmentation and classification. In: Proc. of IEEE Intl. Conf. on Image Processing (ICIP), Greece.
Kurapati, K., S. Gutta, D. Schaffer, J. Martino, and J. Zimmerman [ 2001 ]. A multi-agent TV recommender. In: Proc. of User Modeling 2001: Personalization in Future TV Workshop.
Lee, H., A. Smeaton, P. McCann, N. Murphy, N. O’Connor, and S. Marlow [ 2000 ]. Fischiar on a PDA: A handheld user interface for video indexing, browsing and playback system. In: Proc. of the Sixth ECRIM Workshop: User Interfaces for All.
Li, Y., and C.C.J. Kuo [ 2000a ]. Detecting commercial breaks in real TV programs based on audiovisual information. In: Proc. SPIE on Internet Multimedia Management System, vol. 4210, Boston, pages 225–236.
Li, D., I.K. Sethi, N. Dimitrova, and T. McGee [ 2000b ]. Classification of general audio data for content-based retrieval. Pattern Recognition Letters, 22 (5): 533–544.
Li, Y., W. Ming, C.-C. Jay Kuo [ 2001 ]. Semantic video content abstraction based on multiple cues. In: Proc. ICME, Japan, pages 804–808.
Lienhart, R., C. Kuhmunch, and W. Effelsberg [ 1997a ]. On the detection and recognition of television commercials. In Proc. IEEE International Conference on Multimedia Computing and Systems, pages 509–516.
Lienhart, R., S. Pfeiffer, and W. Effelsberg [ 1997b ]. Video abstracting. Communications of the ACM, 40 (12): 55–62.
Ma, Y.-F., L. Lu, H.J. Zhang, and M. Li [ 2002 ]. A user attention model for video summarization. In: Proc. ACM Multimedia, Juan Les Pin.
Mani, I., and M.T. Maybury [1999]. Advances in Automatic Text Summarization. MIT Press.
Merlino, A., D. Morey, and M. Maybury [ 1997 ]. Broadcast news navigation using story segmentation. In: Proc. of the Fifth ACM International Conference on Multimedia (ACMMM’97), pages 381–391
Merlino, A., and M. Maybury [ 1999 ]. An empirical study of the optimal presentation of multimedia summaries of broadcast news. In: Advances in Automatic Text Summarization, I. Mani and M.T. Maybury, (eds.), MIT Press, pages 391–401.
Merialdo, B., K.T. Lee, D. Luparello, and J. Roudaire [ 1999 ]. Automatic construction of personalized tv news program. In: Proc. of the Seventh ACM International Conference on Multimedia (ACMMM’99), pages 323–331.
McGee, T., and N. Dimitrova [ 1999 ]. Parsing TV program structures for identification and removal of non-story segments. In: Proc. SPIE Conference on Storage and Retrieval for Image and Video Databases VII.
Nafeh, J. [ 1994 ]. Method and Apparatus for Classifying Patterns of Television Programs and Commercials Based on Discerning of Broadcast Audio and Video Signals. US patent 5,343, 251.
Naphade, M.R., I. Kozintsev, and T.S. Huang [ 2002 ]. A factor graph framework for semantic video indexing. IEEE Transactions on Circuits and Systems for Video Technology, 12 (1): 40–52.
Novak, A.P. [ 1988 ]. Method and System for Editing Unwanted Program Material from Broadcast Signals. US patent 4,750, 213.
Petkovic, M., V. Mihajlovic, W. Jonker [ 2002 ]. Multi-modal extraction of highlights from TV formula 1 programs. In: Proc. IEEE Conf. on Multimedia and Expo, Lausanne.
Smith, M., and T. Kanade [ 1997 ]. Video skimming and characterization through the combination of image and language understanding techniques. In: Proc. of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’97), pages 775–781.
Sundaram, H., L. Xie, S.-F. Chang [ 2002 ]. A utility framework for the automatic generation of audio-visual skims. In: Proc. ACM Multimedia, Juan Les Pin.
IBM Intelligent Miner for Text.
Uchihashi, S., J. Foote, A. Girgensohn, and J. Boreczky [ 1999 ]. Video Manga: Generating se-mantically meaningful video summaries. In: Proc. ACM Multimedia, pages 383–392.
Wei, G., and I.K. Sethi [ 1999 ]. Face detection for image annotation. In: Proc. of Pattern Recognition in Practice VI, Vlieland, The Netherlands.
Xie, L., S.-F. Chang, A. Divakaran, and H. Sun [2002]. Structure analysis of soccer video with hidden Markov models. In Proc. Interational Conference on Acoustic, Speech and Signal Processing (ICASSP-2002), Orlando, FL, May 13–17.
Yahiaoui, I., B. Merialdo, and B. Huet [ 2001 ]. Generating summaries of multi-episode video. In: Proc. ICME, Japan, pages 792–796.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Agnihotri, L., Dimitrova, N. (2004). Personalized Multimedia Summarization. In: Verhaegh, W.F.J., Aarts, E., Korst, J. (eds) Algorithms in Ambient Intelligence. Philips Research, vol 2. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-0703-9_5
Download citation
DOI: https://doi.org/10.1007/978-94-017-0703-9_5
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-6490-5
Online ISBN: 978-94-017-0703-9
eBook Packages: Springer Book Archive