Abstract
This chapter discusses a framework for segmenting and categorizing videos. Instead of using a direct method of content matching, we exploit the semantic structure of the videos and employ domain knowledge. There are general rules that television and movie directors often follow when presenting their programs. In this framework, these rules are utilized to develop a systematic method for categorization that corresponds to human perception. Extensive experimentation was performed on a variety of video genres and the results clearly demonstrate the effectiveness of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arijon, D. (1976) Grammar of the Film Language. Hasting House Publishers, NY.
Benitez, A. B., Rising, H., Jrgensen, C., Leonardi, R., Bugatti, A., Hasida, K., Mehrotra, R., Tekalp, A. M., Ekin, A., and Walker, T. (2002). Semantics of Multimedia in MPEG-7. In IEEE International Conference on Image Processing.
Boreczky, J. S. and Wilcox., L. D. (1997). A hidden Markov model framework for video segmentation using audio and image features. In IEEE International Conference on Acoustics, Speech and Signal Processing.
Chang, S. F., Chen, W., Horace, H., Sundaram, H., and Zhong, D. (1998). A fully automated content based video search engine supporting spatio-temporal queries. IEEE Transaction on Circuits and Systems for Video Technology, pages 602 - -615.
DeMenthon, D., Latecki, L. J., Rosenfeld, A., and Vuilleumier-Stuckelberg, M. (2000). Relevance ranking of video data using hidden Markov model distances and polygon simplification. In Advances in Visual Information Systems, VISUAL 2000, pages 49–61.
Deng, Y. and Manjunath, B. S. (1997). Content-based search of video using color, texture and motion. In IEEE Intl. Conf. on Image Processing, pages 534–537.
Dimitrova, N., Agnihotri, L., and Wei, G. (2000). Video classification based on HMM using text and faces. In European Conference on Signal Processing.
Haering, N. (1999). A framework for the design of event detections, (Ph.D. thesis). School of Computer Science, University of Central Florida.
Haering, N. C., Qian, R., and Sezan, M. (1999). A semantic event detection approach and its application to detecting hunts in wildlife video. IEEE Transaction on Circuits and Systems for Video Technology.
Hampapur, A., Gupta, A., Horowitz, B., Shu, C. F., Fuller, C., Bach, J., Gorkani, M., and Jain, R. (1997). Virage video engine. In SPIE, Storage and Retrieval for Image and Video Databases, volume 3022, pages 188–198.
Hanjalic, A., Lagendijk, R. L., and Biemond, J. (1999). Automated high-level movie segmentation for advanced video-retrieval systems. IEEE Transaction on Circuits and Systems for Video Technology, 9(4):580–588.
Informedia. Informedia Project, Digital video library.http:// www. informedia. cs.cmu.edu.
Jahne, B. (1991). Spatio-tmporal Image Processing: Theory and Scientific Applications. Springer Verlag.
Kjedlsen, R. and Kender, J. (1996). Finding skin in color images. In International Conference on Face and Gesture Recognition.
Kobla, V., Doermann, D., and Faloutsos, C. (1997). Videotrails: Representing and visualizing structure in video sequences. In Proceedings of ACM Multimedia Conference, pages 335–346.
Liu, Y., Emoto, H., Fujii, T., and Ozawa, S. (2001). A method for content-based similarity retrieval of images using two dimensional dp matching algorithm. In 11th International Conference on Image Analysis and Processing, pages 236–241.
Lu, C., Drew, M. S., and Au, J. (2001). Classification of summarized videos using hidden Markov models on compressed chromaticity signatures. In ACM International Conference on Multimedia.
Lyman, P. and Varian, H. R. (2000). School of Information Management and Systems at the University of California at Berkeley. http:// www.sims.. berkeley.edu/ research/ projects/ how-much-info/.
Naphade, M. R. and Huang, T. S. (2001). A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Transactions on Multimedia, pages 141–151.
Patel, N. V. and Sethi, I. K. (1997). The Handbook of Multimedia Information Management. Prentice-Hall/PTR.
Perona, P. and Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7):629–639.
Reynertson, A. F. (1970). The Work of the Film Director. Hasting House Publishers, NY.
Rilla, W. (1970). A-Z of movie making, A Studio Book. The Viking Press, NY.
Schweitzer, H. (2001). Template matching approach to content based image indexing by low dimensional euclidean embedding. In Eight IEEE International Conference on Computer Vision, pages 566–571.
Smith, J. R. (1999). Videozoom spatio-temporal video browser. IEEE Transactions on Multimedia, 1(2):157–171.
Smith, M. A. and Kanade, T. (1997). Video skimming and characterization through the combination of image and language understanding techniques.
Vailaya, A., Figueiredo, M., Jain, A. K., and Zhang, H.-J. (2001). Image classification for content-based indexing. IEEE Transactions on Image Processing, 10(1):117–130.
Vasconcelos, N. and Lippman, A. (1997). Towards semantically meaningful feature spaces for the characterization of video content. In IEEE International Conference on Image Processing.
Wolf, W. (1997). Hidden Markov model parsing of video programs. In International Conference on Acoustics, Speech and Signal Processing, pages 2609–2611.
Yeo, B. L. and Liu, B. Rapid scene change detection on compressed video. 5: 533–544.
Yeung, M. M., Yeo, B.-L., and Liu, B. (1998). Segmentation of video by clustering and graph analysis. Computer Vision and Image Understanding, 71(1).
Zettl, H. (1990). Sight Sound Motion: Applied Media Aesthetics. Wadsworth Publishing Company, second edition.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media New York
About this chapter
Cite this chapter
Rasheed, Z., Shah, M. (2003). Video Categorization Using Semantics and Semiotics. In: Rosenfeld, A., Doermann, D., DeMenthon, D. (eds) Video Mining. The Springer International Series in Video Computing, vol 6. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-6928-9_7
Download citation
DOI: https://doi.org/10.1007/978-1-4757-6928-9_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5383-4
Online ISBN: 978-1-4757-6928-9
eBook Packages: Springer Book Archive