Multimedia Tools and Applications

, Volume 26, Issue 3, pp 345–363 | Cite as

An Integrated Framework for Semantic Annotation and Adaptation

  • M. Bertini
  • R. Cucchiara
  • A. Del Bimbo
  • A. Prati


Tools for the interpretation of significant events from video and video clip adaptation can effectively support automatic extraction and distribution of relevant content from video streams. In fact, adaptation can adjust meaningful content, previously detected and extracted, to the user/client capabilities and requirements. The integration of these two functions is increasingly important, due to the growing demand of multimedia data from remote clients with limited resources (PDAs, HCCs, Smart phones). In this paper we propose an unified framework for event-based and object-based semantic extraction from video and semantic on-line adaptation. Two cases of application, highlight detection and recognition from soccer videos and people behavior detection in domotic* applications, are analyzed and discussed.


semantic annotation semantic adaptation semantic transcoding video adaptation event detection motion segmentation performance evaluation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    J.K. Aggarwal and A. Madabhushi, “A bayesian approach to human activity recognition,” in Proc. of the Second IEEE International Workshop on Visual Surveillance (CVPR workshop), Fort Collins, CO (USA), June 1999, pp. 25–30.Google Scholar
  2. 2.
    J. Assfalg, M. Bertini, C. Colombo, A. Del Bimbo, and W. Nunziati, “Automatic interpretation of soccer video for highlights extraction and annotation,” in Proceeedings of the ACM Symposium on Applied Computing, March 2003, pp. 769–773.Google Scholar
  3. 3.
    J. Assfalg, M. Bertini, C. Colombo, A. Del Bimbo, and W. Nunziati, “Semantic annotation of soccer videos: Automatic highlights identification,” Computer Vision and Image Understanding, Vol. 92, No. 2/3, pp. 285–305, 2003.Google Scholar
  4. 4.
    R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, “Detecting moving objects, ghosts and shadows in video streams,” in press on IEEE Transcations on Pattern Analysis and Machine Intelligence, 2003.Google Scholar
  5. 5.
    R. Cucchiara, C. Grana, and A. Prati, “Semantic transcoding for live video server,” in Proceedings of ACM Multimedia 2002 Conference, December 2002, pp. 223–226.Google Scholar
  6. 6.
    R. Cucchiara, C. Grana, and A. Prati, “Semantic video transcoding using classes of relevance,” International Journal of Image and Graphics, Vol. 3, No. 1, pp. 145–169, 2003.Google Scholar
  7. 7.
    A. Ekin, A. Murat Tekalp, and R. Mehrotra, “Automatic soccer video analysis and summarization,” IEEE Transactions on Image Processing, 2003 (to appear).Google Scholar
  8. 8.
    D. Farin, M. Ksemann, P.H.N. de With, and W. Effelsberg, “Rate-distortion optimal adaptive quantization and coefficient thresholding for MPEG coding,” in 23rd Symposium on Information Theory in the Benelux, May 2002.Google Scholar
  9. 9.
    F. Brémond, F. Cupillard, and M. Thonnat, “Behaviour recognition for individuals, groups of people and crowd,” in IEEE Proc. of the IDSS Symposium—Intelligent Distributed Surveillance Systems, London (UK), February 2003.Google Scholar
  10. 10.
    Y. Gong, L.T. Sin, C.H. Chuan, H. Zhang, and M. Sakauchi, “Automatic parsing of tv soccer programs,” in Proceedings of IEEE Int’l Conference on Multimedia Computing and Systems, 1995, pp. 15–18.Google Scholar
  11. 11.
    C.A. Gonzales and E. Viscito, “Motion video adaptive quantization in the transform domain, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 1, No. 4, pp. 374–378, 1991.Google Scholar
  12. 12.
    M.R. Hashemi, L. Winger, and S. Panchanathan, “Compressed domain motion vector resampling for downscaling of MPEG video, in Proceedings of IEEE Int’l Conference on Image Processing, Vol. 4, pp. 276–279, 1999.Google Scholar
  13. 13.
    K.-L. Huang, Y.-S. Tung, J.-L. Wu, P.-K. Hsiao, and H.-S. Chen, “A frame-based mpeg characteristics extraction tool and its application in video transcoding, IEEE Transcations on Consumer Electronics, Vol. 48, No. 3, pp. 522–532, 2002.Google Scholar
  14. 14.
    J. Hwang, T. Wu, and C. Lin, “Dynamic frame-skipping in video transcoding,” in Proceedings of the IEEE Second Workshop on Multimedia Signal Processing, 1998, pp. 616–621.Google Scholar
  15. 15.
    G. Keesman, R. Hellinghuizen, Fokke Hoeksema, and Geert Heideman, “Transcoding of MPEG bitstreams,” Signal Processing: Image Communication, Vol. 8, No. 6, pp. 481–500, 1996.Google Scholar
  16. 16.
    J.-G. Kim, Y. Wang, and S.-F. Chang, “Content-adaptive utility-based video adaptation,” in Proceedings of IEEE Int’l Conference on Multimedia Computing and Expo, 2003.Google Scholar
  17. 17.
    R. Leonardi and P. Migliorati, “Semantic indexing of multimedia documents,” IEEE Multimedia, Vol. 9, No. 2, pp. 44–51, 2002.Google Scholar
  18. 18.
    Y. Liang and Y-P. Tan, “A new content-based hybrid video transcoding method,” in Proceedings of IEEE Int’l Conference on Image Processing, Vol. 1, 2001, pp. 429–432.Google Scholar
  19. 19.
    R. Mohan, J.R. Smith, and C. Li, “Adapting multimedia internet content for universal access,” IEEE Transactions on Multimedia, Vol. 1, No. 1, pp. 104–114, 1999.Google Scholar
  20. 20.
    K. Nagao, Y. Shirai, and K. Squire, “Semantic annotation and transcoding: Making web content more accessible,” IEEE Multimedia, Vol. 8, No. 2, pp. 69–81, 2001.Google Scholar
  21. 21.
    S. Nepal, U. Srinivasan, and G. Reynolds, “Automatic detection of ‘goal’ segments in basketball videos,” in Proceedings of ACM Multimedia, 2001, pp. 261–269.Google Scholar
  22. 22.
    A. Ortega and K. Ramchandran, “Forward-adaptive quantization with optimal overhead cost for image and video coding with applications to MPEG video coders,” in SPIE Digital Video Compression, February 1995.Google Scholar
  23. 23.
    K. Ramchandran and M. Vetterli, “Rate-distortion optimal fast thresholding with complete JPEG/MPEG decoder compatibility,” IEEE Transactions on Image Processing, Vol. 3, No. 5, pp. 700–704, 1994.Google Scholar
  24. 24.
  25. 25.
    T. Shanableh and M. Ghanbari, “Heterogeneous video transcoding to lower spatio-temporal resolution and different encoding formats,” IEEE Transactions on Multimedia, Vol. 2, No. 2, pp. 101–110, 2000.Google Scholar
  26. 26.
    J.R. Smith, R. Mohan, and C. Li, “Content-based transcoding of images in the internet,” in Proceedings of IEEE Int’l Conference on Image Processing, October 1998, Vol. 3, pp. 7–11.Google Scholar
  27. 27.
    J. Song and B.-L. Yeo, “Fast extraction of spatially reduced image sequences from MPEG-2 compressed video,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 9, No. 7, pp. 1100–1114, 1999.Google Scholar
  28. 28.
    G. Sudhir, J.C.M. Lee, and A.K. Jain, “Automatic classification of tennis video for high-level content-based retrieval,” in Proceedings of Int’l Workshop on Content-based Access of Image and Video Databases, 1998.Google Scholar
  29. 29.
    H. Sun, A. Vetro, J. Bao, and T. Poon, “A new approach for memory-efficient atv decoding,” IEEE Transcations on Consumer Electronics, Vol. 43, No. 3, pp. 517–525, 1997.Google Scholar
  30. 30.
    F. Brémond S. Hongeng and R. Nevatia, “Representation and optimal recognition of human activities,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition CVPR00, South Carolina (USA), June 2000.Google Scholar
  31. 31.
    A. Vetro, C. Chrisopoulos, and H. Sun, “Video transcoding architectures and techniques: An overview,” IEEE Signal Processing Magazine, Vol. 20, No. 2, pp. 18–29, 2003.Google Scholar
  32. 32.
    A. Vetro and H. Sun, “Encoding and transcoding multiple video-objects with variable temporal resolution,” in Proceedings of Intern. Symposium of Circuit and Systems, May 2001.Google Scholar
  33. 33.
    A. Vetro, H. Sun, and Y. Wang, “Object-based transcoding for adaptable video content delivery,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11, No. 3, pp. 387–401, 2001.Google Scholar
  34. 34.
    O. Werner, “Requantization for transcoding of MPEG-2 bit streams,” IEEE Transactions on Image Processing, Vol. 8, No. 2, pp. 179–191, February 1999.Google Scholar
  35. 35.
    P.H. Westerink, R. Rajagopalan, and C.A. Gonzales, “Two-pass MPEG-2 variable-bitrate encoding,” IBM Journal of Research and Developement, Vol. 43, No. 4, July 1999.Google Scholar
  36. 36.
    C. Yim and M.A. Isnardi, “An efficient method for dct-domain image resizing with mixed field/frame-mode macroblocks,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 9, No. 5, pp. 696–700, 1999.Google Scholar
  37. 37.
    Y. Yoo and A. Ortega, “Adaptive quantization without side information using svq and tcq,” in 29th Asilomar Conference on Signals, Systems, and Computers, November 1995.Google Scholar
  38. 38.
    Y. Yu and C.W. Chen, “SNR scalable transcoding for video over wireless channels,” in Proceedings of the Wireless Communications and Networking Conference (WCNC), 2000, Vol. 3, pp. 1396–1402.Google Scholar

Copyright information

© Springer Science + Business Media, Inc. 2005

Authors and Affiliations

  • M. Bertini
    • 1
  • R. Cucchiara
    • 2
  • A. Del Bimbo
    • 1
  • A. Prati
    • 2
  1. 1.D.S.I.Universitá di FirenzeItaly
  2. 2.D.I.I.Universitá di Modena e Reggio EmiliaItaly

Personalised recommendations