Multimedia Systems

, Volume 13, Issue 2, pp 103–118 | Cite as

Integrating semantic analysis and scalable video coding for efficient content-based adaptation

  • Luis HerranzEmail author


Scalable video coding has become a key technology to deploy systems where the adaptation of content to diverse constrained usage environments (such as PDAs, mobile phones and networks) is carried out in a simple and efficient way. Content-based adaptation and summarization are fields that aim for providing improved adaptation to the user, trying to optimize the semantic coverage in the adapted/summarized version. This paper proposes the integration of content analysis with scalable video adaptation paradigm. They must be fitted in such a way that the efficiency of scalable adaptation is not damaged. An integrated framework is proposed for semantic video adaptation, as well as an adaptive skimming scheme that can use the results of semantic analysis. They are described using the MPEG-21 DIA tools to provide the adaptation in a standard framework. Particularly, the case of activity analysis is described to illustrate the integration of semantic analysis in the framework, and its use for online content summarization and adaptation. Overall efficiency is achieved by means of computing activity using compressed domain analysis with several metrics evaluated as measures of activity.


MPEG-21 Digital item adaptation Scalable video Video summarization Semantic video adaptation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chang S.-F., Vetro A. (2005). Video adaptation: concepts, technologies and open issues. Proc. IEEE 93(1): 148–158 CrossRefGoogle Scholar
  2. 2.
    Vetro A. (2004). MPEG-21 digital item adaptation: enabling universal multimedia access. IEEE Multimed. 11(1): 84–87 CrossRefGoogle Scholar
  3. 3.
    Ohm J.R. (2005). Advances in scalable video coding. Proc. IEEE 93(1): 42–56 CrossRefGoogle Scholar
  4. 4.
    Ohm J.R., Woods J.W., Schaar M. (2004). Interframe wavelet coding motion picture representation for universal scalability. Signal Process. Image Commun. 19(9): 877–908 CrossRefGoogle Scholar
  5. 5.
    Schwarz, H., Marpe, D., Wiegand, T.: Overview of the scalable H.264/MPEG4-AVC extension. In: Proceedings of International Conference on Image Processing, (2006)Google Scholar
  6. 6.
    Pereira F., Van Beek P., Kot A.C., Ostermann J. (2005). Special issue on analysis and understanding for video adaptation. IEEE Trans. Circuits Syst. Video Technol. 15(10): 1197–1199 CrossRefGoogle Scholar
  7. 7.
    Dimitrova N., Zhang H.-J., Shahraray B., Sezan I., Huang T., Zakhor A. (2002). Applications of video-content analysis and retrieval. Multimed. IEEE 9(3): 42–55 CrossRefGoogle Scholar
  8. 8.
    Furini, M., Ghini, V.: A video frame dropping mechanism based on audio perception. IEEE Global Telecommunications Conference Workshops, pp. 211–216 (2004)Google Scholar
  9. 9.
    Yeung M.M., Yeo B.-L. (1997). Video visualization for compact presentation and fast browsing of pictorial content. Circuits Syst. Video Technol. IEEE Trans. 7(5): 771–785 CrossRefGoogle Scholar
  10. 10.
    Chang H.S., Sull S., Lee S.U. (1999). Efficient video indexing scheme for content-based retrieval. Circuits Syst. Video Technol. IEEE Trans. 9(8): 1269–1279 CrossRefGoogle Scholar
  11. 11.
    Pfeiffer S., Lienhart R., Fischer S., Effelsberg W. (1996). Abstracting digital movies automatically. J. Vis. Commun. Image Represent 7(4): 345–353 CrossRefGoogle Scholar
  12. 12.
    Zhu X., Elmagarmid A.K., Xue X., Wu L., Catlin A.C. (2005). InsightVideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval. Multimed. IEEE Trans. 7(4): 648–666 CrossRefGoogle Scholar
  13. 13.
    Peker, K.A., Divakaran, A., Sun, H.: Constant pace skimming and temporal sub-sampling of video using motion activity. In: Proceedings of International Conference on. Image Processing, pp. 414–417 (2001)Google Scholar
  14. 14.
    Ma Y.-F., Hua X.-S., Lu L., Zhang H.-J. (2005). A generic framework of user attention model and its application in video summarization. Multimed. IEEE Trans. 7(5): 907–919 CrossRefGoogle Scholar
  15. 15.
    Li Z., Schuster G.M., Katsaggelos A.K., Gandhi B. (2005). Rate- distortion optimal video summary generation. IEEE Trans. Image Process. 14(10): 1550–1560 CrossRefGoogle Scholar
  16. 16.
    Ngo, C.-W., Ma, Y.-F., Zhang, H.-J.: Automatic video summarization by graph modeling. In: Proceedings of Ninth IEEE International Conference on Computer Vision, pp. 104–109 (2003)Google Scholar
  17. 17.
    Smith, M.A., Kanade, T.: Video skimming and characterization through the combination of image and language understanding. In: Proceedings of IEEE International Workshop on Content-Based Access of Image and Video Database, pp. 61–70 (1998)Google Scholar
  18. 18.
    Gang, Z., Chia, L.T., Zongkai, Y.: MPEG-21 digital item adaptation by applying perceived motion energy to H.264 video. In: International Conference on Image Processing, pp. 2777–2780 (2004)Google Scholar
  19. 19.
    Lai, W., Gu, X.D., Wang, R.H., Dai, L.R., Zhang, H.J.: Perceptual video streaming by adaptive spatial-temporal scalability. Advances in Multimedia Information Processing—PCM 2004. Lecture Notes in Computer Science (3332), pp. 431–438. Springer, Berlin, (2004)Google Scholar
  20. 20.
    Cha H.J., Oh J.H., Ha R. (2003). Dynamic frame dropping for bandwidth control in MPEG streaming system. Multimed. Tools Appl. 19(2): 155–178 CrossRefGoogle Scholar
  21. 21.
    Hsiang S.T., Woods J.W. (2001). Embedded video coding using invertible motion compensated 3-D subband/wavelet filter bank. Signal Process. Image Commun. 16(8): 705–724 CrossRefGoogle Scholar
  22. 22.
    Sprljan, N., Mrak, M., Abhayaratne, G.C.K., Izquierdo, E.: A scalable coding framework for efficient video adaptation. In: Proceedings of International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) (2005)Google Scholar
  23. 23.
    Ohm J.R. (1994). Three-dimensional subband coding with motion compensation. Image Process. IEEE Trans. 3(5): 559–571 CrossRefGoogle Scholar
  24. 24.
    Fonseca P.M., Pereira F. (2004). Automatic video summarization based on MPEG-7 descriptions. Signal Process. Image Commun. 19(8): 685–699 CrossRefGoogle Scholar
  25. 25.
    van Beek P., Smith J.R., Ebrahimi T., Suzuki T., Askelof J. (2003). Metadata-driven multimedia access. Signal Process. Mag. IEEE 20(2): 40–52 CrossRefGoogle Scholar
  26. 26.
    Shen, K., Delp, E.J.: A fast algorithm for video parsing using MPEG compressed sequences. In: Proceedings of the International Conference on Image Processing, pp. 252–255 (1995)Google Scholar
  27. 27.
    Wang H.L., Divakaran A., Vetro A., Chang S.F., Sun H.F. (2003). Survey of compressed-domain features used in audio-visual indexing and analysis. J. Vis. Commun. Image Represent. 14(2): 150–183 CrossRefGoogle Scholar
  28. 28.
    Bescos J. (2004). Real-time shot change detection over online MPEG-2 video. Circuits Syst. Video Technol. IEEE Trans. 14(4): 475–484 CrossRefGoogle Scholar
  29. 29.
    Jeannin S., Divakaran A. (2001). MPEG-7 visual motion descriptors. IEEE Trans. Circuits Syst. Video Technol. 11(6): 720–724 CrossRefGoogle Scholar
  30. 30.
    Tan Y.-P., Saur D.D., Kulkami S.R., Ramadge P.J. (2000). Rapid estimation of camera motion from compressed video with application to video annotation. Circuits Syst. Video Technol. IEEE Trans. 10(1): 133–146 CrossRefGoogle Scholar
  31. 31.
    Babu R.V., Ramakrishnan K.R., Srinivasan S.H. (2004). Video object segmentation: a compressed domain approach. Circuits Syst. Video Technol. IEEE Trans. 14(4): 462–474 CrossRefGoogle Scholar
  32. 32.
    Mukherjee D., Delfosse E., Kim J.G., Wang Y. (2005). Optimal adaptation decision-taking for terminal and network quality- of-service. IEEE Trans. Multimed. 7(3): 454–462 CrossRefGoogle Scholar
  33. 33.
    Chan, M.H., Yu, Y.B., Constantinides, A.G.: Variable size block matching motion compensation with applications to video coding. Communications, Speech and Vision, IEE Proceedings I, pp. 205–212 (1990)Google Scholar
  34. 34.
    Herranz, L., Tiburzi, F., Bescós, J.: Extraction of motion activity from scalable-coded video sequences. Semantic Multimedia, Lecture Notes in Computer Science (4306), pp. 148–158. Springer, Berlin, (2006)Google Scholar
  35. 35.
    Hamidi M., Pearl J. (1976). Comparison of the cosine and Fourier transforms of Markov-1 signals. IEEE Trans. Signal Process. Acoust. Speech Signal Process. 24(5): 428–429 CrossRefGoogle Scholar
  36. 36.
    Ahmad I., Wei X., Sun Y., Zhang Y.-Q. (2005). Video transcoding: an overview of various techniques and research issues. IEEE Trans. Multimed. 7(5): 793–804 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Grupo de Tratamiento de Imágenes, Escuela Politécnica SuperiorUniversidad Autónoma de MadridMadridSpain

Personalised recommendations