Multimedia Tools and Applications

, Volume 55, Issue 2, pp 307–331 | Cite as

Annotation based personalized adaptation and presentation of videos for mobile applications

  • Sarah De BruyneEmail author
  • Peter Hosten
  • Cyril Concolato
  • Mark Asbach
  • Jan De Cock
  • Michael Unger
  • Jean Le Feuvre
  • Rik Van de Walle


Personalized multimedia content which suits user preferences and the usage environment, and as a result improves the user experience, gains more importance. In this paper, we describe an architecture for personalized video adaptation and presentation for mobile applications which is guided by automatically generated annotations. By including this annotation information, more intelligent adaptation techniques can be realized which primarily reduce the quality of unimportant regions in case a bit rate reduction is necessary. Furthermore, a presentation layer is added to enable advanced multimedia viewers to adequately present the interesting parts of a video in case the user wants to zoom in. This architecture is the result of collaborative research done in the EU FP6 IST INTERMEDIA project.


Annotation Adaptation Rich media presentation Personalized multimedia 



The research activities that have been described in this paper were co-funded by Ghent University, the Interdisciplinary Institute for Broadband Technology (IBBT), the Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT), the Fund for Scientific Research-Flanders (FWOFlanders), the Belgian Federal Science Policy Office (BFSPO), and the European Union (within the framework of the NoE INTERMEDIA, IST-038419).


  1. 1.
    Baccichet P, Zhu X, Girod B (2006) Network-aware H.264/AVC region-of-interest coding for a multi-camera wireless surveillance network. In: Proceedings of the picture coding symposiumGoogle Scholar
  2. 2.
    Bertini M, Cucchiara R, Del Bimbo A, Prati A (2005) An integrated framework for semantic annotation and adaptation. Multimedia Tools and Applications 26(3):345–363CrossRefGoogle Scholar
  3. 3.
    Bolla R, Repetto M, De Zutter S, Van de Walle R, Chessa S, Furfari F, Reiterer B, Hellwagner H, Asbach M, Wien M (2008) A context-aware architecture for QoS and transcoding management of multimedia streams in smart homes. In: Proceedings of the IEEE international conference on emerging technologies and factory automation, pp 1354–1361Google Scholar
  4. 4.
    Boreczky J, Rowe L (1996) Comparison of video shot boundary detection techniques. J Electron Imaging 5(2):122–128CrossRefGoogle Scholar
  5. 5.
    Burnett IS, Pereira F, Van de Walle R, Koenen R (eds) (2006) The MPEG-21 book. WileyGoogle Scholar
  6. 6.
    Cavallaro A, Steiger O, Ebrahimi T (2005) Semantic video analysis for adaptive content delivery and automatic description. IEEE Trans Circuits Syst Video Technol 15(10):1200–1209CrossRefGoogle Scholar
  7. 7.
    Chang S-F, Vetro A (2005) Video adaptation: concepts, technologies and open issues. Proc IEEE 93(1):148–158CrossRefGoogle Scholar
  8. 8.
    Cucchiara R, Grana C, Prati A (2002) Semantic transcoding for live video server. In: Proceedings of the ACM international conference on multimedia, pp 223–226Google Scholar
  9. 9.
    De Cock J, Notebaert S, Lambert P, Van de Walle R (2010) Requantization transcoding for H.264/AVC video coding. Signal Process Image Commun 25(4):235–254CrossRefGoogle Scholar
  10. 10.
    De Zutter S, Asbach M, De Bruyne S, Unger M, Wien M, Van de Walle R (2008) System architecture for semantic annotation and adaptation in content sharing environments. Vis Comput (Int J Comput Graph) 24(7–9):735–743CrossRefGoogle Scholar
  11. 11.
    Feng W-C, Dang T, Kassebaum J, Bauman T (2008) Supporting region-of-interest cropping through constrained compression. In: Proceeding of the ACM international conference on multimedia, pp 745–748Google Scholar
  12. 12.
    Hata T, Kuwahara N, Nozawa T, Schwenke DL, Vetro A (2005) Surveillance system with object-aware video transcoder. In: Proceedings of the IEEE workshop on multimedia signal processing, pp 1–4Google Scholar
  13. 13.
    INTERMEDIA (2006) Interactive media with personal networked devices (ist-1-38419). In: European sixth framework programme (FP6) IST NoE co-funded project. Accessed May 2010
  14. 14.
    ISO/IEC 14496-11:2005 (2005) Information technology—coding of audio-visual objects. Part 11: scene description and application engine. ISO, GenevaGoogle Scholar
  15. 15.
    Knoche H, Sasse MA (2009) The big picture on small screens delivering acceptable video quality in mobile TV. ACM Trans Multimedia Comput Commun Appl 5(3):1–27CrossRefGoogle Scholar
  16. 16.
    Le Feuvre J, Concolato C, Moissinac J-C (2007) GPAC: open source multimedia framework. In: Proceedings of the international conference on multimedia, pp 1009–1012Google Scholar
  17. 17.
    Lefol D, Bull D, Canagarajah N, Redmill D (2007) An efficient complexity-scalable video transcoder with mode refinement. Signal Process Image Commun 22(4):421–433CrossRefGoogle Scholar
  18. 18.
    Magalhães J, Pereira F (2004) Using MPEG standards for multimedia customization. Signal Process Image Commun 19(5):437–456CrossRefGoogle Scholar
  19. 19.
    Manjunath BS, Salembier P, Sikora T (eds) Introduction to MPEG-7: multimedia content description language. WileyGoogle Scholar
  20. 20.
    Mavlankar A, Baccichet P, Varodayan D, Girod B (2007) Optimal slice size for streaming regions of high resolution video with virtual pan/tilt/zoom functionality. In: Proceedings of European signal processing conference, pp 1275–1279Google Scholar
  21. 21.
  22. 22.
    Notebaert S, De Cock J, Beheydt S, De Lameillieure J, Van de Walle R (2009) Mixed architectures for H.264/AVC digital video transrating. Multimedia Tools and Applications 44(1):39–64CrossRefGoogle Scholar
  23. 23.
    Pereira F, Alpert T (1997) MPEG-4 video subjective test procedures and results. IEEE Trans Circuits Syst Video Technol 7(1):32–51CrossRefGoogle Scholar
  24. 24.
    Pinto N, Doukhan D, DiCarlo JJ, Cox DD (2009) A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Computational Biology 5(11):e1000579MathSciNetCrossRefGoogle Scholar
  25. 25.
    Quang Minh Khiem N, Ravindra G, Carlier A, Ooi WT (2010) Supporting zoomable video streams with dynamic region-of-interest cropping. In: Proceedings of the ACM SIGMM conference on multimedia systems, pp 259–270Google Scholar
  26. 26.
    Schwarz H, Marpe D, Wiegand T (2007) Overview of the scalable video coding extension of the H.264/AVC standard. IEEE Trans Circuits Syst Video Technol 17(9):1103–1120CrossRefGoogle Scholar
  27. 27.
    Shen H, Sun X, Wu F, Li S (2006) R-D optimal motion estimation for fast H.264/AVC bit-rate reduction. In: Proceedings of the picture coding symposiumGoogle Scholar
  28. 28.
    Smeaton A, Over P, Doherty A (2010) Video shot boundary detection: seven years of TRECVid activity. Computer Vis Image Underst 114(4):411–418CrossRefGoogle Scholar
  29. 29.
    Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and TRECVid. In: Proceedings of the ACM international workshop on multimedia information retrieval, pp 321–330Google Scholar
  30. 30.
    Torralba A, Murphy K, Freeman W (2004) Sharing visual features for multiclass and multiview object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognitionGoogle Scholar
  31. 31.
    Unger M, Asbach M (2008) Segment based diffusion—a post-processing step (not only) for background subtraction. In: International workshop on image analysis for multimedia interactive services, pp 167–170Google Scholar
  32. 32.
    Unger M, Asbach M, Hosten P (2008) Enhanced background subtraction using global motion compensation and mosaicing. In: Proceedings of the IEEE international conference on image processing, pp 2708–2711Google Scholar
  33. 33.
    Van Rijsselbergen D, Van De Keer B, Verwaest M, Mannens E, Van de Walle R (2009) Enabling universal media experiences through semantic adaptation in the creative drama production workflow. In: International workshop on image analysis for multimedia interactive services, pp 296–299Google Scholar
  34. 34.
    Vetro A, Christopoulos C, Sun H (2003) Video transcoding architectures and techniques: an overview. IEEE Signal Process Mag 20(2):18–29CrossRefGoogle Scholar
  35. 35.
    Vetro A, Sun H, Member S, Wang Y (1999) MPEG-4 rate control for multiple video objects. IEEE Trans Circuits Syst Video Technol 9:186–199CrossRefGoogle Scholar
  36. 36.
    Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 511–518Google Scholar
  37. 37.
    Wiegand T, Schwarz H, Joch A, Kossentini F, Sullivan GJ (2003) Rate-constrained coder control and comparison of video coding standards. IEEE Trans Circuits Syst Video Technol 13(7):688–703CrossRefGoogle Scholar
  38. 38.
    Wiegand T, Sullivan GJ, Bjøntegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Transact Circuits Syst Video Technol 13(7):560–576CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Sarah De Bruyne
    • 1
    Email author
  • Peter Hosten
    • 2
  • Cyril Concolato
    • 3
  • Mark Asbach
    • 2
  • Jan De Cock
    • 1
  • Michael Unger
    • 2
  • Jean Le Feuvre
    • 3
  • Rik Van de Walle
    • 1
  1. 1.Department of Electronics and Information Systems—Multimedia LabGhent University—IBBTLedeberg-GhentBelgium
  2. 2.Institute of Communication EngineeringRWTH Aachen UniversityAachenGermany
  3. 3.Multimedia Group, Signal and Image Processing DepartmentTelecom ParisTechParisFrance

Personalised recommendations