Multimedia Tools and Applications

, Volume 51, Issue 2, pp 819–861 | Cite as

Algorithms for video retargeting

  • Stephan Kopf
  • Thomas Haenselmann
  • Johannes Kiess
  • Benjamin Guthier
  • Wolfgang Effelsberg
Article

Abstract

The visualization of high resolution video on small mobile devices is still a great challenge today. Most critical are the limited display resolution and different aspect ratios of handheld mobile devices. So far, there is no retargeting algorithm available that guarantees good results for all videos. We introduce a new video retargeting approach that reduces the resolution while preserving as much of the relevant content as possible. A central component of the system selects the most suitable algorithm to adapt a given shot. We have implemented two retargeting algorithms: a region of interest (ROI) based technique, and a fast implementation of seam carving for size adaptation of videos (FSCAV). The ROI-based retargeting detects important regions like faces, objects, text, and contrast-based saliency regions. A rectangular window within the larger frame is selected that defines the visible area of the target video. If several relevant regions are detected, an artificial camera motion (pan, tilt, or zoom) may change the selected view within a shot. For seam carving, we present two extensions: The first reduces the distortion of straight lines (lines may become curved or disconnected); the second avoids jitter in the target video, limits the large memory requirements and computational effort of seam carving, and makes it applicable to video retargeting. In addition, we present a heuristic that estimates the visual quality of the target video. If the quality drops below a threshold, the ROI-based retargeting is used for this shot. User evaluations confirm a very high visual quality of our approach.

Keywords

Video retargeting Video adaptation Seam carving Region of interest Contrast-based saliency 

Notes

Acknowledgements

The authors acknowledge the financial support granted by the Deutsche Forschungsgemeinschaft (DFG). We would like to thank the following flickr.com users for providing their images via the creative commons license: teoruiz (bridge.jpg), the_tahoe_guy (road.jpg) and digital_cat (construction_site.jpg). We thank Instituto Luce for providing historical films within the European research project ECHO. Furthermore, we would like to thank Sabine Olawsky for the development of the contrast-based saliency detection.

References

  1. 1.
    Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. ACM Trans Graph, SIGGRAPH 2007 26(3)Google Scholar
  2. 2.
    Bai B, Harms J (2005) A multiview video transcoder. In: Proceedings of the 13th annual ACM international conference on multimedia. ACM Press, New York, pp 503–506CrossRefGoogle Scholar
  3. 3.
    Bay H, Ess A, Tuytelaars T, Gool LV (2008) SURF: Speeded Up Robust Features. Comput Vis Image Underst (CVIU) 110(3):246–359Google Scholar
  4. 4.
    Beek P, Smith JR, Ebrahimi T, Suzuki T, Askelof J (2003) Metadata-driven multimedia access. IEEE Signal Process Mag 20(2):40–52. IEEE Computer Society PressCrossRefGoogle Scholar
  5. 5.
    Björk N, Christopoulos C (2000) Video transcoding for universal multimedia access. In: Proceedings of the 2000 ACM workshops on multimedia. ACM Press, New York, pp 75–79CrossRefGoogle Scholar
  6. 6.
    Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26(9):1124–1137CrossRefGoogle Scholar
  7. 7.
    Canny JF (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698. IEEE Computer Society PressCrossRefGoogle Scholar
  8. 8.
    Cardellini V, Yu P, Huang Y (2000) Collaborative proxy system for distributed web content transcoding. In: Proceedings of 9th international ACM conference on information and knowledge management. ACM Press, New York, pp 520–527Google Scholar
  9. 9.
    Cheng WH, Hsieh CW, Lin SK, Wang CW, Wu JL (2005) Robust algorithm for exemplar-based image inpainting. In: The international conference on computer graphics, imaging and vision. IEEE Press, New York, pp 64–69Google Scholar
  10. 10.
    Cheng WH, Wang CW, Wu JL (2007) Video adaptation for small display based on content recomposition. IEEE Trans Circuits Syst Video Technol 17(1):43–58CrossRefMathSciNetGoogle Scholar
  11. 11.
    Curran K, Annesley S (2005) Transcoding media for bandwidth constrained mobile devices. In: International Journal of Network Management, vol 15(2). Wiley, New York, pp 75–88Google Scholar
  12. 12.
    Dong W, Bao G, Zhang X, Paul JC (2010) Interactive multi-operator image resizing and evaluation. J Comput Sci Technol 25(2)Google Scholar
  13. 13.
    Dong W, Paul JC (2008) Adaptive content aware image resizing. In: Eurographics 2009, vol 28(2)Google Scholar
  14. 14.
    Dong W, Zhou N, Paul JC, Zhang X (2009) Optimized image resizing using seam carving and scaling. ACM Trans Graph 28(5):1–10CrossRefGoogle Scholar
  15. 15.
    Duda RO, Hart PE (1972) Use of the hough transformation to detect lines and curves in pictures. Commun ACM 15(1):11–15CrossRefGoogle Scholar
  16. 16.
    El-Alfy H, Jacobs D, Davis L (2007) Multi-scale video cropping. In: ACM international conference on multimedia, pp 97–106Google Scholar
  17. 17.
    Farin D (2005) Automatic video segmentation employing object/camera modeling. PhD thesis, Technische Universiteit Eindhoven, Einhoven, The NetherlandsGoogle Scholar
  18. 18.
    Farin D, Haenselmann T, Kopf S, Kühne G, Effelsberg W (2003) Segmentation and classification of moving video objects. In: Furht B, Marques O (eds) Handbook of video databases: design and applications, internet and communications series, vol 8. CRC Press, Boca Raton, pp 561–591Google Scholar
  19. 19.
    Fischler M, Bolles R (1981) Random sample concensus: a paradigm for model fitting with applications to image analysis and automated cartography. In: Communications ACM, vol 24(6). ACM Press, New York, pp 381–395Google Scholar
  20. 20.
    Fox A, Gribble S, Chawathe Y, Brewer E (1998) Adapting to network and client variation using infrastructural proxies: lessons and perspectives. In: IEEE Personal Communication, vol 5(4). IEEE Computer Society Press, Los Alamitos, pp 10–19Google Scholar
  21. 21.
    Gal R, Sorkine O, Cohen-Or D (2006) Feature-aware texturing. In: Proceedings of Eurographics symposium on rendering, pp 297–303Google Scholar
  22. 22.
    Guo Y, Liu F, Zhou ZH, Gleicher M (2009) Image retargeting using mesh parameterization. IEEE Trans Multimedia 11(5):856–867CrossRefGoogle Scholar
  23. 23.
    Han JW, Choi KS, Wang TS, Cheon SH, Ko SJ (2009) Improved seam carving using a modified energy function based on wavelet decomposition. In: IEEE 13th international symposium on consumer electronics, pp 38 –41Google Scholar
  24. 24.
    Han R, Bhagwat P, LaMaire R, Mummert T, Perret V, Rubas J (1998) Dynamic adaptation in an image transcoding proxy for mobile WWW browsing. In: IEEE Personal Communication, vol 5(6). IEEE Computer Society Press, Los Alamitos, pp 8–17Google Scholar
  25. 25.
    Harris C, Stephens M (1988) A combined corner and edge detector. In: Proceedings of Alvey vision conference, pp 147–151Google Scholar
  26. 26.
    Harrison P (2001) A non-hierarchical procedure for re-synthesis of complex textures. In: The 9th international conference in Central Europe on computer graphics, visualization and computer vision, pp 190–197Google Scholar
  27. 27.
    Hjelsvold R, Vdaygiri S, Leaute Y (2001) Web–based personalization and management of interactive video. In: Proceedings of the 10th international conference on World Wide Web, pp 129–139Google Scholar
  28. 28.
    Hossain M, Rahman A, Saddik A (2004) A framework for repurposing multimedia content. In: Proceedings of the Canadian conference on electrical and computer engineering. IEEE Computer Society Press, Los Alamitos, pp 971–974Google Scholar
  29. 29.
    Hwang DS, Chien SY (2008) Content-aware image resizing using perceptual seam carving with human attention model. In: IEEE international conference on multimedia and expo, pp 1029–1032Google Scholar
  30. 30.
    ISO/IEC (2002) Information technology–multimedia content description interface (MPEG-7)—part 8: extraction and use of MPEG-7 descriptions. Tech. rep. TR 15938-8, ISO/IECGoogle Scholar
  31. 31.
    ISO/IEC (2003) MPEG-21 multimedia framework—part 7: digital item adaptation (final committee draft). Tech. rep. N 5845, ISO/IECGoogle Scholar
  32. 32.
    ISO/IEC (2004) Information technology–multimedia framework (MPEG-21)—part 1: vision, technologies and strategy. Tech. rep. TR 21000-1, ISO/IECGoogle Scholar
  33. 33.
    Itti L, Koch C, Niebur E (1999) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259CrossRefGoogle Scholar
  34. 34.
    Kiess J, Kopf S, Guthier B, Effelsberg W (2010) Seam carving with improved edge preservation. In: Proceedings of IS&T/SPIE conference on multimedia on mobile devices, vol 7542Google Scholar
  35. 35.
    Kim JS, Kim JH, Kim CS (2009) Adaptive image and video retargeting technique based on fourier analysis. In: Proceedings of IEEE international conference on computer vision and pattern recognition. IEEE, New York, pp 1730–1737Google Scholar
  36. 36.
    Kopf S, Effelsberg W (2008) Mobile cinema: canonical processes for video adaptation. In: Multimedia Systems, vol 14(6). Springer, New York, pp 369–375Google Scholar
  37. 37.
    Kopf S, Guthier B, Lemelson H, Effelsberg W (2009) Adaptation of web pages and images for mobile applications. In: Proceedings of IS&T/SPIE conference on multimedia on mobile devices, vol 7256, pp 72560C-1–72560C-12Google Scholar
  38. 38.
    Kopf S, Haenselmann T, Farin D, Effelsberg W (2004) Automatic generation of summaries for the Web. In: Proceedings of IS&T/SPIE conference on storage and retrieval for media databases, vol 5307, pp 417–428Google Scholar
  39. 39.
    Kopf S, Haenselmann T, Effelsberg W (2005) Enhancing curvature scale space features for robust shape classification. In: Proceedings of IEEE international conference on multimedia and expo (ICME). IEEE Computer Society Press, Los Alamitos, pp 478–481CrossRefGoogle Scholar
  40. 40.
    Kopf S, Haenselmann T, Effelsberg W (2005) Robust character recognition in low-resolution images and videos. Tech. rep. TR-05-002, Department of Mathematics and Computer Science, University of Mannheim, GermanyGoogle Scholar
  41. 41.
    Kopf S, Haenselmann T, Effelsberg W (2005) Shape-based posture and gesture recognition in videos. In: Proceedings of IS&T/SPIE conference on storage and retrieval methods and applications for multimedia, vol 5682, pp 114–124Google Scholar
  42. 42.
    Kopf S, Kiess J, Lemelson H, Effelsberg W (2009) FSCAV: Fast seam carving for size adaptation of videos. In: Proceedings of the 17th ACM international conference on multimedia. ACM, New York, pp 321–330Google Scholar
  43. 43.
    Kopf S, Lampi F, King T, Effelsberg W (2006) Automatic scaling and cropping of videos for devices with limited screen resolution. In: Proceedings of the 14th ACM international conference on multimedia. ACM Press, New York, pp 957–958CrossRefGoogle Scholar
  44. 44.
    Krähenbühl P, Lang M, Hornung A, Gross M (2009) A system for retargeting of streaming video. In: ACM SIGGRAPH Asia. ACM, New York, pp 1–10CrossRefGoogle Scholar
  45. 45.
    Lei Z, Georganas ND (2001) Context-based media adaptation in pervasive computing. In: Proceedings of IEEE Canadian conference on electrical and computer engineering, vol 2. IEEE Computer Society Press, Los Alamitos, pp 913–918Google Scholar
  46. 46.
    Lei Z, Georganas ND (2002) Rate adaptation transcoding for precoded video streams. In: Proceedings of the 10th ACM international conference on multimedia. ACM Press, New York, pp 127–136CrossRefGoogle Scholar
  47. 47.
    Li Y, Sun J, Tang CK, Shum HY (2004) Lazy snapping. ACM Trans Graph (TOG) 23(3):303–308CrossRefGoogle Scholar
  48. 48.
    Li Y, Tian Y, Yang J, Duan LY, Gao W (2010) Video retargeting with multi-scale trajectory optimization. In: Proceedings of the international conference on multimedia information retrieval. ACM, New York, pp 45–54CrossRefGoogle Scholar
  49. 49.
    Linde Y, Buzo A, Gray R (1980) An algorithm for vector quantizer design. IEEE Trans Commun 28(1):84–95CrossRefGoogle Scholar
  50. 50.
    Liu F, Gleicher M (2003) Automatic image retargeting with fisheye-view warping. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 153–162Google Scholar
  51. 51.
    Liu F, Gleicher M (2006) Video retargeting: automating pan and scan. In: ACM international conference on multimedia, pp 241–250Google Scholar
  52. 52.
    Liu H, Jiang S, Huang Q, Xu C, Gao W (2007) Region-based visual attention analysis with its application in image browsing on small displays. In: Proceedings of the 15th international conference on multimedia, pp 305–308Google Scholar
  53. 53.
    Liu H, Xie X, Ma WY, Zhang HJ (2003) Automatic browsing of large pictures on mobile devices. In: ACM international conference on multimedia, pp 148–155Google Scholar
  54. 54.
    Lowe DG (2004) Distinctive image features from scale-invariant keypoints. In: International Journal of Computer Vision, vol. 60(2). Kluwer, Norwell, pp 91–110Google Scholar
  55. 55.
    Lum W, Lau F (2002) A context-aware decision engine for content adaptation. In: IEEE Pervasive Computing, vol 1(3). IEEE Computer Society Press, Los Alamitos, pp 41–49Google Scholar
  56. 56.
    Ma YF, Zhang HJ (2003) Contrast-based image attention analysis by using fuzzy growing. In: Proceedings of the 11th ACM international conference on multimedia. ACM Press, New York, pp 374–381CrossRefGoogle Scholar
  57. 57.
    Mohan R, Smith J, Li C (1999) Adapting multimedia internet content for universal access. In: IEEE Transactions on Multimedia, vol 1(1). IEEE Computer Society Press, Los Alamitos, pp 104–114Google Scholar
  58. 58.
    Mokhtarian F, Bober M (2003) Curvature scale space representation: theory, applications, and MPEG-7 standardization. In: Computational imaging and vision, vol 25. Kluwer, DordrechtGoogle Scholar
  59. 59.
    Nepal S, Srinivasan U (2003) DAVE: A system for quality driven adaptive video delivery. In: Proceedings of the 5th ACM SIGMM international workshop on multimedia information retrieval. ACM Press, New York, pp 223–230CrossRefGoogle Scholar
  60. 60.
    Noble B, Satyanarayanan M, Narayanan D, Tilton JE, Flinn J, RWalker K (1997) Agile application-aware adaptation for mobility. In: Proceedings of the 16th symposium on operating system principles, pp 276–287Google Scholar
  61. 61.
    Nurnett I (2003) MPEG-21: Goals and archievments. In: IEEE Multimedia, vol 10(6). IEEE Computer Society Press, Los Alamitos, pp 60–70Google Scholar
  62. 62.
    Obrenovic Z, Starcevic D, Selic B (2004) A model-driven approach to content repurposing. In: IEEE Multimedia, vol. 11(1). IEEE Computer Society Press, Los Alamitos, pp 62–71Google Scholar
  63. 63.
    Ren T, Liu Y, Wu G (2009) Image retargeting based on global energy optimization. In: Proceedings of the 2009 IEEE international conference on multimedia and expo. IEEE Press, Piscataway, pp 406–409CrossRefGoogle Scholar
  64. 64.
    Richter S, Kühne G, Schuster O (2001) Contour-based classification of video objects. In: Proceedings of IS&T/SPIE conference on storage and retrieval for media databases, vol 4315, pp 608–618Google Scholar
  65. 65.
    Rowley HA, Baluja S, Kanade T (1998) Neural network-based face detection. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 20(1). IEEE Computer Society Press, Los Alamitos, pp 23–38Google Scholar
  66. 66.
    Rubinstein M, Avidan S, Shamir A (2008) Improved seam carving for video retargeting. ACM Trans Graph, SIGGRAPH 2008 27(3)Google Scholar
  67. 67.
    Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph, SIGGRAPH 2009 28(3):1–11CrossRefGoogle Scholar
  68. 68.
    Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo cropping. In: ACM conference on human factors in computing systems, pp 771–780Google Scholar
  69. 69.
    Schaber P, Kopf S, Thorwirth N, Effelsberg W (2010) Semi-automatic registration of videos for improved watermark detection. In: ACM SIGMM conference on multimedia systems. ACM, New York, pp 23–34Google Scholar
  70. 70.
    Schneiderman H (2010) Face detection demonstration. Tech. rep., Robotics Institute, Carnegie Mellon University. http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi
  71. 71.
    Schneiderman H, Kanade T (2000) A statistical model for 3D object detection applied to faces and cars. In: Proceedings of IEEE international conference on computer vision and pattern recognition (CVPR). IEEE Computer Society Press, Los AlamitosGoogle Scholar
  72. 72.
    Setlur V, Takagi S, Raskar R, Gleicher M, Gooch B (2005) Automatic image retargeting. In: Proceedings of the 4th international conference on mobile and ubiquitous multimedia, pp 247–250Google Scholar
  73. 73.
    Shamir A, Avidan S (2009) Seam carving for media retargeting. Commun ACM 52(1):77–85CrossRefGoogle Scholar
  74. 74.
    Shanableh T, Ghanbari M (2000) Heterogeneous video transcoding to lower spatio-temporal resolution and different encoding formats. In: IEEE Transactions on Multimedia, vol 2(2). IEEE Computer Society Press, Los Alamitos, pp 101–110Google Scholar
  75. 75.
    Smith SM, Brady JM (1997) SUSAN—new approach to low level image processing. In: International Journal of Computer Vision (IJCV), vol 23(1), pp 45–78Google Scholar
  76. 76.
    Steiger O, Ebrahimi T, Sanjuan D (2003) MPEG-based personalized content delivery. In: Proceedings of IEEE international conference on image processing (ICIP), vol 3. IEEE Computer Society Press, Los Alamitos, pp 45–48Google Scholar
  77. 77.
    Suh B, Ling H, Bederson B, Jacobs D (2003) Automatic thumbnail cropping and its effectiveness. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 95–104Google Scholar
  78. 78.
    Tao C, Jia J, Sun H (2007) Active window oriented dynamic video retargeting. In: Proceedings of the workshop on dynamical visionGoogle Scholar
  79. 79.
    Tseng B, Lin CY, Smith JR (2004) Using MPEG-7 and MPEG-21 for personalizing video. In: IEEE Multimedia, vol 11(1). IEEE Computer Society Press, Los Alamitos, pp 42–52Google Scholar
  80. 80.
    Vetro A (2004) MPEG-21 digital item adaptation: enabling universal multimedia access. In: IEEE Multimedia, vol 11(1). IEEE Computer Society Press, Los Alamitos, pp 84–87Google Scholar
  81. 81.
    Vetro A, Christopoulos T, Ebrahimi T (2003) Special issue on universal multimedia access. In: IEEE Signal Processing Magazine, vol 20(2). IEEE Computer Society Press, Los Alamitos, pp 69–79Google Scholar
  82. 82.
    Vetro A, Chrisopoulos C, Sun H (2003) Video transcoding architectures and techniques: an overview. In: IEEE Signal Processing Magazine, vol 20(2). IEEE Computer Society Press, Los Alamitos, pp 18–29Google Scholar
  83. 83.
    Wang J, Reinders M, Lagendijk R, Lindenberg J, Kankanhalli M (2004) Video content presentation on tiny devices. In: IEEE international conference on multimedia and expo, pp 1711–1714Google Scholar
  84. 84.
    Wang YS, Fu H, Sorkine O, Lee TY, Seidel HP (2009) Motion-aware temporal coherence for video resizing. ACM Trans Graph 28(5)Google Scholar
  85. 85.
    Wang YS, Tai CL, Sorkine O, Lee TY (2008) Optimized scale-and-stretch for image resizing. ACM Trans Graph 27(5):1–8CrossRefGoogle Scholar
  86. 86.
    Wolf L, Guttmann M, Cohen-Or D (2007) Non-homogeneous content-driven video-retargeting. In: Proceedings of the eleventh IEEE international conference on computer visionGoogle Scholar
  87. 87.
    Zwicker M, Pfister H, van Baar J, Gross M (2002) EWA splatting. IEEE Trans Vis Comput Graph 8(3):223–238CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Stephan Kopf
    • 1
  • Thomas Haenselmann
    • 1
  • Johannes Kiess
    • 1
  • Benjamin Guthier
    • 1
  • Wolfgang Effelsberg
    • 1
  1. 1.Department of Computer Science IVUniversity of MannheimMannheimGermany

Personalised recommendations