Algorithms for video retargeting

Abstract

The visualization of high resolution video on small mobile devices is still a great challenge today. Most critical are the limited display resolution and different aspect ratios of handheld mobile devices. So far, there is no retargeting algorithm available that guarantees good results for all videos. We introduce a new video retargeting approach that reduces the resolution while preserving as much of the relevant content as possible. A central component of the system selects the most suitable algorithm to adapt a given shot. We have implemented two retargeting algorithms: a region of interest (ROI) based technique, and a fast implementation of seam carving for size adaptation of videos (FSCAV). The ROI-based retargeting detects important regions like faces, objects, text, and contrast-based saliency regions. A rectangular window within the larger frame is selected that defines the visible area of the target video. If several relevant regions are detected, an artificial camera motion (pan, tilt, or zoom) may change the selected view within a shot. For seam carving, we present two extensions: The first reduces the distortion of straight lines (lines may become curved or disconnected); the second avoids jitter in the target video, limits the large memory requirements and computational effort of seam carving, and makes it applicable to video retargeting. In addition, we present a heuristic that estimates the visual quality of the target video. If the quality drops below a threshold, the ROI-based retargeting is used for this shot. User evaluations confirm a very high visual quality of our approach.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

References

  1. 1.

    Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. ACM Trans Graph, SIGGRAPH 2007 26(3)

  2. 2.

    Bai B, Harms J (2005) A multiview video transcoder. In: Proceedings of the 13th annual ACM international conference on multimedia. ACM Press, New York, pp 503–506

    Google Scholar 

  3. 3.

    Bay H, Ess A, Tuytelaars T, Gool LV (2008) SURF: Speeded Up Robust Features. Comput Vis Image Underst (CVIU) 110(3):246–359

    Google Scholar 

  4. 4.

    Beek P, Smith JR, Ebrahimi T, Suzuki T, Askelof J (2003) Metadata-driven multimedia access. IEEE Signal Process Mag 20(2):40–52. IEEE Computer Society Press

    Article  Google Scholar 

  5. 5.

    Björk N, Christopoulos C (2000) Video transcoding for universal multimedia access. In: Proceedings of the 2000 ACM workshops on multimedia. ACM Press, New York, pp 75–79

    Google Scholar 

  6. 6.

    Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26(9):1124–1137

    Article  Google Scholar 

  7. 7.

    Canny JF (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698. IEEE Computer Society Press

    Article  Google Scholar 

  8. 8.

    Cardellini V, Yu P, Huang Y (2000) Collaborative proxy system for distributed web content transcoding. In: Proceedings of 9th international ACM conference on information and knowledge management. ACM Press, New York, pp 520–527

    Google Scholar 

  9. 9.

    Cheng WH, Hsieh CW, Lin SK, Wang CW, Wu JL (2005) Robust algorithm for exemplar-based image inpainting. In: The international conference on computer graphics, imaging and vision. IEEE Press, New York, pp 64–69

    Google Scholar 

  10. 10.

    Cheng WH, Wang CW, Wu JL (2007) Video adaptation for small display based on content recomposition. IEEE Trans Circuits Syst Video Technol 17(1):43–58

    Article  MathSciNet  Google Scholar 

  11. 11.

    Curran K, Annesley S (2005) Transcoding media for bandwidth constrained mobile devices. In: International Journal of Network Management, vol 15(2). Wiley, New York, pp 75–88

    Google Scholar 

  12. 12.

    Dong W, Bao G, Zhang X, Paul JC (2010) Interactive multi-operator image resizing and evaluation. J Comput Sci Technol 25(2)

  13. 13.

    Dong W, Paul JC (2008) Adaptive content aware image resizing. In: Eurographics 2009, vol 28(2)

  14. 14.

    Dong W, Zhou N, Paul JC, Zhang X (2009) Optimized image resizing using seam carving and scaling. ACM Trans Graph 28(5):1–10

    Article  Google Scholar 

  15. 15.

    Duda RO, Hart PE (1972) Use of the hough transformation to detect lines and curves in pictures. Commun ACM 15(1):11–15

    Article  Google Scholar 

  16. 16.

    El-Alfy H, Jacobs D, Davis L (2007) Multi-scale video cropping. In: ACM international conference on multimedia, pp 97–106

  17. 17.

    Farin D (2005) Automatic video segmentation employing object/camera modeling. PhD thesis, Technische Universiteit Eindhoven, Einhoven, The Netherlands

  18. 18.

    Farin D, Haenselmann T, Kopf S, Kühne G, Effelsberg W (2003) Segmentation and classification of moving video objects. In: Furht B, Marques O (eds) Handbook of video databases: design and applications, internet and communications series, vol 8. CRC Press, Boca Raton, pp 561–591

    Google Scholar 

  19. 19.

    Fischler M, Bolles R (1981) Random sample concensus: a paradigm for model fitting with applications to image analysis and automated cartography. In: Communications ACM, vol 24(6). ACM Press, New York, pp 381–395

    Google Scholar 

  20. 20.

    Fox A, Gribble S, Chawathe Y, Brewer E (1998) Adapting to network and client variation using infrastructural proxies: lessons and perspectives. In: IEEE Personal Communication, vol 5(4). IEEE Computer Society Press, Los Alamitos, pp 10–19

    Google Scholar 

  21. 21.

    Gal R, Sorkine O, Cohen-Or D (2006) Feature-aware texturing. In: Proceedings of Eurographics symposium on rendering, pp 297–303

  22. 22.

    Guo Y, Liu F, Zhou ZH, Gleicher M (2009) Image retargeting using mesh parameterization. IEEE Trans Multimedia 11(5):856–867

    Article  Google Scholar 

  23. 23.

    Han JW, Choi KS, Wang TS, Cheon SH, Ko SJ (2009) Improved seam carving using a modified energy function based on wavelet decomposition. In: IEEE 13th international symposium on consumer electronics, pp 38 –41

  24. 24.

    Han R, Bhagwat P, LaMaire R, Mummert T, Perret V, Rubas J (1998) Dynamic adaptation in an image transcoding proxy for mobile WWW browsing. In: IEEE Personal Communication, vol 5(6). IEEE Computer Society Press, Los Alamitos, pp 8–17

    Google Scholar 

  25. 25.

    Harris C, Stephens M (1988) A combined corner and edge detector. In: Proceedings of Alvey vision conference, pp 147–151

  26. 26.

    Harrison P (2001) A non-hierarchical procedure for re-synthesis of complex textures. In: The 9th international conference in Central Europe on computer graphics, visualization and computer vision, pp 190–197

  27. 27.

    Hjelsvold R, Vdaygiri S, Leaute Y (2001) Web–based personalization and management of interactive video. In: Proceedings of the 10th international conference on World Wide Web, pp 129–139

  28. 28.

    Hossain M, Rahman A, Saddik A (2004) A framework for repurposing multimedia content. In: Proceedings of the Canadian conference on electrical and computer engineering. IEEE Computer Society Press, Los Alamitos, pp 971–974

    Google Scholar 

  29. 29.

    Hwang DS, Chien SY (2008) Content-aware image resizing using perceptual seam carving with human attention model. In: IEEE international conference on multimedia and expo, pp 1029–1032

  30. 30.

    ISO/IEC (2002) Information technology–multimedia content description interface (MPEG-7)—part 8: extraction and use of MPEG-7 descriptions. Tech. rep. TR 15938-8, ISO/IEC

  31. 31.

    ISO/IEC (2003) MPEG-21 multimedia framework—part 7: digital item adaptation (final committee draft). Tech. rep. N 5845, ISO/IEC

  32. 32.

    ISO/IEC (2004) Information technology–multimedia framework (MPEG-21)—part 1: vision, technologies and strategy. Tech. rep. TR 21000-1, ISO/IEC

  33. 33.

    Itti L, Koch C, Niebur E (1999) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  34. 34.

    Kiess J, Kopf S, Guthier B, Effelsberg W (2010) Seam carving with improved edge preservation. In: Proceedings of IS&T/SPIE conference on multimedia on mobile devices, vol 7542

  35. 35.

    Kim JS, Kim JH, Kim CS (2009) Adaptive image and video retargeting technique based on fourier analysis. In: Proceedings of IEEE international conference on computer vision and pattern recognition. IEEE, New York, pp 1730–1737

    Google Scholar 

  36. 36.

    Kopf S, Effelsberg W (2008) Mobile cinema: canonical processes for video adaptation. In: Multimedia Systems, vol 14(6). Springer, New York, pp 369–375

    Google Scholar 

  37. 37.

    Kopf S, Guthier B, Lemelson H, Effelsberg W (2009) Adaptation of web pages and images for mobile applications. In: Proceedings of IS&T/SPIE conference on multimedia on mobile devices, vol 7256, pp 72560C-1–72560C-12

  38. 38.

    Kopf S, Haenselmann T, Farin D, Effelsberg W (2004) Automatic generation of summaries for the Web. In: Proceedings of IS&T/SPIE conference on storage and retrieval for media databases, vol 5307, pp 417–428

  39. 39.

    Kopf S, Haenselmann T, Effelsberg W (2005) Enhancing curvature scale space features for robust shape classification. In: Proceedings of IEEE international conference on multimedia and expo (ICME). IEEE Computer Society Press, Los Alamitos, pp 478–481

    Google Scholar 

  40. 40.

    Kopf S, Haenselmann T, Effelsberg W (2005) Robust character recognition in low-resolution images and videos. Tech. rep. TR-05-002, Department of Mathematics and Computer Science, University of Mannheim, Germany

  41. 41.

    Kopf S, Haenselmann T, Effelsberg W (2005) Shape-based posture and gesture recognition in videos. In: Proceedings of IS&T/SPIE conference on storage and retrieval methods and applications for multimedia, vol 5682, pp 114–124

  42. 42.

    Kopf S, Kiess J, Lemelson H, Effelsberg W (2009) FSCAV: Fast seam carving for size adaptation of videos. In: Proceedings of the 17th ACM international conference on multimedia. ACM, New York, pp 321–330

    Google Scholar 

  43. 43.

    Kopf S, Lampi F, King T, Effelsberg W (2006) Automatic scaling and cropping of videos for devices with limited screen resolution. In: Proceedings of the 14th ACM international conference on multimedia. ACM Press, New York, pp 957–958

    Google Scholar 

  44. 44.

    Krähenbühl P, Lang M, Hornung A, Gross M (2009) A system for retargeting of streaming video. In: ACM SIGGRAPH Asia. ACM, New York, pp 1–10

    Google Scholar 

  45. 45.

    Lei Z, Georganas ND (2001) Context-based media adaptation in pervasive computing. In: Proceedings of IEEE Canadian conference on electrical and computer engineering, vol 2. IEEE Computer Society Press, Los Alamitos, pp 913–918

    Google Scholar 

  46. 46.

    Lei Z, Georganas ND (2002) Rate adaptation transcoding for precoded video streams. In: Proceedings of the 10th ACM international conference on multimedia. ACM Press, New York, pp 127–136

    Google Scholar 

  47. 47.

    Li Y, Sun J, Tang CK, Shum HY (2004) Lazy snapping. ACM Trans Graph (TOG) 23(3):303–308

    Article  Google Scholar 

  48. 48.

    Li Y, Tian Y, Yang J, Duan LY, Gao W (2010) Video retargeting with multi-scale trajectory optimization. In: Proceedings of the international conference on multimedia information retrieval. ACM, New York, pp 45–54

    Google Scholar 

  49. 49.

    Linde Y, Buzo A, Gray R (1980) An algorithm for vector quantizer design. IEEE Trans Commun 28(1):84–95

    Article  Google Scholar 

  50. 50.

    Liu F, Gleicher M (2003) Automatic image retargeting with fisheye-view warping. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 153–162

  51. 51.

    Liu F, Gleicher M (2006) Video retargeting: automating pan and scan. In: ACM international conference on multimedia, pp 241–250

  52. 52.

    Liu H, Jiang S, Huang Q, Xu C, Gao W (2007) Region-based visual attention analysis with its application in image browsing on small displays. In: Proceedings of the 15th international conference on multimedia, pp 305–308

  53. 53.

    Liu H, Xie X, Ma WY, Zhang HJ (2003) Automatic browsing of large pictures on mobile devices. In: ACM international conference on multimedia, pp 148–155

  54. 54.

    Lowe DG (2004) Distinctive image features from scale-invariant keypoints. In: International Journal of Computer Vision, vol. 60(2). Kluwer, Norwell, pp 91–110

    Google Scholar 

  55. 55.

    Lum W, Lau F (2002) A context-aware decision engine for content adaptation. In: IEEE Pervasive Computing, vol 1(3). IEEE Computer Society Press, Los Alamitos, pp 41–49

    Google Scholar 

  56. 56.

    Ma YF, Zhang HJ (2003) Contrast-based image attention analysis by using fuzzy growing. In: Proceedings of the 11th ACM international conference on multimedia. ACM Press, New York, pp 374–381

    Google Scholar 

  57. 57.

    Mohan R, Smith J, Li C (1999) Adapting multimedia internet content for universal access. In: IEEE Transactions on Multimedia, vol 1(1). IEEE Computer Society Press, Los Alamitos, pp 104–114

    Google Scholar 

  58. 58.

    Mokhtarian F, Bober M (2003) Curvature scale space representation: theory, applications, and MPEG-7 standardization. In: Computational imaging and vision, vol 25. Kluwer, Dordrecht

    Google Scholar 

  59. 59.

    Nepal S, Srinivasan U (2003) DAVE: A system for quality driven adaptive video delivery. In: Proceedings of the 5th ACM SIGMM international workshop on multimedia information retrieval. ACM Press, New York, pp 223–230

    Google Scholar 

  60. 60.

    Noble B, Satyanarayanan M, Narayanan D, Tilton JE, Flinn J, RWalker K (1997) Agile application-aware adaptation for mobility. In: Proceedings of the 16th symposium on operating system principles, pp 276–287

  61. 61.

    Nurnett I (2003) MPEG-21: Goals and archievments. In: IEEE Multimedia, vol 10(6). IEEE Computer Society Press, Los Alamitos, pp 60–70

    Google Scholar 

  62. 62.

    Obrenovic Z, Starcevic D, Selic B (2004) A model-driven approach to content repurposing. In: IEEE Multimedia, vol. 11(1). IEEE Computer Society Press, Los Alamitos, pp 62–71

    Google Scholar 

  63. 63.

    Ren T, Liu Y, Wu G (2009) Image retargeting based on global energy optimization. In: Proceedings of the 2009 IEEE international conference on multimedia and expo. IEEE Press, Piscataway, pp 406–409

    Google Scholar 

  64. 64.

    Richter S, Kühne G, Schuster O (2001) Contour-based classification of video objects. In: Proceedings of IS&T/SPIE conference on storage and retrieval for media databases, vol 4315, pp 608–618

  65. 65.

    Rowley HA, Baluja S, Kanade T (1998) Neural network-based face detection. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 20(1). IEEE Computer Society Press, Los Alamitos, pp 23–38

    Google Scholar 

  66. 66.

    Rubinstein M, Avidan S, Shamir A (2008) Improved seam carving for video retargeting. ACM Trans Graph, SIGGRAPH 2008 27(3)

  67. 67.

    Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph, SIGGRAPH 2009 28(3):1–11

    Article  Google Scholar 

  68. 68.

    Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo cropping. In: ACM conference on human factors in computing systems, pp 771–780

  69. 69.

    Schaber P, Kopf S, Thorwirth N, Effelsberg W (2010) Semi-automatic registration of videos for improved watermark detection. In: ACM SIGMM conference on multimedia systems. ACM, New York, pp 23–34

    Google Scholar 

  70. 70.

    Schneiderman H (2010) Face detection demonstration. Tech. rep., Robotics Institute, Carnegie Mellon University. http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

  71. 71.

    Schneiderman H, Kanade T (2000) A statistical model for 3D object detection applied to faces and cars. In: Proceedings of IEEE international conference on computer vision and pattern recognition (CVPR). IEEE Computer Society Press, Los Alamitos

    Google Scholar 

  72. 72.

    Setlur V, Takagi S, Raskar R, Gleicher M, Gooch B (2005) Automatic image retargeting. In: Proceedings of the 4th international conference on mobile and ubiquitous multimedia, pp 247–250

  73. 73.

    Shamir A, Avidan S (2009) Seam carving for media retargeting. Commun ACM 52(1):77–85

    Article  Google Scholar 

  74. 74.

    Shanableh T, Ghanbari M (2000) Heterogeneous video transcoding to lower spatio-temporal resolution and different encoding formats. In: IEEE Transactions on Multimedia, vol 2(2). IEEE Computer Society Press, Los Alamitos, pp 101–110

    Google Scholar 

  75. 75.

    Smith SM, Brady JM (1997) SUSAN—new approach to low level image processing. In: International Journal of Computer Vision (IJCV), vol 23(1), pp 45–78

  76. 76.

    Steiger O, Ebrahimi T, Sanjuan D (2003) MPEG-based personalized content delivery. In: Proceedings of IEEE international conference on image processing (ICIP), vol 3. IEEE Computer Society Press, Los Alamitos, pp 45–48

    Google Scholar 

  77. 77.

    Suh B, Ling H, Bederson B, Jacobs D (2003) Automatic thumbnail cropping and its effectiveness. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 95–104

  78. 78.

    Tao C, Jia J, Sun H (2007) Active window oriented dynamic video retargeting. In: Proceedings of the workshop on dynamical vision

  79. 79.

    Tseng B, Lin CY, Smith JR (2004) Using MPEG-7 and MPEG-21 for personalizing video. In: IEEE Multimedia, vol 11(1). IEEE Computer Society Press, Los Alamitos, pp 42–52

    Google Scholar 

  80. 80.

    Vetro A (2004) MPEG-21 digital item adaptation: enabling universal multimedia access. In: IEEE Multimedia, vol 11(1). IEEE Computer Society Press, Los Alamitos, pp 84–87

    Google Scholar 

  81. 81.

    Vetro A, Christopoulos T, Ebrahimi T (2003) Special issue on universal multimedia access. In: IEEE Signal Processing Magazine, vol 20(2). IEEE Computer Society Press, Los Alamitos, pp 69–79

    Google Scholar 

  82. 82.

    Vetro A, Chrisopoulos C, Sun H (2003) Video transcoding architectures and techniques: an overview. In: IEEE Signal Processing Magazine, vol 20(2). IEEE Computer Society Press, Los Alamitos, pp 18–29

    Google Scholar 

  83. 83.

    Wang J, Reinders M, Lagendijk R, Lindenberg J, Kankanhalli M (2004) Video content presentation on tiny devices. In: IEEE international conference on multimedia and expo, pp 1711–1714

  84. 84.

    Wang YS, Fu H, Sorkine O, Lee TY, Seidel HP (2009) Motion-aware temporal coherence for video resizing. ACM Trans Graph 28(5)

  85. 85.

    Wang YS, Tai CL, Sorkine O, Lee TY (2008) Optimized scale-and-stretch for image resizing. ACM Trans Graph 27(5):1–8

    Article  Google Scholar 

  86. 86.

    Wolf L, Guttmann M, Cohen-Or D (2007) Non-homogeneous content-driven video-retargeting. In: Proceedings of the eleventh IEEE international conference on computer vision

  87. 87.

    Zwicker M, Pfister H, van Baar J, Gross M (2002) EWA splatting. IEEE Trans Vis Comput Graph 8(3):223–238

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the financial support granted by the Deutsche Forschungsgemeinschaft (DFG). We would like to thank the following flickr.com users for providing their images via the creative commons license: teoruiz (bridge.jpg), the_tahoe_guy (road.jpg) and digital_cat (construction_site.jpg). We thank Instituto Luce for providing historical films within the European research project ECHO. Furthermore, we would like to thank Sabine Olawsky for the development of the contrast-based saliency detection.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Stephan Kopf.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Kopf, S., Haenselmann, T., Kiess, J. et al. Algorithms for video retargeting. Multimed Tools Appl 51, 819–861 (2011). https://doi.org/10.1007/s11042-010-0717-6

Download citation

Keywords

  • Video retargeting
  • Video adaptation
  • Seam carving
  • Region of interest
  • Contrast-based saliency