Algorithms for video retargeting
- 356 Downloads
- 13 Citations
Abstract
The visualization of high resolution video on small mobile devices is still a great challenge today. Most critical are the limited display resolution and different aspect ratios of handheld mobile devices. So far, there is no retargeting algorithm available that guarantees good results for all videos. We introduce a new video retargeting approach that reduces the resolution while preserving as much of the relevant content as possible. A central component of the system selects the most suitable algorithm to adapt a given shot. We have implemented two retargeting algorithms: a region of interest (ROI) based technique, and a fast implementation of seam carving for size adaptation of videos (FSCAV). The ROI-based retargeting detects important regions like faces, objects, text, and contrast-based saliency regions. A rectangular window within the larger frame is selected that defines the visible area of the target video. If several relevant regions are detected, an artificial camera motion (pan, tilt, or zoom) may change the selected view within a shot. For seam carving, we present two extensions: The first reduces the distortion of straight lines (lines may become curved or disconnected); the second avoids jitter in the target video, limits the large memory requirements and computational effort of seam carving, and makes it applicable to video retargeting. In addition, we present a heuristic that estimates the visual quality of the target video. If the quality drops below a threshold, the ROI-based retargeting is used for this shot. User evaluations confirm a very high visual quality of our approach.
Keywords
Video retargeting Video adaptation Seam carving Region of interest Contrast-based saliencyNotes
Acknowledgements
The authors acknowledge the financial support granted by the Deutsche Forschungsgemeinschaft (DFG). We would like to thank the following flickr.com users for providing their images via the creative commons license: teoruiz (bridge.jpg), the_tahoe_guy (road.jpg) and digital_cat (construction_site.jpg). We thank Instituto Luce for providing historical films within the European research project ECHO. Furthermore, we would like to thank Sabine Olawsky for the development of the contrast-based saliency detection.
References
- 1.Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. ACM Trans Graph, SIGGRAPH 2007 26(3)Google Scholar
- 2.Bai B, Harms J (2005) A multiview video transcoder. In: Proceedings of the 13th annual ACM international conference on multimedia. ACM Press, New York, pp 503–506CrossRefGoogle Scholar
- 3.Bay H, Ess A, Tuytelaars T, Gool LV (2008) SURF: Speeded Up Robust Features. Comput Vis Image Underst (CVIU) 110(3):246–359Google Scholar
- 4.Beek P, Smith JR, Ebrahimi T, Suzuki T, Askelof J (2003) Metadata-driven multimedia access. IEEE Signal Process Mag 20(2):40–52. IEEE Computer Society PressCrossRefGoogle Scholar
- 5.Björk N, Christopoulos C (2000) Video transcoding for universal multimedia access. In: Proceedings of the 2000 ACM workshops on multimedia. ACM Press, New York, pp 75–79CrossRefGoogle Scholar
- 6.Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26(9):1124–1137CrossRefGoogle Scholar
- 7.Canny JF (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698. IEEE Computer Society PressCrossRefGoogle Scholar
- 8.Cardellini V, Yu P, Huang Y (2000) Collaborative proxy system for distributed web content transcoding. In: Proceedings of 9th international ACM conference on information and knowledge management. ACM Press, New York, pp 520–527Google Scholar
- 9.Cheng WH, Hsieh CW, Lin SK, Wang CW, Wu JL (2005) Robust algorithm for exemplar-based image inpainting. In: The international conference on computer graphics, imaging and vision. IEEE Press, New York, pp 64–69Google Scholar
- 10.Cheng WH, Wang CW, Wu JL (2007) Video adaptation for small display based on content recomposition. IEEE Trans Circuits Syst Video Technol 17(1):43–58CrossRefMathSciNetGoogle Scholar
- 11.Curran K, Annesley S (2005) Transcoding media for bandwidth constrained mobile devices. In: International Journal of Network Management, vol 15(2). Wiley, New York, pp 75–88Google Scholar
- 12.Dong W, Bao G, Zhang X, Paul JC (2010) Interactive multi-operator image resizing and evaluation. J Comput Sci Technol 25(2)Google Scholar
- 13.Dong W, Paul JC (2008) Adaptive content aware image resizing. In: Eurographics 2009, vol 28(2)Google Scholar
- 14.Dong W, Zhou N, Paul JC, Zhang X (2009) Optimized image resizing using seam carving and scaling. ACM Trans Graph 28(5):1–10CrossRefGoogle Scholar
- 15.Duda RO, Hart PE (1972) Use of the hough transformation to detect lines and curves in pictures. Commun ACM 15(1):11–15CrossRefGoogle Scholar
- 16.El-Alfy H, Jacobs D, Davis L (2007) Multi-scale video cropping. In: ACM international conference on multimedia, pp 97–106Google Scholar
- 17.Farin D (2005) Automatic video segmentation employing object/camera modeling. PhD thesis, Technische Universiteit Eindhoven, Einhoven, The NetherlandsGoogle Scholar
- 18.Farin D, Haenselmann T, Kopf S, Kühne G, Effelsberg W (2003) Segmentation and classification of moving video objects. In: Furht B, Marques O (eds) Handbook of video databases: design and applications, internet and communications series, vol 8. CRC Press, Boca Raton, pp 561–591Google Scholar
- 19.Fischler M, Bolles R (1981) Random sample concensus: a paradigm for model fitting with applications to image analysis and automated cartography. In: Communications ACM, vol 24(6). ACM Press, New York, pp 381–395Google Scholar
- 20.Fox A, Gribble S, Chawathe Y, Brewer E (1998) Adapting to network and client variation using infrastructural proxies: lessons and perspectives. In: IEEE Personal Communication, vol 5(4). IEEE Computer Society Press, Los Alamitos, pp 10–19Google Scholar
- 21.Gal R, Sorkine O, Cohen-Or D (2006) Feature-aware texturing. In: Proceedings of Eurographics symposium on rendering, pp 297–303Google Scholar
- 22.Guo Y, Liu F, Zhou ZH, Gleicher M (2009) Image retargeting using mesh parameterization. IEEE Trans Multimedia 11(5):856–867CrossRefGoogle Scholar
- 23.Han JW, Choi KS, Wang TS, Cheon SH, Ko SJ (2009) Improved seam carving using a modified energy function based on wavelet decomposition. In: IEEE 13th international symposium on consumer electronics, pp 38 –41Google Scholar
- 24.Han R, Bhagwat P, LaMaire R, Mummert T, Perret V, Rubas J (1998) Dynamic adaptation in an image transcoding proxy for mobile WWW browsing. In: IEEE Personal Communication, vol 5(6). IEEE Computer Society Press, Los Alamitos, pp 8–17Google Scholar
- 25.Harris C, Stephens M (1988) A combined corner and edge detector. In: Proceedings of Alvey vision conference, pp 147–151Google Scholar
- 26.Harrison P (2001) A non-hierarchical procedure for re-synthesis of complex textures. In: The 9th international conference in Central Europe on computer graphics, visualization and computer vision, pp 190–197Google Scholar
- 27.Hjelsvold R, Vdaygiri S, Leaute Y (2001) Web–based personalization and management of interactive video. In: Proceedings of the 10th international conference on World Wide Web, pp 129–139Google Scholar
- 28.Hossain M, Rahman A, Saddik A (2004) A framework for repurposing multimedia content. In: Proceedings of the Canadian conference on electrical and computer engineering. IEEE Computer Society Press, Los Alamitos, pp 971–974Google Scholar
- 29.Hwang DS, Chien SY (2008) Content-aware image resizing using perceptual seam carving with human attention model. In: IEEE international conference on multimedia and expo, pp 1029–1032Google Scholar
- 30.ISO/IEC (2002) Information technology–multimedia content description interface (MPEG-7)—part 8: extraction and use of MPEG-7 descriptions. Tech. rep. TR 15938-8, ISO/IECGoogle Scholar
- 31.ISO/IEC (2003) MPEG-21 multimedia framework—part 7: digital item adaptation (final committee draft). Tech. rep. N 5845, ISO/IECGoogle Scholar
- 32.ISO/IEC (2004) Information technology–multimedia framework (MPEG-21)—part 1: vision, technologies and strategy. Tech. rep. TR 21000-1, ISO/IECGoogle Scholar
- 33.Itti L, Koch C, Niebur E (1999) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259CrossRefGoogle Scholar
- 34.Kiess J, Kopf S, Guthier B, Effelsberg W (2010) Seam carving with improved edge preservation. In: Proceedings of IS&T/SPIE conference on multimedia on mobile devices, vol 7542Google Scholar
- 35.Kim JS, Kim JH, Kim CS (2009) Adaptive image and video retargeting technique based on fourier analysis. In: Proceedings of IEEE international conference on computer vision and pattern recognition. IEEE, New York, pp 1730–1737Google Scholar
- 36.Kopf S, Effelsberg W (2008) Mobile cinema: canonical processes for video adaptation. In: Multimedia Systems, vol 14(6). Springer, New York, pp 369–375Google Scholar
- 37.Kopf S, Guthier B, Lemelson H, Effelsberg W (2009) Adaptation of web pages and images for mobile applications. In: Proceedings of IS&T/SPIE conference on multimedia on mobile devices, vol 7256, pp 72560C-1–72560C-12Google Scholar
- 38.Kopf S, Haenselmann T, Farin D, Effelsberg W (2004) Automatic generation of summaries for the Web. In: Proceedings of IS&T/SPIE conference on storage and retrieval for media databases, vol 5307, pp 417–428Google Scholar
- 39.Kopf S, Haenselmann T, Effelsberg W (2005) Enhancing curvature scale space features for robust shape classification. In: Proceedings of IEEE international conference on multimedia and expo (ICME). IEEE Computer Society Press, Los Alamitos, pp 478–481CrossRefGoogle Scholar
- 40.Kopf S, Haenselmann T, Effelsberg W (2005) Robust character recognition in low-resolution images and videos. Tech. rep. TR-05-002, Department of Mathematics and Computer Science, University of Mannheim, GermanyGoogle Scholar
- 41.Kopf S, Haenselmann T, Effelsberg W (2005) Shape-based posture and gesture recognition in videos. In: Proceedings of IS&T/SPIE conference on storage and retrieval methods and applications for multimedia, vol 5682, pp 114–124Google Scholar
- 42.Kopf S, Kiess J, Lemelson H, Effelsberg W (2009) FSCAV: Fast seam carving for size adaptation of videos. In: Proceedings of the 17th ACM international conference on multimedia. ACM, New York, pp 321–330Google Scholar
- 43.Kopf S, Lampi F, King T, Effelsberg W (2006) Automatic scaling and cropping of videos for devices with limited screen resolution. In: Proceedings of the 14th ACM international conference on multimedia. ACM Press, New York, pp 957–958CrossRefGoogle Scholar
- 44.Krähenbühl P, Lang M, Hornung A, Gross M (2009) A system for retargeting of streaming video. In: ACM SIGGRAPH Asia. ACM, New York, pp 1–10CrossRefGoogle Scholar
- 45.Lei Z, Georganas ND (2001) Context-based media adaptation in pervasive computing. In: Proceedings of IEEE Canadian conference on electrical and computer engineering, vol 2. IEEE Computer Society Press, Los Alamitos, pp 913–918Google Scholar
- 46.Lei Z, Georganas ND (2002) Rate adaptation transcoding for precoded video streams. In: Proceedings of the 10th ACM international conference on multimedia. ACM Press, New York, pp 127–136CrossRefGoogle Scholar
- 47.Li Y, Sun J, Tang CK, Shum HY (2004) Lazy snapping. ACM Trans Graph (TOG) 23(3):303–308CrossRefGoogle Scholar
- 48.Li Y, Tian Y, Yang J, Duan LY, Gao W (2010) Video retargeting with multi-scale trajectory optimization. In: Proceedings of the international conference on multimedia information retrieval. ACM, New York, pp 45–54CrossRefGoogle Scholar
- 49.Linde Y, Buzo A, Gray R (1980) An algorithm for vector quantizer design. IEEE Trans Commun 28(1):84–95CrossRefGoogle Scholar
- 50.Liu F, Gleicher M (2003) Automatic image retargeting with fisheye-view warping. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 153–162Google Scholar
- 51.Liu F, Gleicher M (2006) Video retargeting: automating pan and scan. In: ACM international conference on multimedia, pp 241–250Google Scholar
- 52.Liu H, Jiang S, Huang Q, Xu C, Gao W (2007) Region-based visual attention analysis with its application in image browsing on small displays. In: Proceedings of the 15th international conference on multimedia, pp 305–308Google Scholar
- 53.Liu H, Xie X, Ma WY, Zhang HJ (2003) Automatic browsing of large pictures on mobile devices. In: ACM international conference on multimedia, pp 148–155Google Scholar
- 54.Lowe DG (2004) Distinctive image features from scale-invariant keypoints. In: International Journal of Computer Vision, vol. 60(2). Kluwer, Norwell, pp 91–110Google Scholar
- 55.Lum W, Lau F (2002) A context-aware decision engine for content adaptation. In: IEEE Pervasive Computing, vol 1(3). IEEE Computer Society Press, Los Alamitos, pp 41–49Google Scholar
- 56.Ma YF, Zhang HJ (2003) Contrast-based image attention analysis by using fuzzy growing. In: Proceedings of the 11th ACM international conference on multimedia. ACM Press, New York, pp 374–381CrossRefGoogle Scholar
- 57.Mohan R, Smith J, Li C (1999) Adapting multimedia internet content for universal access. In: IEEE Transactions on Multimedia, vol 1(1). IEEE Computer Society Press, Los Alamitos, pp 104–114Google Scholar
- 58.Mokhtarian F, Bober M (2003) Curvature scale space representation: theory, applications, and MPEG-7 standardization. In: Computational imaging and vision, vol 25. Kluwer, DordrechtGoogle Scholar
- 59.Nepal S, Srinivasan U (2003) DAVE: A system for quality driven adaptive video delivery. In: Proceedings of the 5th ACM SIGMM international workshop on multimedia information retrieval. ACM Press, New York, pp 223–230CrossRefGoogle Scholar
- 60.Noble B, Satyanarayanan M, Narayanan D, Tilton JE, Flinn J, RWalker K (1997) Agile application-aware adaptation for mobility. In: Proceedings of the 16th symposium on operating system principles, pp 276–287Google Scholar
- 61.Nurnett I (2003) MPEG-21: Goals and archievments. In: IEEE Multimedia, vol 10(6). IEEE Computer Society Press, Los Alamitos, pp 60–70Google Scholar
- 62.Obrenovic Z, Starcevic D, Selic B (2004) A model-driven approach to content repurposing. In: IEEE Multimedia, vol. 11(1). IEEE Computer Society Press, Los Alamitos, pp 62–71Google Scholar
- 63.Ren T, Liu Y, Wu G (2009) Image retargeting based on global energy optimization. In: Proceedings of the 2009 IEEE international conference on multimedia and expo. IEEE Press, Piscataway, pp 406–409CrossRefGoogle Scholar
- 64.Richter S, Kühne G, Schuster O (2001) Contour-based classification of video objects. In: Proceedings of IS&T/SPIE conference on storage and retrieval for media databases, vol 4315, pp 608–618Google Scholar
- 65.Rowley HA, Baluja S, Kanade T (1998) Neural network-based face detection. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 20(1). IEEE Computer Society Press, Los Alamitos, pp 23–38Google Scholar
- 66.Rubinstein M, Avidan S, Shamir A (2008) Improved seam carving for video retargeting. ACM Trans Graph, SIGGRAPH 2008 27(3)Google Scholar
- 67.Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph, SIGGRAPH 2009 28(3):1–11CrossRefGoogle Scholar
- 68.Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo cropping. In: ACM conference on human factors in computing systems, pp 771–780Google Scholar
- 69.Schaber P, Kopf S, Thorwirth N, Effelsberg W (2010) Semi-automatic registration of videos for improved watermark detection. In: ACM SIGMM conference on multimedia systems. ACM, New York, pp 23–34Google Scholar
- 70.Schneiderman H (2010) Face detection demonstration. Tech. rep., Robotics Institute, Carnegie Mellon University. http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi
- 71.Schneiderman H, Kanade T (2000) A statistical model for 3D object detection applied to faces and cars. In: Proceedings of IEEE international conference on computer vision and pattern recognition (CVPR). IEEE Computer Society Press, Los AlamitosGoogle Scholar
- 72.Setlur V, Takagi S, Raskar R, Gleicher M, Gooch B (2005) Automatic image retargeting. In: Proceedings of the 4th international conference on mobile and ubiquitous multimedia, pp 247–250Google Scholar
- 73.Shamir A, Avidan S (2009) Seam carving for media retargeting. Commun ACM 52(1):77–85CrossRefGoogle Scholar
- 74.Shanableh T, Ghanbari M (2000) Heterogeneous video transcoding to lower spatio-temporal resolution and different encoding formats. In: IEEE Transactions on Multimedia, vol 2(2). IEEE Computer Society Press, Los Alamitos, pp 101–110Google Scholar
- 75.Smith SM, Brady JM (1997) SUSAN—new approach to low level image processing. In: International Journal of Computer Vision (IJCV), vol 23(1), pp 45–78Google Scholar
- 76.Steiger O, Ebrahimi T, Sanjuan D (2003) MPEG-based personalized content delivery. In: Proceedings of IEEE international conference on image processing (ICIP), vol 3. IEEE Computer Society Press, Los Alamitos, pp 45–48Google Scholar
- 77.Suh B, Ling H, Bederson B, Jacobs D (2003) Automatic thumbnail cropping and its effectiveness. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 95–104Google Scholar
- 78.Tao C, Jia J, Sun H (2007) Active window oriented dynamic video retargeting. In: Proceedings of the workshop on dynamical visionGoogle Scholar
- 79.Tseng B, Lin CY, Smith JR (2004) Using MPEG-7 and MPEG-21 for personalizing video. In: IEEE Multimedia, vol 11(1). IEEE Computer Society Press, Los Alamitos, pp 42–52Google Scholar
- 80.Vetro A (2004) MPEG-21 digital item adaptation: enabling universal multimedia access. In: IEEE Multimedia, vol 11(1). IEEE Computer Society Press, Los Alamitos, pp 84–87Google Scholar
- 81.Vetro A, Christopoulos T, Ebrahimi T (2003) Special issue on universal multimedia access. In: IEEE Signal Processing Magazine, vol 20(2). IEEE Computer Society Press, Los Alamitos, pp 69–79Google Scholar
- 82.Vetro A, Chrisopoulos C, Sun H (2003) Video transcoding architectures and techniques: an overview. In: IEEE Signal Processing Magazine, vol 20(2). IEEE Computer Society Press, Los Alamitos, pp 18–29Google Scholar
- 83.Wang J, Reinders M, Lagendijk R, Lindenberg J, Kankanhalli M (2004) Video content presentation on tiny devices. In: IEEE international conference on multimedia and expo, pp 1711–1714Google Scholar
- 84.Wang YS, Fu H, Sorkine O, Lee TY, Seidel HP (2009) Motion-aware temporal coherence for video resizing. ACM Trans Graph 28(5)Google Scholar
- 85.Wang YS, Tai CL, Sorkine O, Lee TY (2008) Optimized scale-and-stretch for image resizing. ACM Trans Graph 27(5):1–8CrossRefGoogle Scholar
- 86.Wolf L, Guttmann M, Cohen-Or D (2007) Non-homogeneous content-driven video-retargeting. In: Proceedings of the eleventh IEEE international conference on computer visionGoogle Scholar
- 87.Zwicker M, Pfister H, van Baar J, Gross M (2002) EWA splatting. IEEE Trans Vis Comput Graph 8(3):223–238CrossRefGoogle Scholar