Abstract
We introduce a deep learning-driven framework for creating an adaptably applicable importance map (A2R-Map) that can be integrated with existing image and video retargeting operators. A conventional retargeting algorithm uses a heuristic approach to seek an off-the-self algorithm used into their retargeting system. The extracted importance map of the image does not match the characteristics of the input image; therefore, it affects the retargeting results and limits the performance of the retargeting method. Our designed framework attempts to minimize the artifacts/distortions caused by inappropriate energy, e.g., the shrunk phenomenon in warping-based results and carving-through-object distortion in the seam carving-based approach. Our proposed framework focuses on capturing sensitive distortion regions and activating their energy to solve this challenge. We verify the effectiveness of our proposed scheme by plugging it in three typical retargeting methods: seam carving-based, warping-based for image, and video retargeting. Extensive experiments and evaluations are conducted on two widely used databases. On the one hand, A2R-Map significantly reduces the time of importance map generation in retargeting systems to \(\sim 9\) times compared to the baseline saliency map. On the other hand, our A2R-Map achieves improvement over the baseline methods with an average of 11% and 9% in terms of image and video quality, respectively. The experimental results and evaluations demonstrate that our strategy for A2R-Map substantially outperforms the previous works and significantly boosts the visual quality of video/image retargeting.
Similar content being viewed by others
References
Suh B, Ling H, Bederson BB, Jacobs DW (2003) Automatic thumbnail cropping and its effectiveness. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 95–104
Kopf S, Guthier B, Lemelson H, Effelsberg W (2009) Adaptation of web pages and images for mobile applications. In: Multimedia on Mobile Devices 2009, vol. 7256, p 72560. International Society for Optics and Photonics
Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. In: ACM SIGGRAPH 2007 Papers, p 10
Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph (TOG) 28(3):1–11
Pritch Y, Kav-Venaki E, Peleg S (2009) Shift-map image editing. In: 2009 IEEE 12th international conference on computer vision, pp 151–158. IEEE
Lin S-S, Yeh I-C, Lin C-H, Lee T-Y (2012) Patch-based image warping for content-aware retargeting. IEEE Trans Multimed 15(2):359–368
Lin S-S, Lin C-H, Yeh I-C, Chang S-H, Yeh C-K, Lee T-Y (2013) Content-aware video retargeting using object-preserving warping. IEEE Trans Vis Comput Graph 19(10):1677–1686
Asheghi B, Salehpour P, Khiavi AM, Hashemzadeh M (2022) A comprehensive review on content-aware image retargeting: From classical to state-of-the-art methods. Signal Processing 108496
Cho D, Park J, Oh T-H, Tai Y-W, So Kweon I (2017) Weakly-and self-supervised learning for content-aware deep image retargeting. In: Proceedings of the IEEE international conference on computer vision, pp 4558–4567
Tan W, Yan B, Lin C, Niu X (2019) Cycle-ir: Deep cyclic image retargeting. IEEE Trans Multimed
Kajiura N, Kosugi S, Wang X, Yamasaki T (2020) Self-play reinforcement learning for fast image retargeting. In: Proceedings of the 28th ACM international conference on multimedia, pp 1755–1763
Lin J, Zhou T, Chen Z (2019) Deepir: A deep semantics driven framework for image retargeting. In: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp 54–59. IEEE
Kiess J, Kopf S, Guthier B, Effelsberg W (2018) A survey on content-aware image and video retargeting. ACM Trans Multimed Comput Commun Appl (TOMM) 14(3):1–28
Li X, Ling H (2009) Learning based thumbnail cropping. In: 2009 IEEE International Conference on Multimedia and Expo, pp 558–561. IEEE
Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo cropping. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 771–780
Guo D, Ding J, Tang J, Xu M, Zhao C (2015) Nif-based seam carving for image resizing. Multimedia Systems 21(6):603–613
Shen J, Wang D, Li X (2013) Depth-aware image seam carving. IEEE Trans Cybern 43(5):1453–1461
Wu J, Zhou W, Luo T, Yu L, Lei J (2021) Multiscale multilevel context and multimodal fusion for rgb-d salient object detection. Signal Processing 178:107766
Choi J, Kim C (2016) Sparse seam-carving for structure preserving image retargeting. J Signal Process Syst 85(2):275–283
Battiato S, Farinella GM, Puglisi G, Ravi D (2014) Saliency-based selection of gradient vector flow paths for content aware image resizing. IEEE Trans Image Process 23(5):2081–2095
Cui J, Cai Q, Lu H, Jia Z, Tang M (2020) Distortion-aware image retargeting based on continuous seam carving model. Signal processing 166:107242
Zhang X, Hu Y, Rajan D (2013) Dynamic distortion maps for image retargeting. J Vis Commun Image Represent 24(1):81–92
Guo Y, Liu F, Shi J, Zhou Z-H, Gleicher M (2009) Image retargeting using mesh parametrization. IEEE Trans Multimed 11(5):856–867
Wang Y-S, Tai C-L, Sorkine O, Lee T-Y (2008) Optimized scale-and-stretch for image resizing. In: ACM SIGGRAPH Asia 2008 Papers, pp 1–8
Zhang G-X, Cheng M-M, Hu S-M, Martin RR (2009) A shape-preserving approach to image resizing. In: Computer Graphics Forum, vol 28, pp 1897–1906. Wiley Online Library
Jin Y, Liu L, Wu Q (2010) Nonhomogeneous scaling optimization for realtime image resizing. Vis Comput 26(6):769–778
Niu Y, Liu F, Li X, Gleicher M (2012) Image resizing via non-homogeneous warping. Multimed Tools Appl 56(3):485–508
Hu W, Luo Z, Fan X (2014) Image retargeting via adaptive scaling with geometry preservation. IEEE J Emerg Sel Top Circ Syst 4(1):70–81
Panozzo D, Weber O, Sorkine O (2012) Robust image retargeting via axis-aligned deformation. In: Computer Graphics Forum, vol 31, pp 229–236. Wiley Online Library
Tan W, Yan B, Li K, Tian Q (2015) Image retargeting for preserving robust local feature: Application to mobile visual search. IEEE Trans Multimed 18(1):128–137
Kim Y, Jung S, Jung C, Kim C (2018) A structure-aware axis-aligned grid deformation approach for robust image retargeting. Multimed Tools Appl 77(6):7717–7739
Kim Y, Eun H, Jung C, Kim C (2018) A quad edge-based grid encoding model for content-aware image retargeting. IEEE Trans Vis Comput Graph 25(12):3202–3215
Liu S, Wei Z, Sun Y, Ou X, Lin J, Liu B, Yang M-H (2018) Composing semantic collage for image retargeting. IEEE Trans Image Process 27(10):5032–5043
Guo G, Wang H, Shen C, Yan Y, Liao H-YM (2018) Automatic image cropping for visual aesthetic enhancement using deep neural networks and cascaded regression. IEEE Trans Multimed 20(8):2073–2085
Song E, Lee M, Lee S (2018) Carvingnet: content-guided seam carving using deep convolution neural network. IEEE Access 7:284–292
Wang Z, Zhang W, Zhou H (2019) Perception-guided multi-channel visual feature fusion for image retargeting. Signal Process Image Commun 79:63–70
Ahmadi M, Karimi N, Samavi S (2021) Context-aware saliency detection for image retargeting using convolutional neural networks. Multimed Tools Appl 80(8):11917–11941
Zhou Y, Chen Z, Li W (2020) Weakly supervised reinforced multi-operator image retargeting. IEEE Trans Circ Syst Video Technol 31(1):126–139
Shafieyan F, Karimi N, Mirmahboub B, Samavi S, Shirani S (2017) Image retargeting using depth assisted saliency map. Signal Process Image Commun 50:34–43
Li B, Duan L-Y, Lin C-W, Huang T, Gao W (2015) Depth-preserving warping for stereo image retargeting. IEEE Trans Image Process 24(9):2811–2826
Zhang W, Yao T, Zhu S, Saddik AE (2019) Deep learning-based multimedia analytics: a review. ACM Trans Multimed Comput Commun Appl (TOMM) 15(1s):1–26
Zhang Z, Lin H, Zhao X, Ji R, Gao Y (2018) Inductive multi-hypergraph learning and its application on view-based 3d object classification. IEEE Trans Image Process 27(12):5957–5968
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International Conference on Machine Learning, pp 2048–2057. PMLR
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Kanopoulos N, Vasanthavada N, Baker RL (1988) Design of an image edge detection filter using the sobel operator. IEEE Journal of solid-state circuits 23(2):358–367
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H-Y (2010) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33(2):353–367
Goferman S, Zelnik-Manor L, Tal A (2011) Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell 34(10):1915–1926
Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph-based video segmentation. In: 2010 Ieee Computer society conference on computer vision and pattern recognition, pp 2141–2148. IEEE
Patel D, Nagar R, Raman S (2019) Reflection symmetry aware image retargeting. Pattern Recogn Lett 125:179–186
Cheng M-M, Mitra NJ, Huang X, Torr PH, Hu S-M (2014) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7479–7489
Qin X, Dai H, Hu X, Fan D-P, Shao L, Van Gool L (2022) Highly accurate dichotomous image segmentation. In: European Conference on Computer Vision, pp 38–56. Springer
Liu J-J, Hou Q, Cheng M-M (2020) Dynamic feature integration for simultaneous detection of salient object, edge, and skeleton. IEEE Trans Image Process 29:8652–8667
Tang F, Dong W, Meng Y, Ma C, Wu F, Li X, Lee T-Y (2019) Image retargetability. IEEE Trans Multimed 22(3):641–654
Zhang Y, Lin W, Zhang X, Fang Y, Li L (2016) Aspect ratio similarity (ars) for image retargeting quality assessment. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1080–1084. IEEE
Liu C, Yuen J, Torralba A (2010) Sift flow: Dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell 33(5):978–994
Rubinstein M, Gutierrez D, Sorkine O, Shamir A (2010) A comparative study of image retargeting. ACM Trans Graph (Proc. SIGGRAPH ASIA) 29(6):160–116010
Simakov D, Caspi Y, Shechtman E, Irani M (2008) Summarizing visual data using bidirectional similarity. In: 2008 IEEE conference on computer vision and pattern recognition, pp 1–8. IEEE
Ma L, Lin W, Deng C, Ngan KN (2012) Image retargeting quality assessment: A study of subjective scores and objective metrics. IEEE J Sel Top Signal Process 6(6):626–639
Zhang L, Li X, Nie L, Yan Y, Zimmermann R (2016) Semantic photo retargeting under noisy image labels. ACM Trans Multimed Comput Commun Appl (TOMM) 12(3):1–22
Rubinstein M, Shamir A, Avidan S (2008) Improved seam carving for video retargeting. ACM Trans Graph (TOG) 27(3):1–9
Acknowledgements
This work was supported in part by the National Science and Technology Council (under nos. 111-2221-E-006-112-MY3, 110-2221-E-006-135-MY3, 112-2221-E-019-063-MY3 and 110-2221-E-019-052-MY3), Republic of China (ROC), Taiwan. And this work is also supported by National Natural Science Foundation of China under Nos. U20B2070 and 61832016.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest. A part of the datasets generated and/or analysed during the current study are available in https://people.csail.mit.edu/mrub/retargetme, https://www.ee.nthu.edu.tw/cwlin/Retargeting_Quality/NRID.html, and https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: More comparisons
Apart from the comparisons with content-aware retargeting approaches, we further exhibit our results competing with a semantic-aware retargeting approach [60] in Fig. 17. In this figure, besides [60] (PM), we further show the results from other five retargeting methods: seam carving (SC) [3] and its improved version (ISC) [61], patch-based warping (PW) [6], saliency-based mesh parametrization (SMP) [23], multi-operator (MOR) [4]. These results are obtained from [60]. We can observe that our result outperforms the compared results. If the carving distortions occur in SC, ISC, and SMP, a linear-like phenomenon falls in PW, MOR, and PM (e.g., the green door). Meanwhile, our result does not have such phenomena and appears in a balanced structure compared to the input image. Figure 18 exhibits the performance of our A2R-Map in terms of enlarging. In this experiment, we enlarge images to 25% of width.
Appendix B: List of notations
Symbol | Definition |
---|---|
\(\mathcal {I}\) | Input image |
SC | Seam carving operator |
OMap | The energy map generated by TFS-Net |
BMap | The energy map generated by (4) |
A2R-Map | The final importance map generated by our model |
SOD | Salient Object Detection |
TFS-Net | The network we proposed to generate OMap |
TFS | Feature Sharing Session module |
AFS | Adjacent-layer Feature Sharing module |
\(\mathcal {X}^i\) | Feature maps at layer \(i^{th}\) |
\(\mathcal {X}_u\) | Feature maps at upper layer |
\(\mathcal {X}_l\) | Feature maps at lower layer |
\(\mathcal {A}^s\) | The source image/video in general |
\(\mathcal {A}^t\) | The target image/video \(\mathcal {A}^s\) after retargeting process |
\(\mathcal {P}\) | A certain resizing operator |
\(\mathcal {R}\) | A retargeting system using operator \(\mathcal {P}\) to resize \(\mathcal {A}^s\) and output \(\mathcal {A}^t\) |
\(\mathcal {M}\) | An off-the-shelf method that \(\mathcal {R}\) uses to define the importance in the input \(\mathcal {A}^s\) |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Le, TNH., Lee, TY., Lin, SS. et al. Deep learning-based importance map for content-aware media retargeting. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18389-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18389-4