Deep learning-based importance map for content-aware media retargeting

Le, Thi-Ngoc-Hanh; Lee, Tong-Yee; Lin, Shih-Syun; Dong, Weiming

doi:10.1007/s11042-024-18389-4

Deep learning-based importance map for content-aware media retargeting

Published: 15 February 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Thi-Ngoc-Hanh Le¹,
Tong-Yee Lee¹,
Shih-Syun Lin² &
…
Weiming Dong³

54 Accesses
Explore all metrics

Abstract

We introduce a deep learning-driven framework for creating an adaptably applicable importance map (A2R-Map) that can be integrated with existing image and video retargeting operators. A conventional retargeting algorithm uses a heuristic approach to seek an off-the-self algorithm used into their retargeting system. The extracted importance map of the image does not match the characteristics of the input image; therefore, it affects the retargeting results and limits the performance of the retargeting method. Our designed framework attempts to minimize the artifacts/distortions caused by inappropriate energy, e.g., the shrunk phenomenon in warping-based results and carving-through-object distortion in the seam carving-based approach. Our proposed framework focuses on capturing sensitive distortion regions and activating their energy to solve this challenge. We verify the effectiveness of our proposed scheme by plugging it in three typical retargeting methods: seam carving-based, warping-based for image, and video retargeting. Extensive experiments and evaluations are conducted on two widely used databases. On the one hand, A2R-Map significantly reduces the time of importance map generation in retargeting systems to \(\sim 9\) times compared to the baseline saliency map. On the other hand, our A2R-Map achieves improvement over the baseline methods with an average of 11% and 9% in terms of image and video quality, respectively. The experimental results and evaluations demonstrate that our strategy for A2R-Map substantially outperforms the previous works and significantly boosts the visual quality of video/image retargeting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 6

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

Deep learning models for digital image processing: a review

Article 07 January 2024

Notes

References

Suh B, Ling H, Bederson BB, Jacobs DW (2003) Automatic thumbnail cropping and its effectiveness. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 95–104
Kopf S, Guthier B, Lemelson H, Effelsberg W (2009) Adaptation of web pages and images for mobile applications. In: Multimedia on Mobile Devices 2009, vol. 7256, p 72560. International Society for Optics and Photonics
Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. In: ACM SIGGRAPH 2007 Papers, p 10
Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph (TOG) 28(3):1–11
Article Google Scholar
Pritch Y, Kav-Venaki E, Peleg S (2009) Shift-map image editing. In: 2009 IEEE 12th international conference on computer vision, pp 151–158. IEEE
Lin S-S, Yeh I-C, Lin C-H, Lee T-Y (2012) Patch-based image warping for content-aware retargeting. IEEE Trans Multimed 15(2):359–368
Article Google Scholar
Lin S-S, Lin C-H, Yeh I-C, Chang S-H, Yeh C-K, Lee T-Y (2013) Content-aware video retargeting using object-preserving warping. IEEE Trans Vis Comput Graph 19(10):1677–1686
Article PubMed Google Scholar
Asheghi B, Salehpour P, Khiavi AM, Hashemzadeh M (2022) A comprehensive review on content-aware image retargeting: From classical to state-of-the-art methods. Signal Processing 108496
Cho D, Park J, Oh T-H, Tai Y-W, So Kweon I (2017) Weakly-and self-supervised learning for content-aware deep image retargeting. In: Proceedings of the IEEE international conference on computer vision, pp 4558–4567
Tan W, Yan B, Lin C, Niu X (2019) Cycle-ir: Deep cyclic image retargeting. IEEE Trans Multimed
Kajiura N, Kosugi S, Wang X, Yamasaki T (2020) Self-play reinforcement learning for fast image retargeting. In: Proceedings of the 28th ACM international conference on multimedia, pp 1755–1763
Lin J, Zhou T, Chen Z (2019) Deepir: A deep semantics driven framework for image retargeting. In: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp 54–59. IEEE
Kiess J, Kopf S, Guthier B, Effelsberg W (2018) A survey on content-aware image and video retargeting. ACM Trans Multimed Comput Commun Appl (TOMM) 14(3):1–28
Article Google Scholar
Li X, Ling H (2009) Learning based thumbnail cropping. In: 2009 IEEE International Conference on Multimedia and Expo, pp 558–561. IEEE
Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo cropping. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 771–780
Guo D, Ding J, Tang J, Xu M, Zhao C (2015) Nif-based seam carving for image resizing. Multimedia Systems 21(6):603–613
Article CAS Google Scholar
Shen J, Wang D, Li X (2013) Depth-aware image seam carving. IEEE Trans Cybern 43(5):1453–1461
Article PubMed Google Scholar
Wu J, Zhou W, Luo T, Yu L, Lei J (2021) Multiscale multilevel context and multimodal fusion for rgb-d salient object detection. Signal Processing 178:107766
Article Google Scholar
Choi J, Kim C (2016) Sparse seam-carving for structure preserving image retargeting. J Signal Process Syst 85(2):275–283
Article Google Scholar
Battiato S, Farinella GM, Puglisi G, Ravi D (2014) Saliency-based selection of gradient vector flow paths for content aware image resizing. IEEE Trans Image Process 23(5):2081–2095
Article ADS MathSciNet PubMed Google Scholar
Cui J, Cai Q, Lu H, Jia Z, Tang M (2020) Distortion-aware image retargeting based on continuous seam carving model. Signal processing 166:107242
Article Google Scholar
Zhang X, Hu Y, Rajan D (2013) Dynamic distortion maps for image retargeting. J Vis Commun Image Represent 24(1):81–92
Article Google Scholar
Guo Y, Liu F, Shi J, Zhou Z-H, Gleicher M (2009) Image retargeting using mesh parametrization. IEEE Trans Multimed 11(5):856–867
Article Google Scholar
Wang Y-S, Tai C-L, Sorkine O, Lee T-Y (2008) Optimized scale-and-stretch for image resizing. In: ACM SIGGRAPH Asia 2008 Papers, pp 1–8
Zhang G-X, Cheng M-M, Hu S-M, Martin RR (2009) A shape-preserving approach to image resizing. In: Computer Graphics Forum, vol 28, pp 1897–1906. Wiley Online Library
Jin Y, Liu L, Wu Q (2010) Nonhomogeneous scaling optimization for realtime image resizing. Vis Comput 26(6):769–778
Article Google Scholar
Niu Y, Liu F, Li X, Gleicher M (2012) Image resizing via non-homogeneous warping. Multimed Tools Appl 56(3):485–508
Article Google Scholar
Hu W, Luo Z, Fan X (2014) Image retargeting via adaptive scaling with geometry preservation. IEEE J Emerg Sel Top Circ Syst 4(1):70–81
Article Google Scholar
Panozzo D, Weber O, Sorkine O (2012) Robust image retargeting via axis-aligned deformation. In: Computer Graphics Forum, vol 31, pp 229–236. Wiley Online Library
Tan W, Yan B, Li K, Tian Q (2015) Image retargeting for preserving robust local feature: Application to mobile visual search. IEEE Trans Multimed 18(1):128–137
Article Google Scholar
Kim Y, Jung S, Jung C, Kim C (2018) A structure-aware axis-aligned grid deformation approach for robust image retargeting. Multimed Tools Appl 77(6):7717–7739
Article Google Scholar
Kim Y, Eun H, Jung C, Kim C (2018) A quad edge-based grid encoding model for content-aware image retargeting. IEEE Trans Vis Comput Graph 25(12):3202–3215
Article PubMed Google Scholar
Liu S, Wei Z, Sun Y, Ou X, Lin J, Liu B, Yang M-H (2018) Composing semantic collage for image retargeting. IEEE Trans Image Process 27(10):5032–5043
Article ADS MathSciNet Google Scholar
Guo G, Wang H, Shen C, Yan Y, Liao H-YM (2018) Automatic image cropping for visual aesthetic enhancement using deep neural networks and cascaded regression. IEEE Trans Multimed 20(8):2073–2085
Article Google Scholar
Song E, Lee M, Lee S (2018) Carvingnet: content-guided seam carving using deep convolution neural network. IEEE Access 7:284–292
Article Google Scholar
Wang Z, Zhang W, Zhou H (2019) Perception-guided multi-channel visual feature fusion for image retargeting. Signal Process Image Commun 79:63–70
Article Google Scholar
Ahmadi M, Karimi N, Samavi S (2021) Context-aware saliency detection for image retargeting using convolutional neural networks. Multimed Tools Appl 80(8):11917–11941
Article Google Scholar
Zhou Y, Chen Z, Li W (2020) Weakly supervised reinforced multi-operator image retargeting. IEEE Trans Circ Syst Video Technol 31(1):126–139
Article Google Scholar
Shafieyan F, Karimi N, Mirmahboub B, Samavi S, Shirani S (2017) Image retargeting using depth assisted saliency map. Signal Process Image Commun 50:34–43
Article Google Scholar
Li B, Duan L-Y, Lin C-W, Huang T, Gao W (2015) Depth-preserving warping for stereo image retargeting. IEEE Trans Image Process 24(9):2811–2826
Article ADS MathSciNet PubMed Google Scholar
Zhang W, Yao T, Zhu S, Saddik AE (2019) Deep learning-based multimedia analytics: a review. ACM Trans Multimed Comput Commun Appl (TOMM) 15(1s):1–26
Google Scholar
Zhang Z, Lin H, Zhao X, Ji R, Gao Y (2018) Inductive multi-hypergraph learning and its application on view-based 3d object classification. IEEE Trans Image Process 27(12):5957–5968
Article ADS MathSciNet PubMed Google Scholar
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International Conference on Machine Learning, pp 2048–2057. PMLR
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Kanopoulos N, Vasanthavada N, Baker RL (1988) Design of an image edge detection filter using the sobel operator. IEEE Journal of solid-state circuits 23(2):358–367
Article ADS Google Scholar
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H-Y (2010) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33(2):353–367
CAS Google Scholar
Goferman S, Zelnik-Manor L, Tal A (2011) Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell 34(10):1915–1926
Article Google Scholar
Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph-based video segmentation. In: 2010 Ieee Computer society conference on computer vision and pattern recognition, pp 2141–2148. IEEE
Patel D, Nagar R, Raman S (2019) Reflection symmetry aware image retargeting. Pattern Recogn Lett 125:179–186
Article ADS Google Scholar
Cheng M-M, Mitra NJ, Huang X, Torr PH, Hu S-M (2014) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582
Article Google Scholar
Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7479–7489
Qin X, Dai H, Hu X, Fan D-P, Shao L, Van Gool L (2022) Highly accurate dichotomous image segmentation. In: European Conference on Computer Vision, pp 38–56. Springer
Liu J-J, Hou Q, Cheng M-M (2020) Dynamic feature integration for simultaneous detection of salient object, edge, and skeleton. IEEE Trans Image Process 29:8652–8667
Article ADS Google Scholar
Tang F, Dong W, Meng Y, Ma C, Wu F, Li X, Lee T-Y (2019) Image retargetability. IEEE Trans Multimed 22(3):641–654
Article Google Scholar
Zhang Y, Lin W, Zhang X, Fang Y, Li L (2016) Aspect ratio similarity (ars) for image retargeting quality assessment. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1080–1084. IEEE
Liu C, Yuen J, Torralba A (2010) Sift flow: Dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell 33(5):978–994
Article Google Scholar
Rubinstein M, Gutierrez D, Sorkine O, Shamir A (2010) A comparative study of image retargeting. ACM Trans Graph (Proc. SIGGRAPH ASIA) 29(6):160–116010
Simakov D, Caspi Y, Shechtman E, Irani M (2008) Summarizing visual data using bidirectional similarity. In: 2008 IEEE conference on computer vision and pattern recognition, pp 1–8. IEEE
Ma L, Lin W, Deng C, Ngan KN (2012) Image retargeting quality assessment: A study of subjective scores and objective metrics. IEEE J Sel Top Signal Process 6(6):626–639
Article ADS Google Scholar
Zhang L, Li X, Nie L, Yan Y, Zimmermann R (2016) Semantic photo retargeting under noisy image labels. ACM Trans Multimed Comput Commun Appl (TOMM) 12(3):1–22
Article CAS Google Scholar
Rubinstein M, Shamir A, Avidan S (2008) Improved seam carving for video retargeting. ACM Trans Graph (TOG) 27(3):1–9
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Science and Technology Council (under nos. 111-2221-E-006-112-MY3, 110-2221-E-006-135-MY3, 112-2221-E-019-063-MY3 and 110-2221-E-019-052-MY3), Republic of China (ROC), Taiwan. And this work is also supported by National Natural Science Foundation of China under Nos. U20B2070 and 61832016.

Author information

Authors and Affiliations

National Cheng-Kung University, Tainan, Taiwan, ROC
Thi-Ngoc-Hanh Le & Tong-Yee Lee
National Taiwan Ocean University, Hsinchu, Taiwan, ROC
Shih-Syun Lin
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Weiming Dong

Authors

Thi-Ngoc-Hanh Le
View author publications
You can also search for this author in PubMed Google Scholar
Tong-Yee Lee
View author publications
You can also search for this author in PubMed Google Scholar
Shih-Syun Lin
View author publications
You can also search for this author in PubMed Google Scholar
Weiming Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tong-Yee Lee.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest. A part of the datasets generated and/or analysed during the current study are available in https://people.csail.mit.edu/mrub/retargetme, https://www.ee.nthu.edu.tw/cwlin/Retargeting_Quality/NRID.html, and https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: More comparisons

Apart from the comparisons with content-aware retargeting approaches, we further exhibit our results competing with a semantic-aware retargeting approach [60] in Fig. 17. In this figure, besides [60] (PM), we further show the results from other five retargeting methods: seam carving (SC) [3] and its improved version (ISC) [61], patch-based warping (PW) [6], saliency-based mesh parametrization (SMP) [23], multi-operator (MOR) [4]. These results are obtained from [60]. We can observe that our result outperforms the compared results. If the carving distortions occur in SC, ISC, and SMP, a linear-like phenomenon falls in PW, MOR, and PM (e.g., the green door). Meanwhile, our result does not have such phenomena and appears in a balanced structure compared to the input image. Figure 18 exhibits the performance of our A2R-Map in terms of enlarging. In this experiment, we enlarge images to 25% of width.

Appendix B: List of notations

Symbol	Definition
\(\mathcal {I}\)	Input image
SC	Seam carving operator
OMap	The energy map generated by TFS-Net
BMap	The energy map generated by (4)
A2R-Map	The final importance map generated by our model
SOD	Salient Object Detection
TFS-Net	The network we proposed to generate OMap
TFS	Feature Sharing Session module
AFS	Adjacent-layer Feature Sharing module
\(\mathcal {X}^i\)	Feature maps at layer \(i^{th}\)
\(\mathcal {X}_u\)	Feature maps at upper layer
\(\mathcal {X}_l\)	Feature maps at lower layer
\(\mathcal {A}^s\)	The source image/video in general
\(\mathcal {A}^t\)	The target image/video \(\mathcal {A}^s\) after retargeting process
\(\mathcal {P}\)	A certain resizing operator
\(\mathcal {R}\)	A retargeting system using operator \(\mathcal {P}\) to resize \(\mathcal {A}^s\) and output \(\mathcal {A}^t\)
\(\mathcal {M}\)	An off-the-shelf method that \(\mathcal {R}\) uses to define the importance in the input \(\mathcal {A}^s\)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Le, TNH., Lee, TY., Lin, SS. et al. Deep learning-based importance map for content-aware media retargeting. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18389-4

Download citation

Received: 06 December 2022
Revised: 12 October 2023
Accepted: 19 January 2024
Published: 15 February 2024
DOI: https://doi.org/10.1007/s11042-024-18389-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning-based importance map for content-aware media retargeting

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

Image Matching from Handcrafted to Deep Features: A Survey

Deep learning models for digital image processing: a review

Notes

References

Acknowledgements