Skip to main content
Log in

Deep learning-based importance map for content-aware media retargeting

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

We introduce a deep learning-driven framework for creating an adaptably applicable importance map (A2R-Map) that can be integrated with existing image and video retargeting operators. A conventional retargeting algorithm uses a heuristic approach to seek an off-the-self algorithm used into their retargeting system. The extracted importance map of the image does not match the characteristics of the input image; therefore, it affects the retargeting results and limits the performance of the retargeting method. Our designed framework attempts to minimize the artifacts/distortions caused by inappropriate energy, e.g., the shrunk phenomenon in warping-based results and carving-through-object distortion in the seam carving-based approach. Our proposed framework focuses on capturing sensitive distortion regions and activating their energy to solve this challenge. We verify the effectiveness of our proposed scheme by plugging it in three typical retargeting methods: seam carving-based, warping-based for image, and video retargeting. Extensive experiments and evaluations are conducted on two widely used databases. On the one hand, A2R-Map significantly reduces the time of importance map generation in retargeting systems to \(\sim 9\) times compared to the baseline saliency map. On the other hand, our A2R-Map achieves improvement over the baseline methods with an average of 11% and 9% in terms of image and video quality, respectively. The experimental results and evaluations demonstrate that our strategy for A2R-Map substantially outperforms the previous works and significantly boosts the visual quality of video/image retargeting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://graphics.csie.ncku.edu.tw/A2RMap/CompareVids.mp4

  2. http://graphics.csie.ncku.edu.tw/A2RMap

References

  1. Suh B, Ling H, Bederson BB, Jacobs DW (2003) Automatic thumbnail cropping and its effectiveness. In: Proceedings of the 16th annual ACM symposium on user interface software and technology, pp 95–104

  2. Kopf S, Guthier B, Lemelson H, Effelsberg W (2009) Adaptation of web pages and images for mobile applications. In: Multimedia on Mobile Devices 2009, vol. 7256, p 72560. International Society for Optics and Photonics

  3. Avidan S, Shamir A (2007) Seam carving for content-aware image resizing. In: ACM SIGGRAPH 2007 Papers, p 10

  4. Rubinstein M, Shamir A, Avidan S (2009) Multi-operator media retargeting. ACM Trans Graph (TOG) 28(3):1–11

    Article  Google Scholar 

  5. Pritch Y, Kav-Venaki E, Peleg S (2009) Shift-map image editing. In: 2009 IEEE 12th international conference on computer vision, pp 151–158. IEEE

  6. Lin S-S, Yeh I-C, Lin C-H, Lee T-Y (2012) Patch-based image warping for content-aware retargeting. IEEE Trans Multimed 15(2):359–368

    Article  Google Scholar 

  7. Lin S-S, Lin C-H, Yeh I-C, Chang S-H, Yeh C-K, Lee T-Y (2013) Content-aware video retargeting using object-preserving warping. IEEE Trans Vis Comput Graph 19(10):1677–1686

    Article  PubMed  Google Scholar 

  8. Asheghi B, Salehpour P, Khiavi AM, Hashemzadeh M (2022) A comprehensive review on content-aware image retargeting: From classical to state-of-the-art methods. Signal Processing 108496

  9. Cho D, Park J, Oh T-H, Tai Y-W, So Kweon I (2017) Weakly-and self-supervised learning for content-aware deep image retargeting. In: Proceedings of the IEEE international conference on computer vision, pp 4558–4567

  10. Tan W, Yan B, Lin C, Niu X (2019) Cycle-ir: Deep cyclic image retargeting. IEEE Trans Multimed

  11. Kajiura N, Kosugi S, Wang X, Yamasaki T (2020) Self-play reinforcement learning for fast image retargeting. In: Proceedings of the 28th ACM international conference on multimedia, pp 1755–1763

  12. Lin J, Zhou T, Chen Z (2019) Deepir: A deep semantics driven framework for image retargeting. In: 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp 54–59. IEEE

  13. Kiess J, Kopf S, Guthier B, Effelsberg W (2018) A survey on content-aware image and video retargeting. ACM Trans Multimed Comput Commun Appl (TOMM) 14(3):1–28

    Article  Google Scholar 

  14. Li X, Ling H (2009) Learning based thumbnail cropping. In: 2009 IEEE International Conference on Multimedia and Expo, pp 558–561. IEEE

  15. Santella A, Agrawala M, DeCarlo D, Salesin D, Cohen M (2006) Gaze-based interaction for semi-automatic photo cropping. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 771–780

  16. Guo D, Ding J, Tang J, Xu M, Zhao C (2015) Nif-based seam carving for image resizing. Multimedia Systems 21(6):603–613

    Article  CAS  Google Scholar 

  17. Shen J, Wang D, Li X (2013) Depth-aware image seam carving. IEEE Trans Cybern 43(5):1453–1461

    Article  PubMed  Google Scholar 

  18. Wu J, Zhou W, Luo T, Yu L, Lei J (2021) Multiscale multilevel context and multimodal fusion for rgb-d salient object detection. Signal Processing 178:107766

    Article  Google Scholar 

  19. Choi J, Kim C (2016) Sparse seam-carving for structure preserving image retargeting. J Signal Process Syst 85(2):275–283

    Article  Google Scholar 

  20. Battiato S, Farinella GM, Puglisi G, Ravi D (2014) Saliency-based selection of gradient vector flow paths for content aware image resizing. IEEE Trans Image Process 23(5):2081–2095

    Article  ADS  MathSciNet  PubMed  Google Scholar 

  21. Cui J, Cai Q, Lu H, Jia Z, Tang M (2020) Distortion-aware image retargeting based on continuous seam carving model. Signal processing 166:107242

    Article  Google Scholar 

  22. Zhang X, Hu Y, Rajan D (2013) Dynamic distortion maps for image retargeting. J Vis Commun Image Represent 24(1):81–92

    Article  Google Scholar 

  23. Guo Y, Liu F, Shi J, Zhou Z-H, Gleicher M (2009) Image retargeting using mesh parametrization. IEEE Trans Multimed 11(5):856–867

    Article  Google Scholar 

  24. Wang Y-S, Tai C-L, Sorkine O, Lee T-Y (2008) Optimized scale-and-stretch for image resizing. In: ACM SIGGRAPH Asia 2008 Papers, pp 1–8

  25. Zhang G-X, Cheng M-M, Hu S-M, Martin RR (2009) A shape-preserving approach to image resizing. In: Computer Graphics Forum, vol 28, pp 1897–1906. Wiley Online Library

  26. Jin Y, Liu L, Wu Q (2010) Nonhomogeneous scaling optimization for realtime image resizing. Vis Comput 26(6):769–778

    Article  Google Scholar 

  27. Niu Y, Liu F, Li X, Gleicher M (2012) Image resizing via non-homogeneous warping. Multimed Tools Appl 56(3):485–508

    Article  Google Scholar 

  28. Hu W, Luo Z, Fan X (2014) Image retargeting via adaptive scaling with geometry preservation. IEEE J Emerg Sel Top Circ Syst 4(1):70–81

    Article  Google Scholar 

  29. Panozzo D, Weber O, Sorkine O (2012) Robust image retargeting via axis-aligned deformation. In: Computer Graphics Forum, vol 31, pp 229–236. Wiley Online Library

  30. Tan W, Yan B, Li K, Tian Q (2015) Image retargeting for preserving robust local feature: Application to mobile visual search. IEEE Trans Multimed 18(1):128–137

    Article  Google Scholar 

  31. Kim Y, Jung S, Jung C, Kim C (2018) A structure-aware axis-aligned grid deformation approach for robust image retargeting. Multimed Tools Appl 77(6):7717–7739

    Article  Google Scholar 

  32. Kim Y, Eun H, Jung C, Kim C (2018) A quad edge-based grid encoding model for content-aware image retargeting. IEEE Trans Vis Comput Graph 25(12):3202–3215

    Article  PubMed  Google Scholar 

  33. Liu S, Wei Z, Sun Y, Ou X, Lin J, Liu B, Yang M-H (2018) Composing semantic collage for image retargeting. IEEE Trans Image Process 27(10):5032–5043

    Article  ADS  MathSciNet  Google Scholar 

  34. Guo G, Wang H, Shen C, Yan Y, Liao H-YM (2018) Automatic image cropping for visual aesthetic enhancement using deep neural networks and cascaded regression. IEEE Trans Multimed 20(8):2073–2085

    Article  Google Scholar 

  35. Song E, Lee M, Lee S (2018) Carvingnet: content-guided seam carving using deep convolution neural network. IEEE Access 7:284–292

    Article  Google Scholar 

  36. Wang Z, Zhang W, Zhou H (2019) Perception-guided multi-channel visual feature fusion for image retargeting. Signal Process Image Commun 79:63–70

    Article  Google Scholar 

  37. Ahmadi M, Karimi N, Samavi S (2021) Context-aware saliency detection for image retargeting using convolutional neural networks. Multimed Tools Appl 80(8):11917–11941

    Article  Google Scholar 

  38. Zhou Y, Chen Z, Li W (2020) Weakly supervised reinforced multi-operator image retargeting. IEEE Trans Circ Syst Video Technol 31(1):126–139

    Article  Google Scholar 

  39. Shafieyan F, Karimi N, Mirmahboub B, Samavi S, Shirani S (2017) Image retargeting using depth assisted saliency map. Signal Process Image Commun 50:34–43

    Article  Google Scholar 

  40. Li B, Duan L-Y, Lin C-W, Huang T, Gao W (2015) Depth-preserving warping for stereo image retargeting. IEEE Trans Image Process 24(9):2811–2826

    Article  ADS  MathSciNet  PubMed  Google Scholar 

  41. Zhang W, Yao T, Zhu S, Saddik AE (2019) Deep learning-based multimedia analytics: a review. ACM Trans Multimed Comput Commun Appl (TOMM) 15(1s):1–26

    Google Scholar 

  42. Zhang Z, Lin H, Zhao X, Ji R, Gao Y (2018) Inductive multi-hypergraph learning and its application on view-based 3d object classification. IEEE Trans Image Process 27(12):5957–5968

    Article  ADS  MathSciNet  PubMed  Google Scholar 

  43. Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International Conference on Machine Learning, pp 2048–2057. PMLR

  44. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  45. Kanopoulos N, Vasanthavada N, Baker RL (1988) Design of an image edge detection filter using the sobel operator. IEEE Journal of solid-state circuits 23(2):358–367

    Article  ADS  Google Scholar 

  46. Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H-Y (2010) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33(2):353–367

    CAS  Google Scholar 

  47. Goferman S, Zelnik-Manor L, Tal A (2011) Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell 34(10):1915–1926

    Article  Google Scholar 

  48. Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph-based video segmentation. In: 2010 Ieee Computer society conference on computer vision and pattern recognition, pp 2141–2148. IEEE

  49. Patel D, Nagar R, Raman S (2019) Reflection symmetry aware image retargeting. Pattern Recogn Lett 125:179–186

    Article  ADS  Google Scholar 

  50. Cheng M-M, Mitra NJ, Huang X, Torr PH, Hu S-M (2014) Global contrast based salient region detection. IEEE Trans Pattern Anal Mach Intell 37(3):569–582

    Article  Google Scholar 

  51. Qin X, Zhang Z, Huang C, Gao C, Dehghan M, Jagersand M (2019) Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7479–7489

  52. Qin X, Dai H, Hu X, Fan D-P, Shao L, Van Gool L (2022) Highly accurate dichotomous image segmentation. In: European Conference on Computer Vision, pp 38–56. Springer

  53. Liu J-J, Hou Q, Cheng M-M (2020) Dynamic feature integration for simultaneous detection of salient object, edge, and skeleton. IEEE Trans Image Process 29:8652–8667

    Article  ADS  Google Scholar 

  54. Tang F, Dong W, Meng Y, Ma C, Wu F, Li X, Lee T-Y (2019) Image retargetability. IEEE Trans Multimed 22(3):641–654

    Article  Google Scholar 

  55. Zhang Y, Lin W, Zhang X, Fang Y, Li L (2016) Aspect ratio similarity (ars) for image retargeting quality assessment. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1080–1084. IEEE

  56. Liu C, Yuen J, Torralba A (2010) Sift flow: Dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell 33(5):978–994

    Article  Google Scholar 

  57. Rubinstein M, Gutierrez D, Sorkine O, Shamir A (2010) A comparative study of image retargeting. ACM Trans Graph (Proc. SIGGRAPH ASIA) 29(6):160–116010

  58. Simakov D, Caspi Y, Shechtman E, Irani M (2008) Summarizing visual data using bidirectional similarity. In: 2008 IEEE conference on computer vision and pattern recognition, pp 1–8. IEEE

  59. Ma L, Lin W, Deng C, Ngan KN (2012) Image retargeting quality assessment: A study of subjective scores and objective metrics. IEEE J Sel Top Signal Process 6(6):626–639

    Article  ADS  Google Scholar 

  60. Zhang L, Li X, Nie L, Yan Y, Zimmermann R (2016) Semantic photo retargeting under noisy image labels. ACM Trans Multimed Comput Commun Appl (TOMM) 12(3):1–22

    Article  CAS  Google Scholar 

  61. Rubinstein M, Shamir A, Avidan S (2008) Improved seam carving for video retargeting. ACM Trans Graph (TOG) 27(3):1–9

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Science and Technology Council (under nos. 111-2221-E-006-112-MY3, 110-2221-E-006-135-MY3, 112-2221-E-019-063-MY3 and 110-2221-E-019-052-MY3), Republic of China (ROC), Taiwan. And this work is also supported by National Natural Science Foundation of China under Nos. U20B2070 and 61832016.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong-Yee Lee.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest. A part of the datasets generated and/or analysed during the current study are available in https://people.csail.mit.edu/mrub/retargetme, https://www.ee.nthu.edu.tw/cwlin/Retargeting_Quality/NRID.html, and https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: More comparisons

Apart from the comparisons with content-aware retargeting approaches, we further exhibit our results competing with a semantic-aware retargeting approach [60] in Fig. 17. In this figure, besides [60] (PM), we further show the results from other five retargeting methods: seam carving (SC) [3] and its improved version (ISC) [61], patch-based warping (PW) [6], saliency-based mesh parametrization (SMP) [23], multi-operator (MOR) [4]. These results are obtained from [60]. We can observe that our result outperforms the compared results. If the carving distortions occur in SC, ISC, and SMP, a linear-like phenomenon falls in PW, MOR, and PM (e.g., the green door). Meanwhile, our result does not have such phenomena and appears in a balanced structure compared to the input image. Figure 18 exhibits the performance of our A2R-Map in terms of enlarging. In this experiment, we enlarge images to 25% of width.

Fig. 12
figure 12

Comparison with Multi-operator and Cui et al. [21]

Fig. 13
figure 13

Left to right: input image, NIF energy map, SC + NIF, A2R-Map, SC + A2R-Map

Fig. 14
figure 14

Comparison with CarvingNet

Fig. 15
figure 15

Comparison with WSSDCNN

Fig. 16
figure 16

Comparison with Cycle-IR

Fig. 17
figure 17

Comparisons with content-aware and semantic-aware retargeting approaches

Fig. 18
figure 18

Enlarging results. In each pair, left: input image, right: enlarged one

Appendix B: List of notations

Symbol

Definition

\(\mathcal {I}\)

Input image

SC

Seam carving operator

OMap

The energy map generated by TFS-Net

BMap

The energy map generated by (4)

A2R-Map

The final importance map generated by our model

SOD

Salient Object Detection

TFS-Net

The network we proposed to generate OMap

TFS

Feature Sharing Session module

AFS

Adjacent-layer Feature Sharing module

\(\mathcal {X}^i\)

Feature maps at layer \(i^{th}\)

\(\mathcal {X}_u\)

Feature maps at upper layer

\(\mathcal {X}_l\)

Feature maps at lower layer

\(\mathcal {A}^s\)

The source image/video in general

\(\mathcal {A}^t\)

The target image/video \(\mathcal {A}^s\) after retargeting process

\(\mathcal {P}\)

A certain resizing operator

\(\mathcal {R}\)

A retargeting system using operator \(\mathcal {P}\) to resize \(\mathcal {A}^s\) and output \(\mathcal {A}^t\)

\(\mathcal {M}\)

An off-the-shelf method that \(\mathcal {R}\) uses to define the importance in the input \(\mathcal {A}^s\)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Le, TNH., Lee, TY., Lin, SS. et al. Deep learning-based importance map for content-aware media retargeting. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18389-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18389-4

Keywords

Navigation