Skip to main content

Joint regression and learning from pairwise rankings for personalized image aesthetic assessment

Abstract

Recent image aesthetic assessment methods have achieved remarkable progress due to the emergence of deep convolutional neural networks (CNNs). However, these methods focus primarily on predicting generally perceived preference of an image, making them usually have limited practicability, since each user may have completely different preferences for the same image. To address this problem, this paper presents a novel approach for predicting personalized image aesthetics that fit an individual user’s personal taste. We achieve this in a coarse to fine manner, by joint regression and learning from pairwise rankings. Specifically, we first collect a small subset of personal images from a user and invite him/her to rank the preference of some randomly sampled image pairs. We then search for the K-nearest neighbors of the personal images within a large-scale dataset labeled with average human aesthetic scores, and use these images as well as the associated scores to train a generic aesthetic assessment model by CNN-based regression. Next, we fine-tune the generic model to accommodate the personal preference by training over the rankings with a pairwise hinge loss. Experiments demonstrate that our method can effectively learn personalized image aesthetic preferences, clearly outperforming state-of-the-art methods. Moreover, we show that the learned personalized image aesthetic benefits a wide variety of applications.

References

  1. [1]

    Zhang, F.-L.; Wang, M.; Hu, S.-M. Aesthetic image enhancement by dependence-aware object recomposition. IEEE Transactions on Multimedia Vol. 15, No. 7, 1480–1490, 2013.

    Article  Google Scholar 

  2. [2]

    Zhang, Q.; Nie, Y. W.; Zhang, L.; Xiao, C. X. Underexposed video enhancement via perception-driven progressive fusion. IEEE Transactions on Visualization and Computer Graphics Vol. 22, No. 6, 1773–1785, 2016.

    Article  Google Scholar 

  3. [3]

    Zhang, Q.; Yuan, G. Z.; Xiao, C. X.; Zhu, L.; Zheng, W. S. High-quality exposure correction of underexposed photos. In: Proceedings of the 26th ACM international conference on Multimedia, 582–590, 2018.

  4. [4]

    Zhang, F. L.; Wu, X.; Li, R. L.; Wang, J.; Zheng, Z. H.; Hu, S. M. Detecting and removing visual distractors for video aesthetic enhancement. IEEE Transactions on Multimedia Vol. 20, No. 8, 1987–1999, 2018.

    Article  Google Scholar 

  5. [5]

    Zhang, Q.; Nie, Y.; Zheng, W.-S. Dual illumination estimation for robust exposure correction. Computer Graphics Forum Vol. 38, 243–252, 2019.

    Article  Google Scholar 

  6. [6]

    Zhang, Q.; Yin, G. L.; Nie, Y. W.; Zheng, W. S. Deep camouflage images. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 12845–12852, 2020.

    Article  Google Scholar 

  7. [7]

    Zhang, Q.; Nie, Y.; Zhu, L.; Xiao, C.; Zheng, W.-S. Enhancing underexposed photos using perceptually bidirectional similarity. IEEE Transactions on Multimedia Vol. 23, 189–202, 2021.

    Article  Google Scholar 

  8. [8]

    Murray, N.; Marchesotti, L.; Perronnin, F. AVA: A large-scale database for aesthetic visual analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2408–2415, 2012.

  9. [9]

    Kong, S.; Shen, X. H.; Lin, Z.; Mech, R.; Fowlkes, C. Photo aesthetics ranking network with attributes and content adaptation. In: Proceedings of the European Conference on Computer Vision, 662–679, 2016.

  10. [10]

    Ren, J.; Shen, X.; Lin, Z.; Mech, R.; Foran, D. J. Personalized image aesthetics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 638–647, 2017.

  11. [11]

    Park, K.; Hong, S.; Baek, M.; Han, B. Personalized image aesthetic quality assessment by joint regression and ranking. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 1206–1214, 2017.

  12. [12]

    Sarwar, B.; Karypis, G.; Konstan, J.; Reidl, J. Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th international Conference on World Wide Web, 285–295, 2001.

  13. [13]

    Breese, J. S.; Heckerman, D.; Kadie, C. Empirical analysis of predictive algorithms for collaborative filtering. arXiv preprint arXiv:1301.7363, 2013.

  14. [14]

    Wang, G.; Yan, J.; Qin, Z. Collaborative and attentive learning for personalized image aesthetic assessment. In: Proceedings of the International Joint Conference on Artificial Intelligence, 957–963, 2018.

  15. [15]

    Korhonen, J. Assessing personally perceived image quality via image features and collaborative filtering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8169–8177, 2019.

  16. [16]

    Luo, W.; Wang, X.; Tang, X. Content-based photo quality assessment. In: Proceedings of the IEEE International Conference on Computer Vision, 2206–2213, 2011.

  17. [17]

    Dhar, S.; Ordonez, V.; Berg, T. L. High level describable attributes for predicting aesthetics and interestingness. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1657–1664, 2011.

  18. [18]

    Marchesotti, L.; Perronnin, F.; Larlus, D.; Csurka, G. Assessing the aesthetic quality of photographs using generic image descriptors. In: Proceedings of the IEEE International Conference on Computer Vision, 1784–1791, 2011.

  19. [19]

    Lu, X.; Lin, Z.; Shen, X.; Mech, R.; Wang. J. Z. Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In: Proceedings of the IEEE International Conference on Computer Vision, 990–998, 2015.

  20. [20]

    Sheng, K. K.; Dong, W. M.; Ma, C. Y.; Mei, X.; Huang, F. Y.; Hu, B. G. Attention-based multipatch aggregation for image aesthetic assessment. In: Proceedings of the ACM International Conference on Multimedia, 879–886, 2018.

  21. [21]

    Mai, L.; Jin, H.; Liu, F. Composition-preserving deep photo aesthetics assessment. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 497–506, 2016.

  22. [22]

    Talebi, H.; Milanfar, P. NIMA: Neural image assessment. IEEE Transactions on Image Processing Vol. 27, No. 8, 3998–4011, 2018.

    MathSciNet  Article  Google Scholar 

  23. [23]

    Zeng, H.; Cao, Z.; Zhang, L.; Bovik, A. C. A unified probabilistic formulation of image aesthetic assessment. IEEE Transactions on Image Processing Vol. 29, 1548–1561, 2019.

    MathSciNet  Article  Google Scholar 

  24. [24]

    Zhang, X. D.; Gao, X. B.; Lu, W.; He, L. H. A gated peripheral-foveal convolutional neural network for unified image aesthetic prediction. IEEE Transactions on Multimedia Vol. 21, No. 11, 2815–2826, 2019.

    Article  Google Scholar 

  25. [25]

    Pan, B. W.; Wang, S. F.; Jiang, Q. S. Image aesthetic assessment assisted by attributes through adversarial learning. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, 679–686, 2019.

    Article  Google Scholar 

  26. [26]

    Wang, X. C.; Liang, X. H.; Yang, B. L.; Li, F. W. B. No-reference synthetic image quality assessment with convolutional neural network and local image saliency. Computational Visual Media Vol. 5, No. 2, 193–208, 2019.

    Article  Google Scholar 

  27. [27]

    Sheng, K. K.; Dong, W. M.; Chai, M. L.; Wang, G. H.; Zhou, P.; Huang, F. Y.; Hu, B.; Ji, R.; Ma, C. Revisiting image aesthetic assessment via self-supervised feature learning. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 4, 5709–5716, 2020.

    Article  Google Scholar 

  28. [28]

    Deng, Y. B.; Loy, C. C.; Tang, X. O. Image aesthetic assessment: An experimental survey. IEEE Signal Processing Magazine Vol. 34, No. 4, 80–106, 2017.

    Article  Google Scholar 

  29. [29]

    Li, L. D.; Zhu, H. C.; Zhao, S. C.; Ding, G. G.; Jiang, H. Y.; Tan, A. Personality driven multi-task learning for image aesthetic assessment. In: Proceedings of the International Conference on Multimedia and Expo, 430–435, 2019.

  30. [30]

    Lee, J. T.; Kim, C. S. Image aesthetic assessment based on pairwise comparison: A unified approach to score regression, binary classification, and personalization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1191–1200, 2019.

  31. [31]

    Zhu, H.; Li, L.; Wu, J.; Zhao, S.; Ding, G.; Shi, G. Personalized image aesthetics assessment via meta-learning with bilevel gradient optimization. IEEE Transactions on Cyberneticshttps://doi.org/10.1109/TCYB.2020.2984670, 2020.

  32. [32]

    Cui, C. R.; Yang, W. Y.; Shi, C.; Wang, M.; Nie, X. S.; Yin, Y. L. Personalized image quality assessment with social-sensed aesthetic preference. Information Sciences Vol. 512, 780–794, 2020.

    Article  Google Scholar 

  33. [33]

    Yan, J.; Lin, S.; Kang, S. B.; Tang, X. A learning-to-rank approach for image color enhancement. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2987–2994, 2014.

  34. [34]

    Paisitkriangkrai, S.; Shen, C. H.; van den Hengel, A. Learning to rank in person re-identification with metric ensembles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1846–1855, 2015.

  35. [35]

    Liu, X. L.; van de Weijer, J.; Bagdanov, A. D. RankIQA: Learning from rankings for no-reference image quality assessment. In: Proceedings of the IEEE International Conference on Computer Vision, 1040–1049, 2017.

  36. [36]

    Liu, X. L.; van de Weijer, J.; Bagdanov, A. D. Leveraging unlabeled data for crowd counting by learning to rank. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7661–7669, 2018.

  37. [37]

    Gong, Y. C.; Jia, Y. Q.; Leung, T.; Toshev, A.; Ioffe, S. Deep convolutional ranking for multilabel image annotation. arXiv preprint arXiv:1312.4894, 2013.

  38. [38]

    Wang, Y. L.; Wang, S. H.; Tang, J. L.; Liu, H.; Li, B. X. PPP: Joint pointwise and pairwise image label prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6005–6013, 2016.

  39. [39]

    Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

  40. [40]

    Deng, J.; Dong, W.; Socher, R.; Li, L. J.; Li, K.; Li, F. F. ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255, 2009.

  41. [41]

    Myers, J. L.; Well, A.; Lorch, R. F. Research Design and Statistical Analysis. Routledge, 2010.

  42. [42]

    Talebi, H.; Milanfar, P. Learned perceptual image enhancement. In: Proceedings of the IEEE International Conference on Computational Photography, 1–13, 2018.

  43. [43]

    Hosu, V.; Goldlücke, B.; Saupe, D. Effective aesthetics prediction with multi-level spatially pooled features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 9375–9383, 2019.

  44. [44]

    O’Donovan, P.; Agarwala, A.; Hertzmann, A. Collaborative filtering of color aesthetics. In: Proceedings of the Workshop on Computational Aesthetics, 33–40, 2014.

  45. [45]

    Joachims, T. Optimizing search engines using click-through data. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 133–142, 2002.

  46. [46]

    Burges, C.; Shaked, T.; Renshaw, E.; Lazier, A.; Deeds, M.; Hamilton, N.; Hullender, G. Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, 89–96, 2005.

  47. [47]

    Wang, R.; Zhang, Q.; Fu, C.-W.; Shen, X.; Zheng, W.-S.; Jia, J. Underexposed photo enhancement using deep illumination estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6849–6857, 2019.

  48. [48]

    Park, J.; Lee, J. Y.; Yoo, D.; Kweon, I. S. Distort-and-recover: Color enhancement using deep reinforcement learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5928–5936, 2018.

  49. [49]

    Deng, Z.; Zhu, L.; Hu, X.; Fu, C.-W.; Xu, X.; Zhang, Q.; Qin, J.; Heng, P.-A. Deep multi-model fusion for single-image dehazing. In: Proceedings of the IEEE International Conference on Computer Vision, 2453–2462, 2019.

Download references

Acknowledgements

The authors thank the reviewers for their valuable comments. This work was supported partially by the National Key Research and Development Program of China (2018YFB1004903), National Natural Science Foundation of China (61802453, U1911401, U1811461), Fundamental Research Funds for the Central Universities (19lgpy216), and Research Projects of Zhejiang Lab (2019KD0AB03).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Qing Zhang.

Additional information

Jin Zhou is a master student in the School of Electronics and Information Technology, Sun Yat-sen University. His research interests include computer vision and deep learning.

Qing Zhang is a research associate professor in the School of Computer Science and Engineering, Sun Yat-sen University. His research interests include computational photography and computer vision.

Jian-Hao Fan is an undergraduate student in the School of Computer Science and Engineering, Sun Yat-sen University. His research interests are computer vision and deep learning.

Wei Sun received his Ph.D. degree in computer science from Sun Yat-sen University in 2004, where he is currently a professor in the School of Electronics and Information Technology. His research interests include multimedia forensics and signal processing.

Wei-Shi Zheng received his Ph.D. degree in applied mathematics from Sun Yat-sen University in 2008. He is now a full professor in the School of Computer Science and Engineering, Sun Yat-sen University. His research interests include person/object association and activity understanding in visual surveillance, and the related large-scale machine learning algorithm. He has more than 90 publications in leading journals (TPAMI, IJCV, TNN/TNNLS, TIP, PR) and conferences (ICCV, CVPR, IJCAI, AAAI). He is an associate editor of the Pattern Recognition Journal. He has joined Microsoft Research Asia Young Faculty Visiting Programme and is a recipient of the Excellent Young Scientists Fund of the National Natural Science Foundation of China, and the Royal Society Newton Advanced Fellowship, UK.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhou, J., Zhang, Q., Fan, JH. et al. Joint regression and learning from pairwise rankings for personalized image aesthetic assessment. Comp. Visual Media 7, 241–252 (2021). https://doi.org/10.1007/s41095-021-0207-y

Download citation

Keywords

  • personalized image aesthetic assessment
  • deep convolutional neural networks
  • pairwise ranking
  • regression