Skip to main content
Log in

Classical learning or deep learning: a study on food photo aesthetic assessment

  • 1230: Sentient Multimedia Systems and Visual Intelligence
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Food photo aesthetic assessment has gained increasing attention in both commercial activity and social life. However, there has been little research dedicated to the quality classification of food photos. This paper presents a study on food photo aesthetic evaluation, covering dataset collection and evaluation methods. First, a dataset of food photos was collected by web crawler from food-sharing websites, and the appropriate images were selected and labeled using a WeChat applet for binary classification. Then, food photo aesthetic assessment was evaluated using classical machine learning and deep learning methods. Different hand-crafted features, including layout, texture, color, local, and deep features, were manually extracted. Two classifiers, support vector machine and random forest, were used to establish the classical learning models. Meanwhile, three convolutional neural networks (AlexNet, VGGNet, ResNet) were applied to compare with former methods by fine-tuning the model parameters. Four quantitative metrics (accuracy, recall, precision, and f1-score) were used to evaluate the performance of food photo aesthetic assessment, with the accuracy of classical and deep learning methods being 91.09% vs 94.70%, respectively. This demonstrates that classical learning with good enough hand-crafted features is capable of producing performance close to that of CNNs. The dataset for food photo aesthetic assessment can be used as a preliminary exploration of food image aesthetics assessment from both classical learning and deep learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

References

  1. Al-Hamami A, Al-Rashdan H (2010) Improving the effectiveness of the color coherence vector. Int Arab J Inf Techn 7:324–332

    Google Scholar 

  2. Asghar N (2016) Yelp dataset challenge: review rating prediction. ArXiv: 1605.05362

  3. Bossard L, Guillaumin M, Van Gool L (2014) Food-101–mining discriminative components with random forestsComputer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part VI 13. Springer, pp 446–461

  4. Calonder M, Lepetit V, Strecha C, Fua P (2010) BRIEF: Binary robust independent elementary features. Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 778–792

    Google Scholar 

  5. Cervantes J, Garcia-Lamont F, Rodríguez-Mazahua L, Lopez A (2020) A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 408:189–215

    Article  Google Scholar 

  6. Datta R, Jia L, Wang JZ (2008) Algorithmic inferencing of aesthetics and emotion in natural images: An exposition15th IEEE International Conference on Image Processing, pp 105–108

  7. Debnath S, Roy R, Changder S (2022) Photo classification based on the presence of diagonal line using pre-trained DCNN VGG16. Multimed Tools Appl 81:22527–22548

    Article  Google Scholar 

  8. Deng Y, Loy CC, Tang X (2017) Image aesthetic assessment: An experimental survey. IEEE Signal Proc Mag 34:80–106

    Article  Google Scholar 

  9. De Siqueira FR, Schwartz WR, Pedrini H (2013) Multi-scale gray level co-occurrence matrices for texture description. Neurocomputing 120:336–345

    Article  Google Scholar 

  10. Gaspar P, Carbonell J, Oliveira JL (2012) On the parameter optimization of support vector machines for binary classification. J Integr Bioinform 9:33–43

    Article  Google Scholar 

  11. Han L, Embrechts MJ, Szymanski BK, Sternickel K, Ross A (2011) Sigma tuning of gaussian kernels detection of ischemia from magnetocardiograms. Computational Modeling and Simulation of Intellect: Current State and Future Perspectives. IGI Global, pp 206–223

  12. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  13. Jang H, Lee JS (2021) Analysis of deep features for image aesthetic assessment. IEEE Access 9:29850–29861

    Article  Google Scholar 

  14. Jiang G, Song H, Yu M, Song Y, Peng Z (2018) Blind tone-mapped image quality assessment based on brightest/darkest regions, naturalness and aesthetics. IEEE Access 6:2231–2240

    Article  Google Scholar 

  15. Joshi D, Datta R, Fedorovskaya E et al (2011) Aesthetics and emotions in images. IEEE Signal Proc Mag 28:94–115

    Article  Google Scholar 

  16. Katz O, Heidmann P, Fink M, Gigan S (2014) Non-invasive single-shot imaging through scattering layers and around corners via speckle correlations. Nat Photonics 8:784–790

    Article  Google Scholar 

  17. Kim S, Kavuri S, Lee M (2013) Deep network with support vector machines. In: Lee M, Hirose A, Hou ZG, Kil RM (eds) International Conference on Neural Information Processing. ICONIP 2013. Lecture Notes in Computer Science, vol 8226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42054-2_57

  18. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networksAdvances in neural information processing systems

  19. Li Z, Huang X, Zhang Z et al (2022) Synthesis of magnetic resonance images from computed tomography data using convolutional neural network with contextual loss function. Quant Imag Med Surg 12:3151–3169

    Article  Google Scholar 

  20. Li Z, Wu F, Hong F, Gai X, Cao W, Zhang Z, Yang T, Wang J, Gao S and Peng C (2022) Computer-aided diagnosis of spinal tuberculosis from CT images based on deep learning with multimodal feature fusion. Front Microbiol 13:823324. https://doi.org/10.3389/fmicb.2022.823324

  21. Liu W and Wang Z (2017) A database for perceptual evaluation of image aesthetics, IEEE International Conference on Image Processing (ICIP), pp. 1317-1321. https://doi.org/10.1109/ICIP.2017.8296495

  22. Lou J, Yang H (2018) Food image aesthetic quality measurement by distribution prediction. Standford

  23. Lu X, Lin Z, Jin H, Yang J, Wang JZ (2015) Rating image aesthetics using deep learning. IEEE T Multimedia 17:2021–2034

    Article  Google Scholar 

  24. Ma S, Liu J, Wen Chen C (2017) A-lamp: Adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4535–4544

  25. Mikhailava V, Pyshkin E, Klyuev V (2020) Aesthetic evaluation of food plate images using deep learning 22nd International Conference on Advanced Communication Technology (ICACT), pp 285–289

  26. Murray N, Marchesotti L, Perronnin F (2012) AVA: A large-scale database for aesthetic visual analysis. IEEE Conference on Computer Vision and Pattern Recognition, pp 2408–2415

  27. Nuari R, Utami E, Raharjo S (2019) Comparison of scale invariant feature transform and speed up robust feature for image forgery detection copy move 2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE). IEEE, pp 107–112

  28. Panetta K, Bao L, Agaian S, Oludare V (2019) Color theme–based aesthetic enhancement algorithm to emulate the human perception of beauty in photos. ACM T Multim Comput 15:1–17

    Article  Google Scholar 

  29. Rahmad C, Asmara RA, Putra D, Dharma I, Darmono H, Muhiqqin I (2020) Comparison of Viola-Jones haar cascade classifier and histogram of oriented gradients (HOG) for face detection. IOP conference series: materials science and engineering. IOP Publishing, pp 012038

  30. Ray P, Reddy SS, Banerjee T (2021) Various dimension reduction techniques for high dimensional data analysis: a review. Artif Intell Rev 54:3473–3515

    Article  Google Scholar 

  31. Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: An efficient alternative to SIFT or SURFInternational Conference on Computer Vision, pp 2564–2571

  32. Sheng K, Dong W, Huang H et al (2021) Learning to assess visual aesthetics of food images. Comput Vis Media 7:139–152

    Article  Google Scholar 

  33. Sheng K, Dong W, Huang H, Ma C, Hu B-G (2018) Gourmet photography dataset for aesthetic assessment of food images. SIGGRAPH Asia 2018 Technical Briefs. Association for Computing Machinery, Tokyo, Japan, 2018-12-4 to 2018-12-7. http://ir.ia.ac.cn/handle/173211/23890

  34. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556

  35. Speiser JL, Miller ME, Tooze J, Ip E (2019) A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst Appli 134:93–101

    Article  Google Scholar 

  36. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    MathSciNet  Google Scholar 

  37. Subhashree SN, Sunoj S, Xue J, Bora GC (2017) Quantification of browning in apples using colour and textural features by image analysis. Food Qual Saf 1:221–226

    Article  Google Scholar 

  38. Sun W-T, Chao T-H, Kuo Y-H, Hsu WH (2017) Photo filter recommendation by category-aware aesthetic learning. IEEE T Multimedia 19:1870–1880

    Article  Google Scholar 

  39. Suran S, Sreekumar K (2016) Automatic aesthetic quality assessment of photographic images using deep convolutional neural network. International Conference on Information Science (ICIS), pp 77–82

  40. Tang X, Luo W, Wang X (2013) Content-based photo quality assessment. IEEE T Multimedia 15:1930–1943

    Article  Google Scholar 

  41. Tigistu T, Abebe G (2021) Classification of rose flowers based on Fourier descriptors and color moments. Multimed Tools Appl 80:36143–36157

    Article  Google Scholar 

  42. Tran DT, Huh J-H (2022) Building a model to exploit association rules and analyze purchasing behavior based on rough set theory. J Supercomput 78:11051–11091

    Article  Google Scholar 

  43. Tran DT, Huh J-H, Kim J-H (2023) Building a Lucy hybrid model for grocery sales forecasting based on time series. J Supercomput 79:4048–4083

    Article  Google Scholar 

  44. Vijayan T, Sangeetha M, Kumaravel A, Karthik B (2023) Feature selection for simple color histogram filter based on retinal fundus images for diabetic retinopathy recognition. IETE J Res 69:987–994

    Article  Google Scholar 

  45. Zhou J, Zhang Q, Fan J-H, Sun W, Zheng W-S (2021) Joint regression and learning from pairwise rankings for personalized image aesthetic assessment. Comput Vis Media 7:241–252

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to express sincere thanks to Associate Professor Zhao of the School of Software Engineering, Tongji University for her valuable comments and suggestions.

Funding

This research received no external funding.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Zhaotong Li. The first draft of the manuscript was written by Zhaotong Li and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhaotong Li.

Ethics declarations

Conflicts of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z., Zhang, Z. & Gao, S. Classical learning or deep learning: a study on food photo aesthetic assessment. Multimed Tools Appl 83, 36469–36489 (2024). https://doi.org/10.1007/s11042-023-15791-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15791-2

Keywords

Navigation