Abstract
Automatic image-based food recognition is a particularly challenging task. Traditional image analysis approaches have achieved low classification accuracy in the past, whereas deep learning approaches enabled the identification of food types and their ingredients. The contents of food dishes are typically deformable objects, usually including complex semantics, which makes the task of defining their structure very difficult. Deep learning methods have already shown very promising results in such challenges, so this chapter focuses on the presentation of some popular approaches and techniques applied in image-based food recognition. The three main lines of solutions, namely the design from scratch, the transfer learning and the platform-based approaches, are outlined, particularly for the task at hand, and are tested and compared to reveal the inherent strengths and weaknesses. The chapter is complemented with basic background material, a section devoted to the relevant datasets that are crucial in light of the empirical approaches adopted, and some concluding remarks that underline the future directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
It should be noted that there is an inconsistency across the scientific domains in the usage of the word “tensor”, which has a totally different meaning in physics than in computer science and in particular in machine learning.
- 2.
Covariate shift relates to a change in the distribution of the input variables of the training and the test data.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
References
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, … X. Zheng, Tensorflow: a system for large-scale machine learning, in 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA (2016), pp. 265–283
S.G. Adam Paszke, Automatic differentiation in PyTorch, in 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA (2017)
Y. Andrew, Feature selection, L1 vs. L2 regularization, and rotational invariance, in Proceedings of the Twenty-First International Conference on Machine Learning (ACM, 2004)
L. Bossard, M. Guillaumin, & L. Van Gool, Food-101—mining discriminative components with random forests, in European Conference on Computer Vision (Springer, Cham, 2014), pp. 446–461
J. Chen, & W.C. Ngo, Deep-based ingredient recognition for cooking recipe retrival, in ACM Multimedia (2016), pp. 32–41
T. Chen, M. Li, Y. Li, M. Lin, M. Wang, M. Wang, … Z. Zhang, MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems, in NIPS Workshop on Machine Learning Systems (LearningSys) (2015)
F. Chollet, François Chollet. Keras.io. (2015). Accessed Keras: https://keras.io
F. Chollet, Deep Learning with Python (Manning Publications, 2018)
S. Christodoulidis, M. Anthimopoulos, S. Mougiakakou, Food recognition for dietary assessment using deep convolutional neural networks, in International Conference on Image Analysis and Processing (Springer, Cham, 2015), pp. 458–465
G. Ciocca, P. Napoletano, R. Schettini, Food recognition: a new dataset, experiments, and results. IEEE J. Biomed. Health Inform. 21(3), 588–598 (2017)
G. Ciocca, P. Napoletano, R. Schettini, Learning CNN-based features for retrieval of food images, New Trends in Image Analysis and Processing—ICIAP 2017: ICIAP International Workshops, WBICV, SSPandBE, 3AS, RGBD, NIVAR, IWBAAS, and MADiMa 2017 (Springer International Publishing, Catania, Italy, 2017), pp. 426–434
G. Ciocca, P. Napoletano, R. Schettini, CNN-based features for retrieval and classification of food images. Comput. Vis. Image Underst. 176–177, 70–77 (2018)
V. Dumoulin, F. Visin, A guide to convolution arithmetic for deep learning (2016). arXiv arXiv:1603.07285
X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistic (2010), pp. 249–256
X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectifier neural networks, in International Conference on Artificial Intelligence and Statistics, ed. by G. Gordon, D. Dunson, M. Dudk (2011), pp. 315–323
I. Goodfellow, Y. Bengio, & A. Courville, Deep Learning (MIT Press, 2016)
H. Hassannejad, G. Matrella, P. Ciampolini, I. DeMunari, M. Mordonini, S. Cagnoni, Food image recognition using very deep convolutional networks, in 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, The Netherlands (2016), pp. 41–49
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV (2016), pp. 770–778
S. Horiguchi, S. Amano, M. Ogawa, K. Aizawa, Personalized classifier for food image recognition. IEEE Trans. Multimedia 20(10), 2836–2848 (2018)
G. Huang, Z. Liu, L. van der Maaten, K. Weinberger, Densely connected convolutional networks, in IEEE Conference on Pattern Recognition and Computer Vision (2017), pp. 4700–4708
S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, in Proceeding ICML’15 Proceedings of the 32nd International Conference on International Conference on Machine Learning (2015), pp. 448–4456
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, … T. Darrell, Caffe: convolutional architecture for fast feature embedding, in 22nd ACM International Conference on Multimedia (ACM, Orlando, Florida, USA, 2014), pp. 675–678
A. Jovic, K. Brkic, N. Bogunovic, An overview of free software tools for general data mining, in 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (2014), pp. 1112–1117
H. Kagaya, K. Aizawa, M. Ogawa, Food detection and recognition using convolutional neural network, in 22nd ACM international conference on Multimedia, Orlando, FL, USA (2014), pp. 1055–1088
Y. Kawano, K. Yanai, FoodCam: a real-time food recognition system on a smartphone. Multimedia Tools Appl. 74(14), 5263–5287 (2014)
Y. Kawano, K. Yanai, Automatic expansion of a food image dataset leveraging existing categories with domain adaptation, in Proceedings of ECCV Workshop on Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV) (2014), pp. 3–17
Y. Kawano, K. Yanai, Food image recognition with deep convolutional features, in Proceedings of ACM UbiComp Workshop on Cooking and Eating Activities (CEA) (2014c), pp. 589–593
A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in 25th International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada (2012), pp. 1097–1105
L. Kuang-Huei, H. Xiaodong, Z. Lei, Y. Linjun, CleanNet: transfer learning for scalable image classifier training with label noise, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
C. Liu, Y. Cao, Y. Luo, G. Chen, V. Vokkarane, Y. Ma, DeepFood: deep learning-based food image recognition for computer-aided dietary assessment, in 14th International Conference on Inclusive Smart Cities and Digital Health, Wuhan, China (2016), pp. 37–48
N. Martinel, G. Foresti, C. Micheloni, Wide-slice residual networks for food recognition, in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV (2018), pp. 567–576
Y. Matsuda, H. Hoashi, K. Yanai, Recognition of multiple-food images by detecting candidate regions, in Proceedings of IEEE International Conference on Multimedia and Expo (ICME) (2012)
S. Mezgec, S. Koroušić, NutriNet: a deep learning food and drink image recognition system for dietary assessment. Nutrients 9(7), 657 (2017)
G. Nguyen, S. Dlugolinsky, M. Bobák, V. Tran, A.L. García, I. Heredia, L. Hluchý, Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artif. Intell. Rev. 52(1), 77–124 (2019)
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.C. Chen, MobileNetV2: inverted residuals and linear bottlenecks, in The IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 4510–4520
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur (2015), pp. 730–734
A. Singla, L. Yuan, T. Ebrahimi, Food/non-food image classification and food categorization using pre-trained GoogLeNet model, in 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, The Netherlands (2016), pp. 3–11
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
I. Sutskever, J. Martens, G. Dahl, G. Hinton, On the importance of initialization and momentum in deep learning, in International Conference on Machine Learning (2013), pp. 1139–1147
C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in Thirty-First AAAI Conference on Artificial Intelligence (2017)
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, … A. Rabinovich, Going deeper with convolutions, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA (2015), pp. 1–9
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 2818–2826
D. TheanoTeam, Theano: a Python framework for fast computation of mathematical expressions (2016). arXiv preprint arXiv:1605.02688
L. Torrey, J. Shavlik, Transfer learning, in Handbook of Research on Machine Learning Applications, ed. by E. Soria, J. Martin, R. Magdalena, M. Martinez, A. Serrano (IGI Global, 2009), pp. 242–264
X. Wang, D. Kumar, N. Thome, M. Cord, F. Precioso, Recipe recognition with large multimodal food dataset, in 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Turin (2015), pp. 1–6
M. Weiqing, J. Shuqiang, L. Linhu, R. Yong, J. Ramesh, A Survey on Food Computing (ACM Computing Surveys, 2019)
K. Yanai, Y. Kawano, Food image recognition using deep convolutional network with pre-training and fine-tuning, in IEEE International Conference on Multimedia & Expo Workshops, Turin, Italy (2015), pp. 1–6
Q. Yu, M. Anzawa, S. Amano, M. Ogawa, K. Aizawa, Food image recognition by personalized classifier, in 25th IEEE International Conference on Image Processing, Athens (2018), pp. 171–175
M. Zeiler (2013). Accessed http://www.image-net.org/challenges/LSVRC/2013/results.php
M. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, in IEEE European Conference on Computer Vision (2014), pp. 818–833
S. Zhang, X. Zhang, H. Wang, J. Cheng, P. Li, Z. Ding, Chinese medical question answer matching using end-to-end character-level multi-scale CNNs. Appl. Sci. 7(8), 767 (2017)
B. Zoph, V. Vasudevan, J. Shlens, V. Le, Learning transferable architectures for scalable image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 8697–8710
Acknowledgements
This research has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH—CREATE—INNOVATE (project code: T1EDK-02015). We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for the experiments of this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Kiourt, C., Pavlidis, G., Markantonatou, S. (2020). Deep Learning Approaches in Food Recognition. In: Tsihrintzis, G., Jain, L. (eds) Machine Learning Paradigms. Learning and Analytics in Intelligent Systems, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-030-49724-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-49724-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49723-1
Online ISBN: 978-3-030-49724-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)