Skip to main content

Deep Learning Approaches in Food Recognition

  • Chapter
  • First Online:
Machine Learning Paradigms

Part of the book series: Learning and Analytics in Intelligent Systems ((LAIS,volume 18))

Abstract

Automatic image-based food recognition is a particularly challenging task. Traditional image analysis approaches have achieved low classification accuracy in the past, whereas deep learning approaches enabled the identification of food types and their ingredients. The contents of food dishes are typically deformable objects, usually including complex semantics, which makes the task of defining their structure very difficult. Deep learning methods have already shown very promising results in such challenges, so this chapter focuses on the presentation of some popular approaches and techniques applied in image-based food recognition. The three main lines of solutions, namely the design from scratch, the transfer learning and the platform-based approaches, are outlined, particularly for the task at hand, and are tested and compared to reveal the inherent strengths and weaknesses. The chapter is complemented with basic background material, a section devoted to the relevant datasets that are crucial in light of the empirical approaches adopted, and some concluding remarks that underline the future directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    It should be noted that there is an inconsistency across the scientific domains in the usage of the word “tensor”, which has a totally different meaning in physics than in computer science and in particular in machine learning.

  2. 2.

    Covariate shift relates to a change in the distribution of the input variables of the training and the test data.

  3. 3.

    http://caffe.berkeleyvision.org/.

  4. 4.

    https://www.tensorflow.org/.

  5. 5.

    https://keras.io/.

  6. 6.

    https://pytorch.org/.

  7. 7.

    http://www.deeplearning.net/software/theano/.

  8. 8.

    https://mxnet.apache.org/.

  9. 9.

    http://foodcam.mobi/dataset100.html.

  10. 10.

    http://www.image-net.org/challenges/LSVRC/2010/.

  11. 11.

    http://foodcam.mobi/dataset256.html.

  12. 12.

    https://www.vision.ee.ethz.ch/datasets_extra/food-101/.

  13. 13.

    http://vireo.cs.cityu.edu.hk/VireoFood172/.

  14. 14.

    http://www.ivl.disco.unimib.it/activities/food524db/.

  15. 15.

    https://kuanghuei.github.io/Food-101N.

  16. 16.

    https://cloud.google.com/vision/.

  17. 17.

    https://clarifai.com.

  18. 18.

    https://aws.amazon.com/rekognition/.

  19. 19.

    https://www.microsoft.com/cognitive-services.

  20. 20.

    https://github.com/chairiq/FoodCNNs.

References

  1. M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, … X. Zheng, Tensorflow: a system for large-scale machine learning, in 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA (2016), pp. 265–283

    Google Scholar 

  2. S.G. Adam Paszke, Automatic differentiation in PyTorch, in 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA (2017)

    Google Scholar 

  3. Y. Andrew, Feature selection, L1 vs. L2 regularization, and rotational invariance, in Proceedings of the Twenty-First International Conference on Machine Learning (ACM, 2004)

    Google Scholar 

  4. L. Bossard, M. Guillaumin, & L. Van Gool, Food-101—mining discriminative components with random forests, in European Conference on Computer Vision (Springer, Cham, 2014), pp. 446–461

    Google Scholar 

  5. J. Chen, & W.C. Ngo, Deep-based ingredient recognition for cooking recipe retrival, in ACM Multimedia (2016), pp. 32–41

    Google Scholar 

  6. T. Chen, M. Li, Y. Li, M. Lin, M. Wang, M. Wang, … Z. Zhang, MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems, in NIPS Workshop on Machine Learning Systems (LearningSys) (2015)

    Google Scholar 

  7. F. Chollet, François Chollet. Keras.io. (2015). Accessed Keras: https://keras.io

  8. F. Chollet, Deep Learning with Python (Manning Publications, 2018)

    Google Scholar 

  9. S. Christodoulidis, M. Anthimopoulos, S. Mougiakakou, Food recognition for dietary assessment using deep convolutional neural networks, in International Conference on Image Analysis and Processing (Springer, Cham, 2015), pp. 458–465

    Google Scholar 

  10. G. Ciocca, P. Napoletano, R. Schettini, Food recognition: a new dataset, experiments, and results. IEEE J. Biomed. Health Inform. 21(3), 588–598 (2017)

    Article  Google Scholar 

  11. G. Ciocca, P. Napoletano, R. Schettini, Learning CNN-based features for retrieval of food images, New Trends in Image Analysis and Processing—ICIAP 2017: ICIAP International Workshops, WBICV, SSPandBE, 3AS, RGBD, NIVAR, IWBAAS, and MADiMa 2017 (Springer International Publishing, Catania, Italy, 2017), pp. 426–434

    Google Scholar 

  12. G. Ciocca, P. Napoletano, R. Schettini, CNN-based features for retrieval and classification of food images. Comput. Vis. Image Underst. 176–177, 70–77 (2018)

    Article  Google Scholar 

  13. V. Dumoulin, F. Visin, A guide to convolution arithmetic for deep learning (2016). arXiv arXiv:1603.07285

  14. X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistic (2010), pp. 249–256

    Google Scholar 

  15. X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectifier neural networks, in International Conference on Artificial Intelligence and Statistics, ed. by G. Gordon, D. Dunson, M. Dudk (2011), pp. 315–323

    Google Scholar 

  16. I. Goodfellow, Y. Bengio, & A. Courville, Deep Learning (MIT Press, 2016)

    Google Scholar 

  17. H. Hassannejad, G. Matrella, P. Ciampolini, I. DeMunari, M. Mordonini, S. Cagnoni, Food image recognition using very deep convolutional networks, in 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, The Netherlands (2016), pp. 41–49

    Google Scholar 

  18. K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV (2016), pp. 770–778

    Google Scholar 

  19. S. Horiguchi, S. Amano, M. Ogawa, K. Aizawa, Personalized classifier for food image recognition. IEEE Trans. Multimedia 20(10), 2836–2848 (2018)

    Article  Google Scholar 

  20. G. Huang, Z. Liu, L. van der Maaten, K. Weinberger, Densely connected convolutional networks, in IEEE Conference on Pattern Recognition and Computer Vision (2017), pp. 4700–4708

    Google Scholar 

  21. S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, in Proceeding ICML’15 Proceedings of the 32nd International Conference on International Conference on Machine Learning (2015), pp. 448–4456

    Google Scholar 

  22. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, … T. Darrell, Caffe: convolutional architecture for fast feature embedding, in 22nd ACM International Conference on Multimedia (ACM, Orlando, Florida, USA, 2014), pp. 675–678

    Google Scholar 

  23. A. Jovic, K. Brkic, N. Bogunovic, An overview of free software tools for general data mining, in 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (2014), pp. 1112–1117

    Google Scholar 

  24. H. Kagaya, K. Aizawa, M. Ogawa, Food detection and recognition using convolutional neural network, in 22nd ACM international conference on Multimedia, Orlando, FL, USA (2014), pp. 1055–1088

    Google Scholar 

  25. Y. Kawano, K. Yanai, FoodCam: a real-time food recognition system on a smartphone. Multimedia Tools Appl. 74(14), 5263–5287 (2014)

    Google Scholar 

  26. Y. Kawano, K. Yanai, Automatic expansion of a food image dataset leveraging existing categories with domain adaptation, in Proceedings of ECCV Workshop on Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV) (2014), pp. 3–17

    Google Scholar 

  27. Y. Kawano, K. Yanai, Food image recognition with deep convolutional features, in Proceedings of ACM UbiComp Workshop on Cooking and Eating Activities (CEA) (2014c), pp. 589–593

    Google Scholar 

  28. A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in 25th International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada (2012), pp. 1097–1105

    Google Scholar 

  29. L. Kuang-Huei, H. Xiaodong, Z. Lei, Y. Linjun, CleanNet: transfer learning for scalable image classifier training with label noise, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  30. C. Liu, Y. Cao, Y. Luo, G. Chen, V. Vokkarane, Y. Ma, DeepFood: deep learning-based food image recognition for computer-aided dietary assessment, in 14th International Conference on Inclusive Smart Cities and Digital Health, Wuhan, China (2016), pp. 37–48

    Google Scholar 

  31. N. Martinel, G. Foresti, C. Micheloni, Wide-slice residual networks for food recognition, in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV (2018), pp. 567–576

    Google Scholar 

  32. Y. Matsuda, H. Hoashi, K. Yanai, Recognition of multiple-food images by detecting candidate regions, in Proceedings of IEEE International Conference on Multimedia and Expo (ICME) (2012)

    Google Scholar 

  33. S. Mezgec, S. Koroušić, NutriNet: a deep learning food and drink image recognition system for dietary assessment. Nutrients 9(7), 657 (2017)

    Article  Google Scholar 

  34. G. Nguyen, S. Dlugolinsky, M. Bobák, V. Tran, A.L. García, I. Heredia, L. Hluchý, Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artif. Intell. Rev. 52(1), 77–124 (2019)

    Article  Google Scholar 

  35. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.C. Chen, MobileNetV2: inverted residuals and linear bottlenecks, in The IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 4510–4520

    Google Scholar 

  36. K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur (2015), pp. 730–734

    Google Scholar 

  37. A. Singla, L. Yuan, T. Ebrahimi, Food/non-food image classification and food categorization using pre-trained GoogLeNet model, in 2nd International Workshop on Multimedia Assisted Dietary Management, Amsterdam, The Netherlands (2016), pp. 3–11

    Google Scholar 

  38. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  39. I. Sutskever, J. Martens, G. Dahl, G. Hinton, On the importance of initialization and momentum in deep learning, in International Conference on Machine Learning (2013), pp. 1139–1147

    Google Scholar 

  40. C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  41. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, … A. Rabinovich, Going deeper with convolutions, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA (2015), pp. 1–9

    Google Scholar 

  42. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 2818–2826

    Google Scholar 

  43. D. TheanoTeam, Theano: a Python framework for fast computation of mathematical expressions (2016). arXiv preprint arXiv:1605.02688

  44. L. Torrey, J. Shavlik, Transfer learning, in Handbook of Research on Machine Learning Applications, ed. by E. Soria, J. Martin, R. Magdalena, M. Martinez, A. Serrano (IGI Global, 2009), pp. 242–264

    Google Scholar 

  45. X. Wang, D. Kumar, N. Thome, M. Cord, F. Precioso, Recipe recognition with large multimodal food dataset, in 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Turin (2015), pp. 1–6

    Google Scholar 

  46. M. Weiqing, J. Shuqiang, L. Linhu, R. Yong, J. Ramesh, A Survey on Food Computing (ACM Computing Surveys, 2019)

    Google Scholar 

  47. K. Yanai, Y. Kawano, Food image recognition using deep convolutional network with pre-training and fine-tuning, in IEEE International Conference on Multimedia & Expo Workshops, Turin, Italy (2015), pp. 1–6

    Google Scholar 

  48. Q. Yu, M. Anzawa, S. Amano, M. Ogawa, K. Aizawa, Food image recognition by personalized classifier, in 25th IEEE International Conference on Image Processing, Athens (2018), pp. 171–175

    Google Scholar 

  49. M. Zeiler (2013). Accessed http://www.image-net.org/challenges/LSVRC/2013/results.php

  50. M. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, in IEEE European Conference on Computer Vision (2014), pp. 818–833

    Google Scholar 

  51. S. Zhang, X. Zhang, H. Wang, J. Cheng, P. Li, Z. Ding, Chinese medical question answer matching using end-to-end character-level multi-scale CNNs. Appl. Sci. 7(8), 767 (2017)

    Article  Google Scholar 

  52. B. Zoph, V. Vasudevan, J. Shlens, V. Le, Learning transferable architectures for scalable image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 8697–8710

    Google Scholar 

Download references

Acknowledgements

This research has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH—CREATE—INNOVATE (project code: T1EDK-02015). We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for the experiments of this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chairi Kiourt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kiourt, C., Pavlidis, G., Markantonatou, S. (2020). Deep Learning Approaches in Food Recognition. In: Tsihrintzis, G., Jain, L. (eds) Machine Learning Paradigms. Learning and Analytics in Intelligent Systems, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-030-49724-8_4

Download citation

Publish with us

Policies and ethics