Abstract
One of the most common critical factors directly related to the cause of a chronic disease is unhealthy diet consumption. Building an automatic system for food analysis could enable a better understanding of the nutritional information associated to the food consumed and thus, help taking corrective actions on our diet. The Computer Vision community has focused its efforts on several areas involved in visual food analysis such as: food detection, food recognition, food localization, portion estimation, among others. For food detection, the best results in the state of the art were obtained using Convolutional Neural Networks. However, the results of all different approaches were tested on different datasets and, therefore, are not directly comparable. This article proposes an overview of the last advances on food detection and an optimal model based on the GoogLeNet architecture, Principal Component Analysis, and a Support Vector Machine that outperforms the state of the art on two public food/non-food datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ng, M., et al.: Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: a systematic analysis for the global burden of disease study 2013. Lancet 384, 766–781 (2014)
World Health Organization: Diet, nutrition and the prevention of chronic diseases. WHO Technical Report Series, vol. 916, p. 149 (2003)
Kagaya, H., Aizawa, K., Ogawa, M.: Food detection and recognition using convolutional neural network. In: ACM Multimedia, pp. 1085–1088 (2014)
Bolaños, M., Radeva, P.: Simultaneous food localization and recognition. In: ICPR (2016)
Myers, A., et al.: Im2Calories: towards an automated mobile vision food diary. In: ICCV (2015)
Singla, A., Yuan, L., Ebrahimi, T.: Food/non-food image classification and food categorization using pre-trained GoogLeNet model. In: Proceedings of the 2nd International Workshop on MADiMa (2016)
Kitamura, K., Yamasaki, T., Aizawa, K.: FoodLog. In: Proceedings of the ACM Multimedia 2009 Workshop on Multimedia for Cooking and Eating Activities (2009)
Farinella, G.M., Allegra, D., Stanco, F., Battiato, S.: On the exploitation of one class classification to distinguish food vs non-food images. In: Murino, V., Puppo, E., Sona, D., Cristani, M., Sansone, C. (eds.) ICIAP 2015. LNCS, vol. 9281, pp. 375–383. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23222-5_46
Ragusa, F., et al.: Food vs non-food classification. In: Proceedings of the 2nd International Workshop on MADiMa (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, p. 19 (2012)
Kagaya, H., Aizawa, K.: Highly accurate food/non-food image classification based on a deep convolutional neural network. In: Murino, V., Puppo, E., Sona, D., Cristani, M., Sansone, C. (eds.) ICIAP 2015. LNCS, vol. 9281, pp. 350–357. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23222-5_43
Lin, M., Chen, Q., Yan, S.: Network in network. arXiv Preprint, p. 10 (2013)
Bossard, L., Guillaumin, M., Van Gool, L.: Food-101 – mining discriminative components with random forests. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 446–461. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10599-4_29
Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88, 303–338 (2010)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)
Jollie, I.T.: Principal component analysis. J. Am. Statist. Assoc. 98, 487 (2002)
Kaiser, H.F.: The application of electronic computers to factor analysis. Edu. Psychol. Measur. 20, 141–151 (1960)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Caltech mimeo 11, 20 (2007)
Farinella, G.M., Allegra, D., Stanco, F.: A benchmark dataset to study the representation of food images. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8927, pp. 584–599. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16199-0_41
Jia, Y. et al.: Caffe: convolutional architecture for fast feature embedding. arXiv Preprint (2014)
Acknowledgement
This work was partially funded by TIN2015-66951-C2, SGR 1219, CERCA, ICREA Academia’2014, CONICYT Becas Chile, FPU15/01347 and Grant 20141510 (Marató TV3). The funders had no role in the study design, data collection, analysis, and preparation of the manuscript. We acknowledge Nvidia Corporation for the donation of a Titan X GPU.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Aguilar, E., Bolaños, M., Radeva, P. (2018). Exploring Food Detection Using CNNs. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds) Computer Aided Systems Theory – EUROCAST 2017. EUROCAST 2017. Lecture Notes in Computer Science(), vol 10672. Springer, Cham. https://doi.org/10.1007/978-3-319-74727-9_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-74727-9_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-74726-2
Online ISBN: 978-3-319-74727-9
eBook Packages: Computer ScienceComputer Science (R0)