FoodCam: A real-time food recognition system on a smartphone

Kawano, Yoshiyuki; Yanai, Keiji

doi:10.1007/s11042-014-2000-8

FoodCam: A real-time food recognition system on a smartphone

Published: 12 April 2014

Volume 74, pages 5263–5287, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yoshiyuki Kawano¹ &
Keiji Yanai¹

2921 Accesses
106 Citations
4 Altmetric
Explore all metrics

Abstract

We propose a mobile food recognition system, FoodCam, the purposes of which are estimating calorie and nutrition of foods and recording a user’s eating habits. In this paper, we propose image recognition methods which are suitable for mobile devices. The proposed method enables real-time food image recognition on a consumer smartphone. This characteristic is completely different from the existing systems which require to send images to an image recognition server. To recognize food items, a user draws bounding boxes by touching the screen first, and then the system starts food item recognition within the indicated bounding boxes. To recognize them more accurately, we segment each food item region by GrubCut, extract image features and finally classify it into one of the one hundred food categories with a linear SVM. As image features, we adopt two kinds of features: one is the combination of the standard bag-of-features and color histograms with χ ² kernel feature maps, and the other is a HOG patch descriptor and a color patch descriptor with the state-of-the-art Fisher Vector representation. In addition, the system estimates the direction of food regions where the higher SVM output score is expected to be obtained, and it shows the estimated direction in an arrow on the screen in order to ask a user to move a smartphone camera. This recognition process is performed repeatedly and continuously. We implemented this system as a standalone mobile application for Android smartphones so as to use multiple CPU cores effectively for real-time recognition. In the experiments, we have achieved the 79.2 % classification rate for the top 5 category candidates for a 100-category food dataset with the ground-truth bounding boxes when we used HOG and color patches with the Fisher Vector coding as image features. In addition, we obtained positive evaluation by a user study compared to the food recording system without object recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

References

Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (SURF). Comput Vis Image Underst 110(3):346–359
Article Google Scholar
Chae J, Woo I, Kim S, Maciejewski R, Zhu F, Delp E, Boushey C, Ebert D (2011) Volume estimation using food specific shape templates in mobile image-based dietary assessment. In: Proceedings of the IS&T/SPIE conference on computational imaging IX, vol 7873, p 78730K
Chatfield K, Lempitsky V, Vedaldi A, Zisserman A (2011) The devil is in the details: an evaluation of recent feature encoding methods. In: Proceedings of British machine vision conference
Csurka G, Bray C, Dance C, Fan L (2004) Visual categorization with bags of keypoints. In: Proceedings of ECCV workshop on statistical learning in computer vision (SLCV), pp 59–74
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of IEEE computer vision and pattern recognition
Deng Y, Manjunath BS (2001) Unsupervised segmentation of color-texture regions in images and video. IEEE Trans Pattern Anal Mach Intell 23(8):800–810
Article Google Scholar
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
Google Scholar
Felzenszwalb P, Girshick R, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
He Y, Xu C, Khanna N, Boushey C, Delp E (2013) Food image analysis: segmentation identification and weight estimation. In: Proceedings of IEEE international conference on multimedia and expo
Jia D, Alex B, Sanjeev S, Hao S, Aditya K, Fei-Fei L (2012) Imagenet large scale visual recognition challenge 2012 (ILSVRC2012). http://www.image-net.org/challenges/LSVRC/2012/
Kitamura K, Yamasaki T, Aizawa K (2008) Food log by analyzing food images. In: Proceedings of ACM international conference multimedia, pp 999–1000
Kitamura K, Yamasaki T, Aizawa K (2009) Foodlog: capture, analysis and retrieval of personal food images via web. In: Proceedings of ACM multimedia workshop on multimedia for cooking and eating activities, pp 23–30
Kumar N, Belhumeur P, Biswas A, Jacobs D, Kress W, Lopez I, Soares J (2012) Leafsnap: a computer vision system for automatic plant species identification. In: Proceedings of European conference on computer vision
Lampert CH, BlaschkoMB, Hofmann T (2008) Beyond sliding windows: object localization by efficient subwindow search. In: Proceedings of IEEE computer vision and pattern recognition
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE computer vision and pattern recognition, pp 2169–2178
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Mariappan A, Bosch M, Zhu F, Boushey C, Kerr D, Ebert D, Delp E (2009) Personal dietary assessment using mobile devices. In: Proceedings of the IS&T/SPIE conference on computational imaging VII
Maruyama T, Kawano Y, Yanai K (2012) Real-time mobile recipe recommendation system using food ingredient recognition. In: Proceedings of ACM MMworkshop on interactivemultimedia on mobile and portable devices(IMMPD)
Matsuda Y, Hoashi H, Yanai K (2012) Recognition of multiple-food images by detecting candidate regions. In: Proceedings of IEEE international conference on multimedia and expo
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: Proceedings of IEEE computer vision and pattern recognition
Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: Proceedings of European conference on computer vision
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2008) Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of IEEE computer vision and pattern recognition
Rother C, Kolmogorov V, Blake A (2004) Grabcut: interactive foreground extraction using iterated graph cuts. In: ACM SIGGRAPH, pp 309–314
Vedaldi A, Zisserman A (2012) Efficient additive kernels via explicit feature maps. IEEE Trans Pattern Anal Mach Intell
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Proceedings of IEEE computer vision and pattern recognition, pp 3360–3367
Yang S, Chen M, Pomerleau D, Sukthankar R (2010) Food recognition using statistics of pairwise local features. In: Proceedings of IEEE computer vision and pattern recognition
Yu F, Ji R, Chang S (2011) Active query sensing for mobile location search. In: Proceedings of ACM international conference multimedia

Download references

Author information

Authors and Affiliations

The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka, Chofu-shi, Tokyo, 182-8585, Japan
Yoshiyuki Kawano & Keiji Yanai

Authors

Yoshiyuki Kawano
View author publications
You can also search for this author in PubMed Google Scholar
Keiji Yanai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Keiji Yanai.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kawano, Y., Yanai, K. FoodCam: A real-time food recognition system on a smartphone. Multimed Tools Appl 74, 5263–5287 (2015). https://doi.org/10.1007/s11042-014-2000-8

Download citation

Published: 12 April 2014
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11042-014-2000-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FoodCam: A real-time food recognition system on a smartphone

Abstract

Access this article

Similar content being viewed by others

Image Recognition-Based Tool for Food Recording and Analysis: FoodLog

Smart Diet Management Through Food Image and Cooking Recipe Analysis

Advancements in Machine Learning and Computer Vision Approaches for Food and Nutrient Recognition from Images: A Survey

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Image Recognition-Based Tool for Food Recording and Analysis: FoodLog

Smart Diet Management Through Food Image and Cooking Recipe Analysis

Advancements in Machine Learning and Computer Vision Approaches for Food and Nutrient Recognition from Images: A Survey

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation