Advertisement

Multimedia Tools and Applications

, Volume 77, Issue 15, pp 19769–19794 | Cite as

Context based image analysis with application in dietary assessment and evaluation

  • Yu Wang
  • Ye He
  • Carol J. Boushey
  • Fengqing Zhu
  • Edward J. Delp
Article
  • 431 Downloads

Abstract

Dietary assessment is essential for understanding the link between diet and health. We develop a context based image analysis system for dietary assessment to automatically segment, identify and quantify food items from images. In this paper, we describe image segmentation and object classification methods used in our system to detect and identify food items. We then use context information to refine the classification results. We define contextual dietary information as the data that is not directly produced by the visual appearance of an object in the image, but yields information about a user’s diet or can be used for diet planning. We integrate contextual dietary information that a user supplies to the system either explicitly or implicitly to correct potential misclassifications. We evaluate our models using food image datasets collected during dietary assessment studies from natural eating events.

Keywords

Image analysis Image segmentation Object classification Context information Dietary assessment 

Notes

Acknowledgements

This work was sponsored by the US National Institutes of Health under grant NIH/NCI 1U01CA130784-01 and NIH/NIDDK 2R56DK073711-04,1R01-DK073711-01A1. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the US National Institutes of Health.

References

  1. 1.
    Agarwal S, Snavely N, Simon I, Seitz SM, Szeliski R (2009) Building rome in a day. In: Proceedings of the IEEE international conference on computer vision. Elsevier, Kyoto, pp 72–79Google Scholar
  2. 2.
  3. 3.
    Amadasun M, King R (1989) Textural features corresponding to textural properties. IEEE Trans Syst Man Cybern 19(5):1264–1274CrossRefGoogle Scholar
  4. 4.
    Arulampalam S, Maskell S, Gordon N, Clapp T (2002) A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans Signal Process 50(2):174–188CrossRefGoogle Scholar
  5. 5.
    Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (SURF). Journal of Computer Vision and Image Understanding 110(3):346–359CrossRefGoogle Scholar
  6. 6.
    Biederman I, Mezzanotte R, Rabinowitz J (1982) Scene perception: detecting and judging objects undergoing relational violations. Cogn Psychol 14(2):143–177CrossRefGoogle Scholar
  7. 7.
    Bossard L, Guillaumin M, Van Gool L (2014) Food-101 – mining discriminative components with random forests. European Conference on Computer Vision 8694:446–461Google Scholar
  8. 8.
    Boushey CJ, Kerr DA, Wright J, Lutes KD, Ebert DS, Delp EJ (2009) Use of technology in children’s dietary assessment. Eur J Clin Nutr 63:S50–S57CrossRefGoogle Scholar
  9. 9.
    Choi T, Chin S (2013) An intelligent wellness keeper for food nutrition with graphical icons. International Journal of Multimedia and Ubiquitous Engineering 8:207–214Google Scholar
  10. 10.
    Deng Y, Manjunath BS, Kenney C, Moore MS, Shin H (2001) An efficient color representation for image retrieval. IEEE Trans Image Process 10:140–147CrossRefzbMATHGoogle Scholar
  11. 11.
  12. 12.
    Duda R, Hart P (1973) Pattern classification and scene analysis. Wiley, HobokenzbMATHGoogle Scholar
  13. 13.
    Fang S, Liu C, Zhu F, Delp E, Boushey C (2015) Single-view food portion estimation based on geometric models. In: Proceedings of the IEEE international symposium on multimedia. Elsevier, Miami, pp 385–390Google Scholar
  14. 14.
    Felzenszwalb P, Huttenlocher D (1998) Image segmentation using local variation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Santa Barbara, pp 98–104Google Scholar
  15. 15.
    Galleguillos C, Belongie S (2010) Context based object categorization: a critical survey. Comput Vis Image Underst 114:712–722CrossRefGoogle Scholar
  16. 16.
    Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Columbus, pp 580–587Google Scholar
  17. 17.
    He Y, Khanna N, Boushey C, Delp E (2013) Image segmentation for image-based dietary assessment: A comparative study. In: Proceedings of the IEEE international symposium on signals, circuits and systems. Springer, Iasi, pp 1–4Google Scholar
  18. 18.
    He Y, Xu C, Khanna N, Boushey C, Delp E (2014) Analysis of food images: Features and classification. In: Proceedings of the IEEE international conference on image processing. IEEE, Paris, pp 2744–2748Google Scholar
  19. 19.
    Jordan MI, Ghahramani Z, Jaakkola TS, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233CrossRefzbMATHGoogle Scholar
  20. 20.
    Joutou T, Yanai K (2009) A food image recognition system with multiple kernel learning. In: Proceedings of the IEEE international conference on image processing. Springer, Cairo, pp 285–288Google Scholar
  21. 21.
    Julesz B (1981) Textons, the elements of texture perception and their iteractions. Nature 290:91–97CrossRefGoogle Scholar
  22. 22.
    Kass M, Witkin A, Terzopoulos D (1988) Snakes: active contour models. International journal Of Computer Vision 1(4):321–331CrossRefzbMATHGoogle Scholar
  23. 23.
    Kelkar S, Stella S, Okos M (2010) X-Ray micro computed tomography (CT): a novel method to measure density of porous food. In: Proceedings of the IFT annual meeting and food expo. ACM, ChicagoGoogle Scholar
  24. 24.
    Kenney C, Deng Y, Manjunath BS, Hewer G (2001) Peer group image enhancement. IEEE Trans Image Process 10:326–334MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Kitamura K, Yamasaki T, Aizawa K (2009) Foodlog: capture, analysis and retrieval of personal food images via web. In: Proceedings of the ACM multimedia workshop on Multimedia for cooking and eating activities. MIT Press, Beijing, pp 23–30Google Scholar
  26. 26.
    Kong F, He H, Raynor HA, Tan J (2015) Dietcam: multi-view regular shape food recognition with a camera phone. Pervasive Mob Comput 19:108–121CrossRefGoogle Scholar
  27. 27.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 1097–1105Google Scholar
  28. 28.
    LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444CrossRefGoogle Scholar
  29. 29.
    Livingstone MBE, Robson PJ, Wallace JMW (2004) Issues in dietary intake assessment of children and adolescents. Br J Nutr 92:S213–S222CrossRefGoogle Scholar
  30. 30.
    Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 2(60):91–110CrossRefGoogle Scholar
  31. 31.
    Ma WY, Deng Y, Manjunath B (1997) Tools for texture- and color-based search of images. In: Proceedings of the SPIE human vision and electronic imaging II 3016, San Jose, pp 496–507Google Scholar
  32. 32.
    Manjunath B, Ohm JR, Vasudevan V, Yamada A (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11(6):703–715CrossRefGoogle Scholar
  33. 33.
    Martinel N, Foresti GL, Micheloni C (2016) Wide-slice residual networks for food recognition. arXiv:1612.06543
  34. 34.
    McFee B, Galleguillos C, Lanckriet G (2011) Contextual object localization with multiple kernel nearest-neighbor. IEEE Trans Image Process 20(2):570–585MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Meyers A, Johnston N, Rathod V, Korattikara A, Gorban A, Silberman N, Guadarrama S, Papandreou G, Huang J, Murphy KP (2015) Im2calories: Towards an automated mobile vision food diary. In: Proceedings of the IEEE international conference on computer vision. MIT Press, Santiago, pp 1233–1241Google Scholar
  36. 36.
    Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. Int J Comput Vis 1(60):63–86CrossRefGoogle Scholar
  37. 37.
    Murphy K, Torralba A, Freeman W (2003) Using the forest to see the trees: a graphical model relating features, objects and scenes. Adv Neural Inf Proces Syst 16:1499–1506Google Scholar
  38. 38.
    National Vital Statistics System U.S. (2009) Quickstats: age-adjusted death rates for the 10 leading causes of death. Morb Mortal Wkly Rep 58(46):1303Google Scholar
  39. 39.
    Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE conference on computer vision and pattern recognition. Elsevier, Washington, pp 2161–2168Google Scholar
  40. 40.
    Oliva A, Torralba A (2007) The role of context in object recognition. Trends Cogn Sci 11(12):520– 527CrossRefGoogle Scholar
  41. 41.
    Peddi SVB, Kuhad P, Yassine A, Pouladzadeh P, Shirmohammadi S, Shirehjini AAN (2017) An intelligent cloud-based data processing broker for mobile e-health multimedia applications. Futur Gener Comput Syst 66:71–86CrossRefGoogle Scholar
  42. 42.
    Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Minneapolis, pp 1–8Google Scholar
  43. 43.
    Pytorch. http://www.pytorch.org/. Tensors and Dynamic neural networks in Python with strong GPU acceleration
  44. 44.
    Rabinovich A, Vedaldi A, Galleguillos C, Wiewiora E, Belongie S (2007) Objects in context. In: Proceedings of the IEEE international conference on computer vision. IEEE, Rio de Janeiro, pp 1–8Google Scholar
  45. 45.
    Razavian A, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Cambridge University Press, Columbus, pp 806–813Google Scholar
  46. 46.
    Sarkka S (2013) Bayesian filtering and smoothing. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  47. 47.
    Schap T, Zhu F, Delp E, Boushey C (2014) Merging dietary assessment with the adolescent lifestyle. J Hum Nutr Diet 27(s1):82–88CrossRefGoogle Scholar
  48. 48.
    Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Minneapolis, pp 1–7Google Scholar
  49. 49.
    Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRefGoogle Scholar
  50. 50.
    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
  51. 51.
    Stella S, Kelkar S, Okos M (2010) Predicting and 3D laser scanning for determination of apparent density of porous food. In: Proceedings of the IFT annual meeting and food expo. Elsevier, ChicagoGoogle Scholar
  52. 52.
    Thompson FE, Subar AF, Loria CM, Reedy JL, Baranowski T (2010) Need for technological innovation in dietary assessment. J Am Diet Assoc 110(1):48–51CrossRefGoogle Scholar
  53. 53.
    Tola E, Lepetit V, Fua P (2010) DAISY: an efficient dense descriptor applied to wide baseline stereo. IEEE Trans Pattern Anal Mach Intell 32(5):815–830CrossRefGoogle Scholar
  54. 54.
    Torralba A, Murphy KP, Freeman WT, Rubin MA (2003) Context-based vision system for place and object recognition. In: Proceedings of the IEEE international conference on computer vision, Nice, pp 273–280Google Scholar
  55. 55.
  56. 56.
    Uijlings JR, van de Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171CrossRefGoogle Scholar
  57. 57.
    Wang X, Yang M, Cour T, Zhu S, Yu K, Han TX (2011) Contextual weighting for vocabulary tree based image retrieval. In: Proceedings of the IEEE international conference on computer vision, pp 209–216Google Scholar
  58. 58.
    Wang Y, He Y, Zhu F, Boushey C, Delp E (2015) The use of temporal information in food image analysis. In: Murino V, Puppo E, Sona D, Cristani M, Sansone C (eds) New Trends in image analysis and processing – ICIAP 2015 workshops, lecture notes in computer science, vol 9281. Springer International, Berlin, pp 317–325Google Scholar
  59. 59.
    Zhu F, Bosch M, Woo I, Kim S, Boushey C, Ebert D, Delp E (2010) The use of mobile devices in aiding dietary assessment and evaluation. IEEE J Sel Top Sign Proces 4(4):756–766CrossRefGoogle Scholar
  60. 60.
    Zhu F, Bosch M, Khanna N, Boushey C, Delp E (2015) Multiple hypotheses image segmentation and classification with application to dietary assessment. IEEE journal of Biomedical and Health Informatics 19(1):377–388CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Purdue UniversityWest LafayetteUSA
  2. 2.Google IncMountain ViewUSA
  3. 3.University of Hawaii Cancer CenterHawaiiUSA

Personalised recommendations