Advertisement

CNN-Based Food Image Segmentation Without Pixel-Wise Annotation

  • Wataru Shimoda
  • Keiji YanaiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9281)

Abstract

We propose a CNN-based food image segmentation which requires no pixel-wise annotation. The proposed method consists of food region proposals by selective search and bounding box clustering, back propagation based saliency map estimation with the CNN model fine-tuned with the UEC-FOOD100 dataset, GrabCut guided by the estimated saliency maps and region integration by non-maximum suppression. In the experiments, the proposed method outperformed RCNN regarding food region detection as well as the PASCAL VOC detection task.

Keywords

Food segmentation Convolutional neural network Deep learning UEC-FOOD 

References

  1. 1.
    Bosch, M., Zhu, F., Khanna, N., Boushey, C.J., Delp, E.J.: Combining global and local features for food identification in dietary assessment. In: Proc. of IEEE International Conference on Image Processing (2011)Google Scholar
  2. 2.
    Bossard, L., Guillaumin, M., Van Gool, L.: Food-101 – mining discriminative components with random forests. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part VI. LNCS, vol. 8694, pp. 446–461. Springer, Heidelberg (2014) Google Scholar
  3. 3.
    Chen, M., Yang, Y., Ho, C., Wang, S., Liu, S., Chang, E., Yeh, C., Ouhyoung, M.: Automatic chinese food identification and quantity estimation. In: SIGGRAPH Asia (2012)Google Scholar
  4. 4.
    Deng, Y., Manjunath, B.S.: Unsupervised segmentation of color-texture regions in images and video. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(8), 800–810 (2001)CrossRefGoogle Scholar
  5. 5.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  6. 6.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Image segmentation using local variation. In: Proc. of IEEE Computer Vision and Pattern Recognition, pp. 98–104 (1998)Google Scholar
  7. 7.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. of IEEE Computer Vision and Pattern Recognition, pp. 580–587 (2014)Google Scholar
  8. 8.
    He, Y., Xu, C., Khanna, N., Boushey, C.J., Delp, E.J.: Food image analysis: segmentation, identification and weight estimation. In: Proc. of IEEE International Conference on Multimedia and Expo., pp. 1–6 (2013)Google Scholar
  9. 9.
    Kagaya, H., Aizawa, K., Ogawa, M.: Food detection and recognition using convolutional neural network. In: Proc. of ACM International Conference Multimedia, pp. 1085–1088 (2014)Google Scholar
  10. 10.
    Kawano, Y., Yanai, K.: Real-time mobile food recognition system. In: Proc. of IEEE CVPR International Workshop on Mobile Vision (IWMV) (2013)Google Scholar
  11. 11.
    Kawano, Y., Yanai, K.: Food image recognition with deep convolutional features. In: Proc. of ACM UbiComp Workshop on Workshop on Smart Technology for Cooking and Eating Activities (CEA) (2014)Google Scholar
  12. 12.
    Kawano, Y., Yanai, K.: Foodcam: A real-time food recognition system on a smartphone. Multimedia Tools and Applications, 1–25 (2014)Google Scholar
  13. 13.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)Google Scholar
  14. 14.
    Matsuda, Y., Hoashi, H., Yanai, K.: Recognition of multiple-food images by detecting candidate regions. In: Proc. of IEEE International Conference on Multimedia and Expo., pp. 1554–1564 (2012)Google Scholar
  15. 15.
    Morikawa, C., Sugiyama, H., Aizawa, K.: Food region segmentation in meal images using touch points. In: Proc. of ACM MM WS on Multimedia for Cooking and Eating Activities (CEA), pp. 7–12 (2012)Google Scholar
  16. 16.
    Rother, C., Kolmogorov, V., Blake, A.: Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics (TOG) 23(3), 309–314 (2004)CrossRefGoogle Scholar
  17. 17.
    Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: Proc. of International Conference on Learning Represenation Workshop Track (2014). http://arxiv.org/abs/1312.6034
  18. 18.
    Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: Proc. of International Conference on Learning Represenation Workshop Track (2015). http://arxiv.org/abs/1412.6806
  19. 19.
    Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. International Journal of Computer Vision 104(2), 154–171 (2013)CrossRefGoogle Scholar
  20. 20.
    Yang, S., Chen, M., Pomerleau, D., Sukthankar, R.: Food recognition using statistics of pairwise local features. In: Proc. of IEEE Computer Vision and Pattern Recognition (2010)Google Scholar
  21. 21.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Heidelberg (2014) Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of InformaticsThe University of Electro-CommunicationsChofu-shiJapan

Personalised recommendations