Abstract
In recent years, deep learning has been applied to different tasks in the food recognition field. Some promising solutions have been proposed. Due to the complexity of background food, the problem of pattern recognition on a limited dataset is still challenging. Experiments were conducted on a self-collected dataset with canteen trays, containing images of various dishes depending on the day of the week. The main objective of this work is to compare the effectiveness of modern object detection architectures, namely, YOLO_v5, YOLO_v6, YOLO_v7, and YOLO_v5, with a custom classifier. The experimental results showed that the custom classifier was needed to effectively distinguish dishes with high performance.
REFERENCES
Y. Matsuda, H. Hoashi, and K. Yanai, “Recognition of multiple-food images by detecting candidate regions,” in Proceedings of the 2012 IEEE International Conference on Multimedia and Expo (IEEE, 2012), pp. 25–30.
Y. Kawano and K. Yanai, “Food image recognition with deep convolutional features,” in Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Adjunct Publication, 2014), pp. 589–593.
C. Liu, Y. Cao, Y. Luo, G. Chen, V. Vokkarane, and Y. Ma, “Deepfood: Deep learning-based food image recognition for computer-aided dietary assessment,” in Proceedings of the International Conference on Smart Homes and Health Telematics (Springer, 2016), pp. 37–48.
L. Bossard, M. Guillaumin, and L. Van Gool, “Food-101–mining discriminative components with random forests,” in Proceedings of European conference on Computer Vision (Springer, Zurich, Switzerland, 2014), pp. 446–461.
Y Kawano and K Yanai, “Foodcam-256: a large-scale real-time mobile food recognition system employing high-dimensional features and compression of classifier weights,” in Proceedings of the 22nd ACM International Conference on Multimedia, Florida, USA, 2014, pp. 761–762.
E. Aguilar, M. Bolaños, and P. Radeva, “Regularized uncertainty-based multi-task learning model for food analysis,” J. Vis. Commun. Image Represent. 60, 360–370 (2019).
A. Fakhrou, J. Kunhoth, and S. Al Maadeed, “Smartphone-based food recognition system using multiple deep CNN models,” Multimed. Tools Appl. 80, 33011–33032 (2021). https://doi.org/10.1007/s11042-021-11329-6
D. Pandey, P. Parmar, G. Toshniwal, M. Goel, V. Agrawal, Sh. Dhiman, L. Gupta, and G. Bagler, “Object detection in Indian food platters using transfer learning with YOLOv4,” (2022), arXiv:2205.04841.
Y.-C. Liu, D. D. Onthoni, S. Mohapatra, D. Irianti, and P. K. Sahoo, “Deep-learning-assisted multi-dish food recognition application for dietary intake reporting electronics,” 11, 1626 (2022). https://doi.org/10.3390/electronics11101626
S. Srivastava, A. V. Divekar, C. Anilkumar, et al., “Comparative analysis of deep learning image detection algorithms,” J. Big Data 8, 66 (2021). https://doi.org/10.1186/s40537-021-00434-w
M. Chopra and A. Purwar, “Food Image Recognition Using CNN, Faster R-CNN and YOLO,” in Applications of Artificial Intelligence, Big Data and Internet of Things in Sustainable Development (CRC Press, 2022).
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” (2016) arXiv.org; 2015.
C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng, W. Nie, Y. Li, B. Zhang, Y. Liang, L. Zhou, X. Xu, X. Chu, X. Wei, and X. Wei, “YOLOv6: A single-stage object detection framework for industrial applications,” (2022) arXiv.org.
C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” (2022) arXiv.org.
“COCO–Common Objects in Context,” https://cocodataset.org/#home (Accessed September 18, 2023).
F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A unified embedding for face recognition and clustering,” (2015) arXiv.org.
D. Cheng, Y. Gong, S. Zhou, J. Wang, and N. Zheng, “Person re-identification by multi-channel parts-based cnn with improved triplet loss function,” in Proceedings of Computer Vision and Pattern Recognition Conference (CVPR), 2016, pp. 1335– 1344.
A. Hermans, L. Beyer, and B.Leibe, “In defense of the triplet loss for person reidentification,” (2017), arXiv: 1703.07737.
Funding
This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors of this work declare that they have no conflicts of interest.
Additional information
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gerasimchuk, M., Uzhinskiy, A. Food Recognition for Smart Restaurants and Self-Service Cafes. Phys. Part. Nuclei Lett. 21, 79–83 (2024). https://doi.org/10.1134/S1547477124010059
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1547477124010059