Abstract
The rapid development of artificial intelligence has made various automated machines generally appear. In the catering industry, automatic billing has great development prospects. At present, the commonly used recognition methods include food recognition and dish recognition. There are many types of Chinese food, which are not conducive to promotion. Therefore, this article focuses on dish recognition. Faster R-CNN is a model with better results. There are also many improved networks to apply for the identification of various situations. This article improves Faster R-CNN, combining Faster R-CNN with cross-connected layers, and proposes the Cross Faster R-CNN model, which combines low-level features and high-level features, and introduces an attention mechanism to make the model highlight the characteristics of the dishes. The experimental results show that the Cross Faster R-CNN model, which introduces the cross-connected layer and attention mechanism, has no major changes in detection speed, and the accuracy is significantly improved.
Similar content being viewed by others
References
Zhao, X. (2020). On the influence of artificial intelligence on social development. In Proceedings of the 34th China (Tianjin) 2020’IT, network, information technology, electronics, and instrumentation innovation academic conference, pp. 249–251.
Zhou, H., Zhou, Y., Hu, J., & Xie, D. (2020). Real-time optimal scheduling strategy for cluster electric vehicles supported by artificial intelligence technology. Power Grid Technology, 14.
Xuefang, J., & Runwei, L. (2020). Deep learing and artificial intelligence. Neijiang Technology, 41(06), 78–78.
Jiang, D., Wang, Y., Lv, Z., Wang, W., & Wang, H. (2020). An energy-efficient networking approach in cloud services for IIoT networks. IEEE Journal on Selected Areas in Communications, 38(5), 928–941.
Yue, Hu., Dongyang, L., Hua Kui, Lu., & Haiming, Z. X. (2019). Summary and discussion on deep learning. CAAI Transactions on Intelligent Systems, 01, 1–19.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436.
Qi, S., Jiang, D., Huo, L. (2019). A prediction approach to end-to-end traffic in space information networks. Mobile Networks and Applications. Online available.
Yuanfan, Z., Guangyang, Li., & Ye, Li. (2019). Summary of research on application of deep learning in image recognition. Computer Engineering and Applications, 12, 20–36.
He K., Zhang X., Ren S., Sun J.(2016). Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778.
Bennet, B., Arno, O. (2019). Analysis of video feature learning in two-stream CNNs on the example of Zebrafish swim bout classification. arXiv:1912.09857.
Jiang, D., Wang, W., Shi, L., & Song, H. (2020). A compressive sensing-based approach to end-to-end network traffic reconstruction. IEEE Transactions on Network Science and Engineering, 7(1), 507–519.
Levine, S., Pastor, P., Krizhevsky, A., et al. (2018). Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research, 37(4–5), 421–436.
Liang T., Wang Y., Zhao Q., Zhang H., Tang Z., & Ling H. (2019). MFPN: A novel mixture feature pyramid network of multiple architectures for object detection. arXiv:1912.09748.
Liu, Y., He, T., Chen, H., Wang, X., Luo, C., Zhang, S., Shen, C., Jin, L.(2019). Exploring the capacity of sequential-free box discretization network for omnidirectional scene text detection. arXiv:1912.09629.
Jiang, D., Huo, L., & Song, H. (2020). Rethinking behaviors and activities of base stations in mobile cellular networks based on big data analysis. IEEE Transactions on Network Science and Engineering, 7(1), 80–90.
Zhou, Q., Yang, W., Gao, G., et al. (2019). Multi-scale deep context convolutional neural networks for semantic segmentation. World Wide Web, 22(2), 555–570.
He, Y., Rahimian, S., Schiele, B., Fritz, M. (2019). Segmentations-leak: Membership inference attacks and defenses in semantic image segmentation. arXiv:1912.09685.
Lin, Z., Wenbing, T. (2019). JSNet: Joint instance and semantic segmentation of 3D point clouds. arXiv:1912.09654.
Yu, Q., Yang, D., Roth, H., Bai, Y., Zhang, Y., Yuille, A., Xu, D. (2010). C2FNAS: Coarse-to-fine neural architecture search for 3D medical image segmentation. arXiv:1912.09628.
Jiang, D., Wang, Y., Lv, Z., Qi, S., & Singh, S. (2020). Big data analysis based network behavior insight of cellular networks for industry 4.0 applications. IEEE Transactions on Industrial Informatics, 16(2), 1310–1320.
Bai, M., Urtasun, R. (2017) Deep watershed transform for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5221–5229.
Li, Q., & Zhao, C. (2020). Laser point cloud tree segmentation combined with supervoxel and graph optimization. Science of Surveying and Mapping, 9, 117–122.
Wang, Z., Yu, C., Wei, G., & Sun, Y. (2020). Point cloud instance segmentation method based on super-point graph. Journal of Tongji University (Natural Science), 9, 1377–1384.
Krizhevsky, A., Sutskever. I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, pp. 1097–1105.
Johnson, J., Karpathy, A., & Fei-Fei, L. (2016) Densecap: Fully convolutional localization networks for dense captioning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4565–4574.
Jiang, D., Huo, L., Lv, Z., Song, H., & Qin, W. (2018). A joint multi-criteria utility-based network selection approach for vehicle-to-infrastructure networking. IEEE Transactions on Intelligent Transportation Systems, 19(10), 3305–3319.
Jiang, D., Li, W., & Lv, H. (2017). An energy-efficient cooperative multicast routing in multi-hop wireless networks for smart medical applications. Neurocomputing, 220, 160–169.
Yue, G., Hou, C., & Zhou, T. (2019). Subtitle region selection of S3D images in consideration of visual discomfort and viewing habit. ACM Transactions on Multimedia Computing, Communications, and Applications, 15(3), 1–16.
Young, T., Hazarika, D., Poria, S., et al. (2018). Recent trends in deep learning based natural language processing. IEEE Computational Intelligence Magazine, 13(3), 55–75.
Jiang, D., Huo, L., & Li, Y. (2018). Fine-granularity inference and estimations to network traffic for SDN. PLoS ONE, 13(5), 1–23.
Hong, J., Park, S., & Byun, H. (2020). Selective residual learning for visual question answering. Neurocomputing, 402, 366–374.
Zhang, L., Liu, S., Liu, D., Zeng, P., Li, X., Song, J., & Gao, L. (2020). Rich, visual knowledge-based augmentation network for visual question answering. IEEE Transactions on Neural Networks and Learning Systems.
Milz, S., Arbeiter, G., & Witt, C., et al. (2018). Visual slam for automated driving: Exploring the applications of deep learning. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 247–257.
Wang, Y., Jiang, D., Huo, L., & Zhao, Y. (2019). A new traffic prediction algorithm to software defined networking. Mobile Networks and Applications, 26, 716.
Venkata, R. R. C., Srinivasulu, R. U., & Venkata, K. K. K. (2020). Deep CNN: A machine learning approach for driver drowsiness detection based on eye state. Revue D’intelligence Artificielle, 33(6), EI.
Quanxin, Z., Yuhang, Z., Yajie, W., Thar, B., Jian, Z., & Jingjing, H. (2020). Towards cross-task universal perturbation against black-box object detectors in autonomous driving. Computer Networks, 180, 107388.
Gu, J., Wang, Z., Kuen, J., et al. (2018). Recent advances in convolutional neural networks. Pattern Recognition, 77, 354–377.
Saliha, A. M., Petr, M., Armando, R. M., & Rob, R. (2020). Empirical analysis of partial discharge data and innovative visualization tools for defect identification under DC stress. International Journal of Electrical Power and Energy Systems, 123, 106270.
Luo, M., Wen, G., Yang, Hu., Dai, D., & Ma, J. (2020). Learning competitive channel-wise attention in residual network with masked regularization and signal boosting. Expert Systems With Applications, 160, 113591.
Liu, J., Tu, Q., Li, B., Huang, X., et al. (2020). Fault prediction method based on convolutional neural network. Computerized Tomography Theory and Applications., 5, 522–533.
Ioffe, S., Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
Wagner, J., Kohler, J. M., & Gindele, T., et al. (2019). Interpretable and fine-grained visual explanations for convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9097–910.
Jiang, D., Zhang, P., Lv, Z., et al. (2016). Energy-efficient multi-constraint routing algorithm with load balancing for smart city applications. IEEE Internet of Things Journal, 3(6), 1437–1447.
Ruizhi, Wu., Dayong, Z., Chunyu, W., & Ke, Q. (2020). Location semantic inference based on graph convolutional neural network. Journal of University of Electronic Science and Technology of China, 5, 739–744.
Son, A. V., Joel, S., Paul, T., & Robert, O. (2020). Convolutional neural networks for individual identification in the southern rock lobster supply Chain. Food Control, 118, 107419.
Ebani, E. J., Kaplitt, M. G., Yi, W., Nguyen, T. D., Akin, G., & Levi Chazen, J. (2020). Improved targeting of the Globus pallidus interna using quantitative susceptibility mapping prior to MR-guided focused ultrasound ablation in Parkinson’s disease. Clinical Imaging, 68, 94–98.
Selvaraju1, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh D., & Batra D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pp. 618–626.
Cheng, Z., Haojie, Z., & Hailiang, W. (2019). A 3D point cloud object recognition method based on attention mechanism. Key R & D plan projects of the Ministry of Science and Technology.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems. [S.1.], Long Beach, USA, IEEE, pp. 5998–6008.
Zhou, Y., & Zhang, R. (2019). A brief analysis of subtitle translation of documentary wild china from the perspective of eco-translatology. Theory and Practice in Language Studies (TPLS), 9(10), 1301–1308.
Dongfang, L., Yiming, C., Yingjie, C., Jiyong, Z., & Bin, F. (2020). Video object detection for autonomous driving: Motion-aid feature calibration. Neurocomputing, 409, 1–11.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xiong, J., Zhu, L., Ye, L. et al. Attention aware cross faster RCNN model and simulation. Wireless Netw (2021). https://doi.org/10.1007/s11276-021-02645-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s11276-021-02645-8