Automatic Training Method of Deep Neural Network for Robot Vision

Wang, Yao; Chen, Zhihong; Wang, Yanbo; Lin, Junqin; Liang, Binyan; Guo, Meishan

doi:10.1007/978-981-16-6320-8_57

Yao Wang^41,42,
Zhihong Chen^41,42,
Yanbo Wang^41,42,
Junqin Lin^41,42,
Binyan Liang^41,42 &
…
Meishan Guo^41,42

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 805))

1755 Accesses
1 Citations

Abstract

Thanks to the breakthrough of deep learning in machine vision, robots have been widely applied in industrial and family services in recent years. Object capture is one of the most important functions for robots, and two-dimensional object detection is the premise for robot to capture objects. However, the training cost of obtaining the high recognition rate model is very large, as the deep neural network needs huge data samples. Based on this, this paper proposes an automatic training method of deep neural network for robot vision. The Tracking-Learning-Detection (TLD) algorithm tracks and collects the object samples by online learning. Then the offline Single-Shot-Detector (SSD) model studies the features of the object so as to realize the robotic object recognition function. In order to ensure the stability and continuity of the tracking process of the target, the sample acquisition process is accomplished automatically by the robot manipulator. Two methods of manual annotation and TLD algorithm are all used in this paper for increasing the persuasiveness of the data. The results show that the time cost of automatic data annotation by TLD algorithm is 77.75% less than the manual data annotation, and the recognition rate of model trained with automatic labeling data is 97.75%, which verify the validity and feasibility of the novel method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 349.00; Price excludes VAT (USA)

Softcover Book: USD 449.99; Price excludes VAT (USA)

Hardcover Book: USD 449.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Scheidegger, S., et al.: Mono-camera 3D multi-object tracking using deep learning detections and PMBM filtering. In: 2018 IEEE Intelligent Vehicles Symposium (IV). IEEE (2018)
Google Scholar
Breuer, T., et al.: Johnny: an autonomous service robot for domestic environments. J. Intell. Robot. Syst. 66(1), 245–272 (2012)
Article Google Scholar
Zhang, W., Wang, W.: Face recognition based on local binary pattern and deep learning. J. Comput. Appl. 35, 1474–1478 (2015)
Google Scholar
Wojek, C., et al.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743 (2012)
Article Google Scholar
Aman, E., et al.: Content-based image retrieval: a comprehensive study. Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol. 5, 1073–1081 (2019)
Google Scholar
Pan, C., et al.: Vehicle logo recognition based on deep learning architecture in video surveillance for intelligent traffic system. In: IET International Conference on Smart & Sustainable City IET (2014)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, no. 2 (2012)
Google Scholar
Girshick, R., et al.: Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. IEEE Computer Society (2013)
Google Scholar
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Article Google Scholar
Liu, W., et al.: SSD: Single Shot MultiBox Detector. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Book Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2015)
Article Google Scholar
Redmon, J., et al.: You Only Look Once: Unified, Real-Time Object Detection. IEEE (2016)
Google Scholar
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Software Eng. 34(7), 1409–1422 (2011)
Google Scholar
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Institute of Precision Mechatronics and Controls, Beijing, 100076, China
Yao Wang, Zhihong Chen, Yanbo Wang, Junqin Lin, Binyan Liang & Meishan Guo
Laboratory of Aerospace Servo Actuation and Transmission, Beijing, 100076, China
Yao Wang, Zhihong Chen, Yanbo Wang, Junqin Lin, Binyan Liang & Meishan Guo

Authors

Yao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhihong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yanbo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Junqin Lin
View author publications
You can also search for this author in PubMed Google Scholar
Binyan Liang
View author publications
You can also search for this author in PubMed Google Scholar
Meishan Guo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Automatic Science and Electrical Engineering, Beihang University, Beijing, China
Yingmin Jia
School of Automation and Electrical Engineering, University of Science and Technology, Beijing, China
Weicun Zhang
School of Mechanical Engineering and Automation, Beihang University, Beijing, China
Yongling Fu
Beijing Institute of Precision Mechatronics and Controls, Beijing, China
Zhiyuan Yu
College of Electrical Engineering and Automation, Fuzhou University, Fuzhou, Fujian, China
Song Zheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y., Chen, Z., Wang, Y., Lin, J., Liang, B., Guo, M. (2022). Automatic Training Method of Deep Neural Network for Robot Vision. In: Jia, Y., Zhang, W., Fu, Y., Yu, Z., Zheng, S. (eds) Proceedings of 2021 Chinese Intelligent Systems Conference. Lecture Notes in Electrical Engineering, vol 805. Springer, Singapore. https://doi.org/10.1007/978-981-16-6320-8_57

Download citation

DOI: https://doi.org/10.1007/978-981-16-6320-8_57
Published: 06 October 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-6319-2
Online ISBN: 978-981-16-6320-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics