Abstract
Computer vision has been revolutionised in recent years by increased research in convolutional neural networks (CNNs); however, many challenges remain to be addressed in order to ensure fast and accurate image processing when applying these techniques to robotics. These challenges consist of handling extreme changes in scale, illumination, noise, and viewing angles of a moving object. The project main contribution is to provide insight on how to properly train a convolutional neural network (CNN), a specific type of DNN, for object tracking in the context of industrial robotics. The proposed solution aims to use a combination of documented approaches to replicate a pick-and-place task with an industrial robot using computer vision feeding a YOLOv3 CNN. Experimental tests, designed to investigate the requirements of training the CNN in this context, were performed using a variety of objects that differed in shape and size in a controlled environment. The general focus was to detect the objects based on their shape; as a result, a suitable and secure grasp could be selected by the robot. The findings in this article reflect the challenges of training the CNN through brute force. It also highlights the different methods of annotating images and the ensuing results obtained after training the neural network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Max pooling is a technique that extracts the most significant features from the convolutional layer.
- 2.
Retrieved from https://www.syntouchinc.com/en/sensor-technology/, last accessed 2019-06-20.
- 3.
Retrieved from https://www.active8robots.com/robots/ar10-robotic-hand/, last accessed 2019-06-20.
- 4.
Retrieved from https://medium.com/@manivannan_data/how-to-train-YOLOv3-to-detect-custom-objects-ccbcafeb13d2, last accessed on 25/04/2019.
- 5.
Available online, https://gitlab.com/CNCR-NTU/CNCR_annotation_tool, last accessed on the 15/06/2019.
- 6.
Available online https://www.youtube.com/watch?v=vdDqMtdyUYU, last accessed on 25/04/2019.
- 7.
Available online https://www.youtube.com/watch?v=IzN3kp7eAuY, last accessed on 25/04/2019.
References
Aggarwal, C.C.: Convolutional neural networks. In: Neural Networks and Deep Learning, pp. 315–371. Springer, Cham (2018). http://link.springer.com/10.1007/978-3-319-94463-0_8
Brownlee, J.: Overfitting and Underfitting With Machine Learning Algorithms (2016). https://machinelearningmastery.com/overfitting-and-underfitting-with-machine-learning-algorithms/
Das, S.: CNN Architectures: LeNet, AlexNet, VGG, GoogLeNet, ResNet and more... (2017). https://medium.com/@sidereal/cnns-architectures-lenet-alexnet-vgg-googlenet-resnet-and-more-666091488df5
Ferreira, J.F., Dias, J.: Attentional mechanisms for socially interactive robots – a survey. IEEE Trans. Auton. Mental Dev. 6(2), 110–123 (2014)
Jafri, R., Aljuhani, A.M., Ali, S.A.: A tangible interface-based application for teaching tactual shape perception and spatial awareness sub-concepts to visually impaired children. Proc. Manuf. 3(Ahfe), 5562–5569 (2015). https://doi.org/10.1016/j.promfg.2015.07.734
Kragic, D., Gustafson, J., Karaoguz, H., Jensfelt, P., Krug, R.: Interactive, collaborative robots: challenges and opportunities. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI 2018), pp. 18–25 (2018)
Kuo, C.C.: Understanding convolutional neural networks with a mathematical model. J. Vis. Commun. Image Representation 41, 406–413 (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016). https://doi.org/10.1109/CVPR.2016.91
Redmon, J., Farhadi, A.: YOLO9000: Better, faster, stronger. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, January 2017, pp. 6517–6525 (2017)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Sharma, P.: A Step-by-Step Introduction to the Basic Object Detection Algorithms (Part 1) (2018). https://www.analyticsvidhya.com/blog/2018/10/a-step-by-step-introduction-to-the-basic-object-detection-algorithms-part-1/
Wirth, R.: CRISP-DM: towards a standard process model for data mining. In: Proceedings of the Fourth International Conference on the Practical Application of Knowledge Discovery and Data Mining, pp. 29–39 (2000)
Yazdi, M., Bouwmans, T.: New trends on moving object detection in video images captured by a moving camera: a survey. Comput. Sci. Rev. 28, 157–177 (2018). https://doi.org/10.1016/j.cosrev.2018.03.001
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Brandenburg, S., Machado, P., Shinde, P., Ferreira, J.F., McGinnity, T.M. (2020). Object Classification for Robotic Platforms. In: Silva, M., Luís Lima, J., Reis, L., Sanfeliu, A., Tardioli, D. (eds) Robot 2019: Fourth Iberian Robotics Conference. ROBOT 2019. Advances in Intelligent Systems and Computing, vol 1093. Springer, Cham. https://doi.org/10.1007/978-3-030-36150-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-36150-1_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36149-5
Online ISBN: 978-3-030-36150-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)