Abstract
This paper presents a deep learning approach based on synthetic data for computer vision training and motion planning algorithms to be used in collaborative robotics. The cobot is in this case part of fully automated packing and cargo loading systems that must detect items, estimate their pose in space to grasp them, and create a collision-free pick and place trajectory. Simply recording raw data form sensors is typically insufficient to obtain an object’s pose. Specialized machine vision algorithms are needed to process the data, usually based on learning algorithms that depend on carefully annotated and extensive training datasets. However, procuring these datasets may prove expensive and time-consuming. To address this problem, we propose the use of synthetic data to train a neural network that will serve as a machine vision component for an automated packing system. We divide the problem into two steps: detection and pose estimation. Each step is performed with a different convolutional neuronal network configured to complete its task without excessive computing complexity that would be required to perform them simultaneously. We train and test both networks with synthetic data from a virtual scene of the workstation. For the detection problem, we achieved an accuracy of 99.5%. For the pose estimation problem, a mean error for the centre of the mass of 17.78 mm and a mean error for orientation of 21.28\(^\circ \) were registered. Testing with real-world data remains pending, as well as the use of other network architectures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrawal, P., Girshick, R., Malik, J.: Analyzing the Performance of Multilayer Neural Networks for Object Recognition. University of California, Berkeley (2014)
Bauer, W., Bender, M., Braun, M., Rally, P., Scholtz, O.: Lightweight robots in manual assembly – best to start simply! Examining companies’ initial experiences with lightweight robots. Frauhofer I40 Study (2016)
Deng, J., Dong, W., Socher, R., et al.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Chi, J., Walia, E., Babyn, P., Wang, J., Groot, G., Eramian, M.: Thyroid nodule classification in ultrasound images by fine-tuning deep convolutional neural network. J. Digit. Imaging 30(4), 477–486 (2017). https://doi.org/10.1007/s10278-017-9997-y
Dosovitskiy, A., Fischer, P., Ilg, E., et al.: FlowNet: Learning Optical Flow with Convolutional Networks. University of Freibur and Technical University of Munich (2015)
Fridman, L.: Deep Learning State of the Art. MIT (2019)
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual Worlds as Proxy for Multi-Object Tracking Analysis. Xerox Research Center Europe and Arizona State University. arXiv:1605.06457v1 (2016)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (2016). ISBN-13 978-0262035613
Han, D., Liu, Q., Fan, W.: A new image classification method using CNN transfer learning and web data augmentation. Expert Syst. Appl. 95, 43–56 (2018)
Hietanen, A., Latokartano, J., Foi, A., et al.: Object Pose Estimation in Robotics Revisited. Tampere University and Aalto University. arXiv: 1906.02783v2 (2019)
Hoo-Chang, S., Roth, H., Gao, M., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)
Huh, M., Agrawal, P., Efros, A.: What makes ImageNet good for transfer learning? UC Berkeley. arXiv:1608.08614v2 (2016)
ISO/TS 15066:2016 - Robots and robotic devices - Collaborative robots. https://pytorch.org/docs/stable/nn.html. Accessed 18 Apr 2020
Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. Google Inc. arXiv:1404.5997v2 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Commun. ACM (2012). https://doi.org/10.1145/3065386
Mousavian, A., Anguelov, D., Flynn, J., Kosecka, J.: 3D Bounding Box Estimation Using Deep Learning and Geometry. George Mason University and Zoox Inc. arXiv:1612.00496v2 (2017)
Oza, P., Patel, M.: One-Class Convolutional Neural Network. IEEE. arXiv:1901.08688v1 (2019)
Perera, P., Patel, M.: Learning Deep Features for One-Class Classification. IEEE. arXiv:1801.05365v2 (2019)
Pytorch documentation. https://pytorch.org/docs/stable/nn.html. Accessed 15 Apr 2020
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, pp. 779–788 (2015)
Reed, S., Akata, Z., Yan, X., Logeswaran, L.: Generative adversarial text to image synthesis. In: Proceedings of 33rd International Conference on Machine Learning, New York, 2016, vol. 48 (2016)
Reyes, A., Caicedo, J., Camargo, J.: Fine-tuning Deep Convolutional Networks for Plant Recognition. Laboratory for Advanced Computational Science and Engineering Research, Universidad Antonio Nariño and Fundación Universitaria Konrad Lorenz, Colombia (2015)
Rodriguez-Garavito, C.H., Camacho-Munoz, G., Álvarez-Martínez, D., Cardenas, K.V., Rojas, D.M., Grimaldos, A.: 3D object pose estimation for robotic packing applications. In: Figueroa-García, J.C., Villegas, J.G., Orozco-Arroyave, J.R., Maya Duque, P.A. (eds.) WEA 2018. CCIS, vol. 916, pp. 453–463. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00353-1_40
Ros, G., Sellart, L., Materzynska, J., et al.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)
Sangkloy, P., Lu, J., Fang, C., Yu, F., Hays, J.: Scribbler: controlling deep image synthesis with sketch and color. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 6836–6845 (2016)
Sermanet, P., Eigen, D., Zhang, X., et al.: OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. Courant Institute of Mathematical Sciences, New York University. arXiv:1312.6229v4 (2014)
Solund, T., Glent, A., Kruger, N., et al.: A Large-Scale 3D Object Recognition dataset. Danish Technological Institute, University of Southern Denmark and Technical University of Denmark (2016)
Solund, T., Savarimuthu, T., Glent, A., et al.: Teach it yourself - fast modeling of industrial objects for 6D pose estimation. In: Fourth International Conference on 3D Vision (3DV), 2016, pp. 73–82 (2015)
Tremblay, J., Prakash, A., Acuna, D., et al.: Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: CVPR 2018 Workshop on Autonomous Driving, arXiv:1804.06516 (2018)
Tsirikoglou, A., Kronander, J., Wrenninge, M., Unger, J.: Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications. Linkoping University and 7D labs. arXiv:1710.06270v2 (2017)
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes. NVIDIA, University of Washington and Carnegie Mellon University. arXiv:1711.00199v3 (2018)
Zhang, H., Xu, T., Li, H., et al.: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. ICCV 2017, arXiv: 1612.03242 (2017)
Zhang, M., Wu, J., Lin, H., Yuan, P., Song, Y.: The application of one-class classifier based on CNN in image defect detection. Procedia Comput. Sci. 114, 341–348 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Martínez-Franco, J.C., Álvarez-Martínez, D. (2021). Machine Vision for Collaborative Robotics Using Synthetic Data-Driven Learning. In: Trentesaux, D., Borangiu, T., Leitão, P., Jimenez, JF., Montoya-Torres, J.R. (eds) Service Oriented, Holonic and Multi-Agent Manufacturing Systems for Industry of the Future. SOHOMA 2021. Studies in Computational Intelligence, vol 987. Springer, Cham. https://doi.org/10.1007/978-3-030-80906-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-80906-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80905-8
Online ISBN: 978-3-030-80906-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)