Machine Vision for Collaborative Robotics Using Synthetic Data-Driven Learning

Martínez-Franco, Juan Camilo; Álvarez-Martínez, David

doi:10.1007/978-3-030-80906-5_6

Juan Camilo Martínez-Franco⁷ &
David Álvarez-Martínez⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 987))

Included in the following conference series:

International Workshop on Service Orientation in Holonic and Multi-Agent Manufacturing

644 Accesses
1 Citations

Abstract

This paper presents a deep learning approach based on synthetic data for computer vision training and motion planning algorithms to be used in collaborative robotics. The cobot is in this case part of fully automated packing and cargo loading systems that must detect items, estimate their pose in space to grasp them, and create a collision-free pick and place trajectory. Simply recording raw data form sensors is typically insufficient to obtain an object’s pose. Specialized machine vision algorithms are needed to process the data, usually based on learning algorithms that depend on carefully annotated and extensive training datasets. However, procuring these datasets may prove expensive and time-consuming. To address this problem, we propose the use of synthetic data to train a neural network that will serve as a machine vision component for an automated packing system. We divide the problem into two steps: detection and pose estimation. Each step is performed with a different convolutional neuronal network configured to complete its task without excessive computing complexity that would be required to perform them simultaneously. We train and test both networks with synthetic data from a virtual scene of the workstation. For the detection problem, we achieved an accuracy of 99.5%. For the pose estimation problem, a mean error for the centre of the mass of 17.78 mm and a mean error for orientation of 21.28\(^\circ \) were registered. Testing with real-world data remains pending, as well as the use of other network architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, P., Girshick, R., Malik, J.: Analyzing the Performance of Multilayer Neural Networks for Object Recognition. University of California, Berkeley (2014)
Book Google Scholar
Bauer, W., Bender, M., Braun, M., Rally, P., Scholtz, O.: Lightweight robots in manual assembly – best to start simply! Examining companies’ initial experiences with lightweight robots. Frauhofer I40 Study (2016)
Google Scholar
Deng, J., Dong, W., Socher, R., et al.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Chi, J., Walia, E., Babyn, P., Wang, J., Groot, G., Eramian, M.: Thyroid nodule classification in ultrasound images by fine-tuning deep convolutional neural network. J. Digit. Imaging 30(4), 477–486 (2017). https://doi.org/10.1007/s10278-017-9997-y
Article Google Scholar
Dosovitskiy, A., Fischer, P., Ilg, E., et al.: FlowNet: Learning Optical Flow with Convolutional Networks. University of Freibur and Technical University of Munich (2015)
Google Scholar
Fridman, L.: Deep Learning State of the Art. MIT (2019)
Google Scholar
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual Worlds as Proxy for Multi-Object Tracking Analysis. Xerox Research Center Europe and Arizona State University. arXiv:1605.06457v1 (2016)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (2016). ISBN-13 978-0262035613
Google Scholar
Han, D., Liu, Q., Fan, W.: A new image classification method using CNN transfer learning and web data augmentation. Expert Syst. Appl. 95, 43–56 (2018)
Article Google Scholar
Hietanen, A., Latokartano, J., Foi, A., et al.: Object Pose Estimation in Robotics Revisited. Tampere University and Aalto University. arXiv: 1906.02783v2 (2019)
Hoo-Chang, S., Roth, H., Gao, M., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)
Article Google Scholar
Huh, M., Agrawal, P., Efros, A.: What makes ImageNet good for transfer learning? UC Berkeley. arXiv:1608.08614v2 (2016)
ISO/TS 15066:2016 - Robots and robotic devices - Collaborative robots. https://pytorch.org/docs/stable/nn.html. Accessed 18 Apr 2020
Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. Google Inc. arXiv:1404.5997v2 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Commun. ACM (2012). https://doi.org/10.1145/3065386
Mousavian, A., Anguelov, D., Flynn, J., Kosecka, J.: 3D Bounding Box Estimation Using Deep Learning and Geometry. George Mason University and Zoox Inc. arXiv:1612.00496v2 (2017)
Oza, P., Patel, M.: One-Class Convolutional Neural Network. IEEE. arXiv:1901.08688v1 (2019)
Perera, P., Patel, M.: Learning Deep Features for One-Class Classification. IEEE. arXiv:1801.05365v2 (2019)
Pytorch documentation. https://pytorch.org/docs/stable/nn.html. Accessed 15 Apr 2020
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, pp. 779–788 (2015)
Google Scholar
Reed, S., Akata, Z., Yan, X., Logeswaran, L.: Generative adversarial text to image synthesis. In: Proceedings of 33rd International Conference on Machine Learning, New York, 2016, vol. 48 (2016)
Google Scholar
Reyes, A., Caicedo, J., Camargo, J.: Fine-tuning Deep Convolutional Networks for Plant Recognition. Laboratory for Advanced Computational Science and Engineering Research, Universidad Antonio Nariño and Fundación Universitaria Konrad Lorenz, Colombia (2015)
Google Scholar
Rodriguez-Garavito, C.H., Camacho-Munoz, G., Álvarez-Martínez, D., Cardenas, K.V., Rojas, D.M., Grimaldos, A.: 3D object pose estimation for robotic packing applications. In: Figueroa-García, J.C., Villegas, J.G., Orozco-Arroyave, J.R., Maya Duque, P.A. (eds.) WEA 2018. CCIS, vol. 916, pp. 453–463. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00353-1_40
Chapter Google Scholar
Ros, G., Sellart, L., Materzynska, J., et al.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)
Google Scholar
Sangkloy, P., Lu, J., Fang, C., Yu, F., Hays, J.: Scribbler: controlling deep image synthesis with sketch and color. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 6836–6845 (2016)
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., et al.: OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. Courant Institute of Mathematical Sciences, New York University. arXiv:1312.6229v4 (2014)
Solund, T., Glent, A., Kruger, N., et al.: A Large-Scale 3D Object Recognition dataset. Danish Technological Institute, University of Southern Denmark and Technical University of Denmark (2016)
Google Scholar
Solund, T., Savarimuthu, T., Glent, A., et al.: Teach it yourself - fast modeling of industrial objects for 6D pose estimation. In: Fourth International Conference on 3D Vision (3DV), 2016, pp. 73–82 (2015)
Google Scholar
Tremblay, J., Prakash, A., Acuna, D., et al.: Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: CVPR 2018 Workshop on Autonomous Driving, arXiv:1804.06516 (2018)
Tsirikoglou, A., Kronander, J., Wrenninge, M., Unger, J.: Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications. Linkoping University and 7D labs. arXiv:1710.06270v2 (2017)
Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes. NVIDIA, University of Washington and Carnegie Mellon University. arXiv:1711.00199v3 (2018)
Zhang, H., Xu, T., Li, H., et al.: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. ICCV 2017, arXiv: 1612.03242 (2017)
Zhang, M., Wu, J., Lin, H., Yuan, P., Song, Y.: The application of one-class classifier based on CNN in image defect detection. Procedia Comput. Sci. 114, 341–348 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Universidad de los Andes, Bogotá, Colombia
Juan Camilo Martínez-Franco & David Álvarez-Martínez

Authors

Juan Camilo Martínez-Franco
View author publications
You can also search for this author in PubMed Google Scholar
David Álvarez-Martínez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juan Camilo Martínez-Franco .

Editor information

Editors and Affiliations

UPHF, LAMIH UMR CNRS 8201, Polytechnic University Hauts de France, Valenciennes cedex 9, France
Damien Trentesaux
Faculty of Automatic Control and Computer Science, University Politehnica of Bucharest, Bucharest, Romania
Theodor Borangiu
Polytechnic Institute of Bragança, Research Centre in Digitalization and Intelligent Robotics (CeDRI), Bragança, Portugal
Paulo Leitão
Faculty of Engineering, Department of Industrial Engineering, Pontificia Universidad Javeriana, Bogota, Colombia
Jose-Fernando Jimenez
Facultad de Ingeniería, Universidad de La Sabana, Chia, Colombia
Jairo R. Montoya-Torres

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martínez-Franco, J.C., Álvarez-Martínez, D. (2021). Machine Vision for Collaborative Robotics Using Synthetic Data-Driven Learning. In: Trentesaux, D., Borangiu, T., Leitão, P., Jimenez, JF., Montoya-Torres, J.R. (eds) Service Oriented, Holonic and Multi-Agent Manufacturing Systems for Industry of the Future. SOHOMA 2021. Studies in Computational Intelligence, vol 987. Springer, Cham. https://doi.org/10.1007/978-3-030-80906-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-80906-5_6
Published: 29 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80905-8
Online ISBN: 978-3-030-80906-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics