Skip to main content

Machine Vision for Collaborative Robotics Using Synthetic Data-Driven Learning

  • Conference paper
  • First Online:
Service Oriented, Holonic and Multi-Agent Manufacturing Systems for Industry of the Future (SOHOMA 2021)

Abstract

This paper presents a deep learning approach based on synthetic data for computer vision training and motion planning algorithms to be used in collaborative robotics. The cobot is in this case part of fully automated packing and cargo loading systems that must detect items, estimate their pose in space to grasp them, and create a collision-free pick and place trajectory. Simply recording raw data form sensors is typically insufficient to obtain an object’s pose. Specialized machine vision algorithms are needed to process the data, usually based on learning algorithms that depend on carefully annotated and extensive training datasets. However, procuring these datasets may prove expensive and time-consuming. To address this problem, we propose the use of synthetic data to train a neural network that will serve as a machine vision component for an automated packing system. We divide the problem into two steps: detection and pose estimation. Each step is performed with a different convolutional neuronal network configured to complete its task without excessive computing complexity that would be required to perform them simultaneously. We train and test both networks with synthetic data from a virtual scene of the workstation. For the detection problem, we achieved an accuracy of 99.5%. For the pose estimation problem, a mean error for the centre of the mass of 17.78 mm and a mean error for orientation of 21.28\(^\circ \) were registered. Testing with real-world data remains pending, as well as the use of other network architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, P., Girshick, R., Malik, J.: Analyzing the Performance of Multilayer Neural Networks for Object Recognition. University of California, Berkeley (2014)

    Book  Google Scholar 

  2. Bauer, W., Bender, M., Braun, M., Rally, P., Scholtz, O.: Lightweight robots in manual assembly – best to start simply! Examining companies’ initial experiences with lightweight robots. Frauhofer I40 Study (2016)

    Google Scholar 

  3. Deng, J., Dong, W., Socher, R., et al.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  4. Chi, J., Walia, E., Babyn, P., Wang, J., Groot, G., Eramian, M.: Thyroid nodule classification in ultrasound images by fine-tuning deep convolutional neural network. J. Digit. Imaging 30(4), 477–486 (2017). https://doi.org/10.1007/s10278-017-9997-y

    Article  Google Scholar 

  5. Dosovitskiy, A., Fischer, P., Ilg, E., et al.: FlowNet: Learning Optical Flow with Convolutional Networks. University of Freibur and Technical University of Munich (2015)

    Google Scholar 

  6. Fridman, L.: Deep Learning State of the Art. MIT (2019)

    Google Scholar 

  7. Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual Worlds as Proxy for Multi-Object Tracking Analysis. Xerox Research Center Europe and Arizona State University. arXiv:1605.06457v1 (2016)

  8. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (2016). ISBN-13 978-0262035613

    Google Scholar 

  9. Han, D., Liu, Q., Fan, W.: A new image classification method using CNN transfer learning and web data augmentation. Expert Syst. Appl. 95, 43–56 (2018)

    Article  Google Scholar 

  10. Hietanen, A., Latokartano, J., Foi, A., et al.: Object Pose Estimation in Robotics Revisited. Tampere University and Aalto University. arXiv: 1906.02783v2 (2019)

  11. Hoo-Chang, S., Roth, H., Gao, M., et al.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35(5), 1285–1298 (2016)

    Article  Google Scholar 

  12. Huh, M., Agrawal, P., Efros, A.: What makes ImageNet good for transfer learning? UC Berkeley. arXiv:1608.08614v2 (2016)

  13. ISO/TS 15066:2016 - Robots and robotic devices - Collaborative robots. https://pytorch.org/docs/stable/nn.html. Accessed 18 Apr 2020

  14. Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. Google Inc. arXiv:1404.5997v2 (2014)

  15. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. Commun. ACM (2012). https://doi.org/10.1145/3065386

  16. Mousavian, A., Anguelov, D., Flynn, J., Kosecka, J.: 3D Bounding Box Estimation Using Deep Learning and Geometry. George Mason University and Zoox Inc. arXiv:1612.00496v2 (2017)

  17. Oza, P., Patel, M.: One-Class Convolutional Neural Network. IEEE. arXiv:1901.08688v1 (2019)

  18. Perera, P., Patel, M.: Learning Deep Features for One-Class Classification. IEEE. arXiv:1801.05365v2 (2019)

  19. Pytorch documentation. https://pytorch.org/docs/stable/nn.html. Accessed 15 Apr 2020

  20. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016, pp. 779–788 (2015)

    Google Scholar 

  21. Reed, S., Akata, Z., Yan, X., Logeswaran, L.: Generative adversarial text to image synthesis. In: Proceedings of 33rd International Conference on Machine Learning, New York, 2016, vol. 48 (2016)

    Google Scholar 

  22. Reyes, A., Caicedo, J., Camargo, J.: Fine-tuning Deep Convolutional Networks for Plant Recognition. Laboratory for Advanced Computational Science and Engineering Research, Universidad Antonio Nariño and Fundación Universitaria Konrad Lorenz, Colombia (2015)

    Google Scholar 

  23. Rodriguez-Garavito, C.H., Camacho-Munoz, G., Álvarez-Martínez, D., Cardenas, K.V., Rojas, D.M., Grimaldos, A.: 3D object pose estimation for robotic packing applications. In: Figueroa-García, J.C., Villegas, J.G., Orozco-Arroyave, J.R., Maya Duque, P.A. (eds.) WEA 2018. CCIS, vol. 916, pp. 453–463. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00353-1_40

    Chapter  Google Scholar 

  24. Ros, G., Sellart, L., Materzynska, J., et al.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)

    Google Scholar 

  25. Sangkloy, P., Lu, J., Fang, C., Yu, F., Hays, J.: Scribbler: controlling deep image synthesis with sketch and color. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 6836–6845 (2016)

    Google Scholar 

  26. Sermanet, P., Eigen, D., Zhang, X., et al.: OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. Courant Institute of Mathematical Sciences, New York University. arXiv:1312.6229v4 (2014)

  27. Solund, T., Glent, A., Kruger, N., et al.: A Large-Scale 3D Object Recognition dataset. Danish Technological Institute, University of Southern Denmark and Technical University of Denmark (2016)

    Google Scholar 

  28. Solund, T., Savarimuthu, T., Glent, A., et al.: Teach it yourself - fast modeling of industrial objects for 6D pose estimation. In: Fourth International Conference on 3D Vision (3DV), 2016, pp. 73–82 (2015)

    Google Scholar 

  29. Tremblay, J., Prakash, A., Acuna, D., et al.: Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: CVPR 2018 Workshop on Autonomous Driving, arXiv:1804.06516 (2018)

  30. Tsirikoglou, A., Kronander, J., Wrenninge, M., Unger, J.: Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications. Linkoping University and 7D labs. arXiv:1710.06270v2 (2017)

  31. Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes. NVIDIA, University of Washington and Carnegie Mellon University. arXiv:1711.00199v3 (2018)

  32. Zhang, H., Xu, T., Li, H., et al.: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. ICCV 2017, arXiv: 1612.03242 (2017)

  33. Zhang, M., Wu, J., Lin, H., Yuan, P., Song, Y.: The application of one-class classifier based on CNN in image defect detection. Procedia Comput. Sci. 114, 341–348 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Camilo Martínez-Franco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Martínez-Franco, J.C., Álvarez-Martínez, D. (2021). Machine Vision for Collaborative Robotics Using Synthetic Data-Driven Learning. In: Trentesaux, D., Borangiu, T., Leitão, P., Jimenez, JF., Montoya-Torres, J.R. (eds) Service Oriented, Holonic and Multi-Agent Manufacturing Systems for Industry of the Future. SOHOMA 2021. Studies in Computational Intelligence, vol 987. Springer, Cham. https://doi.org/10.1007/978-3-030-80906-5_6

Download citation

Publish with us

Policies and ethics