Vision-Based Solutions for Robotic Manipulation and Navigation Applied to Object Picking and Distribution

  • Máximo A. Roa-GarzónEmail author
  • Elena F. Gambaro
  • Monika Florek-Jasinska
  • Felix Endres
  • Felix Ruess
  • Raphael Schaller
  • Christian Emmerich
  • Korbinian Muenster
  • Michael Suppa
AI Transfer


This paper presents a robotic demonstrator for manipulation and distribution of objects. The demonstrator relies on robust 3D vision-based solutions for navigation, object detection and detection of graspable surfaces using the rc_visard, a self-registering stereo vision sensor. Suitable software modules were developed for SLAM and for model-free suction gripping. The modules run onboard the sensor, which enables creating the presented demonstrator as a standalone application that does not require an additional host PC. The modules are interfaced with ROS, which allows a quick implementation of a fully functional robotic application.


Robotic vision Object picking Logistics 



This project was partially funded by the European Union’s Horizon 2020 research and innovation programme under the project ROSIN, Grant agreement no. 732287, with the FTP (Focused Technical Project) VISARD4ROS.


  1. 1.
    Bay H, Ess A, Tuytelaars T, Gool L (2008) Speeded-up robust features (SURF). Comput Vis Image Underst 110(3):346–359CrossRefGoogle Scholar
  2. 2.
    Bohg J, Morales A, Asfour T, Kragic D (2014) Data-driven grasp synthesis—a survey. IEEE Trans Robot 30(2):289–309CrossRefGoogle Scholar
  3. 3.
    Correll N, Bekris K, Berenson D, Brock O, Causo A, Hauser K, Okada K, Rodriguez A, Romano J, Wurman P (2018) Analysis and observations from the first Amazon picking challenge. IEEE Trans Autom Sci Eng 15(1):172–188CrossRefGoogle Scholar
  4. 4.
    DHL Trend Research (2016) Robotics in logistics: a DPDHL perspective on implications and use cases for the logistics industry. DHL Customer Solutions & InnovationGoogle Scholar
  5. 5.
    Döllinger A, Larsson T (2005) Selection of automated order picking systems. Master thesis, Chalmers University of Technology, SwedenGoogle Scholar
  6. 6.
    EHI Retail Institute (2017) Robotics 4 retail: status quo, potenziale und herausforderungen. EHI-WhitepaperGoogle Scholar
  7. 7.
    Falco J, Sun Y, Roa M (2018) Robotic grasping and manipulation competition: competitor feedback and lessons learned. In: Sun Y, Falco J (eds) Robotic grasping and manipulation: first robotic grasping and manipulation challenge. Springer, Berlin, pp 180–189CrossRefGoogle Scholar
  8. 8.
    Galvez-Lopez D, Tardos JD (2012) Bags of binary words for fast place recognition in image sequences. IEEE Trans Robot 28(5):1188–1197CrossRefGoogle Scholar
  9. 9.
    Gambaro E, Emmerich C, Muenster K, Schaller R, Suppa M (2018) Verfahren zum erstellen eines objektmodells zum greifen eines objekts, computerlesbares speichermedium und robotersystem. In: German Patent OfficeGoogle Scholar
  10. 10.
    Gualtierei M, Pas A, Platt R (2018) Pick and place without geometric object models. In: Proceeding of IEEE international conference on robotics and automation—ICRA, Brisbane, Australia, pp 7433–7440Google Scholar
  11. 11.
    Hillenbrand U (2008) Pose clustering from stereo data. In: Proceedings of VISAPP international workshop on robotic perception, Madeira, Portugal, pp 23–32Google Scholar
  12. 12.
    Hinterstoisser S, Lepetit V, Ilic S, Holzer S, Bradski G, Konolige K, Navab N (2012) Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Proceedings of ACCV Asian conference on computer vision, Daejeon, Korea, pp 548–562Google Scholar
  13. 13.
    Hirschmüller H (2008) Stereo processing by semi-global matching and mutual information. IEEE Trans Pattern Anal Mach Intell 30(2):328–341CrossRefGoogle Scholar
  14. 14.
    Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRefGoogle Scholar
  15. 15.
    Mahler J, Matl M, Liu X, Li A, Gealy D, Goldberg K (2018) Dex-Net 3.0: computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning. In: Proceedings of IEEE international conference on robotics and automation—ICRA, Brisbane, Australia, pp 5620–5627Google Scholar
  16. 16.
    Mur R, Tardos J (2017) ORB-SLAM2: an open-source SLAM system for monocular, stereo and RGB-D cameras. IEEE Trans Robot 33(5):1255–1262CrossRefGoogle Scholar
  17. 17.
    Olson E (2011) AprilTag: a robust and flexible visual fiducial system. In: Proceedings of IEEE international conference on robotics and automation—ICRA, Shanghai, China, pp 3400–3407Google Scholar
  18. 18.
    Pauwels K, Kragic D (2015) Simtrack: a simulation-based framework for scalable real-time object pose detection and tracking. In: Proceedings of IEEE/RSJ international conference on intelligent robots and systems—IROS, Hamburg, Germany, pp 1300–1307Google Scholar
  19. 19.
    Porges O, Stouraitis T, Borst C, Roa MA (2014) Reachability and capability analysis for manipulation tasks. In: ROBOT2013: first Iberian robotics conference. Springer, Madrid, Spain, pp 703–718CrossRefGoogle Scholar
  20. 20.
    Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv:1804.02767
  21. 21.
    Roy N, Newman M, Srinivasa S (2013) Recognition and pose estimation of rigid transparent objects with a kinect sensor. In: Proceedings of robotics science and systems—RSS, Berlin, GermanyGoogle Scholar
  22. 22.
    Rusu R, Blodow N, Beetz M (2009) Fast point feature histograms (FPFH) for 3D registration. In: Proceedings of IEEE international conference on Robotics and Automation—ICRA, Kobe, Japan, pp 3212–3217Google Scholar
  23. 23.
    Rusu R, Cousins S (2011) 3D is here: point cloud library (PCL). In: Proceedings of IEEE international conference on robotics and automation—ICRA, Shanghai, ChinaGoogle Scholar
  24. 24.
    Sepp W, Fuchs S, Hirzinger G (2006) Hierarchical featureless tracking for position-based 6-DoF visual servoing. In: Proceedings of IEEE/RSJ international conference on intelligent robots and systems—IROS, Beijing, China, pp 4310–4315Google Scholar
  25. 25.
    Styleintelligence (2018) Market report: goods-to-person ecommerce fulfilment roboticsGoogle Scholar
  26. 26.
    Xiang Y, Schmidt T, Narayanan V, Fox D (2018) PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. In: Proceedings of robotics: science and systems—RSS, Pittsburgh, USAGoogle Scholar
  27. 27.
    Xiao J, Hays J, Ehinger K, Oliva A, Torralba A (2010) SUN database: large-scale scene recognition from abbey to zoo. In: Proceedings of IEEE conference on computer vision and pattern recognition—CVPR, San Francisco, USA, pp 3485–3492Google Scholar

Copyright information

© Gesellschaft für Informatik e.V. and Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Máximo A. Roa-Garzón
    • 1
    Email author
  • Elena F. Gambaro
    • 1
  • Monika Florek-Jasinska
    • 1
  • Felix Endres
    • 1
  • Felix Ruess
    • 1
  • Raphael Schaller
    • 1
  • Christian Emmerich
    • 1
  • Korbinian Muenster
    • 1
  • Michael Suppa
    • 1
  1. 1.Roboception GmbHMunichGermany

Personalised recommendations