Construction of a 3D object recognition and manipulation database from grasp demonstrations


Object recognition and manipulation are critical for enabling robots to operate in household environments. Many grasp planners can estimate grasps based on object shape, but they ignore key information about non-visual object characteristics. Object model databases can account for this information, but existing methods for database construction are time and resource intensive. We present an easy-to-use system for constructing object models for 3D object recognition and manipulation made possible by advances in web robotics. The database consists of point clouds generated using a novel iterative point cloud registration algorithm. The system requires no additional equipment beyond the robot, and non-expert users can demonstrate grasps through an intuitive web interface. We validate the system with data collected from both a crowdsourcing user study and expert demonstration. We show that the demonstration approach outperforms purely vision-based grasp planning approaches for a wide variety of object classes.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17


  1. 1.

    Documentation and source code are available at Tutorials are available at


  1. Alexander, B., Hsiao, K., Jenkins, C., Suay, B., & Toris, R. (2012). Robot web tools [ROS topics]. IEEE Robotics Automation Magazine, 19(4), 20–23.

  2. Azevedo, T. C., Tavares, Ja M R, & Vaz, M. A. (2009). 3D object reconstruction from uncalibrated images using an off-the-shelf camera. In Ja M R Tavares & R. N. Jorge (Eds.), Advances in computational vision and medical image processing, volume 13 of computational methods in applied sciences (pp. 117–136). The Netherlands: Springer.

  3. Baier, T., & Zhang, J. (2006). Reusability-based semantics for grasp evaluation in context of service robotics. In IEEE International Conference on Robotics and Biomimetics, 2006. ROBIO ’06, pp. 703–708.

  4. Breazeal, C., DePalma, N., Orkin, J., Chernova, S., & Jung, M. (2013). Crowdsourcing human-robot interaction: New methods and system evaluation in a public environment. Journal of Human-Robot Interaction, 2(1), 82–111.

  5. Brown, M., & Lowe, D. (2005). Unsupervised 3D object recognition and reconstruction in unordered datasets. In Fifth International Conference on 3-D Digital Imaging and Modeling, 2005. 3DIM 2005, pp. 56–63.

  6. Chatzilari, E., Nikolopoulos, S., Papadopoulos, S., Zigkolis, C., & Kompatsiaris, Y. (2011). Semi-supervised object recognition using flickr images. In 9th International Workshop on Content-Based Multimedia Indexing (CBMI), 2011, pp. 229–234.

  7. Chernova, S., DePalma, N., Morant, E., & Breazeal, C. (2011). Crowdsourcing human-robot interaction: Application from virtual to physical worlds. In IEEE International Symposium on Robot and Human Interactive Communication, Ro-Man ’11.

  8. Chung, M., Forbes, M., Cakmak, M., & Rao, R. (2014). Accelerating imitation learning through crowdsourcing. In IEEE International Conference on Robotics and Automation (ICRA), pp. 4777–4784.

  9. Crick, C., Jay, G., Osentoski, S., & Jenkins, O. (2012). ROS and rosbridge: Roboticists out of the loop. In th ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 493–494.

  10. Crick, C., Osentoski, S., Jay, G., & Jenkins, O. (2011). Human and robot perception in large-scale learning from demonstration. In ACM/IEEE International Conference on Human-Robot Interaction (HRI 2011).

  11. Detry, R., Kraft, D., Kroemer, O., Bodenhagen, L., Peters, J., Krüger, N., et al. (2011). Learning grasp affordance densities. Paladyn, 2(1), 1–17.

  12. Esteban, C. H., & Schmitt, F. (2004). Silhouette and stereo fusion for 3D object modeling. Computer Vision and Image Understanding, 96(3), 367–392. (Special issue on model-based and image-based 3D scene representation for interactive visualization).

  13. Garage, W. (2010). The household objects SQL Database.

  14. Goldfeder, C., Ciocarlie, M., Dang, H., & Allen, P. K. (2009a). The Columbia grasp database. In IEEE International Conference on Robotics and Automation, 2009. ICRA’09, pp. 1710–1716.

  15. Goldfeder, C., Ciocarlie, M., Peretzman, J., Dang, H., & Allen, P. (2009b). Data-driven grasping with partial sensor data. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009, pp. 1278–1283.

  16. Harris, C. (2011). You’re hired! An examination of crowdsourcing incentive models in human resource tasks. In WSDM Workshop on Crowdsourcing for Search and Data Mining (CSDM), pp. 15–18.

  17. Hsiao, K., Chitta, S., Ciocarlie, M., & Jones, E. (2010). Contact-reactive grasping of objects with partial shape information. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 1228–1235.

  18. Jiang, Y., Moseson, S., & Saxena, A. (2011). Efficient grasping from RGBD images: Learning using a new rectangle representation. In IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 3304–3311.

  19. Kasper, A., Xue, Z., & Dillmann, R. (2012). The KIT object models database: An object model database for object recognition, localization and manipulation in service robotics. The International Journal of Robotics Research, 31(8), 927–934.

  20. Kehoe, B., Patil, S., Abbeel, P., & Goldberg, K. (2015). A survey of research on cloud robotics and automation. IEEE Transactions on Automation Science and Engineering, 12(2), 398–409.

  21. Kraft, D., Pugeault, N., Baseski, E., Popovic, M., Kragic, D., Kalkan, S., et al. (2008). Birth of the object: Detection of objectness and extraction of object shape through object action complexes. Special Issue on Cognitive Humanoid Robots of the International Journal of Humanoid Robotics (IJHR), 5(2), 247–265.

  22. Krainin, M., Curless, B., & Fox, D. (2011). Autonomous generation of complete 3d object models using next best view manipulation planning. In IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 5031–5037.

  23. Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view rgb-d object dataset. In 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 1817–1824.

  24. Makadia, A., Patterson, A., & Daniilidis, K. (2006). Fully automatic registration of 3D point clouds. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1, 1297–1304.

  25. Mitra, N. J., Gelfand, N., Pottmann, H., & Guibas, L. (2004). Registration of point cloud data from a geometric optimization perspective. In Proceedings of the 2004 Eurographics/ACM SIGGRAPH Symposium on Geometry Processing, SGP ’04, pp. 22–31, New York, NY, USA. ACM.

  26. Morales, A., Asfour, T., Azad, P., Knoop, S., & Dillmann, R. (2006). Integrated grasp planning and visual object localization for a humanoid robot with five-fingered hands. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006, pp. 5663–5668.

  27. Pitzer, B., Osentoski, S., Jay, G., Crick, C., & Jenkins, O. (2012). PR2 remote lab: An environment for remote development and experimentation. In 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 3200–3205.

  28. Quinlan, J. R. (1993). C4.5: Programs for machine learning. Burlington: Morgan Kaufmann.

  29. Rusu, R., Blodow, N., & Beetz, M. (2009). Fast point feature histograms (FPFH) for 3D registration. In IEEE International Conference on Robotics and Automation, 2009. ICRA ’09, pp. 3212–3217.

  30. Rusu, R., & Cousins, S. (2011). 3D is here: Point cloud library (PCL). In IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 1–4.

  31. Saxena, A., Wong, L. L., & Ng, A. Y. (2008). Learning grasp strategies with partial shape information. In AAAI, 8, 1491–1494.

  32. Sehgal, A., Cernea, D., & Makaveeva, M. (2010). Real-time scale invariant 3D range point cloud registration. In A. Campilho & M. Kamel (Eds.), Image analysis and recognition, volume 6111 of Lecture Notes in Computer Science, pp. 220–229. Springer: Berlin Heidelberg.

  33. Sorokin, A., Berenson, D., Srinivasa, S. S., & Hebert, M. (2010). People helping robots helping people: Crowdsourcing for grasping novel objects. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2010, pp. 2117–2122.

  34. Stückler, J., Steffens, R., Holz, D., & Behnke, S. (2013). Efficient 3D object perception and grasp planning for mobile manipulation in domestic environments. Robotics and Autonomous Systems, 61(10), 1106–1115. Selected Papers from the 5th European Conference on Mobile Robots (ECMR 2011).

  35. Tellex, S., Kollar, T., Dickerson, S., Walter, M., Banerjee, A., Teller, S., & Roy, N. (2011). Understanding natural language commands for robotic navigation and mobile manipulation. In Proceedings of the National Conference on Artificial Intelligence (AAAI).

  36. Toris, R., Kent, D., & Chernova, S. (2014). The robot management system: A framework for conducting human-robot interaction studies through crowdsourcing. Journal of Human-Robot Interaction, 3, 25–49.

  37. Torralba, A., Fergus, R., & Freeman, W. T. (2008). 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11), 1958–1970.

  38. Xue, Z., Kasper, A., Zoellner, J., & Dillmann, R. (2009). An automatic grasp planning system for service robots. In International Conference on Advanced Robotics, 2009. ICAR 2009, pp. 1–6.

Download references


I would like to thank Professor Odest Chadwicke Jenkins of Brown University for allowing the use of the PR2 robot throughout the user study and validation experiments. Furthermore, I would like to thank fellow graduate students Russell Toris for the development of the RMS, and Morteza Behrooz for his help in conducting the user study. This work was supported by the National Science Foundation Award Number 1149876, CAREER: Towards Robots that Learn from Everyday People, PI Sonia Chernova, and Office of Naval Research Grant N00014-08-1-0910 PECASE: Tracking Human Movement Using Models of Physical Dynamics and Neurobiomechanics with Probabilistic Inference, PI Odest Chadwicke Jenkins.

Author information

Correspondence to David Kent.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 118021 KB)

Supplementary material 1 (mp4 118021 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kent, D., Behrooz, M. & Chernova, S. Construction of a 3D object recognition and manipulation database from grasp demonstrations. Auton Robot 40, 175–192 (2016).

Download citation


  • Object manipulation
  • Grasp demonstration
  • Crowdsourcing