Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7724)


We propose a framework for automatic modeling, detection, and tracking of 3D objects with a Kinect. The detection part is mainly based on the recent template-based LINEMOD approach [1] for object detection. We show how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time. The pose estimation and the color information allow us to check the detection hypotheses and improves the correct detection rate by 13% with respect to the original LINEMOD. These many improvements make our framework suitable for object manipulation in Robotics applications. Moreover we propose a new dataset made of 15 registered, 1100+ frame video sequences of 15 various objects for the evaluation of future competing methods.


Object Detection Object Projection Color Gradient Cluttered Scene Correct Detection Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hinterstoisser, S., Cagniart, C., Holzer, S., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal Templates for Real-Time Detection of Texture-Less Objects in Heavily Cluttered Scenes. In: ICCV (2011)Google Scholar
  2. 2.
    Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: Real-Time Dense Surface Mapping and Tracking. In: ISMAR (2011)Google Scholar
  3. 3.
    Pan, Q., Reitmayr, G., Drummond, T.: ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition. In: BMVC (2009)Google Scholar
  4. 4.
    Weise, T., Wismer, T., Leibe, B., Gool, L.V.: In-hand Scanning with Online Loop Closure. In: International Workshop on 3-D Digital Imaging and Modeling (2009)Google Scholar
  5. 5.
    Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: Dense Tracking and Mapping in Real-Time. In: ICCV (2011)Google Scholar
  6. 6.
    Viola, P., Jones, M.: Fast Multi-View Face Detection. In: CVPR (2003)Google Scholar
  7. 7.
    Stark, M., Goesele, M., Schiele, B.: Back to the Future: Learning Shape Models from 3D Cad Data. In: BMVC (2010)Google Scholar
  8. 8.
    Liebelt, J., Schmid, C.: Multi-View Object Class Detection With a 3D Geometric Model. In: CVPR (2010)Google Scholar
  9. 9.
    Ferrari, V., Jurie, F., Schmid, C.: From Images to Shape Models for Object Detection. In: IJCV (2009)Google Scholar
  10. 10.
    Payet, N., Todorovic, S.: From contours to 3d object detection and pose estimation. In: ICCV, pp. 983–990 (2011)Google Scholar
  11. 11.
    Gavrila, D., Philomin, V.: Real-Time Object Detection for “smart” Vehicles. In: ICCV (1999)Google Scholar
  12. 12.
    Huttenlocher, D., Klanderman, G., Rucklidge, W.: Comparing Images Using the Hausdorff Distance. TPAMI (1993)Google Scholar
  13. 13.
    Steger, C.: Similarity Measures for Occlusion, Clutter, and Illumination Invariant Object Recognition. In: Radig, B., Florczyk, S. (eds.) DAGM 2001. LNCS, vol. 2191, pp. 148–154. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  14. 14.
    Hinterstoisser, S., Lepetit, V., Ilic, S., Fua, P., Navab, N.: Dominant Orientation Templates for Real-Time Detection of Texture-Less Objects. In: CVPR (2010)Google Scholar
  15. 15.
    Mian, A.S., Bennamoun, M., Owens, R.A.: Automatic Correspondence for 3D Modeling: an Extensive Review. International Journal of Shape Modeling (2005)Google Scholar
  16. 16.
    Zhang, Z.: Iterative Point Matching for Registration of Free-Form Curves. In: IJCV (1994)Google Scholar
  17. 17.
    Johnson, A.E., Hebert, M.: Using Spin Images for Efficient Object Recognition in Cluttered 3 D Scenes. TPAMI (1999)Google Scholar
  18. 18.
    Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model Globally, Match Locally: Efficient and Robust 3D Object Recognition. In: CVPR (2010)Google Scholar
  19. 19.
    Mian, A.S., Bennamoun, M., Owens, R.: Three-Dimensional Model-Based Object Recognition and Segmentation in Cluttered Scenes. TPAMI (2006)Google Scholar
  20. 20.
    Rusu, R.B., Blodow, N., Beetz, M.: Fast Point Feature Histograms (FPFH) for 3D Registration. In: International Conference on Robotics and Automation (2009)Google Scholar
  21. 21.
    Tombari, F., Salti, S., Di Stefano, L.: Unique Signatures of Histograms for Local Surface Description. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 356–369. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  22. 22.
    Sun, M., Bradski, G., Xu, B.-X., Savarese, S.: Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 658–671. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  23. 23.
    Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: ICRA, pp. 4007–4013 (2011)Google Scholar
  24. 24.
    Grabner, M., Grabner, H., Bischof, H.: Learning Features for Tracking. In: CVPR (2007)Google Scholar
  25. 25.
    Ozuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast Keypoint Online Learning and Recognition. TPAMI (2010)Google Scholar
  26. 26.
    Kalal, Z., Matas, J., Mikolajczyk, K.: P-N Learning: Bootstrapping Binary Classifiers by Structural Constraints. In: CVPR (2010)Google Scholar
  27. 27.
    Hinterstoisser, S., Benhimane, S., Lepetit, V., Fua, P., Navab, N.: Simultaneous Recognition and Homography Extraction of Local Patches With a Simple Linear Classifier. In: BMVC (2008)Google Scholar
  28. 28.
    Fitzgibbon, A.: Robust Registration fo 2D and 3D Point Sets. In: BMVC (2001)Google Scholar
  29. 29.
    Hinterstoisser, S., Ilic, S., Sturm, P., Navab, N., Fua, P., Lepetit, V.: Gradient Response Maps for Real-Time Detection of Texture-Less Objects. TPAMI (2012)Google Scholar
  30. 30.
    Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: CVPR (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.CAMPTechnische Universität München (TUM)Germany
  2. 2.Industrial PerceptionPalo AltoUSA
  3. 3.CV-LabÉcole Polytechnique Fédérale de Lausanne (EPFL)Switzerland

Personalised recommendations