Skip to main content
Log in

Correspondence-free pose estimation for 3D objects from noisy depth data

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Estimating the pose of objects from depth data is a problem of considerable practical importance for many vision applications. This paper presents an approach for accurate and efficient 3D pose estimation from noisy 2.5D depth images obtained from a consumer depth sensor. Initialized with a coarsely accurate pose, the proposed approach applies a hypothesize-and-test scheme that combines stochastic optimization and graphics-based rendering to refine the supplied initial pose, so that it accurately accounts for a sensed depth image. Pose refinement employs particle swarm optimization to minimize an objective function that quantifies the misalignment between the acquired depth image and a rendered one that is synthesized from a hypothesized pose with the aid of an object mesh model. No explicit correspondences between the depth data and the model need to be established, whereas pose hypothesis rendering and objective function evaluation are efficiently performed on the GPU. Extensive experimental results demonstrate the superior performance of the proposed approach compared to the ICP algorithm, which is typically used for pose refinement in depth images. Furthermore, the experiments indicate the graceful degradation of its performance to limited computational resources and its robustness to noisy and reduced polygon count models, attesting its suitability for use with automatically scanned object models and common graphics hardware.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. In our implementation, \(d_T\) is set to 20 mm, a value determined by considering the size of the search space for candidate poses and the sensors sensitivity to depth changes.

  2. http://campar.in.tum.de/Main/StefanHinterstoisser.

  3. http://meshlab.sourceforge.net/.

References

  1. Badino, H., Huber, D., Park, Y., Kanade, T.: Fast and accurate computation of surface normals from range images. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 3084–3091 (2011)

  2. Besl, P., McKay, N.: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)

    Article  Google Scholar 

  3. Bouaziz, S., Tagliasacchi, A., Pauly, M.: Sparse iterative closest point. Comput. Graph. Forum 32(5), 113–123 (2013)

    Article  Google Scholar 

  4. Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., Rother, C.: Learning 6D object pose estimation using 3D object coordinates. In: European Conference on Computer Vision, vol. II, pp. 536–551. Springer International Publishing, Berlin (2014)

  5. Bratanič, B., Pernuš, F., Likar, B., Tomaževič, D.: Real-time pose estimation of rigid objects in heavily cluttered environments. Comput. Vis. Image Understand. 141, 38–51 (2015)

    Article  Google Scholar 

  6. Bronstein, M.M., Kokkinos, I.: Scale-invariant heat kernel signatures for non-rigid shape recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1704–1711 (2010)

  7. Cao, T.T., Tang, K., Mohamed, A., Tan, T.S.: Parallel banding algorithm to compute exact distance transform with the GPU. In: Proceedings of the 2010 Association for Computing Machinery’s Special Interest Group on Computer Graphics and Interactive Techniques (ACM SIGGRAPH) Symposium on Interactive 3D Graphics and Games, pp. 83–90 (2010)

  8. Choi, C., Christensen, H.: 3D pose estimation of daily objects using an RGB-D camera. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3342–3349 (2012)

  9. Cignoni, P., Corsini, M., Ranzuglia, G.: MeshLab: an open-source 3D mesh processing system. ERCIM News 2008(73), 45–46 (2008)

    Google Scholar 

  10. Collet, A., Martinez, M., Srinivasa, S.: The MOPED framework: object recognition and pose estimation for manipulation. Int. J. Robot. Res. 30(10), 1284–1306 (2011)

    Article  Google Scholar 

  11. Darom, T., Keller, Y.: Scale-invariant features for 3-D mesh models. IEEE Trans. Image Process. 21(5), 2758–2769 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  12. DARWIN: Dextrous Assembler Robot Working with Embodied Intelligence. European Commission FP7 Project, Grant No. 270138. http://darwin-project.eu/ (2015). Accessed 23 Sept 2015

  13. Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: efficient and robust 3D object recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 998–1005 (2010)

  14. Felzenszwalb, P., Huttenlocher, D.: Distance transforms of sampled functions. Theory Comput. 8(19), 415–428 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  15. Fischer, J., Bormann, R., Arbeiter, G., Verl, A.: A feature descriptor for texture-less object representation using 2D and 3D cues from RGB-D data. In: IEEE International Conference on Robotics and Automation, pp. 2112–2117 (2013)

  16. Flöry, S., Hofer, M.: Surface fitting and registration of point clouds using approximations of the unsigned distance function. Comput. Aided Geom. Des. 27(1), 60–77 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  17. Frome, A., Huber, D., Kolluri, R., Bülow, T., Malik, J.: Recognizing objects in range data using regional point descriptors. Eur. Conf. Comput. Vis. 3, 224–237 (2004)

    MATH  Google Scholar 

  18. Garland, M., Heckbert, P.S.: Surface simplification using quadric error metrics. In: 24th Annual Conference on Computer Graphics and Interactive Techniques, Special Interest Group on Computer Graphics and Interactive Techniques (SIGGRAPH) ’97, pp. 209–216. ACM Press/Addison-Wesley Publishing Co., New York (1997)

  19. Geiger, A.: LIBICP: C++ library for iterative closest point matching (2011). http://www.cvlibs.net/software/libicp. Accessed 23 Sept 2015

  20. Hernandez-Matas, C., Zabulis, X., Argyros, A.A.: Retinal image registration based on keypoint correspondences, spherical eye modeling and camera pose estimation. In: IEEE International Conference of the Engineering in Medicine and Biology Society, pp. 5650–5654 (2015)

  21. Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., Navab, N.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: Asian Conference on Computer Vision, pp. 548–562 (2012)

  22. Hodan, T., Zabulis, X., Lourakis, M., Obdrzalek, S., Matas, J.: Detection and fine 3D pose estimation of textureless objects in RGB-D images. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4421–4428 (2015)

  23. Horn, B.: Closed-form solution of absolute orientation using unit quaternions. J. Opt. Soc. Am. A 4(4), 629–642 (1987)

    Article  Google Scholar 

  24. Ivekovič, S., Trucco, E., Petillot, Y.: Human body pose estimation with particle swarm optimisation. Evol. Comput. 16(4), 509–528 (2008)

    Article  Google Scholar 

  25. Johnson, A., Hebert, M.: Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)

    Article  Google Scholar 

  26. Kehl, W., Tombari, F., Navab, N., Ilic, S., Lepetit, V.: Hashmod: a hashing method for scalable 3D object detection. In: British Machine Vision Conference, pp. 1–12. BMVA Press, USA (2015)

  27. Khoshelham, K., Elberink, S.: Accuracy and resolution of kinect depth data for indoor mapping applications. Sensors 12(2), 1437–1454 (2012)

    Article  Google Scholar 

  28. Krull, A., Brachmann, E., Michel, F., Ying Yang, M., Gumhold, S., Rother, C.: Learning analysis-by-synthesis for 6D pose estimation in RGB-D images. In: International Conference on Computer Vision, pp. 954–962. IEEE, New York (2015)

  29. Lourakis, M., Zabulis, X.: Model-based pose estimation for rigid objects. In: International Conference on Computer Vision Systems. Lecture Notes on Computer Science, vol. 7963, pp. 83–92. Springer, Berlin (2013)

  30. Mian, A., Bennamoun, M., Owens, R.: Automatic correspondence for 3D modeling: an extensive review. Int. J. Shape Model. 11(02), 253–291 (2005)

    Article  MATH  Google Scholar 

  31. Nascimento, E., Oliveira, G., Campos, M., Vieira, A., Schwartz, W.: BRAND: a robust appearance and depth descriptor for RGB-D images. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1720–1726 (2012)

  32. Oikonomidis, I., Kyriazis, N., Argyros, A.: Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: International Conference on Computer Vision, pp. 2088–2095 (2011)

  33. Park, I., Germann, M., Breitenstein, M., Pfister, H.: Fast and automatic object pose estimation for range images on the GPU. Mach. Vis. Appl. 21(5), 749–766 (2010)

    Article  Google Scholar 

  34. Pauwels, K., Ivan, V., Ros, E., Vijayakumar, S.: Real-time object pose recognition and tracking with an imprecisely calibrated moving RGB-D camera. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2733–2740 (2014)

  35. Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization. Swarm Intell. 1(1), 33–57 (2007)

    Article  Google Scholar 

  36. Prankl, J., Aldoma, A., Svejda, A., Vincze, M.: RGB-D object modelling for object recognition and tracking. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 96–103 (2015)

  37. Rios-Cabrera, R., Tuytelaars, T.: Discriminatively trained templates for 3D object detection: a real time scalable approach. In: International Conference on Computer Vision (ICCV), pp. 2048–2055 (2013)

  38. Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: International Conference on 3D Digital Imaging and Modeling, pp. 145–152 (2001)

  39. Rusu, R., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3D registration. In: IEEE International Conference on Robotics and Automation, pp. 3212–3217 (2009)

  40. Savarese, S., Fei-Fei, L.: 3D generic object categorization, localization and pose estimation. In: International Conference on Computer Vision (2007)

  41. Song, S., Xiao, J.: Sliding shapes for 3D object detection in depth images. In: European Conference on Computer Vision, vol. VI, pp. 634–651. Springer International Publishing, Berlin (2014)

  42. Sun, M., Bradski, G., Xu, B.X., Savarese, S.: Depth-encoded hough voting for joint object detection and shape recovery. In: European Conference on Computer Vision, pp. 658–671 (2010)

  43. Tejani, A., Tang, D., Kouskouridas, R., Kim, T.: Latent-class hough forests for 3D object detection and pose estimation. In: European Conference on Computer Vision, pp. 462–477 (2014)

  44. Tombari, F., Salti, S., di Stefano, L.: A combined texture-shape descriptor for enhanced 3D feature matching. In: IEEE International Conference on Image Processing, pp. 809–812 (2011)

  45. Wang, W., Chen, L., Liu, Z., Kühnlenz, K., Burschka, D.: Textured/textureless object recognition and pose estimation using RGB-D image. J. Real-Time Image Process. 10(4), 667–682 (2013)

  46. Wohlhart, P., Lepetit, V.: Learning descriptors for object recognition and 3D pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3109–3118. IEEE, New York (2015)

  47. Yuille, A., Kersten, D.: Vision as bayesian inference: analysis by synthesis? Trends Cognit. Sci. 10(7), 301–308 (2006)

    Article  Google Scholar 

  48. Zabulis, X., Lourakis, M., Koutlemanis, P.: 3D object pose refinement in range images. In: International Conference on Computer Vision Systems. In: Lecture Notes on Computer Science, vol. 9163, pp. 263–274. Springer International Publishing, Berlin (2015)

  49. Zach, C., Penate-Sanchez, A., Pham, M.T.: A dynamic programming approach for fast and robust object pose recognition from range images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 196–203. IEEE, New York (2015)

  50. Zaharescu, A., Boyer, E., Varanasi, K., Horaud, R.: Surface feature detection and description with applications to mesh matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 373–380 (2009)

  51. Zhang, X., Hu, W., Maybank, S., Li, X., Zhu, M.: Sequential particle swarm optimization for visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

  52. Zhang, Z.: Microsoft kinect sensor and its effect. IEEE Multimed. 19(2), 4–10 (2012)

    Article  Google Scholar 

Download references

Acknowledgments

This work was partially supported by the European Commission FP7 DARWIN Project, Grant No. 270138 and the Foundation for Research and Technology Hellas-Institute of Computer Science (FORTH-ICS) internal RTD Programme ‘Ambient Intelligence and Smart Environments’.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xenophon Zabulis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zabulis, X., Lourakis, M.I.A. & Koutlemanis, P. Correspondence-free pose estimation for 3D objects from noisy depth data. Vis Comput 34, 193–211 (2018). https://doi.org/10.1007/s00371-016-1326-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-016-1326-9

Keywords

Navigation