Machine Vision and Applications

, Volume 21, Issue 5, pp 749–766

Fast and automatic object pose estimation for range images on the GPU

  • In Kyu Park
  • Marcel Germann
  • Michael D. Breitenstein
  • Hanspeter Pfister
Original Paper

Abstract

We present a pose estimation method for rigid objects from single range images. Using 3D models of the objects, many pose hypotheses are compared in a data-parallel version of the downhill simplex algorithm with an image-based error function. The pose hypothesis with the lowest error value yields the pose estimation (location and orientation), which is refined using ICP. The algorithm is designed especially for implementation on the GPU. It is completely automatic, fast, robust to occlusion and cluttered scenes, and scales with the number of different object types. We apply the system to bin picking, and evaluate it on cluttered scenes. Comprehensive experiments on challenging synthetic and real-world data demonstrate the effectiveness of our method.

Keywords

Object pose estimation Bin picking Range image processing General purpose GPU programming Iterative closest point Euclidean distance transform Downhill simplex CUDA 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Al-Hujazi E., Sood A.: Range image segmentation with applications to robot bin-picking using vacuum gripper. IEEE Trans. Syst. Man Cybern. 20(6), 1313–1325 (1990)CrossRefGoogle Scholar
  2. 2.
    Berger, M., Bachler, G., Scherer, S.: Vision guided bin picking and mounting in a flexible assembly cell. In: Proceedings of the 13th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, pp. 109–117 (2000)Google Scholar
  3. 3.
    Besl P., McKay N.: A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)CrossRefGoogle Scholar
  4. 4.
    Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings of ACM SIGGRAPH, pp. 187–194 (1999)Google Scholar
  5. 5.
    Boughorbel F., Zhang Y., Kang S., Chidambaram U., Abidi B., Koschan A., Abidi M.: Laser ranging and video imaging for bin picking. Assembl. Autom. 23(1), 53–59 (2003)CrossRefGoogle Scholar
  6. 6.
    Breitenstein, M.D., Kuettel, D., Weise, T., Gool, L.V., Pfister, H.: Real-time face pose estimation from single range images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  7. 7.
    Chen Y., Medioni G.: Object modeling by registration of multiple range images. Comput. Vis. Image Underst. 10(3), 145–155 (1992)Google Scholar
  8. 8.
    Dorai C., Jain A.K.: Cosmos—a representation scheme for 3d free-form objects. IEEE Trans. Pattern Anal. Mach. Intell. 19(10), 1115–1130 (1997)CrossRefGoogle Scholar
  9. 9.
    Gelfand, N., Mitra, N., Guibas, L., Pottmann, H.: Robust global registration. In: Proceedings of Eurographics Symposium on Geometry Processing, pp. 197–206 (2005)Google Scholar
  10. 10.
    General purpose gpu programming (gpgpu) website. http://www.gpgpu.org
  11. 11.
    Germann, M., Breitenstein, M.D., Park, I.K., Pfister, H.: Automatic pose estimation for range images on the gpu. In: Proceedings of International Conference on 3-D Digital Imaging and Modeling, pp. 81–88 (2007)Google Scholar
  12. 12.
    Greenspan M.: Geometric probing of dense range data. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 495–508 (2002)CrossRefGoogle Scholar
  13. 13.
    Greenspan, M., Shang, L., Jasiobedzki, P.: Efficient tracking with the bounded hough transform. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. I-520–I-527 (2004)Google Scholar
  14. 14.
    Ikeuchi K.: Generating an interpretation tree from a cad model for 3d object recognition in bin-picking tasks. Int. J. Comput. Vis. 1(2), 145–165 (1987)CrossRefGoogle Scholar
  15. 15.
    Johnson A.E., Hebert M.: Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. Mach. Intell. 21(5), 433–449 (1999)CrossRefGoogle Scholar
  16. 16.
    Jones M., Poggio T.: Multidimensional morphable models: a framework for representing and matching object classes. Int. J. Comput. Vis. 29(2), 107–131 (1998)CrossRefGoogle Scholar
  17. 17.
    Jones, M.J., Viola, P.: Fast multi-view face detection. Technical Report TR2003-96, Mitsubishi Electric Research Laboratories (2003)Google Scholar
  18. 18.
    Lamdan, Y., Wolfson, H.: Geometric hashing: a general and efficient model-based recognition sceme. In: Proceedings of International Conference on Computer Vision, pp. 238–249 (1988)Google Scholar
  19. 19.
    Lee, J., Moghaddam, B., Pfister, H., Machiraju, R.: Finding optimal views for 3d face shape modeling. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, pp. 31–36 (2004)Google Scholar
  20. 20.
    Liebelt, J., Schmid, C., Schertler, K.: Viewpoint-independent object class detection using 3d feature maps. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2008)Google Scholar
  21. 21.
    MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)Google Scholar
  22. 22.
    Mian A.S., Bennamoun M., Owens R.: Three-dimensional model-based object recognition and segmentation in cluttered scenes. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 1584–1601 (2006)CrossRefGoogle Scholar
  23. 23.
    Nelder J.A., Mead R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)MATHGoogle Scholar
  24. 24.
    NVIDIA Corporation: Compute Unified Device Architecture (CUDA). http://developer.nvidia.com/object/cuda.html
  25. 25.
    NVIDIA Corporation: PhysX SDK. http://developer.nvidia.com/object/physx.html
  26. 26.
    Okuda H., Kitaaki Y., Hashimoto M., Kaneko S.: Hm-icp: fast 3-d registration algorithm with hierarchical and region selection approach of m-icp. J. Robot. Mechatron. 18(6), 765–771 (2006)Google Scholar
  27. 27.
    Owens J.D., Luebke D., Govindaraju N., Harris M., Krüger J., Lefohn A.E., Purcell T.J.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26(1), 80–113 (2007)CrossRefGoogle Scholar
  28. 28.
    Press W., Teukolsky S., Vetterling W., Flannery B.: Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge (1992)Google Scholar
  29. 29.
    Rahardja, K., Kosaka, A.: Vision-based bin-picking: recognition and localization of multiple complex objects using simple visual cues. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems pp. 1448–1457 (1996)Google Scholar
  30. 30.
    Rong, G.D., Tan, T.S.: Jump flooding in gpu with applications to voronoi diagram and distance transform. In: Proceedings of ACM Symposium on Interactive 3D Graphics and Games, pp. 109–116 (2006)Google Scholar
  31. 31.
    Rothganger, F., Lazebnik, S.J., Ponce, C.S.: 3d object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 272–277 (2003)Google Scholar
  32. 32.
    Rusinkiewicz S., Hall-Holt O., Levoy M.: Real-time 3d model acquisition. ACM Trans. Graph. 21(3), 438–446 (2002)CrossRefGoogle Scholar
  33. 33.
    Rusinkiewicz, S., Levoy, M.: Efficient variants of the icp algorithm. In: Proceedings of International Conference on 3-D Digital Imaging and Modeling, pp. 145–152 (2001)Google Scholar
  34. 34.
    Schmid, C., Mohr, R.: Combining greyvalue invariants with local constraints for object recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 872–877 (1996)Google Scholar
  35. 35.
    Shang, L., Jasiobedzki, P., Greenspan, M.: Discrete pose space estimation to improve icp-based tracking. In: Proceedings of International Conference on 3-D Digital Imaging and Modeling, pp. 523–530 (2005)Google Scholar

Copyright information

© Springer-Verlag 2009

Authors and Affiliations

  • In Kyu Park
    • 1
  • Marcel Germann
    • 2
  • Michael D. Breitenstein
    • 3
  • Hanspeter Pfister
    • 4
  1. 1.School of Information and Communication EngineeringInha UniversityIncheonKorea
  2. 2.Computer Graphics Lab.Swiss Federal Institute of Technology (ETH)ZurichSwitzerland
  3. 3.Computer Vision Lab.Swiss Federal Institute of Technology (ETH)ZurichSwitzerland
  4. 4.School of Engineering and Applied SciencesHarvard UniversityCambridgeUSA

Personalised recommendations