International Journal of Computer Vision

, Volume 46, Issue 1, pp 5–23 | Cite as

3D Object Recognition in Cluttered Environments by Segment-Based Stereo Vision

  • Yasushi Sumi
  • Yoshihiro Kawai
  • Takashi Yoshimi
  • Fumiaki Tomita


We propose a new method for 3D object recognition which uses segment-based stereo vision. An object is identified in a cluttered environment and its position and orientation (6 dof) are determined accurately enabling a robot to pick up the object and manipulate it. The object can be of any shape (planar figures, polyhedra, free-form objects) and partially occluded by other objects. Segment-based stereo vision is employed for 3D sensing. Both CAD-based and sensor-based object modeling subsystems are available. Matching is performed by calculating candidates for the object position and orientation using local features, verifying each candidate, and improving the accuracy of the position and orientation by an iteration method. Several experimental results are presented to demonstrate the usefulness of the proposed method.

3D object recognition free-form objects segment-based stereo vision 3D object modeling 3D shape matching robot vision 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Arman, F. and Aggarwal, J.K. 1993a. CAD-based vision: Object recognition in cluttered range images using recognition strategies. CVGIP: Image Understanding, 58(1):33–48.Google Scholar
  2. Arman, F. and Aggarwal, J.K. 1993b. Model-based object recognition in dense-range images—A review. ACM Computing Surveys, 25(1):5–43.Google Scholar
  3. Ayache, N. and Faugeras, O.D. 1986. HYPER: A new approach for the recognition and positioning of two-dimensional objects. IEEE Trans. on PAMI, PAMI-8(1):44–54.Google Scholar
  4. Basri, R. and Ullman, S. 1993. The alignment of objects with smooth surfaces. CVGIP: Image Understanding, 57(3):331–345.Google Scholar
  5. Besl, P.J. and McKay, N.D. 1992. A method for registration of 3-D shapes. IEEE Trans. on PAMI, 14(2):239–256.Google Scholar
  6. Chen, Y. and Medioni, G. 1992. Object modeling by registration of multiple range images. Image Vision Computing, 10(3):145–155.Google Scholar
  7. Chen, J.-L. and Stockman, G.C. 1996. Determining pose of 3D objects with curved surfaces. IEEE Trans. on PAMI, 18(1):52–57.Google Scholar
  8. Chua, C.S. and Jarvis, R. 1996. 3D free-form surface registration and object recognition. Int. J. of Computer Vision, 17(1):77–99.Google Scholar
  9. Chua, C.S. and Jarvis, R. 1997. Point signatures: A new representation for 3D object recognition. Int. J. of Computer Vision, 25(1):63–85.Google Scholar
  10. Cipolla, R. and Blake, A. 1992. Surface shape from the deformation of apparent contours. Int. J. of Computer Vision, 9(2):83–112.Google Scholar
  11. Dorai, C. and Jain, A.K. 1997. Cosmos—A representation scheme for 3D free-form objects. IEEE Trans. on PAMI, 19(10):1115–1130.Google Scholar
  12. Faugeras, O.D. and Hebert, M.1986. The representation, recognition, and locating of 3-D objects. Int. J. of Robotics Research, 5(3):27–52.Google Scholar
  13. Flynn, P.J. 1994. 3-D object recognition with symmetric models: Symmetry extraction and encoding. IEEE Trans. on PAMI, 16(8):814–818.Google Scholar
  14. Flynn, P.J. and Jain, A.K. 1991. BONSAI: 3-D object recognition using constrained search. IEEE Trans. on PAMI, 13(10):1066–1075.Google Scholar
  15. Grimson, W.E.L. 1990. Object Recognition by Computer: The Role of Geometric Constraints. The MIT Press, Cambridge, MA.Google Scholar
  16. Horn, B.K.P. 1984. Extended gaussian images. Proc. IEEE, 72(12):1656–1678.Google Scholar
  17. Huttenlocher, D.P. and Ullman, S. 1990. Recognizing solid objects by alignment with an image. Int. J. of Computer Vision, 5(2):195–212.Google Scholar
  18. Johnson, A.E. and Hebert, M. 1998. Surface matching for object recognition in complex three-dimensional scenes. Image and Vision Computing, 16(9/10):433–449.Google Scholar
  19. Johnson, A.E. and Hebert, M. 1999. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans. on PAMI, 21(5):433–449.Google Scholar
  20. Joshi, T., Vijayakumar, B., Kriegman, D.J., and Ponce, J. 1997. Hot curves for modelling and recognition of smooth curved 3D objects. Image and Vision Computing, 15(7):479–498.Google Scholar
  21. Kawai, Y. and Tomita, F. 1996. Interactive tactile display system—A support system for the visually disabled to recognize 3Dobjects—. In Proc. 2nd Annual ACM Conference on Assistive Technologies, ASSET96, Vancouver, Canada, pp. 45–50.Google Scholar
  22. Kawai, Y., Ueshiba, T., Ishiyama, Y., Sumi, Y., and Tomita, F. 1998. Stereo correspondence using segment connectivity. In Proc. 14th International Conference on Pattern Recognition, ICPR98, Brisbane, Australia, pp. 648–651.Google Scholar
  23. Kriegman, D.J. and Ponce, J. 1990. On recognizing and positioning curved 3-D objects from image contours. IEEE Trans. on PAMI, 12(12):1127–1137.Google Scholar
  24. Matsui, T. and Hara, I. 1995. EusLisp Reference Manual version 8.00. Electrotechnical Laboratory, Tsukuba, Japan.Google Scholar
  25. Matsushita, T., Sumi, Y., Ishiyama, Y., and Tomita, F. 1998. A tracking based manipulation system built on stereo vision. In Proc. IEEE/RSJ International Conference on Intelligent Robotic Systems, iIROS'98, Victoria, Canada, pp. 185–190.Google Scholar
  26. Ohta, Y., Watanabe, M., and Ikeda, K. 1986. Improving depth map by right-angled trinocular stereo. In Proc. ICPR, Paris, France, vol. 1, pp. 519–521.Google Scholar
  27. Oue, Y., Sugimoto, T., Kitamura, T., Sumi, Y., and Tomita, F. 1999. Object recognition on distributed computing environment. Trans. of the Institute of Electronics, Information and Communication Engineers, J82-D-II(12):2307–2315. (in Japanese).Google Scholar
  28. Ponce, J., Hoogs, A., and Kriegman, D.J. 1992. On using CAD models to compute the pose of curved 3D object. CVGIP: Image Understanding, 55(2):184–197.Google Scholar
  29. Porrill, J., Pollard, S.B., Pridmore, T.P., Bowen, J.B., Mayhew, J.E.W., and Frisby, J.P. 1988. TINA: A 3D vision system for pick and place. Image and Vision Computing, 6(2):91–99.Google Scholar
  30. Rygol, M., Pollard, S., and Brown, C. 1991. Multiprocessor 3Dvision system for pick and place. Image and Vision Computing, 9(1):33–38.Google Scholar
  31. Seales, W.B. and Faugeras, O.D. 1995. Building three-dimensional object models from image sequences. Computer Vision and Image Understanding, 61(3):308–324.Google Scholar
  32. Stein, F. and Medioni, G. 1992. Structural indexing: Efficient 3-D object recognition. IEEE Trans. on PAMI, 14(2):125–145.Google Scholar
  33. Sugimoto, K. and Tomita, F. 1994. Boundary segmentation by detection of corner, inflection and transition points. In Proc. the IEEE Workshop on Visualization and Machine Vision, Seattle, WA, pp. 13–17.Google Scholar
  34. Tan, T.N., Sullivan, G.D., and Baker, K.D. 1998. Model-based localisation and recognition of road vehicles. Int. J. of Computer Vision, 27(1):5–25.Google Scholar
  35. Tomita, F. and Takahashi, H. 1986. Algorithms for a B-rep of an image as its intermediate description. Technical Report, Institute of Electronics, Information and Communication Engineers (in Japanese).Google Scholar
  36. Tomita, F. and Tsuji, S. 1990. Computer Analysis of Visual Textures, Ch. 3. Kluwer Academic Publishers, Norwell, MA.Google Scholar
  37. Tomita, F., Yoshimi, T., Ueshiba, T., Kawai, Y., Sumi, Y., Matsushita, T., Ichimura, N., Sugimoto, K., and Ishiyama, Y. 1998. R&D of versatile 3D vision system VVV. In Proc. IEEE International Conference on Systems, Man, and Cybernetics, SMC'98, San Diego, CA, pp. 4510–4516.Google Scholar
  38. Ueshiba, T., Kawai, Y., Ishiyama, Y., Sumi, Y., and Tomita, F. 1998. An efficient matching algorithm for segment-based stereo vision using dynamic programming technique. In Proc. IAPR Workshop on Machine Vision Applications, Chiba, Japan, pp. 61–64.Google Scholar
  39. Vaillant, R. and Faugeras, O.D. 1992. Using extremal boundaries for 3-D object modeling. IEEE Trans. on PAMI, 14(2):157–173.Google Scholar
  40. Vayda, A.J. and Kak, A.C. 1991. A robot vision system for recognition of generic shaped objects. CVGIP: Image Understanding, 54(1):1–46.Google Scholar
  41. Yachida, M., Kitamura, Y., and Kimachi, M. 1986. Trinocular vision: New approach for correspondence problem. In Proc. ICPR, Paris, France, vol. 2, pp. 1041–1044.Google Scholar
  42. Zerroug, M. and Nevatia, R. 1995. Pose estimation of multi-part curved objects. In Proc. International Symposium on Computer Vision, Coral Gables, FL, pp. 431–436.Google Scholar
  43. Zhang, Z. and Faugeras, O.D. 1992. Three-dimensional motion computation and object segmentation in a long sequence of stereo frames. Int. J. of Computer Vision, 7(3):211–241.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Yasushi Sumi
    • 1
  • Yoshihiro Kawai
    • 1
  • Takashi Yoshimi
    • 1
  • Fumiaki Tomita
    • 1
  1. 1.Intelligent Systems InstituteNational Institute of Advanced Industrial Science and Technology (AIST), AIST Tsukuba CentralIbarakiJapan

Personalised recommendations