Basic 3D Solid Recognition in RGB-D Images

  • Tomasz Kornuta
  • Maciej Stefańczyk
  • Włodzimierz Kasprzak
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 267)


The paper deals with the problem of recognition of 3D objects for the purpose of their subsequent grasping and manipulation by a two-handed robot. We describe the idea of a general framework for object recognition rooted in the compositional model of the world. This approach threats complex objects as entities constructed of simpler, elementary ones, termed solids. In particular, we focus on recognition of two types of such solids: cuboids and generalized cones. We present details of their operation, starting from the low-level processing of RGB-D images and ending with the generation of hypotheses regarding the presence and parameters of those types of solids.


RGB-D images object recognition recognition-by-parts object primitives solids cuboids generalized cones 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kasprzak, W.: A linguistic approach to 3-D object recognition. Computers & Graphics 11(4), 427–443 (1987)CrossRefGoogle Scholar
  2. 2.
    Kasprzak, W., Stefańczyk, M.: 3D semantic map computation based on depth map and video image. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L.J., Wojciechowski, K. (eds.) ICCVG 2012. LNCS, vol. 7594, pp. 441–448. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River (1995)zbMATHGoogle Scholar
  4. 4.
    Grenander, U., Miller, M.I.: Pattern Theory: From Representation to Inference. Oxford University Press (2007)Google Scholar
  5. 5.
    Fischler, M., Elschlager, R.: The representation and matching of pictorial structures. IEEE Transactions on Computers C–22, 67–92 (1973)Google Scholar
  6. 6.
    Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychological Review 94(2), 115–147 (1987)CrossRefGoogle Scholar
  7. 7.
    Blum, H.: Biological shape and visual science. Journal of Theoretical Biology 38, 205–287 (1973)CrossRefGoogle Scholar
  8. 8.
    Krivic, J., Solina, F.: Part-level object recognition using superquadrics. Comput. Vis. Image Underst. 95(1), 105–126 (2004)CrossRefGoogle Scholar
  9. 9.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2010)CrossRefGoogle Scholar
  10. 10.
    Nevatia, R., Binford, T.: Description and recognition of curved objects. Journal of Artificial Intelligence 8 (1), 77–98 (1977)CrossRefzbMATHGoogle Scholar
  11. 11.
    Marr, D., Nishihara, H.K.: Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society of London. Series B. Biological Sciences 200(1140), 269–294 (1978)CrossRefGoogle Scholar
  12. 12.
    Sun, M., Savarese, S.: Articulated part-based model for joint object detection and pose estimation. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 723–730. IEEE (2011)Google Scholar
  13. 13.
    Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1385–1392. IEEE (2011)Google Scholar
  14. 14.
    Kim, B.S., Xu, S., Savarese, S.: Accurate localization of 3d objects from rgb-d data using segmentation hypotheses. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3182–3189. IEEE (2013)Google Scholar
  15. 15.
    Stefańczyk, M., Kasprzak, W.: Multimodal Segmentation of Dense Depth Maps and Associated Color Information. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L.J., Wojciechowski, K. (eds.) ICCVG 2012. LNCS, vol. 7594, pp. 626–632. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6), 381–395 (1981)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Kornuta, T., Stefańczyk, M.: DisCODe: a component framework for sensory data processing. Pomiary Automatyka Robotyka 16(7-8), 76–85 (2012) (in Polish)Google Scholar
  18. 18.
    OpenKinect: libfreenect – drivers and libraries for the xbox kinect device on windows, linux, and os x (2010)Google Scholar
  19. 19.
    Bradski, G., Kaehler, A.: Learning OpenCV: Computer Vision with the OpenCV Library, 1st edn. O’Reilly (September 2008)Google Scholar
  20. 20.
    Rusu, R.B., Cousins, S.: 3D is here: Point Cloud Library (PCL). In: International Conference on Robotics and Automation, Shanghai, China (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Tomasz Kornuta
    • 1
  • Maciej Stefańczyk
    • 1
  • Włodzimierz Kasprzak
    • 1
  1. 1.Institute of Control and Computation EngineeringWarsaw University of TechnologyWarsawPoland

Personalised recommendations