Advertisement

Autonomous Robots

, Volume 40, Issue 5, pp 805–829 | Cite as

Unsupervised segmentation of unknown objects in complex environments

  • Umar AsifEmail author
  • Mohammed Bennamoun
  • Ferdous Sohel
Article

Abstract

This paper presents a novel object segmentation approach for highly complex indoor scenes. Our approach starts with a novel algorithm which partitions the scene into distinct regions whose boundaries accurately conform to the physical object boundaries in the scene. Next, we propose a novel perceptual grouping algorithm based on local cues (e.g., 3D proximity, co-planarity, and shape convexity) to merge these regions into object hypotheses. Our extensive experimental evaluations demonstrate that our object segmentation results are superior compared to the state-of-the-art methods.

Keywords

3D object segmentation Object localization Object detection 

Notes

Acknowledgments

This work was supported by Australian Research Council Grants: DP150100294, DP110102166, DE120102960.

Supplementary material

Supplementary material 1 (mp4 37626 KB)

References

  1. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). Slic superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.CrossRefGoogle Scholar
  2. Arbelaez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.CrossRefGoogle Scholar
  3. Asif, U., Bennamoun, M., & Sohel, F. (2013). Real-time pose estimation of rigid objects using RGB-D imagery. In ICIEA (pp. 1692–1699).Google Scholar
  4. Asif, U., Bennamoun, M., & Sohel, F. (2014a). A model-free approach for the segmentation of unknown objects. In IROS (pp. 4914–4921).Google Scholar
  5. Asif, U., Bennamoun, M., & Sohel, F. (2014b). Model-free segmentation and grasp selection of unknown stacked objects. In ECCV (pp. 659–674). Springer.Google Scholar
  6. Asif, U., Bennamoun, M., & Sohel, F. (2015a). Discriminative feature learning for efficient RGB-D object recognition. In IROS.Google Scholar
  7. Asif, U., Bennamoun, M., & Sohel, F. (2015b). Efficient RGB-D object categorization using cascaded ensembles of randomized decision trees. In ICRA (pp. 1295–1302).Google Scholar
  8. Björkman, M., Bergström, N., & Kragic, D. (2014). Detecting, segmenting and tracking unknown objects using multi-label MRF inference. Computer Vision and Image Understanding, 118, 111–127.CrossRefGoogle Scholar
  9. Bleiweiss, A., & Werman, M. (2009). Fusing time-of-flight depth and color for real-time segmentation and tracking. In A. Kolb & R. Koch (Eds.), Dynamic 3D imaging (pp. 58–69). Heidelberg: Springer.CrossRefGoogle Scholar
  10. Bo, L., Ren, X., & Fox, D. (2014). Learning hierarchical sparse features for RGB-(D) object recognition. The International Journal of Robotics Research, 33(4), 581–599.CrossRefGoogle Scholar
  11. Boykov, Y., & Funka-Lea, G. (2006). Graph cuts and efficient nd image segmentation. International Journal of Computer Vision, 70(2), 109–131.CrossRefGoogle Scholar
  12. Carreira, J., & Sminchisescu, C. (2010). Constrained parametric min-cuts for automatic object segmentation. In IEEE conference on computer vision and pattern recognition (CVPR), 2010 (pp. 3241–3248). IEEE.Google Scholar
  13. Collet, A., Martinez, M., & Srinivasa, S. S. (2011). The moped framework: Object recognition and pose estimation for manipulation. The International Journal of Robotics Research, 30(10), 1284–1306.CrossRefGoogle Scholar
  14. Cour, T., Benezit, F., & Shi, J. (2005). Spectral segmentation with multiscale graph decomposition. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 2, pp. 1124–1131). IEEE.Google Scholar
  15. Cremers, D., Schmidt, F.R., & Barthel, F. (2008). Shape priors in variational image segmentation: Convexity, Lipschitz continuity and globally optimal solutions. In IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1–6). IEEE.Google Scholar
  16. Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59(2), 167–181.CrossRefGoogle Scholar
  17. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.CrossRefGoogle Scholar
  18. Fenzi, M., Dragon, R., Leal-Taixé, L., Rosenhahn, B., & Ostermann, J. (2012). 3D object recognition and pose estimation for multiple objects using multi-prioritized RANSAC and model updating. In Pattern recognition (pp. 123–133). Springer.Google Scholar
  19. Goron, L.C., Marton, Z.C., Lazea, G., & Beetz, M. (2012). Robustly segmenting cylindrical and box-like objects in cluttered scenes using depth cameras. In Proceedings of ROBOTIK 2012, 7th German conference on VDE robotics (pp. 1–6).Google Scholar
  20. Gupta, S., Arbelaez, P., & Malik, J. (2013). Perceptual organization and recognition of indoor scenes from RGB-D images. In IEEE conference on computer vision and pattern recognition (CVPR) 2013 (pp. 564–571). IEEE.Google Scholar
  21. Hager, G. D., & Wegbreit, B. (2011). Scene parsing using a prior world model. The International Journal of Robotics Research, 30(12), 1477–1507.CrossRefGoogle Scholar
  22. Harville, M., Gordon, G., & Woodfill, J. (2001). Foreground segmentation using adaptive mixture models in color and depth. In Proceedings of IEEE workshop on detection and recognition of events in video, 2001 (pp. 3–11). IEEE.Google Scholar
  23. Hoiem, D., Efros, A. A., & Hebert, M. (2011). Recovering occlusion boundaries from an image. International Journal of Computer Vision, 91(3), 328–346.MathSciNetCrossRefzbMATHGoogle Scholar
  24. Holz, D., Holzer, S., Rusu, R.B., & Behnke, S. (2012). Real-time plane segmentation using RGB-D cameras. In RoboCup 2011: robot soccer world cup XV (pp. 306–317). Springer.Google Scholar
  25. Ignakov, D., Liu, G., & Okouneva, G. (2013). Object segmentation in cluttered and visually complex environments. Autonomous Robots, 37(2), 111–135.Google Scholar
  26. Kim, E., & Medioni, G. (2011). 3D object recognition in range images using visibility context. In IEEE/RSJ international conference on intelligent robots and systems (IROS), 2011 (pp. 3800–3807), IEEE.Google Scholar
  27. Kim, J. S., & Hong, K. S. (2009). Color-texture segmentation using unsupervised graph cuts. Pattern Recognition, 42(5), 735–750.CrossRefzbMATHGoogle Scholar
  28. Kirkpatrick, S. (1984). Optimization by simulated annealing: Quantitative studies. Journal of Statistical Physics, 34(5–6), 975–986.MathSciNetCrossRefGoogle Scholar
  29. Kootstra, G., & Kragic, D. (2011). Fast and bottom-up object detection, segmentation, and evaluation using Gestalt principles. In IEEE international conference on robotics and automation (ICRA), 2011 (pp. 3423–3428). IEEE.Google Scholar
  30. Kootstra, G., Popović, M., Jørgensen, J. A., Kuklinski, K., Miatliuk, K., Kragic, D., et al. (2012). Enabling grasping of unknown objects through a synergistic use of edge and surface information. The International Journal of Robotics Research, 31(10), 1190–1213.CrossRefGoogle Scholar
  31. Kuehnle, J., Verl, A., Xue, Z., Ruehl, S., Zoellner, J.M., Dillmann, R., Grundmann, T., Eidenberger, R., & Zoellner, R.D. (2009). 6D object localization and obstacle detection for collision-free manipulation with a mobile service robot. In International conference on advanced robotics, 2009. ICAR 2009 (pp. 1–6). IEEE.Google Scholar
  32. Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view RGB-D object dataset. In IEEE international conference on robotics and automation (ICRA), 2011 (pp. 1817–1824). IEEE.Google Scholar
  33. Lai, K., Bo, L., Ren, X., & Fox, D. (2012). Detection-based object labeling in 3D scenes. In IEEE international conference on robotics and automation (ICRA), 2012 (pp. 1330–1337). IEEE.Google Scholar
  34. Leibe, B., Leonardis, A., & Schiele, B. (2008). Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision, 77(1–3), 259–289.CrossRefGoogle Scholar
  35. Levinshtein, A., Stere, A., Kutulakos, K. N., Fleet, D. J., Dickinson, S. J., & Siddiqi, K. (2009). Turbopixels: Fast superpixels using geometric flows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(12), 2290–2297.CrossRefGoogle Scholar
  36. Li, W. H., & Kleeman, L. (2011). Segmentation and modeling of visually symmetric objects by robot actions. The International Journal of Robotics Research, 30(9), 1124–1142.CrossRefGoogle Scholar
  37. Li, X., & Guskov, I. (2007) 3D object recognition from range images using pyramid matching. In IEEE 11th international conference on computer vision, 2007. ICCV 2007 (pp. 1–6). IEEE.Google Scholar
  38. Maire, M., Arbeláez, P., Fowlkes, C., & Malik, J. (2008). Using contours to detect and localize junctions in natural images. In IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1–8). IEEE.Google Scholar
  39. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6), 1087–1092.CrossRefGoogle Scholar
  40. Mishra, A.K., Shrivastava, A., & Aloimonos, Y. (2012). Segmenting “simple” objects using RGB-D. In IEEE international conference on robotics and automation (ICRA), 2012 (pp. 4406–4413). IEEE.Google Scholar
  41. Papazov, C., Haddadin, S., Parusel, S., Krieger, K., & Burschka, D. (2012). Rigid 3D geometry matching for grasping of known objects in cluttered scenes. The International Journal of Robotics Research, 31(4), 538–553.CrossRefGoogle Scholar
  42. Papon, J., Abramov, A., Schoeler, M., & Worgotter, F. (2013). Voxel cloud connectivity segmentation-supervoxels for point clouds. In IEEE conference on computer vision and pattern recognition (CVPR), 2013 (pp. 2027–2034). IEEE.Google Scholar
  43. Pepik, B., Stark, M., Gehler, P., & Schiele, B. (2012). Teaching 3D geometry to deformable part models. In IEEE conference on computer vision and pattern recognition (CVPR), 2012 (pp. 3362–3369). IEEE.Google Scholar
  44. Rasolzadeh, B., Björkman, M., Huebner, K., & Kragic, D. (2010). An active vision system for detecting, fixating and manipulating objects in the real world. The International Journal of Robotics Research, 29(2–3), 133–154.CrossRefGoogle Scholar
  45. Ren, X., & Malik, J. (2003). Learning a classification model for segmentation. In Proceedings of the ninth IEEE international conference on computer vision, 2003 (pp. 10–17). IEEE.Google Scholar
  46. Richtsfeld, A., Morwald, T., Prankl, J., Zillich, M., & Vincze, M. (2012). Segmentation of unknown objects in indoor environments. In IEEE/RSJ international conference on intelligent robots and systems (IROS), 2012 (pp. 4791–4796). IEEE.Google Scholar
  47. Richtsfeld, A., Mörwald, T., Prankl, J., Zillich, M., & Vincze, M. (2014). Learning of perceptual grouping for object segmentation on RGB-D data. Journal of Visual Communication and Image Representation, 25(1), 64–73.CrossRefGoogle Scholar
  48. Rusu, R.B., Blodow, N., Marton, Z.C., & Beetz, M. (2009). Close-range scene segmentation and reconstruction of 3D point cloud maps for mobile manipulation in domestic environments. In IEEE/RSJ international conference on intelligent robots and systems, 2009. IROS 2009 (pp. 1–6). IEEE.Google Scholar
  49. Rusu, R.B., Bradski, G., Thibaux, R., & Hsu, J. (2010). Fast 3D recognition and pose using the viewpoint feature histogram. In IEEE/RSJ international conference on intelligent robots and systems (IROS), 2010 (pp. 2155–2162). IEEE.Google Scholar
  50. Shapiro, L., & Stockman, G. C. (2001). Computer vision (pp. 69–75). Upper Saddle River: Prentice Hall.Google Scholar
  51. Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905.CrossRefGoogle Scholar
  52. Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor segmentation and support inference from RGBD images. In Computer vision—ECCV 2012 (pp. 746–760). Springer.Google Scholar
  53. Sun, M., Bradski, G., Xu, B.X., & Savarese, S. (2010). Depth-encoded hough voting for joint object detection and shape recovery. In Computer vision—ECCV 2010 (pp. 658–671). Springer.Google Scholar
  54. Uckermann, A., Haschke, R., & Ritter, H. (2012). Real-time 3D segmentation of cluttered scenes for robot grasping. In 12th IEEE-RAS international conference on humanoid robots (Humanoids), 2012 (pp. 198–203). IEEE.Google Scholar
  55. Vedaldi, A., & Soatto, S. (2008). Quick shift and kernel methods for mode seeking. In Computer vision—ECCV 2008 (pp. 705–718). Springer.Google Scholar
  56. Veksler, O., Boykov, Y., & Mehrani, P. (2010). Superpixels and supervoxels in an energy optimization framework. In Computer vision—ECCV 2010 (pp. 211–224). Springer.Google Scholar
  57. Weikersdorfer, D., Gossow, D., & Beetz, M. (2012). Depth-adaptive superpixels. In 21st international conference on pattern recognition (ICPR), 2012 (pp. 2087–2090). IEEE.Google Scholar
  58. Xiang, Y., & Savarese, S. (2012). Estimating the aspect layout of object categories. In IEEE conference on computer vision and pattern recognition (CVPR), 2012 (pp. 3410–3417). IEEE.Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.School of CSSEThe University of Western AustraliaPerthAustralia
  2. 2.School of Engineering and Information TechnologyMurdoch UniversityPerthAustralia

Personalised recommendations