International Journal of Computer Vision

, Volume 113, Issue 3, pp 233–245 | Cite as

Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video

Article

Abstract

Motion is a fundamental grouping cue in video. Many current approaches to motion segmentation in monocular or stereo image sequences rely on sparse interest points or are dense but computationally demanding. We propose an efficient expectation–maximization (EM) framework for dense 3D segmentation of moving rigid parts in RGB-D video. Our approach segments images into pixel regions that undergo coherent 3D rigid-body motion. Our formulation treats background and foreground objects equally and poses no further assumptions on the motion of the camera or the objects than rigidness. While our EM-formulation is not restricted to a specific image representation, we supplement it with efficient image representation and registration for rapid segmentation of RGB-D video. In experiments, we demonstrate that our approach recovers segmentation and 3D motion at good precision.

Keywords

Motion segmentation Rigid multi-body registration  Multibody structure-from-motion 

References

  1. Agrawal, M., Konolige, K., & Iocchi, L. (2005). Real-time detection of independent motion using stereo. In Proceedings of the IEEE Workshop on Motion.Google Scholar
  2. Ayvaci, A., & Soatto, S. (2009). Motion segmentation with occlusions on the superpixel graph. In Proceedings of the IEEE ICCV Workshops.Google Scholar
  3. Bishop, C. M. (2006). Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus: Springer. ISBN 0387310738.Google Scholar
  4. Boykov, Y., & Jolly, M. -P. (2001). Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In Proceedings of the IEEE International Conference on Computer Vision.Google Scholar
  5. Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.CrossRefGoogle Scholar
  6. Brox, T., Bruhn, A., & Weickert, J. (2006). Variational motion segmentation with level sets. In Proceedings of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science (pp. 471–483).Google Scholar
  7. Cremers, D., & Soatto, S. (2005). Motion competition: A variational approach to piecewise parametric motion segmentation. International Journal of Computer Vision, 62, 249–265.CrossRefGoogle Scholar
  8. Delong, A., Osokin, A., Isack, H. N., & Boykov, Y. (2012). Fast approximate energy minimization with label costs. International Journal of Computer Vision, 96(1), 1–27.MATHMathSciNetCrossRefGoogle Scholar
  9. Drost, Bertram, Ulrich, Markus, Navab, Nassir, & Ilic, Slobodan. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  10. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.Google Scholar
  11. Fitzpatrick, P. (2003). First contact: an active vision approach to segmentation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).Google Scholar
  12. Gruber, A., & Weiss, Y. (2004). Multibody factorization with uncertainty and missing data using the EM algorithm. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  13. Hadfield, S., & Bowden, R. (2014). Scene particles: Unregularized particle based scene flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3), 564–576.CrossRefGoogle Scholar
  14. Herbst, Evan, Ren, Xiaofeng, & Fox, Dieter. (2013). RGB-D flow: Dense 3-D motion estimation using color and depth. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (pp. 2276–2282).Google Scholar
  15. Herbst, Evan, Henry, Peter, & Fox, Dieter. (2014). Toward online 3-D object segmentation and mapping. In International Conference on Robotics and Automation (ICRA).Google Scholar
  16. Hornacek, M., Fitzgibbon, A., & Rother, C. (2014). SphereFlow: 6 DoF scene flow from RGB-D pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
  17. Huguet, F., & Devernay, F. (2007). A variational method for scene flow estimation from stereo sequences. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).Google Scholar
  18. Kenney, J., Buckley, T., & Brock, O. (2009). Interactive segmentation for manipulation in unstructured environments. In Proceedings of the IEEE ICRA.Google Scholar
  19. Kumar, M. P., Torr, P. H. S., & Zisserman, A. (2005). Learning layered motion segmentations of video. In Proceedings of the International Conference on Computer Vision (ICCV).Google Scholar
  20. Ochs, P., Malik, J., & Brox, T. (2014). Segmentation of moving objects by long term video analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6), 1187–1200. Preprint.CrossRefGoogle Scholar
  21. Quiroga, J., Devernay, F., & Crowley, J. L. (2013). Local/global scene flow estimation. In Proceedings of the IEEE International Conference on Image Processing (ICIP).Google Scholar
  22. Ross, D., Tarlow, D., & Zemel, R. (2010). Learning articulated structure and motion. International Journal of Computer Vision, 88, 214–237.CrossRefGoogle Scholar
  23. Rothganger, F., Lazebnik, S., Schmid, C., & Ponce, J. (2007). Segmenting, modeling, and matching video clips containing multiple moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence (pp. 477–491).Google Scholar
  24. Roussos, A., Russell, C., Garg, R., & de Agapito, L. (2012). Dense multibody motion estimation and reconstruction from a handheld camera. In Proceedings of the IEEE International Symposium on Mixed and Augmented Reality (ISMAR).Google Scholar
  25. Saito, M., Okatani, T., Deguchi, K. (2012). Application of the mean field methods to mrf optimization in computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1680–1687).Google Scholar
  26. Schindler, K., & Suter, D. (2006). Two-view multibody structure-and-motion with outliers through model selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 983–995. ISSN 0162–8828.Google Scholar
  27. Sekkati, H., & Mitiche, A. (2006). Concurrent 3-D motion segmentation and 3-D interpretation of temporal sequences of monocular images. IEEE Transactions on Image Processing, 15(3), 641–653.CrossRefGoogle Scholar
  28. Stückler, J., & Behnke, S. (2013). Efficient dense 3D rigid-body motion segmentation in RGB-D video. In Proceedings of the British Machine Vision Conference (BMVC). BMVA Press.Google Scholar
  29. Stückler, J., & Behnke, S. (2014). Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation, 25(1), 137–147.Google Scholar
  30. Unger, M., Werlberger, M., Pock, T., & Bischof, H. (2012). Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1878–1885).Google Scholar
  31. Van den Bergh, M., & van Gool, L. (2012). Real-time stereo and flow-based video segmentation with superpixels. In IEEE Workshop on Applications of Computer Vision (WACV).Google Scholar
  32. Wang, S., Yu, H., & Hu, R. (2013). 3D video based segmentation and motion estimation with active surface evolution. Journal of Signal Processing Systems, 71(1), 21–34.Google Scholar
  33. Weber, J., & Malik, J. (1997). Rigid body segmentation and shape description from dense optical flow under weak perspective. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 139–143.CrossRefGoogle Scholar
  34. Wedel, A., & Cremers, D. (2011). Stereoscopic scene flow for 3D motion analysis.Google Scholar
  35. Zelnik-Manor, L., Machline, M., & Irani, M. (2006). Multi-body factorization with uncertainty: Revisiting motion consistency. International Journal of Computer Vision, 68(1), 27–41.Google Scholar
  36. Zhang, G., Jia, J., & Bao, H. (2011). Simultaneous multi-body stereo and segmentation. In Proc. of the IEEE International Conference on Computer Vision (ICCV).Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Computer Science Institute VIUniversity of BonnBonnGermany

Personalised recommendations