Unsupervised Dense Object Discovery, Detection, Tracking and Reconstruction

  • Lu Ma
  • Gabe Sibley
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8690)


In this paper, we present an unsupervised framework for discovering, detecting, tracking, and reconstructing dense objects from a video sequence. The system simultaneously localizes a moving camera, and discovers a set of shape and appearance models for multiple objects, including the scene background. Each object model is represented by both a 2D and 3D level-set. This representation is used to improve detection, 2D-tracking, 3D-registration and importantly subsequent updates to the level-set itself. This single framework performs dense simultaneous localization and mapping as well as unsupervised object discovery. At each iteration portions of the scene that fail to track, such as bulk outliers on moving rigid bodies, are used to either seed models for new objects or to update models of known objects. For the latter, once an object is successfully tracked in 2D with aid from a 2D level-set segmentation, the level-set is updated and then used to aid registration and evolution of a 3D level-set that captures shape information. For a known object either learned by our system or introduced from a third-party library, our framework can detect similar appearances and geometries in the scene. The system is tested using single and multiple object data sets. Results demonstrate an improved method for discovering and reconstructing 2D and 3D object models, which aid tracking even under significant occlusion or rapid motion.


Structure From Motion SLAM 3D Tracking 3D Reconstruction Dense Reconstruction Learning Level-Set Evolution 


  1. 1.
    Baker, S., Matthews, I.: Lucas-kanade 20 years on: A unifying framework. International Journal of Computer Vision 56(3), 221–255 (2004)CrossRefGoogle Scholar
  2. 2.
    Bao, S.Y., Chandraker, M., Lin, Y., Savarese, S.: Dense object reconstruction with semantic priors. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1264–1271. IEEE (2013)Google Scholar
  3. 3.
    Bibby, C., Reid, I.D.: Robust real-time visual tracking using pixel-wise posteriors. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 831–844. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Bibby, C., Reid, I.: Real-time tracking of multiple occluding objects using level sets. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1307–1314. IEEE (2010)Google Scholar
  5. 5.
    Blais, G., Levine, M.D.: Registering multiview range data to create 3d computer objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(8), 820–824 (1995)CrossRefGoogle Scholar
  6. 6.
    Chen, Y., Medioni, G.: Object modelling by registration of multiple range images. Image and vision computing 10(3), 145–155 (1992)CrossRefGoogle Scholar
  7. 7.
    Cremers, D., Rousson, M., Deriche, R.: A review of statistical approaches to level set segmentation: integrating color, texture, motion and shape. International Journal of Computer Vision 72(2), 195–215 (2007)CrossRefGoogle Scholar
  8. 8.
    Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 303–312. ACM (1996)Google Scholar
  9. 9.
    Dame, A., Prisacariu, V.A., Ren, C.Y., Reid, I.: Dense reconstruction using 3d object shape priors. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1288–1295. IEEE (2013)Google Scholar
  10. 10.
    Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., et al.: Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, pp. 559–568. ACM (2011)Google Scholar
  11. 11.
    Li, C., Xu, C., Gui, C., Fox, M.D.: Distance regularized level set evolution and its application to image segmentation. IEEE Transactions on Image Processin 19(12), 3243–3254 (2010)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Meilland, M., Comport, A.I.: Super-resolution 3d tracking and mapping. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 5717–5723. IEEE (2013)Google Scholar
  13. 13.
    Newcombe, R.A., Davison, A.J.: Live dense reconstruction with a single moving camera. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1498–1505. IEEE (2010)Google Scholar
  14. 14.
    Newcombe, R.A., Davison, A.J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., Fitzgibbon, A.: Kinectfusion: Real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136. IEEE (2011)Google Scholar
  15. 15.
    Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: Dtam: Dense tracking and mapping in real-time. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2320–2327. IEEE (2011)Google Scholar
  16. 16.
    Prisacariu, V.A., Reid, I.D.: Pwp3d: Real-time segmentation and tracking of 3d objects. International Journal of Computer Vision 98(3), 335–354 (2012)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Ren, C.Y., Prisacariu, V., Murray, D., Reid, I.: Star3d: Simultaneous tracking and reconstruction of 3d objects using rgb-d dataGoogle Scholar
  18. 18.
    Rusu, R.B., Blodow, N., Marton, Z.C., Beetz, M.: Close-range scene segmentation and reconstruction of 3d point cloud maps for mobile manipulation in domestic environments. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2009. pp. 1–6. IEEE (2009)Google Scholar
  19. 19.
    Salas-Moreno, R.F., Newcombe, R.A., Strasdat, H., Kelly, P.H., Davison, A.J.: Slam++: Simultaneous localisation and mapping at the level of objects. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1352–1359. IEEE (2013)Google Scholar
  20. 20.
    Sibley, G., Keivan, N., Patron-Perez, A., Murphy, L., Lovegrove, S., Mamo, V.: Scalable perception and planning based control. In: International Symposium on Robotics Research (2013)Google Scholar
  21. 21.
    Teh, C.H., Chin, R.T.: On the detection of dominant points on digital curves. IEEE Transactions on Pattern Analysis and Machine Intelligence 11(8), 859–872 (1989)CrossRefGoogle Scholar
  22. 22.
    Whelan, T., Johannsson, H., Kaess, M., Leonard, J.J., McDonald, J.: Robust tracking for real-time dense rgb-d mapping with kintinuous (2012)Google Scholar
  23. 23.
    Zhou, Q.Y., Miller, S., Koltun, V.: Elastic fragments for dense scene reconstruction. Environments 27(16), 7–35Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Lu Ma
    • 1
  • Gabe Sibley
    • 1
  1. 1.Autonomous Robotics and Perception GroupThe George Washington UniversityWashington DCUSA

Personalised recommendations