Coarse-to-fine Planar Regularization for Dense Monocular Depth Estimation

  • Stephan LiwickiEmail author
  • Christopher Zach
  • Ondrej Miksik
  • Philip H. S. Torr
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9906)


Simultaneous localization and mapping (SLAM) using the whole image data is an appealing framework to address shortcoming of sparse feature-based methods – in particular frequent failures in textureless environments. Hence, direct methods bypassing the need of feature extraction and matching became recently popular. Many of these methods operate by alternating between pose estimation and computing (semi-)dense depth maps, and are therefore not fully exploiting the advantages of joint optimization with respect to depth and pose. In this work, we propose a framework for monocular SLAM, and its local model in particular, which optimizes simultaneously over depth and pose. In addition to a planarity enforcing smoothness regularizer for the depth we also constrain the complexity of depth map updates, which provides a natural way to avoid poor local minima and reduces unknowns in the optimization. Starting from a holistic objective we develop a method suitable for online and real-time monocular SLAM. We evaluate our method quantitatively in pose and depth on the TUM dataset, and qualitatively on our own video sequences.


SLAM Monocular odometry Dense tracking and mapping 



O. Miksik is supported by Technicolor. P. Torr wishes to acknowledges the support of ERC grant ERC-2012-AdG 321162-HELIOS, EPSRC/MURI grant ref EP/N019474/1, EPSRC grant EP/M013774/1, EPSRC Programme Grant Seebibyte EP/M013774/1.

Supplementary material

Supplementary material 1 (mp4 26828 KB)


  1. 1.
    Barfield, W.: Fundamentals of Wearable Computers and Augmented Reality, 2nd edn. CRC Press, Boca Raton (2016)Google Scholar
  2. 2.
    Engel, J., Sturm, J., Cremers, D.: Scale-aware navigation of a low-cost quadrocopter with a monocular camera. Robot. Auton. Syst. 62(11), 1646–1656 (2014)CrossRefGoogle Scholar
  3. 3.
    Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: ICRA 2014, pp. 15–22 (2014)Google Scholar
  4. 4.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR 2012, pp. 3354–3361 (2012)Google Scholar
  5. 5.
    Miksik, O., Vineet, V., Lidegaard, M., Prasaath, R., Nießner, M., Golodetz, S., Hicks, S., Pérez, P., Izadi, S., Torr, P.: The semantic paintbrush: interactive 3D mapping and recognition in large outdoor spaces. In: ACM Conference Human Factors in Computing, CHI 2015, pp. 3317–3326 (2015)Google Scholar
  6. 6.
    Vineet, V., Miksik, O., Lidegaard, M., Nießner, M., Golodetz, S., Prisacariu, V., Kähler, O., Murray, D., Izadi, S., Pérez, P., Torr, P.: Incremental dense semantic stereo fusion for large-scale semantic scene reconstruction. In: ICRA 2015 (2015)Google Scholar
  7. 7.
    Schöps, T., Engel, J., Cremers, D.: Semi-dense visual odometry for AR on a smartphone. In: ISMAR 2014, pp. 145–150 (2014)Google Scholar
  8. 8.
    Newcombe, R., Lovegrove, S., Davison, A.: DTAM: dense tracking and mapping in real-time. In: IEEE International Conference on Computer Vision, ICCV 2011, pp. 2320–2327 (2011)Google Scholar
  9. 9.
    Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 834–849. Springer, Heidelberg (2014)Google Scholar
  10. 10.
    Miksik, O., Amar, Y., Vineet, V., Pérez, P., Torr, P.: Incremental dense multi-modal 3D scene reconstruction. In: IROS 2015 (2015)Google Scholar
  11. 11.
    Newcombe, R., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: ISMAR 2011, pp. 127–136 (2011)Google Scholar
  12. 12.
    Salas-Moreno, R., Glocker, B., Kelly, P., Davison, A.: Dense planar SLAM. In: ISMAR 2014, pp. 157–164 (2014)Google Scholar
  13. 13.
    Yamaguchi, K., McAllester, D., Urtasun, R.: Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 756–771. Springer, Heidelberg (2014)Google Scholar
  14. 14.
    Nister, D., Naroditsky, O., Bergen, J.: Indoor positioning using multi-frequency RSS with foot-mounted INS. In: CVPR 2004, pp. 652–659 (2004)Google Scholar
  15. 15.
    Davison, A.: Real-time simultaneous localisation and mapping with a single camera. In: CVPR 2003, pp. 1403–1410 (2003)Google Scholar
  16. 16.
    Davison, A., Reid, I., Molton, N., Stasse, O.: MonoSLAM: real-time single camera SLAM. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)CrossRefGoogle Scholar
  17. 17.
    Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: ISMAR 2007 (2007)Google Scholar
  18. 18.
    Wendel, A., Maurer, M., Graber, G., Pock, T., Bischof, H.: Dense reconstruction on-the-fly. In: CVPR 2012, pp. 1450–1457 (2012)Google Scholar
  19. 19.
    Pradeep, V., Rhemann, C., Izadi, S., Zach, C., Bleyer, M., Bathiche, S.: MonoFusion: real-time 3D reconstruction of small scenes with a single web camera. In: IEEE on ISMAR, pp. 83–88 (2013)Google Scholar
  20. 20.
    Concha, A., Civera, J.: DPPTAM: dense piecewise planar tracking and mapping from a monocular sequence. In: IROS 2015 (2015)Google Scholar
  21. 21.
    Tarrio, J., Pedre, S.: Realtime edge-based visual odometry for a monocular camera. In: IEEE International Conference on Computer Vision, ICCV 2015, pp. 702–710 (2015)Google Scholar
  22. 22.
    Geiger, A., Roser, M., Urtasun, R.: Efficient large-scale stereo matching. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part I. LNCS, vol. 6492, pp. 25–38. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  23. 23.
    Sinha, S., Scharstein, D., Szeliski, S.: Efficient high-resolution stereo matching using local plane sweeps. In: CVPR 2014, pp. 1582–1589 (2014)Google Scholar
  24. 24.
    Zhang, C., Li, Z., Cheng, Y., Cai, R., Chao, H., Rui, Y.: MeshStereo: a global stereo model with mesh alignment regularization for view interpolation. In: IEEE International Conference on Computer Vision, ICCV 2015, pp. 2057–2065 (2015)Google Scholar
  25. 25.
    Becker, F., Lenzen, F., Kappes, J., Schnörr, C.: Variational recursive joint estimation of dense scene structure and camera motion from monocular high speed traffic sequences. In: IEEE International Conference on Computer Vision, ICCV 2011, pp. 1692–1699 (2011)Google Scholar
  26. 26.
    Concha, A., Hussain, W., Montano, L., Civera, J.: Incorporating scene priors to dense monocular mapping. Auton. Robots 39(3), 279–292 (2015)CrossRefGoogle Scholar
  27. 27.
    Salas, M., Hussain, W., Concha, A., Montano, L., Civera, J., Montiel, J.: Layout aware visual tracking and mapping. In: IROS 2015 (2015)Google Scholar
  28. 28.
    Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: International Joint Conference on Artifical Intelligence, IJCAI 1981, pp. 674–679 (1981)Google Scholar
  29. 29.
    Li, H., Summer, R., Pauly, M.: Global correspondence optimization for non-rigid registration of depth scans. Comput. Graph. Forum 27(5), 1421–1430 (2008)CrossRefGoogle Scholar
  30. 30.
    Yang, J., Li, H.: Dense, accurate optical flow estimation with piecewise parametric model. In: ECCV 2015, pp. 1019–1027 (2015)Google Scholar
  31. 31.
    Westheimer, G.: Cooperative neural processes involved in stereoscopic acuity. Exp. Brain Res. 36, 585–597 (1979)CrossRefGoogle Scholar
  32. 32.
    Zach, C.: Robust bundle adjustment revisited. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 772–787. Springer, Heidelberg (2014)Google Scholar
  33. 33.
    Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of RGB-D SLAM systems. In: IROS 2012 (2012)Google Scholar
  34. 34.
    Mobahi, H., Fisher III, J.W.: On the link between Gaussian homotopy continuation and convex envelopes. In: Tai, X.-C., Bae, E., Chan, T.F., Lysaker, M. (eds.) EMMCVPR 2015. LNCS, vol. 8932, pp. 43–56. Springer, Heidelberg (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Stephan Liwicki
    • 1
    Email author
  • Christopher Zach
    • 1
  • Ondrej Miksik
    • 2
  • Philip H. S. Torr
    • 2
  1. 1.Toshiba Research EuropeCambridgeUK
  2. 2.University of OxfordOxfordUK

Personalised recommendations