Skip to main content
Log in

Improving outdoor plane estimation without manual supervision

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Recently, great progress has been made in the automatic detection and segmentation of planar regions from monocular images of indoor scenes. This has been achieved thanks to the development of convolutional neural network architectures for the task and the availability of large amounts of training data usually obtained with the help of active depth sensors. Unfortunately, it is much harder to obtain large image sets outdoors partly due to limited range of active sensors. Therefore, there is a need to develop techniques that transfer features learned from the indoor dataset to segmentation of outdoor images. We propose such an approach that does not require manual annotations on the outdoor datasets. Instead, we exploit a network trained on indoor images and an automatically reconstructed point cloud to estimate the training ground truth on the outdoor images in an energy minimization framework. We show that the resulting ground truth estimate is good enough to improve the network weights. Moreover, the process can be repeated multiple times to further improve plane detection and segmentation accuracy on monocular images of outdoor scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua P., Süsstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012). https://doi.org/10.1109/TPAMI.2012.120

  2. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2274–2282 (2012)

    Article  Google Scholar 

  3. Bódis-Szomorú, A., Riemenschneider, H., Van Gool, L.: Fast, approximate piecewise-planar modeling based on sparse structure-from-motion and superpixels. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 469–476 (2014)

  4. Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222–1239 (2001)

    Article  Google Scholar 

  5. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    Article  MathSciNet  Google Scholar 

  6. Freedman, D.: An improved image graph for semi-automatic segmentation. SIVP 6, 533–545 (2012)

    Google Scholar 

  7. Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Manhattan-world stereo. In: 2009 IEEE conference on computer vision and pattern recognition, pp. 1422–1429. IEEE (2009)

  8. Gallup, D., Frahm, J.M., Pollefeys, M.: Piecewise planar and non-planar stereo for urban scene reconstruction. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 1418–1425. IEEE (2010)

  9. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

  10. Li, Y., Snavely, N., Huttenlocher, D.P.: Location recognition using prioritized feature matching. In: European conference on computer vision, pp. 791–804. Springer (2010)

  11. Liu, C., Kim, K., Gu, J., Furukawa, Y., Kautz, J.: Planercnn: 3d plane detection and reconstruction from a single image. In: The IEEE conference on computer vision and pattern recognition (CVPR) (2019)

  12. Liu, C., Yang, J., Ceylan, D., Yumer, E., Furukawa, Y.: Planenet: Piece-wise planar reconstruction from a single rgb image. In: The IEEE conference on computer vision and pattern recognition (CVPR) (2018)

  13. Liu, G., Duan, J.: RGB-D image segmentation using superpixel and multi-feature fusion graph theory. SIVP 14, 1171–1179 (2020)

    Google Scholar 

  14. Schönberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Conference on computer vision and pattern recognition (CVPR) (2016)

  15. Schönberger, J.L., Zheng, E., Pollefeys, M., Frahm, J.M.: Pixelwise view selection for unstructured multi-view stereo. In: European Conference on Computer Vision (ECCV) (2016)

  16. Sinha, S., Steedly, D., Szeliski, R.: Piecewise planar stereo for image-based rendering (2009)

  17. Yang, F., Zhou, Z.: Recovering 3d planes from a single image via convolutional neural networks. In: The European Conference on Computer Vision (ECCV) (2018)

  18. Yu, Z., Zheng, J., Lian, D., Zhou, Z., Gao, S.: Single-image piece-wise planar 3d reconstruction via associative embedding. CoRR abs/1902.09777 (2019)

  19. Zeng, Z., Wu, M., Zeng, W., Fu, C.-W.: Deep recognition of vanishing-point-constrained building planes in urban street views. IEEE Trans. Image Process. 29, 5912–5923 (2020)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mustafa Özuysal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Uzyıldırım, F.E., Özuysal, M. Improving outdoor plane estimation without manual supervision. SIViP 16, 1–9 (2022). https://doi.org/10.1007/s11760-021-01996-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-021-01996-1

Keywords

Navigation