Skip to main content

Monitored Distillation for Positive Congruent Depth Completion

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

We propose a method to infer a dense depth map from a single image, its calibration, and the associated sparse point cloud. In order to leverage existing models (teachers) that produce putative depth maps, we propose an adaptive knowledge distillation approach that yields a positive congruent training process, wherein a student model avoids learning the error modes of the teachers. In the absence of ground truth for model selection and training, our method, termed Monitored Distillation, allows a student to exploit a blind ensemble of teachers by selectively learning from predictions that best minimize the reconstruction error for a given image. Monitored Distillation yields a distilled depth map and a confidence map, or “monitor”, for how well a prediction from a particular teacher fits the observed image. The monitor adaptively weights the distilled depth where if all of the teachers exhibit high residuals, the standard unsupervised image reconstruction loss takes over as the supervisory signal. On indoor scenes (VOID), we outperform blind ensembling baselines by 17.53% and unsupervised methods by 24.25%; we boast a 79% model size reduction while maintaining comparable performance to the best supervised method. For outdoors (KITTI), we tie for 5th overall on the benchmark despite not using ground truth. Code available at: https://github.com/alexklwong/mondi-python.

T. Y. Liu, P. Agrawal and A. Chen—denotes equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chao, C.H., Cheng, B.W., Lee, C.Y.: Rethinking ensemble-distillation for semantic segmentation based unsupervised domain adaption. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2610–2620 (2021)

    Google Scholar 

  2. Chawla, A., Yin, H., Molchanov, P., Alvarez, J.: Data-free knowledge distillation for object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3289–3298 (2021)

    Google Scholar 

  3. Chen, G., Choi, W., Yu, X., Han, T., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: Advances in neural information processing systems, vol. 30 (2017)

    Google Scholar 

  4. Chen, L., Yu, C., Chen, L.: A new knowledge distillation for incremental object detection. In: 2019 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE (2019)

    Google Scholar 

  5. Chen, Y., Yang, B., Liang, M., Urtasun, R.: Learning joint 2D-3D representations for depth completion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 10023–10032 (2019)

    Google Scholar 

  6. Cheng, X., Wang, P., Guan, C., Yang, R.: CSPN++: learning context and resource aware convolutional spatial propagation networks for depth completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10615–10622 (2020)

    Google Scholar 

  7. Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 103–119 (2018)

    Google Scholar 

  8. Chodosh, N., Wang, C., Lucey, S.: Deep convolutional compressed sensing for LiDAR depth completion. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 499–513. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20887-5_31

    Chapter  Google Scholar 

  9. Choi, K., Jeong, S., Kim, Y., Sohn, K.: Stereo-augmented depth completion from a single RGB-LiDAR image. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13641–13647. IEEE (2021)

    Google Scholar 

  10. Dimitrievski, M., Veelaert, P., Philips, W.: Learning morphological operators for depth completion. In: Blanc-Talon, J., Helbert, D., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2018. LNCS, vol. 11182, pp. 450–461. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01449-0_38

    Chapter  Google Scholar 

  11. Eldesokey, A., Felsberg, M., Holmquist, K., Persson, M.: Uncertainty-aware CNNs for depth completion: Uncertainty from beginning to end. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12014–12023 (2020)

    Google Scholar 

  12. Eldesokey, A., Felsberg, M., Khan, F.S.: Propagating confidences through CNNs for sparse data regression. In: Proceedings of British Machine Vision Conference (BMVC) (2018)

    Google Scholar 

  13. Fei, X., Wong, A., Soatto, S.: Geo-supervised visual depth prediction. IEEE Robot. Autom. Lett. 4(2), 1661–1668 (2019)

    Article  Google Scholar 

  14. Fukuda, T., Suzuki, M., Kurata, G., Thomas, S., Cui, J., Ramabhadran, B.: Efficient knowledge distillation from an ensemble of teachers. In: Interspeech, pp. 3697–3701 (2017)

    Google Scholar 

  15. Gofer, E., Praisler, S., Gilboa, G.: Adaptive LiDAR sampling and depth completion using ensemble variance. IEEE Trans. Image Process. 30, 8900–8912 (2021)

    Article  Google Scholar 

  16. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  17. Hong, B.-W., Koo, J.-K., Dirks, H., Burger, M.: Adaptive regularization in convex composite optimization for variational imaging problems. In: Roth, V., Vetter, T. (eds.) GCPR 2017. LNCS, vol. 10496, pp. 268–280. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66709-6_22

    Chapter  Google Scholar 

  18. Hong, B.W., Koo, J., Burger, M., Soatto, S.: Adaptive regularization of some inverse problems in image analysis. IEEE Trans. Image Process. 29, 2507–2521 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  19. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  20. Hu, J., et al.: Boosting light-weight depth estimation via knowledge distillation. arXiv preprint arXiv:2105.06143 (2021)

  21. Hu, M., Wang, S., Li, B., Ning, S., Fan, L., Gong, X.: PENet: towards precise and efficient image guided depth completion. arXiv preprint arXiv:2103.00783 (2021)

  22. Huang, Z., Fan, J., Cheng, S., Yi, S., Wang, X., Li, H.: HMS-net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. IEEE Trans. Image Process. 29, 3429–3441 (2019)

    Article  MATH  Google Scholar 

  23. Hwang, S., Lee, J., Kim, W.J., Woo, S., Lee, K., Lee, S.: LiDAR depth completion using color-embedded information via knowledge distillation. IEEE Trans. Intell. Transp. Syst. (2021)

    Google Scholar 

  24. Jaritz, M., De Charette, R., Wirbel, E., Perrotton, X., Nashashibi, F.: Sparse and dense data with CNNs: depth completion and semantic segmentation. In: 2018 International Conference on 3D Vision (3DV), pp. 52–60. IEEE (2018)

    Google Scholar 

  25. Jin, H., Soatto, S., Yezzi, A.J.: Multi-view stereo beyond lambert. In: 2003 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, p. I. IEEE (2003)

    Google Scholar 

  26. Kang, J., Gwak, J.: Ensemble learning of lightweight deep learning models using knowledge distillation for image classification. Mathematics 8(10), 1652 (2020)

    Article  Google Scholar 

  27. Ku, J., Harakeh, A., Waslander, S.L.: In defense of classical image processing: Fast depth completion on the CPU. In: 2018 15th Conference on Computer and Robot Vision (CRV), pp. 16–22. IEEE (2018)

    Google Scholar 

  28. Lan, X., Zhu, X., Gong, S.: Knowledge distillation by on-the-fly native ensemble. arXiv preprint arXiv:1806.04606 (2018)

  29. Li, A., Yuan, Z., Ling, Y., Chi, W., Zhang, C., et al.: A multi-scale guided cascade hourglass network for depth completion. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 32–40 (2020)

    Google Scholar 

  30. Liu, T.Y., Agrawal, P., Chen, A., Hong, B.W., Wong, A.: Monitored distillation for positive congruent depth completion. arXiv preprint arXiv:2203.16034 (2022)

  31. Liu, Y., Chen, K., Liu, C., Qin, Z., Luo, Z., Wang, J.: Structured knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2604–2613 (2019)

    Google Scholar 

  32. Liu, Y., Shu, C., Wang, J., Shen, C.: Structured knowledge distillation for dense prediction. IEEE Trans. Pattern Anal. Mach. Intell. (2020)

    Google Scholar 

  33. Liu, Y., Sheng, L., Shao, J., Yan, J., Xiang, S., Pan, C.: Multi-label image classification via knowledge distillation from weakly-supervised detection. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 700–708 (2018)

    Google Scholar 

  34. Lopez-Rodriguez, A., Busam, B., Mikolajczyk, K.: Project to adapt: domain adaptation for depth completion from noisy and sparse sensor data. In: Proceedings of the Asian Conference on Computer Vision (2020)

    Google Scholar 

  35. Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised sparse-to-dense: self-supervised depth completion from lidar and monocular camera. In: International Conference on Robotics and Automation (ICRA), pp. 3288–3295. IEEE (2019)

    Google Scholar 

  36. McCormac, J., Handa, A., Leutenegger, S., Davison, A.J.: SceneNet RGB-D: can 5m synthetic images beat generic ImageNet pre-training on indoor segmentation? In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2678–2687 (2017)

    Google Scholar 

  37. Merrill, N., Geneva, P., Huang, G.: Robust monocular visual-inertial depth completion for embedded systems. In: International Conference on Robotics and Automation (ICRA). IEEE (2021)

    Google Scholar 

  38. Michieli, U., Zanuttigh, P.: Knowledge distillation for incremental learning in semantic segmentation. Comput. Vis. Image Underst. 205, 103167 (2021)

    Google Scholar 

  39. Park, J., Joo, K., Hu, Z., Liu, C.-K., So Kweon, I.: Non-local spatial propagation network for depth completion. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12358, pp. 120–136. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58601-0_8

    Chapter  Google Scholar 

  40. Park, S., Heo, Y.S.: Knowledge distillation for semantic segmentation using channel and spatial correlations and adaptive cross entropy. Sensors 20(16), 4616 (2020)

    Article  Google Scholar 

  41. Pilzer, A., Lathuiliere, S., Sebe, N., Ricci, E.: Refine and distill: exploiting cycle-inconsistency and knowledge distillation for unsupervised monocular depth estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9768–9777 (2019)

    Google Scholar 

  42. Qiu, J., et al.: DeepLiDAR: deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3313–3322 (2019)

    Google Scholar 

  43. Qu, C., Liu, W., Taylor, C.J.: Bayesian deep basis fitting for depth completion with uncertainty. arXiv preprint arXiv:2103.15254 (2021)

  44. Qu, C., Nguyen, T., Taylor, C.: Depth completion via deep basis fitting. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 71–80 (2020)

    Google Scholar 

  45. Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: FitNets: hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014)

  46. Shivakumar, S.S., Nguyen, T., Miller, I.D., Chen, S.W., Kumar, V., Taylor, C.J.: DfuseNet: deep fusion of RGB and sparse depth information for image guided dense depth completion. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC), pp. 13–20. IEEE (2019)

    Google Scholar 

  47. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54

    Chapter  Google Scholar 

  48. Traganitis, P.A., Giannakis, G.B.: Blind multi-class ensemble learning with dependent classifiers. In: 2018 26th European Signal Processing Conference (EUSIPCO), pp. 2025–2029. IEEE (2018)

    Google Scholar 

  49. Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), pp. 11–20. IEEE (2017)

    Google Scholar 

  50. Van Gansbeke, W., Neven, D., De Brabandere, B., Van Gool, L.: Sparse and noisy lidar completion with RGB guidance and uncertainty. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6. IEEE (2019)

    Google Scholar 

  51. Walawalkar, D., Shen, Z., Savvides, M.: Online ensemble model compression using knowledge distillation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 18–35. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_2

    Chapter  Google Scholar 

  52. Wang, Y., Li, X., Shi, M., Xian, K., Cao, Z.: Knowledge distillation for fast and accurate monocular depth estimation on mobile devices. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2457–2465 (2021)

    Google Scholar 

  53. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  54. Wong, A., Cicek, S., Soatto, S.: Learning topology from synthetic data for unsupervised depth completion. IEEE Robot. Autom. Lett. 6(2), 1495–1502 (2021)

    Article  Google Scholar 

  55. Wong, A., Fei, X., Hong, B.W., Soatto, S.: An adaptive framework for learning unsupervised depth completion. IEEE Robot. Autom. Lett. 6(2), 3120–3127 (2021)

    Article  Google Scholar 

  56. Wong, A., Fei, X., Tsuei, S., Soatto, S.: Unsupervised depth completion from visual inertial odometry. IEEE Robot. Autom. Lett. 5, 1899–1906 (2020)

    Article  Google Scholar 

  57. Wong, A., Soatto, S.: Bilateral cyclic constraint and adaptive regularization for unsupervised monocular depth prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5644–5653 (2019)

    Google Scholar 

  58. Wong, A., Soatto, S.: Unsupervised depth completion with calibrated backprojection layers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12747–12756 (2021)

    Google Scholar 

  59. Xiang, L., Ding, G., Han, J.: Learning from multiple experts: self-paced knowledge distillation for long-tailed classification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 247–263. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_15

    Chapter  Google Scholar 

  60. Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse LiDAR data with depth-normal constraints. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2811–2820 (2019)

    Google Scholar 

  61. Yan, S., et al.: Positive-congruent training: Towards regression-free model updates. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14299–14308 (2021)

    Google Scholar 

  62. Yang, Y., Wong, A., Soatto, S.: Dense depth posterior (DDP) from single image and sparse range. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3353–3362 (2019)

    Google Scholar 

  63. Zhang, Y., Funkhouser, T.: Deep depth completion of a single RGB-D image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 175–185 (2018)

    Google Scholar 

Download references

Acknowledgements

This work was supported by ARO W911NF-17-1-0304, ONR N00014-22-1-2252, NIH-NEI 1R01EY030595, and IITP-2021-0-01341 (AIGS-CAU). We thank Stefano Soatto for his continued support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tian Yu Liu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 17522 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, T.Y., Agrawal, P., Chen, A., Hong, BW., Wong, A. (2022). Monitored Distillation for Positive Congruent Depth Completion. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13662. Springer, Cham. https://doi.org/10.1007/978-3-031-20086-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20086-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20085-4

  • Online ISBN: 978-3-031-20086-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics