Abstract
Dense depth estimation is significant in robotic systems, such as for mapping, localization, and object recognition. For multiple sensors, an active depth sensor can provide accurate but sparse measurements for environments, and a camera pair can provide dense but imprecise stereo reconstruction results. In this paper, a tightly coupled fusion method is proposed for depth sensor and stereo camera to complete dense depth estimation, and advantages of the two type sensors are combined so as to achieve better depth estimation. An adaptive dynamic cross-arm algorithm are developed to integrate sparse depth measurements into camera-dominated semiglobal stereo matching. To obtain the optimal arm length for a measured pixel point, each cross-arm shape is variational and calculated automatically. Public datasets of KITTI, Middlebury, and Scene Flow datasets are used with comparison experiments to test performance of the proposed method, and real-world experiments are further conducted for verification.
Similar content being viewed by others
References
Choe, J., Joo, K., Imtiaz, T., Kweon, I.S.: Volumetric propagation network: stereo-LiDAR fusion for long-range depth estimation. IEEE Robot. Autom. Lett. 6(3), 4672–4679 (2021)
Evangelidis, G.D., Hansard, M., Horaud, R.: Fusion of range and stereo data for high-resolution scene-modeling. IEEE Trans. Pattern Anal. Mach. Intell. 37(11), 2178–2192 (2015)
Chen, S., Zhang, J., Jin, M.: A simplified ICA-based local similarity stereo matching. Vis. Comput. 37, 411–419 (2021)
Hernandez-Juarez, D., Chacón, A., Espinosa, A., Vázquez, D., Moure, J.C., López, A.M.: Embedded real-time stereo estimation via semi-global matching on the GPU. Procedia Comput. Sci. 80, 143–153 (2016)
Kraft, H., Frey, J., Moeller, T., Albrecht, M., Grothof, M., Schink, B., Hess, H., Buxbaum, B.: “3D-camera of high 3D-frame rate, depth-resolution and background light elimination based on improved PMD (photonic mixer device)-technologies,” In: OPTO, Nuernberg, (2004)
Beder, C., Bartczak, B., Koch, R.: “A comparison of PMD-cameras and stereo-vision for the task of surface reconstruction using patchlets,” In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern. Recognit., Minneapolis, MN, USA, pp. 1–8 (2007)
Jian, M., Dong, J., Gong, M., Yu, H., et al.: Learning the traditional art of Chinese calligraphy via three-dimensional reconstruction and assessment. IEEE Trans. Multimed. 22(4), 970–979 (2020)
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47, 7–42 (2002)
Brown, M.Z., Burschka, D., Hager, G.D.: Advances in computational stereo. IEEE Trans. Pattern Anal. Mach. Intell. 25(8), 993–1008 (2003)
Rother, C., Carlsson, S.: Linear multi view reconstruction and camera recovery using a reference plane. Int. J. Comput. Vis. 49, 117–141 (2002)
Hirschmuller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)
Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res. 17(1), 2287–2318 (2016)
Hambarde, P., Murala, S.: S2DNet: Depth estimation from single image and sparse samples. IEEE Trans. Comput. Imag. 6, 806–817 (2020)
Zhao, T., Pan, S., Gao, W., Sheng, C., Sun, Y., Wei, J.: Attention Unet++ for lightweight depth estimation from sparse depth samples and a single RGB image. Vis. Comput. (2021). https://doi.org/10.1007/s00371-021-02092-8
Wang, T.H., Hu, H.N., Lin, C.H., Tsai, Y.-H., Chiu, W.-C., Sun, M.: “3D LiDAR and stereo fusion using stereo matching network with conditional cost volume normalization,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Macau, China, pp. 5895–5902 (2019)
Park, K., Kim, S., Sohn, K.: High-precision depth estimation using uncalibrated LiDAR and stereo fusion. IEEE Trans. Intell. Transp. Syst. 21(1), 321–335 (2020)
Zhang, J., Ramanagopalg, M.S., Vasudevan, R., Johnson-Roberson, M.: “LiStereo: Generate dense depth maps from LiDAR and stereo imagery,” In: Proc. IEEE Int. Conf. Robot. Autom., Paris, France, pp. 7829–7836 (2020)
Badino, H., Huber, D., Kanade, T.: “Integrating LiDAR into stereo for fast and improved disparity computation,” In: Proc. Int. Conf. 3D Imaging Model. Process. Vis. Transm., Hangzhou, China, pp. 405–412 (2011)
Maddern, W., Newman, P.: “Real-time probabilistic fusion of sparse 3D LiDAR and dense stereo,” In: Proc IEEE/RSJ Int. Conf. Intell. Robots Syst., Daejeon, Korea, pp. 2181–2188 (2016)
Veitch-Michaelis, J., Muller, J., Storey, J., Walton, D., Foster, M.: Data fusion of LiDAR into a region growing stereo algorithm. Int. Arch. Photogramm Remote Sens. Spat. Inf. Sci. 40(4), 107 (2015)
Courtois, H., Aouf, N.: “Fusion of stereo and LiDAR data for dense depth map computation,” In: Proc. Workshop Res. Educ. Dev. Unman. Aerial Syst., Linkoping, Sweden, pp. 186–191 (2017)
Premebida, C., Garrote, L., Asvadi, A., Ribeiro, A.P., Nunes, U.: “High-resolution LiDAR-based depth mapping using bilateral filter,” In: Proc. Int. Conf. Intell. Transp. Syst., Rio de Janeiro, Brazil, pp. 2469–2474 (2016)
Kuhnert, K., Stommel, M.: “Fusion of stereo-camera and PMD-camera data for real-time suited precise 3D environment reconstruction,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Beijing, China, pp. 4780–478 (2006)
Shim, H., Adelsberger, R., Kim, J.D., Rhee, S.M., Rhee, T., Sim, J.Y., et al.: Time-of-flight sensor and color camera calibration for multi-view acquisition. The Vis. Comput. 28, 1139–1151 (2012)
Agresti, G., Zanuttigh, P.: “Combination of spatially-modulated ToF and structured light for MPI-free depth estimation,” In: Proc. Euro. Conf. Comput. Vis., Munich, Germany, pp. 355–371 (2018)
Zhao, T., Pan, S., Gao, W., Sun, Y.: Learning modal and spatial features with lightweight 3D convolution for RGB guided depth completion. IEEE Trans. Consum. Electron. 67(3), 195–201 (2020)
Cheng, X., Zhong, Y., Harandi, M., Dai, Y., et al.: Hierarchical neural architecture search for deep stereo matching. Adv. Neural Inform. Proc. Syst. 33, 22158–22169 (2020)
Oberle, W.F., Davis, L.: “Toward high resolution, ladar-quality 3D world models using ladar-stereo data integration and fusion,” Army Rese. Labor., (2005)
Yang, Q., Tan, K., Culbertson, B., Apostolopoulos, J.: “Fusion of active and passive sensors for fast 3D capture,” In: Proc. IEEE Int. Workshop Multi. Sig., Saint-Malo, France, pp. 69–74 (2010)
Poggi, M., Agresti, G., Tosi, F., Zanuttigh, P., Mattoccia, S.: Confidence estimation for ToF and stereo sensors and its application to depth data fusion. IEEE Sensors J. 20(3), 1411–1421 (2020)
Marin, G., Zanuttigh, P., Mattoccia, S.: “Reliable fusion of ToF and stereo depth driven by confidence measures,” In: Proc. Euro. Conf. Comput. Vis., Amsterdam, Netherlands, pp. 386–401 (2016)
Gandhi, V., C̆ech, J., Horaud, R.: “High-resolution depth maps based on TOF-stereo fusion,” In: Proc. IEEE Int. Conf. Robot. Autom., Saint Paul, MN, USA, pp. 4742–4749 (2012)
Zhu, J., Liang, W., Yang, R., Davis, J., Pan, Z.: Reliability fusion of time-of-flight depth and stereo geometry for high quality depth maps. IEEE Trans. Pattern Anal. Mach. Intell. 33(7), 1400–1414 (2011)
Chen, B., Jung, C., Zhang, Z.: Variational fusion of time-of-flight and stereo data for depth estimation using edge-selective joint filtering. IEEE Trans. Multimed. 20(11), 2882–2890 (2018)
Ma, F., Karaman, S.: “Sparse-to-dense: Depth prediction from sparse depth samples and a single image,” In: Proc. IEEE Int. Conf. Robot. Autom., Brisbane, Australia, pp. 4796–4803 (2018)
Shivakumar, S.S., Mohta, K., Pfrommer, B., Kumar, V., Taylor, C.J.: “Real time dense depth estimation by fusing stereo with sparse depth measurements,” In: Proc. IEEE Int. Conf. Robot. Autom., Montreal, Canada, pp. 6482–6488 (2019)
Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. Proc. Euro. Conf. Comput. Vis., Stockholm, Sweden 801, 151–158 (1994)
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47(1), 7–42 (2002)
Chang, J., Chen, Y.: “Pyramid stereo matching network,” In: Proc. IEEE Comput. Soc. Conf. Comput. Vision. Pattern. Recognit., Salt Lake City, UT, USA, pp. 5410–5418 (2018)
Huang, Y., Liu, Y., Wu, T., Su, H., et al.: “S3: Learnable sparse signal superdensity for guided depth estimation,” In: Proc. IEEE/CVF Comput. Vis. Pattern Recognit., pp. 16706–16716 (2021)
Wang, T., Hu, H., Lin, C., Tsai, Y., et al.: “3D LiDAR and stereo fusion using stereo matching network with conditional cost volume normalization,” In: Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., Macau, China, Nov. pp. 5895–5902 (2019)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant 61973234, in part by the Tianjin Natural Science Foundation under Grant 18JCZDJC96700 and Grant 20JCYBJC00180, and in part by Tianjin Science Fund for Distinguished Young Scholars under Grant 19JCJQJC62100.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mo, H., Li, B., Shi, W. et al. Cross-based dense depth estimation by fusing stereo vision with measured sparse depth. Vis Comput 39, 4339–4350 (2023). https://doi.org/10.1007/s00371-022-02594-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02594-z