Abstract
Event cameras do not produce images, but rather a continuous flow of events, which encode changes of illumination for each pixel independently and asynchronously. While they output temporally rich information, they lack any depth information which could facilitate their use with other sensors. LiDARs can provide this depth information, but are by nature very sparse, which makes the depth-to-event association more complex. Furthermore, as events represent changes of illumination, they might also represent changes of depth; associating them with a single depth is therefore inadequate. In this work, we propose to address these issues by fusing information from an event camera and a LiDAR using a learning-based approach to estimate accurate dense depth maps. To solve the “potential change of depth” problem, we propose here to estimate two depth maps at each step: one “before” the events happen, and one “after” the events happen. We further propose to use this pair of depths to compute a depth difference for each event, to give them more context. We train and evaluate our network, ALED, on both synthetic and real driving sequences, and show that it is able to predict dense depths with an error reduction of up to 61% compared to the current state of the art. We also demonstrate the quality of our 2-depths-to-event association, and the usefulness of the depth difference information. Finally, we release SLED, a novel synthetic dataset comprising events, LiDAR point clouds, RGB images, and dense depth maps.
Supported in part by the Hauts-de-France Region and in part by the SIVALab Joint Laboratory (Renault Group—Université de technologie de Compiègne (UTC)—Centre National de la Recherche Scientifique (CNRS)).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baldwin, R.W., Liu, R., Almatrafi, M.B., Asari, V.K., Hirakawa, K.: Time-ordered recent event (TORE) volumes for event cameras. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2519–2532 (2021)
Brandli, C., Berner, R., Yang, M., Liu, S.C., Delbruck, T.: A 240 \(\times \) 180 130 dB 3 \(\mu \)s latency global shutter spatiotemporal vision sensor. IEEE J. Solid-State Circ. 49, 2333–2341 (2014)
Cao, H., Chen, G., Xia, J., Zhuang, G., Knoll, A.: Fusion-based feature attention gate component for vehicle detection based on event camera. IEEE Sens. J. 21, 24540–24548 (2021)
Chodosh, N., Wang, C., Lucey, S.: Deep convolutional compressed sensing for LiDAR depth completion. In: Asian Conference on Computer Vision, pp. 499–513 (2018)
Cui, M., Zhu, Y., Liu, Y., Liu, Y.M., Chen, G., Huang, K.: Dense depth-map estimation based on fusion of event camera and sparse LiDAR. IEEE Trans. Instrum. Measur. 71, 1–11 (2022)
Dosovitskiy, A., Ros, G., Codevilla, F., López, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the 1st Annual Conference on Robot Learning, pp. 1–16 (2017)
Gansbeke, W.V., Neven, D., Brabandere, B.D., Gool, L.V.: Sparse and noisy LiDAR completion with RGB guidance and uncertainty. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019)
Gehrig, D., Rebecq, H., Gallego, G., Scaramuzza, D.: EKLT: asynchronous photometric feature tracking using events and frames. Int. J. Comput. Vis. 128, 601–618 (2019)
Gehrig, D., Rüegg, M., Gehrig, M., Hidalgo-Carrió, J., Scaramuzza, D.: Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction. IEEE Robot. Autom. Lett. 6, 2822–2829 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imageNet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hidalgo-Carrió, J., Gehrig, D., Scaramuzza, D.: Learning monocular dense depth from events. In: 2020 International Conference on 3D Vision (3DV), pp. 534–542 (2020)
Hu, Y., Binas, J., Neil, D., Liu, S.C., Delbrück, T.: DDD20 end-to-end event camera driving dataset: fusing frames and events with deep learning for improved steering prediction. In: 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pp. 1–6 (2020)
Huang, Z., Fan, J., Cheng, S., Yi, S., Wang, X., Li, H.: HMS-Net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. IEEE Trans. Image Process. 29, 3429–3441 (2020)
Jaritz, M., de Charette, R., Wirbel, É., Perrotton, X., Nashashibi, F.: Sparse and dense data with CNNs: depth completion and semantic segmentation. In: 2018 International Conference on 3D Vision (3DV), pp. 52–60 (2018)
Jiang, Z., et al.: Mixed frame-/event-driven fast pedestrian detection. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8332–8338 (2019)
Kim, H., Leutenegger, S., Davison, A.J.: Real-time 3D reconstruction and 6-DoF tracking with an event camera. In: European Conference on Computer Vision (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2015)
Ku, J., Harakeh, A., Waslander, S.L.: In defense of classical image processing: fast depth completion on the CPU. In: 2018 15th Conference on Computer and Robot Vision (CRV), pp. 16–22 (2018)
Kueng, B., Mueggler, E., Gallego, G., Scaramuzza, D.: Low-latency visual odometry using event-based feature tracks. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 16–23 (2016)
Li, B., et al.: Enhancing 3-D LiDAR point clouds with event-based camera. IEEE Trans. Instrum. Measur. 70, 1–12 (2021)
Liu, M., Delbrück, T.: Adaptive time-slice block-matching optical flow algorithm for dynamic vision sensors. In: British Machine Vision Conference (2018)
Maddern, W.P., Newman, P.: Real-time probabilistic fusion of sparse 3D LIDAR and dense stereo. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2181–2188 (2016)
Messikommer, N., Gehrig, D., Loquercio, A., Scaramuzza, D.: Event-based asynchronous sparse convolutional networks. In: European Conference on Computer Vision (2020)
Nam, Y., Isfahani, S.M.M., Yoon, K.J., Choi, J.: Stereo depth from events cameras: concentrate and focus on the future. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6104–6113 (2022)
Paikin, G., Ater, Y., Shaul, R., Soloveichik, E.: EFI-Net: video frame interpolation from fusion of events and frames. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1291–1301 (2021)
Pan, L., Scheerlinck, C., Yu, X., Hartley, R.I., Liu, M., Dai, Y.: Bringing a blurry frame alive at high frame-rate with an event camera. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6813–6822 (2019)
Pan, X., Luo, P., Shi, J., Tang, X.: Two at once: enhancing learning and generalization capacities via IBN-Net. In: European Conference on Computer Vision (2018)
Perot, E., de Tournemire, P., Nitti, D.O., Masci, J., Sironi, A.: Learning to detect objects with a 1 megapixel event camera. In: 34th Conference on Neural Information Processing Systems (NeurIPS 2020) (2020)
Sabater, A., Montesano, L., Murillo, A.C.: Event transformer+. A multi-purpose solution for efficient event data processing. arXiv preprint arXiv:2211.12222 (2022)
Scheerlinck, C., Barnes, N., Mahony, R.E.: Continuous-time intensity estimation using event cameras. In: Asian Conference on Computer Vision, pp. 308–324 (2018)
Schraml, S., Belbachir, A.N., Bischof, H.: An event-driven stereo system for real-time 3-D 360\(^{\circ }\) panoramic vision. IEEE Trans. Ind. Electron. 63, 418–428 (2016)
Schraml, S., Belbachir, A.N., Milosevic, N., Schön, P.: Dynamic stereo vision system for real-time tracking. In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pp. 1409–1412 (2010)
Siam, M., Valipour, S., Jägersand, M., Ray, N.: Convolutional gated recurrent networks for video segmentation. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3090–3094 (2017)
Song, R., Jiang, Z., Li, Y., Shan, Y., Huang, K.: Calibration of event-based camera and 3D LiDAR. In: 2018 WRC Symposium on Advanced Robotics and Automation (WRC SARA), pp. 289–295 (2018)
Ta, K., Bruggemann, D., Brodermann, T., Sakaridis, C., Gool, L.V.: L2E: Lasers to events for 6-DoF extrinsic calibration of lidars and event cameras. arXiv preprint arXiv:2207.01009 (2022)
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: European Conference on Computer Vision, pp. 402–419 (2020)
Tomy, A., Paigwar, A., Mann, K.S., Renzaglia, A., Laugier, C.: Fusing event-based and RGB camera for robust object detection in adverse conditions. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 933–939 (2022)
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), pp. 11–20 (2017)
Ummenhofer, B., et al.: DeMoN: depth and motion network for learning monocular stereo. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5622–5631 (2017)
Weikersdorfer, D., Adrian, D.B., Cremers, D., Conradt, J.: Event-based 3D SLAM with a depth-augmented dynamic vision sensor. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 359–364 (2014)
Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse LiDAR data with depth-normal constraints. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2811–2820 (2019)
Zhu, A.Z., Thakur, D., Özaslan, T., Pfrommer, B., Kumar, V.R., Daniilidis, K.: The multivehicle stereo event camera dataset: an event camera dataset for 3D perception. IEEE Robot. Autom. Lett. 3, 2032–2039 (2018)
Zhu, A.Z., Yuan, L., Chaney, K., Daniilidis, K.: Unsupervised event-based learning of optical flow, depth, and egomotion. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 989–997 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Brebion, V., Moreau, J., Davoine, F. (2023). Learning to Estimate Two Dense Depths from LiDAR and Event Data. In: Gade, R., Felsberg, M., Kämäräinen, JK. (eds) Image Analysis. SCIA 2023. Lecture Notes in Computer Science, vol 13886. Springer, Cham. https://doi.org/10.1007/978-3-031-31438-4_34
Download citation
DOI: https://doi.org/10.1007/978-3-031-31438-4_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31437-7
Online ISBN: 978-3-031-31438-4
eBook Packages: Computer ScienceComputer Science (R0)