Learning to Estimate Two Dense Depths from LiDAR and Event Data

Brebion, Vincent; Moreau, Julien; Davoine, Franck

doi:10.1007/978-3-031-31438-4_34

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13886))

Included in the following conference series:

Scandinavian Conference on Image Analysis

707 Accesses
3 Citations

Abstract

Event cameras do not produce images, but rather a continuous flow of events, which encode changes of illumination for each pixel independently and asynchronously. While they output temporally rich information, they lack any depth information which could facilitate their use with other sensors. LiDARs can provide this depth information, but are by nature very sparse, which makes the depth-to-event association more complex. Furthermore, as events represent changes of illumination, they might also represent changes of depth; associating them with a single depth is therefore inadequate. In this work, we propose to address these issues by fusing information from an event camera and a LiDAR using a learning-based approach to estimate accurate dense depth maps. To solve the “potential change of depth” problem, we propose here to estimate two depth maps at each step: one “before” the events happen, and one “after” the events happen. We further propose to use this pair of depths to compute a depth difference for each event, to give them more context. We train and evaluate our network, ALED, on both synthetic and real driving sequences, and show that it is able to predict dense depths with an error reduction of up to 61% compared to the current state of the art. We also demonstrate the quality of our 2-depths-to-event association, and the usefulness of the depth difference information. Finally, we release SLED, a novel synthetic dataset comprising events, LiDAR point clouds, RGB images, and dense depth maps.

Supported in part by the Hauts-de-France Region and in part by the SIVALab Joint Laboratory (Renault Group—Université de technologie de Compiègne (UTC)—Centre National de la Recherche Scientifique (CNRS)).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baldwin, R.W., Liu, R., Almatrafi, M.B., Asari, V.K., Hirakawa, K.: Time-ordered recent event (TORE) volumes for event cameras. IEEE Trans. Pattern Anal. Mach. Intell. 45(2), 2519–2532 (2021)
Google Scholar
Brandli, C., Berner, R., Yang, M., Liu, S.C., Delbruck, T.: A 240 \(\times \) 180 130 dB 3 \(\mu \)s latency global shutter spatiotemporal vision sensor. IEEE J. Solid-State Circ. 49, 2333–2341 (2014)
Article Google Scholar
Cao, H., Chen, G., Xia, J., Zhuang, G., Knoll, A.: Fusion-based feature attention gate component for vehicle detection based on event camera. IEEE Sens. J. 21, 24540–24548 (2021)
Article Google Scholar
Chodosh, N., Wang, C., Lucey, S.: Deep convolutional compressed sensing for LiDAR depth completion. In: Asian Conference on Computer Vision, pp. 499–513 (2018)
Google Scholar
Cui, M., Zhu, Y., Liu, Y., Liu, Y.M., Chen, G., Huang, K.: Dense depth-map estimation based on fusion of event camera and sparse LiDAR. IEEE Trans. Instrum. Measur. 71, 1–11 (2022)
Google Scholar
Dosovitskiy, A., Ros, G., Codevilla, F., López, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the 1st Annual Conference on Robot Learning, pp. 1–16 (2017)
Google Scholar
Gansbeke, W.V., Neven, D., Brabandere, B.D., Gool, L.V.: Sparse and noisy LiDAR completion with RGB guidance and uncertainty. In: 2019 16th International Conference on Machine Vision Applications (MVA), pp. 1–6 (2019)
Google Scholar
Gehrig, D., Rebecq, H., Gallego, G., Scaramuzza, D.: EKLT: asynchronous photometric feature tracking using events and frames. Int. J. Comput. Vis. 128, 601–618 (2019)
Article Google Scholar
Gehrig, D., Rüegg, M., Gehrig, M., Hidalgo-Carrió, J., Scaramuzza, D.: Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction. IEEE Robot. Autom. Lett. 6, 2822–2829 (2021)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imageNet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hidalgo-Carrió, J., Gehrig, D., Scaramuzza, D.: Learning monocular dense depth from events. In: 2020 International Conference on 3D Vision (3DV), pp. 534–542 (2020)
Google Scholar
Hu, Y., Binas, J., Neil, D., Liu, S.C., Delbrück, T.: DDD20 end-to-end event camera driving dataset: fusing frames and events with deep learning for improved steering prediction. In: 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pp. 1–6 (2020)
Google Scholar
Huang, Z., Fan, J., Cheng, S., Yi, S., Wang, X., Li, H.: HMS-Net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. IEEE Trans. Image Process. 29, 3429–3441 (2020)
Article MATH Google Scholar
Jaritz, M., de Charette, R., Wirbel, É., Perrotton, X., Nashashibi, F.: Sparse and dense data with CNNs: depth completion and semantic segmentation. In: 2018 International Conference on 3D Vision (3DV), pp. 52–60 (2018)
Google Scholar
Jiang, Z., et al.: Mixed frame-/event-driven fast pedestrian detection. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 8332–8338 (2019)
Google Scholar
Kim, H., Leutenegger, S., Davison, A.J.: Real-time 3D reconstruction and 6-DoF tracking with an event camera. In: European Conference on Computer Vision (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2015)
Ku, J., Harakeh, A., Waslander, S.L.: In defense of classical image processing: fast depth completion on the CPU. In: 2018 15th Conference on Computer and Robot Vision (CRV), pp. 16–22 (2018)
Google Scholar
Kueng, B., Mueggler, E., Gallego, G., Scaramuzza, D.: Low-latency visual odometry using event-based feature tracks. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 16–23 (2016)
Google Scholar
Li, B., et al.: Enhancing 3-D LiDAR point clouds with event-based camera. IEEE Trans. Instrum. Measur. 70, 1–12 (2021)
Google Scholar
Liu, M., Delbrück, T.: Adaptive time-slice block-matching optical flow algorithm for dynamic vision sensors. In: British Machine Vision Conference (2018)
Google Scholar
Maddern, W.P., Newman, P.: Real-time probabilistic fusion of sparse 3D LIDAR and dense stereo. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2181–2188 (2016)
Google Scholar
Messikommer, N., Gehrig, D., Loquercio, A., Scaramuzza, D.: Event-based asynchronous sparse convolutional networks. In: European Conference on Computer Vision (2020)
Google Scholar
Nam, Y., Isfahani, S.M.M., Yoon, K.J., Choi, J.: Stereo depth from events cameras: concentrate and focus on the future. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6104–6113 (2022)
Google Scholar
Paikin, G., Ater, Y., Shaul, R., Soloveichik, E.: EFI-Net: video frame interpolation from fusion of events and frames. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1291–1301 (2021)
Google Scholar
Pan, L., Scheerlinck, C., Yu, X., Hartley, R.I., Liu, M., Dai, Y.: Bringing a blurry frame alive at high frame-rate with an event camera. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6813–6822 (2019)
Google Scholar
Pan, X., Luo, P., Shi, J., Tang, X.: Two at once: enhancing learning and generalization capacities via IBN-Net. In: European Conference on Computer Vision (2018)
Google Scholar
Perot, E., de Tournemire, P., Nitti, D.O., Masci, J., Sironi, A.: Learning to detect objects with a 1 megapixel event camera. In: 34th Conference on Neural Information Processing Systems (NeurIPS 2020) (2020)
Google Scholar
Sabater, A., Montesano, L., Murillo, A.C.: Event transformer+. A multi-purpose solution for efficient event data processing. arXiv preprint arXiv:2211.12222 (2022)
Scheerlinck, C., Barnes, N., Mahony, R.E.: Continuous-time intensity estimation using event cameras. In: Asian Conference on Computer Vision, pp. 308–324 (2018)
Google Scholar
Schraml, S., Belbachir, A.N., Bischof, H.: An event-driven stereo system for real-time 3-D 360\(^{\circ }\) panoramic vision. IEEE Trans. Ind. Electron. 63, 418–428 (2016)
Article Google Scholar
Schraml, S., Belbachir, A.N., Milosevic, N., Schön, P.: Dynamic stereo vision system for real-time tracking. In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pp. 1409–1412 (2010)
Google Scholar
Siam, M., Valipour, S., Jägersand, M., Ray, N.: Convolutional gated recurrent networks for video segmentation. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3090–3094 (2017)
Google Scholar
Song, R., Jiang, Z., Li, Y., Shan, Y., Huang, K.: Calibration of event-based camera and 3D LiDAR. In: 2018 WRC Symposium on Advanced Robotics and Automation (WRC SARA), pp. 289–295 (2018)
Google Scholar
Ta, K., Bruggemann, D., Brodermann, T., Sakaridis, C., Gool, L.V.: L2E: Lasers to events for 6-DoF extrinsic calibration of lidars and event cameras. arXiv preprint arXiv:2207.01009 (2022)
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: European Conference on Computer Vision, pp. 402–419 (2020)
Google Scholar
Tomy, A., Paigwar, A., Mann, K.S., Renzaglia, A., Laugier, C.: Fusing event-based and RGB camera for robust object detection in adverse conditions. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 933–939 (2022)
Google Scholar
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: 2017 International Conference on 3D Vision (3DV), pp. 11–20 (2017)
Google Scholar
Ummenhofer, B., et al.: DeMoN: depth and motion network for learning monocular stereo. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5622–5631 (2017)
Google Scholar
Weikersdorfer, D., Adrian, D.B., Cremers, D., Conradt, J.: Event-based 3D SLAM with a depth-augmented dynamic vision sensor. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 359–364 (2014)
Google Scholar
Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse LiDAR data with depth-normal constraints. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2811–2820 (2019)
Google Scholar
Zhu, A.Z., Thakur, D., Özaslan, T., Pfrommer, B., Kumar, V.R., Daniilidis, K.: The multivehicle stereo event camera dataset: an event camera dataset for 3D perception. IEEE Robot. Autom. Lett. 3, 2032–2039 (2018)
Article Google Scholar
Zhu, A.Z., Yuan, L., Chaney, K., Daniilidis, K.: Unsupervised event-based learning of optical flow, depth, and egomotion. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 989–997 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Heudiasyc (Heuristics and Diagnosis of Complex Systems) Laboratory, CNRS, Université de technologie de Compiègne (UTC), 60319, Compiègne Cedex, France
Vincent Brebion, Julien Moreau & Franck Davoine

Authors

Vincent Brebion
View author publications
You can also search for this author in PubMed Google Scholar
Julien Moreau
View author publications
You can also search for this author in PubMed Google Scholar
Franck Davoine
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vincent Brebion .

Editor information

Editors and Affiliations

Aalborg University, Aalborg, Denmark
Rikke Gade
Linköping University, Linköping, Sweden
Michael Felsberg
Tampere University, Tampere, Finland
Joni-Kristian Kämäräinen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brebion, V., Moreau, J., Davoine, F. (2023). Learning to Estimate Two Dense Depths from LiDAR and Event Data. In: Gade, R., Felsberg, M., Kämäräinen, JK. (eds) Image Analysis. SCIA 2023. Lecture Notes in Computer Science, vol 13886. Springer, Cham. https://doi.org/10.1007/978-3-031-31438-4_34

Download citation

DOI: https://doi.org/10.1007/978-3-031-31438-4_34
Published: 27 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31437-7
Online ISBN: 978-3-031-31438-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Learning to Estimate Two Dense Depths from LiDAR and Event Data