EMVS: Event-Based Multi-View Stereo—3D Reconstruction with an Event Camera in Real-Time
- 1.7k Downloads
Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. They offer significant advantages over standard cameras, namely a very high dynamic range, no motion blur, and a latency in the order of microseconds. However, because the output is composed of a sequence of asynchronous events rather than actual intensity images, traditional vision algorithms cannot be applied, so that a paradigm shift is needed. We introduce the problem of event-based multi-view stereo (EMVS) for event cameras and propose a solution to it. Unlike traditional MVS methods, which address the problem of estimating dense 3D structure from a set of known viewpoints, EMVS estimates semi-dense 3D structure from an event camera with known trajectory. Our EMVS solution elegantly exploits two inherent properties of an event camera: (1) its ability to respond to scene edges—which naturally provide semi-dense geometric information without any pre-processing operation—and (2) the fact that it provides continuous measurements as the sensor moves. Despite its simplicity (it can be implemented in a few lines of code), our algorithm is able to produce accurate, semi-dense depth maps, without requiring any explicit data association or intensity estimation. We successfully validate our method on both synthetic and real data. Our method is computationally very efficient and runs in real-time on a CPU.
KeywordsMulti-view stereo Event cameras Event-based vision 3D reconstruction
This research was funded by the DARPA FLA Program, the National Center of Competence in Research (NCCR) Robotics through the Swiss National Science Foundation and the SNSF-ERC Starting Grant.
- Bardow, P., Davison, A. J., & Leutenegger, S. (2016). Simultaneous optical flow and intensity estimation from an event camera. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2016.102.
- Brandli, C., Muller, L., & Delbruck, T. (2014a) Real-time, high-speed video decompression using a frame- and event-based DAVIS sensor. In International Symposium Circuits and Systems (ISCAS) (pp. 686–689). https://doi.org/10.1109/ISCAS.2014.6865228.
- Censi, A., & Scaramuzza, D. (2014). Low-latency event-based visual odometry. In IEEE International Conference on Robotics and Automation (ICRA). https://doi.org/10.1109/IROS.2016.7758089.
- Collins, R. T. (1996). A space-sweep approach to true multi-image matching. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (pp. 358–363). https://doi.org/10.1109/CVPR.1996.517097.
- Cook, M., Gugelmann, L., Jug, F., Krautz, C., & Steger, A. (2011). Interacting maps for fast visual interpretation. In International Joint Conference Neural Networks (IJCNN) (pp. 770–776). https://doi.org/10.1109/IJCNN.2011.6033299.
- Delbruck, T. (2016). Neuromorophic vision sensing and processing. In European Solid-State Device Research Conferernce (ESSDERC) (pp. 7–14). https://doi.org/10.1109/ESSDERC.2016.7599576.
- Delbruck, T., & Lichtsteiner, P. (2007). Fast sensory motor control based on event-based hybrid neuromorphic-procedural system. In IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 845–848). https://doi.org/10.1109/ISCAS.2007.378038.
- Forster, C., Pizzoli, M., & Scaramuzza, D. (2014). SVO: Fast semi-direct monocular visual odometry. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 15–22). https://doi.org/10.1109/ICRA.2014.6906584.
- Kim, H., Handa, A., Benosman, R., Ieng, S.-H., & Davison, A. J. (2014). Simultaneous mosaicing and tracking with an event camera. In British Machine Vision Conference (BMVC). https://doi.org/10.5244/C.28.26.
- Kueng, B., Mueggler, E., Gallego, G., & Scaramuzza, D. (2016). Low-latency visual odometry using event-based feature tracks. In IEEE/RSJ International Conference on IIntelligent Robots and Systems (IROS) (pp. 16–23). Daejeon, Korea. https://doi.org/10.1109/IROS.2016.7758089.
- Lee, J., Delbruck, T., Park, P. K. J., Pfeiffer, M., Shin, C.-W., Ryu, H., & Kang, B. C. (2012). Live demonstration: Gesture-based remote control using stereo pair of dynamic vision sensors. In IEEE International Symposium on Circuits and Systems (ISCAS). https://doi.org/10.1109/ISCAS.2012.6272144.
- Lee, J. H., Delbruck, T., Pfeiffer, M., Park, P. K. J., Shin, C.-W., Ryu, H., et al. (2014). Real-time gesture interface based on event-driven processing from stereo silicon retinas. IEEE Transactions on Neural Networks and Learning Systems, 25(12), 2250–2263. https://doi.org/10.1109/TNNLS.2014.2308551.CrossRefGoogle Scholar
- Litzenberger, M., Belbachir, A. N., Donath, N., Gritsch, G., Garn, H., Kohn, B., Posch, C., & Schraml, S. (2006). Estimation of vehicle speed based on asynchronous data from a silicon retina optical sensor. In IEEE Intelligent Transportation Systems Conference (pp. 653–658). https://doi.org/10.1109/ITSC.2006.1706816.
- Matsuda, N., Cossairt, O., & Gupta. M. (2015). MC3D: Motion contrast 3D scanning. In IEEE International Conference on Computational Photography (ICCP) (pp. 1–10). https://doi.org/10.1109/ICCPHOT.2015.7168370.
- Mueggler, E., Huber, B., & Scaramuzza, D. (2014). Event-based, 6-DOF pose tracking for high-speed maneuvers. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 2761–2768). https://doi.org/10.1109/IROS.2014.6942940.
- Mueggler, E., Rebecq, H., Gallego, G., Delbruck, T., & Scaramuzza, D. (2017). The event-camera dataset and simulator: Event-based data for pose estimation, visual odometry, and SLAM. International Journal of Robotics Research, 36, 142–149. https://doi.org/10.1177/0278364917691115.CrossRefGoogle Scholar
- Piatkowska, E., Belbachir, A. N., & Gelautz, M. (2013). Asynchronous stereo vision for event-driven dynamic stereo sensor using an adaptive cooperative approach. In International Conference on Computer Vision Workshops (ICCVW) (pp. 45–50). https://doi.org/10.1109/ICCVW.2013.13.
- Piatkowska, E., Belbachir, A. N., Schraml, S., & Gelautz, M. (2012). Spatiotemporal multiple persons tracking using dynamic vision sensor. In IEEE International Conference on Computer Vision and Pattern Recognition Workshop (pp. 35–40). https://doi.org/10.1109/CVPRW.2012.6238892.
- Pizzoli, M., Forster, C., & Scaramuzza, D. (2014). REMODE: Probabilistic, monocular dense reconstruction in real time. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 2609–2616). https://doi.org/10.1109/ICRA.2014.6907233.
- Rebecq, H., Gallego, G., & Scaramuzza, D. (2016). EMVS: Event-based multi-view stereo. In British Machine Vision Conference (BMVC). https://doi.org/10.5244/C.30.63.
- Reinbacher, C., Graber, G., & Pock, T. (2016). Real-time intensity-image reconstruction for event cameras using manifold regularisation. In British Machine Vision Conference (BMVC). https://doi.org/10.5244/C.30.9.
- Rusu, R. B., & Cousins, S. (2011). 3D is here: Point cloud library (PCL). In IEEE International Conference on Robotics and Automation (ICRA). Shanghai, China. https://doi.org/10.1109/ICRA.2011.5980567.
- Schraml, S., Belbachir, A. N., & Bischof, H. (2015). Event-driven stereo matching for real-time 3D panoramic vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 466–474). https://doi.org/10.1109/CVPR.2015.7298644.
- Schraml, S., Belbachir, A. N., Milosevic, N., & Schön, P. (2010). Dynamic stereo vision system for real-time tracking. In IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 1409–1412). https://doi.org/10.1109/ISCAS.2010.5537289.
- Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/CVPR.2006.19.
- Weikersdorfer, D., & Conradt, J. (2012). Event-based particle filtering for robot self-localization. In IEEE International Conference on Robotics and Biomimetics (ROBIO) (pp. 866–870). https://doi.org/10.1109/ROBIO.2012.6491077.
- Wiesmann, G., Schraml, S., Litzenberger, M., Belbachir, A. N., Hofstatter, M., & Bartolozzi, C. (2012). Event-driven embodied system for feature extraction and object recognition in robotic applications. In IEEE International Conference on Computer Vision and Pattern Recognition Workshop (pp. 76–82). https://doi.org/10.1109/CVPRW.2012.6238898.
- Wolberg, G. (1990). Digital image warping. California: Wiley-IEEE Computer Society Press.Google Scholar