A Differentiable Recurrent Surface for Asynchronous Event-Based Data

Cannici, Marco; Ciccone, Marco; Romanoni, Andrea; Matteucci, Matteo

doi:10.1007/978-3-030-58565-5_9

Marco Cannici¹²,
Marco Ciccone¹²,
Andrea Romanoni¹² &
…
Matteo Matteucci¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12365))

Included in the following conference series:

European Conference on Computer Vision

3687 Accesses
37 Citations

Abstract

Dynamic Vision Sensors (DVSs) asynchronously stream events in correspondence of pixels subject to brightness changes. Differently from classic vision devices, they produce a sparse representation of the scene. Therefore, to apply standard computer vision algorithms, events need to be integrated into a frame or event-surface. This is usually attained through hand-crafted grids that reconstruct the frame using ad-hoc heuristics. In this paper, we propose Matrix-LSTM, a grid of Long Short-Term Memory (LSTM) cells that efficiently process events and learn end-to-end task-dependent event-surfaces. Compared to existing reconstruction approaches, our learned event-surface shows good flexibility and expressiveness on optical flow estimation on the MVSEC benchmark and it improves the state-of-the-art of event-based object classification on the N-Cars dataset.

A. Romanoni—Work done prior to Amazon involvement of the author and does not reflect views of the Amazon company.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Code available at https://marcocannici.github.io/matrixlstm.

References

Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, pp. 265–283 (2016)
Google Scholar
Berner, R., Brandli, C., Yang, M., Liu, S.C., Delbruck, T.: A 240\(\times \)180 10mW 12us latency sparse-output vision sensor for mobile applications. In: 2013 Symposium on VLSI Circuits, pp. C186–C187. IEEE (2013)
Google Scholar
Bi, Y., Chadha, A., Abbas, A., Bourtsoulatze, E., Andreopoulos, Y.: Graph-based object classification for neuromorphic vision sensing. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 491–501 (2019)
Google Scholar
Cannici, M., Ciccone, M., Romanoni, A., Matteucci, M.: Asynchronous convolutional networks for object detection in neuromorphic cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Cannici, M., Ciccone, M., Romanoni, A., Matteucci, M.: Attention mechanisms for object recognition with event-based cameras. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1127–1136. IEEE (2019)
Google Scholar
Cohen, G.K.: Event-Based Feature Detection, Recognition and Classification. Theses, Université Pierre et Marie Curie - Paris VI (September 2016)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Gallego, G., et al.: Event-based vision: A survey. arXiv preprint arXiv:1904.08405 (2019)
Gehrig, D., Loquercio, A., Derpanis, K.G., Scaramuzza, D.: End-to-end learning of representations for asynchronous event-based data. https://github.com/uzh-rpg/rpg_event_representation_learning
Gehrig, D., Loquercio, A., Derpanis, K.G., Scaramuzza, D.: End-to-end learning of representations for asynchronous event-based data. In: IEEE International Conference of Computer Vision (ICCV) (October 2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Lagorce, X., Orchard, G., Galluppi, F., Shi, B.E., Benosman, R.B.: HOTS: a hierarchy of event-based time-surfaces for pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(7), 1346–1359 (2016)
Article Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)
Article Google Scholar
Lichtsteiner, P., Posch, C., Delbruck, T.: A \({128}{\times }{128}\)\(120\) db \(15 \mu \)s latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circ. 43(2), 566–576 (2008)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015)
Google Scholar
Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neural Netw. 10(9), 1659–1671 (1997)
Article Google Scholar
Maqueda, A.I., Loquercio, A., Gallego, G., García, N., Scaramuzza, D.: Event-based vision meets deep learning on steering prediction for self-driving cars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5419–5427 (2018)
Google Scholar
Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Neil, D., Pfeiffer, M., Liu, S.C.: Phased LSTM: accelerating recurrent network training for long or event-based sequences. In: Advances in Neural Information Processing Systems, pp. 3882–3890 (2016)
Google Scholar
Orchard, G., Jayawant, A., Cohen, G.K., Thakor, N.: Converting static image datasets to spiking neuromorphic datasets using saccades. Front. Neurosci. 9, 437 (2015)
Article Google Scholar
Orchard, G., Meyer, C., Etienne-Cummings, R., Posch, C., Thakor, N., Benosman, R.: HFirst: a temporal approach to object recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(10), 2028–2040 (2015)
Article Google Scholar
Posch, C., Serrano-Gotarredona, T., Linares-Barranco, B., Delbruck, T.: Retinomorphic event-based vision sensors: bioinspired cameras with spiking output. Proc. IEEE 102(10), 1470–1484 (2014)
Article Google Scholar
Rebecq, H., Ranftl, R., Koltun, V., Scaramuzza, D.: Events-to-video: bringing modern computer vision to event cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3857–3866 (2019)
Google Scholar
Scheerlinck, C., Rebecq, H., Stoffregen, T., Barnes, N., Mahony, R., Scaramuzza, D.: CED: color event camera dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Sekikawa, Y., Hara, K., Saito, H.: EventNet: asynchronous recursive event processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3887–3896 (2019)
Google Scholar
Serrano-Gotarredona, T., Linares-Barranco, B.: A 128\(\times \)128 \(1.5\%\) contrast sensitivity \(0.9\%\) FPN \(3 \upmu \)s latency 4 mW asynchronous frame-free dynamic vision sensor using transimpedance preamplifiers. IEEE J. Solid-State Circ. 48(3), 827–838 (2013)
Article Google Scholar
SHI, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W., WOO, W.: Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 802–810. Curran Associates, Inc. (2015)
Google Scholar
Sironi, A., Brambilla, M., Bourdis, N., Lagorce, X., Benosman, R.: HATS: histograms of averaged time surfaces for robust event-based object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1731–1740 (2018)
Google Scholar
Steiner, B., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Wang, Q., Zhang, Y., Yuan, J., Lu, Y.: Space-time event clouds for gesture recognition: from RGB cameras to event cameras. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1826–1835. IEEE (2019)
Google Scholar
Ye, C., Mitrokhin, A., Fermüller, C., Yorke, J.A., Aloimonos, Y.: Unsupervised learning of dense optical flow, depth and egomotion from sparse event data. arXiv preprint arXiv:1809.08625 (2018)
Zhu, A., Yuan, L., Chaney, K., Daniilidis, K.: EV-FlowNet: Self-supervised optical flow estimation for event-based cameras. https://github.com/daniilidis-group/EV-FlowNet
Zhu, A., Yuan, L., Chaney, K., Daniilidis, K.: EV-FlowNet: self-supervised optical flow estimation for event-based cameras. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (June 2018)
Google Scholar
Zhu, A.Z., Thakur, D., Özaslan, T., Pfrommer, B., Kumar, V., Daniilidis, K.: The multivehicle stereo event camera dataset: an event camera dataset for 3D perception. IEEE Robot. Autom. Lett. 3(3), 2032–2039 (2018)
Article Google Scholar
Zhu, A.Z., Yuan, L., Chaney, K., Daniilidis, K.: Unsupervised event-based learning of optical flow, depth, and egomotion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 989–997 (2019)
Google Scholar

Download references

Acknowledgments

We thank Alex Zihao Zhu for his help on replicating Ev-FlowNet results and the ISPL group at Politecnico di Milano for GPU support. This research is supported from project TEINVEIN, CUP: E96D17000110009 - Call “Accordi per la Ricerca e l’Innovazione”, cofunded by POR FESR 2014-2020 (Regional Operational Programme, European Regional Development Fund).

Author information

Authors and Affiliations

Politecnico di Milano, Milan, Italy
Marco Cannici, Marco Ciccone, Andrea Romanoni & Matteo Matteucci

Authors

Marco Cannici
View author publications
You can also search for this author in PubMed Google Scholar
Marco Ciccone
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Romanoni
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Matteucci
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Cannici .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 301 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cannici, M., Ciccone, M., Romanoni, A., Matteucci, M. (2020). A Differentiable Recurrent Surface for Asynchronous Event-Based Data. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12365. Springer, Cham. https://doi.org/10.1007/978-3-030-58565-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-58565-5_9
Published: 12 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58564-8
Online ISBN: 978-3-030-58565-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics