Abstract
In Human Research and Rescue scenarios, it is useful to be able to distinguish persons in distress from rescuers. Assuming people requiring help would wave to attract attention, human motion is thus a significant cue to identify person in needs. Therefore, in this paper, we aim at detecting and classifying human motion at different depths with low resolution. The task is fulfilled thanks to an event-based sensor and a Spiking Neural Network (SNN). The event-based sensor has been chosen as a suitable device to register motion specifically. While SNN is appropriate to process the event-based data, it is also a suitable algorithm to be implemented in low-power neuromorphic device, allowing for a longer operating time. In this study, we gather new data with similar classes to the IBM DVS Gesture dataset at various distances. We show we can achieve an accuracy up to 91.5% on a validation set obtained at different depths and lighting conditions from the training set. We also show that having an Region of Interest detection leads to better accuracy compare to a full frame model on untrained distances.
This research was supported by Programmatic grant no. A1687b0033 from the Singapore governments Research, Innovation and Enterprise 2020 plan (Advanced Manufacturing and Engineering domain).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Acharya, J., et al.: EBBIOT: a low-complexity tracking algorithm for surveillance in IoVT using stationary neuromorphic vision sensors. In: 2019 32nd IEEE International System-on-Chip Conference (SOCC), pp. 318–323 (2019). https://doi.org/10.1109/SOCC46988.2019.1570553690
Agarwal, S., Hervas-Martin, E., Byrne, J., Dunne, A., Luis Espinosa-Aranda, J., Rijlaarsdam, D.: An evaluation of low-cost vision processors for efficient star identification. Sensors 20(21), 6250 (2020). https://doi.org/10.3390/s20216250
Akopyan, F., et al.: TrueNorth: design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 34(10), 1537–1557 (2015). https://doi.org/10.1109/TCAD.2015.2474396
Amir, A., et al.: A low power, fully event-based gesture recognition system. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Bi, Y., Chadha, A., Abbas, A., Bourtsoulatze, E., Andreopoulos, Y.: Graph-based object classification for neuromorphic vision sensing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Blouw, P., Choo, X., Hunsberger, E., Eliasmith, C.: Benchmarking keyword spotting efficiency on neuromorphic hardware. In: Proceedings of the 7th Annual Neuro-Inspired Computational Elements Workshop. NICE 2019, Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3320288.3320304
Calabrese, E., et al.: DHP19: dynamic vision sensor 3D human pose dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2019)
Ceolini, E., et al.: Hand-gesture recognition based on EMG and event-based camera sensor fusion: a benchmark in neuromorphic computing. Frontiers Neurosci. 14, 637 (2020). https://doi.org/10.3389/fnins.2020.00637
Choi, W., Pantofaru, C., Savarese, S.: A general framework for tracking multiple people from a moving camera. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1577–1591 (2013). https://doi.org/10.1109/TPAMI.2012.248
Davies, M., et al.: Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018). https://doi.org/10.1109/MM.2018.112130359
Dozat, T.: Incorporating Nesterov momentum into Adam. In: ICLR Workshop (2016)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 226–231. KDD 1996, AAAI Press (1996)
Gallego, G., et al.: Event-based vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(1), 154–180 (2022). https://doi.org/10.1109/TPAMI.2020.3008413
Gerstner, W.: Chapter 12 a framework for spiking neuron models: the spike response model. In: Moss, F., Gielen, S. (eds.) Neuro-Informatics and Neural Modelling, Handbook of Biological Physics, vol. 4, pp. 469–516. North-Holland (2001). https://doi.org/10.1016/S1383-8121(01)80015-4
Hinz, G., et al.: Online multi-object tracking-by-clustering for intelligent transportation system with neuromorphic vision sensor. In: Kern-Isberner, G., Fürnkranz, J., Thimm, M. (eds.) KI 2017. LNCS (LNAI), vol. 10505, pp. 142–154. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67190-1_11
Kaiser, J., et al.: Embodied neuromorphic vision with continuous random backpropagation. In: 2020 8th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob), pp. 1202–1209 (2020). https://doi.org/10.1109/BioRob49111.2020.9224330
Lan, W., Dang, J., Wang, Y., Wang, S.: Pedestrian detection based on yolo network model. In: 2018 IEEE International Conference on Mechatronics and Automation (ICMA), pp. 1547–1551 (2018). https://doi.org/10.1109/ICMA.2018.8484698
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128\(\times \)128 120 db 15 \(\mu \)s latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circ. 43(2), 566–576 (2008). https://doi.org/10.1109/JSSC.2007.914337
Lin, Z., Davis, L.S.: Shape-based human detection and segmentation via hierarchical part-template matching. IEEE Trans. Pattern Anal. Mach. Intell. 32(4), 604–618 (2010). https://doi.org/10.1109/TPAMI.2009.204
Liu, Y., et al.: Dynamic gesture recognition algorithm based on 3D convolutional neural network. Computational Intelligence and Neuroscience 2021(4828102) (2021). https://doi.org/10.1155/2021/4828102
Lygouras, E., Santavas, N., Taitzoglou, A., Tarchanidis, K., Mitropoulos, A., Gasteratos, A.: Unsupervised human detection with an embedded vision system on a fully autonomous UAV for search and rescue operations. Sensors 19(16), 3542 (2019). https://doi.org/10.3390/s19163542
Mitrokhin, A., Fermüller, C., Parameshwara, C., Aloimonos, Y.: Event-based moving object detection and tracking. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9 (2018). https://doi.org/10.1109/IROS.2018.8593805
Mondal, A., Das, M.: Moving object detection for event-based vision using k-means clustering. In: 2021 IEEE 8th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON), pp. 1–6 (2021). https://doi.org/10.1109/UPCON52273.2021.9667636
Nguyen, H.H., Ta, T.N., Nguyen, N.C., Bui, V.T., Pham, H.M., Nguyen, D.M.: Yolo based real-time human detection for smart video surveillance at the edge. In: 2020 IEEE Eighth International Conference on Communications and Electronics (ICCE), pp. 439–444 (2021). https://doi.org/10.1109/ICCE48956.2021.9352144
Piatkowska, E., Belbachir, A.N., Schraml, S., Gelautz, M.: Spatiotemporal multiple persons tracking using dynamic vision sensor. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 35–40 (2012). https://doi.org/10.1109/CVPRW.2012.6238892
Pigou, L., Van Herreweghe, M., Dambre, J.: Gesture and sign language recognition with temporal residual networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops (2017)
Rudnev, V., et al.: Eventhands: Real-time neural 3d hand pose estimation from an event stream. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 12385–12395 (October 2021)
Saha, S., Lahiri, R., Konar, A., Banerjee, B., Nagar, A.K.: HMM-based gesture recognition system using kinect sensor for improvised human-computer interaction. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2776–2783 (2017). https://doi.org/10.1109/IJCNN.2017.7966198
Shrestha, S.B., Orchard, G.: SLAYER: spike layer error reassignment in time. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31, pp. 1419–1428. Curran Associates, Inc. (2018). https://papers.nips.cc/paper/7415-slayer-spike-layer-error-reassignment-in-time.pdf
Stewart, K., Orchard, G., Shrestha, S.B., Neftci, E.: Online few-shot gesture learning on a neuromorphic processor. IEEE J. Emerg. Sel. Top. Circ. Syst. 10(4), 512–521 (2020). https://doi.org/10.1109/JETCAS.2020.3032058
Ur Rehman, M., et al.: Dynamic hand gesture recognition using 3D-CNN and LSTM networks. Comput. Mater. Continua, 70, 4675–4690 (2021). https://doi.org/10.32604/cmc.2022.019586
Xu, D., Wu, X., Chen, Y.L., Xu, Y.: Online dynamic gesture recognition for human robot interaction. J. Intell. Robot. Syst. 77(4), 604–618 (2010). https://doi.org/10.1109/TPAMI.2009.204
Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Zhang, Y., et al.: An event-driven spatiotemporal domain adaptation method for DVS gesture recognition. IEEE Trans. Circuits Syst. II Express Briefs 69(3), 1332–1336 (2022). https://doi.org/10.1109/TCSII.2021.3108798
Zhang, Z.: Microsoft kinect sensor and its effect. IEEE Multimedia 19(2), 4–10 (2012). https://doi.org/10.1109/MMUL.2012.24
Zheng, C., et al.: Deep learning-based human pose estimation: a survey. CoRR abs/2012.13392 (2020). https://arxiv.org/abs/2012.13392
Zhou, Y., Gallego, G., Lu, X., Liu, S., Shen, S.: Event-based motion segmentation with spatio-temporal graph cuts. IEEE Trans. Neural Netw. Learn. Syst. 1–13 (2021)
Acknowledgment
The authors would like to thank Austin Lai Weng Mun for his help in the dataset collection.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Colonnier, F., Seeralan, A., Zhu, L. (2023). Event-Based Visual Sensing for Human Motion Detection and Classification at Various Distances. In: Wang, H., et al. Image and Video Technology. PSIVT 2022. Lecture Notes in Computer Science, vol 13763. Springer, Cham. https://doi.org/10.1007/978-3-031-26431-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-26431-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26430-6
Online ISBN: 978-3-031-26431-3
eBook Packages: Computer ScienceComputer Science (R0)