Abstract
We present a recurrent network based novel framework for tracking and re-identifying multiple targets in first-person perspective. Even though LSTMs can act as a sequence classifier, most of the previous works in multi target tracking use their output with some distance metric for data association. In this work, we employ an LSTM as a classifier and train it over the memory cells output vectors corresponding to different targets obtained from another LSTM. This classifier, based on appearance and motion features, discriminates the targets in two consecutive frames as well as re-identify them in a time interval. We integrate this classifier as an additional block in a detection free tracking architecture which enhances the performance in terms of re-identification of targets and also indicates the absence of targets. We propose a dataset of twenty egocentric videos containing multiple targets to validate our approach.
Keywords
This publication is an outcome of the R & D work undertaken project under the Visvesvaraya PhD Scheme of Ministry of Electronics and Information Technology, Government of India, being implemented by Digital India Corporation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. J. Image Video Process. 2008, 1 (2008)
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chen, L., Ai, H., Shang, C., Zhuang, Z., Bai, B..: Online multi-object tracking with convolutional neural networks. In: ICIP (2017)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Girija, S.S.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2016). Software tensorflow.org
Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: Proceedings of International Conference on Neural Networks (ICNN 1996), vol. 1, pp. 347–352. IEEE (1996)
Gordon, D., Farhadi, A., Fox, D.: Re3: real-time recurrent regression networks for object tracking. arXiv preprint arXiv:1705.06368, 3 (2017)
Gray, D., Tao, H.: Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5302, pp. 262–275. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88682-2_21
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45
Hu, P., Ramanan, D.: Finding tiny faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 951–959 (2017)
Karpathy, A., Johnson, J., Fei-Fei, L.: Visualizing and understanding recurrent networks. arXiv preprint arXiv:1506.02078 (2015)
Koestinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2288–2295. IEEE (2012)
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Eleventh Annual Conference of the International Speech Communication Association (2010)
Milan, A., Leal-Taixé, L., Reid, I.D., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. CoRR abs/1603.00831 (2016). http://arxiv.org/abs/1603.00831
Nigam, J., Rameshan, R.M.: EgoTracker: pedestrian tracking with re-identification in egocentric videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 40–47 (2017)
Prosser, B.J., Zheng, W.S., Gong, S., Xiang, T., Mary, Q.: Person re-identification by support vector ranking. In: BMVC, vol. 2, p. 6 (2010)
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: 2014 22nd International Conference on Pattern Recognition, pp. 34–39, August 2014. https://doi.org/10.1109/ICPR.2014.16
Zheng, W.S., Gong, S., Xiang, T.: Reidentification by relative distance comparison. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 653–668 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Nigam, J., Rameshan, R.M. (2019). TRINet: Tracking and Re-identification Network for Multiple Targets in Egocentric Videos Using LSTMs. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11679. Springer, Cham. https://doi.org/10.1007/978-3-030-29891-3_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-29891-3_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29890-6
Online ISBN: 978-3-030-29891-3
eBook Packages: Computer ScienceComputer Science (R0)