Abstract
The goal of video-based person re-identification is to recognize a person at different camera settings. Most previous methods use features from the full body to represent a person. In this paper, we propose a novel Spatial and Temporal Features Mixture Model (STFMM). Unlike previous approaches, our model first horizontally splits human body into N parts, which include the information of head, waist, legs and so on. The feature of each part is then integrated in order to achieve more expressive representation for each person. Experiments conducted on the iLIDS-VID and PRID-2011 datasets demonstrate that our approach outperforms the existing video-based person re-identification methods and significantly improves stability. Our model achieves a rank-1 CMC accuracy of 73.6% on the iLIDS-VID dataset and a rank-1 CMC accuracy of 47.8% for the cross-data testing.
Similar content being viewed by others
References
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. International conference on learning representations (ICLR)
Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a mahalanobis metric from equivalence constraints. J Mach Learn Res (JMLR) 6(6):937–965
Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2360–2367
Gray D, Brennan S, Tao H (2007) Evaluating appearance models for recognition, reacquisition, and tracking. In: IEEE international workshop on performance evaluation for tracking and surveillance (PETS), pp 1–7
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1735–1742
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision (ECCV), pp 346–361
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis (SCIA), pp 91–102
Hirzer M, Roth PM, Stinger M, Bischof H (2012) Relaxed pairwise learned metric for person re-identification. In: European conference on computer vision (ECCV), pp 780–793
Kviatkovsky I, Adam A, Rivlin E (2013) Color invariants for person reidentification. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(7):1622–1634
Li Y, Wu Z, Karanam S, Radke RJ (2015) Multi-shot human re-identification using adaptive fisher discriminant analysis. In: British machine vision conference (BMVC), pp 73.1–73.12
Li Z, Chang S, Liang F, Huang TS, Cao L, Smith JR (2013) Learning locally-adaptive decision functions for person verification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3610–3617
Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2197–2206
Liao S, Li SZ (2015) Efficient psd constrained asymmetric metric learning for person re-identification. In: IEEE international conference on computer vision (ICCV), pp 3685–3693
Liu C, Gong S, Chen CL, Lin X (2012) Person re-identification: what features are important?. In: European conference on computer vision (ECCV), pp 391–401
Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: IEEE international conference on computer vision (ICCV), pp 3810–3818
Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: International joint conference on artificial intelligence (IJCAI), pp 674–679
Ma B, Su Y, Jurie F (2012) Local descriptors encoded by fisher vectors for person re-identification. In: European conference on computer vision (ECCV), pp 413–422
Matsukawa T, Okabe T, Suzuki E, Sato Y (2016) Hierarchical gaussian descriptor for person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1363–1372
McLaughlin N, Rincon JMD, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1325–1334
Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Advances in neural information processing systems (NIPS), pp 2204–2212
Paisitkriangkrai S, Shen C, Hengel AVD (2015) Learning to rank in person re-identification with metric ensembles. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1846–1855
Subramaniam A, Chatterjee M, Mittal A (2016) Deep neural networks with inexact matching for person re-identification. In: Advances in neural information processing systems (NIPS), pp 2667–2675
Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision (ECCV), pp 135–153
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision (ECCV), pp 688–703
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res (JMLR) 10(2):207–244
Wu L, Shen C, Hengel A (2016) Deep recurrent convolutional networks for video-based person re-identification: an end-to-end approach. IEEE conference on computer vision and pattern recognition (CVPR)
Xiong F, Gou M, Camps O, Sznaier M (2014) Person re-identification using kernel-based metric learning methods. In: European conference on computer vision (ECCV), pp 1–16
Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. IEEE International Conference on Computer Vision (ICCV), pp 4743–4752
Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: European conference on computer vision (ECCV), pp 701–716
Yi D, Lei Z, Liao S, Li SZ (2014) Deep metric learning for person re-identification. In: International conference on pattern recognition (ICPR), pp 34–39
Zhang Z, Chen Y, Saligrama V (2015) Group membership prediction. In: IEEE International conference on computer vision (ICCV), pp 3916–3924
Zhao R, Ouyang W, Wang X (2014) Learning mid-level filters for person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 144–151
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: IEEE international conference on computer vision (ICCV), pp 1116–1124
Zheng WS, Gong S, Xiang T (2013) Reidentification by relative distance comparison. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(3):653–668
Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 6776–6785
Acknowledgements
This research was supported by the National Natural Science Foundation of China (NSFC 61572005, 61672086, 61702030, 61771058), and Key Projects of Science and Technology Research of Hebei Province Higher Education (ZD2017304).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, J., Sun, C., Xu, X. et al. A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl Intell 49, 3436–3446 (2019). https://doi.org/10.1007/s10489-019-01459-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-019-01459-8