Skip to main content
Log in

A spatial and temporal features mixture model with body parts for video-based person re-identification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The goal of video-based person re-identification is to recognize a person at different camera settings. Most previous methods use features from the full body to represent a person. In this paper, we propose a novel Spatial and Temporal Features Mixture Model (STFMM). Unlike previous approaches, our model first horizontally splits human body into N parts, which include the information of head, waist, legs and so on. The feature of each part is then integrated in order to achieve more expressive representation for each person. Experiments conducted on the iLIDS-VID and PRID-2011 datasets demonstrate that our approach outperforms the existing video-based person re-identification methods and significantly improves stability. Our model achieves a rank-1 CMC accuracy of 73.6% on the iLIDS-VID dataset and a rank-1 CMC accuracy of 47.8% for the cross-data testing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. International conference on learning representations (ICLR)

  2. Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a mahalanobis metric from equivalence constraints. J Mach Learn Res (JMLR) 6(6):937–965

    MathSciNet  MATH  Google Scholar 

  3. Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2360–2367

  4. Gray D, Brennan S, Tao H (2007) Evaluating appearance models for recognition, reacquisition, and tracking. In: IEEE international workshop on performance evaluation for tracking and surveillance (PETS), pp 1–7

  5. Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1735–1742

  6. He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision (ECCV), pp 346–361

  7. Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis (SCIA), pp 91–102

  8. Hirzer M, Roth PM, Stinger M, Bischof H (2012) Relaxed pairwise learned metric for person re-identification. In: European conference on computer vision (ECCV), pp 780–793

  9. Kviatkovsky I, Adam A, Rivlin E (2013) Color invariants for person reidentification. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(7):1622–1634

    Article  Google Scholar 

  10. Li Y, Wu Z, Karanam S, Radke RJ (2015) Multi-shot human re-identification using adaptive fisher discriminant analysis. In: British machine vision conference (BMVC), pp 73.1–73.12

  11. Li Z, Chang S, Liang F, Huang TS, Cao L, Smith JR (2013) Learning locally-adaptive decision functions for person verification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3610–3617

  12. Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2197–2206

  13. Liao S, Li SZ (2015) Efficient psd constrained asymmetric metric learning for person re-identification. In: IEEE international conference on computer vision (ICCV), pp 3685–3693

  14. Liu C, Gong S, Chen CL, Lin X (2012) Person re-identification: what features are important?. In: European conference on computer vision (ECCV), pp 391–401

  15. Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: IEEE international conference on computer vision (ICCV), pp 3810–3818

  16. Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: International joint conference on artificial intelligence (IJCAI), pp 674–679

  17. Ma B, Su Y, Jurie F (2012) Local descriptors encoded by fisher vectors for person re-identification. In: European conference on computer vision (ECCV), pp 413–422

  18. Matsukawa T, Okabe T, Suzuki E, Sato Y (2016) Hierarchical gaussian descriptor for person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1363–1372

  19. McLaughlin N, Rincon JMD, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1325–1334

  20. Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Advances in neural information processing systems (NIPS), pp 2204–2212

  21. Paisitkriangkrai S, Shen C, Hengel AVD (2015) Learning to rank in person re-identification with metric ensembles. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 1846–1855

  22. Subramaniam A, Chatterjee M, Mittal A (2016) Deep neural networks with inexact matching for person re-identification. In: Advances in neural information processing systems (NIPS), pp 2667–2675

  23. Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision (ECCV), pp 135–153

  24. Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision (ECCV), pp 688–703

  25. Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res (JMLR) 10(2):207–244

    MATH  Google Scholar 

  26. Wu L, Shen C, Hengel A (2016) Deep recurrent convolutional networks for video-based person re-identification: an end-to-end approach. IEEE conference on computer vision and pattern recognition (CVPR)

  27. Xiong F, Gou M, Camps O, Sznaier M (2014) Person re-identification using kernel-based metric learning methods. In: European conference on computer vision (ECCV), pp 1–16

  28. Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. IEEE International Conference on Computer Vision (ICCV), pp 4743–4752

  29. Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: European conference on computer vision (ECCV), pp 701–716

  30. Yi D, Lei Z, Liao S, Li SZ (2014) Deep metric learning for person re-identification. In: International conference on pattern recognition (ICPR), pp 34–39

  31. Zhang Z, Chen Y, Saligrama V (2015) Group membership prediction. In: IEEE International conference on computer vision (ICCV), pp 3916–3924

  32. Zhao R, Ouyang W, Wang X (2014) Learning mid-level filters for person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 144–151

  33. Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: IEEE international conference on computer vision (ICCV), pp 1116–1124

  34. Zheng WS, Gong S, Xiang T (2013) Reidentification by relative distance comparison. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(3):653–668

    Article  Google Scholar 

  35. Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 6776–6785

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (NSFC 61572005, 61672086, 61702030, 61771058), and Key Projects of Science and Technology Research of Hebei Province Higher Education (ZD2017304).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baomin Xu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Sun, C., Xu, X. et al. A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl Intell 49, 3436–3446 (2019). https://doi.org/10.1007/s10489-019-01459-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-019-01459-8

Keywords

Navigation