Advertisement

Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

  • Thuy-Binh Nguyen
  • Thi-Lan LeEmail author
  • Louis Devillaine
  • Thi Thanh Thuy Pham
  • Nam Pham Ngoc
Article
  • 37 Downloads

Abstract

Multi-shot person re-identification (ReID) is a popular case of person ReID in which a set of images are processed for each person. However, using entire image set for person ReID as most experimented proposals is not always effective because of time and memory consuming. The main contribution of this work is the proposed strategies for (1) choosing representative image frames for each individual instead of entire set of frames, and (2) temporal feature pooling in multi-shot person ReID. These strategies are efficiently integrated in a person ReID framework which uses GoG (Gaussian of Gaussian) and XQDA (metric learning Cross-view Quadratic Discriminant Analysis) for person representation and matching. The effectiveness of the proposed framework on two benchmark datasets (PRID 2011 and iLIDS-VID) in terms of re-identification accuracy, computational time, and storage requirements are deeply investigated and analyzed. The experimental results allow to provide several recommendations on the use of these schemes based on the characteristics of the working dataset and the requirement of the applications. Furthermore, the study also offers a desktop-based application for person search and ReID. The implementation of the proposed framework will be made publicly available.

Keywords

Multi-shot peron re-identification Hand designed features Multi-shot Representative frames 

Notes

Acknowledgments

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2017.315

References

  1. 1.
    Avraham T, Gurvich I, Lindenbaum M, Markovitch S (2012) Learning implicit transfer for person re-identification. In: Workshops and demonstrations computer vision–ECCV 2012, pp 381–390. SpringerGoogle Scholar
  2. 2.
    Bazzani L, Cristani M, Murino V (2013) Symmetry-driven accumulation of local features for human characterization and re-identification. Comput Vis Image Underst 117(2):130–144CrossRefGoogle Scholar
  3. 3.
    Chang Y C, Chiang C K, Lai S H (2012) Single-shot person re-identification based on improved random-walk pedestrian segmentation. In: 2012 international symposium on intelligent signal processing and communications systems (ISPACS), pp 1–6. IEEEGoogle Scholar
  4. 4.
    Chen Y, Zhu X, Gong S (2018) Deep association learning for unsupervised video person re-identification. arXiv:1808.07301
  5. 5.
    Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1335–1344Google Scholar
  6. 6.
    Eisenbach M, Kolarow A, Vorndran A, Niebling J, Gross H M (2015) Evaluation of multi feature fusion at score-level for appearance-based person re-identification. In: 2015 international joint conference on neural networks (IJCNN), pp 1–8. IEEEGoogle Scholar
  7. 7.
    Frikha M, Chebbi O, Fendri E, Hammami M (2016) Key frame selection for multi-shot person re-identification. In: International workshop on representations, analysis and recognition of shape and motion from imaging data (2016), pp 97–110. SpringerGoogle Scholar
  8. 8.
    Gao C, Wang J, Liu L, Yu J G, Sang N (2016) Temporally aligned pooling representation for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4284–4288. IEEEGoogle Scholar
  9. 9.
    Gao M, Ai H, Bai B (2016) A feature fusion strategy for person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4274–4278. IEEEGoogle Scholar
  10. 10.
    Geng S, Yu M, Liu Y, Yu Y, Bai J (2018) Re-ranking pedestrian re-identification with multiple metrics. Multimedia Tools and Applications, pp 1–23Google Scholar
  11. 11.
    Graves A (2013) Generating sequences with recurrent neural networks. arXiv:1308.0850
  12. 12.
    Hassen Y H, Ayedi W, Ouni T, Jallouli M (2015) Multi-shot person re-identification approach based key frame selection. In: 8th international conference on machine vision (ICMV 2015), vol. 9875, p. 98751H. International Society for Optics and PhotonicsGoogle Scholar
  13. 13.
    Hassen Y H, Loukil K, Ouni T, Jallouli M (2017) Images selection and best descriptor combination for multi-shot person re-identification. In: International conference on intelligent interactive multimedia systems and services (2017), pp 11–20. SpringerGoogle Scholar
  14. 14.
    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778Google Scholar
  15. 15.
    Heidarysafa M, Kowsari K, Brown D E, Meimandi K J, Barnes L E (2018) An improvement of data classification using random multimodel deep learning (rmdl). arXiv:1808.08121
  16. 16.
    Hirzer M, Beleznai C, Roth P M, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis (2011), pp 91–102. SpringerGoogle Scholar
  17. 17.
    Huang Z, Wang R, Shan S, Chen X (2015) Projection metric learning on grassmann manifold with application to video based face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 140–149Google Scholar
  18. 18.
    John Lu Z (2010) The elements of statistical learning: data mining, inference, and prediction. J R Stat Soc A Stat Soc 173(3):693–694CrossRefGoogle Scholar
  19. 19.
    Johnson J, Yasugi S, Sugino Y, Pranata S, Shen S (2018) Person re-identification with fusion of hand-crafted and deep pose-based body region features. arXiv:1803.10630
  20. 20.
    Karanam S, Gou M, Wu Z, Rates-Borras A, Camps O, Radke R J (2018) A systematic evaluation and benchmark for person re-identification: features, metrics, and datasets IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)Google Scholar
  21. 21.
    Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008-19th British machine vision conference, pp 275–1. British machine vision associationGoogle Scholar
  22. 22.
    Koestinger M, Hirzer M, Wohlhart P, Roth P M, Bischof H (2012) Large scale metric learning from equivalence constraints. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2288–2295. IEEEGoogle Scholar
  23. 23.
    Kowsari K, Brown D E, Heidarysafa M, Meimandi K J, Gerber M S, Barnes L E (2017) Hdltex: Hierarchical deep learning for text classification. In: 2017 16th IEEE international conference on machine learning and applications (ICMLA), pp 364–371. IEEEGoogle Scholar
  24. 24.
    Le TL, Thonnat M, Boucher A, Brémond F (2009) Appearance based retrieval for tracked objects in surveillance videos. In: Proceedings of the ACM international conference on image and video retrieval, CIVR ’09. ACM, New York, pp 40:1–40:8.  https://doi.org/10.1145/1646396.1646444
  25. 25.
    Lejbølle AR, Nasrollahi K, Moeslund TB (2017) Enhancing person re-identification by late fusion of low-, mid-and high-level features Iet BiometricsGoogle Scholar
  26. 26.
    Li Z, Chang S, Liang F, Huang T S, Cao L, Smith J R (2013) Learning locally-adaptive decision functions for person verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3610–3617Google Scholar
  27. 27.
    Li Y, Zhuo L, Li J, Zhang J, Liang X, Tian Q (2017) Video-based person re-identification by deep feature guided pooling. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops (2017), pp 39–46Google Scholar
  28. 28.
    Li M, Zhu X, Gong S (2018) Unsupervised person re-identification by deep learning tracklet association. In: Proceedings of the European conference on computer vision (ECCV), pp 737–753CrossRefGoogle Scholar
  29. 29.
    Li M, Zhu X, Gong S (2019) Unsupervised tracklet person re-identification. IEEE transactions on pattern analysis and machine intelligenceGoogle Scholar
  30. 30.
    Liao S, Hu Y, Zhu X, Li S Z (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp 2197–2206Google Scholar
  31. 31.
    Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision (2015), pp 3810–3818Google Scholar
  32. 32.
    Liu Z, Chen J, Wang Y (2016) A fast adaptive spatio-temporal 3d feature for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4294–4298. IEEEGoogle Scholar
  33. 33.
    Liu H, Jie Z, Jayashree K, Qi M, Jiang J, Yan S, Feng J (2017) Video-based person re-identification with accumulative motion context. IEEE Trans Circuits Syst Video Technol 28(10):2788–2802CrossRefGoogle Scholar
  34. 34.
    Liu Z, Wang D, Lu H (2017) Stepwise metric promotion for unsupervised video person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 2429–2438Google Scholar
  35. 35.
    Liu Y, Song N, Han Y (2019) Multi-cue fusion: Discriminative enhancing for person re-identification. J Vis Commun Image Represent 58:46–52CrossRefGoogle Scholar
  36. 36.
    Ma B, Su Y, Jurie F (2012) Local descriptors encoded by fisher vectors for person re-identification. In: Workshops and demonstrations computer vision–ECCV 2012, pp 413–422. SpringerGoogle Scholar
  37. 37.
    Ma X, Zhu X, Gong S, Xie X, Hu J, Lam K M, Zhong Y (2017) Person re-identification by unsupervised video matching. Pattern Recogn 65:197–210CrossRefGoogle Scholar
  38. 38.
    Matsukawa T, Okabe T, Suzuki E, Sato Y (2016) Hierarchical gaussian descriptor for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp 1363–1372Google Scholar
  39. 39.
    McLaughlin N, Martinez del Rincon J, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp 1325–1334Google Scholar
  40. 40.
    Nguyen H Q, Nguyen T B, Le T L (2018) Enhancing person re-identification based on recurrent feature aggregation network. In: 2018 1st international conference on multimedia analysis and pattern recognition (MAPR), pp 1–6. IEEEGoogle Scholar
  41. 41.
    Nguyen TB, Le TL, Ngoc NP (2018) Fusion schemes for image-to-video person re-identification. Journal of Information and Telecommunication 0(0):1–21.  https://doi.org/10.1080/24751839.2018.1531233 CrossRefGoogle Scholar
  42. 42.
    Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, Tian Y (2016) Unsupervised cross-dataset transfer learning for person re-identification. In Proc. IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USAGoogle Scholar
  43. 43.
    Prosser B J, Zheng W S, Gong S, Xiang T, Mary Q (2010) Person re-identification by support vector ranking. In: BMVC, vol 2, pp 6Google Scholar
  44. 44.
    ur Rehman S, Chen Z, Shah J H, Raza M (2016) Multi-feature fusion based re-ranking for person re-identification. In: 2016 international conference on audio, language and image processing (ICALIP), pp 213–216. IEEEGoogle Scholar
  45. 45.
    Song J, Gao L, Nie F, Shen H T, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011MathSciNetCrossRefGoogle Scholar
  46. 46.
    Song J, Guo Y, Gao L, Li X, Hanjalic A, Shen H T (2017) From deterministic to generative: Multi-modal stochastic rnns for video captioning. arXiv:1708.02478
  47. 47.
    Song J, Zhang H, Li X, Gao L, Wang M, Hong R (2018) Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Trans Image Process 27 (7):3210–3221MathSciNetCrossRefGoogle Scholar
  48. 48.
    Song S, Cheung N M, Chandrasekhar V, Mandal B (2018) Deep adaptive temporal pooling for activity recognition. arXiv:1808.07272
  49. 49.
    Su C, Yang F, Zhang S, Tian Q, Davis L S, Gao W (2015) Multi-task learning with low rank attribute embedding for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3739–3747Google Scholar
  50. 50.
    Thuy-Binh N, Duc-Long T, Thi-Lan L, Thi Thanh Thuy P, Huong-Giang D (2018) Towards effective implementation of gaussian of gaussian descriptor for person re-identification. In: The 5th NAFOSTED conference on information and computer science (NICS 2018)Google Scholar
  51. 51.
    Wang R, Chen X (2009) Manifold discriminant analysis. In: IEEE Conference on computer vision and pattern recognition, 2009. CVPR 2009, pp 429–436. IEEEGoogle Scholar
  52. 52.
    Wang X, Doretto G, Sebastian T, Rittscher J, Tu P (2007) Shape and appearance context modeling. In: IEEE 11th international conference on computer vision, 2007. ICCV 2007, pp 1–8. IEEEGoogle Scholar
  53. 53.
    Wang R, Guo H, Davis L S, Dai Q (2012) Covariance discriminative learning: A natural and efficient approach to image set classification. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2496–2503. IEEEGoogle Scholar
  54. 54.
    Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: ECCV (4), pp 688–703CrossRefGoogle Scholar
  55. 55.
    Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans Pattern Anal Mach Intell 38 (12):2501–2514CrossRefGoogle Scholar
  56. 56.
    Wang X, Gao L, Wang P, Sun X, Liu X (2017) Two-stream 3-d convnet fusion for action recognition in videos with arbitrary size and length. IEEE Trans Multimed 20(3):634–644CrossRefGoogle Scholar
  57. 57.
    Wu Y, Minoh M, Mukunoki M, Lao S (2012) Set based discriminative ranking for recognition. Computer Vision–ECCV 2012:497–510Google Scholar
  58. 58.
    Wu Y, Mukunoki M, Minoh M (2014) Locality-constrained collaboratively regularized nearest points for multiple-shot person re-identification. In: Proc. of The 20th Korea-Japan joint workshop on frontiers of computer vision (FCV). CiteseerGoogle Scholar
  59. 59.
    Wu S, Chen Y C, Li X, Wu A C, You J J, Zheng W S (2016) An enhanced deep feature representation for person re-identification. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–8. IEEEGoogle Scholar
  60. 60.
    Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: European Conference on computer vision (2016), pp 701–716. SpringerGoogle Scholar
  61. 61.
    Yang Y, Yang J, Yan J, Liao S, Yi D, Li S Z (2014) Salient color names for person re-identification. In: European conference on computer vision, pp 536–551. SpringerGoogle Scholar
  62. 62.
    Ye M, Ma A J, Zheng L, Li J, Yuen P C (2017) Dynamic label graph matching for unsupervised video re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 5142–5150Google Scholar
  63. 63.
    Yuan L, Tian Z (2016) Person re-identification based on color and texture feature fusion. In: International conference on intelligent computing, pp 341–352. SpringerGoogle Scholar
  64. 64.
    Zeng Z, Li Z, Cheng D, Zhang H, Zhan K, Yang Y (2017) Two-stream multirate recurrent neural network for video-based pedestrian reidentification. IEEE Trans Ind Inf 14(7):3179–3186CrossRefGoogle Scholar
  65. 65.
    Zeng M, Tian C, Wu Z (2018) Person re-identification with hierarchical deep learning feature and efficient xqda metric. In: 2018 ACM multimedia conference on multimedia conference, pp 1838–1846. ACMGoogle Scholar
  66. 66.
    Zhang W, Hu S, Liu K (2017) Learning compact appearance representation for video-based person re-identification. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)Google Scholar
  67. 67.
    Zhao R, Ouyang W, Wang X (2013) Person re-identification by salience matching. In: Proceedings of the IEEE international conference on computer vision, pp 2528–2535Google Scholar
  68. 68.
    Zhao S, Liu Y, Han Y, Hong R, Hu Q, Tian Q (2017) Pooling the convolutional layers in deep convnets for video action recognition. IEEE Trans Circuits Syst Video Technol 28(8):1839–1849CrossRefGoogle Scholar
  69. 69.
    Zheng L, Wang S, Tian L, He F, Liu Z, Tian Q (2015) Query-adaptive late fusion for image search and person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp 1741–1750Google Scholar
  70. 70.
    Zheng L, Yang Y, Hauptmann A G (2016) Person re-identification: Past, present and future. arXiv:1610.02984
  71. 71.
    Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. arXiv:1701.07717

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Computer Vision Department, MICA International Research InstituteHanoi University of Science and TechnologyHanoiVietnam
  2. 2.School of Electronics and TelecommunicationsHanoi University of Science and TechnologyHanoiVietnam
  3. 3.Faculty of Electrical and Electronics EngineeringUniversity of Transport and CommunicationsHanoiVietnam
  4. 4.School of Engineering in Physics, Applied Physics, Electronics & Materials ScienceGrenoble Institute of TechnologyGrenobleFrance
  5. 5.Faculty of Security and Information TechnologyAcademy of People SecurityHanoiVietnam

Personalised recommendations