Skip to main content
Log in

Pseudo-positive regularization for deep person re-identification

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

An intrinsic challenge of person re-identification (re-ID) is the annotation difficulty. This typically means (1) few training samples per identity and (2) thus the lack of diversity among the training samples. Consequently, we face high risk of over-fitting when training the convolutional neural network (CNN), a state-of-the-art method in person re-ID. To reduce the risk of over-fitting, this paper proposes a Pseudo-Positive Regularization method to enrich the diversity of the training data. Specifically, unlabeled data from an independent pedestrian database are retrieved using the target training data as query. A small proportion of these retrieved samples are randomly selected as the Pseudo-Positive samples and added to the target training set for the supervised CNN training. The addition of Pseudo-Positive samples is therefore a Data Augmentation method to reduce the risk of over-fitting during CNN training. We implement our idea in the identification CNN models (i.e., CaffeNet, VGGNet-16 and ResNet-50). On CUHK03 and Market-1501 datasets, experimental results demonstrate that the proposed method consistently improves the baseline and yields competitive performance to the state-of-the-art person re-ID methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The independent database is obtained from our collaboration project with Dr. Liang Zheng at the University of Technology Sydney (homepage: http://www.liangzheng.com.cn/). The pedestrian images are captured from several cameras placed in front of a supermarket at Tsinghua University. The database will be made publicly available together with Dr. Liang Zheng's future publication.

  2. http://image-net.org/challenges/LSVRC/2012/.

  3. Note: we just take the CaffeNet [20] as an example in Fig. 3.

  4. The size is 227 \({\times}\) 227 for CaffeNet [20], while the size is 224 \({\times}\) 224 for VGGNet-16 [38] and ResNet-50 [13].

  5. Note: CaffeNet [20] and VGGNet-16 [38] are the penultimate fully connected layer, while ResNet-50 [13] is the last pooling layer.

  6. The rank-1 accuracy is shown when the CMC curve is absent.

  7. http://www.image-net.org/challenges/LSVRC/.

References

  1. Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: Proc. CVPR, pp. 3908–3916 (2015)

  2. Bazzani, L., Cristani, M., Murino, V.: Sdalf: modeling human appearance with symmetry-driven accumulation of local features. In: Person re-identification, pp. 43–69. Springer London (2014)

  3. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proc. COMPSTAT’2010, pp. 177–186 (2010)

  4. Chang, X., Nie, F., Wang, S., Yang, Y., Zhou, X., Zhang, C.: Compound rank-\(k\) projections for bilinear analysis. IEEE Trans. Neural Netw. Learn. Syst. 27(7), 1502–1513 (2016)

    Article  MathSciNet  Google Scholar 

  5. Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proc. CVPR, pp. 1335–1344 (2016)

  6. Das, A., Chakraborty, A., Roy-Chowdhury, A.K.: Consistent re-identification in a camera network. In: Proc. ECCV, pp. 330–345 (2014)

  7. Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proc. ICML, pp. 209–216 (2007)

  8. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: Proc. CVPR, pp. 248–255 (2009)

  9. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  10. Fu, H., Zhao, H., Kong, X., Zhang, X.: Bhog: binary descriptor for sketch-based image retrieval. Multimed. Syst. 22(1), 127–136 (2016)

    Article  Google Scholar 

  11. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. CVPR, pp. 580–587 (2014)

  12. Gray, D., Tao, H.: Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Proc. ECCV, pp. 262–275 (2008)

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. CVPR, pp. 770–778 (2016)

  14. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)

  15. Hirzer, M., Beleznai, C., Roth, P.M., Bischof, H.: Person re-identification by descriptive and discriminative classification. In: Proc. Scandinavian conference on Image analysis, pp. 91–102 (2011)

  16. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. In: Proc. ACM international conference on multimedia, pp. 675–678 (2014)

  17. Kodirov, E., Xiang, T., Fu, Z., Gong, S.: Person re-identification by unsupervised l1 graph learning. In: Proc. ECCV, pp. 178–195 (2016)

  18. Koestinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: Proc. CVPR, pp. 2288–2295 (2012)

  19. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. University of Toronto, Tech. rep. (2009)

    Google Scholar 

  20. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proc. NIPS, pp. 1097–1105 (2012)

  21. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  22. Li, W., Wang, X.: Locally aligned feature transforms across views. In: Proc. CVPR, pp. 3594–3601 (2013)

  23. Li, W., Zhao, R., Wang, X.: Human reidentification with transferred metric learning. In: Proc. ACCV, pp. 31–44 (2012)

  24. Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: Deep filter pairing neural network for person re-identification. In: Proc. CVPR, pp. 152–159 (2014)

  25. Li, X.: Tag relevance fusion for social image retrieval. Multimed. Syst. 23(1), 29–40 (2017)

    Article  MathSciNet  Google Scholar 

  26. Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: Proc. CVPR, pp. 2197–2206 (2015)

  27. Liao, S., Mo, Z., Zhu, J., Hu, Y., Li, S.Z.: Open-set person re-identification. arXiv preprint arXiv:1408.0872 (2014)

  28. Lin, Y., Zheng, L., Zheng, Z., Wu, Y., Yang, Y.: Improving person re-identification by attribute and identity learning. arXiv preprint arXiv:1703.07220 (2017)

  29. Liu, H., Feng, J., Qi, M., Jiang, J., Yan, S.: End-to-end comparative attention networks for person re-identification. arXiv preprint arXiv:1606.04404 (2016)

  30. Liu, J., Li, Z., Lu, H.: Sparse semantic metric learning for image retrieval. Multimed. Syst. 20(6), 635–643 (2014)

    Article  Google Scholar 

  31. Martinel, N., Das, A., Micheloni, C., Roy-Chowdhury, A.K.: Temporal model adaptation for person re-identification. arXiv preprint arXiv:1607.07216 (2016)

  32. Peng, P., Xiang, T., Wang, Y., Pontil, M., Gong, S., Huang, T., Tian, Y.: Unsupervised cross-dataset transfer learning for person re-identification. In: Proc. CVPR, pp. 1306–1315 (2016)

  33. Plaut, D.C., et al.: Experiments on learning by back propagation. Tech. rep., CMU-CS-86-126, CMU (1986)

  34. Radenović, F., Tolias, G., Chum, O.: Cnn image retrieval learns from bow: Unsupervised fine-tuning with hard examples. arXiv preprint arXiv:1604.02426 (2016)

  35. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

  36. Roth, P.M., Hirzer, M., Koestinger, M., Beleznai, C., Bischof, H.: Mahalanobis distance learning for person re-identification. In: Person re-identification, pp. 247–267. Springer London (2014)

  37. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proc. CVPR, pp. 815–823 (2015)

  38. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  39. Su, C., Zhang, S., Xing, J., Gao, W., Tian, Q.: Deep attributes driven multi-camera person re-identification. In: Proc. ECCV, pp. 475–491 (2016)

  40. Sukhbaatar, S., Bruna, J., Paluri, M., Bourdev, L., Fergus, R.: Training convolutional networks with noisy labels. arXiv preprint arXiv:1406.2080 (2014)

  41. Sun, Y., Zheng, L., Deng, W., Wang, S.: Svdnet for pedestrian retrieval. arXiv preprint arXiv:1703.05693 (2017)

  42. Ustinova, E., Ganin, Y., Lempitsky, V.: Multiregion bilinear convolutional neural networks for person re-identification. arXiv preprint arXiv:1512.05300 (2015)

  43. Van Der Maaten, L., Chen, M., Tyree, S., Weinberger, K.Q.: Learning with marginalized corrupted features. In: Proc. ICML, pp. 410–418 (2013)

  44. Van Der Maaten, L., Chen, M., Tyree, S., Weinberger, K.Q.: Marginalizing corrupted features. arXiv preprint arXiv:1402.7001 (2014)

  45. Varior, R.R., Haloi, M., Wang, G.: Gated siamese convolutional neural network architecture for human re-identification. In: Proc. ECCV, pp. 791–808 (2016)

  46. Varior, R.R., Shuai, B., Lu, J., Xu, D., Wang, G.: A siamese long short-term memory architecture for human re-identification. In: Proc. ECCV, pp. 135–153 (2016)

  47. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

  48. Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Proc. ICML, pp. 1058–1066 (2013)

  49. Wang, F., Zuo, W., Lin, L., Zhang, D., Zhang, L.: Joint learning of single-image and cross-image representations for person re-identification. In: Proc. CVPR, pp. 1288–1296 (2016)

  50. Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Proc. NIPS, pp. 1473–1480 (2005)

  51. Wu, L., Shen, C., Hengel, A.v.d.: Personnet: Person re-identification with deep convolutional neural networks. arXiv preprint arXiv:1601.07255 (2016)

  52. Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: Proc. CVPR, pp. 1249–1258 (2016)

  53. Xie, L., Wang, J., Wei, Z., Wang, M., Tian, Q.: Disturblabel: Regularizing cnn on the loss layer. In: Proc. CVPR, pp. 4753–4762 (2016)

  54. Yan, Y., Nie, F., Li, W., Gao, C., Yang, Y., Xu, D.: Image classification by cross-media active learning with privileged information. IEEE Trans. Multimed. 18(12), 2494–2502 (2016)

    Article  Google Scholar 

  55. Yang, X., Zhang, T., Xu, C.: A new discriminative coding method for image classification. Multimed. Syst. 21(2), 133–145 (2015)

    Article  Google Scholar 

  56. Yang, Y., Ma, Z., Hauptmann, A.G., Sebe, N.: Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans. Multimed. 15(3), 661–669 (2013)

    Article  Google Scholar 

  57. Yang, Y., Nie, F., Xu, D., Luo, J., Zhuang, Y., Pan, Y.: A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 723–742 (2012)

    Article  Google Scholar 

  58. Yang, Y., Zhuang, Y.T., Wu, F., Pan, Y.H.: Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans. Multimed. 10(3), 437–446 (2008)

    Article  Google Scholar 

  59. Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: Proc. ICPR, pp. 34–39 (2014)

  60. Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013)

  61. Zhang, L., Xiang, T., Gong, S.: Learning a discriminative null space for person re-identification. In: Proc. CVPR, pp. 1239–1248 (2016)

  62. Zheng, L., Bie, Z., Sun, Y., Wang, J., Wang, S., Su, C., Tian, Q.: Mars: A video benchmark for large-scale person re-identification. In: Proc. ECCV, pp. 868–884 (2016)

  63. Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: A benchmark. In: Proc. ICCV, pp. 1116–1124 (2015)

  64. Zheng, L., Wang, S., Guo, P., Liang, H., Tian, Q.: Tensor index for large scale image retrieval. Multimed. Syst. 21(6), 569–579 (2015)

    Article  Google Scholar 

  65. Zheng, L., Wang, S., Liu, Z., Tian, Q.: Packing and padding: Coupled multi-index for accurate image retrieval. In: Proc. CVPR, pp. 1939–1946 (2014)

  66. Zheng, L., Wang, S., Tian, L., He, F., Liu, Z., Tian, Q.: Query-adaptive late fusion for image search and person re-identification. In: Proc. CVPR, pp. 1741–1750 (2015)

  67. Zheng, L., Wang, S., Wang, J., Tian, Q.: Accurate image search with multi-scale contextual evidences. Int. J. Comput. Vis. 120(1), 1–13 (2016)

    Article  MathSciNet  Google Scholar 

  68. Zheng, L., Yang, Y., Hauptmann, A.G.: Person re-identification: Past, present and future. arXiv preprint arXiv:1610.02984 (2016)

  69. Zheng, L., Yang, Y., Tian, Q.: Sift meets cnn: A decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. (2017). doi:10.1109/TPAMI.2017.2709749

  70. Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: Proc. CVPR (2017)

  71. Zheng, W.S., Gong, S., Xiang, T.: Associating groups of people. In: Proc. BMVC, pp. 23.1–23.11 (2009)

  72. Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by gan improve the person re-identification baseline in vitro. arXiv preprint arXiv:1701.07717 (2017)

Download references

Acknowledgements

This work was supported in part by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (NSFC) under Grant 71421001, in part by the National Natural Science Foundation of China (NSFC) under Grant 61502073 and Grant 61429201, in part by the Open Projects Program of National Laboratory of Pattern Recognition under Grant 201407349, and in part to Dr. Qi Tian by ARO Grants W911NF-15-1-0290 and Faculty Research Gift Awards by NEC Laboratories of America and Blippar.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangwei Kong.

Additional information

Communicated by T. Plagemann.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, F., Kong, X., Fu, H. et al. Pseudo-positive regularization for deep person re-identification. Multimedia Systems 24, 477–489 (2018). https://doi.org/10.1007/s00530-017-0571-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-017-0571-8

Keywords

Navigation