Abstract
An intrinsic challenge of person re-identification (re-ID) is the annotation difficulty. This typically means (1) few training samples per identity and (2) thus the lack of diversity among the training samples. Consequently, we face high risk of over-fitting when training the convolutional neural network (CNN), a state-of-the-art method in person re-ID. To reduce the risk of over-fitting, this paper proposes a Pseudo-Positive Regularization method to enrich the diversity of the training data. Specifically, unlabeled data from an independent pedestrian database are retrieved using the target training data as query. A small proportion of these retrieved samples are randomly selected as the Pseudo-Positive samples and added to the target training set for the supervised CNN training. The addition of Pseudo-Positive samples is therefore a Data Augmentation method to reduce the risk of over-fitting during CNN training. We implement our idea in the identification CNN models (i.e., CaffeNet, VGGNet-16 and ResNet-50). On CUHK03 and Market-1501 datasets, experimental results demonstrate that the proposed method consistently improves the baseline and yields competitive performance to the state-of-the-art person re-ID methods.
Similar content being viewed by others
Notes
The independent database is obtained from our collaboration project with Dr. Liang Zheng at the University of Technology Sydney (homepage: http://www.liangzheng.com.cn/). The pedestrian images are captured from several cameras placed in front of a supermarket at Tsinghua University. The database will be made publicly available together with Dr. Liang Zheng's future publication.
The rank-1 accuracy is shown when the CMC curve is absent.
References
Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: Proc. CVPR, pp. 3908–3916 (2015)
Bazzani, L., Cristani, M., Murino, V.: Sdalf: modeling human appearance with symmetry-driven accumulation of local features. In: Person re-identification, pp. 43–69. Springer London (2014)
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proc. COMPSTAT’2010, pp. 177–186 (2010)
Chang, X., Nie, F., Wang, S., Yang, Y., Zhou, X., Zhang, C.: Compound rank-\(k\) projections for bilinear analysis. IEEE Trans. Neural Netw. Learn. Syst. 27(7), 1502–1513 (2016)
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proc. CVPR, pp. 1335–1344 (2016)
Das, A., Chakraborty, A., Roy-Chowdhury, A.K.: Consistent re-identification in a camera network. In: Proc. ECCV, pp. 330–345 (2014)
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Proc. ICML, pp. 209–216 (2007)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: Proc. CVPR, pp. 248–255 (2009)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Fu, H., Zhao, H., Kong, X., Zhang, X.: Bhog: binary descriptor for sketch-based image retrieval. Multimed. Syst. 22(1), 127–136 (2016)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proc. CVPR, pp. 580–587 (2014)
Gray, D., Tao, H.: Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Proc. ECCV, pp. 262–275 (2008)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proc. CVPR, pp. 770–778 (2016)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012)
Hirzer, M., Beleznai, C., Roth, P.M., Bischof, H.: Person re-identification by descriptive and discriminative classification. In: Proc. Scandinavian conference on Image analysis, pp. 91–102 (2011)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. In: Proc. ACM international conference on multimedia, pp. 675–678 (2014)
Kodirov, E., Xiang, T., Fu, Z., Gong, S.: Person re-identification by unsupervised l1 graph learning. In: Proc. ECCV, pp. 178–195 (2016)
Koestinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: Proc. CVPR, pp. 2288–2295 (2012)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. University of Toronto, Tech. rep. (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proc. NIPS, pp. 1097–1105 (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Li, W., Wang, X.: Locally aligned feature transforms across views. In: Proc. CVPR, pp. 3594–3601 (2013)
Li, W., Zhao, R., Wang, X.: Human reidentification with transferred metric learning. In: Proc. ACCV, pp. 31–44 (2012)
Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: Deep filter pairing neural network for person re-identification. In: Proc. CVPR, pp. 152–159 (2014)
Li, X.: Tag relevance fusion for social image retrieval. Multimed. Syst. 23(1), 29–40 (2017)
Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: Proc. CVPR, pp. 2197–2206 (2015)
Liao, S., Mo, Z., Zhu, J., Hu, Y., Li, S.Z.: Open-set person re-identification. arXiv preprint arXiv:1408.0872 (2014)
Lin, Y., Zheng, L., Zheng, Z., Wu, Y., Yang, Y.: Improving person re-identification by attribute and identity learning. arXiv preprint arXiv:1703.07220 (2017)
Liu, H., Feng, J., Qi, M., Jiang, J., Yan, S.: End-to-end comparative attention networks for person re-identification. arXiv preprint arXiv:1606.04404 (2016)
Liu, J., Li, Z., Lu, H.: Sparse semantic metric learning for image retrieval. Multimed. Syst. 20(6), 635–643 (2014)
Martinel, N., Das, A., Micheloni, C., Roy-Chowdhury, A.K.: Temporal model adaptation for person re-identification. arXiv preprint arXiv:1607.07216 (2016)
Peng, P., Xiang, T., Wang, Y., Pontil, M., Gong, S., Huang, T., Tian, Y.: Unsupervised cross-dataset transfer learning for person re-identification. In: Proc. CVPR, pp. 1306–1315 (2016)
Plaut, D.C., et al.: Experiments on learning by back propagation. Tech. rep., CMU-CS-86-126, CMU (1986)
Radenović, F., Tolias, G., Chum, O.: Cnn image retrieval learns from bow: Unsupervised fine-tuning with hard examples. arXiv preprint arXiv:1604.02426 (2016)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Roth, P.M., Hirzer, M., Koestinger, M., Beleznai, C., Bischof, H.: Mahalanobis distance learning for person re-identification. In: Person re-identification, pp. 247–267. Springer London (2014)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proc. CVPR, pp. 815–823 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Su, C., Zhang, S., Xing, J., Gao, W., Tian, Q.: Deep attributes driven multi-camera person re-identification. In: Proc. ECCV, pp. 475–491 (2016)
Sukhbaatar, S., Bruna, J., Paluri, M., Bourdev, L., Fergus, R.: Training convolutional networks with noisy labels. arXiv preprint arXiv:1406.2080 (2014)
Sun, Y., Zheng, L., Deng, W., Wang, S.: Svdnet for pedestrian retrieval. arXiv preprint arXiv:1703.05693 (2017)
Ustinova, E., Ganin, Y., Lempitsky, V.: Multiregion bilinear convolutional neural networks for person re-identification. arXiv preprint arXiv:1512.05300 (2015)
Van Der Maaten, L., Chen, M., Tyree, S., Weinberger, K.Q.: Learning with marginalized corrupted features. In: Proc. ICML, pp. 410–418 (2013)
Van Der Maaten, L., Chen, M., Tyree, S., Weinberger, K.Q.: Marginalizing corrupted features. arXiv preprint arXiv:1402.7001 (2014)
Varior, R.R., Haloi, M., Wang, G.: Gated siamese convolutional neural network architecture for human re-identification. In: Proc. ECCV, pp. 791–808 (2016)
Varior, R.R., Shuai, B., Lu, J., Xu, D., Wang, G.: A siamese long short-term memory architecture for human re-identification. In: Proc. ECCV, pp. 135–153 (2016)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Wan, L., Zeiler, M., Zhang, S., Cun, Y.L., Fergus, R.: Regularization of neural networks using dropconnect. In: Proc. ICML, pp. 1058–1066 (2013)
Wang, F., Zuo, W., Lin, L., Zhang, D., Zhang, L.: Joint learning of single-image and cross-image representations for person re-identification. In: Proc. CVPR, pp. 1288–1296 (2016)
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. In: Proc. NIPS, pp. 1473–1480 (2005)
Wu, L., Shen, C., Hengel, A.v.d.: Personnet: Person re-identification with deep convolutional neural networks. arXiv preprint arXiv:1601.07255 (2016)
Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: Proc. CVPR, pp. 1249–1258 (2016)
Xie, L., Wang, J., Wei, Z., Wang, M., Tian, Q.: Disturblabel: Regularizing cnn on the loss layer. In: Proc. CVPR, pp. 4753–4762 (2016)
Yan, Y., Nie, F., Li, W., Gao, C., Yang, Y., Xu, D.: Image classification by cross-media active learning with privileged information. IEEE Trans. Multimed. 18(12), 2494–2502 (2016)
Yang, X., Zhang, T., Xu, C.: A new discriminative coding method for image classification. Multimed. Syst. 21(2), 133–145 (2015)
Yang, Y., Ma, Z., Hauptmann, A.G., Sebe, N.: Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans. Multimed. 15(3), 661–669 (2013)
Yang, Y., Nie, F., Xu, D., Luo, J., Zhuang, Y., Pan, Y.: A multimedia retrieval framework based on semi-supervised ranking and relevance feedback. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 723–742 (2012)
Yang, Y., Zhuang, Y.T., Wu, F., Pan, Y.H.: Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans. Multimed. 10(3), 437–446 (2008)
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Deep metric learning for person re-identification. In: Proc. ICPR, pp. 34–39 (2014)
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013)
Zhang, L., Xiang, T., Gong, S.: Learning a discriminative null space for person re-identification. In: Proc. CVPR, pp. 1239–1248 (2016)
Zheng, L., Bie, Z., Sun, Y., Wang, J., Wang, S., Su, C., Tian, Q.: Mars: A video benchmark for large-scale person re-identification. In: Proc. ECCV, pp. 868–884 (2016)
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: A benchmark. In: Proc. ICCV, pp. 1116–1124 (2015)
Zheng, L., Wang, S., Guo, P., Liang, H., Tian, Q.: Tensor index for large scale image retrieval. Multimed. Syst. 21(6), 569–579 (2015)
Zheng, L., Wang, S., Liu, Z., Tian, Q.: Packing and padding: Coupled multi-index for accurate image retrieval. In: Proc. CVPR, pp. 1939–1946 (2014)
Zheng, L., Wang, S., Tian, L., He, F., Liu, Z., Tian, Q.: Query-adaptive late fusion for image search and person re-identification. In: Proc. CVPR, pp. 1741–1750 (2015)
Zheng, L., Wang, S., Wang, J., Tian, Q.: Accurate image search with multi-scale contextual evidences. Int. J. Comput. Vis. 120(1), 1–13 (2016)
Zheng, L., Yang, Y., Hauptmann, A.G.: Person re-identification: Past, present and future. arXiv preprint arXiv:1610.02984 (2016)
Zheng, L., Yang, Y., Tian, Q.: Sift meets cnn: A decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. (2017). doi:10.1109/TPAMI.2017.2709749
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: Proc. CVPR (2017)
Zheng, W.S., Gong, S., Xiang, T.: Associating groups of people. In: Proc. BMVC, pp. 23.1–23.11 (2009)
Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by gan improve the person re-identification baseline in vitro. arXiv preprint arXiv:1701.07717 (2017)
Acknowledgements
This work was supported in part by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (NSFC) under Grant 71421001, in part by the National Natural Science Foundation of China (NSFC) under Grant 61502073 and Grant 61429201, in part by the Open Projects Program of National Laboratory of Pattern Recognition under Grant 201407349, and in part to Dr. Qi Tian by ARO Grants W911NF-15-1-0290 and Faculty Research Gift Awards by NEC Laboratories of America and Blippar.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by T. Plagemann.
Rights and permissions
About this article
Cite this article
Zhu, F., Kong, X., Fu, H. et al. Pseudo-positive regularization for deep person re-identification. Multimedia Systems 24, 477–489 (2018). https://doi.org/10.1007/s00530-017-0571-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-017-0571-8