Advertisement

Robust Anchor Embedding for Unsupervised Video Person re-IDentification in the Wild

  • Mang Ye
  • Xiangyuan Lan
  • Pong C. YuenEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11211)

Abstract

This paper addresses the scalability and robustness issues of estimating labels from imbalanced unlabeled data for unsupervised video-based person re-identification (re-ID). To achieve it, we propose a novel Robust AnChor Embedding (RACE) framework via deep feature representation learning for large-scale unsupervised video re-ID. Within this framework, anchor sequences representing different persons are firstly selected to formulate an anchor graph which also initializes the CNN model to get discriminative feature representations for later label estimation. To accurately estimate labels from unlabeled sequences with noisy frames, robust anchor embedding is introduced based on the regularized affine hull. Efficiency is ensured with kNN anchors embedding instead of the whole anchor set under manifold assumptions. After that, a robust and efficient top-k counts label prediction strategy is proposed to predict the labels of unlabeled image sequences. With the newly estimated labeled sequences, the unified anchor embedding framework enables the feature learning process to be further facilitated. Extensive experimental results on the large-scale dataset show that the proposed method outperforms existing unsupervised video re-ID methods.

Keywords

Unsupervised person re-id Robust anchor embedding 

Notes

Acknowledgments

This work is partially supported by Hong Kong RGC General Research Fund HKBU (12254316), and National Natural Science Foundation of China (61562048).

References

  1. 1.
    Bai, S., Bai, X., Tian, Q.: Scalable person re-identification on supervised smoothed manifold. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2530–2539 (2017)Google Scholar
  2. 2.
    Bai, S., Sun, S., Bai, X., Zhang, Z., Tian, Q.: Smooth neighborhood structure mining on multiple affinity graphs with applications to context-sensitive similarity. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 592–608. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_37CrossRefGoogle Scholar
  3. 3.
    Bojanowski, P., Joulin, A.: Unsupervised learning by predicting noise. In: ICML (2017)Google Scholar
  4. 4.
    Bojanowski, P., et al.: Weakly supervised action labeling in videos under ordering constraints. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 628–643. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_41CrossRefGoogle Scholar
  5. 5.
    Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: CVPR (2016)Google Scholar
  6. 6.
    Chung, D., Tahboub, K., Delp, E.J.: A two stream siamese convolutional neural network for person re-identification. In: ICCV (2017)Google Scholar
  7. 7.
    Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Eco: efficient convolution operators for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 21–26 (2017)Google Scholar
  8. 8.
    Duchi, J., Shalev-Shwartz, S., Singer, Y., Chandra, T.: Efficient projections onto the \(\ell _1\)-ball for learning in high dimensions. In: ICML (2008)Google Scholar
  9. 9.
    Fan, H., Zheng, L., Yang, Y.: Unsupervised person re-identification: clustering and fine-tuning. arXiv preprint arXiv:1705.10444 (2017)
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)Google Scholar
  11. 11.
    Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. In: ICCV (2017)Google Scholar
  12. 12.
    Hirzer, M., Beleznai, C., Roth, P.M., Bischof, H.: Person re-identification by descriptive and discriminative classification. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 91–102. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-21227-7_9CrossRefGoogle Scholar
  13. 13.
    Jianming, L., Weihang, C., Qing, L., Can, Y.: Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  14. 14.
    Jin, S., Su, H., Stauffer, C., Learned-Miller, E.: End-to-end face detection and cast grouping in movies using erdos-rényi clustering. In: International Conference on Computer Vision (ICCV), vol. 2, p. 8 (2017)Google Scholar
  15. 15.
    Jingya, W., Xiatian, Z., Shaogang, G., Wei, L.: Transferable joint attribute-identity deep learning for unsupervised person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  16. 16.
    Kodirov, E., Xiang, T., Fu, Z., Gong, S.: Person re-identification by unsupervised \(\ell _1\) graph learning. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 178–195. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_11CrossRefGoogle Scholar
  17. 17.
    Lan, X., Ma, A.J., Yuen, P.C., Chellappa, R.: Joint sparse representation and robust feature-level fusion for multi-cue visual tracking. IEEE Trans. Image Process. (TIP) 24(12), 5826–5841 (2015)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Lan, X., Zhang, S., Yuen, P.C., Chellappa, R.: Learning common and feature-specific patterns: a novel multiple-sparse-representation-based tracker. IEEE Trans. Image Process. 27(4), 2022–2037 (2018)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Lee, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Unsupervised representation learning by sorting sequences. In: IEEE International Conference on Computer Vision (ICCV), pp. 667–676 (2017)Google Scholar
  20. 20.
    Li, D., Hung, W.-C., Huang, J.-B., Wang, S., Ahuja, N., Yang, M.-H.: Unsupervised visual representation learning by graph-based consistent constraints. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 678–694. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_41CrossRefGoogle Scholar
  21. 21.
    Li, J., Ma, A.J., Yuen, P.C.: Semi-supervised region metric learning for person re-identification. Int. J. Comput. Vis. 1–20 (2018)Google Scholar
  22. 22.
    Li, S., Bak, S., Carr, P., Wang, X.: Diversity regularized spatiotemporal attention for video-based person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  23. 23.
    Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2197–2206 (2015)Google Scholar
  24. 24.
    Liu, K., Ma, B., Zhang, W., Huang, R.: A spatio-temporal appearance representation for video-based pedestrian re-identification. In: IEEE International Conference on Computer Vision (ICCV), pp. 3810–3818 (2015)Google Scholar
  25. 25.
    Liu, W., He, J., Chang, S.F.: Large graph construction for scalable semi-supervised learning. In: ICML (2010)Google Scholar
  26. 26.
    Liu, Y., Yan, J., Ouyang, W.: Quality aware network for set to set recognition. In: CVPR (2017)Google Scholar
  27. 27.
    Liu, Z., Wang, D., Lu, H.: Stepwise metric promotion for unsupervised video person re-identification. In: IEEE International Conference on Computer Vision (ICCV), pp. 2429–2438 (2017)Google Scholar
  28. 28.
    Ma, A.J., Li, J., Yuen, P.C., Li, P.: Cross-domain person reidentification using domain adaptation ranking svms. IEEE Trans. Image Process. (TIP) 24(5), 1599–1613 (2015)MathSciNetCrossRefGoogle Scholar
  29. 29.
    Ma, X., et al.: Person re-identification by unsupervised video matching. Pattern Recognit. (PR) 65, 197–210 (2017)CrossRefGoogle Scholar
  30. 30.
    McLaughlin, N., Martinez del Rincon, J., Miller, P.: Recurrent convolutional network for video-based person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1325–1334 (2016)Google Scholar
  31. 31.
    Nie, F., Zhu, W., Li, X.: Unsupervised large graph embedding. In: AAAI (2017)Google Scholar
  32. 32.
    Peng, P., Xiang, T., Wang, Y., et al.: Unsupervised cross-dataset transfer learning for person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1306–1315 (2016)Google Scholar
  33. 33.
    Sun, Y., Zheng, L., Deng, W., Wang, S.: SVDNet for pedestrian retrieval. In: ICCV (2017)Google Scholar
  34. 34.
    Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling. arXiv preprint arXiv:1711.09349 (2017)
  35. 35.
    Varior, R.R., Haloi, M., Wang, G.: Gated siamese convolutional neural network architecture for human re-identification. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 791–808. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_48CrossRefGoogle Scholar
  36. 36.
    Wang, H., Gong, S., Xiang, T.: Unsupervised learning of generative topic saliency for person re-identification. In: BMVC (2014)Google Scholar
  37. 37.
    Wang, M., Fu, W., Hao, S., Tao, D., Wu, X.: Scalable semi-supervised learning by efficient anchor graph regularization. IEEE TKDE 28(7), 1864–1877 (2016)Google Scholar
  38. 38.
    Wang, Q., Yuen, P.C., Feng, G.: Semi-supervised metric learning via topology preserving multiple semi-supervised assumptions. Pattern Recognit. 46(9), 2576–2587 (2013)CrossRefGoogle Scholar
  39. 39.
    Wang, T., Gong, S., Zhu, X., Wang, S.: Person re-identification by video ranking. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 688–703. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10593-2_45CrossRefGoogle Scholar
  40. 40.
    Wang, Z., et al.: Person reidentification via discrepancy matrix and matrix metric. IEEE Trans. Cybern. (2017)Google Scholar
  41. 41.
    Wang, Z., Hu, R., Liang, C., et al.: Zero-shot person re-identification via cross-view consistency. IEEE Trans. Multimed. (TMM) 18(12), 2553–2566 (2016)CrossRefGoogle Scholar
  42. 42.
    Wang, Z., Ye, M., Yang, F., Bai, X., Satoh, S.: Cascaded SR-GAN for scale-adaptive low resolution person re-identification. In: IJCAI, pp. 3891–3897 (2018)Google Scholar
  43. 43.
    Wu, Y., Lin, Y., Dong, X., Yan, Y., Ouyang, W., Yang, Y.: Exploit the unknown gradually: one-shot video-based person re-identification by stepwise learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018Google Scholar
  44. 44.
    Xu, S., Cheng, Y., Gu, K., Yang, Y., Chang, S., Zhou, P.: Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: ICCV (2017)Google Scholar
  45. 45.
    Ye, M., Lan, X., Li, J., Yuen, P.C.: Hierarchical discriminative learning for visible thermal person re-identification. In: Thirty-Second AAAI Conference on Artificial Intelligence (AAAI) (2018)Google Scholar
  46. 46.
    Ye, M., et al.: Person reidentification via ranking aggregation of similarity pulling and dissimilarity pushing. IEEE Trans. Multimed. 18(12), 2553–2566 (2016)CrossRefGoogle Scholar
  47. 47.
    Ye, M., Ma, A.J., Zheng, L., Li, J., Yuen, P.C.: Dynamic label graph matching for unsupervised video re-identification. In: IEEE International Conference on Computer Vision (ICCV), pp. 5142–5150 (2017)Google Scholar
  48. 48.
    Ye, M., Wang, Z., Lan, X., Yuen, P.C.: Visible thermal person re-identification via dual-constrained top-ranking. In: IJCAI, pp. 1092–1099 (2018)Google Scholar
  49. 49.
    Yu, H.X., Wu, A., Zheng, W.S.: Cross-view asymmetric metric learning for unsupervised person re-identification. In: ICCV (2017)Google Scholar
  50. 50.
    Zhang, R., Isola, P., Efros, A.A.: Split-brain autoencoders: unsupervised learning by cross-channel prediction. In: CVPR (2017)Google Scholar
  51. 51.
    Zhao, J., Xiong, L., Cheng, Y., Cheng, Y., et al.: 3D-aided deep pose-invariant face recognition. In: IJCAI (2018)Google Scholar
  52. 52.
    Zhao, R., Ouyang, W., Wang, X.: Unsupervised salience learning for person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3586–3593 (2013)Google Scholar
  53. 53.
    Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., Tian, Q.: MARS: a video benchmark for large-scale person re-identification. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 868–884. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_52CrossRefGoogle Scholar
  54. 54.
    Zheng, L., Yang, Y., Hauptmann, A.G.: Person re-identification: past, present and future. arXiv (2016)Google Scholar
  55. 55.
    Zheng, L., Yang, Y., Tian, Q.: Sift meets CNN: a decade survey of instance retrieval. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40(5), 1224–1244 (2018)CrossRefGoogle Scholar
  56. 56.
    Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: ICCV (2017)Google Scholar
  57. 57.
    Zhong, Z., Zheng, L., Cao, D., Li, S.: Re-ranking person re-identification with k-reciprocal encoding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3652–3661 (2017)Google Scholar
  58. 58.
    Zhu, P., Zhang, L., Zuo, W., Zhang, D.: From point to set: extend the learning of distance metrics. In: ICCV (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceHong Kong Baptist UniversityKowloon TongHong Kong

Personalised recommendations