Advertisement

SCAN: Spatial and Channel Attention Network for Vehicle Re-Identification

  • Shangzhi Teng
  • Xiaobin Liu
  • Shiliang Zhang
  • Qingming Huang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11166)

Abstract

Most existing methods on vehicle Re-Identification (ReID) extract global features on vehicles. However, as some vehicles have the same model and color, it is hard to distinguish them only depend on global appearance. Compared with global appearance, some local regions could be more discriminative. Moreover, it is not reasonable to use feature maps with equal channels weights for methods based on Deep Convolutional Neural Network (DCNN), as different channels have different discrimination ability. To automatically discover discriminative regions on vehicles and discriminative channels in networks, we propose a Spatial and Channel Attention Network (SCAN) based on DCNN. Specifically, the attention model contains two branches, i.e., spatial attention branch and channel attention branch, which are embedded after convolutional layers to refine the feature maps. Spatial and channel attention branches adjust the weights of outputs in different positions and different channels to highlight the outputs in discriminative regions and channels, respectively. Then feature maps are refined by our attention model and more discriminative features can be extracted automatically. We jointly train the attention branches and convolutional layers by triplet loss and cross-entropy loss. We evaluate our methods on two large-scale vehicle ReID datasets, i.e., VehicleID and VeRi-776. Extensive evaluations on two datasets show that our methods achieve promising results and outperform the state-of-the-art approaches on VeRi-776.

Keywords

Vehicle Re-Identification Deep Convolutional Neural Network Attention 

Notes

Acknowledgments

This work was supported in part by National Natural Science Foundation of China under Grant: No. 61572050, 91538111, 61429201, 61620106009, 61332016, U1636214, 61650202, and the National 1000 Youth Talents Plan, in part by National Basic Research Program of China (973 Program): 2015CB351800, in part by Key Research Program of Frontier Sciences, CAS: QYZDJ-SSW-SYS013.

References

  1. 1.
    Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)
  2. 2.
    Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2016)Google Scholar
  3. 3.
    Feris, R.S., et al.: Large-scale vehicle detection, indexing, and search in urban surveillance videos. IEEE Trans. Multimed. 14(1), 28–42 (2012)CrossRefGoogle Scholar
  4. 4.
    Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR, vol. 2, p. 3 (2017)Google Scholar
  5. 5.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks, vol. 7. arXiv preprint arXiv:1709.01507 (2017)
  6. 6.
    Huang, S., Xu, Z., Tao, D., Zhang, Y.: Part-stacked CNN for fine-grained visual categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1173–1182 (2016)Google Scholar
  7. 7.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)Google Scholar
  8. 8.
    Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 554–561 (2013)Google Scholar
  9. 9.
    Li, S., Bak, S., Carr, P., Wang, X.: Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 369–378 (2018)Google Scholar
  10. 10.
    Li, W., Zhu, X., Gong, S.: Harmonious attention network for person re-identification. In: CVPR, vol. 1, p. 2 (2018)Google Scholar
  11. 11.
    Liu, H., Tian, Y., Yang, Y., Pang, L., Huang, T.: Deep relative distance learning: tell the difference between similar vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2167–2175 (2016)Google Scholar
  12. 12.
    Liu, X., Zhang, S., Huang, Q., Gao, W.: Ram: a region-aware deep model for vehicle re-identification. In: ICME (2018)Google Scholar
  13. 13.
    Liu, X., Liu, W., Ma, H., Fu, H.: Large-scale vehicle re-identification in urban surveillance videos. In: 2016 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2016)Google Scholar
  14. 14.
    Liu, X., Liu, W., Mei, T., Ma, H.: A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 869–884. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46475-6_53CrossRefGoogle Scholar
  15. 15.
    Lu, J., Yang, J., Batra, D., Parikh, D.: Hierarchical question-image co-attention for visual question answering. In: Advances In Neural Information Processing Systems, pp. 289–297 (2016)Google Scholar
  16. 16.
    Martinel, N., Micheloni, C., Foresti, G.L.: Kernelized saliency-based person re-identification through multiple metric learning. IEEE Trans. Image Process. 24(12), 5645–5658 (2015)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Qian, Q., Jin, R., Zhu, S., Lin, Y.: Fine-grained visual categorization via multi-stage metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3716–3724 (2015)Google Scholar
  18. 18.
    Shen, Y., Xiao, T., Li, H., Yi, S., Wang, X.: Learning deep neural networks for vehicle re-ID with visual-spatio-temporal path proposals. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1918–1927. IEEE (2017)Google Scholar
  19. 19.
    Si, J., et al.: Dual attention matching network for context-aware feature sequence based person re-identification. arXiv preprint arXiv:1803.09937 (2018)
  20. 20.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  21. 21.
    Su, C., Li, J., Zhang, S., Xing, J., Gao, W., Tian, Q.: Pose-driven deep convolutional model for person re-identification. In: ICCV (2017)Google Scholar
  22. 22.
    Wang, Y., Choi, J., Morariu, V., Davis, L.S.: Mining discriminative triplets of patches for fine-grained classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1163–1172 (2016)Google Scholar
  23. 23.
    Wang, Z., et al.: Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 379–387 (2017)Google Scholar
  24. 24.
    Wei, L., Liu, X., Li, J., Zhang, S.: VP-ReID: vehicle and person re-identification system. In: ICMR (2018)Google Scholar
  25. 25.
    Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: CVPR (2018)Google Scholar
  26. 26.
    Wei, L., Zhang, S., Yao, H., Gao, W., Tian, Q.: Glad: global-local-alignment descriptor for pedestrian retrieval. In: ACM MM (2017)Google Scholar
  27. 27.
    Xu, Q., Yan, K., Tian, Y.: Learning a repression network for precise vehicle search. arXiv preprint arXiv:1708.02386 (2017)
  28. 28.
    Yan, K., Tian, Y., Wang, Y., Zeng, W., Huang, T.: Exploiting multi-grain ranking constraints for precisely searching visually-similar vehicles. In: The IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar
  29. 29.
    Yang, L., Luo, P., Change Loy, C., Tang, X.: A large-scale car dataset for fine-grained categorization and verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3973–3981 (2015)Google Scholar
  30. 30.
    Yao, H., Zhang, S., Zhang, Y., Li, J., Tian, Q.: Coarse-to-fine description for fine-grained visual categorization. IEEE Trans. Image Process. 25(10), 4858–4872 (2016)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Zhang, X., Zhou, F., Lin, Y., Zhang, S.: Embedding label structures for fine-grained feature representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1114–1123 (2016)Google Scholar
  32. 32.
    Zhou, Y., Shao, L.: Aware attentive multi-view inference for vehicle re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6489–6498 (2018)Google Scholar
  33. 33.
    Zhu, Z., Wu, W., Zou, W., Yan, J.: End-to-end flow correlation tracking with spatial-temporal attention. arXiv preprint arXiv:1711.01124 (2017)

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Shangzhi Teng
    • 1
  • Xiaobin Liu
    • 2
  • Shiliang Zhang
    • 2
  • Qingming Huang
    • 1
  1. 1.University of Chinese Academy of SciencesBeijingChina
  2. 2.Peking UniversityBeijingChina

Personalised recommendations