Skip to main content

SCAN: Spatial and Channel Attention Network for Vehicle Re-Identification

Part of the Lecture Notes in Computer Science book series (LNISA,volume 11166)

Abstract

Most existing methods on vehicle Re-Identification (ReID) extract global features on vehicles. However, as some vehicles have the same model and color, it is hard to distinguish them only depend on global appearance. Compared with global appearance, some local regions could be more discriminative. Moreover, it is not reasonable to use feature maps with equal channels weights for methods based on Deep Convolutional Neural Network (DCNN), as different channels have different discrimination ability. To automatically discover discriminative regions on vehicles and discriminative channels in networks, we propose a Spatial and Channel Attention Network (SCAN) based on DCNN. Specifically, the attention model contains two branches, i.e., spatial attention branch and channel attention branch, which are embedded after convolutional layers to refine the feature maps. Spatial and channel attention branches adjust the weights of outputs in different positions and different channels to highlight the outputs in discriminative regions and channels, respectively. Then feature maps are refined by our attention model and more discriminative features can be extracted automatically. We jointly train the attention branches and convolutional layers by triplet loss and cross-entropy loss. We evaluate our methods on two large-scale vehicle ReID datasets, i.e., VehicleID and VeRi-776. Extensive evaluations on two datasets show that our methods achieve promising results and outperform the state-of-the-art approaches on VeRi-776.

Keywords

  • Vehicle Re-Identification
  • Deep Convolutional Neural Network
  • Attention

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-00764-5_32
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-00764-5
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.

References

  1. Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)

  2. Chen, L.C., Yang, Y., Wang, J., Xu, W., Yuille, A.L.: Attention to scale: scale-aware semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2016)

    Google Scholar 

  3. Feris, R.S., et al.: Large-scale vehicle detection, indexing, and search in urban surveillance videos. IEEE Trans. Multimed. 14(1), 28–42 (2012)

    CrossRef  Google Scholar 

  4. Fu, J., Zheng, H., Mei, T.: Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR, vol. 2, p. 3 (2017)

    Google Scholar 

  5. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks, vol. 7. arXiv preprint arXiv:1709.01507 (2017)

  6. Huang, S., Xu, Z., Tao, D., Zhang, Y.: Part-stacked CNN for fine-grained visual categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1173–1182 (2016)

    Google Scholar 

  7. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)

    Google Scholar 

  8. Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 554–561 (2013)

    Google Scholar 

  9. Li, S., Bak, S., Carr, P., Wang, X.: Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 369–378 (2018)

    Google Scholar 

  10. Li, W., Zhu, X., Gong, S.: Harmonious attention network for person re-identification. In: CVPR, vol. 1, p. 2 (2018)

    Google Scholar 

  11. Liu, H., Tian, Y., Yang, Y., Pang, L., Huang, T.: Deep relative distance learning: tell the difference between similar vehicles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2167–2175 (2016)

    Google Scholar 

  12. Liu, X., Zhang, S., Huang, Q., Gao, W.: Ram: a region-aware deep model for vehicle re-identification. In: ICME (2018)

    Google Scholar 

  13. Liu, X., Liu, W., Ma, H., Fu, H.: Large-scale vehicle re-identification in urban surveillance videos. In: 2016 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2016)

    Google Scholar 

  14. Liu, X., Liu, W., Mei, T., Ma, H.: A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 869–884. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_53

    CrossRef  Google Scholar 

  15. Lu, J., Yang, J., Batra, D., Parikh, D.: Hierarchical question-image co-attention for visual question answering. In: Advances In Neural Information Processing Systems, pp. 289–297 (2016)

    Google Scholar 

  16. Martinel, N., Micheloni, C., Foresti, G.L.: Kernelized saliency-based person re-identification through multiple metric learning. IEEE Trans. Image Process. 24(12), 5645–5658 (2015)

    MathSciNet  CrossRef  Google Scholar 

  17. Qian, Q., Jin, R., Zhu, S., Lin, Y.: Fine-grained visual categorization via multi-stage metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3716–3724 (2015)

    Google Scholar 

  18. Shen, Y., Xiao, T., Li, H., Yi, S., Wang, X.: Learning deep neural networks for vehicle re-ID with visual-spatio-temporal path proposals. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1918–1927. IEEE (2017)

    Google Scholar 

  19. Si, J., et al.: Dual attention matching network for context-aware feature sequence based person re-identification. arXiv preprint arXiv:1803.09937 (2018)

  20. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  21. Su, C., Li, J., Zhang, S., Xing, J., Gao, W., Tian, Q.: Pose-driven deep convolutional model for person re-identification. In: ICCV (2017)

    Google Scholar 

  22. Wang, Y., Choi, J., Morariu, V., Davis, L.S.: Mining discriminative triplets of patches for fine-grained classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1163–1172 (2016)

    Google Scholar 

  23. Wang, Z., et al.: Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 379–387 (2017)

    Google Scholar 

  24. Wei, L., Liu, X., Li, J., Zhang, S.: VP-ReID: vehicle and person re-identification system. In: ICMR (2018)

    Google Scholar 

  25. Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: CVPR (2018)

    Google Scholar 

  26. Wei, L., Zhang, S., Yao, H., Gao, W., Tian, Q.: Glad: global-local-alignment descriptor for pedestrian retrieval. In: ACM MM (2017)

    Google Scholar 

  27. Xu, Q., Yan, K., Tian, Y.: Learning a repression network for precise vehicle search. arXiv preprint arXiv:1708.02386 (2017)

  28. Yan, K., Tian, Y., Wang, Y., Zeng, W., Huang, T.: Exploiting multi-grain ranking constraints for precisely searching visually-similar vehicles. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

  29. Yang, L., Luo, P., Change Loy, C., Tang, X.: A large-scale car dataset for fine-grained categorization and verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3973–3981 (2015)

    Google Scholar 

  30. Yao, H., Zhang, S., Zhang, Y., Li, J., Tian, Q.: Coarse-to-fine description for fine-grained visual categorization. IEEE Trans. Image Process. 25(10), 4858–4872 (2016)

    MathSciNet  CrossRef  Google Scholar 

  31. Zhang, X., Zhou, F., Lin, Y., Zhang, S.: Embedding label structures for fine-grained feature representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1114–1123 (2016)

    Google Scholar 

  32. Zhou, Y., Shao, L.: Aware attentive multi-view inference for vehicle re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6489–6498 (2018)

    Google Scholar 

  33. Zhu, Z., Wu, W., Zou, W., Yan, J.: End-to-end flow correlation tracking with spatial-temporal attention. arXiv preprint arXiv:1711.01124 (2017)

Download references

Acknowledgments

This work was supported in part by National Natural Science Foundation of China under Grant: No. 61572050, 91538111, 61429201, 61620106009, 61332016, U1636214, 61650202, and the National 1000 Youth Talents Plan, in part by National Basic Research Program of China (973 Program): 2015CB351800, in part by Key Research Program of Frontier Sciences, CAS: QYZDJ-SSW-SYS013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shangzhi Teng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Teng, S., Liu, X., Zhang, S., Huang, Q. (2018). SCAN: Spatial and Channel Attention Network for Vehicle Re-Identification. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11166. Springer, Cham. https://doi.org/10.1007/978-3-030-00764-5_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00764-5_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00763-8

  • Online ISBN: 978-3-030-00764-5

  • eBook Packages: Computer ScienceComputer Science (R0)