Skip to main content

Adaptive Feature Selection Siamese Networks for Visual Tracking

  • Conference paper
  • First Online:
Frontiers of Computer Vision (IW-FCV 2020)

Abstract

Recently, template based discriminative trackers, especially Siamese network based trackers have shown great potential in terms of balanced accuracy and tracking speed. However, it is still difficult for Siamese models to adapt the target variations from offline learning. In this paper, we introduced an Adaptive Feature Selection Siamese (AFS-Siam) network to learn the most discriminative feature information for better tracking. Features from different layers contain complementary information for discrimination. Proposed adaptive feature selection module selects the most useful feature information from different convolutional layers while suppresses the irrelevant ones. Proposed tracking algorithm not only alleviates the over-fitting problem but also increases the discriminative ability. The proposed tracking framework is trained end-to-end. And extensive experimental results over OTB50, OTB100, TC-128, and VOT2017 demonstrate that our tracking algorithm exhibits favorable performance compared to other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Avidan, S.: Support vector tracking. IEEE Trans. Pattern Anal. Mach. Intell. 26(8), 1064–1072 (2004)

    Article  Google Scholar 

  2. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56

    Chapter  Google Scholar 

  3. Chen, S., Qiu, D., Huo, Q.: Siamese networks with discriminant correlation filters and channel attention. In: 2018 14th International Conference on Computational Intelligence and Security (CIS), pp. 110–114. IEEE (2018)

    Google Scholar 

  4. Choi, J., Jin Chang, H., Jeong, J., Demiris, Y., Young Choi, J.: Visual tracking using attention-modulated disintegration and integration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4321–4330 (2016)

    Google Scholar 

  5. Cui, Z., Xiao, S., Feng, J., Yan, S.: Recurrently target-attending tracking. In: IEEE CVPR, pp. 1449–1458 (2016)

    Google Scholar 

  6. Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 58–66 (2015)

    Google Scholar 

  7. Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE international Conference on Computer Vision, pp. 4310–4318 (2015)

    Google Scholar 

  8. Danelljan, M., Häger, G., Khan, F.S., Felsberg, M.: Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1561–1575 (2016)

    Article  Google Scholar 

  9. Danelljan, M., Bhat, G., Shahbaz Khan, F., Felsberg, M.: ECO: efficient convolution operators for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6638–6646 (2017)

    Google Scholar 

  10. Dong, X., Shen, J.: Triplet loss in Siamese network for object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 472–488. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_28

    Chapter  Google Scholar 

  11. Fan, H., Ling, H.: SANet: structure-aware network for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 42–49 (2017)

    Google Scholar 

  12. Fiaz, M., Mahmood, A., Jung, S.K.: Tracking noisy targets: a review of recent object tracking approaches. arXiv preprint arXiv:180203098 (2018)

  13. Fiaz, M., Mahmood, A., Javed, S., Jung, S.K.: Handcrafted and deep trackers: recent visual object tracking approaches and trends. ACM Comput. Surv. (CSUR) 52(2), 43 (2019)

    Article  Google Scholar 

  14. Fiaz, M., Mahmood, A., Jung, S.K.: Convolutional neural network with structural input for visual object tracking. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp. 1345–1352. ACM (2019)

    Google Scholar 

  15. Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)

    Google Scholar 

  16. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  17. Guo, Q., Feng, W., Zhou, C., Huang, R., Wan, L., Wang, S.: Learning dynamic Siamese network for visual object tracking. In: IEEE CVPR, pp. 1763–1771 (2017)

    Google Scholar 

  18. He, A., Luo, C., Tian, X., Zeng, W.: A twofold Siamese network for real-time object tracking. In: IEEE CVPR, pp. 4834–4843 (2018)

    Google Scholar 

  19. Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45

    Chapter  Google Scholar 

  20. Huang, L., Zhao, X., Huang, K.: GOT-10k: a large high-diversity benchmark for generic object tracking in the wild. arXiv preprint arXiv:181011981 (2018)

  21. Kristan, M., Leonardis, A., Matas, J., Felsberg, M., et al.: The visual object tracking VOT2017 challenge results. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1949–1972 (2017)

    Google Scholar 

  22. Kwak, S., Nam, W., Han, B., Han, J.H.: Learning occlusion with likelihoods for visual tracking. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1551–1558. IEEE (2011)

    Google Scholar 

  23. Li, P., Wang, D., Wang, L., Lu, H.: Deep visual tracking: review and experimental comparison. Pattern Recogn. 76, 323–338 (2018)

    Article  Google Scholar 

  24. Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 510–519 (2019)

    Google Scholar 

  25. Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: algorithms and benchmark. IEEE Trans. Image Process. 24(12), 5630–5644 (2015)

    Article  MathSciNet  Google Scholar 

  26. Lukezic, A., Vojir, T., Cehovin, Z.L., Matas, J., Kristan, M.: Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6309–6318 (2017)

    Google Scholar 

  27. Ma, B., Hu, H., Shen, J., Liu, Y., Shao, L.: Generalized pooling for robust object tracking. IEEE Trans. Image Process. 25(9), 4199–4208 (2016)

    MathSciNet  MATH  Google Scholar 

  28. Ma, C., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: IEEE CVPR, pp. 3074–3082 (2015)

    Google Scholar 

  29. Mnih, V., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in Neural Information Processing Systems, pp. 2204–2212 (2014)

    Google Scholar 

  30. Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4293–4302 (2016)

    Google Scholar 

  31. Pu, S., Song, Y., Ma, C., Zhang, H., Yang, M.H.: Deep attentive tracking via reciprocative learning. In: Advances in Neural Information Processing Systems, pp. 1931–1941 (2018)

    Google Scholar 

  32. Sevilla-Lara, L., Learned-Miller, E.: Distribution fields for tracking. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1910–1917. IEEE (2012)

    Google Scholar 

  33. Shaban, M., Mahmood, A., Al-maadeed, S., Rajpoot, N.: Multi-person head segmentation in low resolution crowd scenes using convolutional encoder-decoder framework. In: Chen, L., Ben Amor, B., Ghorbel, F. (eds.) RFMI 2017. CCIS, vol. 842, pp. 82–92. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19816-9_7

    Chapter  Google Scholar 

  34. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)

    Google Scholar 

  35. Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1420–1429 (2016)

    Google Scholar 

  36. Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.: End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2805–2813 (2017)

    Google Scholar 

  37. Wang, D., Lu, H., Xiao, Z., Yang, M.H.: Inverse sparse tracker with a locally weighted distance metric. IEEE Trans. Image Process. 24(9), 2646–2657 (2015)

    Article  MathSciNet  Google Scholar 

  38. Wang, F.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)

    Google Scholar 

  39. Wang, N., Song, Y., Ma, C., Zhou, W., Liu, W., Li, H.: Unsupervised deep tracking. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  40. Wang, Q., Teng, Z., Xing, J., Gao, J., Hu, W., Maybank, S.: Learning attentions: residual attentional Siamese network for high performance online visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4854–4863 (2018)

    Google Scholar 

  41. Wen, L., Cai, Z., Lei, Z., Yi, D., Li, S.Z.: Online spatio-temporal structural context learning for visual tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 716–729. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_51

    Chapter  Google Scholar 

  42. Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)

    Article  Google Scholar 

  43. Yun, S., Choi, J., Yoo, Y., Yun, K., Young Choi, J.: Action-decision networks for visual tracking with deep reinforcement learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2711–2720 (2017)

    Google Scholar 

  44. Zhu, Z., Huang, G., Zou, W., Du, D., Huang, C.: UCT: learning unified convolutional networks for real-time visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1973–1982 (2017)

    Google Scholar 

Download references

Acknowledgment

This research was supported by Development project of leading technology for future vehicle of the business of Daegu metropolitan city (No. 20171105).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soon Ki Jung .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fiaz, M., Rahman, M.M., Mahmood, A., Farooq, S.S., Baek, K.Y., Jung, S.K. (2020). Adaptive Feature Selection Siamese Networks for Visual Tracking. In: Ohyama, W., Jung, S. (eds) Frontiers of Computer Vision. IW-FCV 2020. Communications in Computer and Information Science, vol 1212. Springer, Singapore. https://doi.org/10.1007/978-981-15-4818-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-4818-5_13

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-4817-8

  • Online ISBN: 978-981-15-4818-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics