Skip to main content
Log in

A strong feature representation for siamese network tracker

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Because AlexNet is too shallow to form a strong feature representation, the trackers based on the Siamese network have an accuracy gap comparing with state-of-the-art algorithms. Both deep features and appearance features benefit tracking accuracy. To combine these two kinds features, the modified pre-trained VGG16 network is fine-tuned as one branch of the backbone network. Secondly, an AlexNet branch is attached after the third convolutional layer of VGG16. Thus the response maps from both branches are merged to form a preliminary strong feature representation with deep features and shallow appearance features. Thirdly, a new mean Peak-to-side ratio(mPSR) loss is designed to help network learn target features adaptively. A channel attention block and the Average-Peak-to-Correlation Energy(APCE) are designed to help select contributed features and suppress distractors. SiamPF only takes ILSVRC2015-VID as training dataset, but it achieves excellent performance on OTB-2013 / OTB-2015 / VOT2015 / VOT2016 / VOT2017 while maintaining the real-time performance of 41FPS on the GTX 1080Ti.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PH (2016) Staple: Complementary learners for real-time tracking. In: CVPR, pp 1401–1409

  2. Bertinetto L, Valmadre J, Henriques J, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking. In: ECCV, pp 850–865

  3. Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: CVPR, pp 2544–2550

  4. Chen K, Tao W (2018) Convolutional regression for visual tracking. IEEE Trans Image Process 27(7):3611–3620

    Article  MathSciNet  Google Scholar 

  5. Danelljan M, Shahbaz Khan F, Felsberg M, Van de Weijer J (2014) Adaptive color attributes for real-time visual tracking. In: CVPR, pp 1090–1097

  6. Danelljan M, Hager G, Shahbaz Khan F, Felsberg M (2015) Convolutional features for correlation filter based visual tracking. In: ICCV Workshops, pp 58–66

  7. Danelljan M, Hager G, Shahbaz Khan F, et al. (2015) Convolutional features for correlation filter based visual tracking. In: CVPR Workshops, pp 58–66

  8. Danelljan M, Hager G, Shahbaz Khan F et al (2015) Learning spatially regularized correlation filters for visual tracking. In: ICCV, pp 4310–4318

  9. Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: Learning continuous convolution operators for visual tracking: In ECCV, pp 472–488

  10. Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) ECO: efficient convolution operators for tracking. In: In CVPR, pp 6638–6646

  11. Dong X, Shen J, Wang W, Shao L, Ling H, Porikli F (2019) Dynamical Hyperparameter Optimization via Deep Reinforcement Learning in Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence

  12. Fan H, Ling H (2017) Sanet: Structure-aware network for visual tracking. In: CVPR Workshops, pp 42–49

  13. Fan DP, Cheng MM, Liu JJ, Gao SH, Hou Q, Borji A (2018) Salient objects in clutter: Bringing salient object detection to the foreground. In: ECCV, pp 186–202

  14. Fan DP, Wang W, Cheng MM, Shen J (2019) Shifting more attention to video salient object detection. In: CVPR, pp 8554–8564

  15. Guo Q, Feng W, Zhou C, Huang R, Wan L, Wang S (2017) Learning dynamic siamese network for visual object tracking. In: ICCV, pp 1763–1771

  16. Gundogdu E, Alatan AA (2018) Good features to correlate for visual tracking. IEEE Trans Image Process 27(5):2526–2540

    Article  MathSciNet  Google Scholar 

  17. Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM, Hicks SL, Torr PH (2015) Struck: Structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096–2109

    Article  Google Scholar 

  18. He A, Luo C, Tian X, Zeng W (2018) A twofold siamese network for real-time object tracking. In: CVPR, pp 4834–4843

  19. Henriques J, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: ECCV, pp 702–715

  20. Henriques J, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  21. Held D, Thrun S, Savarese S (2016) Learning to track at 100 fps with deep regression networks. In: ECCV, pp 749–765

  22. Hua Y, Alahari K, Schmid C (2015) Online object tracking with proposal selection. In: ICCV, pp 3092–3100

  23. Kristan M, Matas J, Leonardis A, Vojíř T, Pflugfelder R, Fernandez G, Čehovin L (2016) A novel performance evaluation methodology for single-target trackers. IEEE Trans Pattern Anal Mach Intell 38(11):2137–2155

    Article  Google Scholar 

  24. Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. In: ECCV, pp 254–265

  25. Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In CVPR, pp 8971–8980

  26. Li Y, Zhang X (2019) SiamVGG: Visual Tracking using Deeper Siamese Networks. arXiv:1902.02804

  27. Liang Z, Shen J (2019) Local Semantic Siamese Networks for Fast Tracking. IEEE Transactions on Image Processing

  28. Lukezic A, Vojir T, Cehovin Zajc L, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: CVPR, pp 6309–6318

  29. Nam H, Baek M, Han B (2016) Modeling and propagating cnns in a tree structure for visual tracking. arXiv:1608.07242

  30. Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: CVPR, pp 4293–4302

  31. Shen J, Liang Z, Liu J, Sun H, Shao L, Tao D (2018) Multiobject tracking by submodular optimization. IEEE Trans Cybern 49(6):1990–2001

    Article  Google Scholar 

  32. Shen J, Tang X, Dong X, Shao L (2019) Visual object tracking by hierarchical attention siamese network. IEEE transactions on cybernetics

  33. Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr PH (2017) End-to-end representation learning for correlation filter based tracking. In: CVPR, pp 2805–2813

  34. Wang N, Yeung D-Y (2015) Ensemble-based tracking: Ensemble-based Aggregating crowdsourced structured time series data. In: ICML, pp 1107–1115

  35. Wang M, Liu Y, Huang Z (2017) Large margin object tracking with circulant feature maps. In: CVPR, pp 4021–4029

  36. Wang N, Zhou W, Tian Q, Hong R, Wang M, Li H (2018) Multi-Cue Correlation filters for robust visual tracking. In: CVPR, pp 4844–4853

  37. Wu Y, Lim J, Yang MH (2013) Online object tracking: A benchmark. In: CVPR, pp 2411–2418

  38. Zhang T, Xu C, Yang MH (2017) Multi-task correlation particle filter for robust object tracking. In: CVPR, pp 4335–4343

  39. Zhao JX, Cao Y, Fan DP, Cheng MM, Li XY, Zhang L (2019) Contrast prior and fluid pyramid integration for RGBD salient object detection. In: CVPR, pp 3927–3936

  40. Zhao JX, Liu JJ, Fan DP, Cao Y, Yang J, Cheng MM (2019) EGNet: Edge guidance network for salient object detection. In: ICCV, pp 8779–8788

  41. Zhu G, Porikli F, Li H (2016) Beyond local search: Tracking objects everywhere with instance-specific proposals. In: CVPR, pp 943–951

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (NSFC) under Grant No. 61671423 and Grant No. 61271403.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong Yin.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Z., Zhang, R. & Yin, D. A strong feature representation for siamese network tracker. Multimed Tools Appl 79, 25873–25887 (2020). https://doi.org/10.1007/s11042-020-09164-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09164-2

Keywords

Navigation