Skip to main content
Log in

A target response adaptive correlation filter tracker with spatial attention

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In recent years, many efficient and accurate algorithms have been proposed in the field of target tracking. The existing correlation filter(CF) based tracking expands the sample through cyclic shift and uses the position of the maximum response map as the most likely position of the target. Since the cyclic shift sample is not the real sample, the target response may also occur in the non-target region, which not only causes the boundary effect problem but also results in the positioning error. To train the filter with the wrong position will lead to the tracking drift. This paper proposes a target response adaptive correlation filter tracker with spatial attention to solve the above problems. Firstly, more useful feature information can be learned by making full use of the context information of the target area, and spatial attention mechanisms can be introduced to suppress the background information and reduce the unnecessary boundary effect. Secondly, the dynamic change of the target response is captured, and the most reliable response map is selected when the interference response map appears, to reduce the probability of dispositioning and train the more recognizable filter. Extensive experimental results on popular benchmarks(OTB2013 OTB100 and VOT2016) demonstrate that the proposed tracker performs favorably against other state-of-the-art tracking methods, with real-time performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Babenko B, Yang M, Belongie S (2011) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Mach Intell 33(8):1619–1632

    Article  Google Scholar 

  2. Behrad A, Motamedi SA (2003) Moving target detection and tracking using edge features detection and matching. IEICE Trans Inf Syst 86(12):2764–2774

    Google Scholar 

  3. Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PHS (2016) Staple: Complementary learners for real-time tracking. In: The IEEE conference on computer vision and pattern recognition. CVPR

  4. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional siamese networks for object tracking. In: Proceedings of the european conference on computer vision workshops, pp 850–865

  5. Bibi A, Mueller M, Ghanem B (2016) Target response adaptation for correlation filter tracking. ECCV

  6. Bolme DS, Beveridge JR, Draper B, Lui YM et al (2010) Visual object tracking using adaptive correlation filters. In: IEEE conference on computer vision and pattern recognition. CVPR

  7. Cannons K (2008) A review of visual tracking. Technical Report CSE 2008-07, York University, Canada

  8. Choi J, Chang HJ, Yun S, Fischer T, Demiris Y (2017) Attentional correlation filter network for adaptive visual tracking. In: IEEE conference on computer vision and pattern recognition, pp 4807–4816, 2, 7

  9. Choi J, Jin Chang H, Jeong J et al (2016) Visual tracking using attention-modulated disintegration and integration[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4321-4330

  10. Danelljan M, Häger G, Khan F, Felsberg M (2015) Convolutional features for correlation filter based visual tracking. ICCV workshop

  11. Danelljan M, Häger G, Khan F, Felsberg M (2015) Convolutional features for correlation filter based visual tracking. ICCV workshop

  12. Danelljan M, Häger G, Khan FS, Felsberg M (2014) Accurate scale estimation for robust visual tracking. BMVC

  13. Danelljan M, Hager G, Shahbaz Khan F et al (2015) Learning spatially regularized correlation filters for visual tracking [C]. ICCV

  14. Fan DP, Cheng MM, Liu JJ et al (2018) Salient objects in clutter: bringing salient object detection to the foreground[C]. In: Proceedings of the European conference on computer vision (ECCV), pp 186–202

  15. Fan DP, Wang W, Cheng MM et al (2019) Shifting more attention to video salient object detection[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8554–8564

  16. Gao L, Wu j, Qiao z, et al. (2016) Collaborative social group influence for event recommendation[C]. ACM International. ACM

  17. Gao L, Zhou C, Wu J et al (2017) Collaborative dynamic sparse topic regression with user profile evolution for item recommendation[C]. In: The thirty-first conference on artificial intelligence, AAAI-17

  18. Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM, Hicks SL, Torr PHS (2016) Struck: Structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096–2109

    Article  Google Scholar 

  19. Henriques JF, Caseiro R, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: European conference on computer vision, pp 702–715

  20. Henriques JF, Caseiro R, Martins P, Batista J, High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence. PAMI (2015)

  21. Jamasbi B, Motamedi SA, Behrad A (2007) Tracking vehicle targets with large aspect change. In: 2007 IEEE Workshop on Motion and Video Computing (WMVC’07), pp 22–22. IEEE

  22. Kalman RE (1960) A new approach to linear filtering and prediction problems[J]. J Basic Eng 82(1):35–45

    Article  MathSciNet  Google Scholar 

  23. Kiani Galoogahi H, Sim T, Lucey S (2015) Correlation filters with limited boundaries [C]. CVPR

  24. Kristan M, Leonardis A, Matas J et al (2016) The visual object tracking vot2016 challenge results. In: Proceedings of the european conference on computer vision workshops, pp 1–45.2, 7, 8

  25. Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: CVPR. 1, 2, 3, 4, 5, 8

  26. Li F, Tian C, Zuo W et al (2018) Learning spatial-temporal regularized correlation filters for visual tracking[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4904–4913

  27. Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration [C]. ECCV, 466-1

  28. Liu W, Song Y, Chen D et al (2019) Deformable object tracking with gated fusion[J]. IEEE Trans Image Process 28(8):3766–3777

    Article  MathSciNet  Google Scholar 

  29. Liu Y, Zhang Y, Hu M et al (2017) Fast tracking via spatio-temporal context learning based on multi-color attributes and pca[C]. In: 2017 IEEE International Conference on Information and Automation (ICIA), pp 398–403. IEEE

  30. Lu X et al (2019) A daptive region proposal with channel regularization for robust object tracking. IEEE Transactions on Circuits and Systems for Video Technology

  31. Lu X, Ma C, Ni B et al (2018) Deep regression tracking with shrinkage loss[C]. In: Proceedings of the european conference on computer vision (ECCV), pp 353–369

  32. Lu X, Ni B, Ma C et al (2019) Learning transform-aware attentive network for object tracking[J]. Neurocomputing 349:133–144

    Article  Google Scholar 

  33. Lu X, Wang W, Ma C et al (2019) See more, know more: unsupervised video object segmentation with co-attention siamese networks[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3623–3632

  34. Lukezic A, Vojir T, Cehovin Zajc L, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: IEEE conference on computer vision and pattern recognition. 1, 2, 5, 7

  35. Ma C, Yang X, Zhang C, Yang M-H (2015) Long-term correlation tracking. In: CVPR, pp 5388–5396, 5

  36. Mueller M, Smith N, Ghanem B (2017) Context-aware correlation filter tracking. CVPR

  37. Poggio T, Cauwenberghs G (2001) Incremental and decremental support vector machine learning. In: Advances in neural information processing systems, NIPS

  38. Possegger H, Mauthner T, Bischof H (2015) . In: Defense of color-based model-free tracking [C]. CVPR

  39. Sun SJ, Akhtar N, Song H963S et al (2019) Deep affinity network for multiple object tracking[J]. IEEE transactions on pattern analysis and machine intelligence

  40. Ullah M, Cheikh FA (2018) Deep feature based end-to-end transportation network for multi-target tracking[C]. In: 2018 25th IEEE international conference on image processing (ICIP), pp 3738–3742. IEEE

  41. Ullah M, Mohammed AK, Cheikh FA et al (2017) A hierarchical feature model for multi-target tracking[C]. In: 2017 IEEE international conference on image processing (ICIP), pp 2612–2616. IEEE

  42. Ullah M, Ullah H, Cheikh FA (2019) Single shot appearance model (ssam) for multi-target tracking[J]. Electron Imaging 466-1(7):466–1–466-6

    Google Scholar 

  43. Van de Weijer J, Schmid C, Verbeek JJ, Larlus D (2009) Learning color names for real-world applications. TIP 18(7):1512–1524

  44. Voigtlaender P, Krause M, Osep A et al (2019) MOTS: multi-object tracking and segmentation[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7942–7951

  45. Vo BN, Vo BT, Beard M (2019) Multi-sensor multi-object tracking with the generalized labeled multi-Bernoulli filter[J]. IEEE Trans Signal Process 67(23):5952–5967

    Article  MathSciNet  Google Scholar 

  46. Wang Q, Zhang L, Bertinetto L et al (2019) Fast online object tracking and segmentation: A unifying approach[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1328–1338

  47. Wang T, Piao Y, Li X et al (2019) Deep learning for light field saliency detection[C]

  48. Wang T, Zhang L, Wang S et al (2018) Detect globally, refine locally: a novel approach to saliency detection[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3127–3135

  49. Wang Z, Xu J, Liu L et al (2019) Ranet: ranking attention network for fast video object segmentation[C]. In: Proceedings of the IEEE international conference on computer vision, pp 3978–3987

  50. Wu J, Pan S, Zhu X, et al. (2018) Multi-instance learning with discriminative bag mapping[J]. IEEE Trans Knowl Data Eng 2018:1–1

    Article  Google Scholar 

  51. Wu Y, Lim J, Yang M-H (2013) Online object tracking: A benchmark. In: CVPR, pp 2411–2418, 5

  52. Wu Y, Lim J, Yang M-H (2015) Object tracking benchmark. PAMI 37 (9):1834–1848, 1, 5, 6

    Article  Google Scholar 

  53. Yilmaz A, Javed O, Shah M (2006) Object tracking: A survey. ACM Comput Surv 38(4):1–45

    Article  Google Scholar 

  54. Zhang B, Li Z, Cao X, Ye Q, Chen C, Shen L, Perina A, Ji R (2016) Output constraint transfer for kernelized correlation filter in tracking. TSMC

  55. Zhang J, Ma S, Sclaroff S (2014) Meem: robust trackingvia multiple experts using entropy minimization. In: ECCV. pp 188–203, 5. Springer, Berlin

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

The derivation of (6) and (7) is shown as follows:

Equation (5) can be rewritten in terms of z with \({z}^{T} = \left [ {w^{T} y^{T} } \right ]\):

$$ \begin{array}{@{}rcl@{}} f(z) &=& ||\left[{A_{0}- I} \right]z||_{2}^{2} + \lambda_{1} ||\left[ {\begin{array}{ll} S \end{array}} \right]z||_{2}^{2} + \lambda_{2} ||\left[ {\begin{array}{ll} 0& I \end{array}} \right]z - y_{0} ||_{2}^{2} \\ &&+ \lambda_{3} \sum\limits_{i = 1}^{k} {||A_{i} } \left[ {\begin{array}{ll} I &0 \end{array}} \right]z||_{2}^{2} \end{array} $$
(13)

Where wRn, yRn, y0Rn, zR2n,then:

$$ \begin{array}{@{}rcl@{}} \nabla_{z} f(z) &=& \left[ {\begin{array}{ll} {A^{T} A} & { - A^{T} } \\ { - A} & I \end{array}} \right]z + \lambda_{1} \left[ {\begin{array}{*{20}c} {S^{T} S} & 0 \\ 0 & 0 \end{array}} \right]z + \lambda_{2} \left[ {\begin{array}{*{20}c} 0 & 0 \\ 0 & I \end{array}} \right]z \\ &&- \lambda_{2} \left[ {\begin{array}{*{20}c} 0 \\ I \end{array}} \right]y_{0} + \lambda_{3} \sum\limits_{i = 1}^{k} {\left[ {\begin{array}{*{20}c} {{A_{i}^{T}} A_{i} } & 0 \\ 0 & 0 \end{array}} \right]} z = 0 \end{array} $$

so

$$ \nabla_{z} f(z) = \left[ {\begin{array}{*{20}c} {A^{T} A + \lambda_{1} {S^{T} S} + \lambda_{3} \sum\limits_{i = 1}^{k} {{A_{i}^{T}} A_{i} } } & { - A^{T} } \\ { - A} & {(1 + \lambda_{2} )I} \end{array}} \right]z = \lambda_{2} \left[ {\begin{array}{*{20}c} 0 \\ I \end{array}} \right]y_{0} $$
$$ \begin{array}{@{}rcl@{}} \begin{array}{ll} \left[ {\begin{array}{*{20}c} F & 0 \\ 0 & F \end{array}} \right]\left[ {\begin{array}{*{20}c} {diag(\hat a_{0} \odot \hat a_{0}^ * + {\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } )} & { - diag(\hat a_{0}^ * )} \\ { - diag(\hat a_{0} )} & {diag(1 + \lambda_{2} )} \end{array}} \right] \\ \left[ {\begin{array}{*{20}c} {F^{H} } & 0 \\ 0 & {F^{H} } \end{array}} \right]z = \lambda_{2} \left[ {\begin{array}{*{20}c} 0 \\ {F^{H} } \end{array}} \right]y_{0} \end{array} \end{array} $$
$$ \left[ {\begin{array}{*{20}c} {diag(\hat a_{0} \odot \hat a_{0}^ * + {\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } )} & { - diag(\hat a_{0}^ * )} \\ {\! - \!diag(\hat a_{0} )} & {diag(1 + \lambda_{2} )} \end{array}} \right]\left[ {\begin{array}{*{20}c} {\hat w^{*} } \\ {\hat y^{*} } \end{array}} \right] \!= \! \lambda_{2} \left[ {\begin{array}{*{20}c} 0 \\ {F^{H} } \end{array}} \right]y_{0} $$
$$ \left[ {\begin{array}{*{20}c} {\hat w^{*} } \\ {\hat y^{*} } \end{array}} \right] \!= \! \lambda_{2} \left[ {\begin{array}{*{20}c} {diag(\hat a_{0} \odot \hat a_{0}^ * \! +\! {\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } )} & { - diag(\hat a_{0}^ * )} \\ { - diag(\hat a_{0} )} & {diag(1 + \lambda_{2} )} \end{array}} \right]^{ - 1} \left[ {\begin{array}{*{20}c} 0 \\ {F^{H} } \end{array}} \right]y_{0} $$

Note that the inverse lemma states:

$$ \left[ {\begin{array}{*{20}c} B & N \\ V & C \end{array}} \right]^{ - 1} \!= \!\left[ {\begin{array}{*{20}c} {(B\! -\! NC^{\! -\! 1} V)^{ - 1} } & {(B - NC^{ - 1} V)^{ - 1} NC^{ - 1} } \\ { - C^{\!- \!1} V(B - NC^{ - 1} V)^{ - 1} } & {C^{\!- \!1} V(B - NC^{ - 1} V)^{ - 1} NC^{\! - \!1} { + }}C^{\!- \!1} \end{array}} \right] $$
(14)

Then:

$$ B - NC^{- 1} V{ = }diag(\hat a_{0} \odot \hat a_{0}^ * + {\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } ) - diag(\hat a_{0}^ * )diag^{- 1} (1 + \lambda_{2} )diag(\hat a_{0} ) $$
$$ (B - NC^{- 1} V)^{- 1} = diag\left( {\frac{{1 + \lambda_{2} }}{{\lambda_{2} (\hat a_{0} \odot \hat a_{0}^ * ) + (1 + \lambda_{2} )({\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } )}}} \right) $$
$$ - (B\! -\! NC^{- 1} V)^{- 1} NC^{- 1} = diag\left( {\frac{{\hat a_{0}^ * }}{{\lambda_{2} (\hat a_{0} \odot \hat a_{0}^ * ) + (1 + \lambda_{2} )({\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } )}}} \right) $$

Since:

$$ \hat w^{*} = - \lambda_{2} (B - NC^{- 1} V)^{- 1} NC^{- 1} F^{H} y_{0} $$
(15)

Then:

$$ \hat w^{*} = \frac{{\lambda_{2} (\hat a_{0}^ * \odot \hat y_{0}^ * )}}{{\lambda_{2} (\hat a_{0} \odot \hat a_{0}^ * + (1 + \lambda_{2} )({\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } )}} $$

$$ \hat w = \frac{{\lambda_{2} (\hat a_{0} \odot \hat y_{0} )}}{{\lambda_{2} (\hat a_{0} \odot \hat a_{0}^ * ) + (1 + \lambda_{2} )({\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } )}} $$

Since:

$$ w = {A_{0}^{T}} \alpha $$
(16)

Then:

$$ \hat \alpha = \frac{{\lambda_{2} \hat y_{0} }}{{\lambda_{2} (\hat a_{0} \odot \hat a_{0}^ * + (1 + \lambda_{2} )({\lambda_{1} \hat s \odot \hat s^ * } + \lambda_{3} \sum\limits_{i = 1}^{k} {\hat a_{i} \odot \hat a_{i}^ * } )}} $$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., Li, J., Du, B. et al. A target response adaptive correlation filter tracker with spatial attention. Multimed Tools Appl 79, 20521–20543 (2020). https://doi.org/10.1007/s11042-020-08839-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-08839-0

Keywords

Navigation