Skip to main content
Log in

Efficient scale estimation methods using lightweight deep convolutional neural networks for visual tracking

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In recent years, visual tracking methods that are based on discriminative correlation filters (DCFs) have been very promising. However, most of these methods suffer from a lack of robust scale estimation skills. Although a wide range of recent DCF-based methods exploit the features that are extracted from deep convolutional neural networks (CNNs) in their translation model, the scale of the visual target is still estimated by hand-crafted features. Whereas the exploitation of CNNs imposes a high computational burden, this paper exploits pre-trained lightweight CNNs models to propose two efficient scale estimation methods, which not only improve the visual tracking performance but also provide acceptable tracking speeds. The proposed methods are formulated based on either holistic or region representation of convolutional feature maps to efficiently integrate into DCF formulations to learn a robust scale model in the frequency domain. Moreover, against the conventional scale estimation methods with iterative feature extraction of different target regions, the proposed methods exploit proposed one-pass feature extraction processes that significantly improve the computational efficiency. Comprehensive experimental results on the OTB-50, OTB-100, TC-128 and VOT-2018 visual tracking datasets demonstrate that the proposed visual tracking methods outperform the state-of-the-art methods, effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PHS (2016) Staple: complementary learners for real-time tracking. In: Proceedings of the IEEE CVPR, pp 1401–1409

  2. Bertinetto L, Jack V, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional Siamese networks for object tracking. In: Proceedings of the ECCV, pp 850–865

  3. Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: Proceedings of the IEEE CVPR, pp 2544–2550

  4. Čehovin L (2017) TraX: the visual tracking exchange protocol and library. Neurocomputing 260:5–8

    Article  Google Scholar 

  5. Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the BMVC, pp 1–11

  6. Chen C, Li S, Qin H, Hao A (2015) Real-time and robust object tracking in video via low-rank coherency analysis in feature space. Pattern Recogn 48(9):2885–2905

    Article  Google Scholar 

  7. Chen X, Yao L, Zhang Y (2020) Residual attention U-Net for automated multi-class segmentation of COVID-19 chest CT images. arXiv.org/abs/2004.05645v1

  8. Choi J, Chang HJ, Fischer T, Yun S, Lee K, Jeong J, Demiris Y, Choi JY (2018) Context-aware deep feature compression for high-speed visual tracking. In: Proceedings of the IEEE CVPR, pp 479–488

  9. Danelljan M, Hager G, Khan FS, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE ICCV, pp 4310–4318

  10. Danelljan M, Hager G, Khan FS, Felsberg M (2016) Convolutional features for correlation filter based visual tracking. In: Proceedings of the IEEE ICCVW, pp 621–629

  11. Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Proceedings of the ECCV, pp 472–488

  12. Danelljan M, Bhat G, Khan FS, Felsberg M (2017) ECO: efficient convolution operators for tracking. In: Proceedings of the IEEE CVPR, pp 6931–6939

  13. Danelljan M, Häger G, Khan FS, Felsberg M (2017) Discriminative scale space tracking. IEEE Trans Pattern Anal Mach Intell 39(8):1561–1575

    Article  Google Scholar 

  14. Fan H, Ling H (2019) Siamese cascaded region proposal networks for real-time visual tracking. In: Proceedings of the IEEE CVPR

  15. Galoogahi HK, Sim T, Lucey S (2015) Correlation filters with limited boundaries. In: Proceedings of the IEEE CVPR, pp 4630–4638

  16. Galoogahi HK, Fagg A, Lucey S (2017) Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE ICCV, pp 1144–1152

  17. Gao J, Zhang T, Xu C (2019) Graph convolutional tracking. In: Proceedings of the CVPR, pp 4649–4659

  18. Gong D, Yang J, Liu L, Zhang Y, Reid I, Shen C, Van Den Hengel A, Shi Q (2017) From motion blur to motion flow: a deep learning solution for removing heterogeneous motion blur. In: Proceedings of the IEEE CVPR

  19. Guo Q, Feng W, Zhou C, Huang R, Wan Li, Wang S (2017) Learning dynamic Siamese network for visual object tracking. In: Proceedings of the IEEE ICCV, pp 1781–1789

  20. Hao J, Zhou Y, Zhang G, Qin L, Wu Q (2019) A review of target tracking algorithm based on UAV. In: Proceedings of the IEEE CBS, pp 328–333

  21. Hare S, Golodetz S, Saffari A, Vineet V, Cheng MM, Hicks SL, Torr PHS (2016) Struck: structured output tracking with kernels. IEEE Trans Pattern Anal Mach Intell 38(10):2096–2109

    Article  Google Scholar 

  22. He A, Luo C, Tian X, Zeng W (2018) A twofold Siamese network for real-time object tracking. In: Proceedings of the IEEE CVPR, pp 4834–4843

  23. Henriques JF, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: Proceedings of the ECCV, pp 702–715

  24. Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  25. Hong Z, Chen WC, Mei X, Prokhorov D, Tao D (2015) MUlti-Store Tracker (MUSTer): a cognitive psychology inspired approach to object tracking. In: Proceedings of the IEEE CVPR, pp 749–758

  26. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2015) MobileNets: efficient convolutional neural networks for mobile vision applicationss. arXiv.org/abs/1704.04861

  27. Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R et al (2019) The sixth visual object tracking VOT2018 challenge results. In: Proceedings of the ECCVW, pp 3–53

  28. Kwon J, Lee KM (2014) Tracking by sampling and integrating multiple trackers. IEEE Trans Pattern Anal Mach Intell 36(7):1428–1441

    Article  MathSciNet  Google Scholar 

  29. Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with Siamese region proposal network. In: Proceedings of the IEEE CVPR, pp 8971–8980

  30. Li F, Tian C, Zuo W, Zhang L, Yang MH (2018) Learning spatial-temporal regularized correlation filters for visual tracking. In: Proceedings of the IEEE CVPR, pp 4904–4913

  31. Li P, Wang D, Wang L, Lu H (2018) Deep visual tracking: review and experimental comparison. Pattern Recognit 76:323–338

    Article  Google Scholar 

  32. Li Y, Zhu J (2015) A scale adaptive kernel correlation filter tracker with feature integration. In: Proceedings of the ECCVW, pp 254–265

  33. Liang P, Blasch E, Ling H (2015) Encoding color information for visual tracking: algorithms and benchmark. IEEE Trans Image Process 24(12):5630–5644

    Article  MathSciNet  Google Scholar 

  34. Liu T, Kong J, Jiang M, Liu C, Xiaofeng G, Wang X (2019) Collaborative model with adaptive selection scheme for visual tracking. Int J Mach Learn Cyb 10(2):215–228

    Article  Google Scholar 

  35. Lugmayr A, Danelljan M, Timofte R (2020) NTIRE 2020 challenge on real-world image super-resolution: methods and results. In: Proceedings of the IEEE CVPRW

  36. Lukežič A, Vojíř T, Zajc L, Matas J, Kristan M (2018) Discriminative correlation filter tracker with channel and spatial reliability. IJCV 126(7):671–688

    Article  MathSciNet  Google Scholar 

  37. Ma C, Huang JB, Yang X, Yang MH (2015) Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE ICCV, pp 3074–3082

  38. Ma C, Huang JB, Yang X, Yang MH (2019) Robust visual tracking via hierarchical convolutional features. IEEE Trans Pattern Anal Mach Intell 41(11):2709–2723. https://doi.org/10.1109/TPAMI.2018.2865311

    Article  Google Scholar 

  39. Ma C, Huang JB, Yang X, Yang MH (2018) Adaptive correlation filters with long-term and short-term memory for object tracking. IJCV 126(8):771–796

    Article  Google Scholar 

  40. Marvasti-Zadeh SM, Ghanei-Yakhdan H, Kasaei S (2018) Adaptive spatio-temporal context learning for visual target tracking. In: Proceedings of the Iranian conference on machine vision and image processing (MVIP)

  41. Marvasti-Zadeh SM, Cheng L, Ghanei-Yakhdan H, Kasaei S (2019) Deep learning for visual tracking: a comprehensive survey. In: Computer vision and pattern recognition. arXiv:1912.00535v1

  42. Marvasti-Zadeh SM, Ghanei-Yakhdan H, Kasaei S (2019) Rotation-aware discriminative scale space tracking. In: Proceedings of the Iranian conference on electrical engineering (ICEE), pp 1272–1276

  43. Marvasti-Zadeh SM, Khaghani J, Ghanei-Yakhdan H, Kasaei S, Cheng L (2020) COMET: context-aware IoU-guided network for small object tracking. In: Proceedings of the ACCV

  44. Nah S, Son S, Timofte R, Lee KM (2020) NTIRE challenge on image and video deblurring. In: Proceedings of the IEEE CVPRW

  45. Possegger H, Mauthner T, Bischof. H (2015) In defense of color-based model-free tracking. In: Proceedings of the IEEE CVPR, pp 2113–2120

  46. Russakovsky O, Deng J, Hao S, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. IJCV 115(3):211–252

    Article  MathSciNet  Google Scholar 

  47. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE CVPR, pp 4510–4520

  48. Sun C, Wang D, Lu H, Yang MH (2018) Learning spatial-aware regressions for visual tracking. In: Proceedings of the IEEE CVPR, pp 8962–8970

  49. Sun C, Wang D, Lu H, Yang MH (2018) Correlation tracking via joint discrimination and reliability learning. In: Proceedings of the IEEE CVPR, pp 489–497

  50. Sun S, An Z, Jian X, Zhang B, Zhang J (2019) Robust object tracking with the inverse relocation strategy. Neural Comput Appl 31:123–132

    Article  Google Scholar 

  51. Tong K, Wu Y, Fei Z (2020) Recent advances in small object detection based on deep learning: a review. Image Vis Comput 97

  52. Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr PHS (2017) End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE CVPR, pp 5000–5008

  53. Wang D, Lu H, Bo C (2015) Visual tracking via weighted local cosine similarity. IEEE Trans Cybernetics 45(9):1838–1850

    Article  Google Scholar 

  54. Wang P, Sun M, Wang H, Li X, Yang Y (2020) Convolution operators for visual tracking based on spatial-temporal regularization. Neural Comput Appl 45:50. https://doi.org/10.1007/s00521-020-04704-1

    Article  Google Scholar 

  55. Wang Q, Gao J, Xing J, Zhang M, Hu W (2017) DCFNet: discriminant correlation filters network for visual tracking. arXiv.org/abs/1704.04057

  56. Wu Y, Lim J, Yang MH (2013) Online object tracking: a benchmark. In: Proceedings of the IEEE CVPR, pp 2411–2418

  57. Yi W, Lim J, Yang MH (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848

    Article  Google Scholar 

  58. Zhang J, Ma S, Sclaroff S (2014) MEEM: robust tracking via multiple experts using entropy minimization. In: Proceedings of the ECCV, pp 188–203

  59. Zhang T, Xu C, Yang MH (2017) Multi-task correlation particle filter for robust object tracking. In: Proceedings of the IEEE CVPR, pp 4819–4827

  60. Zhang X, Yao L, Wang X, Monaghan J, Mcalpine D, Zhang Y (2019) A survey on deep learning based brain computer interface: recent advances and new frontiers. arXiv:1905.04149v4

  61. Zhang Z, Peng H (2019) Deeper and wider Siamese networks for real-time visual tracking. In: Proceedings of the IEEE CVPR

  62. Zhou T, Lu Y, Di H (2017) Locality-constrained collaborative model for robust visual tracking. IEEE Trans Circuits Syst Video Technol 27(2):313–325

    Article  Google Scholar 

  63. Zhu Z, Huang G, Zou W, Du D, Huang C (2018) UCT: learning unified convolutional networks for real-time visual tracking. In: Proceedings of the ICCVW, pp 1973–1982

Download references

Acknowledgements

This work was partly supported by a Grant (No. 96013046) from Iran National Science Foundation (INSF).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hossein Ghanei-Yakhdan.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marvasti-Zadeh, S.M., Ghanei-Yakhdan, H. & Kasaei, S. Efficient scale estimation methods using lightweight deep convolutional neural networks for visual tracking. Neural Comput & Applic 33, 8319–8334 (2021). https://doi.org/10.1007/s00521-020-05586-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05586-z

Keywords

Navigation