Skip to main content
Log in

Visual object tracking based on residual network and cascaded correlation filters

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Significant progress is made in the field of object tracking recently. Especially, trackers based on deep learning and correlation filters both have achieved excellent performance. However, object tracking still faces some challenging problems such as deformation and illumination. In such kinds of situations, the accuracy and precision of tracking algorithms plunge as a result. It is imminent to find a solution to this situation. In this paper, we propose a tracking algorithm based on features extracted by residual network called Resnet features and cascaded correlation filters to improve precision and accuracy. Firstly, features extracted by a deep residual network trained on other image processing datasets, are robust enough and retain higher resolution, therefore, we exploit Resnet-101 pretrained offline to obtain features extracted by middle and high layers for target appearance model representation. Resnet-101 is deeper compared with other deep neural networks which means it contains more semantic information. Then, the method we propose to combine our correlation filters is superior. We propose cascaded correlation filters generated by handcraft, middle-level and high-level features from residual network to gain better competence. Handcraft features localize target precisely because they contain more spatial details while Resnet features are robust to the target appearance change because they retain more semantic information. Finally, we conduct extensive experiments on OTB2013 and OTB2015 benchmark. The experimental results show that our tracker achieves high performance under all kinds of challenges and performs favorably against other state-of-the-art trackers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr PHS (2016a) Staple: complementary learners for real-time tracking. In: Proceedings of the 2016 IEEE conference on computer vision and pattern recognition, pp 1401–1409

  • Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016b) Fully-convolutional siamese networks for object tracking. In: Proceedings of the 2016 European conference on computer vision, vol 9914, pp 850–865

  • Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: Proceedings of the 2010 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 2544–2550

  • Chen Y, Wang J, Xia R, Zhang Q, Cao Z, Yang K (2019) The visual object tracking algorithm research based on adaptive combination kernel. J Ambient Intell Hum Comput 10(12):4855–4867

    Article  Google Scholar 

  • Chen Y, Xiong J, Xu W, Zuo J (2019) A novel online incremental and decremental learning algorithm based on variable support vector machine. Cluster Comput 22(S3):7435–7445

    Article  Google Scholar 

  • Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE conference on computer vision and pattern recognition, pp 886–893

  • Danelljan M, H¨ager G, Khan FS, Felsberg M (2014a) Accurate scale estimation for robust visual tracking. In: Proceedings of the 2014 British machine vision conference. BMVA Press

  • Danelljan M, H¨ager G, Khan FS, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the 2015 IEEE international conference on computer vision. IEEE Computer Society, pp 4310–4318

  • Danelljan M, Khan FS, Felsberg M, van deWeijer J (2014b) Adaptive color attributes for real-time visual tracking. In: Proceedings of the 2014 IEEE conference on computer vision and pattern recognition, pp 1090–1097

  • Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Proceedings of the 2016 European conference on computer vision, Springer, vol 9909, pp 472–488

  • Danelljan M, Bhat G, Khan FS, Felsberg M (2017) ECO: efficient convolution operators for tracking. In: Proceedings of the 2017 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 6931–6939

  • Fan H, Ling H (2017a) Parallel tracking and verifying: a framework for real-time and high accuracy visual tracking. In: Proceedings of the 2017 IEEE international conference on computer vision, IEEE Computer Society, pp 5487–5495

  • Fan H, Ling H (2017b) SANet: structure-aware network for visual tracking. In: Proceedings of the 2017 IEEE conference on computer vision and pattern recognition, IEEE Computer Society, pp 2217–2224

  • Fan H, Ling H (2019) Siamese cascaded region proposal networks for real-time visual tracking. In: Proceedings of the 2019 IEEE conference on computer vision and pattern recognition, IEEE Computer Vision Foundation, pp 7952–7961

  • Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2019) LaSOT: a high-quality benchmark for large-scale single object tracking. In: Proceedings of the 2019 IEEE conference on computer vision and pattern recognition, pp 5374–5383

  • Gao Z, Xia S, Zhang Y, Yao R, Zhao J, Niu Q, Jiang H (2018) Real-time visual tracking with compact shape and color feature. Comput Mater Contin 55(3):509–521

    Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE conference on computer vision and pattern recognition, IEEE Computer Society, pp 770–778

  • Henriques JF, Caseiro R, Martins P, Batista JP (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: Proceedings of the 2012 European conference on computer vision, Springer, vol 7575, pp 702–715

  • Henriques JF, Caseiro R, Martins P, Batista J (2015) Highspeed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596

    Article  Google Scholar 

  • Li Y, Zhu J (2014) A scale adaptive kernel correlation filter tracker with feature integration. In: Proceedings of the 2014 European conference on computer vision, Springer, vol 8926, pp 254–265

  • Li F, Tian C, Zuo W, Zhang L, Yang M (2018) Learning spatial-temporal regularized correlation filters for visual tracking. In: Proceedings of the 2018 IEEE conference on computer vision and pattern recognition, IEEE Computer Society, pp 4904–4913

  • Liang P, Blasch E, Ling H (2015) Encoding color information for visual tracking: algorithms and benchmark. IEEE Trans Image Process 24(12):5630–5644

    Article  MathSciNet  Google Scholar 

  • Liu W, Liu Z, Wang L, Li B, Jing N (2018) Human movement detection and gait periodicity analysis via channel state information. Comput Syst Sci Eng 33(2)

  • Liu F, Guo Y, Cai Z, Xiao N, Zhao Z (2019) Edge-enabled disaster rescue: a case study of searching for missing people. ACM Trans Intell Syst Technol 10(6): 63:1–63:21

  • Ma C, Huang J, Yang X, Yang M (2015) Hierarchical convolutional features for visual tracking. In: Proceedings of the 2015 IEEE international conference on computer vision, IEEE Computer Society, pp 3074–3082

  • Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for UAV tracking. In: Proceedings of the 2016 European conference on computer vision, Springer, pp 445–461

  • Possegger H, Mauthner T, Bischof H (2015) In defense of color-based model-free tracking. In: Proceedings of the 2015 IEEE conference on computer vision and pattern recognition, IEEE Computer Society, pp 2113–2120

  • Viola PA, Jones MJ (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE conference on computer vision and pattern recognition, pp 511–518

  • Wang N, Yeung D (2013) Learning a deep compact image representation for visual tracking. In: 27th annual conference on neural information processing systems, pp 809–817

  • Wang J, Ju C, Gao Y, Sangaiah AK, Kim G (2018) A PSO based energy efficient coverage control algorithm for wireless sensor networks. Comput Mater Contin 56(3):433–446

    Google Scholar 

  • Wang J, Gao Y, Liu W, Wu W, Lim SJ (2019) An asynchronous clustering and mobile data gathering schema based on timer mechanism in wireless sensor networks. Comput Mater Contin 58:711–725

    Article  Google Scholar 

  • Wang Q, Zhang L, Bertinetto L, Hu W, Torr PHS (2019b) Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the 2019 IEEE conference on computer vision and pattern recognition, IEEE Computer Vision Foundation, pp 1328–1338

  • Wang J, Gao Y, Zhou C, Sherratt S, Wang L (2020) Optimal coverage multi-path scheduling scheme with multiple mobile sinks for wsns. Comput Mater Contin 62(2):695–711

    Article  Google Scholar 

  • Wu Y, Lim J, Yang M (2013) Online object tracking: a benchmark. In: Proceedings of the 2013 IEEE conference on computer vision and pattern recognition, pp 2411–2418

  • Wu Y, Lim J, Yang M (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848

    Article  Google Scholar 

  • Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv 38(4):13

    Article  Google Scholar 

  • Zhang J, Ma S, Sclaroff S (2014) MEEM: robust tracking via multiple experts using entropy minimization. In: Proceedings of the 2014 European conference on computer vision, Springer, vol 8694, pp 188–203

  • Zhang J, Jin X, Sun J, Wang J, Li K (2019) Dual model learning combined with multiple feature selection for accurate visual tracking. IEEE Access 7:43956–43969

    Article  Google Scholar 

  • Zhang J, Wu Y, Feng W, Wang J (2019) Spatially attentive visual tracking using multi-model adaptive response fusion. IEEE Access 7:83873–83887

    Article  Google Scholar 

  • Zhang D, Yang G, Li F, Wang J, Sangaiah AK (2020) Detecting seam carved images using uniform local binary patterns. Multimed Tools Appl 79(13–14):8415–8430

    Article  Google Scholar 

  • Zhang J, Jin X, Sun J, Wang J, Sangaiah AK (2020) Spatial and semantic convolutional features for robust visual object tracking. Multimed Tools Appl 79(21–22):15095–15115

    Article  Google Scholar 

  • Zhang J, Lu C, Wang J, Yue X, Lim S, Al-Makhadmeh Z, Tolba A (2020) Training convolutional neural networks with multi-size images and triplet loss for remote sensing scene classification. Sensors 20(4):1188

    Article  Google Scholar 

  • Zhang J, Wang W, Lu C, Wang J, Sangaiah AK (2020) Lightweight deep network for traffic sign classification. Ann Telecommun 75(7–8):369–379

    Article  Google Scholar 

  • Zhang J, Xie Z, Sun J, Zou X, Wang J (2020) A cascaded R-CNN with multiscale attention and imbalanced samples for traffic sign detection. IEEE Access 8:29742–29754

    Article  Google Scholar 

  • Zhou Z, Qin J, Xiang X, Tan Y, Liu Q, Xiong NN (2020) News text topic clustering optimized method based on TF-IDF algorithm on spark. Comput Mater Contin 62(1):217–231

    Article  Google Scholar 

Download references

Acknowledgements

This research was funded in part by the National Natural Science Foundation of China under Grant Nos. 61972056 and 61772454, the "Double First-class" International Cooperation and Development Scientific Research Project of Changsha University of Science and Technology under Grant No. 2019IC34, the Postgraduate Scientific Research Innovation Fund of Hunan Province under Grant Nos. CX20190696 and CX20190695, the Postgraduate Training Innovation Base Construction Project of Hunan Province under Grant No. 2019-248-51.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianming Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Sun, J., Wang, J. et al. Visual object tracking based on residual network and cascaded correlation filters. J Ambient Intell Human Comput 12, 8427–8440 (2021). https://doi.org/10.1007/s12652-020-02572-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02572-0

Keywords

Navigation