Siamada: visual tracking based on Siamese adaptive learning network

Lu, Xin; Li, Fusheng; Yang, Wanqi

doi:10.1007/s00521-024-09481-9

Siamada: visual tracking based on Siamese adaptive learning network

Original Article
Published: 18 February 2024

Volume 36, pages 7639–7656, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Xin Lu^1,2,
Fusheng Li^1,2 &
Wanqi Yang^1,2

130 Accesses
Explore all metrics

Abstract

Recently, Siamese trackers based on region proposal networks (RPN) have gained a lot of popularity. However, the design of RPN requires manual tuning of parameters such as object-anchor intersection over union (IoU) and relative weights for different tasks, which is a difficult and expensive process for model training. To address this issue, we propose a novel Siamese adaptive learning network (SiamAda) for visual tracking, allowing the model trained in a flexible way. Rather than IoU-based anchor assignment, the proposed network uses spatial alignment and model learning status as criteria for anchor quality evaluation, and a Gaussian mixture distribution for adaptive assignment. Moreover, aiming at the inconsistency problem between classification confidence and localization accuracy, a localization branch is designed to predict the IoU for each candidate anchor box, responsible for localization quality assessment. Furthermore, to avoid the tricky relative weight tuning between each task’s loss, multi-task learning with homoscedastic uncertainty is employed to adaptively weigh these multiple losses. Extensive experiments on challenging benchmarks, namely OTB2015, VOT2018, DTB70, UAV20L, GOT-10k and LaSOT validate the superiority of our tracker. The ablation studies also illustrate the advantage of each strategy presented in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ocean: Object-Aware Anchor-Free Tracking

AF2S: An Anchor-Free Two-Stage Tracker Based on a Strong SiamFC Baseline

SiamCPN: Visual tracking with the Siamese center-prediction network

Article Open access 05 April 2021

Data availability and access

The data that support the fundings of this study are available from the corresponding author upon reasonable request.

References

Wang F, Cao P, Li F, Wang X, He B, Sun F (2022) Watb: wild animal tracking benchmark. Int J Comput Vis 131:899–917
Article Google Scholar
Ahmed I, Din S, Jeon G, Piccialli F, Fortino G (2021) Towards collaborative robotics in top view surveillance: a framework for multiple object tracking by detection using deep learning. IEEE/CAA J Autom Sin 8:1253–1270
Article Google Scholar
Zhang P, Zhao J, Wang D, Lu H, Ruan X (2022) Visible-thermal UAV tracking: a large-scale benchmark and new baseline. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8876–8885
Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2018) Siamrpn++: evolution of Siamese visual tracking with very deep networks. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 4277–4286
Hu W, Wang Q, Zhang L, Bertinetto L, Torr PHS (2022) Siammask: a framework for fast online object tracking and segmentation. IEEE Trans Pattern Anal Mach Intell 45:3072–3089
Google Scholar
Zhang T, Liu X, Zhang Q, Han J (2022) Siamcda: complementarity- and distractor-aware RGB-t tracking based on Siamese network. IEEE Trans Circuits Syst Video Technol 32:1403–1417
Article Google Scholar
Wang Z, Xie Q, Lai Y, Wu J, Long K, Wang J (2021) Mlvsnet: multi-level voting Siamese network for 3d visual tracking. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 3081–3090
Ren S, He K, Girshick R, Sun J (2017) Faster r-CNN: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 39(6):1137–1149
Google Scholar
Bo L, Yan J, Wei W, Zheng Z, Hu X (2018) High performance visual tracking with Siamese region proposal network. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8971–8980
Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware Siamese networks for visual object tracking. In: Computer vision—ECCV 2018, pp 103–119
Tian Z, Shen C, Chen H, He T (2019) Fcos: fully convolutional one-stage object detection. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 9626–9635
Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2019) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9756–9765
Zhang X, Wan F, Liu C, Ji X, Ye Q (2019) Learning to match anchors for visual object detection. IEEE Trans Pattern Anal Mach Intell 44:3096–3109
Article Google Scholar
Kim K-J, Lee HS (2020) Probabilistic anchor assignment with IOU prediction for object detection. In: European conference on computer vision
Lin T-Y, Maire M, Belongie SJ, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision
Lu X, Ma C, Shen J, Yang X, Reid ID, Yang M-H (2020) Deep object tracking with shrinkage loss. IEEE Trans Pattern Anal Mach Intell 44:2386–2401
Google Scholar
Zhang H, Ma Z, Zhang J, Chen F, Song X (2023) Multi-view confidence-aware method for adaptive Siamese tracking with shrink-enhancement loss. Pattern Anal Appl 26:1407–1424
Article Google Scholar
Zhang H, Cheng L, Zhang T, Wang Y, Zhang WJ, Zhang J (2022) Target-distractor aware deep tracking with discriminative enhancement learning loss. IEEE Trans Circuits Syst Video Technol 32:6267–6278
Article Google Scholar
Fan H, Ling H (2019) Siamese cascaded region proposal networks for real-time visual tracking. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 7944–7953
Feng J, Pu S, Zhao K, Zhang H, Du T (2019) Enhanced initialization with multi-stage learning for robust visual tracking. In: 2019 IEEE visual communications and image processing (VCIP), pp 1–4
Wang N, Zhou W-G, Tian Q, Li H (2020) Cascaded regression tracking: towards online hard distractor discrimination. IEEE Trans Circuits Syst Video Technol 31:1580–1592
Article Google Scholar
Yang K, Zhang H, Zhou D, Dong L (2022) Paarpn: probabilistic anchor assignment with region proposal network for visual tracking. Inf Sci 598:19–36
Article Google Scholar
Zhou L, He Y, Li W, Mi J-X, Lei BJ (2021) Iou-guided Siamese region proposal network for real-time visual tracking. Neurocomputing 462:544–554
Article Google Scholar
Guo D, Wang J, Cui Y, Wang Z, Chen S (2019) Siamcar: Siamese fully convolutional classification and regression for visual tracking. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 6268–6276
Wang Q, Zhang L, Bertinetto L, Hu W, Torr PHS (2018) Fast online object tracking and segmentation: a unifying approach. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1328–1338
Zhou W, Wen L, Zhang L, Du D, Luo T, Wu Y (2021) Siamcan: real-time visual tracking based on Siamese center-aware network. IEEE Trans Image Process 30:3597–3609
Article Google Scholar
Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: towards robust and accurate visual tracking with target estimation guidelines. Proc AAAI Confer Artif Intell 34(7):12549–12556
Google Scholar
Kendall A, Gal Y, Cipolla R (2017) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 7482–7491
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional Siamese networks for object tracking. In: Computer science—computer vision and pattern recognition (CVPR)
Li P, Chen B, Ouyang W, Wang D, Yang X, Lu H (2019) Gradnet: gradient-guided network for visual object tracking. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 6161–6170
Dong X, Shen J (2018) Triplet loss in Siamese network for object tracking. In: European conference on computer vision
Sosnovik I, Moskalev A, Smeulders AWM (2021) Scale equivariance improves Siamese tracking. In: 2021 IEEE winter conference on applications of computer vision (WACV), pp 2764–2773
Zheng L, Chen Y, Tang M, Wang J, Lu H (2020) Siamese deformable cross-correlation network for real-time visual tracking. Neurocomputing 401:36–47. https://doi.org/10.1016/j.neucom.2020.02.080
Article Google Scholar
Huang H, Liu G, Zhang Y, Xiong R, Zhang S (2022) Ensemble Siamese networks for object tracking. Neural Comput Appl 34(10):8173–8191. https://doi.org/10.1007/s00521-022-06911-4
Article Google Scholar
Li D, Porikli F, Wen G, Kuai Y (2020) When correlation filters meet Siamese networks for real-time complementary tracking. IEEE Trans Circuits Syst Video Technol 30(2):509–519. https://doi.org/10.1109/TCSVT.2019.2892759
Article Google Scholar
Zhong P, Wu W, Dai X, Zhao Q, Li S (2023) Fisher pruning for developing real-time UAV trackers. J Real-Time Image Process. https://doi.org/10.1007/s11554-023-01348-x
Yan B, Zhao H, Wang D, Lu H, Yang X (2019) ’Skimming-perusal’ tracking: a framework for real-time and robust long-term tracking. In: 2019 IEEE/CVF international conference on computer vision (ICCV)
Zhang L, Gonzalez-Garcia A, van de Weijer J, Danelljan M, Khan FS (2019) Learning the model update for Siamese trackers. In: 2019 IEEE/CVF international conference on computer vision (ICCV), pp 4009–4018
Zhang Z, Peng H (2020) Deeper and wider Siamese networks for real-time visual tracking. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CPVR)
Zheng G-Z, Fu C, Ye J, Li B, Lu G, Pan J-Y (2022) Siamese object tracking for vision-based UAM approaching with pairwise scale-channel attention. In: 2022 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 10486–10492
Zheng G-Z, Fu C, Ye J, Li B, Lu G, Pan J-Y (2023) Scale-aware Siamese object tracking for vision-based UAM approaching. IEEE Trans Ind Inf 19:9349–9360
Article Google Scholar
Cao Z, Fu C, Ye J, Li B, Li Y (2021) Siamapn++: Siamese attentional aggregation network for real-time UAV tracking. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 3086–3092
Guo D, Shao Y, Cui Y, Wang Z, Zhang L, Shen C (2021) Graph attention tracking. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9538–9547
Wu S, Li X, Wang X (2019) Iou-aware single-stage object detector for accurate localization. Image Vis Comput 97:103911
Article Google Scholar
Jiang B, Luo R, Mao J, Xiao T, Jiang Y (2018) Acquisition of localization confidence for accurate object detection. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) 2018 European Conference on Computer Vision (ECCV), pp 784–799
Chen Z, Zhong B, Li G, Zhang S, Ji R, Tang Z, Li X (2023) Siamban: target-aware tracking with Siamese box adaptive network. IEEE Trans Pattern Anal Mach Intell 45(4):5158–5173
Google Scholar
Peng J, Jiang Z, Gu Y, Wu Y, Wang Y, Tai Y, Wang C, Lin W (2021) Siamrcr: reciprocal classification and regression for visual object tracking, pp. 952–958. arXiv:2105.11237. https://api.semanticscholar.org/CorpusID:235166830
Tang F, Ling Q (2021) Learning to rank proposals for Siamese visual tracking. IEEE Trans Image Process 30:8785–8796
Article Google Scholar
Nie J, Wu H, He Z, Yang Y, Gao M, Dong Z (2022) Learning localization-aware target confidence for Siamese visual tracking. arXiv:2204.14093
Fan H, Ling H (2021) Cract: cascaded regression-align-classification for robust tracking. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 7013–7020
Zhang Z, Peng H (2020) Ocean: object-aware anchor-free tracking. In: European conference on computer vision, pp 771–787
Zheng Y, Liu X, Cheng X, Zhang K, Wu Y, Chen S (2020) Multi-task deep dual correlation filters for visual tracking. IEEE Trans Image Process 29:9614–9626
Article Google Scholar
Zheng Y, Liu X, Xiao B, Cheng X, Wu Y, Chen S (2022) Multi-task convolution operators with object detection for visual tracking. IEEE Trans Circuits Syst Video Technol 32:8204–8216
Article Google Scholar
Cai Y, Sui X, Gu G (2023) Multi-modal multi-task feature fusion for RGBT tracking. Inf Fus 97:101816
Article Google Scholar
Wang F, Cao P, Wang X, He B, Sun F (2023) SiamADT: Siamese attention and deformable features fusion network for visual object tracking. Neural Proc Lett 55:7933–7950
Article Google Scholar
Marvasti-Zadeh SM, Khaghani J, Ghanei-Yakhdan H, Kasaei S, Cheng L (2020) Comet: context-aware IOU-guided network for small object tracking. In: Asian conference on computer vision. https://api.semanticscholar.org/CorpusID:219305183
Tang F, Ling Q (2022) Ranking-based Siamese visual tracking. In: 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 8731–8740
Wang Y, Wang F, Wang C, Sun F, He J (2021) Learning saliency-aware correlation filters for visual tracking. Comput J 65:1846–1859
Article Google Scholar
Sun F, Zhao T, Zhu B, Jia X, Wang F (2022) Deblurring transformer tracking with conditional cross-attention. Multimedia Syst 29:1131–1144
Article Google Scholar
Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 6667–6676
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em—algorithm plus discussions on the paper. J R Stat Soc. Ser B (Methodol) 39(1):1–38
Google Scholar
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein MS, Berg AC, Fei-Fei L (2014) Imagenet large scale visual recognition challenge. Int J Comput Vis 115:211–252
Article MathSciNet Google Scholar
Wu Y, Lim J, Yang M-H (2013) Online object tracking: a benchmark. In: 2013 IEEE conference on computer vision and pattern recognition, pp 2411–2418
et al MK (2018) The sixth visual object tracking vot2018 challenge results. In: ECCV workshops
Li S, Yeung DY (2017) Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models. In: AAAI conference on artificial intelligence
M. Mueller, N.S., Ghanem, B (2016) A benchmark and simulator for UAV tracking. In: European conference on computer vision (ECCV), pp 445–461
Huang L, Zhao X, Huang K (2018) Got-10k: a large high-diversity benchmark for generic object tracking in the wild. IEEE Trans Pattern Anal Mach Intell 43:1562–1577
Article Google Scholar
Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2018) Lasot: a high-quality benchmark for large-scale single object tracking. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5369–5378
Voigtlaender P, Luiten J, Torr PHS, Leibe B (2020) Siam r-CNN: visual tracking by re-detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 6577–6587

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 62075028).

Author information

Authors and Affiliations

School of Automation Engineering, University of Electronic Science and Technology of China, Xiyuan Avenue, Chengdu, 611730, Sichuan, China
Xin Lu, Fusheng Li & Wanqi Yang
Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Xisaishan Avenue, Huzhou, 313001, Zhejiang, China
Xin Lu, Fusheng Li & Wanqi Yang

Authors

Xin Lu
View author publications
You can also search for this author in PubMed Google Scholar
Fusheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Wanqi Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Xin Lu and Wanqi Yang. The first draft of the manuscript was written by Xin Lu and Fusheng Li. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Fusheng Li.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

The data used in this study are public datasets published on official websites and do not involve human participants and/or animals.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lu, X., Li, F. & Yang, W. Siamada: visual tracking based on Siamese adaptive learning network. Neural Comput & Applic 36, 7639–7656 (2024). https://doi.org/10.1007/s00521-024-09481-9

Download citation

Received: 09 May 2023
Accepted: 14 January 2024
Published: 18 February 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s00521-024-09481-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Siamada: visual tracking based on Siamese adaptive learning network

Abstract

Access this article

Similar content being viewed by others

Ocean: Object-Aware Anchor-Free Tracking

AF2S: An Anchor-Free Two-Stage Tracker Based on a Strong SiamFC Baseline

SiamCPN: Visual tracking with the Siamese center-prediction network

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Siamada: visual tracking based on Siamese adaptive learning network

Abstract

Access this article

Similar content being viewed by others

Ocean: Object-Aware Anchor-Free Tracking

AF2S: An Anchor-Free Two-Stage Tracker Based on a Strong SiamFC Baseline

SiamCPN: Visual tracking with the Siamese center-prediction network

Data availability and access

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation