Skip to main content
Log in

A Novel Matching Operator for Visual Object Tracking

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Visual object tracking is among the extremely attractive topics in various vision system applications. The appearance of the target object often drastically changes over time due to complicated attributes such as motion blur and fast motion; accordingly object tracking is a challenging issue. Over the past few years, Siamese architecture has been satisfactorily employed in object tracking because of its great accuracy and robustness. In this paper, a new matching operator for determining the similarity between the template image and search region to locate the state of the object is introduced. The matching operator is a weighted sum of depth-wise cosine similarity and depth-wise cross-correlation. For computing cosine similarity in each block, extracted features from the backbone architecture are divided into column vectors in each channel. The similarity is calculated in corresponding columns. Then, the max-pooling operation is applied to similarity values, and the maximum value is assigned to the center of the block. Finally, the combination of cosine similarity and cross-correlation in each channel is used. The results on OTB-100 demonstrate that the proposed tracker acquires promising prediction performance concerning the state-of-the-art tracking techniques in terms of motion blur, fast motion, and also out-of-view attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Soleimanitaleb Z, Keyvanrad MA (2022) Single Object Tracking: A Survey of Methods, Datasets, and Evaluation Metrics. arXiv preprint arXiv:2201.13066. https://doi.org/10.48550/arXiv.2201.13066

  2. Abbasi S, Rezaeian M (2021) Visual object tracking using similarity transformation and adaptive optical flow. Multimed Tools Appl 80(24):33455–33473. https://doi.org/10.1007/s11042-021-11344-7

    Article  Google Scholar 

  3. Zuo M (2020) Survey of target tracking algorithm based on siamese network structure. J Phys Conf Ser 2203:012035

    Article  Google Scholar 

  4. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional siamese networks for object tracking. Computer vision–ECCV 2016 workshops: Amsterdam, The Netherlands, October 8–10 and 15–16. Springer, Heidelberg, pp 850–865

    Chapter  Google Scholar 

  5. Zhang Z, et al (2021) Learn to match: Automatic matching network design for visual tracking. Proceedings of the IEEE/CVF International Conference on Computer Vision

  6. Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  7. Xu Y et al (2020) Siamfc++: towards robust and accurate visual tracking with target estimation guidelines. Proc AAAI Conf Artif Intell 34(07):12549–12556. https://doi.org/10.1609/aaai.v34i07.6944

    Article  Google Scholar 

  8. Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese Box Adaptive Network for Visual Tracking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 6667–6676

  9. Hong C, Yu J, Zhang J, Jin X, Lee KH (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inf. https://doi.org/10.1109/TII.2018.2884211

    Article  Google Scholar 

  10. Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670. https://doi.org/10.1109/TIP.2015.2487860

    Article  MathSciNet  MATH  Google Scholar 

  11. Hong C, Yu J, Tao D, Wang M (2015) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751. https://doi.org/10.1109/TIE.2014.2378735

    Article  Google Scholar 

  12. Zhang J, Yang J, Yu J, Fan J (2022) Semisupervised image classification by mutual learning of multiple self-supervised models. Int J Intell Syst 37(5):3117–3141. https://doi.org/10.1002/int.22814

    Article  Google Scholar 

  13. Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recognit 116:107952. https://doi.org/10.1016/j.patcog.2021.107952

    Article  Google Scholar 

  14. Yu J, Tan M, Zhang H, Rui Y, Tao D (2022) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(2):563–578. https://doi.org/10.1109/TPAMI.2019.2932058

    Article  Google Scholar 

  15. Ma B, Huang L, Shen J, Shao L, Yang MH, Porikli F (2016) Visual tracking under motion blur. IEEE Trans Image Process 25(12):5867–5876. https://doi.org/10.1109/TIP.2016.2615812

    Article  MathSciNet  MATH  Google Scholar 

  16. Xu L, Luo H, Hui B, Chang Z (2016) Real-time robust tracking for motion blur and fast motion via correlation filters. Sensors 16(9):1443. https://doi.org/10.3390/s16091443

    Article  Google Scholar 

  17. Guo Q et al (2021) Learning to adversarially blur visual object tracking. Proc IEEE/CVF Int Conf Comput Vis. https://doi.org/10.1109/ICCV48922.2021.01066

    Article  Google Scholar 

  18. Liu S, Liu X, Wang S, Muhammad K (2021) Fuzzy-aided solution for out-of-view challenge in visual tracking under IoT-assisted complex environment. Neural Comput Appl 33(4):1055–1065. https://doi.org/10.1007/s00521-020-05021-3

    Article  Google Scholar 

  19. Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. Lecture notes in computer science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer, Heidelberg, pp 103–119

    Google Scholar 

  20. He A, Luo C, Tian X, Zeng W (2018) A twofold siamese network for real-time object tracking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 4834–4843

  21. Danelljan M, Bhat G, Khan FS, Felsberg M (2019) Atom: accurate tracking by overlap maximization. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 4655–4664

  22. Ma C, Bin Huang J, Yang X, Yang MH (2018) Adaptive correlation filters with long-term and short-term memory for object tracking. Int J Comput Vis 126(8):771–796. https://doi.org/10.1007/s11263-018-1076-4

    Article  Google Scholar 

  23. Voigtlaender P, Luiten J, Torr PH, Leibe B (2022) Siam r-cnn: visual tracking by re-detection. Proc IEEE/CVF Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR42600.2020.00661

    Article  Google Scholar 

  24. Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) SIAMRPN++: evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 4277–4286

  25. Dong X, Shen J (2018) Triplet loss in siamese network for object tracking. In: Proceedings of the European conference on computer vision (ECCV). Springer: Heidelberg, pp 472–488

  26. Wu Y, Cai C, Yeo CK (2022) Siamese centerness prediction network for real-time visual object tracking. Neural Process Lett. https://doi.org/10.1007/s11063-022-10924-4

    Article  Google Scholar 

  27. Xiang X, Ren W, Qiu Y, Zhang K, Lv N (2021) Multi-object tracking method based on efficient channel attention and switchable atrous convolution. Neural Process Lett 53(4):2747–2763. https://doi.org/10.1007/s11063-021-10519-5

    Article  Google Scholar 

  28. Ma F, Shou MZ, Zhu L, Fan H, Xu Y, Yang Y, Yan Z (2022) Unified transformer tracker for object tracking. Proc IEEE/CVF Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR52688.2022.00858

    Article  Google Scholar 

  29. Zhang J, Sun J, Wang J, Yue XG (2021) Visual object tracking based on residual network and cascaded correlation filters. J Ambient Intell Humaniz Comput 12(8):8427–8440. https://doi.org/10.1007/S12652-020-02572-0

    Article  Google Scholar 

  30. Zhang N, Liu J, Wang K, Zeng D, Mei T (2020) Robust visual object tracking with two-stream residual convolutional networks. In: Proceedings—International Conference on Pattern Recognition, pp 4123–4130

  31. Kinasih FMTR, Saragih CFD, Machbub C, Rusmin PH, Yulianti L, Andriana D (2019) State machine implementation for human object tracking using combination of mobilenet, KCF tracker, and HOG features. Int J Electr Eng Inf 11(4):697–712. https://doi.org/10.15676/ijeei.2019.11.4.5

    Article  Google Scholar 

  32. Zhu J, Zhang G, Zhou S, Li K (2021) Relation-aware Siamese region proposal network for visual object tracking. Multimed Tools Appl. https://doi.org/10.1007/S11042-021-10574-Z

    Article  Google Scholar 

  33. You H et al (2022) MC-Net: multiple max-pooling integration module and cross multi-scale deconvolution network. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2021.107456

    Article  Google Scholar 

  34. Zhang X, Wei Y, Yang Y, Huang TS (2020) SG-One: similarity guidance network for one-shot semantic segmentation. IEEE Trans Cybern 50(9):3855–3865. https://doi.org/10.1109/TCYB.2020.2992433

    Article  Google Scholar 

  35. Asuero AG, Sayago A, González AG (2006) The correlation coefficient: an overview. Crit Rev Anal Chem 36(1):41–59. https://doi.org/10.1080/10408340500526766

    Article  Google Scholar 

  36. Hassan M, Bhagvati C (2012) Structural similarity measure for color images. Int J Comput Appl 43:7–12

    Google Scholar 

  37. Lin A (2019) Binary search algorithm. Wiki J Sci 2(1):5. https://doi.org/10.15347/wjs/2019.005

    Article  Google Scholar 

  38. Wu J, Chen XY, Zhang H, Xiong LD, Lei H, Deng SH (2019) Hyperparameter optimization for machine learning models based on Bayesian optimization. J Electron Sci Technol 17(1):26–40. https://doi.org/10.11989/JEST.1674-862X.80904120

    Article  Google Scholar 

  39. Thiede LA, Parlitz U (2019) Gradient based hyperparameter optimization in Echo State Networks. Neural Netw 115:23–29. https://doi.org/10.1016/j.neunet.2019.02.001

    Article  Google Scholar 

  40. Bolme DS, Beveridge JR, Draper BA, Lui YM (2010) Visual object tracking using adaptive correlation filters. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 2544–2550

  41. Danelljan M, Khan FS, Felsberg M, Van De Weijer J (2014) Adaptive color attributes for real-time visual tracking. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 1090–1097

  42. Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional siamese networks for object tracking. Computer vision—ECCV 2016 workshops: Amsterdam, The Netherlands, October 8–10 and 15–16 2016 proceedings part II. Springer, Heidelberg, pp 850–865

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Soolmaz Abbasi and Mehdi Rezaeian conceived of the presented idea. Soolmaz Abbasi developed the theory and performed the computations and experiments. Mehdi Rezaeian encouraged Soolmaz Abbasi to investigate the superiority of cosine similarity against fast motion and motion blur attributes and supervised the findings of this work. All authors discussed the results and contributed to the final manuscript.

Corresponding author

Correspondence to Mehdi Rezaeian.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abbasi, S., Rezaeian, M. A Novel Matching Operator for Visual Object Tracking. Neural Process Lett 55, 9065–9084 (2023). https://doi.org/10.1007/s11063-023-11192-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11192-6

Keywords

Navigation