PG-Net: Pixel to Global Matching Network for Visual Tracking

Liao, Bingyan; Wang, Chenye; Wang, Yayun; Wang, Yaonong; Yin, Jun

doi:10.1007/978-3-030-58542-6_26

Bingyan Liao¹²,
Chenye Wang¹²,
Yayun Wang¹²,
Yaonong Wang¹² &
…
Jun Yin¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12367))

Included in the following conference series:

European Conference on Computer Vision

3333 Accesses
52 Citations

Abstract

Siamese neural network has been well investigated by tracking frameworks due to its fast speed and high accuracy. However, very few efforts were spent on background-extraction by those approaches. In this paper, a Pixel to Global Matching Network (PG-Net) is proposed to suppress t+he influence of background in search image while achieving state-of-the-art tracking performance. To achieve this purpose, each pixel on search feature is utilized to calculate the similarity with global template feature. This calculation method can appropriately reduce the matching area, thus introducing less background interference. In addition, we propose a new tracking framework to perform correlation-shared tracking and multiple losses for training, which not only reduce the computational burden but also improve the performance. We conduct comparison experiments on various public tracking datasets, which obtains state-of-the-art performance while running with fast speed.

B. Liao and C. Wang—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bai, S., He, Z., Xu, T.B., Zhu, Z., Dong, Y., Bai, H.: Multi-hierarchical independent correlation filters for visual tracking. arXiv preprint arXiv:1811.10302 (2018)
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional Siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chapter Google Scholar
Bhat, G., Johnander, J., Danelljan, M., Shahbaz Khan, F., Felsberg, M.: Unveiling the power of deep tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 483–498 (2018)
Google Scholar
Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2544–2550. IEEE (2010)
Google Scholar
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: Atom: accurate tracking by overlap maximization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4660–4669 (2019)
Google Scholar
Danelljan, M., Häger, G., Khan, F., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference, Nottingham, 1–5 September 2014. BMVA Press (2014)
Google Scholar
Fan, H., et al.: LaSOT: a high-quality benchmark for large-scale single object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5374–5383 (2019)
Google Scholar
Fan, H., Ling, H.: Parallel tracking and verifying: a framework for real-time and high accuracy visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5486–5494 (2017)
Google Scholar
Fan, H., Ling, H.: Siamese cascaded region proposal networks for real-time visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7952–7961 (2019)
Google Scholar
Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
Google Scholar
Guo, Q., Feng, W., Zhou, C., Huang, R., Wan, L., Wang, S.: Learning dynamic Siamese network for visual object tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1763–1771 (2017)
Google Scholar
Gupta, M., Kumar, S., Behera, L., Subramanian, V.K.: A novel vision-based tracking algorithm for a human-following mobile robot. IEEE Trans. Syst. Man Cybern. Syst. 47(7), 1415–1427 (2016)
Article Google Scholar
He, A., Luo, C., Tian, X., Zeng, W.: A twofold Siamese network for real-time object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4834–4843 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2014)
Article Google Scholar
Kristan, M., et al.: The sixth visual object tracking vot2018 challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Google Scholar
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4282–4291 (2019)
Google Scholar
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with Siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971–8980 (2018)
Google Scholar
Li, Y., Zhang, X.: SiamVGG: visual tracking using deeper siamese networks. arXiv preprint arXiv:1902.02804 (2019)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, L., Xing, J., Ai, H., Ruan, X.: Hand posture recognition using finger geometric feature. In: Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp. 565–568. IEEE (2012)
Google Scholar
Lukežič, A., Zajc, L.Č., Vojíř, T., Matas, J., Kristan, M.: FuCoLoT – a fully-correlational long-term tracker. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11362, pp. 595–611. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20890-5_38
Chapter Google Scholar
Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: YouTube-BoundingBoxes: a large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5296–5305 (2017)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Saunier, N., Sayed, T.: A feature-based tracking algorithm for vehicles in intersections. In: The 3rd Canadian Conference on Computer and Robot Vision (CRV 2006), p. 59. IEEE (2006)
Google Scholar
Tao, R., Gavves, E., Smeulders, A.W.: Siamese instance search for tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1420–1429 (2016)
Google Scholar
Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.: End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2805–2813 (2017)
Google Scholar
Wang, Q., Teng, Z., Xing, J., Gao, J., Hu, W., Maybank, S.: Learning attentions: residual attentional Siamese network for high performance online visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4854–4863 (2018)
Google Scholar
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1328–1338 (2019)
Google Scholar
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)
Article Google Scholar
Xing, J., Ai, H., Lao, S.: Multiple human tracking based on multi-view upper-body detection and discriminative learning. In: 2010 20th International Conference on Pattern Recognition, pp. 1698–1701. IEEE (2010)
Google Scholar
Xu, T., Feng, Z.H., Wu, X.J., Kittler, J.: Joint group feature selection and discriminative filter learning for robust visual object tracking. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
Google Scholar
Xu, T., Feng, Z.H., Wu, X.J., Kittler, J.: Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking. IEEE Trans. Image Process. 28, 5596–5609 (2019)
Article MathSciNet Google Scholar
Yan, B., Zhao, H., Wang, D., Lu, H., Yang, X.: ‘Skimming-perusal’ tracking: a framework for real-time and robust long-term tracking. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
Google Scholar
Zhang, Y., Wang, D., Wang, L., Qi, J., Lu, H.: Learning regression and verification networks for long-term visual tracking. arXiv preprint arXiv:1809.04320 (2018)
Zhang, Z., Peng, H.: Deeper and wider Siamese networks for real-time visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4591–4600 (2019)
Google Scholar
Zhao, Q., et al.: M2Det: a single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 9259–9266 (2019)
Google Scholar
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware Siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101–117 (2018)
Google Scholar

Download references

Acknowledgment

This work is generous supported by DAHUA Advanced Institute and Deep Learning Platform of Jinn.

Author information

Authors and Affiliations

ZheJiang Dahua Technology Co., Ltd., Hangzhou, China
Bingyan Liao, Chenye Wang, Yayun Wang, Yaonong Wang & Jun Yin

Authors

Bingyan Liao
View author publications
You can also search for this author in PubMed Google Scholar
Chenye Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yayun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yaonong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bingyan Liao .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2221 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liao, B., Wang, C., Wang, Y., Wang, Y., Yin, J. (2020). PG-Net: Pixel to Global Matching Network for Visual Tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12367. Springer, Cham. https://doi.org/10.1007/978-3-030-58542-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-58542-6_26
Published: 17 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58541-9
Online ISBN: 978-3-030-58542-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics