Abstract
Fully-Convolutional Siamese Network (SiamFC) is convolutional neural networks (CNNs) model-based tracking method. This method learns a similarity map from the cross-correlation between the feature representations of the search image and the target image extracted using CNNs, and tracks based on it. The tracking performance of SiamFC tends to degrade when there are similar distractors to the object or when the target object is deformed. On the other hand, recent object tracking methods using the correlation filter (CF) drift under some scenarios such as fast motion and complete occlusion. The analysis showed that although these two approaches have very different structures, they tend to have complementary characteristics. In this work, we propose a complementary tracking framework that parallel connects SiamFC with the CF-based tracker. In the proposed framework, to detect tracking failures, we evaluate the response map output from SiamFC using the confidence score defined in this paper. When a tracking failure is detected, the CF-based tracker provides relative correlated correction. Experiments on the OTB2015 show that our tracker obtains up to more than 5.7%/5.5% (precision score/success score) relative improvements over the original SiamFC and CF-based tracker on the OTB2015, and competitive performance with advanced trackers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: Proceedings of the Computer Vision and Pattern Recognition, pp. 2411–2418, 23–28 June 2013
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. TPAMI 37(9), 1834–1848 (2015)
Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38(4), 1–45 (2006)
Babenko, B., Yang, M.-H., Belongie, S.: Robust object tracking with online multiple instance learning. TPAMI 33(8), 1619–1632 (2011)
Hare, S., et al.: Struck: structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2096–2109 (2016). https://doi.org/10.1109/TPAMI.2015.2509974
Boddeti, V.N., Kanade, T., Kumar, B.V.K.: Correlation filters for object alignment. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR) (2013). https://ieeexplore.ieee.org/document/6619141
Henriques, J.F., Carreira, J., Caseiro, R., Batista, J.: Beyond hard negative mining: efficient detector learning via block-circulant decomposition. In: IEEE International Conference on Computer Vision (ICCV) (2013)
Kiani Galoogahi, H., Sim, T., Lucey, S.: Multi-channel correlation filters. In: IEEE International Conference on Computer Vision (ICCV) (2013)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 886–893, June 2005
Bertinetto, S., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.S.: Staple: complementary learners for real-time tracking. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1401–1409, June 2016
Danelljan, M., Hger, G., Khan, F.S., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4310–4318, December 2015
Danelljan, M., Robinson, A., Shahbaz Khan, F., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 472–488. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_29
Danelljan, M., Bhat, G., Khan, F.S., et al.: ECO: efficient convolution operators for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu (2017). https://ieeexplore.ieee.org/document/8100216
Tao, R., Gavves, E., Smeulders, A.W.M.: Siamese instance search for tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas (2016)
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8971–8980 (2018)
Li, D., Porikli, F., Wen, G., Kuai, Y.: When correlation filters meet siamese networks for real-time complementary tracking. IEEE Trans. Circuits Syst. Video Technol. 30(2), 509–519 (2019)
Li, C., Xing, Q., Ma, Zang, K.: MFCFSiam: a correlation-filter-guided siamese network with multifeature for visual tracking. Wirel. Commun. Mobile Comput. 2020, 19 (2020)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: International Conference on Neural Information Processing Systems, pp. 91–99 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNET classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint, arXiv:1409.1556 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 483–499. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_29
Zhang, Z., Peng, H.: Deeper and wider siamese networks for real-time visual tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4591–4600 (2019)
Wang, M.M., Liu, Y., Huang, Z.Y.: Large margin object tracking with circulant feature maps. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 4800–4808 (2017)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)
Kiani Galoogahi, H., Fagg, A., Lucey, S.: Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, Venice Italy, pp. 1135–1143 (2017)
Lukezic, A., Vojir, T., Zajc, L.C., Matas, J., Kristan, M.: Discriminative correlation filter with channel and spatial reliability. In: IEEE, Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 4847–4856 (2017). 10.1109
Yuan, D., Kang, W., He, Z.: Robust visual tracking with correlation filters and metric learning. Knowl. Syst. 195 (2020). https://doi.org/10.1016/j.knosys.2020.105697
Yuan, D., Shu, X., He, Z.: TRBACF: Learning temporal regularized correlation filters for high performance online visual object tracking. J. Vis. Commun. Image Represent. 72 (2020). https://doi.org/10.1016/j.jvcir.2020.102882
Acknowledgements
This study is supported by JSPS/JAPAN KAKENHI (Grants-in-Aid for Scientific Research) #JP20K11955.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Honda, K., Fujita, H. (2021). Combining Siamese Network and Correlation Filter for Complementary Object Tracking. In: Fujita, H., Selamat, A., Lin, J.CW., Ali, M. (eds) Advances and Trends in Artificial Intelligence. Artificial Intelligence Practices. IEA/AIE 2021. Lecture Notes in Computer Science(), vol 12798. Springer, Cham. https://doi.org/10.1007/978-3-030-79457-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-79457-6_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79456-9
Online ISBN: 978-3-030-79457-6
eBook Packages: Computer ScienceComputer Science (R0)