Advertisement

DSNet: Deep and Shallow Feature Learning for Efficient Visual Tracking

  • Qiangqiang Wu
  • Yan Yan
  • Yanjie Liang
  • Yi Liu
  • Hanzi WangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11365)

Abstract

In recent years, Discriminative Correlation Filter (DCF) based tracking methods have achieved great success in visual tracking. However, the multi-resolution convolutional feature maps trained from other tasks like image classification, cannot be naturally used in the conventional DCF formulation. Furthermore, these high-dimensional feature maps significantly increase the tracking complexity and thus limit the tracking speed. In this paper, we present a deep and shallow feature learning network, namely DSNet, to learn the multi-level same-resolution compressed (MSC) features for efficient online tracking, in an end-to-end offline manner. Specifically, the proposed DSNet compresses multi-level convolutional features to uniform spatial resolution features. The learned MSC features effectively encode both appearance and semantic information of objects in the same-resolution feature maps, thus enabling an elegant combination of the MSC features with any DCF-based methods. Additionally, a channel reliability measurement (CRM) method is presented to further refine the learned MSC features. We demonstrate the effectiveness of the MSC features learned from the proposed DSNet on two DCF tracking frameworks: the basic DCF framework and the continuous convolution operator framework. Extensive experiments show that the learned MSC features have the appealing advantage of allowing the equipped DCF-based tracking methods to perform favorably against the state-of-the-art methods while running at high frame rates.

Keywords

Visual tracking Correlation filter Deep neural network 

Notes

Acknowledgments

This work is supported by the National Natural Science of China (Grant No. U1605252, 61872307, 61472334 and 61571379) and the National Key Research and Development Program of China under Grant No. 2017YFB1302400.

References

  1. 1.
    Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.S.: Staple: complementary learners for real-time tracking. In: Computer Vision and Pattern Recognition (CVPR), pp. 1401–1409 (2016)Google Scholar
  2. 2.
    Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_56CrossRefGoogle Scholar
  3. 3.
    Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: Computer Vision and Pattern Recognition (CVPR), pp. 2544–2550 (2010)Google Scholar
  4. 4.
    Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets (2014). arXiv preprint arxiv:1405.3531
  5. 5.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition (CVPR), pp. 886–893 (2005)Google Scholar
  6. 6.
    Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ECO: efficient convolution operators for tracking. In: Computer Vision and Pattern Recognition (CVPR), pp. 21–26 (2017)Google Scholar
  7. 7.
    Danelljan, M., Hager, G., Khan, F.S., Felsberg, M.: Convolutional features for correlation filter based visual tracking. In: International Conference on Computer Vision Workshops, pp. 58–66 (2015)Google Scholar
  8. 8.
    Danelljan, M., Häger, G., Khan, F.S., Felsberg, M.: Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1561–1575 (2017)CrossRefGoogle Scholar
  9. 9.
    Danelljan, M., Robinson, A., Shahbaz Khan, F., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 472–488. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_29CrossRefGoogle Scholar
  10. 10.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)Google Scholar
  11. 11.
    Gorden, D., Farhadi, A., Fox, D.: Re3: real-time recurrent regression networks for object tracking (2017). arXiv preprint arxiv:1705.06368
  12. 12.
    Gundogdu, E., Alatan, A.A.: Good features to correlate for visual tracking (2017). arXiv preprint arxiv:1704.06326
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)Google Scholar
  14. 14.
    Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_45CrossRefGoogle Scholar
  15. 15.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)CrossRefGoogle Scholar
  16. 16.
    Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(7), 1409–1422 (2012)CrossRefGoogle Scholar
  17. 17.
    Kong, T., Yao, A., Chen, Y., Sun, F.: HyperNet: towards accurate region proposal generation and joint object detection. In: Computer Vision and Pattern Recognition (CVPR), pp. 845–853 (2016)Google Scholar
  18. 18.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)Google Scholar
  19. 19.
    Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  20. 20.
    Ma, C., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: International Conference on Computer Vision (ICCV), pp. 3074–3082 (2015)Google Scholar
  21. 21.
    Ma, C., Yang, X., Zhang, C., Yang, M.H.: Long-term correlation tracking. In: Computer Vision and Pattern Recognition (CVPR), pp. 5388–5396 (2015)Google Scholar
  22. 22.
    Mueller, M., Smith, N., Ghanem, B.: Context-aware correlation filter tracking. In: Computer Vision and Pattern Recognition (CVPR), pp. 1387–1395 (2017)Google Scholar
  23. 23.
    Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: International Conference on Computer Vision (ICCV), pp. 1520–1528 (2015)Google Scholar
  24. 24.
    Qi, Y., et al.: Hedged deep tracking. In: Computer Vision and Pattern Recognition (CVPR), pp. 4303–4311 (2016)Google Scholar
  25. 25.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)Google Scholar
  26. 26.
    Russakovsky, O., Deng, J.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Song, Y., Ma, C., Gong, L., Zhang, J., Lau, R.W., Yang, M.H.: CREST: convolutional residual learning for visual tracking. In: International Conference on Computer Vision (ICCV), pp. 2574–2583 (2017)Google Scholar
  28. 28.
    Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.S.: End-to-end representation learning for correlation filter based tracking. In: Computer Vision and Pattern Recognition (CVPR), pp. 5000–5008 (2017)Google Scholar
  29. 29.
    Wang, Q., Gao, J., Xing, J., Zhang, M., Hu, W.: DCFNet: discriminant correlation filters network for visual tracking (2017). arXiv preprint arxiv:1704.04057
  30. 30.
    Weijer, J.V.D., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18(7), 1512–1523 (2009)MathSciNetCrossRefGoogle Scholar
  31. 31.
    Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: Computer Vision and Pattern Recognition (CVPR), pp. 2411–2418 (2013)Google Scholar
  32. 32.
    Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)CrossRefGoogle Scholar
  33. 33.
    Zhang, T., Xu, C., Yang, M.H.: Multi-task correlation particle filter for robust object tracking. In: Computer Vision and Pattern Recognition (CVPR), pp. 4819–4827 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Qiangqiang Wu
    • 1
  • Yan Yan
    • 1
  • Yanjie Liang
    • 1
  • Yi Liu
    • 1
  • Hanzi Wang
    • 1
    Email author
  1. 1.Fujian Key Laboratory of Sensing and Computing for Smart City, School of Information Science and EngineeringXiamen UniversityXiamenChina

Personalised recommendations