Advertisement

Siamese network for real-time tracking with action-selection

  • Zhuoyi ZhangEmail author
  • Yifeng Zhang
  • Xu Cheng
  • Ke Li
Original Research Paper
  • 56 Downloads

Abstract

Considering that most deep learning based trackers capture accurate locations for targets at the expense of consuming much time in training phrase, in this paper we present a new powerful tracker using the Siamese network which can be implemented with low computation resource. Our proposed tracker can track targets accurately by a fine-tuned model which is convenient to train. During the tracking, we apply a new sampling method that is independent of training called action-selection to conduct selective and flexible sampling step by step with a variable stride, by which we can get bounding boxes with varied aspect radio. By verifying its performance on online tracking benchmarks, it turns out that our tracker achieves higher accuracy than most traditional trackers. In addition, our tracker operates at frame-rates beyond real-time.

Keywords

Computer vision Object tracking Siamese network 

Notes

Funding

This work was supported by the Natural Science Foundation of Jiangsu Province under (Grant no. BK20151102), Natural Science Foundation of China (Grant no. 61673108), Ministry of Education Key Laboratory of Machine Perception, Peking University (Grant no. K-2016-03), Open Project Program of the Ministry of Education Key Laboratory of Underwater Acoustic Signal Processing, Southeast University (Grant no. UASP1502) and Natural Science Foundation of China (Grant no. 61802058).

References

  1. 1.
    Yi, W., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: Computer Vision and Pattern Recognition (2013)Google Scholar
  2. 2.
    Porikli, F.: Achieving real-time object detection and tracking under extreme conditions. J. Real-Time Image Process. 1(1), 33–40 (2006)CrossRefGoogle Scholar
  3. 3.
    Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: Cnn features off-the-shelf: An astounding baseline for recognition (2014)Google Scholar
  4. 4.
    Lee, S.H., Jang, W.D., Kim, C.S.: Tracking-by-segmentation using superpixel-wise neural network. IEEE Access 99, 1–1 (2018)Google Scholar
  5. 5.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems (2012)Google Scholar
  6. 6.
    Bromley, J., Guyon, I., Lecun, Y., Shah, R.: Signature verification using a “siamese” time delay neural network. In: International Conference on Neural Information Processing Systems (1993)Google Scholar
  7. 7.
    Taigman, Y., Ming, Y., Ranzato, M., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)Google Scholar
  8. 8.
    Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks (2015)Google Scholar
  9. 9.
    Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with siamese region proposal. In: IEEE conference on computer vision and pattern recognition (2018)Google Scholar
  10. 10.
    Li, B., Wu, W., Wang, Q., Zhang, F., Yan, J.: Siamrpn++: Evolution of siamese visual tracking with very deep networks (2018)Google Scholar
  11. 11.
    Tao, R., Gavves, E., Smeulders, A.W.M.: Siamese instance search for tracking. In: IEEE conference on computer vision and pattern recognition (2016)Google Scholar
  12. 12.
    Kai, C., Tao, W.: Convolutional regression for visual tracking. IEEE Trans. Image Process. 99, 1–1 (2016)zbMATHGoogle Scholar
  13. 13.
    Smeulders, A.W.M., Chu, D.M., Rita, C., Simone, C., Afshin, D., Mubarak, S.: Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2014)CrossRefGoogle Scholar
  14. 14.
    Ning, G., Zhi, Z., Chen, H., He, Z., Wang, H.: Spatially supervised recurrent convolutional neural networks for visual object tracking. In: IEEE International Symposium on Circuits and Systems (2017)Google Scholar
  15. 15.
    Hare, S., Saffari, A., Torr, P.H.S.: Struck: Structured output tracking with kernels. In: International Conference on Computer Vision (2011)Google Scholar
  16. 16.
    Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision (2016)Google Scholar
  17. 17.
    Guo, Q., Wei, F., Zhou, C., Rui, H., Song, W.: Learning dynamic siamese network for visual object tracking. In: IEEE international conference on computer vision (2017)Google Scholar
  18. 18.
    Wang, N., Yeung, D.Y.: Learning a deep compact image representation for visual tracking. In: International Conference on Neural Information Processing Systems (2013)Google Scholar
  19. 19.
    Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking (2015)Google Scholar
  20. 20.
    Chao, M., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: IEEE International Conference on Computer Vision (2016)Google Scholar
  21. 21.
    Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: IEEE international conference on computer vision (2016)Google Scholar
  22. 22.
    Fan, H., Ling, H.: Parallel tracking and verifying: A framework for real-time and high accuracy visual tracking. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)Google Scholar
  23. 23.
    Chong, S., Lu, H., Yang, M.H.: Learning spatial-aware regressions for visual tracking (2018)Google Scholar
  24. 24.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Kai, C., Tao, W.: Once for all: a two-flow convolutional neural network for visual tracking. IEEE Trans. Circ. Syst. Video Technol. 99, 1–1 (2016)Google Scholar
  26. 26.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Computer Science (2014)Google Scholar
  27. 27.
    Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks (2016)CrossRefGoogle Scholar
  28. 28.
    Huang, C., Lucey, S., Ramanan, D.: Learning policies for adaptive tracking with deep feature cascades (2017)Google Scholar
  29. 29.
    Song, Y., Chao, M., Gong, L., Zhang, J., Yang, M.H.: Crest: Convolutional residual learning for visual tracking. In: IEEE International Conference on Computer Vision (2017)Google Scholar
  30. 30.
    Real, E., Shlens, J., Mazzocchi, S., Xin, P., Vanhoucke, V.: Youtube-boundingboxes: a large high-precision human-annotated data set for object detection in video (2017)Google Scholar
  31. 31.
    Girshick, R.: Fast r-cnn. Computer Science (2015)Google Scholar
  32. 32.
    Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Computer Vision and Pattern Recognition (2015)Google Scholar
  33. 33.
    Yi, W., Jongwoo, L., Ming-Hsuan, Y.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)CrossRefGoogle Scholar
  34. 34.
    Kristan, M., Eldesokey, A., Xing, Y., Fan, Y., Zheng, Z., Zhang, Z., He, Z., Fernandez, G., Garciamartin, A., Muhic, A.: The visual object tracking vot2017 challenge results. In: IEEE International Conference on Computer Vision Workshop (2017)Google Scholar
  35. 35.
    Yang, L., Liu, R., Zhang, D., Lei, Z.: Deep location-specific tracking. In: ACM on Multimedia Conference (2017)Google Scholar
  36. 36.
    Zhang, J., Ma, S., Sclaroff, S.: Meem: Robust tracking via multiple experts using entropy minimization (2014)CrossRefGoogle Scholar
  37. 37.
    Martin, D., Gustav, H., Fahad, S.K., Michael, F.: Accurate scale estimation for robust visual tracking. In: British machine vision conference (2014)Google Scholar
  38. 38.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)CrossRefGoogle Scholar
  39. 39.
    Yang, M.H., Lu, H., Wei, Z.: Robust object tracking via sparsity-based collaborative model. In: Computer Vision and Pattern Recognition (2012)Google Scholar
  40. 40.
    Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.S.: Struck: Structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2096–2109 (2015)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Information Science and EngineeringSoutheast UniversityNanjingChina
  2. 2.Nanjing Institute of Communications TechnologiesNanjingChina
  3. 3.State Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina
  4. 4.School of Computer and SoftwareNanjing University of Information Science and TechnologyNanjingChina

Personalised recommendations