Fast Visual Tracking via Dense Spatio-temporal Context Learning

  • Kaihua Zhang
  • Lei Zhang
  • Qingshan Liu
  • David Zhang
  • Ming-Hsuan Yang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8693)


In this paper, we present a simple yet fast and robust algorithm which exploits the dense spatio-temporal context for visual tracking. Our approach formulates the spatio-temporal relationships between the object of interest and its locally dense contexts in a Bayesian framework, which models the statistical correlation between the simple low-level features (i.e., image intensity and position) from the target and its surrounding regions. The tracking problem is then posed by computing a confidence map which takes into account the prior information of the target location and thereby alleviates target location ambiguity effectively. We further propose a novel explicit scale adaptation scheme, which is able to deal with target scale variations efficiently and effectively. The Fast Fourier Transform (FFT) is adopted for fast learning and detection in this work, which only needs 4 FFT operations. Implemented in MATLAB without code optimization, the proposed tracker runs at 350 frames per second on an i7 machine. Extensive experimental results show that the proposed algorithm performs favorably against state-of-the-art methods in terms of efficiency, accuracy and robustness.


Target Object Object Location Context Model Visual Tracking Multiple Instance Learn 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Adam, A., Rivlin, E., Shimshoni, I.: Robust fragments-based tracking using the integral histogram. In: CVPR, pp. 798–805 (2006)Google Scholar
  2. 2.
    Babenko, B., Yang, M.-H., Belongie, S.: Robust object tracking with online multiple instance learning. PAMI 33(8), 1619–1632 (2011)Google Scholar
  3. 3.
    Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. PAMI 24(4), 509–522 (2002)Google Scholar
  4. 4.
    Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: CVPR, pp. 2544–2550 (2010)Google Scholar
  5. 5.
    Bolme, D.S., Draper, B.A., Beveridge, J.R.: Average of synthetic exact filters. In: CVPR, pp. 2105–2112 (2009)Google Scholar
  6. 6.
    Cehovin, L., Kristan, M., Leonardis, A.: Robust visual tracking using an adaptive coupled-layer visual model. PAMI 35(4), 941–953 (2013)Google Scholar
  7. 7.
    Collins, R.T.: Mean-shift blob tracking through scale space. In: CVPR, vol. 2, pp. II–234 (2003)Google Scholar
  8. 8.
    Collins, R.T., Liu, Y., Leordeanu, M.: Online selection of discriminative tracking features. PAMI 27(10), 1631–1643 (2005)Google Scholar
  9. 9.
    Dinh, T.B., Vo, N., Medioni, G.: Context tracker: Exploring supporters and distracters in unconstrained environments. In: CVPR, pp. 1177–1184 (2011)Google Scholar
  10. 10.
    Divvala, S.K., Hoiem, D., Hays, J.H., Efros, A.A., Hebert, M.: An empirical study of context in object detection. In: CVPR, pp. 1271–1278 (2009)Google Scholar
  11. 11.
    Grabner, H., Grabner, M., Bischof, H.: Real-time tracking via on-line boosting. In: BMVC, pp. 47–56 (2006)Google Scholar
  12. 12.
    Grabner, H., Leistner, C., Bischof, H.: Semi-supervised on-line boosting for robust tracking. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 234–247. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  13. 13.
    Grabner, H., Matas, J., Van Gool, L., Cattin, P.: Tracking the invisible: Learning where the object might be. In: CVPR, pp. 1285–1292 (2010)Google Scholar
  14. 14.
    Hare, S., Saffari, A., Torr, P.H.: Struck: Structured output tracking with kernels. In: ICCV, pp. 263–270 (2011)Google Scholar
  15. 15.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: Exploiting the circulant structure of tracking-by-detection with kernels. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 702–715. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Kalal, Z., Matas, J., Mikolajczyk, K.: Pn learning: Bootstrapping binary classifiers by structural constraints. In: CVPR, pp. 49–56 (2010)Google Scholar
  17. 17.
    Kwon, J., Lee, K.M.: Visual tracking decomposition. In: CVPR, pp. 1269–1276 (2010)Google Scholar
  18. 18.
    Kwon, J., Lee, K.M.: Tracking by sampling trackers. In: ICCV, pp. 1195–1202 (2011)Google Scholar
  19. 19.
    Mei, X., Ling, H.: Robust visual tracking and vehicle classification via sparse representation. PAMI 33(11), 2259–2272 (2011)Google Scholar
  20. 20.
    Oppenheim, A.V., Willsky, A.S., Nawab, S.H.: Signals and systems, vol. 2. Prentice-Hall, Englewood Cliffs (1983)Google Scholar
  21. 21.
    Oron, S., Bar-Hillel, A., Levi, D., Avidan, S.: Locally orderless tracking. In: CVPR, pp. 1940–1947 (2012)Google Scholar
  22. 22.
    Ross, D.A., Lim, J., Lin, R.S., Yang, M.-H.: Incremental learning for robust visual tracking. IJCV 77(1), 125–141 (2008)Google Scholar
  23. 23.
    Sevilla-Lara, L., Learned-Miller, E.: Distribution fields for tracking. In: CVPR, pp. 1910–1917 (2012)Google Scholar
  24. 24.
    Torralba, A.: Contextual priming for object detection. IJCV 53(2), 169–191 (2003)Google Scholar
  25. 25.
    Wen, L., Cai, Z., Lei, Z., Yi, D., Li, S.Z.: Online spatio-temporal structural context learning for visual tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 716–729. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  26. 26.
    Wolf, L., Bileschi, S.: A critical view of context. IJCV 69(2), 251–261 (2006)Google Scholar
  27. 27.
    Yang, M., Wu, Y., Hua, G.: Context-aware visual tracking. PAMI 31(7), 1195–1209 (2009)Google Scholar
  28. 28.
    Yang, M., Yuan, J., Wu, Y.: Spatial selection for attentional visual tracking. In: CVPR, pp. 1–8 (2007)Google Scholar
  29. 29.
    Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Computing Surveys 38(4) (2006)Google Scholar
  30. 30.
    Zhang, K., Zhang, L., Yang, M.-H.: Real-time compressive tracking. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 864–877. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  31. 31.
    Zhang, T., Ghanem, B., Liu, S., Ahuja, N.: Robust visual tracking via multi-task sparse learning. In: CVPR, pp. 2042–2049 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Kaihua Zhang
    • 1
  • Lei Zhang
    • 2
  • Qingshan Liu
    • 1
  • David Zhang
    • 2
  • Ming-Hsuan Yang
    • 3
  1. 1.S-mart GroupNanjing University of Information Science & TechnologyChina
  2. 2.Dept. of ComputingThe Hong Kong Polytechnic UniversityHongKong
  3. 3.Electrical Engineering and Computer ScienceUniversity of California at MercedUSA

Personalised recommendations