Discriminative Tracking by Metric Learning

  • Xiaoyu Wang
  • Gang Hua
  • Tony X. Han
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6313)


We present a discriminative model that casts appearance modeling and visual matching into a single objective for visual tracking. Most previous discriminative models for visual tracking are formulated as supervised learning of binary classifiers. The continuous output of the classification function is then utilized as the cost function for visual tracking. This may be less desirable since the function is optimized for making binary decision. Such a learning objective may make it not to be able to well capture the manifold structure of the discriminative appearances. In contrast, our unified formulation is based on a principled metric learning framework, which seeks for a discriminative embedding for appearance modeling. In our formulation, both appearance modeling and visual matching are performed online by efficient gradient based optimization. Our formulation is also able to deal with multiple targets, where the exclusive principle is naturally reinforced to handle occlusions. Its efficacy is validated in a wide variety of challenging videos. It is shown that our algorithm achieves more persistent results, when compared with previous appearance model based tracking algorithms.


Visual Target Appearance Model Visual Tracking Tracking Result Discriminative Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Jepson, A.D., Fleet, D.J., El-Maraghi, T.F.: Robust online appearance models for visual tracking. In: CVPR, vol. 1, pp. 415–422 (2001)Google Scholar
  2. 2.
    Ho, J., Lee, K.C., Yang, M.H., Kriegman, D.: Visual tracking using learned subspaces. In: CVPR, vol. 1, pp. 782–789 (2004)Google Scholar
  3. 3.
    Lim, J., Ross, D., Lin, R.S., Yang, M.H.: Incremental learning for visual tracking. In: NIPS, pp. 801–808 (2005)Google Scholar
  4. 4.
    Yang, M., Wu, Y.: Tracking non-stationary appearances and dynamic feature selection. In: CVPR (2005)Google Scholar
  5. 5.
    Babenko, B., Yang, M.H., Belongie, S.: Visual tracking with online multiple instance learning. In: CVPR (2009)Google Scholar
  6. 6.
    Cootes, T., Edwards, G., Taylor, C.: Active appearance models. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 484–498. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  7. 7.
    Avidan, S.: Support vector tracking. In: CVPR (2001)Google Scholar
  8. 8.
    Avidan, S.: Ensemble tracking. In: CVPR (2005)Google Scholar
  9. 9.
    Collins, R.T., Liu, Y.: On-line selection of discriminative tracking features. In: ICCV, vol. 1, pp. 346–352 (2003)Google Scholar
  10. 10.
    Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society Series B 39, 1–38 (1977)zbMATHMathSciNetGoogle Scholar
  11. 11.
    Bar-Shalom, Y.: Tracking and data association. Academic Press Professional, Inc., San Diego (1987)Google Scholar
  12. 12.
    Isard, M., Blake, A.: Contour tracking by stochastic propagation of conditional density. In: Buxton, B.F., Cipolla, R. (eds.) ECCV 1996. LNCS, vol. 1065, pp. 343–356. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  13. 13.
    Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: CVPR, vol. 2, pp. 142–149 (2000)Google Scholar
  14. 14.
    Hager, G.D., Dewan, M., Stewart, C.V.: Multiple kernel tracking with ssd. In: CVPR, vol. 1, pp. 790–797 (2004)Google Scholar
  15. 15.
    Zhao, Q., Brennan, S., Tao, H.: Differential emd tracking. In: ICCV (2007)Google Scholar
  16. 16.
    Wu, Y., Fan, J.: Contextual flow. In: CVPR (2009)Google Scholar
  17. 17.
    MacCormick, J., Isard, M.: Partitioned sampling, articulated objects, and interface-quality hand tracking. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 3–19. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  18. 18.
    Wu, Y., Hua, G., Yu, T.: Tracking articulated body by dynamic markov network. In: ICCV, p. 1094 (2003)Google Scholar
  19. 19.
    Zhu, C., Byrd, R.H., Lu, P., Nocedal, J.: Algorithm 778: L-bfgs-b: Fortran subroutines for large-scale bound-constrained optimization. ACM Transaction Mathematical Software 23, 550–560 (1997)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Globerson, A., Roweis, S.T.: Metric learning by collapsing classes. In: NIPS (2005)Google Scholar
  21. 21.
    MacCormick, J., Blake, A.: A probabilistic exclusion principle for tracking multiple objects. In: ICCV, pp. 572–587 (1999)Google Scholar
  22. 22.
    Rosen, J.B.: The gradient projection method for nonlinear programming. part i. linear constraints. Journal of the Society for Industrial and Applied Mathematics 8, 181–217 (1960)zbMATHCrossRefGoogle Scholar
  23. 23.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vision 57, 137–154 (2004)CrossRefGoogle Scholar
  24. 24.
    Wang, X., Han, T.X., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: ICCV (2009)Google Scholar
  25. 25.
    Ribeiro, H.N., Hall, D., et al.: Comparison of target detection algorithms using adaptive background models. In: Proc. 2nd Joint IEEE Int. Workshop on Visual Surveillance, pp. 113–120 (2005)Google Scholar
  26. 26.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2009 (VOC 2009) Results (2009),

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Xiaoyu Wang
    • 1
  • Gang Hua
    • 2
  • Tony X. Han
    • 1
  1. 1.Dept. of ECEUniversity of Missouri 
  2. 2.Nokia Research CenterHollywood

Personalised recommendations