Skip to main content

Robust Visual Tracking Via Consistent Low-Rank Sparse Learning

Abstract

Object tracking is the process of determining the states of a target in consecutive video frames based on properties of motion and appearance consistency. In this paper, we propose a consistent low-rank sparse tracker (CLRST) that builds upon the particle filter framework for tracking. By exploiting temporal consistency, the proposed CLRST algorithm adaptively prunes and selects candidate particles. By using linear sparse combinations of dictionary templates, the proposed method learns the sparse representations of image regions corresponding to candidate particles jointly by exploiting the underlying low-rank constraints. In addition, the proposed CLRST algorithm is computationally attractive since temporal consistency property helps prune particles and the low-rank minimization problem for learning joint sparse representations can be efficiently solved by a sequence of closed form update operations. We evaluate the proposed CLRST algorithm against \(14\) state-of-the-art tracking methods on a set of \(25\) challenging image sequences. Experimental results show that the CLRST algorithm performs favorably against state-of-the-art tracking methods in terms of accuracy and execution time.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Notes

  1. Generally, the matrix of particle representations is not full-rank. It tends to have a low rank that is usually larger than one.

  2. This follows from the linear representation assumption. Since \(\mathbf {X} = \mathbf {D}\mathbf {Z}\) and \(\mathbf {D}\) can be designed to be an overcomplete full row or column rank matrix, then \(\text {rank}(\mathbf {X})=\text {rank}(\mathbf {Z})\). So, if \(\mathbf {X}\) is low-rank, it follows that \(\mathbf {Z}\) is also low-rank.

  3. The results of Li et al. (2011) are not included in Table 4 since the source code is not available for evaluation and the implementation requires technical details as well as parameter settings not discussed.

References

  • Adam, A., Rivlin, E., & Shimshoni, I. (2006). Robust fragments-based tracking using the integral histogram. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 798–805).

  • Avidan, S. (2005). Ensemble tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 494–501).

  • Babenko, B., Yang, M.-H., & Belongie, S. (2009). Visual tracking with online multiple instance learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 983–990).

  • Bao, C., Wu, Y., Ling, H., & Ji, H. (2012). Real time robust \(l_1\) tracker using accelerated proximal gradient approach. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Black, M. J., & Jepson, A. D. (1998). Eigentracking: Robust matching and tracking of articulated objects using a view-based representation. International Journal of Computer Vision, 26(1), 63–84.

    Article  Google Scholar 

  • Boyd, S., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122.

    Article  Google Scholar 

  • Brand, M. (2006). Fast low-rank modifications of the thin singular value decomposition. Linear Algebra and its Applications, 415(1), 20–30.

    Article  MATH  MathSciNet  Google Scholar 

  • Cai, J., Candes, E., & Shen, Z. (2010). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982.

    Article  MATH  MathSciNet  Google Scholar 

  • Candès, E. J., Li, X., Ma, Y., & Wright, J. (2011). Robust principal component analysis? Journal of the ACM, 58(3), 11:1–11:37.

    Article  Google Scholar 

  • Collins, R. T., & Liu, Y. (2003). On-line selection of discriminative tracking features. In Proceedings of the IEEE International Conference on Computer Vision (pp. 346–352).

  • Comaniciu, D., Ramesh, V., & Meer, P. (2003). Kernel-based object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(5), 564–575.

    Article  Google Scholar 

  • Dinh, T., Vo, N., & Medioni, G. (2011). Context tracker: Exploring supporters and distracters in unconstrained environments. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1177–1184).

  • Everingham, M., Gool, L., Williams, C., Winn, J., & Zisserman, A. (2010). The pascal visual object class (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.

    Article  Google Scholar 

  • Gabay, D., & Mercier, B. (1976). A dual algorithm for the solution of nonlinear variational problems via finite element approximations. Computers and Mathematics with Applications, 2(1), 17–40.

    Article  MATH  Google Scholar 

  • Glowinski, R., & Marrocco, A. (1975). Sur l‘approximation, par elements finis d‘ordre un, et la resolution, par penalisation—dualite, d‘une classe de problemes de dirichlet non lineares. Revue Francaise dAutomatique, Informatique, et Recherche Operationelle, 9(1), 41–76.

    MATH  MathSciNet  Google Scholar 

  • Grabner, H., Grabner, M., & Bischof, H. (2006). Real-time tracking via on-line boosting. In Proceedings of British Machine Vision Conference (pp. 1–10).

  • Hare, S., Saffari, A., & Torr, P. (2011). Struck: Structured output tracking with kernels. In Proceedings of the IEEE International Conference on Computer Vision.

  • Henriques, J., Caseiro, R., Martins, P., & Batista, J. (2012). Exploiting the circulant structure of tracking-by-detection with kernels. In Proceedings of European Conference on Computer Vision.

  • Huang, J., Huang, X., & Metaxas, D. (2009). Learning with dynamic group sparsity. In Proceedings of the IEEE International Conference on Computer Vision.

  • Jepson, A., Fleet, D., & El-Maraghi, T. (2003). Robust on-line appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10), 1296–1311.

    Article  Google Scholar 

  • Ji, H., Liu, C., Shen, Z., & Xu, Y. (2010). Robust video denoising using low rank matrix completion. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Jiang, N., Liu, W., & Wu, Y. (2011). Adaptive and discriminative metric differential tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1161–1168).

  • Kalal, Z., Matas, J., & Mikolajczyk, K. (2010). P-N learning: Bootstrapping binary classifiers by structural constraints. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Kaneko, T., & Hori, O. (2003). Feature selection for reliable tracking using template matching. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 796–802).

  • Kristan, M., & Cehovin, L., et al. (2013). The visual object tracking vot2013 challenge results. In ICCV2013 Workshops, Workshop on Visual Object Tracking Challenge.

  • Kwon, J., & Lee, K. M. (2010). Visual tracking decomposition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1269–1276).

  • Li, H., Shen, C., & Shi, Q. (2011). Real-time visual tracking with compressed sensing. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1305–1312).

  • Liu, B., Yang, L., Huang, J., Meer, P., Gong, L., & Kulikowski, C. (2010). Robust and fast collaborative tracking with two stage sparse optimization. In Proceedings of European Conference on Computer Vision (pp. 1–14).

  • Liu, G., Lin, Z., & Yu, Y. (2010). Robust subspace segmentation by low-rank representation. In Proceedings of the International Conference on Machine Learning.

  • Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., & Yan, S. (2012). Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Ma, S., Goldfarb, D., & Chen, L. (2011). Fixed point and bregman iterative methods for matrix rank minimization. Journal Mathematical Programming: Series A and B, 128, -1-1.

  • Matthews, I., Ishikawa, T., & Baker, S. (2004). The template update problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 810–815.

    Article  Google Scholar 

  • Mei, X., & Ling, H. (2011). Robust visual tracking and vehicle classification via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2259–2272.

    Article  Google Scholar 

  • Mei, X., Ling, H., Wu, Y., Blasch, E., & Bai, L. (2011). Minimum error bounded efficient \(l_1\) tracker with occlusion detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1257–1264).

  • Pang, Y., & Ling, H. (2013). Finding the best from the second bests—Inhibiting subjective bias in evaluation of visual tracking algorithms. In Proceedings of the IEEE International Conference on Computer Vision.

  • Peng, Y., Ganesh, A., Wright, J., Xu, W., & Ma, Y. (2011). RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2233–2246.

    Article  Google Scholar 

  • Recht, B., Fazel, M., & Parrilo, P. A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52, 471.

    Article  MATH  MathSciNet  Google Scholar 

  • Ross, D., Lim, J., Lin, R.-S., & Yang, M.-H. (2008). Incremental learning for robust visual tracking. International Journal of Computer Vision, 77(1), 125–141.

    Article  Google Scholar 

  • Salti, S., Cavallaro, A., & Stefano, L. D. (2012). Adaptive appearance modeling for video tracking: Survey and evaluation. IEEE Transactions on Image Processing, 21(10), 4334–4348.

    Article  MathSciNet  Google Scholar 

  • Sevilla-Lara, L., & Learned-Miller, E. (2012). Distribution fields for tracking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1910–1917).

  • Tsaig, Y., & Donoho, D. L. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52, 1289–1306.

    Article  Google Scholar 

  • Wright, J., Yang, A. Y., Ganesh, A., Sastry, S. S., & Ma, Y. (2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210–27.

    Article  Google Scholar 

  • Wu, Y., Lim, J., & Yang, M.-H. (2013). Online object tracking: A benchmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Yang, M., Wu, Y., & Hua, G. (2009). Context-aware visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(7), 1195–1209.

  • Yilmaz, A., Javed, O., & Shah, M. (2006). Object tracking: A survey. ACM Computing Surveys, 38(4), 13.

    Article  Google Scholar 

  • Yu, Q., Dinh, T.B., & Medioni, G. (2008). Online tracking and reacquistion using co-trained generative and discriminative trackers. In Proceedings of European Conference on Computer Vision (pp. 678–691).

  • Zhang, K., Zhang, L., & Yang, M. -H. (2012). Real-time compressive tracking. In Proceedings of European Conference on Computer Vision.

  • Zhang, T., Ghanem, B., & Ahuja, N. (2012). Robust multi-object tracking via cross-domain contextual information for sports video analysis. In International Conference on Acoustics, Speech and Signal Processing.

  • Zhang, T., Ghanem, B., Liu, S., & Ahuja, N. (2012). Low-rank sparse learning for robust visual tracking. In Proceedings of European Conference on Computer Vision.

  • Zhang, T., Ghanem, B., Liu, S., & Ahuja, N. (2012). Robust visual tracking via multi-task sparse learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Zhang, T., Ghanem, B., Liu, S., & Ahuja, N. (2013). Robust visual tracking via structured multi-task sparse learning. International Journal of Computer Vision, 101(2), 367–383.

    Article  MathSciNet  Google Scholar 

  • Zhang, T., Ghanem, B., Liu, S., Xu, C., & Ahuja, N. (2013). Low-rank sparse coding for image classification. In Proceedings of the IEEE International Conference on Computer Vision.

  • Zhang, T., Ghanem, B., Xu, C., & Ahuja, N. (2013). Object tracking by occlusion detection via structured sparse learning. In CVsports workshop in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Zhang, T., Jia, C., Xu, C., Ma, Y., & Ahuja, N. (2014). Partial occlusion handling for visual tracking via robust part matching. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Zhong, W., Lu, H., & M-H, Y. (2012). Robust object tracking via sparsity-based collaborative model. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

Download references

Acknowledgments

This work is supported in part by the research grant for the Human Sixth Sense Programme at the Advanced Digital Sciences Center from Singapore’s Agency for Science, Technology and Research (A\(^*\)STAR) and NSF CAREER Grant #1149783.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Si Liu.

Additional information

Communicated by M. Hebert.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, T., Liu, S., Ahuja, N. et al. Robust Visual Tracking Via Consistent Low-Rank Sparse Learning. Int J Comput Vis 111, 171–190 (2015). https://doi.org/10.1007/s11263-014-0738-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-014-0738-0

Keywords

  • Visual tracking
  • Temporal consistency
  • Sparse representation
  • Low-rank representation