Skip to main content

Visual Tracking via Subspace Learning: A Discriminative Approach

Abstract

Good tracking performance is in general attributed to accurate representation over previously obtained targets and/or reliable discrimination between the target and the surrounding background. In this work, a robust tracker is proposed by integrating the advantages of both approaches. A subspace is constructed to represent the target and the neighboring background, and their class labels are propagated simultaneously via the learned subspace. In addition, a novel criterion is proposed, by taking account of both the reliability of discrimination and the accuracy of representation, to identify the target from numerous target candidates in each frame. Thus, the ambiguity in the class labels of neighboring background samples, which influences the reliability of the discriminative tracking model, is effectively alleviated, while the training set still remains small. Extensive experiments demonstrate that the proposed approach outperforms most state-of-the-art trackers.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

References

  • Arulampalam, M., Maskell, S., Gordon, N., & Clapp, T. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing (TSP), 50(2), 174–188.

    Article  Google Scholar 

  • Avidan, S. (2004). Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 26(8), 1064–1072.

    Article  Google Scholar 

  • Avidan, S. (2007). Ensemble tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 29(2), 261–271.

    Article  Google Scholar 

  • Babenko, B., Member, S., Yang, M. H., & Member, S. (2011). Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 33(8), 1619–1632.

    Article  Google Scholar 

  • Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1), 183–202.

    MathSciNet  Article  MATH  Google Scholar 

  • Cai, J., Candès, E., & Shen, Z. (2010). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982.

    MathSciNet  Article  MATH  Google Scholar 

  • Candes, E. J., Li, X., Ma, Y., & Wright, J. (2011). Robust principal component analysis? Journal of the ACM, 58(3), 1–37.

    MathSciNet  Article  MATH  Google Scholar 

  • Danelljan, M., Häger, G., Khan, F. S., & Felsberg, M. (2014). Accurate scale estimation for robust visual tracking. In British machine vision conference (BMVC)

  • Dinh, T. B., Vo, N., & Medioni, G. (2011). Context tracker: Exploring supporters and distracters in unconstrained environments. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 1177–1184).

  • Grabner, H., & Bischof, H. (2006). On-line boosting and vision. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR), (Vol. 1, pp. 260–267)

  • Hager, G. D., & Belhumeur, P. N. (1996). Real-time tracking of image regions with changes in geometry and illumination. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 403–410).

  • Hare, S., Saffari, A., & Torr, P. (2011). Struck: Structured output tracking with kernels. In IEEE international conference on computer vision (ICCV) (pp. 263–270).

  • Henriques, F., Caseiro, R., Martins, P., & Batista, J. (2012). Exploiting the circulant structure of tracking-by-detection with kernels. In European conference on computer vision (ECCV) (pp 702–715)

  • Henriques, J., Caseiro, R., Martins, P., & Batista, J. (2015). High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 37(3), 583–596.

    Article  Google Scholar 

  • Isard, M. (1998). CONDENSATION: Conditional density propagation for visual tracking. International Journal of Computer Vision (IJCV), 29(1), 5–28.

    Article  Google Scholar 

  • Jia, X., Lu, H., & Yang, M. H. (2012). Visual tracking via adaptive structural local sparse appearance model. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 1822–1829).

  • Kalal, Z., Matas, J., & Mikolajczyk, K. (2010). P-N learning: Bootstrapping binary classifiers by structural constraints. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 49–56).

  • Kalal, Z., Mikolajczyk, K., & Matas, J. (2012). Tracking–learning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 34(7), 1409–1422.

    Article  Google Scholar 

  • Kriegmant, D. J., Engineering, E., & Haven, N. (1996). What is the set of images of an object under all possible lighting conditions? In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 270–277).

  • Kwon, J., & Lee, K. (2010). Visual tracking decomposition. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 1269–1276).

  • Kwon, J., & Lee, K. M. (2011). Tracking by sampling trackers. In IEEE international conference on computer vision (ICCV) (pp. 1195–1202).

  • Kwon, J., & Lee, K. M. (2014). Tracking by sampling and integrating multiple trackers. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 36(7), 1428–1441.

    MathSciNet  Article  Google Scholar 

  • Lasserre, J. A., Bishop, C. M., & Minka, T. P. (2006). Principled hybrids of generative and discriminative models. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (Vol. 6, pp. 87–94).

  • Lin, Z., Chen, M., & Ma, Y. (2010). The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. UIUC Technical Report (pp. 1–23).

  • Liu, B., Huang, J., Yang, L., & Kulikowsk, C. (2011). Robust tracking using local sparse appearance model and K-selection. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 1313–1320).

  • Liu, S., Zhang, T., Cao, X., & Xu, C. (2016). Structural correlation filter for robust visual tracking. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR).

  • Liu, B., Huang, J., Kulikowski, C., & Yang, L. (2013). Robust visual tracking using local sparse appearance model and K-selection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 35(12), 2968–2981.

    Article  Google Scholar 

  • Ma, C., Huang, J. B., Yang, X., & Yang, M. H. (2015a). Hierarchical convolutional features for visual tracking. In IEEE international conference on computer vision (ICCV) (pp. 3074–3082).

  • Ma, C., Yang, X., Zhang, C., & Yang, Mh. (2015b). Long-term correlation tracking. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 5388–5396).

  • Mairal, J., Bach, F., & Ponce, J. (2008). Discriminative learned dictionaries for local image analysis. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR).

  • Mei, X., & Ling, H. (2009). Robust visual tracking using L1 minimization. In IEEE international conference on computer vision (ICCV) (pp. 1436–1443).

  • Mei, X., & Ling, H. (2011). Robust visual tracking and vehicle classification via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 33(11), 2259–2272.

    Article  Google Scholar 

  • Nam, H., & Han, B. (2016). Learning multi-domain convolutional neural networks for visual tracking. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR).

  • Ng, A. Y., & Jordan, M. I. (2001). On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In Advances in Neural Information Processing Systems (NIPS) (pp. 841–848).

  • Pati, Y., Rezaiifar, R., & Krishnaprasad, P. (1993). Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Asilomar conference on signals, systems and computers (pp. 40–44).

  • Pham, D. S., & Venkatesh, S. (2008). Joint learning and dictionary construction for pattern recognition. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 1–8).

  • Qi, Y., Zhang, S., Qin, L., Yao, H., Huang, Q., Lim, J., & Yang, M. H. (2016). Hedged deep tracking. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 4303–4311).

  • Raina, R., & Ng, A. Y. (2007). Self-taught learning : Transfer learning from unlabeled data. In International conference on machine learning (ICML).

  • Ross, D. A., Lim, J., Lin, R. S., & Yang, M. H. (2007). Incremental learning for robust visual tracking. International Journal of Computer Vision (IJCV), 77(1–3), 125–141.

    Google Scholar 

  • Sevilla-Lara, L., & Learned-Miller, E. (2012). Distribution fields for tracking. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 1910–1917).

  • Smeulders, A. W. M., Chu, D. M., Cucchiara, R., Calderara, S., Dehghan, A., & Shah, M. (2014). Visual tracking: An experimental survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 36(7), 1442–1468.

    Article  Google Scholar 

  • Sui, Y., Tang, Y., & Zhang, L. (2015a). Discriminative low-rank tracking. In IEEE international conference on computer vision (ICCV) (pp. 3002–3010).

  • Sui, Y., Wang, G., & Zhang, L. (2017). Correlation filter learning toward peak strength for visual tracking. IEEE Transactions on Cybernetics (TCyb). https://doi.org/10.1109/TCYB.2017.2690860.

  • Sui, Y., Wang, G., Tang, Y., & Zhang, L. (2016a). Tracking completion. In European conference on computer vision (ECCV).

  • Sui, Y., Zhang, Z., Wang, G., Tang, Y., & Zhang, L. (2016b). Real-time visual tracking: Promoting the robustness of correlation filter learning. In European conference on computer vision (ECCV)

  • Sui, Y., & Zhang, L. (2015). Visual tracking via locally structured Gaussian process regression. IEEE Signal Processing Letters, 22(9), 1331–1335.

    Article  Google Scholar 

  • Sui, Y., & Zhang, L. (2016). Robust tracking via locally structured representation. International Journal of Computer Vision (IJCV), 119(2), 110–144.

    MathSciNet  Article  Google Scholar 

  • Sui, Y., Zhang, S., & Zhang, L. (2015b). Robust visual tracking via sparsity-induced subspace learning. IEEE Transactions on Image Processing (TIP), 24(12), 4686–4700.

    MathSciNet  Article  Google Scholar 

  • Sui, Y., Zhao, X., Zhang, S., Yu, X., Zhao, S., & Zhang, L. (2015c). Self-expressive tracking. Pattern Recognition (PR), 48(9), 2872–2884.

    Article  Google Scholar 

  • Tang, M., & Feng, J. (2015). Multi-kernel correlation filter for visual tracking. In IEEE international conference on computer vision (ICCV) (pp. 3038–3046).

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 58(1), 267–288.

    MathSciNet  MATH  Google Scholar 

  • Wang, D., & Lu, H. (2012). Object tracking via 2DPCA and L1-regularization. IEEE Signal Processing Letters, 19(11), 711–714.

    Article  Google Scholar 

  • Wang, D., & Lu, H. (2014). Visual tracking via probability continuous outlier model. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR).

  • Wang, D., Lu, H., & Yang, M. H. (2013a). Least soft-thresold squares tracking. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 2371–2378).

  • Wang, D., Lu, H., & Yang, M. H. (2013b). Online object tracking with sparse prototypes. IEEE Transactions on Image Processing (TIP), 22(1), 314–325.

    MathSciNet  Article  MATH  Google Scholar 

  • Wang, L., Ouyang, W., Wang, X., & Lu, H. (2015). Visual tracking with fully convolutional networks. In IEEE international conference on computer vision (ICCV) (pp. 3119–3127).

  • Wang, L., Ouyang, W., Wang, X., & Lu, H. (2016). Stct: Sequentially training convolutional networks for visual tracking. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 1373–1381).

  • Wang, Q., Chen, F., Xu, W., & Yang, M. (2012). Online discriminative object tracking with local sparse representation. In IEEE winter conference on applications of computer vision (WACV).

  • Wright, J., Ma, Y., Mairal, J., & Sapiro, G. (2010). Sparse representation for computer vision and pattern recognition. Proceedings of The IEEE, 98(6), 1031–1044.

    Article  Google Scholar 

  • Wu, Y., Lim, J., & Yang, M. H. (2013). Online object tracking: A benchmark. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 2411–2418).

  • Wu, Y., Lim, J., & Yang, M. H. (2015). Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 37(9), 1834–1848.

    Article  Google Scholar 

  • Yilmaz, A., Javed, O., & Shah, M. (2006). Object tracking: A survey. ACM Computing Surveys, 38(4), 13–57.

    Article  Google Scholar 

  • Zhang, C., Liu, R., Qiu, T., & Su, Z. (2014a). Robust visual tracking via incremental low-rank features learning. Neurocomputing, 131, 237–247.

    Article  Google Scholar 

  • Zhang, K., Liu, Q., Wu, Y., & Yang, M. H. (2016a). Robust visual tracking via convolutional networks without training. IEEE Transactions on Image Processing (TIP), 25(4), 1779–1792.

    MathSciNet  Google Scholar 

  • Zhang, K., Zhang, L., & Yang, M. H. (2012a). Real-time compressive tracking. In European conference on computer vision (ECCV) (pp. 866–879).

  • Zhang, K., Zhang, L., & Yang, M. H. (2013a). Real-time object tracking via online discriminative feature selection. IEEE Transactions on Image Processing (TIP), 22(12), 4664–4677.

    MathSciNet  Article  MATH  Google Scholar 

  • Zhang, T., Bibi, A., & Ghanem, B. (2016b). In defense of sparse tracking: Circulant sparse tracker. In CVPR.

  • Zhang, T., Ghanem, B., Liu, S., & Ahuja, N. (2012b). Low-rank sparse learning for robust visual tracking. In European conference on computer vision (ECCV) (pp. 470–484).

  • Zhang, T., Liu, S., Xu, C., Yan, S., Ghanem, B., Ahuja, N., & Yang, Mh. (2015). Structural sparse tracking. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 150–158).

  • Zhang, T., Liu, S., Ahuja, N., Yang, M. H., & Ghanem, B. (2014b). Robust visual tracking via consistent low-rank sparse learning. International Journal of Computer Vision (IJCV), 111(2), 171–190.

    Article  Google Scholar 

  • Zhang, S., Yao, H., Sun, X., & Lu, X. (2013b). Sparse coding based visual tracking: Review and experimental comparison. Pattern Recognition, 46(7), 1772–1788.

  • Zhong, W., Lu, H., & Yang, M. H. (2012). Robust object tracking via sparsity-based collaborative model. In IEEE Computer Society conference on computer vision and pattern recognition (CVPR) (pp. 1838–1845).

  • Zhong, W., Lu, H., & Yang, M. H. (2014). Robust object tracking via sparse collaborative appearance model. IEEE Transactions on Image Processing (TIP), 23(5), 2356–68.

    MathSciNet  Article  MATH  Google Scholar 

  • Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.

    MathSciNet  Article  MATH  Google Scholar 

  • Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics, 15(2), 265–286.

    MathSciNet  Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yao Sui.

Additional information

This work is supported by the National Natural Science Foundation of China (NSFC) under Grants 61132007 and 61573351, the joint fund of Civil Aviation Research by the National Natural Science Foundation of China (NSFC) and Civil Aviation Administration under Grant U1533132, and the National Aeronautics and Space Administration (NASA) LEARN II Program under Grant No. NNX15AN94N.

Communicated by Josef Sivic.

Appendices

Appendix A: Derivation of the Iterative Algorithm

This section presents the detailed solutions of all variables in Eq. (9) and the deviation of the iterative algorithm to solve the problem of the discriminative low-rank learning.

Solving \({\mathbf {A}}\)

By fixing other variables, minimizing the IALM function \(\mathcal {L}\left( {\mathbf {A}},{\mathbf {E}},{\mathbf {M}},{\mathbf {w}},{\mathbf {v}},b\right) \) with respect to \({\mathbf {A}}\) is equivalent to

$$\begin{aligned} \min _{\mathbf {A}}\frac{1}{2}\left\| {\mathbf {A}}-\frac{1}{2}\left( {\mathbf {X}}-{\mathbf {E}}+{\mathbf {M}}\!+\!\frac{1}{\tau }\left( {\mathbf {J}}_1-{\mathbf {J}}_3\right) \right) \right\| _F^2 \!\!+\frac{1}{2\tau }\left\| {\mathbf {A}}\right\| _*, \end{aligned}$$
(20)

which is derived from completing the squares. This minimization can be solved by using the singular values thresholding method (Cai et al. 2010). Thus, \({\mathbf {A}}\) is found by

$$\begin{aligned} {\mathbf {A}}={\mathbf {U}}\varphi \left( {\mathbf {S}},\frac{1}{2\tau }\right) {\mathbf {V}}^T, \end{aligned}$$
(21)

where \({\mathbf {U}}{\mathbf {S}}{\mathbf {V}}^T=\frac{1}{2}\left( {\mathbf {X}}-{\mathbf {E}}+{\mathbf {M}}+\frac{1}{\tau }\left( {\mathbf {J}}_1-{\mathbf {J}}_3\right) \right) \) and

$$\begin{aligned} \varphi \left( x,\varepsilon \right) =sign\left( x\right) \max \left( 0,\left| x\right| -\varepsilon \right) \end{aligned}$$
(22)

denotes the shrinkage operator and independently applies to each entries of x.

Solving \({\mathbf {M}}\)

Minimizing \(\mathcal {L}\left( {\mathbf {A}},{\mathbf {E}},{\mathbf {M}},{\mathbf {w}},{\mathbf {v}},b\right) \) with respect to \({\mathbf {M}}\) is equivalent to

$$\begin{aligned} \min _{\mathbf {M}}\left\| {\mathbf {z}}^T-{\mathbf {w}}^T{\mathbf {M}}-b\mathbf {1}^T+\frac{1}{\tau }{\mathbf {J}}_2\right\| _2^2 +\left\| {\mathbf {A}}-{\mathbf {M}}+\frac{1}{\tau }{\mathbf {J}}_3\right\| _F^2, \end{aligned}$$
(23)

which is a least squares problem. This minimization has a closed-form solution. Thus, \({\mathbf {M}}\) is found by

$$\begin{aligned} {\mathbf {M}}=\left( {\mathbf {w}}{\mathbf {w}}^T\!+\,\!\mathbf {I}\right) ^{-1}\left( {\mathbf {w}}\left( {\mathbf {z}}^T-b\mathbf {1}^T+\frac{1}{\tau }{\mathbf {J}}_2\right) \!+\!{\mathbf {A}}+\frac{1}{\tau }{\mathbf {J}}_3\right) . \end{aligned}$$
(24)

Solving \({\mathbf {E}}\)

Minimizing \(\mathcal {L}\left( {\mathbf {A}},{\mathbf {E}},{\mathbf {M}},{\mathbf {w}},{\mathbf {v}},b\right) \) with respect to \({\mathbf {E}}\) is equivalent to

$$\begin{aligned} \min _{\mathbf {E}}\frac{1}{2}\left\| {\mathbf {X}}-{\mathbf {A}}+\frac{1}{\tau }{\mathbf {J}}_1-{\mathbf {E}}\right\| _F^2 +\frac{\alpha }{\tau }\left\| {\mathbf {E}}\right\| _1, \end{aligned}$$
(25)

which is derived from completing the squares. This minimization can be solved by using the iterative shrinkage thresholding method (Beck and Teboulle 2009). Thus, \({\mathbf {E}}\) is found by

$$\begin{aligned} {\mathbf {E}}=\varphi \left( {\mathbf {X}}-{\mathbf {A}}+\frac{1}{\tau }{\mathbf {J}}_1,\frac{\alpha }{\tau }\right) . \end{aligned}$$
(26)

Solving \({\mathbf {w}}\) and b

Minimizing \(\mathcal {L}\left( {\mathbf {A}},{\mathbf {E}},{\mathbf {M}},{\mathbf {w}},{\mathbf {v}},b\right) \) with respect to \({\mathbf {w}}\) and b are respectively equivalent to

$$\begin{aligned} \begin{aligned}&\min _{\mathbf {w}}\frac{1}{2}\left\| {\mathbf {z}}^T-{\mathbf {w}}^T{\mathbf {M}}-b\mathbf {1}^T+\frac{1}{\tau }{\mathbf {J}}_2\right\| _2^2 +\frac{1}{2}\left\| {\mathbf {v}}-{\mathbf {w}}+\frac{1}{\tau }{\mathbf {J}}_4\right\| _2^2 \\&\quad +\frac{\beta }{\tau }\left\| {\mathbf {w}}\right\| _2^2, \\&\min _b \left\| {\mathbf {z}}^T-{\mathbf {w}}^T{\mathbf {M}}-b\mathbf {1}^T+\frac{1}{\tau }{\mathbf {J}}_2\right\| _2^2, \end{aligned} \end{aligned}$$
(27)

both of which can be solved via least squares with the closed-form solutions

$$\begin{aligned} \begin{aligned} {\mathbf {w}}&=\left( {\mathbf {M}}{\mathbf {M}}^T+\left( 1+\frac{2\beta }{\tau }\right) \mathbf {I}\right) ^{-1} \\&\quad \cdot \left( {\mathbf {M}}\left( {\mathbf {z}}^T-b\mathbf {1}^T+\frac{1}{\tau }{\mathbf {J}}_2\right) ^T+{\mathbf {v}}+\frac{1}{\tau }{\mathbf {J}}_4\right) , \\&b=\frac{1}{N}\left\langle \mathbf {1}^T,{\mathbf {z}}^T-{\mathbf {w}}^T{\mathbf {M}}+\frac{1}{\tau }{\mathbf {J}}_2 \right\rangle , \end{aligned} \end{aligned}$$
(28)

where N denotes the number of the training samples.

Solving \({\mathbf {v}}\)

Minimizing \(\mathcal {L}\left( {\mathbf {A}},{\mathbf {E}},{\mathbf {M}},{\mathbf {w}},{\mathbf {v}},b\right) \) with respect to \({\mathbf {v}}\) is equivalent to

$$\begin{aligned} \min _{\mathbf {v}}\frac{1}{2}\left\| {\mathbf {v}}-{\mathbf {w}}+\frac{1}{\tau }{\mathbf {J}}_4\right\| _2^2+\frac{\gamma }{\tau }\left\| {\mathbf {v}}\right\| _1, \end{aligned}$$
(29)

which is derived from completing the squares. This minimization can be solved by the iterative shrinkage thresholding method (Beck and Teboulle 2009). Thus, \({\mathbf {v}}\) is found by

$$\begin{aligned} {\mathbf {v}}=\varphi \left( {\mathbf {w}}-\frac{1}{\tau }{\mathbf {J}}_4,\frac{\gamma }{\tau }\right) . \end{aligned}$$
(30)

The main steps of the iterative algorithm are depicted in Algorithm 2. The algorithm stops when the values of the IALM function \(\mathcal {L}\left( {\mathbf {A}},{\mathbf {E}},{\mathbf {M}},{\mathbf {w}},{\mathbf {v}},b\right) \) between two consecutive iterations have small difference. Note that we set the parameters \(\alpha =\frac{1}{\sqrt{\max \left( d,N\right) }}\), \(\tau =\frac{1.25}{\max \left( svd\left( {\mathbf {X}}\right) \right) }\) and \(\kappa =1.6\) following the recommendations in Lin et al. (2010), and empirically set \(\beta =1-\alpha \) and \(\gamma =\alpha \) in Algorithm 2.

figure b

Appendix B: Evaluations on Different Situations

Fig. 21
figure 21

Performance of our tracker and the top ten ranking trackers in Wu et al. (2013) in different challenging cases. In the caption of each sub-figure, the number in parentheses denotes the number of the video sequences in the corresponding case

For more thorough evaluation of our tracker, we also analyze the performance of our tracker in different challenging situations, such as illumination variation and occlusion. The evaluation results in representative situations are reported as follows, as shown in Fig. 21.

Occlusion In the situation of occlusion, the target is occluded by other objects. Occlusion may easily lead to tracking failure because the target disappears partially or entirely for a period. From the results shown in Fig. 21a, it can be seen that our tracker is robust against occlusion and obtains good tracking results. It benefits from the facts that (1) the sparse reconstruction errors can absorb the occlusion during our subspace learning, such that the learned subspace only acquires the non-occluded information of the target; and (2) the good discriminative capability of the learned subspace can reliably separate the target from the background. The competing trackers using sparse reconstruction errors for occlusion handling, such as SCM and LSK, and the competing trackers using discriminative tracking model, such as Struck, also achieve good tracking results on some video sequences in this case.

Non-Rigid Deformation The motion of the target may cause non-rigid deformations in the appearance. From the results shown in Fig. 21b, it is evident that our tracker obtains supreme performance in this case. This is attributed to the facts that (1) the small deformation, which causes small reconstruction errors, is effectively processed by the subspace learning; and (2) the large deformation, which causes large reconstruction errors, is compensated by using the sparsity constraint on the reconstruction errors.

Illumination Variation In this case, the illumination of the scene changes drastically, leading to significant changes in the appearance of the target. From the results shown in Fig. 21c, it can be seen that our tracker obtains the most excellent results in this case. This is attributed to that the subspace learning is effective to handle illumination change. Note that the adaptive dimension reduction of our subspace learning also makes our tracker more stable in this case. It can also be seen that some subspace learning based trackers, such as LLR and SSL, also obtain good tracking performance in this case.

Background Clutter In this situation, the tracker is distracted by the cluttered background. Thus, the trackers that consider the difference between the target and the background information may be more effective in this case. From the results shown in Fig. 21d, it can be seen that our tracker performs favorably in this case. This is attributed to the good discriminative capability of our tracker, which can reliably distinguish the target from the background. As analyzed above, the competing trackers that considers the background, such as SET, also obtain good tracking results in this case.

Out-of-Plane Rotation The motion of either the target or the camera may cause out-of-plane rotations in the appearance of the target. From the results shown in Fig. 21e, it is evident that our tracker performs the best in this case. On one hand, the temporal locality of our subspace (only using the recently obtained targets) is effective to describe the appearance changes of the target with out-of-plane rotations. On the other hand, the linear classifier can successfully separate the target with out-of-plane rotations from the background.

Scale Variation In this case, the scale of the appearance of the target on successive frames varies over time, such that the tracker may result in inaccurate tracking results. Because we take account of the scale change of the target in the motion state, as shown in Eq. (10), it can be seen from the results shown in Fig. 21f that our tracker is insensitive to scale change and obtains favourable performance in this case.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sui, Y., Tang, Y., Zhang, L. et al. Visual Tracking via Subspace Learning: A Discriminative Approach. Int J Comput Vis 126, 515–536 (2018). https://doi.org/10.1007/s11263-017-1049-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-017-1049-z

Keywords

  • Visual tracking
  • Discriminative subspace
  • Joint learning
  • Sparse representation
  • Low-rank approximation