Visual tracking using structural local DCT sparse appearance model with occlusion detection

Abstract

In this paper, a structural local DCT sparse appearance model with occlusion detection is proposed for visual tracking in a particle filter framework. The energy compaction property of the 2D-DCT is exploited to reduce the size of the dictionary as well as that of the candidate samples so that the computational cost of l1-minimization can be lowered. Further, a holistic image reconstruction procedure is proposed for robust occlusion detection and used for appearance model update, thus avoiding the degradation of the appearance model in the presence of occlusion/outliers. Also, a patch occlusion ratio is introduced in the confidence score computation to enhance the tracking performance. Quantitative and qualitative performance evaluations on two popular benchmark datasets demonstrate that the proposed tracking algorithm generally outperforms several state-of-the-art methods.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

References

  1. 1.

    Adam A, Rivlin E, Shimshoni I (2006) Robust fragments-based tracking using the integral histogram. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 798–805

  2. 2.

    Babenko B, Yang MH, Belongie S (2009) Visual tracking with online multiple instance learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 983–990

  3. 3.

    Bao C, Wu Y, Ling H, Ji H (2012) Real time robust L1 tracker using accelerated proximal gradient approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1830–1837

  4. 4.

    Chen D, Liu Q, Sun M, Yang J (2008) Mining appearance models directly from compressed video. IEEE Trans Multimed 10(2):268–276

    Article  Google Scholar 

  5. 5.

    Chen H, Zhang W, Zhao X, Tan m (2014) DCT representations based appearance model for visual tracking. In: Proceedings of the IEEE international conference on robotics and biometrics (ROBIO), pp 1614–1619

  6. 6.

    Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell (PAMI) 25(5):564–577

    Article  Google Scholar 

  7. 7.

    Dai P, Luo Y, Liu W, Li C, Xie Y (2013) Robust visual tracking via part-based sparsity model. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1803–1806

  8. 8.

    Danelljan M, Häger G, Khan FS, Felsberg M (2015) Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 4310–4318

  9. 9.

    Elad M, Aharon M (2006) Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image process 15(12):3736–3745

    MathSciNet  Article  Google Scholar 

  10. 10.

    Gao J, Zhang T, Yang X, Xu C (2017) Deep relative tracking. IEEE Trans Image Process 26(4):1845–1858

    MathSciNet  Article  Google Scholar 

  11. 11.

    Gao J, Zhang T, Yang X, Xu C (2018) P2T: Part-to-target tracking via deep regression learning. IEEE Trans Image Process 27(6):3074–3086

    MathSciNet  Article  Google Scholar 

  12. 12.

    Grabner H, Leistner C, Bischof H (2008) Semi-supervised on-line boosting for robust tracking. In: Proceedings of European conference on computer vision (ECCV), pp 234–247

    Google Scholar 

  13. 13.

    Hafed ZM, Levine MD (2001) Face recognition using the discrete cosine transform. Int J Comput Vis 43(3):167–188

    Article  Google Scholar 

  14. 14.

    He D, Gu Z, Cercone N (2009) Efficient image retrieval in DCT domain by hypothesis testing. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 225–228

  15. 15.

    Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell (PAMI) 37 (3):583–596

    Article  Google Scholar 

  16. 16.

    Isard M, Blake A (1998) Condensation: Conditional density propagation for visual tracking. Int J Comput Vis 29(1):5–28

    Article  Google Scholar 

  17. 17.

    Jia X, Lu H, Yang MH (2012) Visual tracking via adaptive structural local sparse appearance model. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1822–1829

  18. 18.

    Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R (2016) The visual object tracking VOT2016 challenge results. In: Proceedings of European conference on computer vision (ECCV), pp 1–45

  19. 19.

    Li Y, Ai H, Yamashita T, Lao S, Kawade M (2008) Tracking in low frame rate video: a cascade particle filter with discriminative observers of different life spans. IEEE Trans Pattern Anal Mach Intell (PAMI) 30(10):1728–1740

    Article  Google Scholar 

  20. 20.

    Li X, Dick A, Shen C, Hengel A, Wang H (2013) Incremental learning of 3d-DCT compact representations for robust visual tracking. IEEE Trans Pattern Anal Mach Intell (PAMI) 35(4):863–881

    Article  Google Scholar 

  21. 21.

    Li H, Li Y, Porikli F (2016) Deeptrack: Learning discriminative feature representations online for robust visual tracking. IEEE Trans Image Process 25 (4):1834–1848

    MathSciNet  Article  Google Scholar 

  22. 22.

    Lin C, Pun CM (2013) Tracking object using particle filter and DCT features. In: Proceedings of international conference on advances in computer science and engineering, pp 167–169

  23. 23.

    Mairal J, Bach F, Ponce J, Sapiro G (2010) Online learning for matrix factorization and sparse coding. J Mach Learn Res 11:19–60

    MathSciNet  MATH  Google Scholar 

  24. 24.

    Mei X, Ling H (2009) Robust visual tracking using L1 minimization. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1436–1443

  25. 25.

    Mei X, Ling H, Wu Y, Blasch E, Bai L (2011) Minimum error bounded efficient L1 tracker with occlusion detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1257–1264

  26. 26.

    Ou W, Yuan D, Liu Q, Cao Y (2018) Object tracking basedon online representative sample selection via non-negative least square. Multimedia Tools Appl 77(9):10569–10587

    Article  Google Scholar 

  27. 27.

    Pennerbaker W, Mithchell J (1992) JPEG: Still image data compression standard. Springer Science & Business Media, Berlin

    Google Scholar 

  28. 28.

    Qu P (2014) Visual tracking with fragments-based PCA sparse representation. Int J Signal Process, Image Process Pattern Recogn 7(2):23–34

    Google Scholar 

  29. 29.

    Ross DA, Lim J, Lin RS, Yang MH (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77:125–141

    Article  Google Scholar 

  30. 30.

    Shreyamsha Kumar BK, Swamy MNS, Omair Ahmad M (2013) Multiresolution DCT decomposition for multifocus image fusion. In: Proceedings of the IEEE Canadian conference on electrical and computer engineering (CCECE), pp 1–4. https://doi.org/10.1109/CCECE.2013.6567721

  31. 31.

    Shreyamsha Kumar BK, Swamy MNS, Omair Ahmad M (2015) Structural local DCT sparse appearance model for visual tracking. In: Proceedings of the IEEE international symposium on circuits and systems (ISCAS), pp 1194–1197. https://doi.org/10.1109/ISCAS.2015.7168853

  32. 32.

    Shreyamsha Kumar BK, Swamy MNS, Omair Ahmad M (2016) Visual tracking via bilateral 2DPCA and robust coding. In: Proceedings of the IEEE Canadian conference on electrical and computer engineering (CCECE), pp 1–4. https://doi.org/10.1109/CCECE.2016.7726647

  33. 33.

    Shreyamsha Kumar BK, Swamy MNS, Omair Ahmad M (2016) Weighted residual minimization in PCA subspace for visual tracking. In: Proceedings of the IEEE international symposium on circuits and systems (ISCAS), pp 986–989. https://doi.org/10.1109/ISCAS.2016.7527408

  34. 34.

    Uzair M, Mahmood A, Mian AS (2013) Hyperspectral face recognition using 3d-DCT and partial least squares. In: Proceedings of British machine vision conference (BMVC), pp 1–10

  35. 35.

    Wang D, Lu H (2012) Object tracking via 2DPCA and l 1-regularization. Signal Process Lett 19(11):711–714

    Article  Google Scholar 

  36. 36.

    Wang D, Lu H, Bo C (2015) Fast and robust object tracking via probability continuous outlier model. IEEE Trans Image Process 24(12):5166–5176

    MathSciNet  Article  Google Scholar 

  37. 37.

    Wang D, Lu H, Bo C (2015) Visual tracking via weighted local cosine similarity. IEEE Trans Cybern 45(9):1838–1850

    Article  Google Scholar 

  38. 38.

    Wang D, Lu H, Yang MH (2013) Online object tracking with sparse prototypes. IEEE Trans Image Process 22(1):314–325

    MathSciNet  Article  Google Scholar 

  39. 39.

    Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. In: Proceedings of advances in neural information processing systems (NIPS), pp 809–817

  40. 40.

    Wang F, Zhang J, Guo Q, Liu P, Tu D (2015) Robust visual tracking via discriminative structural sparse feature. In: Proceedings of the Chinese conference on image and graphics technologies, pp 438–446

    Google Scholar 

  41. 41.

    Wang D, Lu H, Xiao Z, Yang MH (2015) Inverse sparse tracker with a locally weighted distance metric. IEEE Trans Image Process 24(9):2646–2657

    MathSciNet  Article  Google Scholar 

  42. 42.

    Wang D, Lu H, Yang MH (2016) Robust visual tracking via least soft-threshold squares. IEEE Trans Circ Syst Video Technol 26(9):1709–1721

    Article  Google Scholar 

  43. 43.

    Wright J, Yang A, Ganesh A, Sastry S, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell (PAMI) 31(2):210–227

    Article  Google Scholar 

  44. 44.

    Wu Y, Lim J, Yang MH (2013) Online object tracking: a benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2411–2418

  45. 45.

    Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1794–1801

  46. 46.

    Yang H, Shao L, Zheng F, Wang L, Song Z (2011) Recent advances and trends in visual tracking: a review. Neurocomputing 74(18):3823–3831

    Article  Google Scholar 

  47. 47.

    You X, Li X, He Z, Zhang X (2015) A robust local sparse tracker with global consistency constraint. Signal Process 111:308–318

    Article  Google Scholar 

  48. 48.

    Zhang T, Ghanem B, Liu S, Ahuja N (2012) Robust visual tracking via multi-task sparse learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2042–2049

  49. 49.

    Zhang H, Tao F, Yang G (2015) Robust visual tracking based on structured sparse representation model. Multimed Tools Appl 74(3):1021–1043

    Article  Google Scholar 

  50. 50.

    Zhang T, Bibi A, Ghanem B (2016) In defense of sparse tracking: Circulant sparse tracker. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3880–3888

  51. 51.

    Zhang T, Xu C, Yang MH (2017) Multi-task correlation particle filter for robust object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4819–4827

  52. 52.

    Zhang T, Liu S, Xu C, Liu B, Yang MH (2018) Correlation particle filter for visual tracking. IEEE Trans Image Process 27(6):2676–2687

    MathSciNet  Article  Google Scholar 

  53. 53.

    Zhang T, Xu C, Yang MH (2018) Learning multi-task correlation particle filters for visual tracking. IEEE Trans Pattern Anal Mach Intell (PAMI):1–14. https://doi.org/10.1109/TPAMI.2018.2797062

    Article  Google Scholar 

  54. 54.

    Zhong Y, Zhang H, Jain AK (2000) Automatic caption localization in compressed video. IEEE Trans Pattern Anal Mach Intell (PAMI) 22(4):385–392

    Article  Google Scholar 

  55. 55.

    Zhuang B, Wang L, Lu H (2016) Visual tracking via shallow and deep collaborative model. Neurocomputing 218:61–71

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada, the Regroupement Stratégique en Microsystèmes du Québec (ReSMiQ), and Ministère de l’Éducation, de l’Enseignement Supérieur et de la Recherche (MEESR) du Québec.

The authors would like to thank the authors of [3, 17, 20, 29, 36,37,38,39, 41, 42] who made their codes available for comparison with the proposed method.

Author information

Affiliations

Authors

Corresponding author

Correspondence to M. Omair Ahmad.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kumar, B.K.S., Swamy, M.N.S. & Ahmad, M.O. Visual tracking using structural local DCT sparse appearance model with occlusion detection. Multimed Tools Appl 78, 7243–7266 (2019). https://doi.org/10.1007/s11042-018-6453-z

Download citation

Keywords

  • Visual tracking
  • Local DCT sparse appearance model
  • Holistic image reconstruction
  • Reconstruction error
  • Occlusion map
  • Observation model update