Skip to main content
Log in

Comprehensive study on visual tracking methods and its application on aerial videos

  • Research Article
  • Published:
Journal of Optics Aims and scope Submit manuscript

Abstract

Visual tracking in aerial videos refers to the process of automatically tracking an object of interest in a video captured from an aerial platform such as a drone or aircraft. One of the key challenges in visual tracking is dealing with appearance variations caused by changes in illumination, viewpoint, occlusion, and background clutter. While there have been some review papers in the literature that have reviewed previous studies on visual tracking in aerial videos, there is still a need for a comprehensive review of existing methods. This paper presents a comprehensive study of existing methods and previous studies on visual target tracking. The visual tracking framework consists of four modules: target region extraction, visual target representation, statistical measuring, and target motion representation and localization. The main contributions of this study is to review the current achievements, highlight the weaknesses and advantages of various existing methods in each module, and address current research issues and challenging tasks in each module. The literature review focuses on articles published in highly cited, peer-reviewed journals, with particular emphasis on high-quality techniques. The aim of this study is to conduct a comprehensive review of visual tracking methods. The findings of this paper address the advantages and disadvantages of existing visual tracking methods, challenges, and potential solutions on visual representation and appearance modeling methods. Finally, the visual tracking methods are investigated on aerial videos, and their highlighted points are addressed to guide future researchers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. A.W. Smeulders, D.M. Chu, R. Cucchiara, S. Calderara, A. Dehghan, M. Shah, Visual tracking: an experimental survey. Pattern Anal. Mach. Intell. IEEE Trans. 36(7), 1442–1468 (2014)

    Article  Google Scholar 

  2. A. Yilmaz, O. Javed, M. Shah, Object tracking: a survey. Acm Comput. Surv. CSUR 38(4), 13 (2006)

    Article  Google Scholar 

  3. K. Zhang, L. Zhang, Q. Liu, D. Zhang, M.-H. Yang, Fast Visual Tracking via Dense Spatio-temporal Context Learning. Computer Vision–ECCV 2014, Springer, 127–141 (2014).

  4. X. Li, W. Hu, C. Shen, Z. Zhang, A. Dick, A.V.D. Hengel, A survey of appearance models in visual object tracking. ACM Trans. Intell. Syst. Technol TIST 4(4), 58 (2013)

    Google Scholar 

  5. K. Briechle, U.D. Hanebeck, Template matching using fast normalized cross correlation aerospace/defense sensing, simulation, and controls. Int. Soc. Opt. Photon. 25, 95–102 (2001)

    Google Scholar 

  6. M. Godec, P.M. Roth, H. Bischof, Hough-based tracking of non-rigid objects. Comput. Vis. Image Underst. 117(10), 1245–1256 (2013)

    Article  Google Scholar 

  7. L. Čehovin, M. Kristan, A. Leonardis, An Adaptive Coupled-layer Visual Model for Robust Visual Tracking. Computer Vision (ICCV), 2011 IEEE International Conference on IEEE, 1363–1370 (2011).

  8. S. Oron, A. Bar-Hillel, D. Levi, S. Avidan, Locally Orderless Tracking. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on IEEE, 1940–1947 (2012).

  9. B. Babenko, M.-H. Yang, S. Belongie, Visual Tracking with Online Multiple Instance Learning. Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on IEEE, 983–990 (2009).

  10. H. Yang, L. Shao, F. Zheng, L. Wang, Z. Song, Recent advances and trends in visual tracking: a review. Neurocomputing 74(18), 3823–3831 (2011)

    Article  Google Scholar 

  11. G. Silveira, E. Mali, Real-time visual tracking under arbitrary illumination changes. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1–6 (2007)

  12. W. Hu, X. Li, X. Zhang, X. Shi, S. Maybank, Z. Zhan, Incremental tensor subspace learning and its applications to foreground segmentation and tracking. Int J Comput Vis 91(3): 303–327 (2011)

    Article  Google Scholar 

  13. J. Wen, X. Li, X. Gao, D. Tao. Incremental learning of weighted tensor subspace for visual tracking. IEEE International Conference on Systems, Man and Cybernetics (SMC). pp. 3688–3693 (2012)

  14. C. He, Y.F. Zheng, S.C. Ahalt, Object tracking using the Gabor wavelet transform and the golden section algorithm. IEEE Trans. Multimed. 4(4), 528–538 (2002)

    Article  Google Scholar 

  15. M. Li, Z. Zhang, K. Huang, T. Tan, Robust Visual Tracking based on Simplified Biologically Inspired Features. In: 2009 16th IEEE International Conference on Image Processing (ICIP) 4113–4116 (2009).

  16. F. Porikli, O. Tuzel, P. Meer, Covariance Tracking using Model Update Based on Lie Algebra. Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on IEEE, 728–735 (2006).

  17. V. Arsigny, P. Fillard, X. Pennec, N. Ayache, Geometric means in a novel vector space structure on symmetric positive-definite matrices. SIAM J. Matrix Anal. Appl. 29(1), 328–347 (2007)

    Article  MathSciNet  Google Scholar 

  18. Q. Zhao, Z. Yang, H. Tao, Differential earth mover's distance with its applications to visual tracking. IEEE Trans Pattern Anal Mach Intell 32(2): 274–287 (2010)

    Article  Google Scholar 

  19. A. Adam, E. Rivlin, I. Shimshoni. Robust fragments-based tracking using the integral histogram. IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp. 798–805 (2006).

  20. S. S. Nejhum, J. Ho, M. H. Yang. Online visual tracking with histograms and articulating blocks. Comput Vis Image Underst 114(8): 901–914 (2010)

    Article  Google Scholar 

  21. W. Hu, X. Li, W. Luo, X. Zhang, S. Maybank, Z. Zhang, Single and multiple object tracking using log-euclidean riemannian subspace and block-division appearance model. Pattern Anal. Mach. Intell. IEEE Trans. 34(12), 2420–2440 (2012)

    Article  Google Scholar 

  22. B. Ma, Y. Su, F. Jurie, Covariance descriptor based on bio-inspired features for person re-identification and face verification. Image Vis. Comput. 32(6), 379–390 (2014)

    Article  Google Scholar 

  23. M.S. Allili, D. Ziou, Object of Interest Segmentation and Tracking by Using Feature Selection and Active Contours. Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on IEEE, 1–8 (2007).

  24. N. Vaswani, Y. Rathi, A. Yezzi, A. Tannenbaum, Pf-mt with an interpolation effective basis for tracking local contour deformations. IEEE Trans. Image Process 19(4), 841–857 (2008)

    Article  ADS  Google Scholar 

  25. Y. Li, S. Wang, Q. Tian, X. Ding, A survey of recent advances in visual feature detection. Neurocomputing 149, 736–751 (2015)

    Article  Google Scholar 

  26. M.A. Oskoei, H. Hu, A Survey on Edge Detection Methods (University of Essex, UK, 2010)

    Google Scholar 

  27. P. Dollar, Z. Tu, S. Belongie, Supervised Learning of Edges and Object Boundaries. Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on IEEE, 1964–1971 (2006).

  28. P. Dollár, C.L. Zitnick, Fast Edge Detection Using Structured Forests (2014).

  29. P. Koniusz, K. Mikolajczyk, Segmentation Based Interest Points and Evaluation of Unsupervised Image Segmentation Methods. BMVC, 1–11 (2009).

  30. H. Grabner, P.M.Roth, and H.Bischof, Eigenboosting: Combining discriminative and generative information. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, 1–8 (2007)

  31. C. Harris, M. Stephens, A Combined Corner and Edge Detector. Alvey Vision Conference, Citeseer, 50 (1988).

  32. C. Tomasi, T. Kanade, Detection and Tracking of Point Features. School of Computer Science (Carnegie Mellon Univ, Pittsburgh, 1991)

    Google Scholar 

  33. J. Shi, C. Tomasi, Good Features to Track. Computer Vision and Pattern Recognition, 1994. Proceedings CVPR'94, 1994 IEEE Computer Society Conference on IEEE, 593–600 (1994).

  34. E. Rosten, T. Drummond, Machine Learning for High-speed Corner Detection. Computer Vision–ECCV 2006, Springer, 430–443 (2006).

  35. E. Rosten, R. Porter, T. Drummond, Faster and better: a machine learning approach to corner detection. Pattern Anal. Mach. Intell. IEEE Trans. 32(1), 105–119 (2010)

    Article  Google Scholar 

  36. E. Mair, G.D. Hager, D. Burschka, M. Suppa, G. Hirzinger, Adaptive and Generic Corner Detection Based on the Accelerated Segment Test. Computer Vision–ECCV 2010, Springer, 183–196 (2010).

  37. P.-L. Shui, W.-C. Zhang, Corner detection and classification using anisotropic directional derivative representations. Image Process IEEE Trans 22(8), 3204–3218 (2013)

    Article  ADS  Google Scholar 

  38. M. Awrangjeb, G. Lu, An improved curvature scale-space corner detector and a robust corner matching approach for transformed image identification. Image Process. IEEE Trans. 17(12), 2425–2441 (2008)

    Article  MathSciNet  ADS  Google Scholar 

  39. K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, L. Van Gool, A comparison of affine region detectors. Int. J. Comput. Vis. 65, 43–72 (2005)

    Article  Google Scholar 

  40. K.M. Yi, E. Trulls, V. Lepetit and P. Fua, Lift: Learned Invariant Feature Transform. In Computer Vision–ECCV 2016: 14th European Conference. Amsterdam, The Netherlands 4(14), 467–483 (2016).

  41. P.F. Alcantarilla, A. Bartoli, A.J. Davison, KAZE features. Computer Vision–ECCV 2012, Springer, 214–227 (2012).

  42. S. Salti, A. Lanza, L. Di Stefano, Keypoints from Symmetries by Wave Propagation. Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on IEEE, 2898–2905 (2013).

  43. E. Rublee, V. Rabaud, K. Konolige, G. Bradski, ORB: an Efficient Alternative to SIFT or SURF. Computer Vision (ICCV), 2011 IEEE International Conference on IEEE, 2564–2571 (2011).

  44. S. Leutenegger, M. Chli, R.Y. Siegwart, BRISK: Binary Robust Invariant Scalable Keypoints. Computer Vision (ICCV), 2011 IEEE International Conference on IEEE, 2548–2555 (2011).

  45. A. Alahi, R. Ortiz, P. Vandergheynst, Freak: Fast Retina keypoint. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on IEEE, 510–517 (2012).

  46. J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)

    Article  Google Scholar 

  47. H. Deng, W. Zhang, E. Mortensen, T. Dietterich, L. Shapiro, Principal Curvature-Based Region Detector for Object Recognition. Computer Vision and Pattern Recognition, 2007 CVPR'07. IEEE Conference on IEEE, 1–8 (2007).

  48. P.-E. Forssén, Maximally Stable Colour Regions for Recognition and Matching. Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on IEEE, 1–8 (2007).

  49. T. Kadir, A. Zisserman, M. Brady, An Affine Invariant Salient Region Detector. Computer Vision-ECCV 2004, Springer, 228–241 (2004).

  50. M. Cheng, N.J. Mitra, X. Huang, P.H. Torr, S. Hu, Global contrast based salient region detection. Pattern Anal. Mach. Intell. IEEE Trans. 37(3), 569–582 (2015)

    Article  Google Scholar 

  51. H. Shen, S. Li, C. Zhu, H. Chang, J. Zhang, Moving object detection in aerial video based on spatiotemporal saliency. Chin. J. Aeronaut. 26(5), 1211–1217 (2013)

    Article  Google Scholar 

  52. D.A. Ross, J. Lim, R.-S. Lin, M.-H. Yang, Incremental learning for robust visual tracking. Int. J. Comput. Vis. 77(1–3), 125–141 (2008)

    Article  Google Scholar 

  53. L. Ellis, N. Dowson, J. Matas, R. Bowden, Linear regression and adaptive appearance models for fast simultaneous modelling and tracking. Int. J. Comput. Vis. 95(2), 154–179 (2011)

    Article  MathSciNet  Google Scholar 

  54. R.E. Kalman, A new approach to linear filtering and prediction problems. J. Fluids Eng. 82(1), 35–45 (1960)

    MathSciNet  Google Scholar 

  55. S. Baker, I. Matthews, Lucas-kanade 20 years on: a unifying framework. Int. J. Comput. Vis. 56(3), 221–255 (2004)

    Article  Google Scholar 

  56. Z. Kalal, J. Matas, K. Mikolajczyk, Pn learning: Bootstrapping binary classifiers by structural constraints. Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on IEEE, 49–56 (2010).

  57. S. Wang, H. Lu, F. Yang, M.-H. Yang, Superpixel tracking. Computer Vision (ICCV), 2011 IEEE International Conference on, IEEE, 1323–1330 (2011).

  58. C. Wojek, G. Dorkó, A. Schulz, B. Schiele, Sliding-Windows for Rapid Object Class Localization: A Parallel Technique. Pattern Recognition. Springer, 71–81 (2008).

  59. S. Newsam, S. Bhagavathy, B. Manjunath, Object localization using texture motifs and Markov random fields. Image Processing, 2003. ICIP 2003. Proceedings 2003 International Conference on IEEE, 1049–1052 (2003).

  60. O. Chum, A. Zisserman, An exemplar model for learning object classes. Computer vision and pattern recognition, 2007. CVPR'07. IEEE Conference on IEEE, 1–8 (2007).

  61. Y. Zhang, X. Tong, T. Yang, W. Ma, Multi-model estimation based moving object detection for aerial video. Sensors 15(4), 8214–8231 (2015)

    Article  ADS  Google Scholar 

  62. A. Borji, L. Itti, State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1), 185–207 (2012)

    Article  Google Scholar 

  63. L. Itti, C. Koch, Computational modelling of visual attention. Nat Rev Neurosci 2(3), 194–203 (2001)

    Article  Google Scholar 

Download references

Funding

This work was supported by “A Bite of Xing Anmeng” Documentary creation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shuai Gao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, S. Comprehensive study on visual tracking methods and its application on aerial videos. J Opt 53, 981–996 (2024). https://doi.org/10.1007/s12596-023-01229-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12596-023-01229-3

Keywords

Navigation