Combining Visual Tracking and Person Detection for Long Term Tracking on a UAV

  • Gustav Häger
  • Goutam Bhat
  • Martin Danelljan
  • Fahad Shahbaz Khan
  • Michael Felsberg
  • Piotr Rudl
  • Patrick Doherty
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10072)


Visual object tracking performance has improved significantly in recent years. Most trackers are based on either of two paradigms: online learning of an appearance model or the use of a pre-trained object detector. Methods based on online learning provide high accuracy, but are prone to model drift. The model drift occurs when the tracker fails to correctly estimate the tracked object’s position. Methods based on a detector on the other hand typically have good long-term robustness, but reduced accuracy compared to online methods.

Despite the complementarity of the aforementioned approaches, the problem of fusing them into a single framework is largely unexplored. In this paper, we propose a novel fusion between an online tracker and a pre-trained detector for tracking humans from a UAV. The system operates at real-time on a UAV platform. In addition we present a novel dataset for long-term tracking in a UAV setting, that includes scenarios that are typically not well represented in standard visual tracking datasets.



This work has been supported by SSF (CUAS, SymbiCloud), Wallenberg Autonomy and Software Programme (WASP) and ELLIIT.


  1. 1.
    Kristan, M., Pflugfelder, R., Leonardis, A., Matas, J., Porikli, F., Čehovin, L., Nebehay, G., Fernandez, G., Vojir, T.: The visual object tracking VOT 2013 challenge results (2013)Google Scholar
  2. 2.
    Kristan, M.: The visual object tracking VOT2014 challenge results. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8926, pp. 191–217. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-16181-5_14 Google Scholar
  3. 3.
    Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., Fernandez, G., Vojir, T., Hager, G., Nebehay, G., Pflugfelder, R.: The visual object tracking VOT2015 challenge results. In: The IEEE International Conference on Computer Vision (ICCV) Workshops (2015)Google Scholar
  4. 4.
    Patino, L., Ferryman, J.: Pets 2014: dataset and challenge. In: 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 355–360. IEEE (2014)Google Scholar
  5. 5.
    Nawaz, T., Boyle, J., Li, L., Ferryman, J.: Tracking performance evaluation on pets 2015 challenge datasets. In: 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2015)Google Scholar
  6. 6.
    Wu, Y., Lim, J., Yang, M.H.: Online object tracking: A benchmark. In: CVPR (2013)Google Scholar
  7. 7.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Danelljan, M., Häger, G., Shahbaz Khan, F., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: BMVC (2014)Google Scholar
  9. 9.
    Van De Weijer, J., Schmid, C., Verbeek, J., Larlus, D.: Learning color names for real-world applications. IEEE Trans. Image Process. 18, 1512–1523 (2009)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)Google Scholar
  11. 11.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)CrossRefGoogle Scholar
  12. 12.
    Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)Google Scholar
  13. 13.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)Google Scholar
  14. 14.
    Danelljan, M., Khan, F.S., Felsberg, M., Granström, K., Heintz, F., Rudol, P., Wzorek, M., Kvarnström, J., Doherty, P.: A low-level active vision framework for collaborative unmanned aircraft systems. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 223–237. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-16178-5_15 CrossRefGoogle Scholar
  15. 15.
    Kalal, Z., Matas, J., Mikolajczyk, K.: P-N learning: Bootstrapping binary classifiers by structural constraints. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 49–56. IEEE (2010)Google Scholar
  16. 16.
    Danelljan, M., Shahbaz Khan, F., Felsberg, M., van de Weijer, J.: Adaptive color attributes for real-time visual tracking. In: CVPR (2014)Google Scholar
  17. 17.
    Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Gustav Häger
    • 1
  • Goutam Bhat
    • 1
  • Martin Danelljan
    • 1
  • Fahad Shahbaz Khan
    • 1
  • Michael Felsberg
    • 1
  • Piotr Rudl
    • 2
  • Patrick Doherty
    • 2
  1. 1.Computer Vision LaboratoryLinköping UniversityLinköpingSweden
  2. 2.Artificial Intelligence and Integrated Computer SystemsLinköping UniversityLinköpingSweden

Personalised recommendations