Object Tracking Using Deep Convolutional Neural Networks and Visual Appearance Models

  • Bogdan Mocanu
  • Ruxandra Tapu
  • Titus Zaharia
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10617)


In this paper we introduce a novel single object tracking method that extends the traditional GOTURN algorithm with a visual attention model. The proposed approach returns accurate object tracks and is able to handle sudden camera and background movement, long-term occlusions and multiple moving objects that can evolve simultaneously in a same neighborhood. The process of occlusion identification is performed using image quad-tree decomposition and patch matching, based on a convolution neural network trained offline. The object appearance model is adaptively modified in time based on both visual similarity constraints and trajectory verification tests. The experimental evaluation performed on the VOT 2016 dataset demonstrates the efficiency of our method that returns high accuracy scores regardless of the scene dynamics or object shape.


Single object tracking Adaptive object appearance model Occlusion detection Patch matching Convolution neural networks 



This work was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS - UEFISCDI, project number: PN-II-RU-TE-2014-4-0202.

Part of this work has been funded by University Politehnica of Bucharest, through the “Excellence Research Grants” Program, UPB – GEX. Identifier: UPB–EXCELENȚĂ–2016, No. 97/26.09.2016 and UPB–EXCELENȚĂ–2017, project entitled: “Autonomous obstacle detection and recognitions system based on deep convolutional neural networks dedicated to visually impaired people”.


  1. 1.
    Babenko, B., Yang, M.H., Belongie, S.: Robust object tracking with online multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1619–1632 (2011)CrossRefGoogle Scholar
  2. 2.
    Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: ECCV (2016)Google Scholar
  3. 3.
    Hua, Y., Alahari, K., Schmid, C.: Occlusion and motion reasoning for long-term tracking. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision – ECCV 2014. LNCS, vol. 8694. Springer, Cham (2014). Google Scholar
  4. 4.
    Zhang, T., Ghanem, B., Xu, C., Ahuja, N.: Object tracking by occlusion detection via structured sparse learning. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1033–1040 (2013)Google Scholar
  5. 5.
    Babenko, B., Yang, M.H., Belongie, S.: Visual tracking with online multiple instance learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 983–990 (2009)Google Scholar
  6. 6.
    Nam, H., Baek, M., Han, B.: Modeling and propagating CNNs in a tree structure for visual tracking. arXiv:1608.07242 (2016)
  7. 7.
    Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.S.: Staple: complementary learners for real-time tracking. In: CVPR (2016)Google Scholar
  8. 8.
    Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: ECCV (2016)Google Scholar
  9. 9.
    Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4353–4361 (2015)Google Scholar
  10. 10.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: CoRR, abs/1409.1556 (2014)Google Scholar
  11. 11.
    Nie, Y., Ma, K.K.: Adaptive rood pattern search for fast block-matching motion estimation. IEEE Trans. Image Process. 11(12), 1442–1449 (2002)CrossRefGoogle Scholar
  12. 12.
    Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: CVPR, pp. 2411–2418 (2013)Google Scholar
  13. 13.
    Kristan, M., Leonardis, A., Matas, J., et al.: The Visual Object Tracking VOT2016 Challenge Results Computer Vision – ECCV Workshops, Proceedings, Part II, Amsterdam, The Netherlands, 8–10, 15–16 October 2016 Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.ARTEMISInstitut Mines-Télécom/TélécomSudParisEvryFrance
  2. 2.Telecommunication, Faculty of ETTIUniversity “Politehnica” of BucharestBucharestRomania

Personalised recommendations