Object Tracking Using Deep Convolutional Neural Networks and Visual Appearance Models
In this paper we introduce a novel single object tracking method that extends the traditional GOTURN algorithm with a visual attention model. The proposed approach returns accurate object tracks and is able to handle sudden camera and background movement, long-term occlusions and multiple moving objects that can evolve simultaneously in a same neighborhood. The process of occlusion identification is performed using image quad-tree decomposition and patch matching, based on a convolution neural network trained offline. The object appearance model is adaptively modified in time based on both visual similarity constraints and trajectory verification tests. The experimental evaluation performed on the VOT 2016 dataset demonstrates the efficiency of our method that returns high accuracy scores regardless of the scene dynamics or object shape.
KeywordsSingle object tracking Adaptive object appearance model Occlusion detection Patch matching Convolution neural networks
This work was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS - UEFISCDI, project number: PN-II-RU-TE-2014-4-0202.
Part of this work has been funded by University Politehnica of Bucharest, through the “Excellence Research Grants” Program, UPB – GEX. Identifier: UPB–EXCELENȚĂ–2016, No. 97/26.09.2016 and UPB–EXCELENȚĂ–2017, project entitled: “Autonomous obstacle detection and recognitions system based on deep convolutional neural networks dedicated to visually impaired people”.
- 2.Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks. In: ECCV (2016)Google Scholar
- 4.Zhang, T., Ghanem, B., Xu, C., Ahuja, N.: Object tracking by occlusion detection via structured sparse learning. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1033–1040 (2013)Google Scholar
- 5.Babenko, B., Yang, M.H., Belongie, S.: Visual tracking with online multiple instance learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 983–990 (2009)Google Scholar
- 6.Nam, H., Baek, M., Han, B.: Modeling and propagating CNNs in a tree structure for visual tracking. arXiv:1608.07242 (2016)
- 7.Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.S.: Staple: complementary learners for real-time tracking. In: CVPR (2016)Google Scholar
- 8.Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: ECCV (2016)Google Scholar
- 9.Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4353–4361 (2015)Google Scholar
- 10.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: CoRR, abs/1409.1556 (2014)Google Scholar
- 12.Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: CVPR, pp. 2411–2418 (2013)Google Scholar
- 13.Kristan, M., Leonardis, A., Matas, J., et al.: The Visual Object Tracking VOT2016 Challenge Results Computer Vision – ECCV Workshops, Proceedings, Part II, Amsterdam, The Netherlands, 8–10, 15–16 October 2016 Google Scholar