Abstract
Object tracking still remains challenging in computer vision because of the severe object variation, e.g., deformation, occlusion, and rotation. To handle the object variation and achieve robust object tracking performance, we propose a novel relationship-based tracking algorithm using neural networks in this paper. Compared with existing approaches in the literature, our method assumes the targeted object to be consisted of several parts and considers the evolution of the topology structure among these parts. After training a candidate neural network for predicting the probable areas each part may locate at in the successive frame, we then design a novel collaboration neural network to determine the precise area each part will locate at with account for the topology structure among these individual parts, which is learned from their historical physical locations during online tracking process. Experimental results show that the proposed method outperforms state-of-the-art trackers on a benchmark dataset, yielding the significant improvements in accuracy on high-distorted sequences.
Similar content being viewed by others
References
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society conference on computer vision and pattern recognition (CVPR), San Diego, USA, pp 886–893
Tang F, Brennan S, Zhao Q, Tao H (2007) Co-tracking using semi-supervised support vector machines. In: IEEE international conference on computer vision (ICCV), Rio de Janeiro, Brazil, pp 1–8
Mei X, Ling H (2011) Robust visual tracking and vehicle classification via sparse representation. IEEE Trans Pattern Anal Mach Intell (PAMI) 33:2259–2272
Ross DA, Lim J, Lin R-S, Yang M-H (2008) Incremental learning for robust visual tracking. Int J Comput Vis (IJCV) 77:125–141
Zhang T, Ghanem B, Liu S, Ahuja N (2012) Robust visual tracking via multi-task sparse learning. In: IEEE conference on computer vision and pattern recognition (CVPR), Rhode Island, USA, pp 2042–2049
Zhong W, Lu H, Yang M-H (2014) Robust object tracking via sparse collaborative appearance model. IEEE Trans Image Process (TIP) 23:2356–2368
Mei X, Ling H, Wu Y, Blasch E, Bai L (2011) Minimum error bounded efficient \(\ell \) 1 tracker with occlusion detection. In: IEEE computer society conference on computer vision and pattern recognition (CVPR). Colorado Springs, USA, pp 1257–1264
Wang N, Li S, Gupta A, Yeung D-Y (2015) Transferring rich feature hierarchies for robust visual tracking. arXiv preprint arXiv:1501.04587
Wang L, Ouyang W, Wang X, Lu H (2015) Visual tracking with fully convolutional networks. In: IEEE international conference on computer vision (ICCV), Santiago, Chile, pp 3119–3127
Ma C, Huang J-B, Yang X, Yang M-H (2015) Hierarchical convolutional features for visual tracking. In: IEEE international conference on computer vision (ICCV), Santiago, Chile, pp 3074–3082
Adam A, Rivlin E, Shimshoni I (2006) Robust fragments-based tracking using the integral histogram. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), New York, USA, pp 798–805
Kwon J, Lee KM (2009) Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive basin hopping Monte Carlo sampling. In: IEEE conference on computer vision and pattern recognition (CVPR), Miami, USA, pp 1208–1215
Yang S, Luo P, Loy C-C, Tang X (2015) From facial parts responses to face detection: a deep learning approach. In: IEEE international conference on computer vision (ICCV), Santiago, Chile, pp 3676–3684
Wang N, Shi J, Yeung D-Y, Jia J (2015) Understanding and diagnosing visual tracking systems. In: IEEE international conference on computer vision (ICCV), Santiago, Chile, pp 3101–3109
Wu Y, Lim J, Yang M-H (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell (PAMI) 37:1834–1848
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR), Columbus, USA, pp 580–587
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2014) Overfeat: integrated recognition, localization and detection using convolutional networks. In: International conference on learning representations (ICLR), Banff, Canada
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: IEEE conference on computer vision and pattern recognition (CVPR), Boston, USA, pp 1–9
Ouyang W, Luo P, Zeng X, Qiu S, Tian Y, Li H, Yang S, Wang Z, Xiong Y, Qian C, Zhu Z, Wang R, Loy C-C, Wang X, Tang X (2014) Deepid-net: multi-stage and deformable deep convolutional neural networks for object detection. arXiv preprint arXiv:1409.3505
Lin M, Chen Q, Yan S (2014) Network in network. In: International conference on learning representations (ICLR), Banff, Canada
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations (ICLR), Banff, Canada
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision (ECCV), Zurich, Switzerland, pp 346–361
Wang N, Yeung D-Y (2013) Learning a deep compact image representation for visual tracking. In: Advances in Neural Information Processing Systems 26 (NIPS 2013), Lake Tahoe, USA, pp 809–817
Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell (PAMI) 30:1958–1970
Jia X, Lu H, Yang M-H (2012) Visual tracking via adaptive structural local sparse appearance model. In: IEEE conference on computer vision and pattern recognition (CVPR), Rhode Island, USA, pp 1822–1829
Fischer A, Igel C (2012) An introduction to restricted Boltzmann machines. In: Alvarez L, Mejail ME, Gomez LE, Jacobo JE (eds) 17th Iberoamerican Congress, CIARP 2012, Buenos Aires, Argentina, pp 14–36
Arulampalam MS, Maskell S, Gordon N, Clapp T (2002) A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans Signal Process 50:174–188
Hare S, Saffari A, Torr PHS (2011) Struck: Structured output tracking with kernels. In: IEEE international conference on computer vision (ICCV), Rhode Island, USA, pp 263–270
Henriques JF, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: European conference on computer vision (ECCV), Firenze, Italy, pp 702–715
Zhang K, Zhang L, Yang M-H (2012) Real-time compressive tracking. In: European conference on computer vision (ECCV), Firenze, Italy, pp 864–877
Oron S, Bar-Hillel A, Levi D, Avidan S (2015) Locally orderless tracking. Int J Comput Vis (IJCV) 111:213–228
Dinh TB, Vo N, Medioni G (2011) Context tracker: exploring supporters and distracters in unconstrained environments. In: IEEE conference on computer vision and pattern recognition (CVPR), Colorado, USA, pp 1177–1184
Sevilla-Lara L, Learned-Miller E (2012) Distribution fields for tracking. In: IEEE conference on computer vision and pattern recognition (CVPR), Rhode Island, USA, pp 1910–1917
Acknowledgements
This work was supported by the Key Program of National Natural Science Foundation of China (61432012, U1435213). The authors would like to thank the anonymous reviewers for their valuable comments and suggestions that significantly improve the quality of this paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflicts of interest to this work.
Rights and permissions
About this article
Cite this article
Shi, X., Chen, G., Heng, P.A. et al. Tracking topology structure adaptively with deep neural networks. Neural Comput & Applic 30, 3317–3326 (2018). https://doi.org/10.1007/s00521-017-2906-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-017-2906-y