Abstract
In big data age, learning with deep models has shown its outstanding effectiveness in a variety of vision tasks. Unfortunately, the requirement of enormous training samples and computational cost still limit its practicability in the low resource media computing based applications such online object tracking. More recently, CNN based feature extraction has helped tracking-by-learning strategies make a significant progress, although the coarse resolution outputs from the last layer still substantially limit a further improvement of tracking performance. By exploiting the hierarchies of convolutional layers as an image pyramid representation, earlier convolutional layers of hierarchical CNN have shown a certain enhancement of spatial localization but are less invariant to target appearance changes, which inevitably led to an inaccurate region for sampling when the non-rigid objects have intrinsic motion. To guarantee a qualified sampling for tracking-by-learning with hierarchical CNN, in this paper, we incorporated an inter-frame motion guidance with the intra-frame appearance correlations by formulating different energy optimization process in both spatial and temporal domains. With an optional functionality for the extracted regions combination, the proposed algorithm is able to achieve more precise target localization for qualified sampling. Experiments on challenging non-rigid tracking benchmark dataset have demonstrated a superior performance of the proposed tracking in comparison to the other state-of-art trackers.
Similar content being viewed by others
References
Babenko B, Yang M.-H., Belongie S (2009) Visual tracking with online multiple instance learning. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Cai Z, Gu Z, Y ZL, Liu H (2016) A real-time visual object tracking system based on kalman filter and mb-lbp feature matching. Multimed Tool Appl (MTAP) 75:2393–2409
Choi JW, Whangbo TK, Kim CG (2015) A contour tracking method of large motion object using optical flow and active contour model. Multimed Tool Appl (MTAP) 74:199–210
Danelljan M, Hager G, Khan F, Felsberg M (2014) Accurate scale estimation for robust visual tracking. In: British machine vision conference (BMVC), pp 1–11
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Dinh TB, Vo N, Medioni G (2011) Context tracker: Exploring supporters and distracters in unconstrained environments. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Godec M, Roth PM, Bischof H (2011) Hough-based tracking of non-rigid objects. In: IEEE International conference on computer vision (ICCV)
Hare S, Saffari A, Torr PH (2011) Struck: Structured output tracking with kernels. In: IEEE International conference on computer vision (ICCV), pp 263–270
Henriques F, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: European conference on computer vision (ECCV)
Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell (T-PAMI) 37 (3):583–596
Hong S, You T, Kwak S, Han B (2015) Online tracking by learning discriminative saliency map with convolutional neural network. International Conference on Machine Learning (ICML) pp. 597–606
Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D (2015) Multi-store tracker (muster): a cognitive psychology inspired approach to object tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 749–758
Kalal Z, Matas J, Mikolajczyk K (2010) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell (T-PAMI) 6(1):1409–1422
Kwon J, Lee KM (2010) Visual tracking decomposition. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Kwon J, Lee KM (2011) Tracking by sampling trackers. In: IEEE International conference on computer vision (ICCV)
Li H, Li Y, Porikli F (2014) Deeptrack: Learning discriminative feature representations by convolutional neural networks for visual tracking British Machine Vision Conference (BMVC)
Li H, Li Y, Porikli F (2014) Robust online visual tracking with a single convolutional neural network. Asian Conference on Computer Vision (ACCV) pp. 194–209
Liu C (2009) Beyond pixels: Exploring new representations and applications for motion analysis. Ph.D Thesis of Massachusetts Institute of Technology
Liu B, Huang J, Yang L, Kulikowsk C (2011) Robust tracking using local sparse appearance model and k-selection. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Liu T, Tao D, Song M, Maybank SJ (2016) Algorithm-dependent generalization bounds for multi-task learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)
Ma C, Huang JB, Yang X, Yang MH (2015) Hierarchical convolutional features for visual tracking. In: IEEE International conference on computer vision (ICCV), pp 3074–3082
Oron S, Bar-Hillel A, Levi D, Avidan S (2012) Locally orderless tracking. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Pan Z, Liu S, Fu W (2016) A review of visual moving target tracking. Multimedia Tools and Applications (MTAP). doi:10.1007/s11042-016-3647-0
Rother C, Kolmogorov V, Blake A (2004) Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314
Sevilla-Lara L, Learned-Miller E (2012) Distribution fields for tracking. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition International Conference on Learning Representations (ICLR)
Son J, Jung I, Park K, Han B (2015) Tracking-by-segmentation with online gradient boosting decision tree. In: IEEE International conference on computer vision (ICCV), pp 3056–3064
Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. Neural Information Processing Systems (NIPS) pp. 809–817
Wang L, Liu T, Wang G, Chan KL, Yang Q (2015) Video tracking using learned hierarchical features. IEEE Trans Image Process (T-IP) 24(4):1424–1435
Wu Y, Lim J, Yang MH (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell (T-PAMI) 37(9):1834–1848
Wu Z, Yang J, Liu H, Zhang Q (2016) A real-time object tracking via l2-rls and compressed haar-like features matching. Multimed Tool Appl (MTAP) 75:9427–9443
Yu J, Yang X, Gao F, Tao D (2016) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics p. doi:10.1109/TCYB.2016.2591583
Yu J, Kuang Z, Zhang B, Lin D, Fan J (2016) Image privacy protection by identifying sensitive objects via deep multi-task learning. IEEE Transactions on Information Forensics and Security p. doi:10.1109/TIFS.2016.2636090
Zhang J, Ma S, Sclaroff S (2014) Meem: Robust tracking via multiple experts using entropy minimization. In: European conference on computer vision (ECCV), pp 188–203
Zhang K, Zhang L, Yang M.-H. (2012) Real-time compressive tracking. In: European conference on computer vision (ECCV)
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. European Conference on Computer Vision (ECCV) pp. 818–833
Acknowledgments
This research is supported by National Natural Science Foundation of China 61571362 & 61601505, and the National Research Foundation, Prime Ministers Office, Singapore under its International Research Centre in Singapore Funding Initiative.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Zhang, P., Zhuo, T., Huang, H. et al. Robust tracking based on H-CNN with low-resource sampling and scaling by frame-wise motion localization. Multimed Tools Appl 77, 18781–18800 (2018). https://doi.org/10.1007/s11042-017-4493-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-4493-4