Robust tracking based on H-CNN with low-resource sampling and scaling by frame-wise motion localization

Zhang, Peng; Zhuo, Tao; Huang, Hanqiao; Chen, Kangli; Zhang, Bo; Kankanhalli, Mohan

doi:10.1007/s11042-017-4493-4

Robust tracking based on H-CNN with low-resource sampling and scaling by frame-wise motion localization

Published: 22 February 2017

Volume 77, pages 18781–18800, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Peng Zhang¹,
Tao Zhuo²,
Hanqiao Huang³,
Kangli Chen¹,
Bo Zhang¹ &
…
Mohan Kankanhalli²

399 Accesses
Explore all metrics

Abstract

In big data age, learning with deep models has shown its outstanding effectiveness in a variety of vision tasks. Unfortunately, the requirement of enormous training samples and computational cost still limit its practicability in the low resource media computing based applications such online object tracking. More recently, CNN based feature extraction has helped tracking-by-learning strategies make a significant progress, although the coarse resolution outputs from the last layer still substantially limit a further improvement of tracking performance. By exploiting the hierarchies of convolutional layers as an image pyramid representation, earlier convolutional layers of hierarchical CNN have shown a certain enhancement of spatial localization but are less invariant to target appearance changes, which inevitably led to an inaccurate region for sampling when the non-rigid objects have intrinsic motion. To guarantee a qualified sampling for tracking-by-learning with hierarchical CNN, in this paper, we incorporated an inter-frame motion guidance with the intra-frame appearance correlations by formulating different energy optimization process in both spatial and temporal domains. With an optional functionality for the extracted regions combination, the proposed algorithm is able to achieve more precise target localization for qualified sampling. Experiments on challenging non-rigid tracking benchmark dataset have demonstrated a superior performance of the proposed tracking in comparison to the other state-of-art trackers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust feature learning for online discriminative tracking without large-scale pre-training

Article 30 June 2018

Feature selection accelerated convolutional neural networks for visual tracking

Article 30 March 2021

Learning deep convolutional descriptor aggregation for efficient visual tracking

Article 27 October 2021

References

Babenko B, Yang M.-H., Belongie S (2009) Visual tracking with online multiple instance learning. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Cai Z, Gu Z, Y ZL, Liu H (2016) A real-time visual object tracking system based on kalman filter and mb-lbp feature matching. Multimed Tool Appl (MTAP) 75:2393–2409
Article Google Scholar
Choi JW, Whangbo TK, Kim CG (2015) A contour tracking method of large motion object using optical flow and active contour model. Multimed Tool Appl (MTAP) 74:199–210
Article Google Scholar
Danelljan M, Hager G, Khan F, Felsberg M (2014) Accurate scale estimation for robust visual tracking. In: British machine vision conference (BMVC), pp 1–11
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Dinh TB, Vo N, Medioni G (2011) Context tracker: Exploring supporters and distracters in unconstrained environments. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Godec M, Roth PM, Bischof H (2011) Hough-based tracking of non-rigid objects. In: IEEE International conference on computer vision (ICCV)
Hare S, Saffari A, Torr PH (2011) Struck: Structured output tracking with kernels. In: IEEE International conference on computer vision (ICCV), pp 263–270
Henriques F, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: European conference on computer vision (ECCV)
Henriques JF, Caseiro R, Martins P, Batista J (2015) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell (T-PAMI) 37 (3):583–596
Article Google Scholar
Hong S, You T, Kwak S, Han B (2015) Online tracking by learning discriminative saliency map with convolutional neural network. International Conference on Machine Learning (ICML) pp. 597–606
Hong Z, Chen Z, Wang C, Mei X, Prokhorov D, Tao D (2015) Multi-store tracker (muster): a cognitive psychology inspired approach to object tracking. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 749–758
Kalal Z, Matas J, Mikolajczyk K (2010) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell (T-PAMI) 6(1):1409–1422
Google Scholar
Kwon J, Lee KM (2010) Visual tracking decomposition. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Kwon J, Lee KM (2011) Tracking by sampling trackers. In: IEEE International conference on computer vision (ICCV)
Li H, Li Y, Porikli F (2014) Deeptrack: Learning discriminative feature representations by convolutional neural networks for visual tracking British Machine Vision Conference (BMVC)
Li H, Li Y, Porikli F (2014) Robust online visual tracking with a single convolutional neural network. Asian Conference on Computer Vision (ACCV) pp. 194–209
Liu C (2009) Beyond pixels: Exploring new representations and applications for motion analysis. Ph.D Thesis of Massachusetts Institute of Technology
Liu B, Huang J, Yang L, Kulikowsk C (2011) Robust tracking using local sparse appearance model and k-selection. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Liu T, Tao D, Song M, Maybank SJ (2016) Algorithm-dependent generalization bounds for multi-task learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)
Ma C, Huang JB, Yang X, Yang MH (2015) Hierarchical convolutional features for visual tracking. In: IEEE International conference on computer vision (ICCV), pp 3074–3082
Oron S, Bar-Hillel A, Levi D, Avidan S (2012) Locally orderless tracking. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Pan Z, Liu S, Fu W (2016) A review of visual moving target tracking. Multimedia Tools and Applications (MTAP). doi:10.1007/s11042-016-3647-0
Rother C, Kolmogorov V, Blake A (2004) Grabcut: Interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314
Article Google Scholar
Sevilla-Lara L, Learned-Miller E (2012) Distribution fields for tracking. In: IEEE International conference on computer vision and pattern recognition (CVPR)
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition International Conference on Learning Representations (ICLR)
Son J, Jung I, Park K, Han B (2015) Tracking-by-segmentation with online gradient boosting decision tree. In: IEEE International conference on computer vision (ICCV), pp 3056–3064
Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. Neural Information Processing Systems (NIPS) pp. 809–817
Wang L, Liu T, Wang G, Chan KL, Yang Q (2015) Video tracking using learned hierarchical features. IEEE Trans Image Process (T-IP) 24(4):1424–1435
Article MathSciNet Google Scholar
Wu Y, Lim J, Yang MH (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell (T-PAMI) 37(9):1834–1848
Article Google Scholar
Wu Z, Yang J, Liu H, Zhang Q (2016) A real-time object tracking via l2-rls and compressed haar-like features matching. Multimed Tool Appl (MTAP) 75:9427–9443
Article Google Scholar
Yu J, Yang X, Gao F, Tao D (2016) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics p. doi:10.1109/TCYB.2016.2591583
Yu J, Kuang Z, Zhang B, Lin D, Fan J (2016) Image privacy protection by identifying sensitive objects via deep multi-task learning. IEEE Transactions on Information Forensics and Security p. doi:10.1109/TIFS.2016.2636090
Zhang J, Ma S, Sclaroff S (2014) Meem: Robust tracking via multiple experts using entropy minimization. In: European conference on computer vision (ECCV), pp 188–203
Zhang K, Zhang L, Yang M.-H. (2012) Real-time compressive tracking. In: European conference on computer vision (ECCV)
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. European Conference on Computer Vision (ECCV) pp. 818–833

Download references

Acknowledgments

This research is supported by National Natural Science Foundation of China 61571362 & 61601505, and the National Research Foundation, Prime Ministers Office, Singapore under its International Research Centre in Singapore Funding Initiative.

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, Xi’an, China
Peng Zhang, Kangli Chen & Bo Zhang
Sensor-enhanced Social Media (SeSaMe) Centre, National University of Singapore, Singapore, Singapore
Tao Zhuo & Mohan Kankanhalli
School of Aeronautics and Astronautics Engineering, Air Force Engineering University, Xi’an, China
Hanqiao Huang

Authors

Peng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Zhuo
View author publications
You can also search for this author in PubMed Google Scholar
Hanqiao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Kangli Chen
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mohan Kankanhalli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Peng Zhang or Tao Zhuo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, P., Zhuo, T., Huang, H. et al. Robust tracking based on H-CNN with low-resource sampling and scaling by frame-wise motion localization. Multimed Tools Appl 77, 18781–18800 (2018). https://doi.org/10.1007/s11042-017-4493-4

Download citation

Received: 07 September 2016
Revised: 19 January 2017
Accepted: 08 February 2017
Published: 22 February 2017
Issue Date: July 2018
DOI: https://doi.org/10.1007/s11042-017-4493-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust tracking based on H-CNN with low-resource sampling and scaling by frame-wise motion localization

Abstract

Access this article

Similar content being viewed by others

Robust feature learning for online discriminative tracking without large-scale pre-training

Feature selection accelerated convolutional neural networks for visual tracking

Learning deep convolutional descriptor aggregation for efficient visual tracking

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust tracking based on H-CNN with low-resource sampling and scaling by frame-wise motion localization

Abstract

Access this article

Similar content being viewed by others

Robust feature learning for online discriminative tracking without large-scale pre-training

Feature selection accelerated convolutional neural networks for visual tracking

Learning deep convolutional descriptor aggregation for efficient visual tracking

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation