Abstract
Object tracking has been a challenge in computer vision. In this paper, we present a novel method to model target appearance and combine it with structured output learning for robust online tracking within a tracking-by-detection framework. We take both convolutional features and handcrafted features into account to robustly encode the target appearance. First, we extract convolutional features of the target by kernels generated from the initial annotated frame. To capture appearance variation during tracking, we propose a new strategy to update the target and background kernel pool. Secondly, we employ a structured output SVM for refining the target’s location to mitigate uncertainty in labeling samples as positive or negative. Compared with existing state-of-the-art trackers, our tracking method not only enhances the robustness of the feature representation, but also uses structured output prediction to avoid relying on heuristic intermediate steps to produce labelled binary samples. Extensive experimental evaluation on the challenging OTB-50 video sequences shows competitive results in terms of both success and precision rate, demonstrating the merits of the proposed tracking method.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Zhang, K.; Liu, Q.; Wu, Y.; Yang, M.-H. Robust visual tracking via convolutional networks without training. IEEE Transactions on Image Processing Vol. 25, No. 4, 1779–1792, 2016.
Henriques, J. F.; Caseiro, R.; Martins, P.; Batista, J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 37, No. 3, 583–596, 2015.
Smeulders, A. W. M.; Chu, D. M.; Cucchiara, R.; Calderara, S.; Dehghan, A.; Shah, M. Visual tracking: An experimental survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 36, No. 7, 1442–1468, 2014.
Babenko, B.; Yang, M.-H.; Belongie, S. Robust object tracking with online multiple instance learning. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 33, No. 8, 1619–1632, 2011.
Kalal, Z.; Mikolajczyk, K.; Matas, J. Trackinglearning-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 7, 1409–1422, 2012.
Grabner, H.; Grabner, M.; Bischof, H. Real-time tracking via on-line boosting. In: Proceedings British Machine Vision Conference, Vol. 1, 47–56, 2006.
Grabner, H.; Leistner, C.; Bischof, H. Semi-supervised on-line boosting for robust tracking. In: Computer Vision–ECCV 2008. Forsyth, D.; Torr, P.; Zisserma, A. Eds. Springer Berlin Heidelberg, 234–247, 2008.
Ma, Y.; Chen, W.; Ma, X.; Xu, J.; Huang, X.; Maciejewski, R.; Tung, A. K. H. EasySVM: A visual analysis approach for open-box support vector machines. Computational Visual Media Vol. 3, No. 2, 161–175, 2017.
Hare, S.; Golodetz, S.; Saffari, A.; Vineet, V.; Cheng, M.-M.; Hicks, S. L.; Torr, P. H. Struck: Structured output tracking with kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 38, No. 10, 2096–2109, 2016.
Ren, X.; Malik, J. Tracking as repeated figure/ground segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2007.
Zhou, X.; Li, Y.; He, B.; Bai, T. GM-PHD-based multi-target visual tracking using entropy distribution and game theory. IEEE Transactions on Industrial Informatics Vol. 10, No. 2, 1064–1076, 2014.
Zhou, X.; Yu, H.; Liu, H.; Li, Y. Tracking multiple video targets with an improved GM-PHD tracker. Sensors Vol. 15, No. 12, 30240–30260, 2015.
Mei, X.; Ling, H. Robust visual tracking using l1 minimization. In: Proceedings of the IEEE 12th International Conference on Computer Vision, 1436–1443, 2009.
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
Han, B.; Comaniciu, D.; Zhu, Y.; Davis, L. S. Sequential kernel density approximation and its application to real-time visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30, No. 7, 1186–1197, 2008.
Jepson, A. D.; Fleet, D. J.; El-Maraghi, T. F. Robust online appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 25, No. 10, 1296–1311, 2003.
Ross, D. A.; Lim, J.; Lin, R.-S; Yang, M.-H. Incremental learning for robust visual tracking. International Journal of Computer Vision Vol. 77, No. 1–3, 125–141, 2008.
Zhong, W.; Lu, H.; Yang, M.-H. Robust object tracking via sparse collaborative appearance model. IEEE Transactions on Image Processing Vol. 23, No. 5, 2356–2368, 2014.
Zhang, K.; Zhang, L.; Yang, M.-H. Real-time compressive tracking. In: Computer Vision–ECCV 2012. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 864–877, 2012.
Kalal, Z.; Matas, J.; Mikolajczyk, K. P-N learning: Bootstrapping binary classifiers by structural constraints. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 49–56, 2010.
Gao, J.; Ling, H.; Hu, W.; Xing, J. Transfer learning based visual tracking with Gaussian processes regression. In: Computer Vision–ECCV 2014. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars. Eds. Springer Cham, 188–203, 2014.
Zhu, Z.; Liang, D.; Zhang, S.; Huang, X.; Li, B.; Hu, S. Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2110–2118, 2016.
Li, H.; Li, Y.; Porikli, F. Robust online visual tracking with a single convolutional neural network. In: Computer Vision–ACCV 2014. Cremers, D.; Reid, I.; Saito, H.; Yang, M.-H. Eds. Springer Cham, 194–209, 2014.
Zhou, X.; Xie, L.; Zhang, P.; Zhang, Y. An ensemble of deep neural networks for object tracking. In: Proceedings of the IEEE International Conference on Image Processing, 843–847, 2014.
Fan, J.; Xu, W.; Wu, Y.; Gong, Y. Human tracking using convolutional neural networks. IEEE Transactions on Neural Networks Vol. 21, No. 10, 1610–1623, 2010.
Wang, N.; Yeung, D.-Y. Learning a deep compact image representation for visual tracking. In: Proceedings of the Advances in Neural Information Processing Systems, 809–817, 2013.
Wang, L.; Liu, T.; Wang, G.; Chan, K. L.; Yang, Q. Video tracking using learned hierarchical features. IEEE Transactions on Image Processing Vol. 24, No. 4, 1424–1435, 2015.
Avidan, S. Support vector tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 26, No. 8, 1064–1072, 2004.
Collins, R. T.; Liu, Y.; Leordeanu, M. Online selection of discriminative tracking features. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 27, No. 10, 1631–1643, 2005.
Yang, F.; Lu, H.; Yang, M.-H. Robust superpixel tracking. IEEE Transactions on Image Processing Vol. 23, No. 4, 1639–1651, 2014.
Henriques, J. F.; Caseiro, R.; Martins, P.; Batista, J. Exploiting the circulant structure of tracking-bydetection with kernels. In: Computer Vision–ECCV 2012. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 702–715, 2012.
Elad, M.; Figueiredo, M. A. T.; Ma, Y. On the role of sparse and redundant representations in image processing. Proceedings of the IEEE Vol. 98, No. 6, 972–982, 2010.
Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 886–893, 2005.
Wu, Y.; Lim, J.; Yang, M.-H. Online object tracking: A benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2411–2418, 2013.
Zhong, W.; Lu, H.; Yang, M.-H. Robust object tracking via sparsity-based collaborative model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1838–1845, 2012.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Nos. 61403342, 61273286, U1509207, 61325019), and Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering (No. 2014KLA09).
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is published with open access at Springerlink.com
Junwei Li, Ph.D., is with the College of Computer Science and Technology, Zhejiang University of Technology. He is a member of the China Computer Federation. His main research interests include object tracking, machine learning, convolutional neural networks, and object detection.
Xiaolong Zhou, Ph.D. and associate professor, is with the College of Computer Science and Technology, Zhejiang University of Technology. He is a member of the China Computer Federation, IEEE, and ACM. His main research interests are in visual tracking, gaze estimation, and pattern recognition.
Sixian Chan, Ph.D., is with the College of Computer Science and Technology, Zhejiang University of Technology. His main research interests include visual tracking, image processing, pattern recognition, robotics, and image understanding.
Shengyong Chen, Ph.D., professor. He is an IET fellow, an IEEE senior member, and a senior member of the China Computer Federation. His main research interests include computer vision, pattern recognition, and robotics.
Rights and permissions
Open Access The articles published in this journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Li, J., Zhou, X., Chan, S. et al. Object tracking using a convolutional network and a structured output SVM. Comp. Visual Media 3, 325–335 (2017). https://doi.org/10.1007/s41095-017-0087-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-017-0087-3