Abstract
It is challenging to track a target continuously in videos with long-term occlusion, or objects which leave then re-enter a scene. Existing tracking algorithms combined with onlinetrained object detectors perform unreliably in complex conditions, and can only provide discontinuous trajectories with jumps in position when the object is occluded. This paper proposes a novel framework of tracking-by-detection using selection and completion to solve the abovementioned problems. It has two components, tracking and trajectory completion. An offline-trained object detector can localize objects in the same category as the object being tracked. The object detector is based on a highly accurate deep learning model. The object selector determines which object should be used to re-initialize a traditional tracker. As the object selector is trained online, it allows the framework to be adaptable. During completion, a predictive non-linear autoregressive neural network completes any discontinuous trajectory. The tracking component is an online real-time algorithm, and the completion part is an after-theevent mechanism. Quantitative experiments show a significant improvement in robustness over prior state-of- the-art methods.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Collins, R. T.; Lipton, A. J.; Fujiyoshi, H.; Kanade, T. Algorithms for cooperative multisensor surveillance. Proceedings of the IEEE Vol. 89, No. 10, 1456–1477, 2001.
Greiffenhagen, M.; Comaniciu, D.; Niemann, H.; Ramesh, V. Design, analysis, and engineering of video monitoring systems: An approach and a case study. Proceedings of the IEEE Vol. 89, No. 10, 1498–1517, 2001.
Kanhere, N. K.; Birchfield, S. T.; Sarasua, W. A. Vision based real time traffic monitoring. U.S. Patent 8,379,926. 2013.
Morris, B. T.; Tran, C.; Scora, G.; Trivedi, M. M.; Barth, M. J. Real-time video-based traffic measurement and visualization system for energy/emissions. IEEE Transactions on Intelligent Transportation Systems Vol. 13, No. 4, 1667–1678, 2012.
Rui, Y.; Chen, Y. Better proposal distributions: Object tracking using unscented particle filter. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, II-786–II-793, 2001.
Zhang, K.; Zhang, L.; Yang, M.-H. Real-time compressive tracking. In: Computer Vision–ECCV 2012. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer-Verlag Berlin Heidelberg, 864–877, 2012.
Li, X.; Hu, W.; Shen, C.; Zhang, Z.; Dick, A.; van den Hengel, A. A survey of appearance models in visual object tracking. ACM Transactions on Intelligent Systems and Technology Vol. 4, No. 4, Article No. 58, 2013.
Isard, M.; Blak, A. CONDENSATION—Conditional density propagation for visual tracking. International Journal of Computer Vision Vol. 29, No. 1, 5–28, 1998.
Santner, J.; Leistner, C.; Saffari, A.; Pock, T.; Bischof, H. PROST: Parallel robust online simple tracking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 723–730, 2010.
Hedayati, M.; Cree, M. J.; Scott, J. Combination of mean shift of colour signature and optical flow for tracking during foreground and background occlusion. In: Image and Video Technology. Braunl, T.; McCane, B.; Rivera, M.; Yu, X. Eds. Springer International Publishing Switzerland, 87–98, 2016.
Zhao, Q.; Yang, Z.; Tao, H. Differential earth mover’s distance with its applications to visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 32, No. 2, 274–287, 2010.
Sun, C.; Wang, D.; Lu, H. Occlusion-aware fragmentbased tracking with spatial-temporal consistency. IEEE Transactions on Image Processing Vol. 25, No. 8, 3814–3825, 2016.
Hu, W.; Zhou, X.; Li, W.; Luo, W.; Zhang, X.; Maybank, S. Active contour-based visual tracking by integrating colors, shapes, and motions. IEEE Transactions on Image Processing Vol. 22, No. 5, 1778–1792, 2013.
Jepson, A. D.; Fleet, D. J.; El-Maraghi, T. F. Robust online appearance models for visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 25, No. 10, 1296–1311, 2003.
Wang, L.; Ouyang, W.; Wang, X.; Lu, H. STCT: Sequentially training convolutional networks for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1373–1381, 2016.
Wang, S.; Lu, H.; Yang, F.; Yang, M.-H. Superpixel tracking. In: Proceedings of the IEEE International Conference on Computer Vision, 1323–1330, 2011.
Lowe, D. G. Object recognition from local scaleinvariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, Vol. 2, 1150–1157, 1999.
Chen, A.-h.; Zhu, M.; Wang, Y.-h.; Xue, C. Mean shift tracking combining SIFT. In: Proceedings of the 9th International Conference on Signal Processing, 1532–1535, 2008.
Fazli, S.; Pour, H. M.; Bouzari, H. Particle filter based object tracking with sift and color feature. In: Proceedings of the 2nd International Conference on Machine Vision, 89–93, 2009.
Zhou, H.; Yuan, Y.; Shi, C. Object tracking using SIFT features and mean shift. Computer Vision and Image Understanding Vol. 113, No. 3, 345–352, 2009.
Mahapatra, D.; Saini, M. K.; Sun, Y. Illumination invariant tracking in office environments using neurobiology-saliency based particle filter. In: Proceedings of the IEEE International Conference on Multimedia and Expo, 953–956, 2008.
Zhang, G.; Yuan, Z.; Zheng, N.; Sheng, X.; Liu, T. Visual saliency based object tracking. In: Computer Vision–ACCV 2009. Zha, H.; Taniguchi, R.; Maybank, S. Eds. Springer-Verlag Berlin Heidelberg, 193–203, 2010.
Kim, Z. W. Real time object tracking based on dynamic feature grouping with background subtraction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2008.
Williams, O.; Blake, A.; Cipolla, R. Sparse Bayesian learning for efficient visual tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 27, No. 8, 1292–1304, 2005.
Li, Y.; Ai, H.; Yamashita, T.; Lao, S.; Kawade, M. Tracking in low frame rate video: A cascade particle filter with discriminative observers of different life spans. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30, No. 10, 1728–1740, 2008.
Okuma, K.; Taleghani, A.; de Freitas, N.; Little, J. J.; Lowe, D. G. A boosted particle filter: Multitarget detection and tracking. In: Computer Vision–ECCV 2004. Pajdla, T.; Matas J. Eds. Springer-Verlag Berlin Heidelberg, 28–39, 2004.
Leibe, B.; Schindler, K.; van Gool, L. Coupled detection and trajectory estimation for multi-object tracking. In: Proceedings of the IEEE 11th International Conference on Computer Vision, 1–8, 2007.
Grabner, H.; Bischof, H. On-line boosting and vision. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 260–267, 2006.
Babenko, B.; Yang, M.-H.; Belongie, S. Visual tracking with online multiple instance learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 983–990, 2009.
Tang, F.; Brennan, S.; Zhao, Q.; Tao, H. Co-tracking using semi-supervised support vector machines. In: Proceedings of the IEEE 11th International Conference on Computer Vision, 1–8, 2007.
Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, 886–893, 2005.
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 38, No. 1, 142–158, 2016.
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster RCNN: Towards real-time object detection with region proposal networks. In: Proceedings of the Advances in Neural Information Processing Systems 28, 91–99, 2015.
Girshick, R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440–1448, 2015.
Chow, T. W. S.; Leung, C. T. Nonlinear autoregressive integrated neural network model for short-term load forecasting. IEE Proceedings-Generation, Transmission and Distribution Vol. 143, No. 5, 500–506, 1996.
Henriques, J. F.; Caseiro, R.; Martins, P.; Batista, J. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 37, No. 3, 583–596, 2015.
Henriques, J. F.; Caseiro, R.; Martins, P.; Batista, J. Exploiting the circulant structure of tracking-bydetection with kernels. In: Computer Vision–ECCV 2012. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer-Verlag Berlin Heidelberg, 702–715, 2012.
Yoon, J. H.; Yang, M. H.; Yoon, K. J. Interacting multiview tracker. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 38, No. 5, 903–917, 2016.
Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple online and realtime tracking. In: Proceedings of the IEEE International Conference on Image Processing, 3464–3468, 2016.
Ma, C.; Yang, X.; Zhang, C.; Yang, M.-H. Longterm correlation tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5388–5396, 2015.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Project No. 61521002), the General Financial Grant from the China Postdoctoral Science Foundation (Grant No. 2015M580100), a Research Grant of Beijing Higher Institution Engineering Research Center, and an EPSRC Travel Grant.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is published with open access at Springerlink.com
Ruochen Fan is a master candidate in the Department of Computer Science and Technology, Tsinghua University. He received his bachelor degree from Beijing University of Posts and Telecommunications in 2016. His research interest is computer vision.
Fang-Lue Zhang is a lecturer in Victoria University of Wellington. He received his doctor degree from Tsinghua University in 2015 and bachelor degree from Zhejiang University in 2009. His research interests include image and video editing, computer vision, and computer graphics. radar
Min Zhang is a postdoctoral fellow in the Center of Mathematical Sciences and Applications, Harvard University. She received her Ph.D. degree in computer science from Stony Brook University and another Ph.D. degree in mathematics from Zhejiang University. She is an expert in the fields of geometric modeling, medical imaging, graphics, visualization, machine learning, 3D technologies, etc.
Ralph R. Martin is a professor in Cardiff University. He obtained his Ph.D. degree from Cambridge University in 1983. He has published more than 300 papers and 15 books, covering such topics as solid and surface modeling, intelligent sketch input, geometric reasoning, reverse engineering, and computer graphics. He is a Fellow of the Learned Society of Wales, the Institute of Mathematics and its Applications, and the British Computer Society. He has served on the editorial boards of Computer-Aided Design, Computer Aided Geometric Design, and Geometric Models. He was recently awarded a Friendship Award, China’s highest honor for foreigners.
Rights and permissions
Open Access The articles published in this journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095.To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Fan, R., Zhang, FL., Zhang, M. et al. Robust tracking-by-detection using a selection and completion mechanism. Comp. Visual Media 3, 285–294 (2017). https://doi.org/10.1007/s41095-017-0083-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-017-0083-7