Deep Learning and Preference Learning for Object Tracking: A Combined Approach
- 469 Downloads
Abstract
Object tracking is one of the most important processes for object recognition in the field of computer vision. The aim is to find accurately a target object in every frame of a video sequence. In this paper we propose a combination technique of two algorithms well-known among machine learning practitioners. Firstly, we propose a deep learning approach to automatically extract the features that will be used to represent the original images. Deep learning has been successfully applied in different computer vision applications. Secondly, object tracking can be seen as a ranking problem, since the regions of an image can be ranked according to their level of overlapping with the target object (ground truth in each video frame). During object tracking, the target position and size can change, so the algorithms have to propose several candidate regions in which the target can be found. We propose to use a preference learning approach to build a ranking function which will be used to select the bounding box that ranks higher, i.e., that will likely enclose the target object. The experimental results obtained by our method, called \( DPL ^{2}\) (Deep and Preference Learning), are competitive with respect to other algorithms.
Keywords
Deep learning Preference learning Object trackingNotes
Acknowledgements
This work was funded by Ministerio de Economía y Competitividad de España (Grant TIN2015-65069-C2-2-R), Specialized Research Fund for the Doctoral Program of Higher Education of China (Grant 20120061110045) and the Project of Science and Technology Development Plan of Jilin Province, China (Grant 20150204007GX). The paper was written while Shuchao Pang was visiting the University of Oviedo at Gijón.
References
- 1.Babenko B, Yang MH, Belongie S (2009) Visual tracking with online multiple instance learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’09), pp 983–990Google Scholar
- 2.Bahamonde A, Bayón GF, Díez J, Quevedo JR, Luaces O, del Coz JJ, Alonso J, Goyache F (2004) Feature subset selection for learning preferences: a case study. In: Proceedings of ICML’04. ACMGoogle Scholar
- 3.Bai Y, Tang M (2012) Robust tracking via weakly supervised ranking svm. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 1854–1861Google Scholar
- 4.Bao C, Wu Y, Ling H, Ji H (2012) Real time robust l1 tracker using accelerated proximal gradient approach. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12), pp 1830–1837Google Scholar
- 5.Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25(5):564–577CrossRefGoogle Scholar
- 6.Dai P, Liu K, Xie Y, Li C (2014) Online co-training ranking svm for visual tracking. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp 6568–6572Google Scholar
- 7.Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetzbMATHGoogle Scholar
- 8.Dinh TB, Vo N, Medioni G (2011) Context tracker: exploring supporters and distracters in unconstrained environments. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11), pp 1177–1184Google Scholar
- 9.Hare S, Saffari A, Torr PH (2011) Struck: structured output tracking with kernels. In: IEEE International Conference on Computer Vision (ICCV’11), pp 263–270Google Scholar
- 10.Henriques JF, Caseiro R, Martins P, Batista J (2012) Exploiting the circulant structure of tracking-by-detection with kernels. In: Computer Vision—ECCV 2012. Springer, Berlin, pp 702–715Google Scholar
- 11.Herbrich R, Graepel T, Obermayer K (1999) Support vector learning for ordinal regression. In: International Conference on Artificial Neural Networks, pp 97–102Google Scholar
- 12.Jepson AD, Fleet DJ, El-Maraghi TF (2003) Robust online appearance models for visual tracking. IEEE Trans Pattern Anal Mach Intell 25(10):1296–1311CrossRefGoogle Scholar
- 13.Jia X, Lu H, Yang MH (2012) Visual tracking via adaptive structural local sparse appearance model. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12), pp 1822–1829Google Scholar
- 14.Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02, pp 133–142Google Scholar
- 15.Kalal Z, Matas J, Mikolajczyk K (2010) Pn learning: bootstrapping binary classifiers by structural constraints. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10), pp 49–56Google Scholar
- 16.Kwon J, Lee KM (2010) Visual tracking decomposition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10), pp 1269–1276Google Scholar
- 17.Kwon J, Lee KM (2011) Tracking by sampling trackers. In: IEEE International Conference on Computer Vision (ICCV’11), pp 1195–1202Google Scholar
- 18.Lawrence S, Giles CL, Tsoi AC, Back AD (1997) Face recognition: a convolutional neural-network approach. IEEE Trans Neural Netw 8(1):98–113CrossRefGoogle Scholar
- 19.LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. Handb Brain Theory Neural Netw 3361(10)Google Scholar
- 20.LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRefGoogle Scholar
- 21.Li X, Hu W, Shen C, Zhang Z, Dick A, Hengel AVD (2013) A survey of appearance models in visual object tracking. ACM Trans Intell Syst Technol (TIST) 4(4):58Google Scholar
- 22.Lin TY, Cui Y, Belongie S, Hays J, Tech C (2015) Learning deep representations for ground-to-aerial geolocalization. In: Proceedings of the IEEE CVPR’15, pp 5007–5015Google Scholar
- 23.Liu B, Huang J, Yang L, Kulikowsk C (2011) Robust tracking using local sparse appearance model and k-selection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11), pp 1313–1320Google Scholar
- 24.Luque-Baena RM, Ortiz-de Lazcano-Lobato JM, López-Rubio E, Domínguez E, Palomo EJ (2013) A competitive neural network for multiple object tracking in video sequence analysis. Neural Process Lett 37(1):47–67CrossRefGoogle Scholar
- 25.Pang S, del Coz JJ, Yu Z, Luaces O, Díez J (2016) Combining deep learning and preference learning for object tracking. In: Hirose A, Ozawa S, Doya K, Ikeda K, Lee M, Liu D (eds) Neural Information Processing: 23rd International Conference, ICONIP 2016, Kyoto, Japan, October 16–21, 2016, Proceedings, Part III, pp 70–77. Springer, BerlinGoogle Scholar
- 26.Pang S, Yu Z (2015) Face recognition: a novel deep learning approach. J Opt Technol 82(4):237–245CrossRefGoogle Scholar
- 27.Ross DA, Lim J, Lin RS, Yang MH (2008) Incremental learning for robust visual tracking. Int J Comput Vis 77(1–3):125–141CrossRefGoogle Scholar
- 28.Torralba A, Fergus R, Freeman WT (2008) 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970CrossRefGoogle Scholar
- 29.Vapnik V (1998) Statistical learning theory. Wiley, New York, NYzbMATHGoogle Scholar
- 30.Wang N, Yeung DY (2013) Learning a deep compact image representation for visual tracking. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems. Neural Information Processing Systems Foundation Inc, Lake Tahoe, pp 809–817Google Scholar
- 31.Wu Y, Lim J, Yang M (2015) Object tracking benchmark. IEEE Trans Pattern Anal Mach Intell 37(9):1834–1848CrossRefGoogle Scholar
- 32.Wu Y, Lim J, Yang MH (2013) Online object tracking: a benchmark. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’13), pp 2411–2418Google Scholar
- 33.Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv (CSUR) 38(4):13CrossRefGoogle Scholar
- 34.Zhang K, Zhang L, Yang MH (2012) Real-time compressive tracking. In: Computer Vision—ECCV 2012. Springer, Berlin, pp 864–877Google Scholar
- 35.Zhang T, Ghanem B, Liu S, Ahuja N (2012) Robust visual tracking via multi-task sparse learning. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12), pp 2042–2049Google Scholar
- 36.Zhong W, Lu H, Yang MH (2012) Robust object tracking via sparsity-based collaborative model. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12), pp 1838–1845Google Scholar
- 37.Zhou S, Chen Q, Wang X (2013) Convolutional deep networks for visual data classification. Neural Process Lett 38(1):17–27CrossRefGoogle Scholar