Abstract
In this paper, we introduce a new adaptive feature weighting framework for multi-modal tracking. Our proposed tracker can compactly and efficiently handle multiple sources of tracking data, such as colour, brightness, gradient orientation and thermal infrared, and adaptively weight the sources based on their reliability for tracking. The adaptive weight selection mechanism is inspired by the state-of-the-art Collins tracker, but instead of treating the tracked object as a bag of features, it takes advantage of the spatial information using a global object model. Additionally, our tracker incorporates scale into the weight selection process and is shown to outperform the Collins tracker in an extensive video evaluation.
Similar content being viewed by others
References
Loy G, Fletcher L, Apostoloff N, Zelinsky A (2002) An adaptive fusion architecture for target tracking. In: IEEE international conference on automatic face and gesture recognition (FGR)
Spengler M, Schiele B (2003) Towards robust multi-cue integration for visual tracking. Machine Vis Appl 14(1):50–58
She K, Bebis G, Gu H, Miller R (2004) Vehicle tracking using on-line fusion of color and shape features. In: IEEE international conference on intelligent transportation systems
Smith K, Gatica-Perez D, Odobez JM (2005) Using particles to track varying numbers of interacting people. In: IEEE Conference on computer vision and pattern recognition (CVPR)
Avidan S (2005) Ensemble tracking. In: IEEE international conference on computer vision and pattern recognition (CVPR)
Collins RT, Liu Y, Leordeanu M (2005) Online selection of discriminative tracking features. IEEE Trans Pattern Anal Machine Intell 27(10)
Stern H, Efros B (2002) Adaptive color space switching for face tracking in multi-colored lighting environment. In: Proceedings of IEEE international conference on automatic face and gester recognition, Washington DC, USA, pp 249–254
Wang J, Chen X, Gao W (2005) Online selecting discriminative tracking features using particle filter. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 1037–1042
Grabner H, Bischof H (2006) On-line boosting and vision. In: 2006 IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 260–267
Ó Conaire C, O’Connor NE, Smeaton A (2007) Thermo-visual feature fusion for object tracking using multiple spatiogram trackers. J Machine Vis Appl
Collins RT (2003) Mean-shift blob tracking through scale space. In: IEEE conference on computer vision and pattern recognition, vol 2, pp 234–240
Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv 38(4):13
Triesch J, von der Malsburg C (2001) Democratic integration: self-organized integration of adaptive cues. Neural Comput 13(9):2049–2074
Siebel NT, Maybank S (2002) Fusion of multiple tracking algorithms for robust people tracking. In: Proceedings of ECCV02, pp 373–387
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 1, pp 511–518
Wang J, Yagi Y (2008) Integrating color and shape-texture features for adaptive real-time object tracking. IEEE Trans Image Process 17(2):235–240
Babenko B, Yang MH, Belongie S (2011) Robust object tracking with online multiple instance learning. IEEE Trans PAMI 33(8)
Kim TK, Woodley T, Stenger B, Cipolla R (2010) Online multiple classifier boosting for object tracking. In: Workshop on online learning for computer vision, San Francisco
Birchfield ST, Rangarajan S (2005) Spatiograms versus histograms for region-based tracking. In: IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 1158–1163
Ó Conaire C, O’Connor NE, Smeaton AF (2007) An improved spatiogram similarity measure for robust object localisation. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP)
Yilmaz A, Shafique K, Sha M (2003) Target tracking in airborne forward looking infrared imagery. Image Vision Comput 21(7):623–635
Matthews I, Ishikawa T, Baker S (2004) The template update problem. IEEE Trans Pattern Anal Machine Intell 26(6)
Jepson A, Fleet D, El-Maraghi TF (2003) Robust online appearance models for visual tracking. IEEE Trans Pattern Anal Machine Intell 25(10)
Zhou S, Chellappa R, Moghaddam B (2004) Appearance tracking using adaptive models in a particle filter. In: Proceedings of 6th Asian Conference on Computer Vision (ACCV)
Davis J, Sharma V (2005) Fusion-based background-subtraction using contour saliency. In: Proceedings of IEEE international workshop on object tracking and classification beyond the visible spectrum
Kasturi R (2006) Performance evaluation protocol for face, person, and vehicle detection& tracking analysis and content extraction (vace-ii). Technical report, Computer Science& Engineering University of South Florida, Tampa
Stauffer C, Grimson W (1999) Adaptive background mixture models for real-time tracking. In: Proceedings of CVPR99, pp II:246–252
Van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Dept. of Computer Science, University of Glasgow, Butterworths, London
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Machine Intell 24(5):603–619
Elgammal A, Duraiswami R, Davis LS (2003) Probabilistic tracking in joint feature-spatial spaces. In: IEEE conference on computer vision and pattern recognition (CVPR)
Acknowledgments
This material is based on works supported by Science Foundation Ireland under Grant No. 03/IN.3/I361.
Author information
Authors and Affiliations
Corresponding author
Appendix: Derivation of gradient of OD ratio
Appendix: Derivation of gradient of OD ratio
Before computing the object-to-distractor ratio (OD ratio), \(D\) in Eq. (6), all scores are replaced by their \(log\) values, which makes the derivation more straightforward. Given a set of \(K\) log-similarity surfaces, \(s_k(p) = \log \rho _k(p), k \in \{1,2,\ldots ,K\},\) with \(p\) being the candidate position, and \(K\) representing the number of features, the fused surface is given by
where the weights sum to \(1,\) so that \(\sum _{i=1}^{K}w_i = 1.\) The weights, \(w_k,\) are computed from \(K\) independent variables, denoted by \(\{W_1,W_2,\ldots ,W_K\},\) as follows:
Since subtraction in \(log\) space is the same as division, the object-to-distractor ratio (OD ratio) is computed as:
where \(p_0\) is the object position and \(p_1\) is the position of the strongest distractor. The goal is to change the independent \(W_k\) variables in order to maximise the OD ratio, \(g.\) We can see how the \(W_k\) variables affect the weights as follows:
And if \(a \ne k\)
Since the weights sum to one, they are not independent and influence each other. If we have \(a, a \ne k,\) this gives:
Using this result, the partial derivative of the fused score with respect to the weights is given by
Now the partial derivative of the fused score with respect to the independent \(W_k\) terms is given by:
If we now define
then we can write \(\partial g / \partial W_k\) as
This represents the effect each of the independent \(W_k\) variables have on the OD ratio, \(g.\) The gradient vector is then simply
This is used in a gradient ascent procedure, recomputing the distractor position, \(p_1,\) at each step by choosing the position of the background sample with the largest score using the current weights.
Rights and permissions
About this article
Cite this article
Ó Conaire, C., O’Connor, N.E. & Smeaton, A.F. Online adaptive feature weighting for spatiogram-bank tracking. Pattern Anal Applic 15, 367–377 (2012). https://doi.org/10.1007/s10044-012-0271-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-012-0271-0