Abstract
We present a framework that jointly learns and then uses multiple image windows for improved classification. Apart from using the entire image content as context, class-specific windows are added, as well as windows that target class pairs. The location and extent of the windows are set automatically by handling the window parameters as latent variables. This framework makes the following contributions: a) the addition of localized information through the class-specific windows improves classification, b) windows introduced for the classification of class pairs further improve the results, c) the windows and classification parameters can be effectively learnt using a discriminative max-margin approach with latent variables, and d) the same framework is suited for multiple visual tasks such as classifying objects, scenes and actions. Experiments demonstrate the aforementioned claims.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bilen, H., Namboodiri, V.P., Van Gool, L.J.: Object and Action Classification with Latent Variables. In: BMVC (2011)
Boureau, Y., Le Roux, N., Bach, F., Ponce, J., LeCun, Y.: Ask the locals: multi-way local pooling for image recognition. In: ICCV. IEEE (2011)
Dekel, O., Keshet, J., Singer, Y.: Large margin hierarchical classification. In: International Conference on Machine Learning (ICML), pp. 27–35 (2004)
Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge 2006 (VOC 2006) Results (2006), http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf
Fergus, R., Bernal, H., Weiss, Y., Torralba, A.: Semantic Label Sharing for Learning with Many Categories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 762–775. Springer, Heidelberg (2010)
Gehler, P.V., Nowozin, S.: On feature combination for multiclass object classification. In: ICCV, pp. 221–228 (2009)
Hoai, M., Lan, Z.Z., De la Torre, F.: Joint segmentation and classification of human actions in video. In: CVPR (2011)
Lampert, C., Austria, I.: Maximum margin multi-label structured prediction (2011)
Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV, pp. 432–439 (2003)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)
Li, L.-J., Su, H., Xing, E.P., Fei-Fei, L.: Object bank: A high-level image representation for scene classification & semantic feature sparsification. In: Advances in Neural Information Processing Systems, NIPS (2010)
Lowe, D.: Object recognition from local scale-invariant features. In: ICCV, p. 1150 (1999)
Marszałek, M., Schmid, C.: Semantic hierarchies for visual object recognition. In: CVPR (2007)
Nguyen, M.H., Torresani, L., De la Torre, F., Rother, C.: Weakly supervised discriminative localization and classification: a joint learning process. In: ICCV (2009)
Nilsback, M.E., Zisserman, A.: A visual vocabulary for flower classification. In: CVPR, vol. 2, pp. 1447–1454 (2006)
Opelt, A., Pinz, A., Zisserman, A.: Incremental learning of object detectors using a visual shape alphabet. In: CVPR, pp. 3–10 (2006)
Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: ICCV (2011)
Patron, A., Marszalek, M., Zisserman, A., Reid, I.D.: High five: Recognising human interactions in tv shows. In: BMVC, pp. 1–11 (2010)
Pinz, A.: Object categorization. Foundations and Trends in Computer Graphics and Vision 1(4) (2005)
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR (2009)
Sadeghi, M.A., Farhadi, A.: Recognition using visual phrases. In: CVPR (2011)
Salakhutdinov, R., Torralba, A., Tenenbaum, J.: Learning to share visual appearance for multiclass object detection. In: CVPR (2011)
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient Object Category Recognition Using Classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)
Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: International Conference on Machine Learning (ICML), pp. 104–112 (2004)
Vedaldi, A., Gulshan, V., Varma, M., Zisserman, A.: Multiple kernels for object detection. In: ICCV, pp. 606–613 (2009)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T.S., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: International Conference on Machine Learning (ICML), pp. 1169–1176. ACM (2009)
Yuille, A., Rangarajan, A.: The concave-convex procedure. Neural Computation 15(4), 915–936 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bilen, H., Namboodiri, V.P., Van Gool, L.J. (2012). Classification with Global, Local and Shared Features. In: Pinz, A., Pock, T., Bischof, H., Leberl, F. (eds) Pattern Recognition. DAGM/OAGM 2012. Lecture Notes in Computer Science, vol 7476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32717-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-32717-9_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32716-2
Online ISBN: 978-3-642-32717-9
eBook Packages: Computer ScienceComputer Science (R0)