Categorical Perception

Fritz, Mario; Andriluka, Mykhaylo; Fidler, Sanja; Stark, Michael; Leonardis, Aleš; Schiele, Bernt

doi:10.1007/978-3-642-11694-0_4

Mario Fritz⁸,
Mykhaylo Andriluka⁸,
Sanja Fidler⁹,
Michael Stark⁸,
Aleš Leonardis⁹ &
…
Bernt Schiele⁸

Part of the book series: Cognitive Systems Monographs ((COSMOS,volume 8))

1411 Accesses

Abstract

The ability to recognize and categorize entities in its environment is a vital competence of any cognitive system. Reasoning about the current state of the world, assessing consequences of possible actions, as well as planning future episodes requires a concept of the roles that objects and places may possibly play. For example, objects afford to be used in specific ways, and places are usually devoted to certain activities. The ability to represent and infer these roles, or, more generally, categories, from sensory observations of the world, is an important constituent of a cognitive system’s perceptual processing (Section 1.3 elaborates on this with a very visual example).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Burl, M.C., Perona, P.: Recognition of planar object classes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 1996), p. 223. IEEE Computer Society, San Francisco (1996)
Google Scholar
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: CVPR 2003, pp. 264–271 (2003)
Google Scholar
Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2005) [86], http://cognitivesystems.org/cosybook/chap4.asp#leibe05cvpr
Varma, M., Ray, D.: Learning the discriminative power-invariance trade-off. In: IEEE International Conference on Computer Vision (ICCV 2007). IEEE Computer Society, Rio de Janeiro (2007)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE International Conference on Computer Vision (ICCV 2005) [87]
Google Scholar
Leibe, B., Leonardis, A., Schiele, B.: Robust object detection with interleaved categorization and segmentation. International Journal of Computer Vision (IJCV) 77(1-3), 259–289 (2008), http://cognitivesystems.org/cosybook/chap4.asp#Leibe05c
Article Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2001), pp. 511–518. IEEE Computer Society, Kauai (2001)
Google Scholar
Winn, J.M., Jojic, N.: Locus: Learning object classes with unsupervised segmentation. In: IEEE International Conference on Computer Vision (ICCV 2005) [87], pp. 756–763
Google Scholar
Weber, M., Welling, M., Perona, P.: Unsupervised learning of models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)
Chapter Google Scholar
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their locations in images. In: IEEE International Conference on Computer Vision (ICCV 2005) [87]
Google Scholar
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: IEEE International Conference on Computer Vision (ICCV 2005) [87]
Google Scholar
Ettinger, G.J.: Hierarchical object recognition using libraries of parameterized model sub-parts, Tech. rep., MIT (1987)
Google Scholar
Tsotsos, J.K.: Analyzing vision at the complexity level. Behavioral and Brain Sciences 13(3), 423–469 (1990)
Google Scholar
Mel, B.W., Fiser, J.: Minimizing binding errors using learned conjunctive features. Neural Computation 12(4), 731–762 (2000)
Article Google Scholar
Amit, Y., Geman, D.: A computational model for visual selection. Neural Comp. 11(7), 1691–1715 (1999)
Article Google Scholar
Amit, Y.: 2d Object Detection and Recognition: Models, Algorithms and Networks. MIT Press, Cambridge (2002)
Google Scholar
Geman, S., Potter, D., Chi, Z.: Composition systems. Quarterly of Applied Mathematics 60(4), 707–736 (2002)
MATH MathSciNet Google Scholar
Fidler, S., Berginc, G., Leonardis, A.: Hierarchical statistical learning of generic parts of object structure. In: CVPR, pp. 182–189 (2006), http://cognitivesystems.org/cosybook/chap4.asp#s:fidler06
Fidler, S., Leonardis, A.: Towards scalable representations of visual categories: Learning a hierarchy of parts. In: CVPR 2007 (2007), http://cognitivesystems.org/cosybook/chap4.asp#s:fidler07
Mikolajczyk, K., Leibe, B., Schiele, B.: Multiple object class detection with a generative model. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2006) [88], pp. 26–36, http://cognitivesystems.org/cosybook/chap4.asp#Mikolajczyk06c
Torralba, A., Murphy, K.P., Freeman, W.T.: Sharing visual features for multiclass and multiview object detection. IEEE Trans. Pattern Analysis and Machine Intelligence 29(5)
Google Scholar
Jaakkola, T.S., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Proceedings of the 1998 conference on Advances in neural information processing systems II, pp. 487–493. MIT Press, Cambridge (1999)
Google Scholar
Fritz, M., Leibe, B., Caputo, B., Schiele, B.: Integrating representative and discriminant models for object category detection. In: IEEE International Conference on Computer Vision (ICCV 2005) [87], http://cognitivesystems.org/cosybook/chap4.asp#Fritz05
Sudderth, E., Torralba, A., Freeman, W., Willsky, A.: Learning hierarchical models of scenes, objects, and parts. In: ICCV 2005, pp. 1331–1338 (2005)
Google Scholar
Ommer, B., Buhmann, J.M.: Learning the compositional nature of visual objects. In: CVPR 2007 (2007)
Google Scholar
Fleuret, F., Geman, D.: Coarse-to-fine face detection. IJCV 41(1/2), 85–107 (2001)
Article MATH Google Scholar
Fukushima, K., Miyake, S., Ito, T.: Neocognitron: a neural network model for a mechanism of visual pattern recognition. IEEE Systems, Man and Cybernetics 13(3), 826–834 (1983)
Google Scholar
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Object recognition with cortex-like mechanisms. PAMI 29(3), 411–426 (2007)
Google Scholar
Scalzo, F., Piater, J.H.: Statistical learning of visual feature hierarchies. In: Workshop on Learning, CVPR (2005)
Google Scholar
Ullman, S., Epshtein, B.: Visual Classification by a Hierarchy of Extended Features. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 321–344. Springer, Heidelberg (2006)
Chapter Google Scholar
Agarwal, A., Triggs, B.: Hyperfeatures - multilevel local coding for visual recognition. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 30–43. Springer, Heidelberg (2006)
Chapter Google Scholar
Ranzato, M.A., Huang., F.-J., Boureau, Y.-L., LeCun, Y.: Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: CVPR 2007 (2007)
Google Scholar
Bienenstock, E., Geman, S.: Compositionality in neural systems. In: Arbib, M. (ed.) The Handbook of Brain Theory and Neural Networks, pp. 223–226. MIT Press, Cambridge (1995)
Google Scholar
Zhu, S., Mumford, D.: Quest for a stochastic grammar of images. Foundations and Trends in Computer Graphics and Vision 2(4), 259–362 (2007)
Article Google Scholar
Califano, A., Mohan, R.: Multidimensional indexing for recognizing visual shapes. Pattern Analysis and Machine Intelligence 16(4), 373–392 (1994)
Article Google Scholar
Tsunoda, K., Yamane, Y., Nishizaki, M., Tanifuji, M.: Complex objects are represented in macaque inferotemporal cortex by the combination of feature columns. Nature Neuroscience (4), 832–838 (2001)
Google Scholar
Brincat, S., Connor, C.: Dynamic shape synthesis in posterior inferotemporal cortex. Neuron 49(1), 17–24 (2006)
Article Google Scholar
Barlow, H.B.: Cerebral cortex as a model builder. In: Rose, D., Dobson, V. (eds.) Models of the Visual Cortex, pp. 37–46. John Wiley, Chichester (1985)
Google Scholar
Rolls, E.T., Deco, G.: Computational Neuroscience of Vision. Oxford Univ. Press, Oxford (2002)
Google Scholar
Edelman, S., Intrator, N.: Towards structural systematicity in distributed, statically bound visual representations. Cognitive Science 27, 73–110 (2003)
Article Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: IEEE CVPR 2004, Workshop on Generative-Model Based Vision (2004)
Google Scholar
Mutch, J., Lowe, D.G.: Multiclass object recognition with sparse, localized features. In: CVPR 2006, pp. 11–18 (2006)
Google Scholar
Wolf, L., Bileschi, S., Meyers, E.: Perception strategies in hierarchical vision systems. In: CVPR 2006, pp. 2153–2160 (2006)
Google Scholar
Csurka, G., Dance, C., Fan, L., Willarnowski, J., Bray, C.: Visual categorization with bags of keypoints. In: SLCV (2004)
Google Scholar
Mundy, J.L.: Object recognition in the geometric era: A retrospective. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 3–28. Springer, Heidelberg (2006)
Chapter Google Scholar
Ferrari, V., Fevrier, L., Jurie, F., Schmid, C.: Groups of adjacent contour segments for object detection, Rapport De Recherche Inria
Google Scholar
Opelt, A., Pinz, A., Zisserman, A.: A boundary-fragment-model for object detection. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part II. LNCS, vol. 3952, pp. 575–588. Springer, Heidelberg (2006)
Chapter Google Scholar
Stark, M., Schiele, B.: How good are local features for classes of geometric objects. In: ICCV (2007), http://cognitivesystems.org/CoSyBook/chap4.asp#stark07iccv
Berg, A.C., Malik, J.: Geometric blur for template matching. In: CVPR, pp. 607–614 (2001)
Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape context: A new descriptor for shape matching and object recognition. In: NIPS, pp. 831–837 (2000)
Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. PAMI 27(10), 1615–1630 (2005)
Google Scholar
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. IJCV 60(1), 63–86 (2004)
Article Google Scholar
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.J.V.: A comparison of affine region detectors. IJCV 65(1-2), 43–72 (2005)
Article Google Scholar
Kadir, T., Zisserman, A., Brady, M.: An affine invariant salient region detector. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 228–241. Springer, Heidelberg (2004)
Google Scholar
Ferrari, V., Tuytelaars, T., Gool, L.J.V.: Object detection by contour segment networks. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 14–28. Springer, Heidelberg (2006)
Chapter Google Scholar
Martin, D.R., Fowlkes, C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. PAMI 26(5), 530–549 (2004)
Google Scholar
Berg, A.C., Berg, T.L., Malik, J.: Shape matching and object recognition using low distortion correspondences. In: CVPR 2005 (2005)
Google Scholar
Morrone, M., Burr, D.: Feature detection in human vision: a phase dependent energy model. Proc. Royal Soc. London Bulletin, 221–245 (1988)
Google Scholar
Harris, C., Stephens, M.J.: A combined corner and edge detector. In: Alvey Conference, pp. 147–152 (1988)
Google Scholar
Mikolajczyk, K., Leibe, B., Schiele, B.: Local features for object class recognition. In: ICCV [87], pp. 1792–1799, http://cognitivesystems.org/cosybook/chap4.asp#conf/iccv/MikolajczykLS05
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI, Workshop on Learning for Text Categorization (1998)
Google Scholar
Torralba, A., Murphy, K., Freeman, W.: Sharing features: efficient boosting procedures for multiclass object detection. In: CVPR (2004)
Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research
Google Scholar
Griffiths, T.L., Steyvers, M.: Finding scientific topics. In: PNAS USA
Google Scholar
Steyvers, M., Griffiths, T.L.: Probabilistic topic models. In: Handbook of Latent Semantic Analysis. Lawrence Erlbaum Associates, Mahwah (2007)
Google Scholar
Everingham, M., Zisserman, A., Williams, C.K.I., Van Gool, L.: The PASCAL Visual Object Classes Challenge 2006 (VOC 2006) (2006), http://www.pascal-network.org/challenges/VOC/voc2006/results.pdf
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001) Software, http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chum, O., Zisserman, A.: An exemplar model for learning object classes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007). IEEE Computer Society, Minneapolis (2007)
Google Scholar
Fritz, M., Schiele, B.: Decomposition, discovery and detection of visual categories using topic models. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008) [89] (to appear), http://cognitivesystems.org/cosybook/chap4.asp#fritz08cvpr
Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2005) [86], http://cognitivesystems.org/cosybook/chap4.asp#leibe05cvpr
Lawrence, N.D., Moore, A.J.: Hierarchical Gaussian process latent variable models. In: ICML 2007 (2007)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. IJCV 61, 55–79 (2007)
Article Google Scholar
Williams, C.K.I., Allan, M.: On a connection between object localization with a generative template of features and pose-space prediction methods. Tech. Rep. EDI-INF-RR-0719, University of Edinburgh (2006)
Google Scholar
Urtasun, R., Fleet, D.J., Fua, P.: 3D people tracking with Gaussian process dynamical models. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2006) [88]
Google Scholar
Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models. In: NIPS 2005 (2005)
Google Scholar
Sminchisescu, C., Kanaujia, A., Metaxas, D.N.: BM³E: Discriminative density propagation for visual tracking. PAMI 29, 2030–2044 (2007)
Google Scholar
Lawrence, N.D.: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. JMLR 6, 1783–1816 (2005)
MathSciNet Google Scholar
Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. IJCV 61, 185–205 (2005)
Article Google Scholar
Demirdjian, D., Taycher, L., Shakhnarovich, G., Grauman, K., Darrell, T.: Avoiding the ”streetlight effect”: Tracking by exploring likelihood modes. In: IEEE International Conference on Computer Vision (ICCV 2005) [87]
Google Scholar
Sigal, L., Black, M.J.: Measure locally, reason globally: Occlusion-sensitive articulated pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2006) [88]
Google Scholar
Ramanan, D., Forsyth, D.A., Zisserman, A.: Tracking people by learning their appearance. PAMI 29, 65–81 (2007)
Google Scholar
Grochow, K., Martin, S.L., Hertzmann, A., Popovic, Z.: Style-based inverse kinematics. In: SIGGRAPH (2004)
Google Scholar
Andriluka, M., Roth, S., Schiele, B.: People-tracking-by-detection and people-detection-by-tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008) [89] (to appear)
Google Scholar
IEEE Computer Society, San Diego, CA, USA (2005)
Google Scholar
IEEE Computer Society, Beijing, China (2005)
Google Scholar
IEEE Computer Society, New York, NY, USA (2006)
Google Scholar
IEEE Computer Society, Anchorage, Alaska, USA (2008) (to appear)
Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität Darmstadt, Darmstadt, Germany
Mario Fritz, Mykhaylo Andriluka, Michael Stark & Bernt Schiele
VICOS Lab, University of Ljubljana, Slovenia
Sanja Fidler & Aleš Leonardis

Authors

Mario Fritz
View author publications
You can also search for this author in PubMed Google Scholar
Mykhaylo Andriluka
View author publications
You can also search for this author in PubMed Google Scholar
Sanja Fidler
View author publications
You can also search for this author in PubMed Google Scholar
Michael Stark
View author publications
You can also search for this author in PubMed Google Scholar
Aleš Leonardis
View author publications
You can also search for this author in PubMed Google Scholar
Bernt Schiele
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Georgia Tech, RIM@GT, 30332, Atlanta, GA, USA
Henrik Iskov Christensen
DFKI GmbH, Stuhlsatzenhausweg 3, 66123, Saarbrücken, Germany
Geert-Jan M. Kruijff
School of Computer Science, University of Birmingham, B15 2TT, Birmingham, UK
Jeremy L. Wyatt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fritz, M., Andriluka, M., Fidler, S., Stark, M., Leonardis, A., Schiele, B. (2010). Categorical Perception. In: Christensen, H.I., Kruijff, GJ.M., Wyatt, J.L. (eds) Cognitive Systems. Cognitive Systems Monographs, vol 8. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11694-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-11694-0_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11693-3
Online ISBN: 978-3-642-11694-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics