The Role of Shape in Visual Recognition

Abstract

Visual recognition requires a robust representation of typical object characteristics. Among all visual characteristics, shape plays a special role. It exhibits crucial invariance properties and captures the holistic structure of objects. However, shape cannot be extracted directly from an image, as it is an emergent property. Thus, representing shape is challenging, since it is related to several key problems of computer vision, such as grouping, segmentation, and correspondence problems. This paper reviews the development of shape in object recognition so far, discusses the reasons for the underlying developmental trends, and presents some promising recent contributions that point towards more accurate models of object structure.

References

  1. 1.
    Antic B, Ommer B (2011) Video parsing for abnormality detection. In: ICCV Google Scholar
  2. 2.
    Attneave F (1954) Some informational aspects of visual perception. Psych Rev 61(3) Google Scholar
  3. 3.
    Basri R, Jacobs D (1995) Recognition using region correspondences. Int J Comput Vis 25:8–13 Google Scholar
  4. 4.
    Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522 CrossRefGoogle Scholar
  5. 5.
    Berg AC, Berg TL, Malik J (2005) Shape matching and object recognition using low distortion correspondence. In: CVPR, pp 26–33 Google Scholar
  6. 6.
    Biederman I (1987) Recognition-by-components: a theory of human image understanding. Psychol Rev 94(2):115–147 CrossRefGoogle Scholar
  7. 7.
    Binford TO (1971) Visual perception by computer. In: IEEE conf on systems and control Google Scholar
  8. 8.
    Bookstein FL (1986) Size and shape spaces for landmark data in two dimensions. Stat Sci 1(2):181–222 MATHCrossRefGoogle Scholar
  9. 9.
    Bourdev L, Malik J (2009) Poselets: body part detectors trained using 3d human pose annotations. In: ICCV Google Scholar
  10. 10.
    Brooks RA (1981) Symbolic reasoning among 3-d models and 2-d images. Artif Intell 17(1–3):285–348 CrossRefGoogle Scholar
  11. 11.
    Brunelli R, Poggio T (1993) Face recognition: features versus templates. IEEE Trans Pattern Anal Mach Intell 15(10):1042–1052. http://doi.ieeecomputersociety.org/10.1109/34.254061 CrossRefGoogle Scholar
  12. 12.
    Cootes TF, Edwards GJ, Taylor CJ (1998) Active appearance models. In: ECCV Google Scholar
  13. 13.
    Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV. Workshop on statistical learning in computer vision Google Scholar
  14. 14.
    Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: CVPR, pp 886–893 Google Scholar
  15. 15.
    Deselaers T, Ferrari V (2010) Global and efficient self-similarity for object classification and detection. In: CVPR, pp 1633–1640 Google Scholar
  16. 16.
    Dickinson SJ, Pentland A, Rosenfeld A (1992) 3-D shape recovery using distributed aspect matching. IEEE Trans Pattern Anal Mach Intell 14(2):174–198 CrossRefGoogle Scholar
  17. 17.
    Dryden IL, Mardia KV (1998) Statistical shape analysis. Wiley, New York MATHGoogle Scholar
  18. 18.
    Eigenstetter A, Ommer B (2012) Visual recognition using embedded feature selection for curvature self-similarity. In: NIPS Google Scholar
  19. 19.
    Everingham M, Zisserman A, Williams CKI, Van Gool L (2006) PASCAL VOC’06 Google Scholar
  20. 20.
    Felzenszwalb P, Mcallester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: CVPR Google Scholar
  21. 21.
    Felzenszwalb PF, Huttenlocher DP (2005) Pictorial structures for object recognition. Int J Comput Vis 61(1):55–79 CrossRefGoogle Scholar
  22. 22.
    Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. In: CVPR, pp 264–271 Google Scholar
  23. 23.
    Fidler S, Boben M, Leonardis A (2010) A coarse-to-fine taxonomy of constellations for fast multi-class object detection. In: ECCV Google Scholar
  24. 24.
    Fischler MA, Elschlager RA (1973) The representation and matching of pictorial structures. IEEE Trans Comput c-22(1):67–92 CrossRefGoogle Scholar
  25. 25.
    Guzman A (1971) Analysis of curved line drawings using context and global information. Mach Intell 6:325–376 Google Scholar
  26. 26.
    Hu MK (1962) Visual pattern recognition by moment invariants. Trans Inf Theory 8(2) Google Scholar
  27. 27.
    Huttenlocher DP, Ullman S (1987) Object recognition using alignment. In: ICCV Google Scholar
  28. 28.
    Julesz B (1981) Textons, the elements of texture perception, and their interactions. Nature 290(5802):91–97 CrossRefGoogle Scholar
  29. 29.
    Kendall D (1984) Shape manifolds, procrustean metrics and complex projective spaces. Bull Lond Math Soc 16(2):81–121 MathSciNetMATHCrossRefGoogle Scholar
  30. 30.
    Koenderink JJ, van Doorn AJ (1976) The singularities of the visual mapping. Biol Cybern 24:51–59 MATHCrossRefGoogle Scholar
  31. 31.
    Lades M, Vorbrüggen JC, Buhmann JM, Lange J, von der Malsburg C, Würtz RP, Konen W (1993) Distortion invariant object recognition in the dynamic link architecture. IEEE Trans Comput 42:300–311 CrossRefGoogle Scholar
  32. 32.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp 2169–2178 Google Scholar
  33. 33.
    Leibe B, Leonardis A, Schiele B (2004) Combined object categorization and segmentation with an implicit shape model. In: ECCV. Workshop on stat learn in comp vision Google Scholar
  34. 34.
    Lowe D (1999) Object recognition from local scale-invariant features. In: ICCV Google Scholar
  35. 35.
    Lowe DG (1985) Perceptual organization and visual recognition. Kluwer, Amsterdam CrossRefGoogle Scholar
  36. 36.
    Maji S, Malik J (2009) Object detection using a max-margin hough transform. In: CVPR Google Scholar
  37. 37.
    Marr D (1982) Vision. Freeman, San Francisco Google Scholar
  38. 38.
    Monroy A, Ommer B (2012) Beyond bounding-boxes: learning object shape by model-driven grouping. In: ECCV, pp 580–593 Google Scholar
  39. 39.
    Murase H, Nayar SK (1995) Visual learning and recognition of 3-d objects from appearance. Int J Comput Vis 14(1):5–24 CrossRefGoogle Scholar
  40. 40.
    Ohta Y, Kanade T, Sakai T (1978) An analysis system for scenes containing objects with substructures. In: Intl joint conf on pattern recognition, pp 752–754 Google Scholar
  41. 41.
    Ommer B, Buhmann JM (2006) Learning compositional categorization models. In: ECCV. LNCS, vol 3953, pp 316–329 Google Scholar
  42. 42.
    Ommer B, Buhmann JM (2010) Learning the compositional nature of visual object categories for recognition. IEEE Trans Pattern Anal Mach Intell 32(3):501–516 CrossRefGoogle Scholar
  43. 43.
    Ommer B, Malik J (2009) Multi-scale object detection by clustering lines. In: ICCV Google Scholar
  44. 44.
    Opelt A, Pinz A, Zisserman A (2006) Incremental learning of object detectors using a visual shape alphabet. In: CVPR, pp 3–10 Google Scholar
  45. 45.
    Revonsuo A, Newman J (1999) Binding and consciousness. Conscious Cogn 8:123–127 CrossRefGoogle Scholar
  46. 46.
    Roberts LG (1963) Machine perception of three-dimensional solids. PhD thesis, MIT Google Scholar
  47. 47.
    Schlecht J, Ommer B (2011) Contour-based object detection. In: BMVC Google Scholar
  48. 48.
    Schneiderman H, Kanade T (1998) Probabilistic modeling of local appearance and spatial relationships for object recognition. In: CVPR, pp 45–51 Google Scholar
  49. 49.
    Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering objects and their localization in images. In: ICCV, pp 370–377 Google Scholar
  50. 50.
    Small CG (1996) The statistical theory of shape. Springer, New York MATHCrossRefGoogle Scholar
  51. 51.
    Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380 CrossRefGoogle Scholar
  52. 52.
    Thompson DW (1917) On growth and form. Dover, New York Google Scholar
  53. 53.
    Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86 CrossRefGoogle Scholar
  54. 54.
    Viola PA, Jones MJ (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR, pp 511–518 Google Scholar
  55. 55.
    Wertheimer M (1922) Untersuchungen zur Lehre von der Gestalt I. Prinzipielle Bemerkungen. Psychol Forsch 1:47–58 CrossRefGoogle Scholar
  56. 56.
    Yarlagadda P, Monroy A, Ommer B (2010) Voting by grouping dependent parts. In: ECCV, pp 197–210 Google Scholar
  57. 57.
    Yarlagadda P, Ommer B (2012) From meaningful contours to discriminative object shape. In: ECCV Google Scholar
  58. 58.
    Yuille AL, Hallinan PW, Cohen DS (1992) Feature extraction from faces using deformable templates. Int J Comput Vis 8(2):99–111 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  1. 1.Heidelberg Collaboratory for Image Processing (HCI) & Interdisciplinary Center for Scientific Computing (IWR)University of HeidelbergHeidelbergGermany

Personalised recommendations