Object Recognition in the Geometric Era: A Retrospective

  • Joseph L. Mundy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4170)


Recent advances in object recognition have emphasized the integration of intensity-derived features such as affine patches with associated geometric constraints leading to impressive performance in complex scenes. Over the four previous decades, the central paradigm of recognition was based on formal geometric object descriptions with a focus on the properties of such descriptions under perspective image formation. This paper will review the key advances of the geometric era and investigate the underlying causes of the movement away from formal geometry and prior models towards the use of statistical learning methods based on appearance features.


Computer Vision Object Recognition Recognition System Machine Intelligence Geometric Invariance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agin, G., Binford, T.: Computer description of curved objects. In: Proceedings 3rd International Conference on Artificial Intelligence, pp. 629–640 (1993)Google Scholar
  2. 2.
    Agin, G.J.: Representation and Description of Curved Objects. Ph.D thesis, Stanford University (October 1972)Google Scholar
  3. 3.
    Ambler, A., Barrow, H., Brown, C., Burstall, R., Popplestone, R.: A Versatile Computer-Controlled Assembly System. In: International Joint Conference on Artificial Intelligence, pp. 298–307 (1973)Google Scholar
  4. 4.
    Ayache, N., Faugeras, O.: HYPER: A New Approach for the Recognition and Positioning of Two-Dimensional Objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 8(1), 44–54 (1986)CrossRefGoogle Scholar
  5. 5.
    Ballard, D.: Generalizing the Hough Transform to Detect Arbitrary Shapes. Pattern Recognition 13(2), 111–122 (1981)MATHCrossRefGoogle Scholar
  6. 6.
    Belhumeur, P., Kriegman, D.: Learning and recognizing objects using illumination subspaces. In: Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, pp. 270–277 (1996)Google Scholar
  7. 7.
    Biederman, I.: Human Image Understanding: Recent Research and a Theory. Computer Vision, Graphics and Image Processing 32, 29–73 (1985)CrossRefGoogle Scholar
  8. 8.
    Binford, T.O.: Visual Perception by Computer. In: Proc. IEEE Conf. on Systems and Control (December 1971)Google Scholar
  9. 9.
    Binford, T.O.: Spatial understanding: the successor system. In: Proceedings of the ARPA Image Understanding Workshop. Defense Advanced Research Projects Agency, pp. 12–20. Morgan Kaufmann Publishers, Inc., San Francisco (1992)Google Scholar
  10. 10.
    Bolles, R., Cain, R.: Recognizing and locating partially visible objects: The local-feature-focus method. International Journal of Robotics Research 1(3), 57–82 (1982)CrossRefGoogle Scholar
  11. 11.
    Bolles, R., Horaud, R.: 3DPO: A Tree-dimensional Part Orientation System. International Journal of Robotics Research 5(3), 3–26 (1986)CrossRefGoogle Scholar
  12. 12.
    Bolles, R.C., Fischler, M.A.: A RANSAC-based approach to model fitting and its application to finding cylinders in range data. In: International Joint Conference on Artificial Intelligence, Vancouver, Canada, pp. 637–643 (August 1981)Google Scholar
  13. 13.
    Brooks, R.: Symbolic reasoning among 3D models and 2D images. Artificial Intelligence Journal 17, 285–348 (1982)CrossRefGoogle Scholar
  14. 14.
    Burns, J., Weiss, R., Riseman, E.: The Non-existence of General-case View-Invariants, pp. 120–131. MIT Press, Cambridge (1992)Google Scholar
  15. 15.
    Canny, J.F.: Finding edges and lines in images. Technical Report AI-TR-720, Massachusets Institute of Technology, Artificial Intelligence Laboratory (June 1983)Google Scholar
  16. 16.
    Carlsson, S.: Multiple image invariance using the double algebra. In: Mundy, J.L., Zisserman, A., Forsyth, D. (eds.) AICV 1993. LNCS, vol. 825, pp. 145–164. Springer, Heidelberg (1994)Google Scholar
  17. 17.
    Chakravarty, I.: The use of characteristic views as a basis for the recognition of three-dimensional objects. In: Proc. Society for Photo-Optical Instrumentation Engineers conference on Robot Vision, vol. 336, pp. 37–45 (May 1982)Google Scholar
  18. 18.
    Clemens, D., Jacobs, D.: Space and time bounds on model indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(10), 1007–1116 (1991)CrossRefGoogle Scholar
  19. 19.
    Clemens, D.T., Jacobs, D.W.: Model group indexing for recognition. In: Proceedings of the IEEEConference on Computer Vision and Pattern Recognition, Maui, HI, pp. 4–9 (June 1991)Google Scholar
  20. 20.
    Clowes, M.B.: On seeing things. Artificial Intelligence Journal 2, 79–116 (1971)CrossRefGoogle Scholar
  21. 21.
    Cyr, C., Kimia, B.: 3d object recognition using shape similiarity-based aspect graph. In: Proceedings of the International Conference on Computer Vision, Vancouver, Canada, pp. 254–261 (July 2001)Google Scholar
  22. 22.
    Dickinson, S., Pentland, A., Rosenfeld, A.: 3-d shape recovery using distributed aspect matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, special issue on Interpretation of 3-D Scenes 14(2), 174–198 (1992)CrossRefGoogle Scholar
  23. 23.
    Faugeras, O., Mundy, J., Ahuja, N., Dyer, C., Pentland, A., Jain, R., Ikeuchi, K., Bowyer, K.: Why aspect graphs are not (yet) practical for computer vision. In: IEEE Workshop on Directions in Automated CAD-Based Vision, pp. 98–104 (1991)Google Scholar
  24. 24.
    Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 264–271 (June 2003)Google Scholar
  25. 25.
    Firschein, O. (ed.): RADIUS: Image Understanding for Imagery Intelligence. Morgan Kaufmann, San Francisco (1997)Google Scholar
  26. 26.
    Fitzgibbon, A.W., Zisserman, A.: Automatic 3D model acquisition and generation of new images from video sequences. In: Proceedings of European Signal Processing Conference (EUSIPCO 1998), Rhodes, Greece, pp. 1261–1269 (1998)Google Scholar
  27. 27.
    Goad, C.: Special purpose automatic programming for 3d model-based vision. In: Proc. DARPA Image Understanding Workshop, Arlington, VA, pp. 94–104 (June 1983)Google Scholar
  28. 28.
    Grimson, W.E.L.: Object Recognition by Computer: The Role of Geometric Constraints. The MIT Press, Cambridge (1990)Google Scholar
  29. 29.
    Grimson, W.E.L., Lozano-Pérez, T.: Model-based recognition and localization from sparse range or tactile data. International Journal of Robotics Research 3(3), 3–35 (1984)CrossRefGoogle Scholar
  30. 30.
    Guzman, A.: Decomposition of a visual scene into three-dimensional bodies. In: Proceedings Fall Joint Computer Conference, vol. 33, pp. 291–304 (1968)Google Scholar
  31. 31.
    Guzman, A.: Analysis of curved line drawings using context and global information. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence 6, pp. 325–375. John Wiley and Sons, Inc., New York (1971)Google Scholar
  32. 32.
    Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000)MATHGoogle Scholar
  33. 33.
    Horn, B.K.P.: Shape from shading: a method for obtaining the shape of a smooth opaque object from one view. Technical Report TR-79, MIT Project Mac (October 1970)Google Scholar
  34. 34.
    Hu, M.: Visual pattern recognition by moment invariants. IRE Transactions on Information Theory 8(2), 179–187 (1962)CrossRefGoogle Scholar
  35. 35.
    Huffman, D.A.: Impossible Objects as Nonsense Sentences. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence 6, pp. 295–324. Edinburgh University Press (1971)Google Scholar
  36. 36.
    Huttenlocher, D.P., Ullman, S.: Object recognition using alignment. In: Proceedings of the First International Conference on Computer Vision, London, pp. 102–111 (1987)Google Scholar
  37. 37.
    Ikeuchi, K., Kanade, T.: Applying sensor models to automatic generation of object recognition programs. In: Proc. Second Int’l Conf. Comput. Vision, Tampa, FL, pp. 228–237 (December 1988)Google Scholar
  38. 38.
    Kadir, T., Zisserman, A., Brady, M.: An affine invariant salient region detector. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 228–241. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  39. 39.
    Koenderink, J.J., van Doorn, A.J.: The singularities of the visual mapping. Biological Cybernetics 24, 51–59 (1976)MATHCrossRefGoogle Scholar
  40. 40.
    Koenderink, J.J., van Doorn, A.J.: Relief: pictorial and otherwise. Image and Vision Computing 13(5), 321–334 (1995)CrossRefGoogle Scholar
  41. 41.
    Kriegman, D., Ponce, J.: Computing exact aspect graphs of curved objects:solids of revolution. The International Journal of Computer Vision 5(2), 119–136 (1990)CrossRefGoogle Scholar
  42. 42.
    Kurzweil, R.: The age of intelligent machines. MIT Press, Cambridge (1990)Google Scholar
  43. 43.
    Lamdan, Y., Wolfson, H.J.: Geometric Hashing: A General and Efficient Model-Based Recognition Scheme. In: Proceedings of the 2nd International Conference on Computer Vision, Tampa, Florida, pp. 238–249 (December 1988)Google Scholar
  44. 44.
    Lazebnik, S., Schmid, C., Ponce, J.: Semi-local affine parts for object recognition. In: British Machine Vision Conference, vol. 2, pp. 779–788 (2004)Google Scholar
  45. 45.
    Lowe, D.: Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, Dordrecht (1985)Google Scholar
  46. 46.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: ICCV 1999: Proceedings of the International Conference on Computer Vision, Washington, DC, USA, vol. 2, p. 1150. IEEE Computer Society, Los Alamitos (1999)Google Scholar
  47. 47.
    Mackworth, A.K.: Interpreting pictures of polyhedral scenes. Artificial Intelligence Journal 4, 99–118 (1973)Google Scholar
  48. 48.
    Marr, D.: Vision. W.H. Freeman and Co., New York (1982)Google Scholar
  49. 49.
    Meer, P., Ramakrishna, S., Lenz, R.: Correspondance of coplanar features through p 2-invariant representations. In: Mundy, J.L., Zisserman, A., Forsyth, D.A. (eds.) AICV 1993. LNCS, vol. 825, pp. 437–492. Springer, Heidelberg (1994)Google Scholar
  50. 50.
    Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, J.A., Matas, F.S., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Comput. Vision (to appear, 1994)Google Scholar
  51. 51.
    Moses, Y., Ullman, S.: Limitations of non model-based recognition systems. In: Sandini, G. (ed.) ECCV 1992. LNCS, vol. 588, pp. 820–828. Springer, Heidelberg (1992)Google Scholar
  52. 52.
    Mundy, J.L., Heller, A.J.: The evolution and testing of a model-based object recognition system. In: Proceedings of the 3rd International Conference on Computer Vision, Osaka, Japan, December 1990, pp. 268–282. IEEE Computer Society Press, Los Alamitos (1990)CrossRefGoogle Scholar
  53. 53.
    Mundy, J.L., Liu, A., Pillow, N., Zisserman, A., Abdallah, S., Utcke, S., Nayar, S.K., Rothwell, C.: An experimental comparison of appearance and geometric model based recognition. In: Object Representation in Computer Vision, pp. 247–269 (1996)Google Scholar
  54. 54.
    Mundy, J.L., Zisserman, A. (eds.): Geometric Invariance in Computer Vision. MIT Press, Cambridge (1992)Google Scholar
  55. 55.
    Murase, H., Nayar, S.: Learning and recognition of 3d objects from appearance. The International Journal of Computer Vision 14(1), 5–24 (1995)CrossRefGoogle Scholar
  56. 56.
    Nevatia, R., Binford, T.O.: Structured descriptions of complex obects. In: Proc. 3rd International Joint Conference on Artificial Intelligence, pp. 641–647 (1973)Google Scholar
  57. 57.
    Nevatia, R., Binford, T.O.: Description and Recognition of Curved Objects. Artificial Intelligence Journal 8, 77–98 (1977)MATHCrossRefGoogle Scholar
  58. 58.
    Perkins, W.: A model-based vision system for industrial parts. IEEE Transactions on Computers C-27(2), 126–143 (1978)CrossRefGoogle Scholar
  59. 59.
    Petitjean, S.: The complexity and enumerative geometry of aspect graphs of smooth surfaces (April 1994)Google Scholar
  60. 60.
    Plantinga, H., Dyer, C.: Visibility, occlusion and the aspect graph. The International Journal of Computer Vision 5(2), 137–160 (1990)CrossRefGoogle Scholar
  61. 61.
    Ponce, J.: Designing tomorrow’s category-level 3D object recognition systems: an international workshop, Taormina, Sicily (September 2003)Google Scholar
  62. 62.
    Ponce, J., Zisserman, A., Hebert, M. (eds.): ECCV-WS 1996. LNCS, vol. 1144. Springer, Heidelberg (1996)Google Scholar
  63. 63.
    Pope, A., Lowe, D.: Learning Appearance Models for Object Recognition. In: Ponce, et al (ed.) [62], pp. 201–219Google Scholar
  64. 64.
    Roberts, L.G.: Machine perception of three-dimensional solids. In: Tippett, J., Berkowitz, D., Clapp, L., Koester, C., Vanderburgh, A. (eds.) Optical and Electrooptical Information processing, pp. 159–197. MIT Press, Cambridge (1965)Google Scholar
  65. 65.
    Roland, A., Shiman, P.: DARPA and the Quest for Machine Intelligence. MIT Press, Cambridge (2002)Google Scholar
  66. 66.
    Rothganger, F., Lazebnik, S., Schmid, C., Ponce, J.: 3d object modeling and recognition using affine-invariant patches and multi-view spatial constraints. In: CVPR, pp. 272–280 (2003)Google Scholar
  67. 67.
    Rothwell, C.: Object recognition through invariant indexing. Oxford University Science Publications. Oxford University Press, Oxford (1995)Google Scholar
  68. 68.
    Rothwell, C.A., Forsyth, D.A., Zisserman, A., Mundy, J.L.: Extracting projective structure from single perspective views of 3D point sets. In: Proceedings International Joint Conference on Computer Vision, Berlin, Germany, May 1993, pp. 573–582. IEEE Computer Society Press, Los Alamitos (1993)Google Scholar
  69. 69.
    Sarkar, S., Boyer, K.L.: Perceptual organization in computer vision: A review and a proposal for a classificatory structure. IEEE Transactions on Systems, Man, and Cybernetics 23, 382–399 (1993)CrossRefGoogle Scholar
  70. 70.
    Schaffalitzky, F., Zisserman, A.: Multi-view Matching for Unordered Image Sets, or How Do I Organize My Holiday Snaps? In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 414–431. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  71. 71.
    Schmid, C., Bobet, P., Lamiroy, B., Mohr, R.: An image-oriented cad approach. In: Ponce, et al (ed.) [62], pp. 221–246Google Scholar
  72. 72.
    Schmid, C., Mohr, R.: Local greyvalue invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 19(5), 530–535 (1997)CrossRefGoogle Scholar
  73. 73.
    Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: Proceedings of the International Conference on Computer Vision (October 2003)Google Scholar
  74. 74.
    Stark, L., Bowyer, K.: Generalized Object Recognition through Reasoning About Association of Function to Structure. IEEE Transactions on Pattern Analysis and Machine Intelligence 13, 1097–1104 (1991)CrossRefGoogle Scholar
  75. 75.
    Stockman, G.: Object recognition and localization via pose clustering. Computer Vision, Graphics, and Image Processing 40, 361–387 (1987)CrossRefGoogle Scholar
  76. 76.
    Sugihara, K.: Machine Interpretation of Line Drawings. MIT Press, Cambridge (1986)Google Scholar
  77. 77.
    Tarr, M.J., Pinker, S.: When does human object recognition use a viewer-centered reference frame? Psychological Science 1(42), 253–256 (1990)CrossRefGoogle Scholar
  78. 78.
    Thompson, D.W., Mundy, J.L.: Three-dimensional model matching from an unconstrained viewpoint. In: Proceedings of the International Conference on Robotics and Automation, Raleigh, NC, pp. 208–220 (1987)Google Scholar
  79. 79.
    Tuytelaars, T., Van Gool, L.: Matching widely separated views based on affine invariant regions. Int. J. Comput. Vision 59(1), 61–85 (2004)CrossRefGoogle Scholar
  80. 80.
    Underwood, S.A., Coates, C.L.: Visual Learning from Multiple Views. IEEE Transactions on Computers C-24(6), 651–661 (1975)CrossRefMathSciNetGoogle Scholar
  81. 81.
    Waltz, D.: Understanding line drawings of scenes with shadows. In: Winston, P.H. (ed.) The Psychology of Computer Vision, pp. 19–91. McGraw-Hill, New York (1975)Google Scholar
  82. 82.
    Weinshall, D., Tomasi, C.: Linear and incremental acquisition of invariant shape models from image sequences. In: Proceedings International Joint Conference on Computer Vision, Berlin, Germany, pp. 675–682. IEEE Computer Society Press, Los Alamitos (1993)Google Scholar
  83. 83.
    Weiss, I., Ray, M.: Model-based recognition of 3d objects from single images. PAMI 23(2), 116–128 (2001)Google Scholar
  84. 84.
    Winston, P.H.: The MIT robot. In: Meltzer, B., Michie, D. (eds.) Machine Intelligence 7, pp. 431–463. Edinberg University Press (1972)Google Scholar
  85. 85.
    Zerroug, M., Nevatia, R.: From an intensity image to 3-d segmented descriptions. In: Ponce, J., Hebert, M., Zisserman, A. (eds.) Object Representation in Computer Vision II, pp. 11–24 (1996)Google Scholar
  86. 86.
    Zisserman, A., Mundy, J., Forsyth, D., Liu, J., Pillow, N., Rothwell, C., Utcke, S.: Class-based grouping in perspective images. In: Proceedings of the 5th International Conference on Computer Vision, Boston, MA, June 1995, pp. 183–188. IEEE Computer Society Press, Los Alamitos (1995)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Joseph L. Mundy
    • 1
  1. 1.Division of EngineeringBrown UniversityProvidence

Personalised recommendations