Robot Vision: A Holistic View

  • Ming Xie
Conference paper


It is well understood that artificial vision enables a wide range of applications from visual inspection, visual measurement, visual recognition, visual surveillance, to visual guidance of robot systems in real-time and real environment. However, in the literature, there is no definite answer to what artificial (or robot) vision should be, or how it should be. This dilemma is largely due to the fact that artificial (or robot) vision is being actively pursued by scientists of various backgrounds in both social and natural sciences. In this paper, the intention is to present a new way of re-organizing various concepts, principles and algorithms of artificial vision. In particular, we propose a function-centric view comprising these five coherent categorizations, namely: (a) instrumental vision, (b) behavior-based vision, (c) reconstructive vision, (d) model-based vision, and (e) cognitive vision. This function-centric view abandons the long-standing notions of low-, intermediate- and high-level vision, as they are more illusive than insightful. The contribution of this article is two-fold. First, it is time to assess and consolidate the current achievements in artificial (or robot) vision. Secondly, it is important to objectively state the remaining challenges in order to guide the future investigations in artificial (or robot) vision.

Key words

Instrumental Vision Behavior-based Vision Reconstructive Vision Model-based Vision Cognitive Vision 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Grimson, W. E. L.: Aspects of a Compuational Theory of Human Stereo Vision. DARPA Image Understanding Workshop (1980) 128–149.Google Scholar
  2. 2.
    Biederman, G.: Recognition by Components: A Theory of Human Image Understanding. Psychological Review 94 (1987) 115–147.CrossRefGoogle Scholar
  3. 3.
    Hummel, J. E., Biederman, I.: Dynamic Binding in a Neural Network for Shape Recognition. Psychological Review 99 (1992) 480–517.CrossRefGoogle Scholar
  4. 4.
    Crick, F. and Koch, C.: The Astonishing Hypothesis. Simon & Schuster (1994).Google Scholar
  5. 5.
    Kandel, E. R., Schwartz, J. H., and Jessel, T. M.: Essentials of Neural Science and Behavior. McGRAW-Hill (1995).Google Scholar
  6. 6.
    Zacks, J. M., Mires, J., Tverski, B. and Hazeltine, E.: Mental Spatial Transformation of Objects and Perspective. Spatial Cognition and Computation 2 (2000) 315–332.CrossRefGoogle Scholar
  7. 7.
    Edelman, S.: Constraining the Neural Representation of the Visual World. Trends in Cognitive Sciences 6(3) (2002) 125–131.CrossRefGoogle Scholar
  8. 8.
    Shiffrin, R.: Modeling Memory and Perception. Cognitive Science 27 (2004) 341–378.Google Scholar
  9. 9.
    Jain, R., Kasturi, R. and Schunck, B. G.: Machine Vision. McGRAW-Hill (1995).Google Scholar
  10. 10.
    Shirai, Y. and Inoue, H.: Guiding a Robot by Visual Feedback in Assembly Tasks. Pattern Recognition 5 (1973) 99–108.CrossRefGoogle Scholar
  11. 11.
    Sanderson, A. C., Weiss, L. E. and Neuman, C. P.: Dynamic Sensor-based Control of Robots with Visual Feedback. IEEE Transaction on Robotics and Automation 3 (1987) 404–417.Google Scholar
  12. 12.
    Espiau, B., Chaumette and Rives, P.: A New Approach to Visual Servoing in Robotics. IEEE Transaction on Robotics and Automation 8 (1992) 313–326.Google Scholar
  13. 13.
    Holinghurst, N. and Cipolla, R.: Uncalibrated Stereo Hand-Eye Coordination. Fourth British Conference on Machine Vision (1993) 783–790.Google Scholar
  14. 14.
    Hosoda, H. and Asada, M.: Versatile Visual Servoing without Knowledge of True Jacobian. IEEE International Conference on Robots and Systems (1994) 186–193.Google Scholar
  15. 15.
    Nasisi, O. and Carelli, R.: Adaptive Servo Visual Robot Control. Robotics and Autonomous Systems 43 (2003) 51–78.CrossRefGoogle Scholar
  16. 16.
    Xie, M.: Fundamentals of Robotics: Linking Perception to Action. World Sceintific (2003).Google Scholar
  17. 17.
    Marr, D.: Vision. Freeman (1982).Google Scholar
  18. 18.
    Ohta, Y. and Kanade, T.: Stereo by Intra-and Inter-baseline Search using Dynamic Programming. IEEE Trans. on PAMI 7 (1985) 139–154.Google Scholar
  19. 19.
    Faugeras, O.: Three-Dimensional Computer Vision. The MIT Press (1993).Google Scholar
  20. 20.
    Seitz, S. and Dyer, C.: Photorealistic Scene Reconstruction by Voxel Coloring. IEEE International Conference on Computer Vision and Pattern Recognition (1997) 1067–1073.Google Scholar
  21. 21.
    Hartley, R. I. and Zisseman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press (1998).Google Scholar
  22. 22.
    Zhang, Y. and Xie, M.: New Principle for Passive 3D Scanner. Third World Automation Congress (1998) 1–6.Google Scholar
  23. 23.
    Ziegler, R., Matusik, W., Pfsiter, H., McMillan, L.: 3D Reconstruction Using Labeled Image Regions. Eurographics Symposium on Geometry Processing (2003) 1–12.Google Scholar
  24. 24.
    Zhang, Z. Y.: Determining the Epipolar Geometry and Its Uncertainty: A Review. International Journal of Computer Vision 27 (1998) 161–195.CrossRefGoogle Scholar
  25. 25.
    Canny, J. F.: A Computational Approach to Edge Detection. IEEE Trans. on PAMI 8 (1986) 769–798.Google Scholar
  26. 26.
    Kadir, T. Brady, M.: Scale, Saliency and Image Description. International Journal of Computer Vision 45 (2001) 83–105.CrossRefzbMATHGoogle Scholar
  27. 27.
    Guo, C. E., Zhu, S. C., Wu, Y. N.: Modeling Visual Patterns by Integrating Descriptive and Generative Methods. International Journal of Computer Vision 53 (2003) 5–29.CrossRefGoogle Scholar
  28. 28.
    Lucas, B. and Kanade, T.: An Iterative Image Registration Technique with an Application to Stereo Vision. International Joint Conference on Artificial Intelligence (1981) 674–679.Google Scholar
  29. 29.
    Xie, M.: A Cooperative Strategy for the Matching of Multiple-level Edge Primitives. Image and Vision Computing 13 (1995) 89–99.CrossRefGoogle Scholar
  30. 30.
    Hager, G. D., Belhumeur, P. N.: Efficient Region Tracking with Parametric Models of Geometry and Illumination. IEEE Trans. on PAMI 20 (1998) 1025–1039.Google Scholar
  31. 31.
    Nevatia, Y. and Binford, T. O.: Description and Recognition of Curved Objects. Artificial Intelligence 8 (1977) 77–98.CrossRefzbMATHGoogle Scholar
  32. 32.
    Kanade, T.: Model Representation and Control Structures in Image Understanding. IJCAI-5 (1977) 1074–1082.Google Scholar
  33. 33.
    Lowe, D.: Solving for the Parameters of Object Models from Image Descriptions. DARPA Image Understanding Workshop (1980) 121–127.Google Scholar
  34. 34.
    Brooks, R. A.: Model-based Computer Vision. UMI Research Press (1981).Google Scholar
  35. 35.
    Oshima, M. and Shirai, Y.: Object Recognition Using 3-D Information. IEEE transaction on Pattern Analysis and Machine Intelligence 5 (1983) 353–361.CrossRefGoogle Scholar
  36. 36.
    Fisher, R. B.: Using Surfaces and Object Models to Recognise Partially Obscured Objects. International Joint Conference on Artificial Intelligence (1983) 989–995.Google Scholar
  37. 37.
    Grimson, W. E. L. and Lozano-Perez, T.: Model-Based Recognition and Localisation From Sparse Range or Tactile Data. International Journal of Robotics Research 3 (1984) 3–35.Google Scholar
  38. 38.
    Horaud, P. and Bolles, R. C.: 3DPO’s Strategy for Matching Three-dimensional Objects in Range Data. International Conference on Robotics (1984) 78–85.Google Scholar
  39. 39.
    Faugeras, O. and Hebert, M.: The Representation, Recognition and Locating of 3-D Objects. International Journal of Robotics Research 5 (1986) 27–52.Google Scholar
  40. 40.
    Marshall, A. D. and Martin, R. R.: Computer Vision, Models and Inspection. World Scientific (1992).Google Scholar
  41. 41.
    Chen, C. H., Pau, L. F. and Wang, P. S. P. (Editors): Handbook of Pattern Recognition and Computer Vision. World Scientific (1993).Google Scholar
  42. 42.
    Rohr, K.: Towards Model-based Recognition of Human Movements in Image Sequences. CVGIP 59 (1994) 95–115.Google Scholar
  43. 43.
    Zisserman, A., Forsyth, D. Mundy, J., Rothwell, C., Liu, J. and Pillow N.: 3D Object Recognition Using Invariance. Artificial Intelligence 78 (1995) 239–288.CrossRefGoogle Scholar
  44. 44.
    Ullman, S.: High-level Vision. The MIT Press (1996).Google Scholar
  45. 45.
    Grimson, W. E. L.: Introduction: Object Recognition at MIT. International Journal of Computer Vision 21 (1997) 5–8.CrossRefGoogle Scholar
  46. 46.
    Tarr, M. J. and Bulthoff, H. H.: Image-based Object Recognition in Man, Monkey and Machine. Cognition 67 (1998) 1–20.Google Scholar
  47. 47.
    Tan, T. N., Sullivan, G. D. and Baker, K. D.: Model-based Localisation and Recognition of Road Vehicles. International Journal of Computer Vision 27 (1998) 5–25.CrossRefGoogle Scholar
  48. 48.
    Belongie, S., Malik, J. and Puzicha, J.: Shape Matching and Object Recognition Using Shape Context. IEEE Trans. on PAMI 24 (2002) 509–522.Google Scholar
  49. 49.
    Cyr, C. M. and Kimia, B. B.: A Similarity-based Aspect-Graph Approach to 3D Object Recognition. International Journal of Computer Vision. 57 (2004) 5–22.CrossRefGoogle Scholar
  50. 50.
    Binford, T. O.: Visual Perception by Computer. IEEE Systems Science and Cybernetics Conference (1971).Google Scholar
  51. 51.
    Requicha, A. A. G.: Representations for Rigid Solids: Theory, Methods and Alternative Approaches. ACM Computing Surveys 12 (1980) 437–464.CrossRefGoogle Scholar
  52. 52.
    Cootes, T. F., Edwards, G. J., Taylor, C. J.: Active Appearance Models. European Conference on Computer Vision (1998) 484–498.Google Scholar
  53. 53.
    Denzler, J.: Knowledge Based Image and Speech Analysis for Service Robots. Workshop on Integration of Speech and Image Understanding, International Conference on Computer Vision (1999).Google Scholar
  54. 54.
    Lungarella, M., Metta, G., Pfeifer, R. and Sandini, G.: Developmental Robotics: A Survey. Connection Science 0 (2004) 1–40.Google Scholar
  55. 55.
    Weng, J.: Developmental Robotics: A Theory and Experiments. International Journal of Humanoid Robotics 2 (2004) 199–236.MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Ming Xie
    • 1
  1. 1.School of Mechanical & Aerospace EngineeringNanyang Technological UniversitySingapore

Personalised recommendations