A Computer Vision System for Visual Perception in Unknown Environments

  • Xiaochun WangEmail author
  • Xiali Wang
  • Don Mitchell Wilkes


The goal of machine learning research is to equip robots with human-like perception capabilities so that they can sense its working environment, understand the collected data, take appropriate actions, and learn from its experience so as to enhance future performance. As a result, the acquisition of knowledge about its environment is one of the most important tasks of an autonomous system of any kind. This is done by taking measurements using various sensors and then extracting meaningful information from those measurements. In this chapter, we will present a vision based machine perception model from the perspective of a system design domain and discuss strategies for extracting information acquired from vision sensors for mobile robot localization tasks. More specifically, we aim to apply several important machine learning techniques to vision-based mobile robot navigation applications by discussing three issues, namely, information acquisition, environmental representation, and reasoning, leading to a general high-level model of the problem. The model is intended to be generic enough to allow a wide variety of tasks to be performed using a single set of sensory data. It is argued that the model has a direct correspondence with some recent biological evidences and can be applied to solving real-world problems specifically for an autonomous system operating in outdoor unknown environments.


Computer vision Machine perception Mobile robot localization Information acquisition Environmental representation Reasoning Unsupervised learning Supervised learning Reinforcement learning 


  1. Cummins, M., & Newman, P. (2008). FAB-MAP: Probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research, 27(6), 647–665.Google Scholar
  2. Dartnall, H. J. A., Bowmaker, J. K., & Mollon, J. D. (1983). Human visual pigments: Microspectrophotometric results from the eyes of seven persons. Proceedings of the Royal Society of London. Series B, 220, 115–130.CrossRefGoogle Scholar
  3. Dass, R., & Priyanka, S. D. (2012). Image segmentation techniques. The International Journal of Electronics and Communication Technology, 3(1).Google Scholar
  4. Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification. New York: Wiley.zbMATHGoogle Scholar
  5. Faugeras, O. (1993). Three-dimensional computer vision: A geometric viewpoint. Cambridge: The MIT Press.Google Scholar
  6. Fiser, J., Chiu, C., & Weliky, M. (2004). Small modulation of ongoing cortical dynamics by sensory input during natural vision. Nature, 431, 573–578.CrossRefGoogle Scholar
  7. Fraundorfer, F., Engels, C., & Nister, D. (2007). Topological mapping, localization and navigation using image collections. In Proceedings of 2007 IEEE/RSJ Conference on Intelligent Robots and Systems (pp. 3872–3877), San Diego, CA.Google Scholar
  8. Goedeme, T., Nuttin, M., Tuytelaars, T., & Van Gool, L. (2004). Markerless computer vision based localization using automatically generated topological maps. In Proceedings of the European Navigation Conference GNSS, Rotterdam.Google Scholar
  9. Gonzalez, R. C., & Woods, R. E. (1992). Digital image processing. Reading, Mass, USA: Addison Wesley.Google Scholar
  10. Gonzalez, R. C., & Woods, R. E. (2008). Digital image processing (3rd ed.). New York: Pearson Prentice Hall.Google Scholar
  11. Haralick, R. M., & Shapiro, L. G. (1991). Computer and Robot Vision 1. Reading: Addison-Wesley.Google Scholar
  12. Hartley, R. I., & Zisserman, A. (2004). Multiple view geometry. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
  13. Jefferies, M. E., & Yeap, W.-K. (Eds.). (2008). Robotics and cognitive approaches to spatial mapping in series. In Springer tracts in advanced robotics. Heidelberg: Springer.Google Scholar
  14. Ladicky, L., Russell, C., Philip H. S., & Kohli, P. (2009). Associative hierarchical CRFs for object class image segmentation. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision (ICCV’09) (pp. 739–746), Kyoto, Japan.Google Scholar
  15. Lazebnik, S., Schmid, C., & Ponce, J. (2004). Semi-local affine parts for object recognition. In Proceedings of the British Machine Vision Conference (Vol. 2, pp. 779–788).Google Scholar
  16. Lowe, D. G. (1999). Object recognition from local scale-invariant features. In Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV’99) (Vol. 2, pp. 1150–1157), Kerkyra, Greece.Google Scholar
  17. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.CrossRefGoogle Scholar
  18. Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. (2003). An invitation to 3-D vision: From images to geometric models. New York: Springer.Google Scholar
  19. Matsumoto, Y., Inaba, M., & Inoue, H. (1996). Visual navigation using view sequenced route representation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’96) (Vol. 1, pp. 83–88), Minneapolis, MN, USA.Google Scholar
  20. Meng, M., & Kak, A. C. (1993). Mobile robot navigation using neural networks and non-metrical environmental models. IEEE Control Systems Magazine, 13(5), 30–39.Google Scholar
  21. Mollon, J. D., & Bowmaker, J. K. (1992). The spatial arrangement of cones in the primate fovea. Nature, 360, 677–679.CrossRefGoogle Scholar
  22. Ning, J., Zhang, L., Zhang, D., & AndWu, C. (2010). Interactive image segmentation by maximal similarity based region merging. Pattern Recognition, 43, 445–456.CrossRefGoogle Scholar
  23. Nistér, D., & Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR’06) (Vol. 2, pp. 2161–2168).Google Scholar
  24. Senthilkumaran, N., & Rajesh, R. (2009). Edge detection techniques for image segmentation—A survey of soft computing approaches. International Journal of Recent Trends in Engineering, 1(2), 250–254.Google Scholar
  25. Sharma, N., Mishra, M., & Shrivastava, M. (2012). Color image segmentation techniques and issues: an approach. International Journal of Science and Technology Research, 1(41), 9–12.Google Scholar
  26. Singh, K. K., & Singh, A. (2010). A study of image segmentation algorithms for different types of images. International Journal of Computer Science Issues, 7(5).Google Scholar
  27. Smart, W. D., & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’02) (Vol. 4, pp. 3404–3410), May, Washington, D.C.Google Scholar
  28. Szeliski, R. (2010). Computer vision: Algorithms and applications. New York: Springer.zbMATHGoogle Scholar
  29. Trucco, E., & Verri, A. (1998). Introductory techniques for 3-D computer vision. New York: Prentice Hall.Google Scholar
  30. Tuytelaars, T., & Mikolajczyk, K. (2007). Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision, 3(3), 177–280.CrossRefGoogle Scholar
  31. Ulrich, I., & Nourbakhsh, I. (2000). Appearance-based place recognition for topological localization. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’00) (pp. 1023–1029), April, San Francisco.Google Scholar
  32. Wang, X., Tugcu, M., Hunter, J. E., & Wilkes, D. M. (2009). Exploration of configural representation in landmark learning using working memory toolkit. Patterns Recognition Letters, 30(1), 66–79.CrossRefGoogle Scholar

Copyright information

© Xi'an Jiaotong University Press 2020

Authors and Affiliations

  • Xiaochun Wang
    • 1
    Email author
  • Xiali Wang
    • 2
  • Don Mitchell Wilkes
    • 3
  1. 1.School of Software EngineeringXi’an Jiaotong UniversityXi’anChina
  2. 2.School of Information EngineeringChang’an UniversityXi’anChina
  3. 3.Department of Electrical Engineering and Computer ScienceVanderbilt UniversityNashvilleUSA

Personalised recommendations