Autonomous Robots

, Volume 42, Issue 2, pp 459–476 | Cite as

On the advantages of foveal mechanisms for active stereo systems in visual search tasks

  • Rui Pimentel de FigueiredoEmail author
  • Alexandre Bernardino
  • José Santos-Victor
  • Helder Araújo
Part of the following topical collections:
  1. Active Perception


In this work we study how information provided by foveated images sampled according to the log-polar transformation can be integrated over time in order to build accurate world representations and accomplish visual search tasks in an efficient manner. We focus on a specific visual information modality depth and on how to store it in a flexible memory structure. We propose a probabilistic observational model for a stereo system that relies on the Unscented Transform in order to propagate uncertainty in stereo matching, due to spatial quantization in the retina, to the 3D Cartesian domain. Probabilistic depth measurements are integrated in a novel Sensory Ego-Sphere whose topology can be biased with foveal-like distributions, according to the autonomous agent short-term tasks and goals. Furthermore, we investigate an Upper Confidence Bound algorithm for the task of simultaneously finding the closest object to the observer (visual search) and learning the surrounding environment 3D map (mapping). The performance of task execution is assessed both with a foveated log-polar sensor and a classical uniform one. The advantage of foveal vision and custom ego-sphere representations are illustrated in a series of experiments with a realistic simulator.


Stereoscopic vision Foveal vision Active vision Sensory ego-sphere 



This work has been partially supported by the Portuguese Foundation for Science and Technology (FCT) Project [UID/EEA/50009/2013]. Rui Figueiredo is funded by FCT Ph.D. Grant PD/BD/105779/2014. Helder Araújo would like to thank FCT (Portuguese Foundation for Science and Technology) grant UID-EEA-0048-2013.


  1. Agarwal, A., & Blake, A. (2010). Dense stereo matching over the panum band. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 416–430.CrossRefGoogle Scholar
  2. Agrawal, R. (1995). Sample mean based index policies with o (log n) regret for the multi-armed bandit problem. Advances in Applied Probability, 27, 1054–1078.MathSciNetzbMATHGoogle Scholar
  3. Ahmad, S., & Yu, A. J. (2013). Active sensing as bayes-optimal sequential decision making, CoRR, vol. abs/1305.6650.
  4. Audibert, J. -Y., & Bubeck, S. (2010). Best arm identification in multi-armed bandits. In COLT-23th conference on learning theory-2010 (pp. 13-p).Google Scholar
  5. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2–3), 235–256.CrossRefzbMATHGoogle Scholar
  6. Avelino, J. A., Figueiredo, R., Moreno, P., & Bernardino, A. (2016). On the perceptual advantages of visual suppression mechanisms for dynamic robot systems. In International conference on biologically inspired cognitive architectures (BICA).Google Scholar
  7. Begum, M., & Karray, F. (2011). Visual attention for robotic cognition: A survey. IEEE Transactions on Autonomous Mental Development, 3(1), 92–105.CrossRefGoogle Scholar
  8. Bernardino, A., & Santos-Victor, J. (2002). A binocular stereo algorithm for log-polar foveated systems. In H. Blthoff, C. Wallraven, S. -W. Lee , & T. Poggio (Eds.), Biologically motivated computer vision, ser. Lecture notes in computer science, (Vol. 2525, pp. 127–136). Berlin: Springer.Google Scholar
  9. Borji, A., & Itti, L. (2013). State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 185–207.CrossRefGoogle Scholar
  10. Butko, N. J., & Movellan, J. R. (2010). Infomax control of eye movements. IEEE Transactions on Autonomous Mental Development, 2(2), 91–107.CrossRefGoogle Scholar
  11. Carrasco, M. (2011). Visual attention: The past 25 years. Vision research, 51(13), 1484–1525. (vision Research 50th Anniversary Issue: Part 2).
  12. Colombo, C., Rucci, M., & Dario, P. (1996). Integrating selective attention and space-variant sensing in machine vision. In J. L. C. Sanz (Ed.), Image technology: Advances in image processing, multimedia and machine vision (pp. 109–127). Springer Berlin Heidelberg.Google Scholar
  13. Cox, D. D., John, S. (1992). Sdo: A statistical method for global optimization. In IEEE international conference on systems, man and cybernetics (pp. 1241–1246). IEEE.Google Scholar
  14. Crawford, L. E., Landy, D., & Presson, A. N. (2014). Bias in spatial memory: Prototypes or relational categories. In Poster presented at the 36th annual conference of the cognitive science Society, Quebec.Google Scholar
  15. Edelman, S. (1995). Receptive fields for vision: From hyperacuity to object recognition.
  16. Ferreira, J., Bessière, P., Mekhnacha, K., Lobo, J., Dias, J., & Laugier, C. (2008). Bayesian models for multimodal perception of 3D structure and motion. In International conference on cognitive systems (CogSys 2008), Karlsruhe, Germany.
  17. Fleming, K. A., Peters, R. A., & Bodenheimer, R. E. (2006). Image mapping and visual attention on a sensory ego-sphere. In 2006 IEEE/RSJ international conference on intelligent robots and systems, IROS 2006, Beijing, China (pp. 241–246). October 9-15, 2006. doi: 10.1109/IROS.2006.281688.
  18. Friston, K., Adams, R., & Montague, R. (2012). What is value accumulated reward or evidence? Frontiers in Neurorobotics,. doi: 10.3389/fnbot.2012.00011.Google Scholar
  19. Hirose, M., Furuhashi, H., Miyasaka, T., & Araki, K. (2002). Reconstruction of range data by means of geodesic dome type data structure. The Journal of the Institute of Image Electronics Engineers of Japan, 31(3), 388–395.Google Scholar
  20. Hirschmuller, H. (2008). Stereo processing by semiglobal matching and mutual information. IEEE Transactions on pattern analysis and machine intelligence, 30(2), 328–341.CrossRefGoogle Scholar
  21. Hoffman, M. D., Brochu, E., & de Freitas, N. (2011). Portfolio allocation for bayesian optimization. Citeseer.Google Scholar
  22. Hornung, A., Wurm, K. M., Bennewitz, M., Stachniss, C., & Burgard, W. (2013). OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots.
  23. Huang, D., Allen, T. T., Notz, W. I., & Zeng, N. (2006). Global optimization of stochastic black-box systems via sequential kriging meta-models. Journal of Global Optimization, 34(3), 441–466.MathSciNetCrossRefzbMATHGoogle Scholar
  24. Itti, L., & Baldi, P. F. (2006). Bayesian surprise attracts human attention. In Advances in neural information processing systems (NIPS*2005) (Vol. 19, pp. 547–554). Cambridge, MA: MIT Press. su;mod;bu;td;ey.Google Scholar
  25. Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 1254–1259.CrossRefGoogle Scholar
  26. Julier, S., & Uhlmann, J. (2004). Unscented filtering and nonlinear estimation. Proceedings of the IEEE, 92(3), 401–422.CrossRefGoogle Scholar
  27. Koch, C., & Ullman, S. (1987). Shifts in selective visual attention: Towards the underlying neural circuitry. In L. M. Vaina (Ed.), Matters of intelligence: Conceptual structures in cognitive neuroscience (pp. 115–141). Dordrecht: Springer Netherlands.Google Scholar
  28. Kriegman, D. J., Triendl, E., & Binford, T. O. (1989). Stereo vision and navigation in buildings for mobile robots. IEEE Transactions on Robotics and Automation, 5(6), 792–803.CrossRefGoogle Scholar
  29. Kushner, H. J. (1964). A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. Journal of Fluids Engineering, 86(1), 97–106.Google Scholar
  30. Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.MathSciNetCrossRefzbMATHGoogle Scholar
  31. Lizotte, D., Wang, T., Bowling, M., & Schuurmans, D. (2007). Automatic gait optimization with gaussian process regression. In Proceedings of the IJCAI (pp. 944–949).Google Scholar
  32. Mockus, J. (1974). On bayesian methods for seeking the extremum. In Proceedings of the IFIP technical conference (pp. 400–404). London, UK: Springer.
  33. Moreno, P., Nunes, R., Figueiredo, R., Ferreira, R., Bernardino, A., Santos-Victor, J., Beira, R., Vargas, L., Aragão, D., & Aragão, M. (2015). Vizzy: A humanoid on wheels for assistive robotics. In Robot 2015: Second Iberian robotics conference (pp. 17–28). Springer International Publishing 2016.Google Scholar
  34. Muller, M. E. (1959). A note on a method for generating points uniformly on n-dimensional spheres. Communications of the ACM, 2(4), 19–20. doi: 10.1145/377939.377946.CrossRefzbMATHGoogle Scholar
  35. Najemnik, J., & Geisler, W. S. (2005). Optimal eye movement strategies in visual search. Nature, 434(7031), 387–391.CrossRefGoogle Scholar
  36. Pamplona, D., & Bernardino, A. (2009). Smooth foveal vision with Gaussian receptive fields. In 9th IEEE-RAS international conference on humanoid robots, humanoids 2009, Paris, France (pp. 223–229). December 7–10, 2009.
  37. Perrollaz, M., Spalanzani, A., & Aubert, D. (2010). Probabilistic representation of the uncertainty of stereo-vision and application to obstacle detection. In Intelligent vehicles symposium (IV), 2010 IEEE (pp. 313–318). June 2010.Google Scholar
  38. Peters, R. A., Hambuchen, K. A., & Bodenheimer, R. E. (2009). The sensory ego-sphere: A mediating interface between sensors and cognition. Autonomous Robots, 26(1), 1–19. doi: 10.1007/s10514-008-9098-3.CrossRefGoogle Scholar
  39. Posner, M. (2012). Cognitive neuroscience of attention. Guilford Press.
  40. Robbins, H., et al. (1952). Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58(5), 527–535.MathSciNetCrossRefzbMATHGoogle Scholar
  41. Ruesch, J., Lopes, M., Bernardino, A., Hornstein, J., Santos-Victor, J., & Pfeifer, R. (2008). Multimodal saliency-based bottom-up attention a framework for the humanoid robot ICUB. In IEEE international conference on robotics and automation, 2008. ICRA 2008 (pp. 962–967). May 2008.Google Scholar
  42. Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning (1st ed.). Cambridge, MA: MIT Press.Google Scholar
  43. Tippetts, B., Lee, D. J., Lillywhite, K., & Archibald, J. (2016). Review of stereo vision algorithms and their suitability for resource-limited systems. Journal of Real-Time Image Processing, 11(1), 5–25.CrossRefGoogle Scholar
  44. Vijayakumar, S., Conradt, J., Shibata, T., & Schaal, S. (2001). Overt visual attention for a humanoid robot. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, 2001 (Vol. 4, pp. 2332–2337). IEEE.Google Scholar
  45. von Helmholtz, H. & König, A. (1896). Handbuch der physiologischen Optik (Vol. 1). L. Voss.
  46. Wang, J., & Liu, Y. (2007). A closed-form solution of reconstruction from nonparallel stereo geometry used in image guided system for surgery. In N. Sebe, Y. Liu, Y. Zhuang, & T. Huang (Eds) Multimedia content analysis and mining, ser. Lecture notes in computer science (Vol. 4577, pp. 371–380). Berlin Heidelberg: Springer.Google Scholar
  47. Weiman, C. F. R. (1995). Binocular stereo via log-polar retinas. In SPIE, Ed.Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Institute for Systems and Robotics (ISR/IST), LARSyS, Instituto Superior TécnicoUniversidade de LisboaLisbonPortugal

Personalised recommendations