Skip to main content
Log in

On the advantages of foveal mechanisms for active stereo systems in visual search tasks

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

In this work we study how information provided by foveated images sampled according to the log-polar transformation can be integrated over time in order to build accurate world representations and accomplish visual search tasks in an efficient manner. We focus on a specific visual information modality depth and on how to store it in a flexible memory structure. We propose a probabilistic observational model for a stereo system that relies on the Unscented Transform in order to propagate uncertainty in stereo matching, due to spatial quantization in the retina, to the 3D Cartesian domain. Probabilistic depth measurements are integrated in a novel Sensory Ego-Sphere whose topology can be biased with foveal-like distributions, according to the autonomous agent short-term tasks and goals. Furthermore, we investigate an Upper Confidence Bound algorithm for the task of simultaneously finding the closest object to the observer (visual search) and learning the surrounding environment 3D map (mapping). The performance of task execution is assessed both with a foveated log-polar sensor and a classical uniform one. The advantage of foveal vision and custom ego-sphere representations are illustrated in a series of experiments with a realistic simulator.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Receptive fields are the fundamental visual processing units. Each corresponds to a specific region in the retina (image) and is represented by the average value of the photo-receptors (pixels) within it (e.g. average color). For more details, we refer the interested reader to Edelman (1995).

References

  • Agarwal, A., & Blake, A. (2010). Dense stereo matching over the panum band. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 416–430.

    Article  Google Scholar 

  • Agrawal, R. (1995). Sample mean based index policies with o (log n) regret for the multi-armed bandit problem. Advances in Applied Probability, 27, 1054–1078.

    MathSciNet  MATH  Google Scholar 

  • Ahmad, S., & Yu, A. J. (2013). Active sensing as bayes-optimal sequential decision making, CoRR, vol. abs/1305.6650. http://arxiv.org/abs/1305.6650.

  • Audibert, J. -Y., & Bubeck, S. (2010). Best arm identification in multi-armed bandits. In COLT-23th conference on learning theory-2010 (pp. 13-p).

  • Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2–3), 235–256.

    Article  MATH  Google Scholar 

  • Avelino, J. A., Figueiredo, R., Moreno, P., & Bernardino, A. (2016). On the perceptual advantages of visual suppression mechanisms for dynamic robot systems. In International conference on biologically inspired cognitive architectures (BICA).

  • Begum, M., & Karray, F. (2011). Visual attention for robotic cognition: A survey. IEEE Transactions on Autonomous Mental Development, 3(1), 92–105.

    Article  Google Scholar 

  • Bernardino, A., & Santos-Victor, J. (2002). A binocular stereo algorithm for log-polar foveated systems. In H. Blthoff, C. Wallraven, S. -W. Lee , & T. Poggio (Eds.), Biologically motivated computer vision, ser. Lecture notes in computer science, (Vol. 2525, pp. 127–136). Berlin: Springer.

  • Borji, A., & Itti, L. (2013). State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 185–207.

    Article  Google Scholar 

  • Butko, N. J., & Movellan, J. R. (2010). Infomax control of eye movements. IEEE Transactions on Autonomous Mental Development, 2(2), 91–107.

    Article  Google Scholar 

  • Carrasco, M. (2011). Visual attention: The past 25 years. Vision research, 51(13), 1484–1525. (vision Research 50th Anniversary Issue: Part 2). http://www.sciencedirect.com/science/article/pii/S0042698911001544.

  • Colombo, C., Rucci, M., & Dario, P. (1996). Integrating selective attention and space-variant sensing in machine vision. In J. L. C. Sanz (Ed.), Image technology: Advances in image processing, multimedia and machine vision (pp. 109–127). Springer Berlin Heidelberg.

  • Cox, D. D., John, S. (1992). Sdo: A statistical method for global optimization. In IEEE international conference on systems, man and cybernetics (pp. 1241–1246). IEEE.

  • Crawford, L. E., Landy, D., & Presson, A. N. (2014). Bias in spatial memory: Prototypes or relational categories. In Poster presented at the 36th annual conference of the cognitive science Society, Quebec.

  • Edelman, S. (1995). Receptive fields for vision: From hyperacuity to object recognition. http://cogprints.org/570/.

  • Ferreira, J., Bessière, P., Mekhnacha, K., Lobo, J., Dias, J., & Laugier, C. (2008). Bayesian models for multimodal perception of 3D structure and motion. In International conference on cognitive systems (CogSys 2008), Karlsruhe, Germany. https://hal.archives-ouvertes.fr/hal-00338800.

  • Fleming, K. A., Peters, R. A., & Bodenheimer, R. E. (2006). Image mapping and visual attention on a sensory ego-sphere. In 2006 IEEE/RSJ international conference on intelligent robots and systems, IROS 2006, Beijing, China (pp. 241–246). October 9-15, 2006. doi:10.1109/IROS.2006.281688.

  • Friston, K., Adams, R., & Montague, R. (2012). What is value accumulated reward or evidence? Frontiers in Neurorobotics,. doi:10.3389/fnbot.2012.00011.

    Google Scholar 

  • Hirose, M., Furuhashi, H., Miyasaka, T., & Araki, K. (2002). Reconstruction of range data by means of geodesic dome type data structure. The Journal of the Institute of Image Electronics Engineers of Japan, 31(3), 388–395.

    Google Scholar 

  • Hirschmuller, H. (2008). Stereo processing by semiglobal matching and mutual information. IEEE Transactions on pattern analysis and machine intelligence, 30(2), 328–341.

    Article  Google Scholar 

  • Hoffman, M. D., Brochu, E., & de Freitas, N. (2011). Portfolio allocation for bayesian optimization. Citeseer.

  • Hornung, A., Wurm, K. M., Bennewitz, M., Stachniss, C., & Burgard, W. (2013). OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots. http://octomap.github.com.

  • Huang, D., Allen, T. T., Notz, W. I., & Zeng, N. (2006). Global optimization of stochastic black-box systems via sequential kriging meta-models. Journal of Global Optimization, 34(3), 441–466.

    Article  MathSciNet  MATH  Google Scholar 

  • Itti, L., & Baldi, P. F. (2006). Bayesian surprise attracts human attention. In Advances in neural information processing systems (NIPS*2005) (Vol. 19, pp. 547–554). Cambridge, MA: MIT Press. su;mod;bu;td;ey.

  • Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 1254–1259.

    Article  Google Scholar 

  • Julier, S., & Uhlmann, J. (2004). Unscented filtering and nonlinear estimation. Proceedings of the IEEE, 92(3), 401–422.

    Article  Google Scholar 

  • Koch, C., & Ullman, S. (1987). Shifts in selective visual attention: Towards the underlying neural circuitry. In L. M. Vaina (Ed.), Matters of intelligence: Conceptual structures in cognitive neuroscience (pp. 115–141). Dordrecht: Springer Netherlands.

  • Kriegman, D. J., Triendl, E., & Binford, T. O. (1989). Stereo vision and navigation in buildings for mobile robots. IEEE Transactions on Robotics and Automation, 5(6), 792–803.

    Article  Google Scholar 

  • Kushner, H. J. (1964). A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. Journal of Fluids Engineering, 86(1), 97–106.

    Google Scholar 

  • Lai, T. L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.

    Article  MathSciNet  MATH  Google Scholar 

  • Lizotte, D., Wang, T., Bowling, M., & Schuurmans, D. (2007). Automatic gait optimization with gaussian process regression. In Proceedings of the IJCAI (pp. 944–949).

  • Mockus, J. (1974). On bayesian methods for seeking the extremum. In Proceedings of the IFIP technical conference (pp. 400–404). London, UK: Springer. http://dl.acm.org/citation.cfm?id=646296.687872.

  • Moreno, P., Nunes, R., Figueiredo, R., Ferreira, R., Bernardino, A., Santos-Victor, J., Beira, R., Vargas, L., Aragão, D., & Aragão, M. (2015). Vizzy: A humanoid on wheels for assistive robotics. In Robot 2015: Second Iberian robotics conference (pp. 17–28). Springer International Publishing 2016.

  • Muller, M. E. (1959). A note on a method for generating points uniformly on n-dimensional spheres. Communications of the ACM, 2(4), 19–20. doi:10.1145/377939.377946.

    Article  MATH  Google Scholar 

  • Najemnik, J., & Geisler, W. S. (2005). Optimal eye movement strategies in visual search. Nature, 434(7031), 387–391.

    Article  Google Scholar 

  • Pamplona, D., & Bernardino, A. (2009). Smooth foveal vision with Gaussian receptive fields. In 9th IEEE-RAS international conference on humanoid robots, humanoids 2009, Paris, France (pp. 223–229). December 7–10, 2009. http://dx.doi.org/10.1109/ICHR.2009.5379575.

  • Perrollaz, M., Spalanzani, A., & Aubert, D. (2010). Probabilistic representation of the uncertainty of stereo-vision and application to obstacle detection. In Intelligent vehicles symposium (IV), 2010 IEEE (pp. 313–318). June 2010.

  • Peters, R. A., Hambuchen, K. A., & Bodenheimer, R. E. (2009). The sensory ego-sphere: A mediating interface between sensors and cognition. Autonomous Robots, 26(1), 1–19. doi:10.1007/s10514-008-9098-3.

    Article  Google Scholar 

  • Posner, M. (2012). Cognitive neuroscience of attention. Guilford Press. http://books.google.pt/books?id=8yjEjoS7EQsC.

  • Robbins, H., et al. (1952). Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 58(5), 527–535.

    Article  MathSciNet  MATH  Google Scholar 

  • Ruesch, J., Lopes, M., Bernardino, A., Hornstein, J., Santos-Victor, J., & Pfeifer, R. (2008). Multimodal saliency-based bottom-up attention a framework for the humanoid robot ICUB. In IEEE international conference on robotics and automation, 2008. ICRA 2008 (pp. 962–967). May 2008.

  • Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning (1st ed.). Cambridge, MA: MIT Press.

    Google Scholar 

  • Tippetts, B., Lee, D. J., Lillywhite, K., & Archibald, J. (2016). Review of stereo vision algorithms and their suitability for resource-limited systems. Journal of Real-Time Image Processing, 11(1), 5–25.

    Article  Google Scholar 

  • Vijayakumar, S., Conradt, J., Shibata, T., & Schaal, S. (2001). Overt visual attention for a humanoid robot. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems, 2001 (Vol. 4, pp. 2332–2337). IEEE.

  • von Helmholtz, H. & König, A. (1896). Handbuch der physiologischen Optik (Vol. 1). L. Voss. https://books.google.pt/books?id=Lb4KAAAAIAAJ.

  • Wang, J., & Liu, Y. (2007). A closed-form solution of reconstruction from nonparallel stereo geometry used in image guided system for surgery. In N. Sebe, Y. Liu, Y. Zhuang, & T. Huang (Eds) Multimedia content analysis and mining, ser. Lecture notes in computer science (Vol. 4577, pp. 371–380). Berlin Heidelberg: Springer.

  • Weiman, C. F. R. (1995). Binocular stereo via log-polar retinas. In SPIE, Ed.

Download references

Acknowledgements

This work has been partially supported by the Portuguese Foundation for Science and Technology (FCT) Project [UID/EEA/50009/2013]. Rui Figueiredo is funded by FCT Ph.D. Grant PD/BD/105779/2014. Helder Araújo would like to thank FCT (Portuguese Foundation for Science and Technology) grant UID-EEA-0048-2013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Pimentel de Figueiredo.

Additional information

This is one of several papers published in Autonomous Robots comprising the Special Issue on Active Perception.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Figueiredo, R.P., Bernardino, A., Santos-Victor, J. et al. On the advantages of foveal mechanisms for active stereo systems in visual search tasks. Auton Robot 42, 459–476 (2018). https://doi.org/10.1007/s10514-017-9617-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-017-9617-1

Keywords

Navigation