Abstract
A recent trend in several robotics tasks is to consider vision as the primary sense to perceive the environment or to interact with humans. Therefore, vision processing becomes a central and challenging matter for the design of real-time control architectures. We follow in this paper a biological inspiration to propose a real-time and embedded control system relying on visual attention to learn specific actions in each place recognized by our robot. Faced with a performance challenge, the attentional model allows to reduce vision processing to a few regions of the visual field. However, the computational complexity of the visual chain remains an issue for a processing system embedded onto an indoor robot. That is why we propose as the first part of our system, a full-hardware architecture prototyped onto reconfigurable devices to detect salient features at the camera frequency. The second part learns continuously these features in order to implement specific robotics tasks. This neural control layer is implemented as embedded software making the robot fully autonomous from a computation point of view. The integration of such a system onto the robot enables not only to accelerate the frame rate of the visual processing, to relieve the control architecture but also to compress the data-flow at the output of the camera, thus reducing communication and energy consumption. We present in this paper the complete embedded sensorimotor architecture and the experimental setup. The presented results demonstrate its real-time behavior in vision-based navigation tasks.
Similar content being viewed by others
Notes
Place field is the projection in the environment of the locations where a given PC fires.
A robulab from the Robosoft company equipped with an additional computer based on a I5 processor.
A neuron of this layer is connected to all the pixels of a small local image.
The merge may be performed in the superficial layer of the entorhinal cortex or in the postrhinal cortex.
The tracking system used to plot trajectories is subject to local errors represented in figures by small jumps and discontinuities. Some videos of the experiments are available at the following address: http://www-etis.ensea.fr/robotsoc
References
ARM: Amba open specifications, the de facto standard for on-chip communication. (2013) http://www.arm.com/products/system-ip/amba/amba-open-specifications.php
Ballard, D.H.: Animate vision. Artif. Intell. 48(1), 5786 (1991)
Battezzati, N., Colazzo, S., Maffione, M., Senepa, L.: Surf algorithm in fpga: a novel architecture for high demanding industrial applications. In: Rosenstiel, W, Thiele, L (eds.) DATE, IEEE, pp. 161–162 (2012)
Birem, M., Berry, F.: Fpga-based real time extraction of visual features. In: Conference (2010)
Bonato, V., Holanda, J., Marques, E.: An embedded multi-camera system for simultaneous localization and mapping. In: Proceedings of Applied Reconfigurable Computing, Lecture Notes on Computer Science (2006)
Bonato, V., Marques, E., Constantinides, G.A.: A parallel hardware architecture for scale and rotation invariant feature detection. IEEE Trans. Circ. Syst. Video Technol. 18(12), 1703–1712 (2008). doi:10.1109/TCSVT.2008.2004936
Bouris, D., Nikitakis, A., Walters, J.: Fast and efficient fpga-based feature detection employing the surf algorithm. In: 2010 18th IEEE Annual International Symposium on, Field-Programmable Custom Computing Machines (FCCM), pp. 3–10 (2010). doi:10.1109/FCCM.2010.11
Burgess, N, O’Keefe, J.: Neuronal computations underlying the firing of place cells and their role in navigation. Hippocampus 7, 749–762 (1996)
Burgess, N., Recce, M., O’Keefe, J.: A model of hippocampal function. Neural Netw. 7(6/7), 1065–1081 (1994)
Cartwright, B.A., Collett, T.S.: Landmark learning in bees. J. Comp. Physiol. 151, 521–543 (1983)
Cope, B.: Implementation of 2d Convolution on fpga, gpu and cpu. Tech. rep (2006)
Crowley, J.L., Riff, O.: Fast Computation of Scale Normalised Gaussian Receptive Fields. Springer Lecture Notes in Computer Science 2695 (2003)
Cuperlier, N., Quoy, M., Gaussier, P.: Neurobiologically inspired mobile robot navigation and planning. Fronti. NeuroRobot. 1(1) (2007)
Farabet, C., Poulet, C., LeCun, Y.: An fpga-based stream processor for embedded real-time vision with convolutional networks. In: 2009 IEEE 12th International Conference on, Computer Vision Workshops (ICCV Workshops), pp. 878–885 (2009). doi:10.1109/ICCVW.2009.5457611
Frintrop, S., Jensfelt, P.: Attentional landmarks and active gaze control for visual SLAM. IEEE Trans. Robot. 24(5), 1054–1065 (2008). doi:10.1109/tro.2008.2004977
Frintrop, S., Rome, E., Christensen, H.I.: Computational visual attention systems and their cognitive foundations: a survey. ACM Trans. Appl. Percept. 7(1), 6:1–6:39 (2010). doi:10.1145/1658349.1658355
Gallistel, C.R.: The Organization of Learning. MIT Press, Cambridge (1993)
Gaussier, P., Zrehen, S.: Perac: a neural architecture to control artificial animals. Robot. Auton. Syst. 16(24), 291320 (1995)
Gaussier, P., Joulain, C., Zrehen, S., Banquet, J.P., Revel, A.: Visual navigation in an open environment without map. In: International Conference On Intelligent Robots and Systems-IROS’97, pp. 545–550. IEEE/RSJ, Grenoble, France (1997)
Gaussier, P., Joulain, C., Banquet, J.P., Leprêtre, S,. Revel, A.: The visual homing problem: an example of robotic/biology cross fertilization. Robot. Auton. Syst. 30, 115–180 (2000)
Gaussier, P., Revel, A., Banquet, J.P., Babeau, V.: From view cells and place cells to cognitive map learning: processing stages of the hippocampal system. Biol. Cybern. 86, 15–28 (2002)
Giovannangeli, C, Gaussier, P., Banquet, J.P.: Robustness of visual place cells in dynamic indoor and outdoor environment. Int. J. Adv. Rob. Syst. 3(2), 115–124 (2006)
Goodale, M.A., Milner, A.D.: Separate visual pathways for perception and action. Trends Neurosci.15(1), 20–25 (1992)
Heinke, D., Humphreys, G.: Computational models of visual selective attention: a review. Connect. Models Psychol. 273–312 (2005)
Hirel, J., Gaussier, P., Quoy, M.: Biologically inspired neural networks for spatio-temporal planning in robotic navigation tasks. In: 2011 IEEE International Conference on, Robotics and Biomimetics (ROBIO), pp. 1627–1632 (2011). doi:10.1109/ROBIO.2011.6181522
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998). doi:10.1109/34.730558
Judd, S.P.D., Collet, T.S.: Multiple stored views and landmark guidance in ants. Nature 392, 710–712 (1998)
Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol 4, 219–227 (1985)
Kolb, B., Tees, R.: The Cerebral Cortex of the Rat. MIT Press, Cambridge (1990)
Lindeberg, T.: Feature detection with automatic scale selection. Int. J. Comput. Vis. 30(2), 79116 (1998)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91110 (2004)
Maillard, M., Gapenne, O., Hafemeister, L., Gaussier, P.: Perception as a dynamical sensori-motor attraction basin. In: Proceedings of the 8th European Conference on Advances in Artificial Life (ECAL 05), vol. 3630, p. 3746 (2005)
Matthieu, L., Pierre, A., Philippe, G.: Distributed real time neural networks in interactive complex systems. In: Proceedings of the 5th International Conference on Soft Computing as Transdisciplinary Science and Technology. ACM, New York, NY, USA, CSTST ’08, pp. 95–100 (2008). doi:10.1145/1456223.1456247
Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. Int. J. Comput. Vis. 60(1) (2004)
O’Keefe, J., Nadel, N.: The Hippocampus As a Cognitive Map. Clarenton Press, Oxford (1978)
Oliva, A., Torralba, A., Castelhano, M.S., Henderson, J.M. Top-down control of visual attention in object detection. In: Proceedings of the IEEE Int’l Conference on Image Processing (ICIP ’03) (2003)
Ouerhani, N., Hügli, H.: Robot self-localization using visual attention. In: CIRA, IEEE, pp. 309–314 (2005)
Pham, P.H., Jelaca, D., Farabet, C., Martini, B., LeCun, Y., Culurciello, E.: Neuflow: dataflow vision processing system-on-a-chip. In: International Midwest Symposium on Circuits and Systems (MWSCAS’12) (2012)
Schaeferling, M.: Flex-surf: A flexible architecture for fpga based robust feature extraction for optical tracking systems. In: Conference on Reconfigurable Computing and FPGAs (2010)
Siagian, C., Itti, L.: Biologically inspired mobile robot vision localization. IEEE Trans. Robot. 25(4), 861–873 (2009)
Tinbergen, N.: The Study of Instinct. Oxford University Press, London (1951)
Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980). doi:10.1016/0010-0285(80)90005-5
Tsotsos, J.: Analyzing vision at the complexity level. Behav. Brain Sci. 13(3), 423–469 (1990)
Verdier, F., Miramond, B., Maillard, M., Huck, E., Levebvre, T.: Using high-level rtos models for hw/sw embedded architecture exploration: case study on mobile robotic vision. EURASIP J. Embed. Syst. (2008)
Zhong, S., Wang, J., Yan, L., Kang, L., Cao, Z.: A real-time embedded architecture for sift. J. Syst. Archit. 59(1), 16–29 (2013). doi:10.1016/j.sysarc.2012.09.002
Zipser, D.: Biologically plausible models of place recognition and goal location. In: McClelland, JL., Rumelhart, D.E (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 2. MIT Press, Cambridge, MA. pp. 423–470 (1986)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Open access design files
The FPGA-based vision architecture can be freely downloadable for several platforms Footnote 7
-
Altera DE2-115 board equipped with D5M camera,
-
Xilinx Zynq ZC702 board equipped with the On-semi camera.
The design files contain:
-
the configuration file of the target FPGA (respectively, Altera Cyclone IV 115kLE, and Zynq 7000),
-
the flash image for the embedded processor (respectively, Nios-II and dual-core Cortex A9). This image contains the executable files that read back the features.
One can then send the extracted features through an Ethernet link to a distant computer or compute them locally.
1.2 Additional figures
Rights and permissions
About this article
Cite this article
Fiack, L., Cuperlier, N. & Miramond, B. Embedded and real-time architecture for bio-inspired vision-based robot navigation. J Real-Time Image Proc 10, 699–722 (2015). https://doi.org/10.1007/s11554-013-0391-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-013-0391-9