Abstract
In this text we present the real-time implementation of a Bayesian framework for robotic multisensory perception on a graphics processing unit (GPU) using the Compute Unified Device Architecture (CUDA). As an additional objective, we intend to show the benefits of parallel computing for similar problems (i.e. probabilistic grid-based frameworks), and the user-friendly nature of CUDA as a programming tool. Inspired by the study of biological systems, several Bayesian inference algorithms for artificial perception have been proposed. Their high computational cost has been a prohibitory factor for real-time implementations. However in some cases the bottleneck is in the large data structures involved, rather than the Bayesian inference per se. We will demonstrate that the SIMD (single-instruction, multiple-data) features of GPUs provide a means for taking a complicated framework of relatively simple and highly parallelisable algorithms operating on large data structures, which might take up to several minutes of execution with a regular CPU implementation, and arrive at an implementation that executes in the order of tenths of a second. The implemented multimodal perception module (including stereovision, binaural sensing and inertial sensing) builds an egocentric representation of occupancy and local motion, the Bayesian Volumetric Map (BVM), based on which gaze shift decisions are made to perform active exploration and reduce the entropy of the BVM. Experimental results show that the real-time implementation successfully drives the robotic system to explore areas of the environment mapped with high uncertainty.
Similar content being viewed by others
Notes
However, the new Fermi GPUs from NVIDIA will have Configurable L1 and Unified L2 caches [32]. Refer to concluding section for further discussion on the subject.
These sensor models are, in fact, Bayesian subprograms of the BVM.
Set of cells, each belonging to a particular line-of-sight (θ C , ϕ C ) in the BVM, just preceding the first occupied cell in that direction.
CUDA streams are concurrent lanes of execution that allow parallel execution of multiple kernels on the GPU.
References
GPU4Vision—Accelerating Computer Vision. http://gpu4vision.icg.tugraz.at/ (2009)
Aloimonos, J., Weiss, I., Bandyopadhyay, A.: Active vision. Int. J. Comput. Vis. 1, 333–356 (1987)
Bajcsy, R.: Active perception vs passive perception. In: Third IEEE Workshop on Computer Vision, Bellair, Michigan, pp 55–59 (1985)
Barber, M.J., Clark, J.W., Anderson, C.H.: Neural representation of probabilistic information. Neural Comput. 15(8), 1843–1864 (2003). doi:10.1162/08997660360675062
Bessière, P., Laugier, C., Siegwart, R. (eds.): Probabilistic reasoning and decision making in sensory-motor systems. In: Springer Tracts in Advanced Robotics, vol 46. Springer. ISBN: 978-3-540-79006-8 (2008)
Box, G.E.P., Muller, M.: A note on the generation of random normal deviates. Ann. Math. Stat. 29(2), 610–611 (1958)
Carpenter, R.H.S.: The saccadic system: a neurological microcosm. Adv. Clin. Neurosci. Rehabil. 4, 6–8 (2004)
Caspi, A., Beutter, B.R., Eckstein, M.P.: The time course of visual information accrual guiding eye movement decisions. Proc. Natl. Acad. Sci. USA 101(35), 13086–13090 (2004)
Cutting, J.E., Vishton, P.M.: Perceiving layout and knowing distances: the integration, relative potency, and contextual use of different information about depth. In: Epstein, W., Rogers, S. (eds.) Handbook of Perception and Cognition, vol 5; Perception of Space and Motion. Academic Press, New York (1995)
Denève, S., Latham, P.E., Pouget, A.: Reading population codes: a neural implementation of ideal observers. Nat. Neurosci. 2(8), 740–745 (1999). doi:10.1038/11205
Elfes, A.: Using occupancy grids for mobile robot perception and navigation. IEEE Comput. 22(6), 46–57 (1989)
Farrugia, J.P., Horain, P., Guehenneux, E., Alusse, Y.: GpuCV: a framework for image processing acceleration with graphics processors. In: 2006 IEEE International Conference on Multimedia and Expo, pp 585–588 (2006)
Ferreira, J.F., Bessière, P., Mekhnacha, K., Lobo, J., Dias, J., Laugier, C.: Bayesian models for multimodal perception of 3D structure and motion. In: International Conference on Cognitive Systems (CogSys 2008), University of Karlsruhe, Karlsruhe, Germany, pp 103–108 (2008)
Ferreira, J.F., Pinho, C., Dias, J.: Active exploration using Bayesian models for multimodal perception. In: Campilho, A., Kamel, M. (eds.) Image Analysis and Recognition. Lecture Notes in Computer Science Series (Springer LNCS), International Conference ICIAR 2008, pp 369–378 (2008)
Ferreira, J.F., Pinho, C., Dias, J.: Bayesian sensor model for egocentric stereovision. In: 14a Conferência Portuguesa de Reconhecimento de Padrões Coimbra (RECPAD 2008) (2008)
Ferreira, J.F., Pinho, C., Dias, J.: Implementation and calibration of a Bayesian binaural system for 3D localisation. In: 2008 IEEE International Conference on Robotics and Biomimetics (ROBIO 2008), Bangkok, Thailand (2009)
Fung, J., Mann, S.: Using graphics devices in reverse: GPU-based image processing and computer vision. In: IEEE Int’l Conference on Multimedia and Expo, Hannover, Germany (2008)
Fung, J., Mann, S., Aimone, C.: OpenVIDIA: parallel GPU computer vision. In: ACM Multimedia 2005, Singapore, pp 849–852 (2005)
Hussein, M., Varshney, A., Davis, L.: On implementing graph cuts on CUDA. In: First Workshop on General Purpose Processing on Graphics Processing Units, Boston, MA (2007)
Hwu, W.M., Rodrigues, C., Ryoo, S., Stratton, J.: Compute unified device architecture application suitability. Comput. Sci. Eng. 11(3), 16–26 (2009)
Jang, H., Park, A., Jung, K.: Neural network implementation using CUDA and OpenMP. In: Proceedings of the 2008 Digital Image Computing: Techniques and Applications, pp 155–161. IEEE Computer Society Washington, DC, USA (2008)
Knill, D.C., Pouget, A.: The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 27(12), 712–719 (2004)
Koene, A., Morén, J., Trifa, V., Cheng, G.: Gaze shift reflex in a humanoid active vision system. In: 5th International Conference on Computer Vision Systems (ICVS 2007), Applied Computer Science Group, Bielefeld University, Germany. ISBN:978-3-00-020933-8 (2007)
Laurens, J., Droulez, J.: Bayesian processing of vestibular information. Biol. Cybernet. 96(4), 389–404 (2007)
Lebeltel, O.: Programmation Bayésienne des robots. PhD thesis, Institut National Polytechnique de Grenoble, Grenoble, France (1999)
L’Ecuyer, P.: Maximally equidistributed combined tausworthe generators. Math. Comput. 65(213), 202–213 (1996)
Lobo, J., Ferreira, J.F., Dias, J.: Bioinspired visuovestibular artificial perception system for independent motion segmentation. In: Second International Cognitive Vision Workshop, ECCV 9th European Conference on Computer Vision, Graz, Austria. http://dib.joanneum.at/icvw2006/ (2006)
Lobo, J., Ferreira, J.F., Dias, J.: Robotic implementation of biological Bayesian models towards visuo-inertial image stabilization and gaze control. In: 2008 IEEE International Conference on Robotics and Biomimetics (ROBIO 2008), Bangkok, Tailand (2009)
Lu, Y.C., Christensen, H., Cooke, M.: Active binaural distance estimation for dynamic sources. In: Interspeech 2007, Antwerp, Belgium (2007)
Mekhnacha, K., Ahuactzin, J.M., Bessière, P., Mazer, E., Smail, L.: Exact and approximate inference in ProBT. Revue d’Intelligence Artificielle 21(3), 295–332 (2007)
NVIDIA (2007) CUDA Programming Guide ver 1.2
NVIDIA (2009) NVIDIA’s Next Generation CUDATM Compute Architecture: FermiTM. Whitepaper, NVIDIA. http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krger, J., Lefohn, A.E., Purcell, T.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26(5), 80–113 (2007)
Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. In: Proceedings of the IEEE, vol 96, pp 879–899 (2008)
Pinho, C., Ferreira, J.F., Bessière, P., Dias, J.: A Bayesian binaural system for 3D sound-source localisation. In: International Conference on Cognitive Systems (CogSys 2008), pp 109–114. University of Karlsruhe, Karlsruhe, Germany (2008)
Pouget, A., Dayan, P., Zemel, R.: Information processing with population codes. Nat. Rev. Neurosci. 1, 125–132 (2000)
Reinbothe, C., Boubekeur, T., Alexa, M.: Hybrid ambient occlusion. In: Proceedings of the Eurographics Symposium on Rendering (2009)
Tay, C., Mekhnacha, K., Chen, C., Yguel, M., Laugier, C.: An efficient formulation of the Bayesian occupation filter for target tracking in dynamic environments. Int. J. Veh. Auton. Syst. 6, 155–171 (2008)
Yguel, M., Aycard, O., Laugier, C.: Efficient GPU-based construction of occupancy grids using several laser range-finders. Int. J. Veh. Auton. Syst. 6(1–2), 48–83 (2007)
Acknowledgments
The authors would like to thank José Marinho for his invaluable assistance on the implementation of the inertial sensor model particle filter. This publication has been supported by EC-contract number FP6-IST-027140, Action line: Cognitive Systems. The contents of this text reflect only the author’s views. The European Community is not liable for any use that may be made of the information contained herein.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ferreira, J.F., Lobo, J. & Dias, J. Bayesian real-time perception algorithms on GPU. J Real-Time Image Proc 6, 171–186 (2011). https://doi.org/10.1007/s11554-010-0156-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-010-0156-7