Journal of Real-Time Image Processing

, Volume 6, Issue 3, pp 171–186 | Cite as

Bayesian real-time perception algorithms on GPU

Real-time implementation of Bayesian models for multimodal perception using CUDA
  • João Filipe Ferreira
  • Jorge Lobo
  • Jorge DiasEmail author
Special Issue


In this text we present the real-time implementation of a Bayesian framework for robotic multisensory perception on a graphics processing unit (GPU) using the Compute Unified Device Architecture (CUDA). As an additional objective, we intend to show the benefits of parallel computing for similar problems (i.e. probabilistic grid-based frameworks), and the user-friendly nature of CUDA as a programming tool. Inspired by the study of biological systems, several Bayesian inference algorithms for artificial perception have been proposed. Their high computational cost has been a prohibitory factor for real-time implementations. However in some cases the bottleneck is in the large data structures involved, rather than the Bayesian inference per se. We will demonstrate that the SIMD (single-instruction, multiple-data) features of GPUs provide a means for taking a complicated framework of relatively simple and highly parallelisable algorithms operating on large data structures, which might take up to several minutes of execution with a regular CPU implementation, and arrive at an implementation that executes in the order of tenths of a second. The implemented multimodal perception module (including stereovision, binaural sensing and inertial sensing) builds an egocentric representation of occupancy and local motion, the Bayesian Volumetric Map (BVM), based on which gaze shift decisions are made to perform active exploration and reduce the entropy of the BVM. Experimental results show that the real-time implementation successfully drives the robotic system to explore areas of the environment mapped with high uncertainty.


GPU NVIDIA CUDA Multimodal Bayesian perception 



The authors would like to thank José Marinho for his invaluable assistance on the implementation of the inertial sensor model particle filter. This publication has been supported by EC-contract number FP6-IST-027140, Action line: Cognitive Systems. The contents of this text reflect only the author’s views. The European Community is not liable for any use that may be made of the information contained herein.


  1. 1.
    GPU4Vision—Accelerating Computer Vision. (2009)
  2. 2.
    Aloimonos, J., Weiss, I., Bandyopadhyay, A.: Active vision. Int. J. Comput. Vis. 1, 333–356 (1987)CrossRefGoogle Scholar
  3. 3.
    Bajcsy, R.: Active perception vs passive perception. In: Third IEEE Workshop on Computer Vision, Bellair, Michigan, pp 55–59 (1985)Google Scholar
  4. 4.
    Barber, M.J., Clark, J.W., Anderson, C.H.: Neural representation of probabilistic information. Neural Comput. 15(8), 1843–1864 (2003). doi: 10.1162/08997660360675062 zbMATHCrossRefGoogle Scholar
  5. 5.
    Bessière, P., Laugier, C., Siegwart, R. (eds.): Probabilistic reasoning and decision making in sensory-motor systems. In: Springer Tracts in Advanced Robotics, vol 46. Springer. ISBN: 978-3-540-79006-8 (2008)Google Scholar
  6. 6.
    Box, G.E.P., Muller, M.: A note on the generation of random normal deviates. Ann. Math. Stat. 29(2), 610–611 (1958)zbMATHCrossRefGoogle Scholar
  7. 7.
    Carpenter, R.H.S.: The saccadic system: a neurological microcosm. Adv. Clin. Neurosci. Rehabil. 4, 6–8 (2004)Google Scholar
  8. 8.
    Caspi, A., Beutter, B.R., Eckstein, M.P.: The time course of visual information accrual guiding eye movement decisions. Proc. Natl. Acad. Sci. USA 101(35), 13086–13090 (2004)CrossRefGoogle Scholar
  9. 9.
    Cutting, J.E., Vishton, P.M.: Perceiving layout and knowing distances: the integration, relative potency, and contextual use of different information about depth. In: Epstein, W., Rogers, S. (eds.) Handbook of Perception and Cognition, vol 5; Perception of Space and Motion. Academic Press, New York (1995)Google Scholar
  10. 10.
    Denève, S., Latham, P.E., Pouget, A.: Reading population codes: a neural implementation of ideal observers. Nat. Neurosci. 2(8), 740–745 (1999). doi: 10.1038/11205 CrossRefGoogle Scholar
  11. 11.
    Elfes, A.: Using occupancy grids for mobile robot perception and navigation. IEEE Comput. 22(6), 46–57 (1989)Google Scholar
  12. 12.
    Farrugia, J.P., Horain, P., Guehenneux, E., Alusse, Y.: GpuCV: a framework for image processing acceleration with graphics processors. In: 2006 IEEE International Conference on Multimedia and Expo, pp 585–588 (2006)Google Scholar
  13. 13.
    Ferreira, J.F., Bessière, P., Mekhnacha, K., Lobo, J., Dias, J., Laugier, C.: Bayesian models for multimodal perception of 3D structure and motion. In: International Conference on Cognitive Systems (CogSys 2008), University of Karlsruhe, Karlsruhe, Germany, pp 103–108 (2008)Google Scholar
  14. 14.
    Ferreira, J.F., Pinho, C., Dias, J.: Active exploration using Bayesian models for multimodal perception. In: Campilho, A., Kamel, M. (eds.) Image Analysis and Recognition. Lecture Notes in Computer Science Series (Springer LNCS), International Conference ICIAR 2008, pp 369–378 (2008)Google Scholar
  15. 15.
    Ferreira, J.F., Pinho, C., Dias, J.: Bayesian sensor model for egocentric stereovision. In: 14a Conferência Portuguesa de Reconhecimento de Padrões Coimbra (RECPAD 2008) (2008)Google Scholar
  16. 16.
    Ferreira, J.F., Pinho, C., Dias, J.: Implementation and calibration of a Bayesian binaural system for 3D localisation. In: 2008 IEEE International Conference on Robotics and Biomimetics (ROBIO 2008), Bangkok, Thailand (2009)Google Scholar
  17. 17.
    Fung, J., Mann, S.: Using graphics devices in reverse: GPU-based image processing and computer vision. In: IEEE Int’l Conference on Multimedia and Expo, Hannover, Germany (2008)Google Scholar
  18. 18.
    Fung, J., Mann, S., Aimone, C.: OpenVIDIA: parallel GPU computer vision. In: ACM Multimedia 2005, Singapore, pp 849–852 (2005)Google Scholar
  19. 19.
    Hussein, M., Varshney, A., Davis, L.: On implementing graph cuts on CUDA. In: First Workshop on General Purpose Processing on Graphics Processing Units, Boston, MA (2007)Google Scholar
  20. 20.
    Hwu, W.M., Rodrigues, C., Ryoo, S., Stratton, J.: Compute unified device architecture application suitability. Comput. Sci. Eng. 11(3), 16–26 (2009)CrossRefGoogle Scholar
  21. 21.
    Jang, H., Park, A., Jung, K.: Neural network implementation using CUDA and OpenMP. In: Proceedings of the 2008 Digital Image Computing: Techniques and Applications, pp 155–161. IEEE Computer Society Washington, DC, USA (2008)Google Scholar
  22. 22.
    Knill, D.C., Pouget, A.: The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 27(12), 712–719 (2004)CrossRefGoogle Scholar
  23. 23.
    Koene, A., Morén, J., Trifa, V., Cheng, G.: Gaze shift reflex in a humanoid active vision system. In: 5th International Conference on Computer Vision Systems (ICVS 2007), Applied Computer Science Group, Bielefeld University, Germany. ISBN:978-3-00-020933-8 (2007)Google Scholar
  24. 24.
    Laurens, J., Droulez, J.: Bayesian processing of vestibular information. Biol. Cybernet. 96(4), 389–404 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  25. 25.
    Lebeltel, O.: Programmation Bayésienne des robots. PhD thesis, Institut National Polytechnique de Grenoble, Grenoble, France (1999)Google Scholar
  26. 26.
    L’Ecuyer, P.: Maximally equidistributed combined tausworthe generators. Math. Comput. 65(213), 202–213 (1996)Google Scholar
  27. 27.
    Lobo, J., Ferreira, J.F., Dias, J.: Bioinspired visuovestibular artificial perception system for independent motion segmentation. In: Second International Cognitive Vision Workshop, ECCV 9th European Conference on Computer Vision, Graz, Austria. (2006)
  28. 28.
    Lobo, J., Ferreira, J.F., Dias, J.: Robotic implementation of biological Bayesian models towards visuo-inertial image stabilization and gaze control. In: 2008 IEEE International Conference on Robotics and Biomimetics (ROBIO 2008), Bangkok, Tailand (2009)Google Scholar
  29. 29.
    Lu, Y.C., Christensen, H., Cooke, M.: Active binaural distance estimation for dynamic sources. In: Interspeech 2007, Antwerp, Belgium (2007)Google Scholar
  30. 30.
    Mekhnacha, K., Ahuactzin, J.M., Bessière, P., Mazer, E., Smail, L.: Exact and approximate inference in ProBT. Revue d’Intelligence Artificielle 21(3), 295–332 (2007)CrossRefGoogle Scholar
  31. 31.
    NVIDIA (2007) CUDA Programming Guide ver 1.2Google Scholar
  32. 32.
    NVIDIA (2009) NVIDIA’s Next Generation CUDATM Compute Architecture: FermiTM. Whitepaper, NVIDIA.
  33. 33.
    Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krger, J., Lefohn, A.E., Purcell, T.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26(5), 80–113 (2007)CrossRefGoogle Scholar
  34. 34.
    Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. In: Proceedings of the IEEE, vol 96, pp 879–899 (2008)Google Scholar
  35. 35.
    Pinho, C., Ferreira, J.F., Bessière, P., Dias, J.: A Bayesian binaural system for 3D sound-source localisation. In: International Conference on Cognitive Systems (CogSys 2008), pp 109–114. University of Karlsruhe, Karlsruhe, Germany (2008)Google Scholar
  36. 36.
    Pouget, A., Dayan, P., Zemel, R.: Information processing with population codes. Nat. Rev. Neurosci. 1, 125–132 (2000)CrossRefGoogle Scholar
  37. 37.
    Reinbothe, C., Boubekeur, T., Alexa, M.: Hybrid ambient occlusion. In: Proceedings of the Eurographics Symposium on Rendering (2009)Google Scholar
  38. 38.
    Tay, C., Mekhnacha, K., Chen, C., Yguel, M., Laugier, C.: An efficient formulation of the Bayesian occupation filter for target tracking in dynamic environments. Int. J. Veh. Auton. Syst. 6, 155–171 (2008)Google Scholar
  39. 39.
    Yguel, M., Aycard, O., Laugier, C.: Efficient GPU-based construction of occupancy grids using several laser range-finders. Int. J. Veh. Auton. Syst. 6(1–2), 48–83 (2007)Google Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  • João Filipe Ferreira
    • 1
  • Jorge Lobo
    • 1
  • Jorge Dias
    • 1
    Email author
  1. 1.ISR-University of CoimbraCoimbraPortugal

Personalised recommendations