, Volume 95, Issue 4, pp 309–326 | Cite as

An efficient implementation of Bailey and Borwein’s algorithm for parallel random number generation on graphics processing units

  • Gleb BeliakovEmail author
  • Michael Johnstone
  • Doug Creighton
  • Tim Wilkin


Pseudorandom number generators are required for many computational tasks, such as stochastic modelling and simulation. This paper investigates the serial and parallel implementation of a Linear Congruential Generator for Graphics Processing Units (GPU) based on the binary representation of the normal number \(\alpha _{2,3}\). We adapted two methods of modular reduction which allowed us to perform most operations in 64-bit integer arithmetic, improving on the original implementation based on 106-bit double-double operations, which resulted in four-fold increase in efficiency. We found that our implementation is faster than existing methods in literature, and our generation rate is close to the limiting rate imposed by the efficiency of writing to a GPU’s global memory.


GPU Random number generation Normal numbers 

Mathematics Subject Classification

11K45 65C10 68W10 65Y05 


  1. 1.
    Agner F (2004–2012) Optimizing software in C++: an optimization guide for Windows, Linux and Mac platforms. Copenhagen University College of Engineering, Copenhagen.
  2. 2.
    Atkinson AC, Pearce MC (1976) The computer generation of beta, gamma and normal random variables. J R Stat Soc Ser A (General) 139:431–461MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bailey D (2012) High-precision software directory., accessed 1 September, 2012
  4. 4.
    Bailey D, Borwein J (2011) Normal numbers and pseudorandom generators. In: Proceedings of the workshop on computational and analytical mathematics in honour of Jonathan Borwein’s 60th birthday. Springer, New York.
  5. 5.
    Bailey D, Crandall R (2000) Random generators and normal numbers. Exp Math 11:527–546MathSciNetCrossRefGoogle Scholar
  6. 6.
    Barrett P (1987) Implementing the Rivest Shamir and Adleman public key encryption algorithm on a standard digital signal processor. In: Proceedings on advances in cryptology—CRYPTO ’86, pp 311–323. Springer, LondonGoogle Scholar
  7. 7.
    Beliakov G (2005) Class library ranlip for multivariate nonuniform random variate generation. Comput Phys Commun 170:93–108CrossRefzbMATHGoogle Scholar
  8. 8.
    Blum M (1982) How to generate cryptographically strong sequences of pseudo random bits. In: 23rd annual symposium on foundations of computer science, pp 112–117Google Scholar
  9. 9.
    Bratley P, Bennet F, Schrage L (1987) A Guide to Simulation, 2nd edn. Springer, BerlinCrossRefGoogle Scholar
  10. 10.
    Gladkov D, Tapia J, Alberts S, D’Souza R (2012) Graphics processing unit based direct simulation Monte Carlo. Simulation 88:680–693CrossRefGoogle Scholar
  11. 11.
    Hörmann W, Leydold J, Derflinger G (2004) Automatic nonuniform random variate generation. Springer, BerlinzbMATHGoogle Scholar
  12. 12.
    Howes L, Thomas D (2007) Efficient random number generation and application using CUDA, chap. 37. In: Hubert N (ed) GPU Gems 3. Addison Wesley, New YorkGoogle Scholar
  13. 13.
    L’Ecuyer P (1988) Efficient and portable combined random number generators. Commun ACM 31:742–751MathSciNetCrossRefGoogle Scholar
  14. 14.
    L’Ecuyer P (1994) Uniform random number generation. Ann Oper Res 53:77–120MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    L’Ecuyer P, Cordeau J, Simard R (2000) Close-point spatial tests and their application to random number generators. Oper Res 48:308–317MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    L’Ecuyer P, Simard R (2007) Testu01: a C library for empirical testing of random number generators. ACM Trans Math Softw 33 (article 22)Google Scholar
  17. 17.
    L’Ecuyer P, Simard R, Chen J, Kelton D (2002) An object-oriented random-number package with many long streams and substreams. Oper Res 50:1073–1075CrossRefGoogle Scholar
  18. 18.
    Menezes AJ, van Oorschot PC, Vanstone SA (eds) (1996) Handbook of applied cryptography. CRC Press, Boca RatonGoogle Scholar
  19. 19.
    NVIDIA (2012) CUDA Zone. (last accessed 1 June, 2012)
  20. 20.
    NVIDIA (2012) CUDA Zone SDK Samples. (last accessed 1 June, 2012)
  21. 21.
    NVIDIA (2012) Double-double precision arithmetic. (last accessed 4 June, 2012)
  22. 22.
    NVIDIA (2012) NVIDIA CUDA Programming Guide, http:// (last accessed 1 June, 2012)
  23. 23.
    Passerat-Palmbach J, Mazel C, Hill D (2011) Pseudo-random number generation on GP-GPU. In: 2011 IEEE workshop on principles of advanced and distributed simulation. IEEE Computer Society, Los Alamitos, pp 1–8Google Scholar
  24. 24.
    Saito M, Matsumoto M (2010) Variants of Mersenne twister suitable for graphic processors. ArXiv e-prints 1005.4973.

Copyright information

© Springer-Verlag Wien 2012

Authors and Affiliations

  • Gleb Beliakov
    • 1
    Email author
  • Michael Johnstone
    • 2
  • Doug Creighton
    • 2
  • Tim Wilkin
    • 1
  1. 1.School of Information TechnologyDeakin UniversityBurwoodAustralia
  2. 2.Centre for Intelligent Systems ResearchDeakin UniversityGeelongAustralia

Personalised recommendations