Advertisement

Parallel Fully Vectorized Marsa-LFIB4: Algorithmic and Language-Based Optimization of Recursive Computations

  • Przemysław StpiczyńskiEmail author
Conference paper
  • 121 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12044)

Abstract

The aim of this paper is to present a new high-performance implementation of Marsa-LFIB4 which is an example of high-quality multiple recursive pseudorandom number generators. We propose a new algorithmic approach that combines language-based vectorization techniques together with a new divide-and-conquer method that exploits a special sparse structure of the matrix obtained from the recursive formula that defines the generator. We also show how the use of intrinsics for Intel AVX2 and AVX512 vector extensions can improve the performance. Our new implementation achieves good performance on several multicore architectures and it is much more energy-efficient than simple SIMD-optimized implementations.

Keywords

Pseudorandom numbers Recursive generators Language-based vectorization Intrinsics Algorithmic approach OpenMP 

Notes

Acknowledgements

The use of computer resources installed at Maria Curie-Skłodowska University in Lublin and Czestochowa University of Technology is kindly acknowledged.

References

  1. 1.
    Aluru, S.: Lagged Fibonacci random number generators for distributed memory parallel computers. J. Parallel Distrib. Comput. 45(1), 1–12 (1997).  https://doi.org/10.1006/jpdc.1997.1363CrossRefzbMATHGoogle Scholar
  2. 2.
    Bauke, H., Mertens, S.: Random numbers for large-scale distributed Monte Carlo simulations. Phys. Rev. E 75, 066701 (2007).  https://doi.org/10.1103/PhysRevE.75.066701MathSciNetCrossRefGoogle Scholar
  3. 3.
    Bisseling, R.H.: Parallel Scientific Computation. A structured Approach Using BSP and MPI. Oxford University Press, Oxford (2004)CrossRefGoogle Scholar
  4. 4.
    Bradley, T., du Toit, J., Tong, R., Giles, M., Woodhams, P.: Parallelization techniques for random numbers generators. In: GPU Computing Gems, pp. 231–246. Gems Emerald Edition (2011)Google Scholar
  5. 5.
    Brent, R.P.: Uniform random number generators for supercomputers. In: Proceedings of the Fifth Australian Supercomputer Conference, pp. 95–104 (1992)Google Scholar
  6. 6.
    Jeffers, J., Reinders, J., Sodani, A.: Intel Xeon Phi Processor High-Performance Programming. Knights Landing Edition. Morgan Kaufman, Cambridge (2016)Google Scholar
  7. 7.
    Khan, K.N., Hirki, M., Niemi, T., Nurminen, J.K., Ou, Z.: RAPL in action experiences in using RAPL for power measurements. ACM Trans. Model. Perform. Eval. Comput. Syst. 3(2), 9:1–9:26 (2018).  https://doi.org/10.1145/3177754CrossRefGoogle Scholar
  8. 8.
    Knuth, D.E.: The Art of Computer Programming, Volume II: Seminumerical Algorithms, 2nd edn. Addison-Wesley, Boston (1981)zbMATHGoogle Scholar
  9. 9.
    Knuth, D.E.: MMIXware. LNCS, vol. 1750. Springer, Heidelberg (1999).  https://doi.org/10.1007/3-540-46611-8CrossRefzbMATHGoogle Scholar
  10. 10.
    Łapa, K., Cpałka, K., Przybył, A., Grzanek, K.: Negative space-based population initialization algorithm (NSPIA). In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2018, Part I. LNCS (LNAI), vol. 10841, pp. 449–461. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-91253-0_42CrossRefGoogle Scholar
  11. 11.
    L’Ecuyer, P.: Good parameters and implementations for combined multiple recursive random number generators. Oper. Res. 47(1), 159–164 (1999).  https://doi.org/10.1287/opre.47.1.159MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    L’Ecuyer, P., Simard, R.J.: TestU01: AC library for empirical testing of random number generators. ACM Trans. Math. Softw. 33(4), 22:1–22:40 (2007).  https://doi.org/10.1145/1268776.1268777CrossRefzbMATHGoogle Scholar
  13. 13.
    Marsaglia, G.: Random numbers for C: The END? Posted to the electronic billboard sci.crypt.random-numbers (1999)Google Scholar
  14. 14.
    Mascagni, M., Srinivasan, A.: Algorithm 806: SPRNG: a scalable library for pseudorandom number generation. ACM Trans. Math. Softw. 26(3), 436–461 (2000).  https://doi.org/10.1145/358407.358427CrossRefGoogle Scholar
  15. 15.
    Mascagni, M., Srinivasan, A.: Parameterizing parallel multiplicative lagged-Fibonacci generators. Parallel Comput. 30(5–6), 899–916 (2004).  https://doi.org/10.1016/j.parco.2004.06.001MathSciNetCrossRefGoogle Scholar
  16. 16.
    Ökten, G., Willyard, M.: Parameterization based on randomized quasi-Monte Carlo methods. Parallel Comput. 36(7), 415–422 (2010).  https://doi.org/10.1016/j.parco.2010.03.003MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Percus, O.E., Kalos, M.H.: Random number generators for MIMD parallel processors. J. Parallel Distrib. Comput. 6(3), 477–497 (1989).  https://doi.org/10.1016/0743-7315(89)90002-6CrossRefGoogle Scholar
  18. 18.
    Stpiczyński, P.: Parallel algorithms for solving linear recurrence systems. In: Bougé, L., Cosnard, M., Robert, Y., Trystram, D. (eds.) CONPAR/VAPP -1992. LNCS, vol. 634, pp. 343–348. Springer, Heidelberg (1992).  https://doi.org/10.1007/3-540-55895-0_428CrossRefGoogle Scholar
  19. 19.
    Stpiczyński, P.: Vectorized algorithm for multidimensional Monte Carlo integration on modern GPU, CPU and MIC architectures. J. Supercomput. 74(2), 936–952 (2018).  https://doi.org/10.1007/s11227-017-2172-xCrossRefGoogle Scholar
  20. 20.
    Stpiczyński, P., Szałkowski, D., Potiopa, J.: Parallel GPU-accelerated recursion-based generators of pseudorandom numbers. In: Proceedings of the Federated Conference on Computer Science and Information Systems, September 9–12, 2012, Wroclaw, Poland, pp. 571–578. IEEE Computer Society Press (2012). http://fedcsis.org/proceedings/2012/pliks/380.pdf
  21. 21.
    Szałkowski, D., Stpiczyński, P.: Using distributed memory parallel computers and GPU clusters for multidimensional Monte Carlo integration. Concurr. Comput. Pract. Exp. 27(4), 923–936 (2015).  https://doi.org/10.1002/cpe.3365CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Institute of Computer ScienceMaria Curie–Skłodowska UniversityLublinPoland

Personalised recommendations