Abstract
Many complex real world systems can be represented as correlated high dimensional vectors (up to 20,501 in this paper). While univariate analysis is simpler, it does not account for correlations between variables. This omission often misleads researchers by producing results based on unrealistic assumptions. As the generation of large correlated data sets is time consuming and resource heavy, we propose a graphical processing unit (GPU) accelerated version of the established NORmal To Anything (NORTA) algorithm. NORTA involves many independent and parallelizeable operations—sparking our interest to deploy a Compute Unified Device Architecture (CUDA) implementation for use on Nvidia GPUs. NORTA begins by simulating independent standard normal vectors and transforms them into correlated vectors with arbitrary marginal distributions (heterogenous random variables). In our benchmark studies using a Tesla Nvidia card, the speedup obtained over a sequential NORTA coded in R (R-NORTA) peaks at 19.6× for 2000 simulated random vectors with dimension 5000. Moreover, the speedup obtained for GPU-NORTA over a commonly used R package for multivariate simulation (the copula package) was 2093× for 2000 simulated random vectors with dimension 20,501. Our study serves as a preliminary proof of concept with opportunities for further optimization, implementation, and additional features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pucher, B.M., Zeleznik, O.A., Thallinger, G.G.: Comparison and evaluation of integrative methods for the analysis of multilevel omics data: a study based on simulated and experimental cancer data. Brief. Bioinform. 1–11 (2018). [Online]. Available: https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bby027/4982568
Gatti, D.M., Barry, W.T., Nobel, A.B., Rusyn, I., Wright, F.A.: Heading down the wrong pathway: on the influence of correlation within gene sets. BMC Genomics 11(1), 574 (2010). [Online]. Available: https://doi.org/10.1186/1471-2164-11-574
Wilkins, M.F., Morris, C., Boddy, L.: A comparison of radial basis function and backpropagation neural networks for identification of marine phytoplankton from multivariate flow cytometry data. Bioinformatics 10(3), 285–294 (1994). [Online]. Available: http://dx.doi.org/10.1093/bioinformatics/10.3.285
Russkova, T.V.: Monte Carlo simulation of the solar radiation transfer in a cloudy atmosphere with the use of graphic processor and NVIDIA CUDA technology. Atmos. Oceanic Opt. 31(2), 119–130 (2018). [Online]. Available: https://link-springer-com.unr.idm.oclc.org/content/pdf/10.1134/S1024856018020100.pdf
Häyrinen, K., Saranto, K., Nykänen, P.: Definition, structure, content, use and impacts of electronic health records: a review of the research literature. Int. J. Med. Inform. 77(5), 291–304 (2008). [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1386505607001682
Niaki, S.T.A., Abbasi, B.: Generating correlation matrices for normal random vectors in NORTA algorithm using artificial neural networks. J. Uncertain Syst. 2(3), 192–201 (2008). [Online]. Available: http://www.worldacademicunion.com/journal/jus/jusVol02No3paper04.pdf
Cario, M.C., Nelson, B.L.: Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. Northwestern University, Technical Report (1997). [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.281
Casella, G., Berger, R.L.: Statistical Inference. Duxbury, Pacific Grove (2002)
Strang, G.: Introduction to Linear Algebra. Cambridge Press, Wellesley (1993)
Rizzo, M.L.: Statistical Computing with R. Chapman and Hall/CRC (2007). [Online]. Available: https://www.taylorfrancis.com/books/9781420010718
Genest, C., Mackay, J.: The joy of copulas: bivariate distributions with uniform marginals. Am. Stat. 40(4), 280–283 (1986). [Online]. Available: https://www.tandfonline.com/doi/abs/10.1080/00031305.1986.10475414
Sklar, M.: Fonctions de répartition à n dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 8, 229–231 (1959)
Sanders, J., Kandrot, E.: CUDA by example: an introduction to general-purpose GPU programming, 1st edn. Addison-Wesley Professional, Upper Saddle River (2010)
Nobile, M.S., Cazzaniga, P., Besozzi, D., Pescini, D., Mauri, G.: cuTauLeaping: a GPU-powered Tau-leaping stochastic simulator for massive parallel analyses of biological systems. PLoS One 9(3), e91963 (2014). [Online]. Available: http://dx.plos.org/10.1371/journal.pone.0091963
Harris, M.: Unified memory in CUDA 6 (2013). [Online]. Available: https://devblogs.nvidia.com/unified-memory-in-cuda-6/
Harris, M.: CUDA 8 features revealed: pascal, unified memory and more (2016). [Online]. Available: https://devblogs.nvidia.com/cuda-8-features-revealed/
Yan, J.: Enjoy the joy of copulas: with a package copula. J. Stat. Softw. 21(4), 1–21 (2007). [Online]. Available: http://www.jstatsoft.org/v21/i04/
cuSOLVER::CUDA Toolkit Documentation (2018). [Online]. Available: https://docs.nvidia.com/cuda/cusolver/index.html
cuRAND::CUDA Toolkit Documentation (2018). [Online]. Available: https://docs.nvidia.com/cuda/curand/notices-header.html#notices-header
cuBLAS::CUDA Toolkit Documentation (2018). [Online]. Available: https://docs.nvidia.com/cuda/cublas/index.html
O’Hara, K.: StatsLib (2018). [Online]. Available: https://github.com/kthohr/stats
Acknowledgements
This material is based upon work supported by the National Science Foundation under grant number IIA1301726. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, X., Schissler, A.G., Wu, R., Barford, L., Harris, F.C. (2019). A Graphical Processing Unit Accelerated NORmal to Anything Algorithm for High Dimensional Multivariate Simulation. In: Latifi, S. (eds) 16th International Conference on Information Technology-New Generations (ITNG 2019). Advances in Intelligent Systems and Computing, vol 800. Springer, Cham. https://doi.org/10.1007/978-3-030-14070-0_46
Download citation
DOI: https://doi.org/10.1007/978-3-030-14070-0_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14069-4
Online ISBN: 978-3-030-14070-0
eBook Packages: EngineeringEngineering (R0)