Abstract
Due to the increasing availability of powerful hardware resources, parallel computing is becoming an important issue, as a noticeable speedup may be achieved. The statistical programming language R allows for parallel computing on computer clusters as well as multicore systems through several packages. This tutorial gives a short, practical overview of four, in view of the authors, important packages for parallel computing in R, namely multicore, snow, snowfall and nws. First, the general principle of parallelizing simple tasks is briefly illustrated based on a statistical cross-validation example. Afterwards, the usage of each of the introduced packages is being demonstrated on the example. Furthermore, we address some specific features of the packages and provide guidance for selecting an adequate package for the computing environment at hand.
Similar content being viewed by others
References
Bjornson R, Carriero N, Weston S (2007) Python NetWorkSpaces and parallel programs. Dr Dobb’s Journal, pp 1–7. http://www.ddj.com/web-development/200001971
Dolkart V, Pronina L (2007) Change in computer hardware and software paradigms. Russian Electr Eng 78(10): 548–553
Dongarra, J, Foster, I, Fox, G, Gropp, W, Kennedy, K, Torczon, L, White, A (eds) (2003) Sourcebook of parallel computing. Morgan Kaufmann Publishers Inc., San Francisco
Eddelbuettel D (2010a) CRAN task view: high-performance and parallel computing. http://cran.r-project.org/web/views/HighPerformanceComputing.htm
Eddelbuettel D (2010b) R SIG on high-performance computing. http://www.r-project.org/mail.html
Grama A, Karypis G, Kumar V, Gupta A (2003) Introduction to parallel computing. 2nd edn. Addison Wesley, Reading
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. 2nd edn. Springer, New York
Knaus J (2010) Snowfall: easier cluster computing based on snow. http://CRAN.R-project.org/package=snowfall, R package version 1.83
Knaus J, Porzelius C, Binder H, Schwarzer G (2009) Easier parallel computing in R with snowfall and sfCluster. R J 1: 54–59
R Development Core Team (2009) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org, ISBN 3-900051-07-0
REvolution Computing (2008) nws: R functions for NetWorkSpaces and Sleigh. REvolution Computing with support and contributions from Pfizer and Inc. http://nws-r.sourceforge.net/, R package version 1.7.0.0
Rossini A, Tierney L, Li NM (2007) Simple parallel statistical computing in R. J Comput Graph Stat 16(2): 399–420
Schmidberger M, Morgan M, Eddelbuettel D, Yu H, Tierney L, Mansmann U (2009) State of the art in parallel computing with R. J Stat Softw 31(1). http://www.jstatsoft.org/v31/i01/
Simon R, Radmacher MD, Dobbin K, McShane LM (2003) Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J National Cancer Inst 95(1): 14–18
Sloan J (2004) High performance linux clusters with OSCAR, Rocks, OpenMosix, and MPI (Nutshell Handbooks). O’Reilly Media, Inc. http://www.oreilly.de/catalog/9780596005702/
Stevens WR (1992) Advanced programming in the UNIX environment. 1st edn. Addison-Wesley, Reading
Tierney L, Rossini AJ, Li MN, Sevcikova H (2008) Snow: simple network of workstations. http://CRAN.R-project.org/package=snow, R package version 0.3–3
Urbanek S (2009) Multicore: parallel processing of R code on machines with multiple cores or CPUs. http://www.RForge.net/multicore/, R package version 0.1-3
Author information
Authors and Affiliations
Corresponding author
Additional information
Manuel J. A. Eugster, Jochen Knaus, Christine Porzelius, Markus Schmidberger, and Esmeralda Vicedo contributed equally to this work.
Rights and permissions
About this article
Cite this article
Eugster, M.J.A., Knaus, J., Porzelius, C. et al. Hands-on tutorial for parallel computing with R. Comput Stat 26, 219–239 (2011). https://doi.org/10.1007/s00180-010-0206-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-010-0206-4