Advertisement

International Journal of Parallel Programming

, Volume 37, Issue 1, pp 78–90 | Cite as

Snow: A Parallel Computing Framework for the R System

  • Luke Tierney
  • A. J. Rossini
  • Na Li
Article

Abstract

This paper presents a simple parallel computing framework for the statistical programming language R. The system focuses on parallelization of familiar higher level mapping functions and emphasizes simplicity of use in order to encourage adoption by a wide range of R users. The paper describes the design and implementation of the system, outlines examples of its use, and presents some possible directions for future developments.

Keywords

Distributed memory Message passing PVM MPI Sockets Bootstrap Cross-validation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    R Development Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2007). http://www.R-project.org [ISBN 3-900051-07-0]
  2. 2.
    Becker, R.A., Chambers, J.M.: S: An Interactive Environment for Data Analysis and Graphics. Wadsworth (1984)Google Scholar
  3. 3.
    Becker, R.A., Chambers, J.M., Wilks, A.R.: The New S Language: A Programming Environment for Data Analysis and Graphics. Wadsworth (1988)Google Scholar
  4. 4.
    Chambers, J.M.: Programming with Data: A Guide to the S Language. Springer Verlag (1998)Google Scholar
  5. 5.
    Geist, A., Beguelin, A., Dongarra, J., Jiang, W.: PVM: Parallel Virtual Machine. MIT Press (1994)Google Scholar
  6. 6.
    Pacheco, P.: Parallel Programming with MPI. Morgan Kaufmann (1997)Google Scholar
  7. 7.
    Li, N., Rossini, A.J.: rpvm: R Interface to PVM (Parallel Virtual Machine) (2005). http://www.r-project.org/ [R package version 0.6-5]
  8. 8.
    Yu, H.: Rmpi: Interface (wrapper) to MPI (Message-Passing Interface) (2007). http://www.stats.uwo.ca/faculty/yu/Rmpi [R package version 0.5-5]
  9. 9.
    Jones, E., et al.: SciPy: Open Source Scientific Tools for Python (2001). http://www.scipy.org/
  10. 10.
    REvolution Computing: NetWorkSpaces for R (2008). http://nws-r.sourceforge.net/ [R package version 1.6.3]
  11. 11.
    Pérez F., Granger B.: IPython: A system for interactive scientific computing. Comput. Sci. Eng. 9, 21 (2007)CrossRefGoogle Scholar
  12. 12.
    L’Ecuyer P., Simard R., Chen E.J., Kelton W.D.: An objected-oriented random-number package with many long streams and substreams. Oper. Res. 50, 1073 (2002)CrossRefGoogle Scholar
  13. 13.
    Sevcikova, H., Rossini, T.: Rlecuyer: R Interface to RNG with Multiple Streams (2004). http://www.r-project.org [R package version 0.1]
  14. 14.
    Rossini A.J., Tierney L., Li N.: Simple parallel statistical computing in R. J. Comput. Graph. Stat. 16, 399 (2007)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer (2002)Google Scholar
  16. 16.
    Gentleman R., Ihaka R.: Lexical scope in statistical computing. J. Comput. Graph. Stat. 9, 491 (2000)CrossRefMathSciNetGoogle Scholar
  17. 17.
    Davison, A., Hinkley, D.: Bootstrap Methods and Their Application. Cambridge University Press (1997)Google Scholar
  18. 18.
    Ripley, B.D., Canty, A.: boot: Bootstrap R (S-Plus) Functions (2008). http://www.r-project.org/ [R package version 1.2-31]
  19. 19.
    Diaz-Uriarte, R.: GeneSrF and varSelRF: a web-based tool and R package for gene selection and classification using random forest. BMC Bioinform. 8 (2007). http://www.biomedcentral.com/1471-2105/8/328
  20. 20.
    Esarey, J., Mukherjee, B., Moore, W.H.: Strategic interaction and interstate crises: a Bayesian quantal response estimator for incomplete information games. Polit. Anal. (2008). http://pan.oxfordjournals.org/cgi/content/full/mpm037v1 [Advance Access]
  21. 21.
    NIH Biowulf Cluster: R on biowulf (2008). http://biowulf.nih.gov/apps/R.html
  22. 22.
    Chandra R., Menon R., Dagum L., Kohr D.: Parallel Programming in OpenMP. Morgan Kaufmann, San Fransisco (2000)Google Scholar
  23. 23.
    Sevcikova, H., Rossini, T.: snowFT: Fault tolerant simple network of workstations (2005). http://www.r-project.org/ [R package version 0.0-2]
  24. 24.
    Bisseling, R.H.: Parallel Scientific Computation: A Structured Approach Using BSP and MPI. Oxford (2004)Google Scholar
  25. 25.
    Hinsen, K.: High-level scientific programming in python. In: Sloot, P.M., Tan, C.K., Dongarra, J.J. Hoekstra, A.G. (eds.) Computational Science—ICCS 2002, number 2331 in Lecture Notes in Computer Science. Springer-Verlag (2002)Google Scholar
  26. 26.
    Loulergue, F., Benheddi, R., Gava, F., Louis-Régis, D.: Bulk synchronous parallel ML: semantics and implementation of the parallel juxtaposition. In: Grigoriev, D., Harrison, J., Hirsch E.A. (eds.) Computer Science—Theory and Applications, First International Computer Science Symposium in Russia, CSR 2006, St. Petersburg, Russia, June 8–12, 2006, Proceedings, vol. 3967 of Lecture Notes in Computer Science, pp. 475–486. Springer (2006)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.Department of Statistics and Actuarial ScienceUniversity of IowaIowa CityUSA
  2. 2.Modeling and SimulationNovartis Pharma AGBaselSwitzerland
  3. 3.Division of BiostatisticsUniversity of MinnesotaMinneapolisUSA

Personalised recommendations