Caching and Visualizing Statistical Analyses
We present the cacher and CodeDepends packages for R, which provide tools for (1) caching and analyzing the code for statistical analyses and (2) distributing these analyses to others in an efficient manner over the Web. The cacher package takes objects created by evaluating R expressions and stores them in key-value databases. These databases of cached objects can subsequently be assembled into “cache packages” for distribution over the Web. The cacher package also provides tools to help readers examine the data and code in a statistical analysis and reproduce, modify, or improve upon the results. In addition, readers can easily conduct alternate analyses of the data. The CodeDepends package provides complementary tools for analyzing and visualizing the code for a statistical analysis and this functionality has been integrated into the cacher package. In this chapter, we describe the cacher and CodeDepends packages and provide examples of how they can be used for reproducible research.
KeywordsSource File Metadata File Reproducible Research Statistical Analysis Code Cache Directory
- Peng RD (2008) Caching and distributing statistical analyses in R. J Stat Softw 26(7):1–24Google Scholar
- Peng RD, Dominici F (2008) Statistical methods for environmental epidemiology in R: a case study in air pollution and health. Springer, New YorkGoogle Scholar
- Peng RD, Eckel SP (2009) Distributed reproducible research using cached computations. IEEE Comput Sci Eng 11(1):28–34Google Scholar