Skip to main content

Caching and Visualizing Statistical Analyses

  • Chapter
  • First Online:
Book cover Biomedical Informatics for Cancer Research

Abstract

We present the cacher and CodeDepends packages for R, which provide tools for (1) caching and analyzing the code for statistical analyses and (2) distributing these analyses to others in an efficient manner over the Web. The cacher package takes objects created by evaluating R expressions and stores them in key-value databases. These databases of cached objects can subsequently be assembled into “cache packages” for distribution over the Web. The cacher package also provides tools to help readers examine the data and code in a statistical analysis and reproduce, modify, or improve upon the results. In addition, readers can easily conduct alternate analyses of the data. The CodeDepends package provides complementary tools for analyzing and visualizing the code for a statistical analysis and this functionality has been integrated into the cacher package. In this chapter, we describe the cacher and CodeDepends packages and provide examples of how they can be used for reproducible research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Baggerly K, Morris J, Edmonson S, Coombes K (2005) Signal in noise: evaluating reported reproducibility of serum proteomic tests for ovarian cancer. J Natl Cancer Inst 97:307–309

    Article  PubMed  CAS  Google Scholar 

  • Laine C, Goodman SN, Griswold ME, Sox HC (2007) Reproducible research: moving toward research the public can really trust. Ann Intern Med 146:450–453

    PubMed  Google Scholar 

  • Peng RD (2008) Caching and distributing statistical analyses in R. J Stat Softw 26(7):1–24

    Google Scholar 

  • Peng RD, Dominici F (2008) Statistical methods for environmental epidemiology in R: a case study in air pollution and health. Springer, New York

    Google Scholar 

  • Peng RD, Eckel SP (2009) Distributed reproducible research using cached computations. IEEE Comput Sci Eng 11(1):28–34

    Google Scholar 

  • Samet JM, Dominici F, Curriero F, Coursac I, Zeger SL (2000) Particulate air pollution and mortality: findings from 20 U.S. cities. N Engl J Med 343(24):1742–1757

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roger D. Peng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Peng, R.D., Lang, D.T. (2010). Caching and Visualizing Statistical Analyses. In: Ochs, M., Casagrande, J., Davuluri, R. (eds) Biomedical Informatics for Cancer Research. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-5714-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-5714-6_17

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-5712-2

  • Online ISBN: 978-1-4419-5714-6

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics