Skip to main content

Analyzing Biological Data Using R: Methods for Graphs and Networks

  • Protocol
  • First Online:
Bacterial Molecular Networks

Part of the book series: Methods in Molecular Biology ((MIMB,volume 804))

Abstract

R is a powerful language and widely used software tool for the analysis and visualization of data. Its core capabilities can be extended through many different add-on packages. Among the many packages are some which offer a broad range of facilities for analyzing statistical properties of graphs. This chapter provides a practical tutorial covering the use of R methods for graphs and networks to examine biological data and analyze their topological and statistical properties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. R Development Core Team. (2009) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, [ http://www.R-project.org ]. [ISBN 3-900051-07-0].

    Google Scholar 

  2. Huber W, Carey VJ, Long L, Falcon S, Gentleman R. (2007) Graphs in molecular biology. BMC Bioinformatics, 8(6):S8.

    Article  PubMed  Google Scholar 

  3. Castelo R, Roverato A. (2009) Reverse engineering molecular regulatory networks from microarray data with qp-graphs. J Comput Biol, 16(2):213–227.

    Article  PubMed  CAS  Google Scholar 

  4. Le Meur N, Gentleman R. (2008) Modeling synthetic lethality. Genome Biol, 9(9):R135.

    Article  PubMed  Google Scholar 

  5. Csardi G, Nepusz T. (2006) The igraph software package for complex network research. InterJournal, Complex Systems:1695.

    Google Scholar 

  6. Shannon P, Markiel A, Ozier O, Baliga N, Wang J, Ramage D, Amin N, Schwikowski B, Ideker T. (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res, 13(11):2498.

    Article  PubMed  CAS  Google Scholar 

  7. Leisch F. (2002) Sweave: Dynamic Generation of Statistical Reports Using Literate Data Analysis. In Compstat 2002 Proceedings in Computational Statistics. Edited by Härdle W, Rönz B, Physika Verlag, Heidelberg, Germany, 575–580, [http://www.ci.tuwien.ac.at/~leisch/Sweave]. [ISBN 3-7908-1517-9].

  8. Venables WN, Ripley BD. (2002) Modern Applied Statistics with S (4e). Springer, New York.

    Google Scholar 

  9. Gentleman R. (2008) R Programming for Bioinformatics. CRC Press, Boca Raton.

    Google Scholar 

  10. Chambers JM. (2008) Software for Data Analysis: Programming with R. Springer, New York.

    Google Scholar 

  11. Hahne F, Huber W, Gentleman R, Falcon S. (2008) Bioconductor Case Studies. Springer, New York.

    Google Scholar 

  12. Boutros M, Bras L, Huber W. (2006) Analysis of cell-based RNAi screens. Genome Biol, 7(7):R66.

    Article  PubMed  Google Scholar 

  13. Hahne F, Le Meur N, Brinkman R, Ellis B, Haaland P, Sarkar D, Spidlen J, Strain E, Gentleman R. (2009) FlowCore: a bioconductor package for high throughput flow cytometry. BMC Bioinformatics, 10:106.

    Google Scholar 

  14. Morgan M, Anders S, Lawrence M, Aboyoun P, Pagès H, Gentleman R. (2009) ShortRead: a bioconductor package for input, quality assessment and exploration of high-throughput sequence data. Bioinformatics, 25(19):2607–2608.

    Article  PubMed  CAS  Google Scholar 

  15. Ellson J, Gansner E, Koutsofios E, North S, Woodhull G. (2004) Graphviz and Dynagraph – Static and Dynamic Graph Drawing Tools. In Graph Drawing Software. Edited by Junger M, Mutzel P, Springer, Berlin/Heidelberg, 127–148.

    Google Scholar 

  16. Brückner A, Polge C, Lentze N, Auerbach D, Schlattner U. (2009) Yeast two-hybrid, a powerful tool for systems biology. Int J Mol Sci, 10(6):2763–2788.

    Article  PubMed  Google Scholar 

  17. Wingren C, James P, Borrebaeck C. (2009) Strategy for surveying the proteome using affinity proteomics and mass spectrometry. Proteomics, 9(6):1511–1517.

    Article  PubMed  CAS  Google Scholar 

  18. Ishikawa S, Kawai Y, Hiramatsu K, Kuwano M, Ogasawara N. (2006) A new FtsZ-interacting protein, YlmF, complements the activity of FtsA during progression of cell division in Bacillus subtilis. Mol Microbiol, 60(6):1364–1380.

    Article  PubMed  CAS  Google Scholar 

  19. Chiang T, Scholtens D, Sarkar D, Gentleman R, Huber W. (2007) Coverage and error models of protein-protein interaction data by directed graph analysis. Genome Biol, 8(9):R186.

    Article  PubMed  Google Scholar 

  20. Scholtens D, Chiang T, Huber W, Gentleman R. (2008) Estimating node degree in bait-prey graphs. Bioinformatics, 24(2):218–224.

    Article  PubMed  CAS  Google Scholar 

  21. Covert M, Knight E, Reed J, Herrgard M, Palsson B. (2004) Integrating high-throughput and computational data elucidates bacterial networks. Nature, 429(6987):92–96.

    Article  PubMed  CAS  Google Scholar 

  22. The Gene Ontology Consortium. (2000) Gene ontology: tool for the unification of biology. Nat Genet, 25:25–29.

    Article  Google Scholar 

  23. Iuchi S, Lin E. (1988) arcA(dye), a global regulatory gene in Escherichia coli mediating repression of enzymes in aerobic pathways. Proc Natl Acad Sci U S A, 85(6):1888–1892.

    Article  CAS  Google Scholar 

  24. Salmon K, Hung S, Mekjian K, Baldi P, Hatfield G, Gunsalus R. (2003) Global gene expression profiling in Escherichia coli K12. The effects of oxygen availability and FNR. J Biol Chem, 278(32):29837–29855.

    Google Scholar 

  25. Correnti J, Munster V, Chan T, Woude M. (2002) Dam-dependent phase variation of Ag 43 in Escherichia coli is altered in a seqA mutant. Mol Microbiol, 44(2):521–532.

    Article  PubMed  CAS  Google Scholar 

  26. Chen H, Xu G, Zhao Y, Tian B, Lu H, Yu X, Xu Z, Ying N, Hu S, Hua Y. (2008) A novel OxyR sensor and regulator of hydrogen peroxide stress with one cysteine residue in Deinococcus radiodurans. PLoS ONE, 3(2):e1602.

    Article  PubMed  Google Scholar 

  27. Brondsted L, Atlung T. (1996) Effect of growth conditions on expression of the acid phosphatase (cyx-appA) operon and the appY gene, which encodes a transcriptional activator of Escherichia coli. J Bacteriol, 178(6):1556.

    PubMed  CAS  Google Scholar 

  28. Falcon S, Gentleman R. (2007) Using GOstats to test gene lists for GO term association. Bioinformatics, 23(2):257–258.

    Article  PubMed  CAS  Google Scholar 

  29. Eisendle M, Schrettl M, Kragl C, Muller D, Illmer P, Haas H. (2006) The intracellular siderophore ferricrocin is involved in iron storage, oxidative-stress resistance, germination, and sexual development in Aspergillus nidulans. Eukaryot Cell, 5(10):1596.

    Article  PubMed  CAS  Google Scholar 

  30. Wasserman S, Faust K. (1994) Social Network Analysis, Methods and Applications. Cambridge University Press, Cambridge.

    Google Scholar 

  31. Scholtens D, Vidal M, Gentleman R. (2005) Local dynamic modeling of global interactome networks. Bioinformatics, 21:3548–3557.

    Article  PubMed  CAS  Google Scholar 

  32. Siek JG, Lee LQ, Lumsdaine A. (2002) The Boost Graph Library. Addison Wesley, Boston.

    Google Scholar 

  33. Schuchhardt J, Beule D, Malik A, Wolski E, Eickhoff H, Lehrach H, Herzel H. (2000) Normalization strategies for cDNA microarrays. Nucleic Acids Res, 28(10):E47.

    Article  PubMed  CAS  Google Scholar 

  34. Gentleman R, Huber W. (2007) Making the most of high-throughput protein-interaction data. Genome Biol, 8(10):112.

    Article  PubMed  Google Scholar 

  35. Chiang T, Scholtens D. (2009) A general pipeline for quality and statistical assessment of protein interaction data using R and Bioconductor. Nat Protoc, 4(4):535–546.

    Article  PubMed  CAS  Google Scholar 

  36. Haider S, Ballester B, Smedley D, Zhang J, Rice P, Kasprzyk A. (2009) BioMart Central Portal–unified access to biological data. Nucleic Acids Res, 37:W23–W27.

    Google Scholar 

  37. Li C, Wong WH. (2001) Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci U S A, 98:31–36.

    Article  CAS  Google Scholar 

  38. Huber W, von Heydebreck A, Sueltmann H, Poustka A, Vingron M. (2002) Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics, 18(Suppl. 1):S96–S104.

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

Funding for this research was provided by grants from La Ligue Contre Le Cancer (RAB08008NSA; project R08010NS). We also gratefully acknowledge the Instituts national de la santé et de la recherche médicale (INSERM) for supporting this project. Funding for RG was provided by NIH grant P41 HG004059.

The authors would like to thank Wolfgang Huber, Tony Chiang, Shailesh Date and Denise Scholtens for helpful comments and for providing a stimulating research environment that has led to the many tools used here and to the ideas that underlie this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nolwenn Le Meur .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Meur, N.L., Gentleman, R. (2012). Analyzing Biological Data Using R: Methods for Graphs and Networks. In: van Helden, J., Toussaint, A., Thieffry, D. (eds) Bacterial Molecular Networks. Methods in Molecular Biology, vol 804. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-361-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-361-5_19

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-61779-360-8

  • Online ISBN: 978-1-61779-361-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics