Web-Based Analysis of (Epi-) Genome Data Using EpiGRAPH and Galaxy
Modern life sciences are becoming increasingly data intensive, posing a significant challenge for most researchers and shifting the bottleneck of scientific discovery from data generation to data analysis. As a result, progress in genome research is increasingly impeded by bioinformatic hurdles. A new generation of powerful and easy-to-use genome analysis tools has been developed to address this issue, enabling biologists to perform complex bioinformatic analyses online - without having to learn a programming language or downloading and manually processing large datasets. In this tutorial paper, we describe the use of EpiGRAPH (http://epigraph.mpi-inf.mpg.de/) and Galaxy (http://galaxyproject.org/) for genome and epigenome analysis, and we illustrate how these two web services work together to identify epigenetic modifications that are characteristics of highly polymorphic (SNP-rich) promoters. This paper is supplemented with video tutorials (http://tinyurl.com/yc5xkqq), which provide a step-by-step guide through each example analysis.
Key wordsBioinformatics Genome analysis Statistics Machine learning Computational epigenetics Single nucleotide polymorphisms (SNPs) Evolutionary constraint
We would like to thank Joachim Büch for maintaining the IT infrastructure of EpiGRAPH, Yoichi Yamada and Sascha Tierling for providing DNA methylation data, and Martina Paulsen as well as Jörn Walter for helpful discussions. EpiGRAPH is partially funded by the European Union through the CANCERDIP project (HEALTH-F2–2007-200620; http://www.cancerdip.eu/). Galaxy is supported by NSF Grant DBI-0543285 and NIH Grant 5R01HG003646–02 as well as by funds from the Huck Institutes for Life Sciences at Penn State University and Pennsylvania Department of Health.
- 3.Zhang, M.Q. (2005) In: Pal, S. K. (ed.), PReMI. Springer-Verlag Berlin Heidelberg, Vol. 3776, pp. 31–38.Google Scholar
- 14.Moser, D., Ekawardhani, S., Kumsta, R., Palmason, H., Bock, C., Athanassiadou, Z., et al.(2008) Functional analysis of a potassium-chloride co-transporter 3 (SLC12A6) promoter polymorphism leading to an additional DNA methylation site. Neuropsychopharmacology, 34, 458–467.PubMedCrossRefGoogle Scholar
- 33.Zhang, Y., C. Rohde, S. Tierling, T.P. Jurkowski, C. Bock, D. Santacruz, S. Ragozin, R. Reinhardt, M. Groth, J. Walter, and A. Jeltsch. 2009. DNA methylation analysis of chromosome 21 gene promoters at single base pair and single allele resolution. PLoS Genet 5: e1000438.Google Scholar
- 37.Witten, I.H. and Frank, E. (2000) Data mining : practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco, Calif.Google Scholar
- 38.Hastie, T., Tibshirani, R. and Friedman, J.H. (2001) The elements of statistical learning : data mining, inference, and prediction. Springer, New York.Google Scholar