EpiSIM: simulation of multiple epistasis, linkage disequilibrium patterns and haplotype blocks for genome-wide interaction analysis
Epistasis is a ubiquitous phenomenon in genetics, and is considered to be one of the main factors in current efforts to detect missing heritability for complex diseases. Simulation is a critical tool in developing methodologies that can more effectively detect and study epistasis. Here we present a simulator, epiSIM (epistasis SIMulator), that can simulate some of the statistical properties of genetic data. EpiSIM is capable of expanding the range of the epistasis models that current simulators offer, including epistasis models that display marginal effects and those that display no marginal effects. One or more of these epistasis models can be embedded simultaneously into a single simulation data set, jointly determining the phenotype. In addition, epiSIM is independent of any outside data source in generating linkage disequilibrium patterns and haplotype blocks. We demonstrate the wide applicability of epiSIM by performing several data simulations, and examine its properties by comparing it with current representative simulators and by comparing the data that it generates with real data. Our experiments demonstrate that epiSIM is a valuable addition and a nice complement to the existing epistasis simulators. The software package is available online at https://sourceforge.net/projects/episimsimulator/files/.
KeywordsEpistasis simulator Genome-wide interaction analysis Single nucleotide polymorphisms
We are grateful to the anonymous reviewers whose suggestions and comments contributed to the significant improvement of this paper. This work was supported by “the Fundamental Research Funds for the Central Universities” (Research on Pathogenic Patterns of Complex Diseases Based on DNA Methylation and SNP); the National Natural Science Foundation of China (Grant No. 61070137, 61070143); the Major Research Plan of the National Natural Science Foundation of China (Grant No. 91130006); the Key Program of the National Natural Science Foundation of China (Grant No. 60933009); the Young Scientists Fund of the National Natural Science Foundation of China (Grant No. 61100164).
Conflicts of interest
The authors have declared that no competing interests exist.
- Cancare F, Marin A, Sciuto D (2011) Dedicated hardware accelerators for the epistatic analysis of human genetic data. International Conference on Embedded Computer Systems:102–109Google Scholar
- Chen L, Yu G, Miller DJ, Song L, Langefeld C, Herrington D, Liu Y, Wang Y (2009b) A ground truth based comparative study on detecting epistatic SNPs. IEEE Internat Confer Bioinform Biomed Workshop:26–31Google Scholar
- Scott MD, Alison AM, Digna RV, Scott MW, Marylyn DR (2006) Data simulation software for whole-genome association and other studies in human genetics. Pacific Symposium on Biocomputing:499–510Google Scholar