Advertisement

Genes & Genomics

, Volume 35, Issue 3, pp 305–316 | Cite as

EpiSIM: simulation of multiple epistasis, linkage disequilibrium patterns and haplotype blocks for genome-wide interaction analysis

  • Junliang Shang
  • Junying Zhang
  • Xiujuan Lei
  • Wenying Zhao
  • Yafei Dong
Research Article

Abstract

Epistasis is a ubiquitous phenomenon in genetics, and is considered to be one of the main factors in current efforts to detect missing heritability for complex diseases. Simulation is a critical tool in developing methodologies that can more effectively detect and study epistasis. Here we present a simulator, epiSIM (epistasis SIMulator), that can simulate some of the statistical properties of genetic data. EpiSIM is capable of expanding the range of the epistasis models that current simulators offer, including epistasis models that display marginal effects and those that display no marginal effects. One or more of these epistasis models can be embedded simultaneously into a single simulation data set, jointly determining the phenotype. In addition, epiSIM is independent of any outside data source in generating linkage disequilibrium patterns and haplotype blocks. We demonstrate the wide applicability of epiSIM by performing several data simulations, and examine its properties by comparing it with current representative simulators and by comparing the data that it generates with real data. Our experiments demonstrate that epiSIM is a valuable addition and a nice complement to the existing epistasis simulators. The software package is available online at https://sourceforge.net/projects/episimsimulator/files/.

Keywords

Epistasis simulator Genome-wide interaction analysis Single nucleotide polymorphisms 

Notes

Acknowledgments

We are grateful to the anonymous reviewers whose suggestions and comments contributed to the significant improvement of this paper. This work was supported by “the Fundamental Research Funds for the Central Universities” (Research on Pathogenic Patterns of Complex Diseases Based on DNA Methylation and SNP); the National Natural Science Foundation of China (Grant No. 61070137, 61070143); the Major Research Plan of the National Natural Science Foundation of China (Grant No. 91130006); the Key Program of the National Natural Science Foundation of China (Grant No. 60933009); the Young Scientists Fund of the National Natural Science Foundation of China (Grant No. 61100164).

Conflicts of interest

The authors have declared that no competing interests exist.

Supplementary material

13258_2013_81_MOESM1_ESM.rar (17.8 mb)
Additional file 1 Title: current version of epiSIM (version 1.0). Description: the archive includes the current version of epiSIM, a detailed manual of its usage, and the simulation data sets used in this study. (RAR 18,266 kb)

References

  1. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21:263–265PubMedCrossRefGoogle Scholar
  2. Cancare F, Marin A, Sciuto D (2011) Dedicated hardware accelerators for the epistatic analysis of human genetic data. International Conference on Embedded Computer Systems:102–109Google Scholar
  3. Carvajal-Rodriguez A (2008) Simulation of genomes: a review. Curr Genomics 9:155–159PubMedCrossRefGoogle Scholar
  4. Carvajal-Rodriguez A (2010) Simulation of genes and genomes forward in time. Curr Genomics 11:58–61PubMedCrossRefGoogle Scholar
  5. Chen GK, Marjoram P, Wall JD (2009a) Fast and flexible simulation of DNA sequence data. Genome Res 19:136–142PubMedCrossRefGoogle Scholar
  6. Chen L, Yu G, Miller DJ, Song L, Langefeld C, Herrington D, Liu Y, Wang Y (2009b) A ground truth based comparative study on detecting epistatic SNPs. IEEE Internat Confer Bioinform Biomed Workshop:26–31Google Scholar
  7. Culverhouse R, Suarez BK, Lin J, Reich T (2002) A perspective on epistasis: limits of models displaying no main effect. Am J Hum Genet 70:461–471PubMedCrossRefGoogle Scholar
  8. Gunther T, Gawenda I, Schmid KJ (2011) Phenosim: a software to simulate phenotypes for testing in genome-wide association studies. BMC Bioinformatics 12:265PubMedCrossRefGoogle Scholar
  9. Herold C, Steffens M, Brockschmidt FF, Baur MP, Becker T (2009) INTERSNP: genome-wide interaction analysis guided by a priori information. Bioinformatics 25:3275–3281PubMedCrossRefGoogle Scholar
  10. Hoban S, Bertorelle G, Gaggiotti OE (2012) Computer simulations: tools for population and evolutionary genetics. Nat Rev Genet 13:110–122PubMedGoogle Scholar
  11. Jenkins PA, Griffiths RC (2011) Inference from samples of DNA sequences using a two-locus model. J Comput Biol 18:109–127PubMedCrossRefGoogle Scholar
  12. Li J, Chen Y (2008) Generating samples for association studies based on HapMap data. BMC Bioinformatics 9:44PubMedCrossRefGoogle Scholar
  13. Liang L, Zollner S, Abecasis GR (2007) GENOME: a rapid coalescent-based whole genome simulator. Bioinformatics 23:1565–1567PubMedCrossRefGoogle Scholar
  14. Maher B (2008) Personal genomes: the case of the missing heritability. Nature 456:18–21PubMedCrossRefGoogle Scholar
  15. Mailund T, Schierup MH, Pedersen CN, Mechlenborg PJ, Madsen JN, Schauser L (2005) CoaSim: a flexible environment for simulating genetic data under coalescent models. BMC Bioinformatics 6:252PubMedCrossRefGoogle Scholar
  16. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A et al (2009) Finding the missing heritability of complex diseases. Nature 461:747–753PubMedCrossRefGoogle Scholar
  17. Miller DJ, Zhang Y, Yu G, Liu Y, Chen L, Langefeld CD, Herrington D, Wang Y (2009) An algorithm for learning maximum entropy probability models of disease risk that efficiently searches and sparingly encodes multilocus genomic interactions. Bioinformatics 25:2478–2485PubMedCrossRefGoogle Scholar
  18. Moore JH, Williams SM (2005) Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. BioEssays 27:637–646PubMedCrossRefGoogle Scholar
  19. Neuenschwander S, Hospital F, Guillaume F, Goudet J (2008) quantiNemo: an individual-based program to simulate quantitative traits with explicit genetic architecture in a dynamic metapopulation. Bioinformatics 24:1552–1553PubMedCrossRefGoogle Scholar
  20. Pattaro C, Ruczinski I, Fallin DM, Parmigiani G (2008) Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies. BMC Genomics 9:405PubMedCrossRefGoogle Scholar
  21. Peng B, Amos CI (2010) Forward-time simulation of realistic samples for genome-wide association studies. BMC Bioinformatics 11:442PubMedCrossRefGoogle Scholar
  22. Peng B, Kimmel M (2005) simuPOP: a forward-time population genetics simulation environment. Bioinformatics 21:3686–3687PubMedCrossRefGoogle Scholar
  23. Posada D, Wiuf C (2003) Simulating haplotype blocks in the human genome. Bioinformatics 19:289–290PubMedCrossRefGoogle Scholar
  24. Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH (2001) Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 69:138–147PubMedCrossRefGoogle Scholar
  25. Scott MD, Alison AM, Digna RV, Scott MW, Marylyn DR (2006) Data simulation software for whole-genome association and other studies in human genetics. Pacific Symposium on Biocomputing:499–510Google Scholar
  26. Shang J, Zhang J, Sun Y, Liu D, Ye D, Yin Y (2011) Performance analysis of novel methods for detecting epistasis. BMC Bioinformatics 12:475PubMedCrossRefGoogle Scholar
  27. Shang J, Zhang J, Lei X, Zhang Y, Chen B (2012) Incorporating heuristic information into ant colony optimization for epistasis detection. Genes Genomics 34:271–278CrossRefGoogle Scholar
  28. Tang W, Wu X, Jiang R, Li Y (2009) Epistatic module detection for case-control studies: a Bayesian model with a Gibbs sampling strategy. PLoS Genet 5:e1000464PubMedCrossRefGoogle Scholar
  29. VanLiere JM, Rosenberg NA (2008) Mathematical properties of the r2 measure of linkage disequilibrium. Theor Popul Biol 74:130–137PubMedCrossRefGoogle Scholar
  30. Wan X, Yang C, Yang Q, Xue H, Fan X, Tang NL, Yu W (2010a) BOOST: a fast approach to detecting gene–gene interactions in genome-wide case-control studies. Am J Hum Genet 87:325–340PubMedCrossRefGoogle Scholar
  31. Wan X, Yang C, Yang Q, Xue H, Tang NL, Yu W (2010b) Predictive rule inference for epistatic interaction detection in genome-wide association studies. Bioinformatics 26:30–37PubMedCrossRefGoogle Scholar
  32. Wang Y, Liu X, Robbins K, Rekaya R (2010) AntEpiSeeker: detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm. BMC Res Notes 3:117PubMedCrossRefGoogle Scholar
  33. Wright FA, Huang H, Guan X, Gamiel K, Jeffries C, Barry WT, de Villena FP, Sullivan PF, Wilhelmsen KC, Zou F (2007) Simulating association studies: a data-based resampling method for candidate regions or whole genome scans. Bioinformatics 23:2581–2588PubMedCrossRefGoogle Scholar
  34. Yuan X, Zhang J, Wang Y (2011) Simulating linkage disequilibrium structures in a human population for SNP association studies. Biochem Genet 49:395–409PubMedCrossRefGoogle Scholar
  35. Yuan X, Miller DJ, Zhang J, Herrington D, Wang Y (2012) An overview of population genetic data simulation. J Comput Biol 19:42–54PubMedCrossRefGoogle Scholar
  36. Zhang Y, Liu JS (2007) Bayesian inference of epistatic interactions in case-control studies. Nat Genet 39:1167–1173PubMedCrossRefGoogle Scholar
  37. Zhang F, Liu J, Chen J, Deng HW (2008) HAPSIMU: a genetic simulation platform for population-based association studies. BMC Bioinformatics 9:331PubMedCrossRefGoogle Scholar
  38. Zhang X, Huang S, Zou F, Wang W (2010) TEAM: efficient two-locus epistasis tests in human genome-wide association study. Bioinformatics 26:i217–i227PubMedCrossRefGoogle Scholar

Copyright information

© The Genetics Society of Korea 2013

Authors and Affiliations

  1. 1.School of Computer Science and Technology, Xidian UniversityXi’anChina
  2. 2.College of Computer Science, Shaanxi Normal UniversityXi’anChina
  3. 3.College of Life Science, Shaanxi Normal UniversityXi’anChina

Personalised recommendations