Human Genetics

, Volume 134, Issue 7, pp 761–773 | Cite as

A cautionary note on the impact of protocol changes for genome-wide association SNP × SNP interaction studies: an example on ankylosing spondylitis

  • Kyrylo Bessonov
  • Elena S. Gusareva
  • Kristel Van Steen
Original Investigation


Genome-wide association interaction (GWAI) studies have increased in popularity. Yet to date, no standard protocol exists. In practice, any GWAI workflow involves making choices about quality control strategy, SNP filtering, linkage disequilibrium (LD) pruning, analytic tool to model or to test for genetic interactions. Each of these can have an impact on the final epistasis findings and may affect their reproducibility in follow-up analyses. Choosing an analytic tool is not straightforward, as different tools exist and current understanding about their performance is based on often very particular simulation settings. In the present study, we wish to create awareness for the impact of (minor) changes in a GWAI analysis protocol can have on final epistasis findings. In particular, we investigate the influence of marker selection and marker prioritization strategies, LD pruning and the choice of epistasis detection analytics on study results, giving rise to 8 GWAI protocols. Discussions are made in the context of the ankylosing spondylitis (AS) data obtained via the Wellcome Trust Case Control Consortium (WTCCC2). As expected, the largest impact on AS epistasis findings is caused by the choice of marker selection criterion, followed by marker coding and LD pruning. In MB-MDR, co-dominant coding of main effects is more robust to the effects of LD pruning than additive coding. We were able to reproduce previously reported epistasis involvement of HLA-B and ERAP1 in AS pathology. In addition, our results suggest involvement of MAGI3 and PARK2, responsible for cell adhesion and cellular trafficking. Gene ontology biological function enrichment analysis across the 8 considered GWAI protocols also suggested that AS could be associated to the central nervous system malfunctions, specifically, in nerve impulse propagation and in neurotransmitters metabolic processes.


Linkage Disequilibrium Ankylose Spondylitis Major Histocompatibility Complex Class Encode Scheme GWAI Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



The research was funded by the Fonds de la Recherche Scientifique (FNRS) (incl. FNRS F.R.F.C. project convention no. 2.4609.11). We thank François Van Lishout and Elena Gusareva from the Systems and Modeling Unit, Montefiore Institute, University of Liege, Belgium for their support and advice. This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from Funding for the project was provided by the Wellcome Trust under award no. 076113.

Conflict of interest

The authors declare that they have no competing interests.

Supplementary material

439_2015_1560_MOESM1_ESM.pdf (212 kb)
Supplementary material 1 (PDF 211 kb)
439_2015_1560_MOESM2_ESM.pdf (204 kb)
Supplementary material 2 (PDF 204 kb)
439_2015_1560_MOESM3_ESM.xlsx (6.9 mb)
Supplementary material 3 (XLSX 7062 kb)
439_2015_1560_MOESM4_ESM.xls (296 kb)
Supplementary material 4 (XLS 296 kb)
439_2015_1560_MOESM5_ESM.xls (34 kb)
Supplementary material 5 (XLS 33 kb)
439_2015_1560_MOESM6_ESM.xls (149 kb)
Supplementary material 6 (XLS 149 kb)


  1. Ackermann M, Strimmer K (2009) A general modular framework for gene set enrichment analysis. BMC Bioinform 10:47. doi: 10.1186/1471-2105-10-47 CrossRefGoogle Scholar
  2. Adamsky K, Arnold K, Sabanay H, Peles E (2003) Junctional protein MAGI-3 interacts with receptor tyrosine phosphatase beta (RPTP beta) and tyrosine-phosphorylated proteins. J Cell Sci 116:1279–1289PubMedCrossRefGoogle Scholar
  3. Alexa A, Rahnenfuhrer J, Lengauer T (2006) Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 22:1600–1607. doi: 10.1093/bioinformatics/btl140 PubMedCrossRefGoogle Scholar
  4. Alvarez-Navarro C, Lopez de Castro JA (2013) ERAP1 structure, function and pathogenetic role in ankylosing spondylitis and other MHC-associated diseases. Mol Immunol. doi: 10.1016/j.molimm.2013.06.012 PubMedGoogle Scholar
  5. Boisgerault F, Mounier J, Tieng V, Stolzenberg MC, Khalil-Daher I, Schmid M, Sansonetti P, Charron D, Toubert A (1998) Alteration of HLA-B27 peptide presentation after infection of transfected murine L cells by Shigella flexneri. Infect Immun 66:4484–4490PubMedCentralPubMedGoogle Scholar
  6. Bush WS, Dudek SM, Ritchie MD (2009) Biofilter: a knowledge-integration system for the multi-locus analysis of genome-wide association studies. Pac Symp Biocomput 2009:368–379Google Scholar
  7. Cattaert T, Calle ML, Dudek SM, Mahachie John JM, Van Lishout F, Urrea V, Ritchie MD, Van Steen K (2011) Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise. Ann Hum Genet 75:78–89. doi: 10.1111/j.1469-1809.2010.00604.x PubMedCentralPubMedCrossRefGoogle Scholar
  8. Chaudhary SB, Hullinger H, Vives MJ (2011) Management of acute spinal fractures in ankylosing spondylitis. ISRN Rheumatol 2011:150484. doi: 10.5402/2011/150484 PubMedCentralPubMedGoogle Scholar
  9. Claushuis D, Cortes A, Bradbury LA, Martin TM, Rosenbaum JT, Reveille JD, Wordsworth P, Pointon J, Evans D, Leo P, Mukhopadhyay P, Brown MA (2012) A genome wide association study of anterior uveiti. In: Annual Scientific Meeting of the American-College-of-Rheumatology (ACR) and Association-of-Rheumatology-Health-Professionals (ARHP). Wiley, Washington, DC, pp S259–S259Google Scholar
  10. Colin Freeman JM (2012) GTOOL. Oxford University. Accessed Mar 2014
  11. da Huang W, Sherman BT, Lempicki RA (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37:1–13. doi: 10.1093/nar/gkn923 PubMedCentralCrossRefGoogle Scholar
  12. Dean LE, Jones GT, MacDonald AG, Downham C, Sturrock RD, Macfarlane GJ (2014) Global prevalence of ankylosing spondylitis. Rheumatology 53:650–657PubMedCrossRefGoogle Scholar
  13. Evans DM, Spencer CC, Pointon JJ, Su Z, Harvey D, Kochan G, Oppermann U, Dilthey A, Pirinen M, Stone MA, Appleton L, Moutsianas L, Leslie S, Wordsworth T, Kenna TJ, Karaderi T, Thomas GP, Ward MM, Weisman MH, Farrar C, Bradbury LA, Danoy P, Inman RD, Maksymowych W, Gladman D, Rahman P, Spondyloarthritis Research Consortium of C, Morgan A, Marzo-Ortega H, Bowness P, Gaffney K, Gaston JS, Smith M, Bruges-Armas J, Couto AR, Sorrentino R, Paladini F, Ferreira MA, Xu H, Liu Y, Jiang L, Lopez-Larrea C, Diaz-Pena R, Lopez-Vazquez A, Zayats T, Band G, Bellenguez C, Blackburn H, Blackwell JM, Bramon E, Bumpstead SJ, Casas JP, Corvin A, Craddock N, Deloukas P, Dronov S, Duncanson A, Edkins S, Freeman C, Gillman M, Gray E, Gwilliam R, Hammond N, Hunt SE, Jankowski J, Jayakumar A, Langford C, Liddle J, Markus HS, Mathew CG, McCann OT, McCarthy MI, Palmer CN, Peltonen L, Plomin R, Potter SC, Rautanen A, Ravindrarajah R, Ricketts M, Samani N, Sawcer SJ, Strange A, Trembath RC, Viswanathan AC, Waller M, Weston P, Whittaker P, Widaa S, Wood NW, McVean G, Reveille JD, Wordsworth BP, Brown MA, Donnelly P, Australo-Anglo-American Spondyloarthritis C, Wellcome Trust Case Control C (2011) Interaction between ERAP1 and HLA-B27 in ankylosing spondylitis implicates peptide handling in the mechanism for HLA-B27 in disease susceptibility. Nat Genet. 43: 761–7. doi:  10.1038/ng.873
  14. Gamazon ER, Zhang W, Konkashbaev A, Duan S, Kistner EO, Nicolae DL, Dolan ME, Cox NJ (2010) SCAN: SNP and copy number annotation. Bioinformatics 26:259–262. doi: 10.1093/bioinformatics/btp644 PubMedCentralPubMedCrossRefGoogle Scholar
  15. Gao W, Sweeney C, Walsh C, Rooney P, McCormick J, Veale DJ, Fearon U (2013) Notch signalling pathways mediate synovial angiogenesis in response to vascular endothelial growth factor and angiopoietin 2. Ann Rheum Dis 72:1080–1088. doi: 10.1136/annrheumdis-2012-201978 PubMedCentralPubMedCrossRefGoogle Scholar
  16. Grange L (2014) Epistasis in genetic susceptibility to infectious diseases: comparison and development of methods application to severe dengue in Asia. Dissertation, University Paris DiderotGoogle Scholar
  17. Greene CS, Penrod NM, Williams SM, Moore JH (2009) Failure to replicate a genetic association may provide important clues about genetic architecture. PLoS One 4:e5639. doi: 10.1371/journal.pone.0005639
  18. Gusareva ES, Van Steen K (2014) Practical aspects of genome-wide association interaction analysis. Hum Genet. doi: 10.1007/s00439-014-1480-y PubMedGoogle Scholar
  19. Gyenesei A, Moody J, Semple CA, Haley CS, Wei WH (2012) High-throughput analysis of epistasis in genome-wide association studies with BiForce. Bioinformatics 28:1957–1964. doi: 10.1093/bioinformatics/bts304 PubMedCentralPubMedCrossRefGoogle Scholar
  20. Housden BE, Fu AQ, Krejci A, Bernard F, Fischer B, Tavaré S, Russell S, Bray SJ (2013) Transcriptional dynamics elicited by a short pulse of notch activation involves feed-forward regulation by E (spl)/Hes genes. PLoS Genet 9:e1003162PubMedCentralPubMedCrossRefGoogle Scholar
  21. Jenisch S, Henseler T, Nair RP, Guo SW, Westphal E, Stuart P, Kronke M, Voorhees JJ, Christophers E, Elder JT (1998) Linkage analysis of human leukocyte antigen (HLA) markers in familial psoriasis: strong disequilibrium effects provide evidence for a major determinant in the HLA-B/-C region. Am J Hum Genet 63:191–199. doi: 10.1086/301899 PubMedCentralPubMedCrossRefGoogle Scholar
  22. Kestler HA, Muller A, Gress TM, Buchholz M (2005) Generalized Venn diagrams: a new method of visualizing complex genetic set relations. Bioinformatics 21:1592–1595. doi: 10.1093/bioinformatics/bti169 PubMedCrossRefGoogle Scholar
  23. Lopez-Arbesu R, Ballina-Garcia FJ, Alperi-Lopez M, Lopez-Soto A, Rodriguez-Rodero S, Martinez-Borra J, Lopez-Vazquez A, Fernandez-Morera JL, Riestra-Noriega JL, Queiro-Silva R, Quinones-Lombrana A, Lopez-Larrea C, Gonzalez S (2007) MHC class I chain-related gene B (MICB) is associated with rheumatoid arthritis susceptibility. Rheumatology (Oxford) 46:426–430. doi: 10.1093/rheumatology/kel331 CrossRefGoogle Scholar
  24. Mahachie J (2012) Thesis: Genomic association screening methodology for high-dimensional and complex data structures, University of LiegeGoogle Scholar
  25. Mahachie John JM, Cattaert T, De Lobel L, Van Lishout F, Empain A, Van Steen K (2011a) Comparison of genetic association strategies in the presence of rare alleles. BMC Proc 5(Suppl 9):S32. doi: 10.1186/1753-6561-5-S9-S32 PubMedCentralPubMedCrossRefGoogle Scholar
  26. Mahachie John JM, Van Lishout F, Van Steen K (2011b) Model-based multifactor dimensionality reduction to detect epistasis for quantitative traits in the presence of error-free and noisy data. Eur J Hum Genet 19:696–703. doi: 10.1038/ejhg.2011.17 PubMedCentralPubMedCrossRefGoogle Scholar
  27. Mahachie John JM, Cattaert T, Lishout FV, Gusareva ES, Steen KV (2012) Lower-order effects adjustment in quantitative traits model-based multifactor dimensionality reduction. PLoS One 7:e29594. doi: 10.1371/journal.pone.0029594 PubMedCentralPubMedCrossRefGoogle Scholar
  28. Mahachie John JM, Van Lishout F, Gusareva ES, Van Steen K (2013) A robustness study of parametric and non-parametric tests in model-based multifactor dimensionality reduction for epistasis detection. BioData Min 6:9. doi: 10.1186/1756-0381-6-9 PubMedCentralPubMedCrossRefGoogle Scholar
  29. Nischwitz S, Cepok S, Kroner A, Wolf C, Knop M, Muller-Sarnowski F, Pfister H, Roeske D, Rieckmann P, Hemmer B, Ising M, Uhr M, Bettecken T, Holsboer F, Muller-Myhsok B, Weber F (2010) Evidence for VAV2 and ZNF433 as susceptibility genes for multiple sclerosis. J Neuroimmunol 227:162–166. doi: 10.1016/j.jneuroim.2010.06.003 PubMedCrossRefGoogle Scholar
  30. Pang X, Wang Z, Yap JS, Wang J, Zhu J, Bo W, Lv Y, Xu F, Zhou T, Peng S, Shen D, Wu R (2013) A statistical procedure to map high-order epistasis for complex traits. Brief Bioinform 14:302–314. doi: 10.1093/bib/bbs027 PubMedCrossRefGoogle Scholar
  31. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, Sham PC (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–575. doi: 10.1086/519795 PubMedCentralPubMedCrossRefGoogle Scholar
  32. RCoreTeam (2013) R: a language and environment for statistical computing. ViennaGoogle Scholar
  33. Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH (2001) Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 69:138–147. doi: 10.1086/321276 PubMedCentralPubMedCrossRefGoogle Scholar
  34. SNP & Variation Suite (Version 8.x) [Software]. Golden Helix, Inc., Bozeman, MT. Available from
  35. Sun X, Lu Q, Mukheerjee S, Crane PK, Elston R, Ritchie MD (2014) Analysis pipeline for the epistasis search–statistical versus biological filtering. Front Genet 5:106. doi: 10.3389/fgene.2014.00106
  36. Tabangin ME, Woo JG, Martin LJ (2009) The effect of minor allele frequency on the likelihood of obtaining false positives. BMC Proc 3(Suppl 7):S41. doi: 10.1186/1753-6561-3-S7-S41 PubMedCentralPubMedCrossRefGoogle Scholar
  37. Van Lishout F, Mahachie John JM, Gusareva ES, Urrea V, Cleynen I, Theatre E, Charloteaux B, Calle ML, Wehenkel L, Van Steen K (2013) An efficient algorithm to perform multiple testing in epistasis screening. BMC Bioinform 14:138. doi: 10.1186/1471-2105-14-138 CrossRefGoogle Scholar
  38. Van Steen K (2012) Travelling the world of gene-gene interactions. Brief Bioinform 13:1–19. doi: 10.1093/bib/bbr012 PubMedCrossRefGoogle Scholar
  39. Verdecia MA, Joazeiro CA, Wells NJ, Ferrer JL, Bowman ME, Hunter T, Noel JP (2003) Conformational flexibility underlies ubiquitin ligation mediated by the WWP1 HECT domain E3 ligase. Mol Cell 11:249–259PubMedCrossRefGoogle Scholar
  40. Wan X, Yang C, Yang Q, Xue H, Fan X, Tang NL, Yu W (2010) BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet 87:325–340. doi: 10.1016/j.ajhg.2010.07.021 PubMedCentralPubMedCrossRefGoogle Scholar
  41. Wei WH, Hemani G, Haley CS (2014) Detecting epistasis in human complex traits. Nat Rev Genet. doi: 10.1038/nrg3747 PubMedGoogle Scholar
  42. Westfall PH, Young SS (1993) Resampling-based multiple testing: examples and methods for p-value adjustment. Wiley-Interscience, CanadaGoogle Scholar
  43. Zhang X, Zou F, Wang W (2008) Fastanova: an efficient algorithm for genome-wide association study. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 821–829Google Scholar
  44. Zhang Y, Jiang B, Zhu J, Liu JS (2011) Bayesian models for detecting epistatic interactions from genetic data. Ann Hum Genet 75:183–193. doi: 10.1111/j.1469-1809.2010.00621.x PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • Kyrylo Bessonov
    • 1
    • 2
  • Elena S. Gusareva
    • 1
    • 2
  • Kristel Van Steen
    • 1
    • 2
  1. 1.Systems and Modeling Unit, Montefiore InstituteUniversity of LiègeLiègeBelgium
  2. 2.Systems Biology and Chemical Biology, GIGA-R, University of LiègeLiègeBelgium

Personalised recommendations