Abstract
Countering prior beliefs that epistasis is rare, genomics advancements suggest the other way. Current practice often filters out genomic loci with low variant counts before detecting epistasis. We argue that this practice is far from optimal because it can throw away strong epistatic patterns. Instead, we present the compensated Sharma–Song test to infer genetic epistasis in genome-wide association studies by differential departure from independence. The test does not require a minimum number of replicates for each variant. We also introduce algorithms to simulate epistatic patterns that differentially depart from independence. Using two simulators, the test performed comparably to the original Sharma–Song test when variant frequencies at a locus are marginally uniform; encouragingly, it has a marked advantage over alternatives when variant frequencies are marginally nonuniform. The test further revealed uniquely clean epistatic variants associated with chicken abdominal fat content that are not prioritized by other methods. Genes involved in most numbers of inferred epistasis between single nucleotide polymorphisms (SNPs) belong to pathways known for obesity regulation; many top SNPs are located on chromosome 20 and in intergenic regions. Measuring differential departure from independence, the compensated Sharma–Song test offers a practical choice for studying epistasis robust to nonuniform genetic variant frequencies.
Similar content being viewed by others
Data and code availability
The compensated Sharma–Song test and the differential table simulation algorithms are implemented in the open-source R package ‘DiffXTables’ (Sharma and Song 2021) freely available from https://cran.r-project.org/package=DiffXTables. Data and other source code are available at Code Ocean doi: https://doi.org/10.24433/CO.7661508.v1.
References
Abu-Remaileh M, Abu-Remaileh M, Akkawi R, Knani I, Udi S, Pacold ME, Tam J, Aqeilan RI (2019) WWOX somatic ablation in skeletal muscles alters glucose metabolism. Mol Metabol 22:132–140. https://doi.org/10.1016/j.molmet.2019.01.010
Alonso L, Fuchs E (2003) Stem cells in the skin: waste not. Wnt not. Genes Dev 17(10):1189–1200. https://doi.org/10.1101/gad.1086903
Andreasen NC, Wilcox MA, Ho BC, Epping E, Ziebell S, Zeien E, Weiss B, Wassink T (2012) Statistical epistasis and progressive brain change in schizophrenia: an approach for examining the relationships between multiple genes. Mol Psychiatry 17(11):1093–1102. https://doi.org/10.1038/mp.2011.108
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol) 57(1):289–300. https://doi.org/10.2307/2346101
Bonferroni CE (1935) Il calcolo delle assicurazioni su gruppi di teste. Studi in Onore del Professore Salvatore Ortu Carboni. Italy, Rome, pp 13–60
Brouwers B, de Oliveira EM, Marti-Solano M, Monteiro FB, Laurin SA, Keogh JM, Henning E, Bounds R, Daly CA, Houston S, Ayinampudi V, Wasiluk N, Clarke D, Plouffe B, Bouvier M, Babu MM, Farooqi IS, Mokrosiński J (2021) Human MC4R variants affect endocytosis, trafficking and dimerization revealing multiple cellular mechanisms involved in weight regulation. Cell Rep 34(12):108862. https://doi.org/10.1016/j.celrep.2021.108862
Chen N, Wang J (2018) Wnt/β-catenin signaling and obesity. Front Physiol 9:792. https://doi.org/10.3389/fphys.2018.00792
Cho Y, Ritchie M, Moore J, Park J, Lee KU, Shin H, Lee H, Park K (2004) Multifactor-dimensionality reduction shows a two-locus interaction associated with Type 2 diabetes mellitus. Diabetologia 47(3):549–554. https://doi.org/10.1007/s00125-003-1321-3
Cochran WG (1952) The \(\chi ^2\) test of goodness of fit. Ann Math Stat 23(3):315–345. https://doi.org/10.1214/aoms/1177729380
Cochran WG (1954) Some methods for strengthening the common \(\chi ^2\) tests. Biometrics 10(4):417–451. https://doi.org/10.2307/3001616
Cordell HJ (2002) Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 11(20):2463–2468. https://doi.org/10.1093/hmg/11.20.2463
Domingo J, Baeza-Centurion P, Lehner B (2019) The causes and consequences of genetic interactions (epistasis). Annu Rev Genom Hum Genet 20:433–460. https://doi.org/10.1146/annurev-genom-083118-014857
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360. https://doi.org/10.1198/016214501753382273
Foroushani AB, Brinkman FS, Lynn DJ (2013) Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures. PeerJ 1:e229. https://doi.org/10.7717/peerj.229
Gilleron J, Gerdes JM, Zeigerer A (2019) Metabolic regulation through the endosomal system. Traffic 20(8):552–570. https://doi.org/10.1111/tra.12670
Guo L, Sun B, Shang Z, Leng L, Wang Y, Wang N, Li H (2011) Comparison of adipose tissue cellularity in chicken lines divergently selected for fatness. Poult Sci 90(9):2024–2034. https://doi.org/10.3382/ps.2010-00863
He S, Tao YX (2014) Defect in MAPK signaling as a cause for monogenic obesity caused by inactivating mutations in the melanocortin-4 receptor gene. Int J Biol Sci 10(10):1128. https://doi.org/10.7150/ijbs.10359
Howard TD, Koppelman GH, Xu J, Zheng SL, Postma DS, Meyers DA, Bleecker ER (2002) Gene-gene interaction in asthma: IL4RA and IL13 in a Dutch population with asthma. Am J Hum Genet 70(1):230–236. https://doi.org/10.1086/338242
Huang H, Liu L, Li C, Liang Z, Huang Z, Wang Q, Li S, Zhao Z (2020) Fat mass-and obesity-associated (FTO) gene promoted myoblast differentiation through the focal adhesion pathway in chicken. 3 Biotech 10(9):1–10. https://doi.org/10.1007/s13205-020-02386-z
Jing PJ, Shen HB (2015) MACOED: a multi-objective ant colony optimization algorithm for SNP epistasis detection in genome-wide association studies. Bioinformatics 31(5):634–641. https://doi.org/10.1093/bioinformatics/btu702
Kempthorne O (1957) An introduction to genetic statistics. Wiley, New York
Kido T, Sikora-Wohlfeld W, Kawashima M, Kikuchi S, Kamatani N, Patwardhan A, Chen R, Sirota M, Kodama K, Hadley D et al (2018) Are minor alleles more likely to be risk alleles? BMC Med Genom 11(1):1–11. https://doi.org/10.1186/s12920-018-0322-5
Li F, Hu G, Zhang H, Wang S, Wang Z, Li H (2013) Epistatic effects on abdominal fat content in chickens: results from a genome-wide SNP-SNP interaction analysis. PLoS ONE 8(12):e81520. https://doi.org/10.1371/journal.pone.0081520
Lin D, Chun TH, Kang L (2016) Adipose extracellular matrix remodelling in obesity and insulin resistance. Biochem Pharmacol 119:8–16. https://doi.org/10.1016/j.bcp.2016.05.005
Loh NY, Neville MJ, Marinou K, Hardcastle SA, Fielding BA, Duncan EL, McCarthy MI, Tobias JH, Gregson CL, Karpe F et al (2015) LRP5 regulates human body fat distribution by modulating adipose progenitor biology in a dose-and depot-specific fashion. Cell Metab 21(2):262–273. https://doi.org/10.1016/j.cmet.2015.01.009
Luk CT, Shi SY, Cai EP, Sivasubramaniyam T, Krishnamurthy M, Brunt JJ, Schroer SA, Winer DA, Woo M (2017) FAK signalling controls insulin sensitivity through regulation of adipocyte survival. Nat Commun 8(1):1–13. https://doi.org/10.1038/ncomms14360
Ma L, Runesha HB, Dvorkin D, Garbe JR, Da Y (2008) Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies. BMC Bioinform 9(1):1–9. https://doi.org/10.1186/1471-2105-9-315
Niel C, Sinoquet C, Dina C, Rocheleau G (2015) A survey about methods dedicated to epistasis detection. Front Genet 6:285. https://doi.org/10.3389/fgene.2015.00285
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, De Bakker PI, Daly MJ et al (2007) Plink: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575. https://doi.org/10.1086/519795
Shang J, Zhang J, Sun Y, Liu D, Ye D, Yin Y (2011) Performance analysis of novel methods for detecting epistasis. BMC Bioinform 12(1):1–17. https://doi.org/10.1186/1471-2105-12-475
Sharma R, Song M (2021) ‘DiffXTables’: pattern analysis across contingency tables. https://CRAN.R-project.org/package=DiffXTables, R package version 0.1.3
Sharma R, Luo X, Kumar S, Song M (2020) Three co-expression pattern types across microbial transcriptional networks of plankton in two oceanic waters. In: Proceedings of the 11th ACM international conference on bioinformatics, computational biology and health informatics, Association for Computing Machinery, New York, NY, USA, BCB ’20. https://doi.org/10.1145/3388440.3412485
Sharma R, Kumar S, Song M (2021) Fundamental gene network rewiring at the second order within and across mammalian systems. Bioinformatics 37(19):3293–3301. https://doi.org/10.1093/bioinformatics/btab240
Song M, Zhong H (2020) Efficient weighted univariate clustering maps outstanding dysregulated genomic zones in human cancers. Bioinformatics 36(20):5027–5036. https://doi.org/10.1093/bioinformatics/btaa613
Song J, Zhong H, Wang H (2022) ‘Ckmeans.1d.dp’: optimal, fast, and reproducible univariate clustering. https://CRAN.R-project.org/package=Ckmeans.1d.dp, R package version 4.3.4
Tenenbaum D, Maintainer BP (2021) KEGGREST: Client-side REST access to the Kyoto Encyclopedia of Genes and Genomes (KEGG). R package version 1.32.0
Tuo S, Zhang J, Yuan X, He Z, Liu Y, Liu Z (2017) Niche harmony search algorithm for detecting complex disease associated high-order SNP combinations. Sci Rep 7(1):1–18. https://doi.org/10.1038/s41598-017-11064-9
Ueki M, Cordell HJ (2012) Improved statistics for genome-wide interaction analysis. PLoS Genet 8(4):e1002625. https://doi.org/10.1371/journal.pgen.1002625
Urbanowicz RJ, Kiralis J, Sinnott-Armstrong NA, Heberling T, Fisher JM, Moore JH (2012) GAMETES: a fast, direct algorithm for generating pure, strict, epistatic models with random architectures. BioData Min 5(1):1–14. https://doi.org/10.1186/1756-0381-5-16
Wan X, Yang C, Yang Q, Xue H, Fan X, Tang NL, Yu W (2010) BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet 87(3):325–340. https://doi.org/10.1016/j.ajhg.2010.07.021
Wang J, Kumar S, Song M (2020) Joint grid discretization for biological pattern discovery. In: Proceedings of the 11th ACM international conference on bioinformatics, computational biology and health informatics, BCB ’20. https://doi.org/10.1145/3388440.3412415
Wang J, Kumar S, Song J (2022) ‘GridOnClusters’: cluster-preserving multivariate joint grid discretization. https://CRAN.R-project.org/package=GridOnClusters, R package version 0.1.0
Winham S, Wang C, Motsinger-Reif AA (2011) A comparison of multifactor dimensionality reduction and L1-penalized regression to identify gene-gene interactions in genetic association studies. Stat Appl Genet Mol Biol. https://doi.org/10.2202/1544-6115.1613
Xie M, Li J, Jiang T (2012) Detecting genome-wide epistases based on the clustering of relatively frequent items. Bioinformatics 28(1):5–12. https://doi.org/10.1093/bioinformatics/btr603
Yang C, He Z, Wan X, Yang Q, Xue H, Yu W (2009) SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics 25(4):504–511. https://doi.org/10.1093/bioinformatics/btn652
Zhang Y, Liu JS (2007) Bayesian inference of epistatic interactions in case-control studies. Nat Genet 39(9):1167–1173. https://doi.org/10.1038/ng2110
Zhang H, Wang SZ, Wang ZP, Da Y, Wang N, Hu XX, Zhang YD, Wang YX, Leng L, Tang ZQ et al (2012) A genome-wide scan of selective sweeps in two broiler chicken lines divergently selected for abdominal fat content. BMC Genom 13(1):1–16. https://doi.org/10.1186/1471-2164-13-704
Acknowledgements
This reported work was partially supported by National Science Foundation Grant 1661331 and USDA grant 2016-51181-25408.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Joan Cerdá.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sharma, R., Sadeghian Tehrani, Z., Kumar, S. et al. Detecting genetic epistasis by differential departure from independence. Mol Genet Genomics 297, 911–924 (2022). https://doi.org/10.1007/s00438-022-01893-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-022-01893-3