Context-Dependent Mutation Effects in Proteins

  • Frank J. Poelwijk
Part of the Methods in Molecular Biology book series (MIMB, volume 1851)


Defining the extent of epistasis—the nonindependence of the effects of mutations—is essential for understanding the relationship of genotype, phenotype, and fitness in biological systems. The applications cover many areas of biological research, including biochemistry, genomics, protein and systems engineering, medicine, and evolutionary biology. However, the quantitative definitions of epistasis vary among fields, and the analysis beyond just pairwise effects can be problematic. Here, we demonstrate the application of a particular mathematical formalism, the weighted Walsh-Hadamard transform, which unifies a number of different definitions of epistasis. We provide a computational implementation of such analysis using a computer-generated higher-order mutational dataset. We discuss general considerations regarding the null hypothesis for independent mutational effects, which then allows a quantitative identification of epistasis in an experimental dataset.

Key words

Epistasis Higher-order epistasis Context-dependent mutations Amino acid interactions Evolutionary biology Fitness Combinatorial mutagenesis 



I thank Michael A. Stiffler and DerZen Fan for critical reading of the manuscript.

Supplementary material (22 kb)
Data 1 Computational Scripts in MATLAB (ZIP 23 KB)


  1. 1.
    Bateson W (1907) Facts limiting the theory of heredity. Science 26:649–660CrossRefGoogle Scholar
  2. 2.
    Fisher RA (1918) The correlation between relatives on the supposition of Mendelian inheritance. Trans Roy Soc Edinb 52:399–433CrossRefGoogle Scholar
  3. 3.
    Phillips PC (1998) The language of gene interaction. Genetics 149:1167–1171PubMedPubMedCentralGoogle Scholar
  4. 4.
    Phillips PC (2008) Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet 9:855–867CrossRefGoogle Scholar
  5. 5.
    Poelwijk FJ, Krishna V, Ranganathan R (2016) The context-dependence of mutations: a linkage of formalisms. PLoS Comput Biol 12:e1004771CrossRefGoogle Scholar
  6. 6.
    Weinreich DM, Watson RA, Chao L (2005) Perspective: sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59:1165–1174PubMedGoogle Scholar
  7. 7.
    Weinreich DM, Delaney NF, Depristo MA, Hartl DL (2006) Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312:111–114CrossRefGoogle Scholar
  8. 8.
    Poelwijk FJ, Kiviet DJ, Weinreich DM, Tans SJ (2007) Empirical fitness landscapes reveal accessible evolutionary paths. Nature 445:383–386CrossRefGoogle Scholar
  9. 9.
    Poelwijk FJ, Tănase-Nicola S, Kiviet DJ, Tans SJ (2011) Reciprocal sign epistasis is a necessary condition for multi-peaked fitness landscapes. J Theor Biol 272:141–144CrossRefGoogle Scholar
  10. 10.
    Siepel A, Haussler D (2004) Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 21:468–488CrossRefGoogle Scholar
  11. 11.
    Beer T (1981) Walsh transforms. Am J Phys 49:466–472CrossRefGoogle Scholar
  12. 12.
    Stoffer DS (1991) Walsh-Fourier analysis and its statistical applications. J Am Stat Assoc 86:461–479CrossRefGoogle Scholar
  13. 13.
    Weinberger E (1991) Fourier and Taylor series on fitness landscapes. Biol Cybernetics 65:321–330CrossRefGoogle Scholar
  14. 14.
    Stadler PF (2002) Spectral landscape theory. In: Crutchfield JP, Schuster P (eds) Evolutionary dynamics—exploring the interface of selection, accident, and function. Oxford University Press, Oxford, pp 231–272Google Scholar
  15. 15.
    Poelwijk FJ, Socolich M, Ranganathan R (2017) High-order epistasis linking genotype and phenotype in a protein. SubmittedGoogle Scholar
  16. 16.
    Otwinowski J, Nemenman I (2013) Genotype to phenotype mapping and the fitness landscape of the E. coli lac promoter. PLoS One 8:e61570CrossRefGoogle Scholar
  17. 17.
    Sailer ZR, Harms MJ (2017) Detecting high-order epistasis in nonlinear genotype-phenotype maps. Genetics 205:1079–1088CrossRefGoogle Scholar
  18. 18.
    Theil H (1950) A rank-invariant method of linear and polynomial regression analysis. I, II, III, Nederl Akad Wetensch Proc 53: 386–392, 521–525, 1397–1412Google Scholar
  19. 19.
    Sen PK (1968) Estimates of the regression coefficient based on Kendall’s tau. J Am Stat Assoc 63:1379–1389CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Frank J. Poelwijk
    • 1
  1. 1.cBio Center, Department of Biostatistics and Computational BiologyDana-FarberCancer InstituteBostonUSA

Personalised recommendations