Universality Classes of Interaction Structures for NK Fitness Landscapes

Abstract

Kauffman’s NK-model is a paradigmatic example of a class of stochastic models of genotypic fitness landscapes that aim to capture generic features of epistatic interactions in multilocus systems. Genotypes are represented as sequences of L binary loci. The fitness assigned to a genotype is a sum of contributions, each of which is a random function defined on a subset of \(k \le L\) loci. These subsets or neighborhoods determine the genetic interactions of the model. Whereas earlier work on the NK model suggested that most of its properties are robust with regard to the choice of neighborhoods, recent work has revealed an important and sometimes counter-intuitive influence of the interaction structure on the properties of NK fitness landscapes. Here we review these developments and present new results concerning the number of local fitness maxima and the statistics of selectively accessible (that is, fitness-monotonic) mutational pathways. In particular, we develop a unified framework for computing the exponential growth rate of the expected number of local fitness maxima as a function of L, and identify two different universality classes of interaction structures that display different asymptotics of this quantity for large k. Moreover, we show that the probability that the fitness landscape can be traversed along an accessible path decreases exponentially in L for a large class of interaction structures that we characterize as locally bounded. Finally, we discuss the impact of the NK interaction structures on the dynamics of evolution using adaptive walk models.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    Note that incorrect expressions for \(\rho (d)\) appear in some of the literature preceding [8].

References

  1. 1.

    Aita, T., Uchiyama, H., Inaoka, T., Nakajima, M., Kokubo, T., Husimi, Y.: Analysis of a local fitness landscape with a model of the rough Mt. Fuji-type landscape: application to prolyl endopeptidase and thermolysin. Biopolymers 54(1), 64–79 (2000)

    Article  Google Scholar 

  2. 2.

    Altenberg, L.: NK fitness landscapes. In: Bäck, T., Fogel, D.B., Michalewicz, Z. (eds.) Handbook of Evolutionary Computation. IOP Publishing Ltd and Oxford University Press, Oxford (1997)

    Google Scholar 

  3. 3.

    Bank, C., Matuszewski, S., Hietpas, R.T., Jensen, J.D.: On the (un)predictability of a large intragenic fitness landscape. Proc. Nat. Acad. Sci. USA 113, 14085–14090 (2016)

    Article  Google Scholar 

  4. 4.

    Berestycki, J., Brunet, É., Shi, Z.: The number of accessible paths in the hypercube. Bernoulli 22, 653–680 (2016)

    MathSciNet  Article  MATH  Google Scholar 

  5. 5.

    Berestycki, J., Brunet, É., Shi, Z.: Accessibility percolation with backsteps. ALEA, Lat. Am. J. Probab. Math. Stat. 14, 45–62 (2017)

    MathSciNet  MATH  Google Scholar 

  6. 6.

    Buzas, J., Dinitz, J.: An analysis of NK landscapes: interaction structure, statistical properties and expected number of local optima. IEEE Trans. Evolut. Comput. 18(6), 807–818 (2014)

    Article  Google Scholar 

  7. 7.

    Campos, P.R.A., Adami, C., Wilke, C.O.: Optimal adaptive performance and delocalization in NK fitness landscapes. Phys. A: Stat. Mech. Appl. 304, 495–506 (2002)

    Article  MATH  Google Scholar 

  8. 8.

    Campos, P.R.A., Adami, C., Wilke, C.O.: Optimal adaptive performance and delocalization in NK fitness landscapes (Erratum). Phys. A: Stat. Mech. Appl. 318, 637 (2003)

    Article  MATH  Google Scholar 

  9. 9.

    Carneiro, M., Hartl, D.L.: Adaptive landscapes and protein evolution. Proc. Nat. Acad. Sci. USA 107, 1747–1751 (2010)

    ADS  Article  Google Scholar 

  10. 10.

    Crona, K., Greene, D., Barlow, M.: The peaks and geometry of fitness landscapes. J. Theor. Biol. 318, 1–10 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  11. 11.

    Crona, K., Gavryushkin, A., Greene, D., Beerenwinkel, N.: Inferring genetic interactions from comparative fitness data. eLife 6, e28629 (2017)

    Article  Google Scholar 

  12. 12.

    de Oliviera, V.M., Fontanari, J.F., Stadler, P.F.: Metastable states in short-ranged \(p\)-spin glasses. J. Phys. A 32, 8793–8802 (1999)

    ADS  Article  MATH  Google Scholar 

  13. 13.

    de Visser, J.A.G.M., Krug, J.: Empirical fitness landscapes and the predictability of evolution. Nat. Rev. Genet. 15, 480–490 (2014)

    Article  Google Scholar 

  14. 14.

    de Visser, J.A.G.M., Park, S.C., Krug, J.: Exploring the effect of sex on empirical fitness landscapes. Am. Nat. 174, S15–S30 (2009)

    Article  Google Scholar 

  15. 15.

    de Visser, J.A.G.M., Cooper, T.F., Elena, S.F.: The causes of epistasis. Proc. R. Soc. Lond. Ser. B 278, 3617–3624 (2011)

    Article  Google Scholar 

  16. 16.

    de Haan, L., Ferreira, A.: Extreme Value Theory: An Introduction. Springer Series in Operations Research. Springer, Berlin (2006)

    Google Scholar 

  17. 17.

    Dean, D.S.: Metastable states of spin glasses on random thin graphs. Eur. Phys. J. B 15, 493–498 (2000)

    ADS  Article  Google Scholar 

  18. 18.

    DePristo, M.A., Hartl, D.L., Weinreich, D.M.: Mutational reversions during adaptive protein evolution. Mol. Biol. Evol. 24, 1608–1610 (2007)

    Article  Google Scholar 

  19. 19.

    Durrett, R., Limic, V.: Rigorous results for the NK model. Ann. Prob. 31, 1713–1753 (2003)

    MathSciNet  Article  MATH  Google Scholar 

  20. 20.

    Evans, S.N., Steinsaltz, D.: Estimating some features of NK fitness landscapes. Ann. Appl. Probab. 12, 1299–1321 (2002)

    MathSciNet  Article  MATH  Google Scholar 

  21. 21.

    Ferretti, L., Schmiegelt, B., Weinreich, D., Yamauchi, A., Kobayashi, Y., Tajima, F., Achaz, G.: Measuring epistasis in fitness landscapes: the correlation of fitness effects of mutations. J. Theor. Biol. 396, 132–143 (2016)

    MathSciNet  Article  MATH  Google Scholar 

  22. 22.

    Fiocco, D., Foffi, G., Sastry, S.: Encoding of memory in sheared amorphous solids. Phys. Rev. Lett. 112, 025702 (2014)

    ADS  Article  Google Scholar 

  23. 23.

    Flyvbjerg, H., Lautrup, B.: Evolution in a rugged fitness landscape. Phys. Rev. A 46, 6714–6723 (1992)

    ADS  Article  MATH  Google Scholar 

  24. 24.

    Franke, J., Krug, J.: Evolutionary accessibility in tunably rugged fitness landscapes. J. Stat. Phys. 148, 705–722 (2012)

    ADS  Article  MATH  Google Scholar 

  25. 25.

    Franke, J., Klözer, A., de Visser, J.A.G.M., Krug, J.: Evolutionary accessibility of mutational pathways. PLoS Comput. Biol. 7(8), e1002,134 (2011)

    MathSciNet  Article  Google Scholar 

  26. 26.

    Gavrilets, S.: Fitness Landscapes and the Origin of Species. Princeton University Press, Princeton (2004)

    Google Scholar 

  27. 27.

    Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Hothorn, T.: mvtnorm: Multivariate Normal and t Distributions. R package version 1.0-6 (2017)

  28. 28.

    Genz, A.: Numerical computation of multivariate normal probabilities. J. Comput. Gr. Stat. 1(2), 141–149 (1992)

    Google Scholar 

  29. 29.

    Gillespie, J.H.: A simple stochastic gene substitution model. Theor. Popul. Biol. 23, 202–215 (1983)

    MathSciNet  Article  MATH  Google Scholar 

  30. 30.

    Gillespie, J.H.: Molecular evolution over the mutational landscape. Evolution 38, 1116–1129 (1984)

    Article  Google Scholar 

  31. 31.

    Haldane, J.B.S.: A mathematical theory of natural selection, Part VIII: metastable populations. Proc. Camb. Philos. Soc. 27, 137–142 (1931)

    ADS  Article  MATH  Google Scholar 

  32. 32.

    Hartl, D.L.: What can we learn from fitness landscapes? Curr. Opin. Microbiol. 21, 51–57 (2014)

    Article  Google Scholar 

  33. 33.

    Hegarty, P., Martinsson, A.: On the existence of accessible paths in various models of fitness landscapes. Ann. Appl. Probab. 24, 1375–1395 (2014)

    MathSciNet  Article  MATH  Google Scholar 

  34. 34.

    Hwang, S., Park, S.C., Krug, J.: Genotypic complexity of Fisher’s geometric model. Genetics 206, 1049–1079 (2017)

    Article  Google Scholar 

  35. 35.

    Isner, B.A., Lacks, D.J.: Generic rugged landscapes under strain and the possibility of rejuvenation in glasses. Phys. Rev. Lett. 96, 025506 (2006)

    ADS  Article  Google Scholar 

  36. 36.

    Jain, K.: Number of adaptive steps to a local fitness peak. Europhys. Lett. 96, 58006 (2011)

    ADS  Article  Google Scholar 

  37. 37.

    Jain, K., Seetharaman, S.: Multiple adaptive substitutions during evolution in novel environments. Genetics 189, 1029–1043 (2011)

    Article  Google Scholar 

  38. 38.

    Kanwal, R.P.: Linear Integral Equations: Theory & Technique. Modern Birkhäuser Classics. Birkhäuser, Basel (2012)

    Google Scholar 

  39. 39.

    Kauffman, S.A.: The Origins of Order. Oxford University Press, Oxford (1993)

    Google Scholar 

  40. 40.

    Kauffman, S., Levin, S.: Towards a general theory of adaptive walks on rugged landscapes. J. Theor. Biol. 128(1), 11–45 (1987)

    MathSciNet  Article  Google Scholar 

  41. 41.

    Kauffman, S.A., Weinberger, E.D.: The NK model of rugged fitness landscapes and its application to maturation of the immune response. J. Theor. Biol. 141, 211–245 (1989)

    Article  Google Scholar 

  42. 42.

    Kimura, M.: On the probability of fixation of mutant genes in a population. Genetics 47, 713–719 (1962)

    Google Scholar 

  43. 43.

    Kingman, J.F.C.: A simple model for the balance between selection and mutation. J. Appl. Probab. 15(1), 1–12 (1978)

    MathSciNet  Article  MATH  Google Scholar 

  44. 44.

    Kondrashov, D.A., Kondrashov, F.A.: Topological features of rugged fitness landscapes in sequence space. Trends Genet. 31, 24–33 (2015)

    Article  Google Scholar 

  45. 45.

    Kouyos, R.D., Leventhal, G.E., Hinkley, T., Haddad, M., Whitcomb, J.M., Petropoulos, C.J., Bonhoeffer, S.: Exploring the complexity of the HIV-1 fitness landscape. PLoS Genet. 8, e100255151 (2012)

    Article  Google Scholar 

  46. 46.

    Levinthal, D.A.: Adaptation on rugged landscapes. Manag. Sci. 43, 934–950 (1997)

    Article  MATH  Google Scholar 

  47. 47.

    Limic, V., Pemantle, R.: More rigorous results on the Kauffman-Levin model of evolution. Ann. Prob. 32, 2149–2178 (2004)

    MathSciNet  Article  MATH  Google Scholar 

  48. 48.

    Macken, C.A., Perelson, A.S.: Protein evolution on rugged landscapes. Proc. Nat. Acad. Sci. USA 86, 6191–6195 (1989)

    ADS  MathSciNet  Article  Google Scholar 

  49. 49.

    Macken, C.A., Hagan, P.S., Perelson, A.S.: Evolutionary walks on rugged landscapes. SIAM J. Appl. Math. 51(3), 799–827 (1991)

    MathSciNet  Article  MATH  Google Scholar 

  50. 50.

    Manukyan, N., Eppstein, M.J., Buzas, J.S.: Tunably rugged landscapes with known maximum and minimum. IEEE Trans. Evolut. Comput. 20, 263–274 (2016)

    Article  Google Scholar 

  51. 51.

    Martinsson, A.: Accessibility percolation and first-passage site percolation on the unoriented binary hypercube. Preprint arXiv:1501.02206 (2015)

  52. 52.

    Mustonen, V., Lässig, M.: From fitness landscapes to seascapes: non-equilbrium dynamics of selection and adaptation. Trends Genet. 25, 111–119 (2009)

    Article  Google Scholar 

  53. 53.

    Neidhart, J., Krug, J.: Adaptive walks and extreme value theory. Phys. Rev. Lett. 107, 178102 (2011)

    ADS  Article  Google Scholar 

  54. 54.

    Neidhart, J., Szendro, I.G., Krug, J.: Exact results for amplitude spectra of fitness landscapes. J. Theor. Biol. 332, 218–227 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  55. 55.

    Neidhart, J., Szendro, I.G., Krug, J.: Adaptation in tunably rugged fitness landscapes: the rough Mount Fuji Model. Genetics 198, 699–721 (2014)

    Article  Google Scholar 

  56. 56.

    Nowak, S., Krug, J.: Analysis of adaptive walks on NK fitness landscapes with different interaction schemes. J. Stat. Mech.: Theory Exp. 2015, P06014 (2015)

  57. 57.

    Nowak, S.: Properties of Random Fitness Landscapes and Their Influence on Evolutionary Dynamics. A Journey through the Hypercube. PhD dissertation, Cologne (2015)

  58. 58.

    Nowak, S., Krug, J.: Accessibility percolation on \(n\)-trees. Europhys. Lett. 101, 66004 (2013)

    ADS  Article  Google Scholar 

  59. 59.

    Nowak, S., Neidhart, J., Szendro, I.G., Krug, J.: Multidimensional epistasis and the transitory advantage of sex. PLoS Comput. Biol. 10, e1003836 (2014)

    ADS  Article  Google Scholar 

  60. 60.

    Ohta, T.: The meaning of near-neutrality at coding and non-coding regions. Gene 205, 261–267 (1997)

    Article  Google Scholar 

  61. 61.

    Orr, H.A.: The population genetics of adaptation: the adaptation of DNA sequences. Evolution 56, 1317–1330 (2002)

    Article  Google Scholar 

  62. 62.

    Orr, H.A.: A minimum on the mean number of steps taken in adaptive walks. J. Theor. Biol. 220, 241–247 (2003)

    MathSciNet  Article  Google Scholar 

  63. 63.

    Orr, H.A.: The population genetics of adaptation on correlated fitness landscapes: the block model. Evolution 60, 1113–1124 (2006)

    Article  Google Scholar 

  64. 64.

    Østman, B., Hintze, A., Adami, C.: Impact of epistasis and pleiotropy on evolutionary adaptation. Proc. R. Soc. Lond. Ser. B 279, 247–256 (2012)

    Article  Google Scholar 

  65. 65.

    Park, S.C., Krug, J.: \(\delta \)-exceedance records and random adaptive walks. J. Phys. A 49, 315601 (2016)

    MathSciNet  Article  MATH  Google Scholar 

  66. 66.

    Park, S.C., Simon, D., Krug, J.: The speed of evolution in large asexual populations. J. Stat. Phys. 138, 381–410 (2010)

    ADS  MathSciNet  Article  MATH  Google Scholar 

  67. 67.

    Park, S.C., Szendro, I.G., Neidhart, J., Krug, J.: Phase transition in random adaptive walks on correlated fitness landscapes. Phys. Rev. E 91, 042707 (2015)

    ADS  MathSciNet  Article  Google Scholar 

  68. 68.

    Park, S.C., Neidhart, J., Krug, J.: Greedy adaptive walks on a correlated fitness landscape. J. Theor. Biol. 397, 89–102 (2016)

    MathSciNet  Article  MATH  Google Scholar 

  69. 69.

    Perelson, A.S., Macken, C.A.: Protein evolution on partially correlated landscapes. Proc. Natl. Acad. Sci. USA 92(21), 9657–9661 (1995)

    ADS  Article  MATH  Google Scholar 

  70. 70.

    Phillips, P.C.: Epistasis—the essential role of gene interactions in the structure and evolution of genetic systems. Nat. Rev. Genet. 9, 855–867 (2008)

    Article  Google Scholar 

  71. 71.

    Poelwijk, F.J., Kiviet, D.J., Weinreich, D.M., Tans, S.J.: Empirical fitness landscapes reveal accessible evolutionary paths. Nature 445, 383–386 (2007)

    ADS  Article  Google Scholar 

  72. 72.

    Poelwijk, F.J., Tănase-Nicola, S., Kiviet, D.J., Tans, S.J.: Reciprocal sign epistasis is a necessary condition for multi-peaked fitness landscapes. J. Theor. Biol. 272, 141–144 (2011)

    Article  Google Scholar 

  73. 73.

    Poelwijk, F.J., Krishna, V., Ranganathan, R.: The context-dependence of mutations: a linkage of formalisms. PLoS Comput. Biol. 12, e1004,771 (2016)

    Article  Google Scholar 

  74. 74.

    Pokusaeva, V.O., Usmanova, D.R., Putintseva, E.V., Espinar, L., Sarkisyan, K.S., Mishin, A.S., Bogatyreva, N.S., Ivankov, D.N., Povolotskaya, I.S., Filion, G.J., Carey, L.B., Kondrashov, F.A.: Experimental assay of a fitness landscape on a macroevolutionary scale. Preprint bioRxiv 222778 (2017)

  75. 75.

    Provine, W.B.: Sewall Wright and Evolutionary Biology. University of Chicago Press, Chicago (1986)

    Google Scholar 

  76. 76.

    Reidys, C.M., Stadler, P.F.: Combinatorial landscapes. SIAM Rev. 44, 3–54 (2002)

    ADS  MathSciNet  Article  MATH  Google Scholar 

  77. 77.

    Richter, H., Engelbrecht, A. (eds.): Recent Advances in the Theory and Application of Fitness Landscapes. Springer, Berlin (2014)

  78. 78.

    Rowe, W., Platt, M., Wedge, D.C., Day, P.J., Kell, D.B., Knowles, J.: Analysis of a complete DNA-protein affinity landscape. J. R. Soc. Interface 7, 397–408 (2010)

    Article  Google Scholar 

  79. 79.

    Sailer, Z.R., Harms, M.J.: High-order epistasis shapes evolutionary trajectories. PLoS Comput. Biol. 13, e1005,541 (2017)

    Article  Google Scholar 

  80. 80.

    Schmiegelt, B.: Sign epistasis networks. Master thesis, Cologne (2016)

  81. 81.

    Schmiegelt, B., Krug, J.: Evolutionary accessibility of modular fitness landscapes. J. Stat. Phys. 154(1), 334–355 (2014)

    ADS  MathSciNet  Article  MATH  Google Scholar 

  82. 82.

    Seetharaman, S., Jain, K.: Length of adaptive walk on uncorrelated and correlated fitness landscapes. Phys. Rev. E 90, 032703 (2014)

    ADS  Article  Google Scholar 

  83. 83.

    Stadler, P.F.: Landscapes and their correlation functions. J. Math. Chem. 20, 1–45 (1996)

    MathSciNet  Article  MATH  Google Scholar 

  84. 84.

    Stadler, P.F., Happel, R.: Random field models for fitness landscapes. J. Math. Biol. 38, 435–478 (1999)

    MathSciNet  Article  MATH  Google Scholar 

  85. 85.

    Stein, D.L. (ed.): Spin Glasses and Biology. World Scientific, Singapore (1992)

  86. 86.

    Svensson, E.I., Calsbeek, R. (eds.): The Adaptive Landscape in Evolutionary Biology. Oxford University Press, Oxford (2012)

  87. 87.

    Szendro, I.G., Schenk, M.F., Franke, J., Krug, J., de Visser, J.A.G.M.: Quantitative analyses of empirical fitness landscapes. J. Stat. Mech.: Theory Exp. 2013, P01005 (2013)

  88. 88.

    Tomassini, M., Vérel, S., Ochoa, G.: Complex-network analysis of combinatorial spaces: the NK landscape case. Phys. Rev. E 78, 066114 (2008)

    ADS  Article  Google Scholar 

  89. 89.

    Touchette, H.: The large deviation approach to statistical mechanics. Phys. Rep. 478(1), 1–69 (2009)

    ADS  MathSciNet  Article  Google Scholar 

  90. 90.

    Valente, M.: An NK-like model for complexity. J. Evolut. Econ. 24, 107–134 (2014)

    Article  Google Scholar 

  91. 91.

    Weinberger, E.D.: Fourier and Taylor series on fitness landscapes. Biol. Cybern. 65, 321–330 (1991)

    Article  MATH  Google Scholar 

  92. 92.

    Weinberger, E.D.: Local properties of Kauffman’s N-k model: a tunably rugged energy landscape. Phys. Rev. A 44, 6399–6413 (1991)

    ADS  Article  Google Scholar 

  93. 93.

    Weinreich, D.M., Watson, R.A., Chao, L.: Sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59, 1165–1174 (2005)

    Google Scholar 

  94. 94.

    Weinreich, D.M., Delaney, N.F., DePristo, M.A., Hartl, D.L.: Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312, 111–114 (2006)

    ADS  Article  Google Scholar 

  95. 95.

    Weinreich, D.M., Lan, Y., Wylie, C.S., Heckendorn, R.B.: Should evolutionary geneticists worry about higher-order epistasis? Curr. Op. Genet. Dev. 23, 700–707 (2013)

    Article  Google Scholar 

  96. 96.

    Welch, J.J., Waxman, D.: The nk model and population genetics. J. Theor. Biol. 234, 329–340 (2005)

    MathSciNet  Article  Google Scholar 

  97. 97.

    Whitlock, M.C., Phillips, P.C., Moore, F.B.G., Tonsor, S.J.: Multiple fitness peaks and epistasis. Annu. Rev. Ecol. Systemat. 26, 601–629 (1995)

    Article  Google Scholar 

  98. 98.

    Wilke, C.O., Martinetz, T.: Adaptive walks on time-dependent fitness landscapes. Phys. Rev. E 60, 2154–2159 (1999)

    ADS  Article  Google Scholar 

  99. 99.

    Wright, S.: The roles of mutation, inbreeding, crossbreeding and selection in evolution. In: Proceedings of the 6th International Congress of Genetics, vol. 1, pp. 356–366 (1932)

  100. 100.

    Wright, A.H., Thompson, R.K., Zhang, J.: The computational complexity of N-K fitness functions. IEEE Trans. Evolut. Comput. 4, 373–379 (2000)

    Article  Google Scholar 

  101. 101.

    Wu, N.C., Dai, L., Olson, C.A., Lloyd-Smith, J.O., Sun, R.: Adaptation in protein fitness landscapes is facilitated by indirect paths. eLife 5, 16965 (2016)

    Google Scholar 

  102. 102.

    Zagorski, M., Burda, Z., Waclaw, B.: Beyond the hypercube: evolutionary accessibility of fitness landscapes with realistic mutational networks. PLoS Comput. Biol. 12(12), e1005218 (2016)

    ADS  Article  Google Scholar 

Download references

Acknowledgements

We thank David Dean for useful discussions, and an anonymous reviewer for constructive remarks on the manuscript. JK acknowledges the kind hospitality of the MPI for Physics of Complex Systems (Dresden) and the Kavli Institute for Theoretical Physics (Santa Barbara) during the completion of the paper. This research was supported by DFG within SFB 680 Molecular basis of evolutionary innovations and SPP1590 Probabilistic structures in evolution, and in part by the National Science Foundation Grant No. NSF PHY-1125915, NIH Grant No. R25GM067110, and the Gordon and Betty Moore Foundation Grant No. 2919.01.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Joachim Krug.

Appendices

A Asymptotics of \( \pi _\mathrm {max}^{\mathrm {MF}} \) in the Joint Limit \(k, L \rightarrow \infty \)

We start from Eq. (34). Rescaling \(y \rightarrow \frac{\eta y}{\sqrt{2}}\), we rewrite the equation in terms of the CDF of a standard Gaussian distribution \( \varPhi (y) \) as

$$\begin{aligned} \pi _\mathrm {max}^{\mathrm {MF}}&= \sqrt{\frac{L \eta ^2}{4\pi } } \int dy e^{-L \eta ^2 y^2/4} \left[ \frac{1}{2} \left( \text {erf}\left( \frac{y}{\sqrt{2}}\right) +1\right) \right] ^L \nonumber \\&= \sqrt{\frac{\mu }{2\pi } } \int dy e^{-\mu y^2/2} \varPhi (y)^L, \end{aligned}$$
(99)

where \(\mu \equiv \frac{L \eta ^2}{2}\) which converges to \(\frac{(2-\alpha )}{\alpha } \) in the joint limit as can be seen from Eq. (32).

Interestingly, the only L-dependence shown in the above equation appears as an L-th power of the CDF \(\varPhi (y)\), which converges monotonically to unity as \( y \rightarrow \infty \). This implies that the conventional saddle point method cannot be applied here due to the absence of a maximum. Instead, we can rely on the extreme value theory by interpreting the term \(\varPhi (y)^L\) as the probability that L randomly sampled standard Gaussian random variables are less than y. This leads immediately to the limit relation [16]

$$\begin{aligned} \varPhi \left( \frac{x}{a_L} + b_L\right) ^L \rightarrow G(x)( 1 + o(1)), \end{aligned}$$
(100)

where G(x) is the Gumbel CDF defined by \(G(x) = e^{- e^{-x}}\), and the two scaling factors are given by \(a_L = \sqrt{2 \ln L}\) and

$$\begin{aligned} b_L = \sqrt{2 \ln L} - \frac{\ln \ln L + \ln 4\pi }{2 \sqrt{2 \ln L}}. \end{aligned}$$
(101)

After making the change of variable \(y = \frac{x}{a_L} + b_L\), the integral is now of the form

$$\begin{aligned} \pi _\mathrm {max}^{\mathrm {MF}}&= \frac{1}{a_L} \sqrt{\frac{\mu }{2\pi } } \int dx e^{-\mu \left( \frac{x}{a_L} + b_L \right) ^2/2} \varPhi \left( \frac{x}{a_L} + b_L\right) ^L \nonumber \\&= \frac{1}{a_L} \sqrt{\frac{\mu }{2\pi } } \int dx e^{- \mu \left( \frac{x}{a_L} + b_L \right) ^2/2} G(x) \left( 1 + o(1)\right) . \end{aligned}$$
(102)

The evaluation of the integral with respect to x is greatly simplified once one notices that the term \(\frac{x^2}{a_L^2}\) in the exponent is sub-leading in L. Ignoring this term gives

$$\begin{aligned} \pi _\mathrm {max}^{\mathrm {MF}}&=\frac{1}{a_L} \sqrt{\frac{\mu }{2\pi } } \int dx e^{- \mu \left( b_L^2 + 2 \frac{b_L x}{a_L} \right) /2} G(x) \left( 1 + o(1)\right) \nonumber \\&= \frac{1}{a_L} \sqrt{\frac{\mu }{2\pi } } e^{-\mu b_L^2/2} \varGamma ( \mu )\left( 1 + o(1)\right) , \end{aligned}$$
(103)

where we have used the identity

$$\begin{aligned} \int _{-\infty }^{\infty } G(x) \exp (-M x) \, dx = \varGamma (M) \end{aligned}$$
(104)

for positive M. Next, expanding \(a_L\) and \(b_L\) and rearranging the terms gives

$$\begin{aligned} \pi _\mathrm {max}^{\mathrm {MF}}&= \frac{1}{a_L} \sqrt{\frac{ \mu }{2\pi } } e^{- \frac{\mu }{2} \left[ 2 \ln L - \left( \ln \ln L + \ln 4 \pi + o(1)\right) \right] } \varGamma (\mu )\left( 1 + o(1)\right) \nonumber \\&= \sqrt{\mu } {\frac{ \left( 4\pi \ln L\right) ^{\mu /2} }{\left( 4\pi \ln L\right) ^{1/2}}} \varGamma ( \mu ) L^{-\mu } \left( 1 + o(1)\right) . \end{aligned}$$
(105)

As expected from the formal analysis in Sect. 3.2.2, the leading order behavior is given by a power law with exponent \(\mu = (2-\alpha )/\alpha \). By contrast, the existence of a non-trivial logarithmic correction is unexpected, in particular since such a correction does not appear in the exact result \( \pi _\mathrm {max}^{\mathrm {HoC}} = (L+1)^{-1} \) for the HoC model (\(\alpha = \mu = 1\)). Remarkably, the logarithmic factors precisely cancel in this particular case.

B Variational Analysis at the Maximum of \(\lambda _k^\mathrm {AN}\)

In Fig. 4, we observed that \(\lambda _2^\mathrm {AN}\) for the negative gamma distribution with shape parameter s is maximized at \(s=1/2\). Furthermore, we claimed that this can be naturally generalized to arbitrary values of k if we replace the shape parameter by 1 / k. As a next question, one might further ask if \(\lambda _k^\mathrm {AN}\) is an extremum also with respect to arbitrary variations in the space of base fitness distributions \(p_f\). Here, we prove that this is indeed the case for distributions with support limited to the negative real axis.

Let us first evaluate the k-fold convolution of the gamma distribution needed to compute Eq. (42). This is easily achieved using the property that the gamma distribution is closed under the convolution operation, i.e., the k-fold convolution of the gamma distribution with shape parameter s is the gamma distribution with shape parameter sk. If we choose as our base distribution the negative gamma distribution with shape parameter \(s=1/k\),

$$\begin{aligned} p_f(x) = p_{1/k}(x) \equiv g_{1/k}(-x), \end{aligned}$$
(106)

the k-fold convolution yields the gamma distribution with unit shape parameter a.k.a. a (negative) exponential distribution, characterized by the CDF \( \tilde{F}_{1/k}^{(k)}(z) = e^z \) for \(z<0\). Since \( \tilde{F}_{1/k}^{(k)}(y_1 + y_2 + \cdots ) = e^{y_1} e^{y_2} \cdots \), Eq. (42) is fully factorized as

$$\begin{aligned} \pi _\mathrm {max}^{\mathrm {AN}} = \left( \int dy \, g_{1/k}(-y) e^{ky} \right) ^L = (k+1)^{-L/k}, \end{aligned}$$
(107)

which is exactly the result for the block model obtained in Eq. (26).

Next, let us derive a useful general formula for \( \tilde{F}^{(k)}(z) \). Using the convolution theorem, it satisfies

$$\begin{aligned} \tilde{F}^{(k)}(z)&= \int _{-\infty }^{z} dz' \int _{z'}^{\infty } dy\, p_f(y) \, p^{(k-1)}_f(z'-y) \end{aligned}$$
(108)

where \( p^{(k-1)}_f(z) \) is the PDF of the \(k-1\) fold convolution of \( p_f(z) \). It will later be convenient to exchange the order of integrals:

$$\begin{aligned} \tilde{F}^{(k)}(z)&= \int _{-\infty }^{z} dy\, p_f(y) \int _{-\infty }^{y} dz' \, p^{(k-1)}_f(z'-y) + \int _{z}^{\infty } dy\, p_f(y) \int _{-\infty }^{z} dz' \, p^{(k-1)}_f(z'-y) \nonumber \\&= \int _{-\infty }^{z} dy\, p_f(y) + \int _{z}^{\infty } dy\, p_f(y) \tilde{F}_s^{(k-1)}(z-y). \end{aligned}$$
(109)

In the first equality, we split the integral into two pieces to accommodate the condition \(p^{(k-1)}_f(z) =0\) for positive z. In the next equality, we have used the fact that \(\tilde{F}^{(k-1)}(0) =1\).

Now, we want to show that \(\pi _\mathrm {max}^{\mathrm {AN}}\) is maximized when the base fitness distribution is given by Eq. (106). To this end, let us introduce a small perturbation \(p_f(y) = p_{1/k}(y) + \epsilon \eta (y)\), with the properties that \(\int dy \, \eta (y) = 0\) and \(\eta (y) = 0\) for \(y > 0\). Since the probability Eq. (42) is given by the product of 2L terms, there will be 2L linear terms in \(O(\epsilon )\), i.e. \(\pi _\mathrm {max}^{\mathrm {AN}}\) changes by

$$\begin{aligned} \delta \pi _\mathrm {max}^{\mathrm {AN}}&= \epsilon L \int dy\, \eta (y) \int \left( \prod _{r=2}^{L} dy_r p_{1/k}(y_r) \right) \prod _{l=0}^{L-1} \tilde{F}_{1/k}^{(k)}\left( \sum _{m=1 }^{k} y_{(l + m) \, \text {mod} \,L} \right) \nonumber \\&\quad \ + L \int \left( \prod _{r=1}^{L} dy_r p_{1/k}(y_r) \right) \delta \tilde{F}^{(k)}\left( \sum _{m=1 }^{k}y_{m} \right) \prod _{l=1}^{L-1} \tilde{F}_{1/k}^{(k)}\left( \sum _{m=1 }^{k} y_{(l + m) \, \text {mod} \,L} \right) \nonumber \\&\equiv L (J_1 + J_2). \end{aligned}$$
(110)

The first term is straightforward to evaluate. Since \( \tilde{F}^{(k)}_{1/k}\left( \sum _{m=1}^{k} y_{(l + m) \, \text {mod} \,L} \right) \) is factorized, it readily follows that

$$\begin{aligned} J_1&= \epsilon \int dy\, \eta (y) \int \left( \prod _{r=1}^{L-1} dy_r p_{1/k}(y_r) \right) \prod _{l=0}^{L-1} \tilde{F}^{(k)}_{1/k}\left( \sum _{m=1 }^{k} y_{(l + m) \, \text {mod} \,L} \right) \nonumber \\&= \epsilon \int dy\, \eta (y) e^{ky} (k+1)^{-(L-1)/k}. \end{aligned}$$
(111)

To evaluate \(J_2\), let us rewrite it in the following way:

$$\begin{aligned} J_2&= \int \left( \prod _{r=1}^{L} dy_r p_{1/k}(y_r) \right) \delta \tilde{F}^{(k)}\left( \sum _{m=1 }^{k}y_{m} \right) \prod _{l=1}^{L-1} \tilde{F}^{(k)}_{1/k}\left( \sum _{m=1 }^{k} y_{(l + m) \, \text {mod} \,L} \right) \nonumber \\&=(k+1)^{-(L-k)/k} \int \left( \prod _{r=1}^{k} dy_r p_{1/k}(y_r) e^{(k-1) y_r}\right) \delta \tilde{F}^{(k)}\left( \sum _{m=1 }^{k}y_{m} \right) . \end{aligned}$$
(112)

The argument of \(\delta \tilde{F}^{(k)}\) is the sum of the variables \(y_r\) that remain to be integrated over. To make them independent, let us introduce a delta function through the identity

$$\begin{aligned} 1 = \int dY \delta \left( \sum _{m=1 }^{k}y_{m} -Y\right) \varTheta (-Y) \end{aligned}$$
(113)

or, in the Fourier representation,

$$\begin{aligned} 1 = \int \frac{dY dZ}{2\pi } e^{ -i Z (\sum _{m=1 }^{k}y_{m} - Y) } \varTheta (-Y), \end{aligned}$$
(114)

where we impose the negativity of Y by inserting an additional theta function. Using the property \(\int dx \delta (x-a) f(x) = \int dx \delta (x-a) f(a)\), we may now complete the integrations over the \(y_r\) as

$$\begin{aligned}&\int \frac{dY dZ}{2\pi } \varTheta (-Y) \int \left( \prod _{r=1}^{k} dy_r g_{1/k}(y_r) e^{(k-1) y_r}\right) e^{ iZ (Y - \sum _{m}^{k} y_m )} \delta \tilde{F}^{(k)}(Y) \nonumber \\&\quad = \int \frac{dY dZ}{2\pi } \varTheta (-Y) (k-i Z)^{-1} e^{ iZ Y} \delta \tilde{F}^{(k)}(Y) = \int dY \varTheta (-Y) e^{ k Y} \delta \tilde{F}^{(k)}(Y), \end{aligned}$$
(115)

where we used Jordan’s lemma to evaluate the integral with respect to Z. With this result, \(J_2\) is of the relatively simple form

$$\begin{aligned} J_2 = (k+1)^{-(L-k)/k} \int dY \varTheta (-Y) e^{ k Y} \delta \tilde{F}^{(k)}\left( Y \right) . \end{aligned}$$
(116)

Next, let us evaluate \(\delta \tilde{F}^{(k)}(z)\). Using Eq. (109), we find that

$$\begin{aligned} \delta \tilde{F}^{(k)}(z)&= \epsilon k \left[ \int _{-\infty }^{z} dy\, \eta (y) + \int _{z}^{\infty } dy\, \eta (y) \tilde{F}^{(k-1)}_{1/k}(z-y) \right] \nonumber \\&= \epsilon k \left[ \int _{-\infty }^{\infty } dy\, \eta (y) + \int _{z}^{\infty } dy\, \eta (y) \left( \tilde{F}^{(k-1)}_{1/k}(z-y) -1\right) \right] \nonumber \\&= \epsilon k \int _{z}^{\infty } dy\, \eta (y) \left( \tilde{F}^{(k-1)}_{1/k}(z-y) -1\right) \nonumber \\&= \epsilon k \int _{-\infty }^{\infty } dy\, \eta (y) \left( \tilde{F}^{(k-1)}_{1/k}(z-y) -1\right) \varTheta (y-z), \end{aligned}$$
(117)

where the factor k comes from the k different choices of \(p_f(y)\) in the variation of \(\tilde{F}^{(k)}\) and the fact that \(\int dy \, \eta (y) =0\) is used to eliminate the first term in the second equality. As expected, this implies that any perturbation made in the range \((-\infty ,z)\) does not change the behavior of \( \tilde{F}^{(k)}(z) \). Inserting this result into \(J_2\) gives

$$\begin{aligned} J_2&= (k+1)^{-(L-k)/k} \int dY \varTheta (-Y) e^{ k Y} \int dy\, \eta (y) \nonumber \\&\quad \ \times \epsilon k \left( \tilde{F}_{1/k}^{(k-1)}(Y-y) -1\right) \varTheta (y-Y). \end{aligned}$$
(118)

Now, the only technical point left is the integration with respect to Y. The integral domain is determined by two theta functions \(\varTheta (-Y)\) and \(\varTheta (y- Y)\), but since \(\eta (y)\) is assumed to be supported only on the negative real axis, the condition imposed by \(\varTheta (-Y)\) is irrelevant. Finally, using the identity

$$\begin{aligned} \int _{-\infty }^{0}dY\, k e^{k Y} \left( 1-\frac{\varGamma \left( \frac{k-1}{k},-Y\right) }{\varGamma \left( \frac{k-1}{k}\right) }\right) = (k+1)^{\frac{1}{k}-1}, \end{aligned}$$
(119)

we find

$$\begin{aligned} J_2 = - \epsilon \int dy\, \eta (y) e^{ky}(k+1)^{-(L-1)/k}. \end{aligned}$$
(120)

Thus, the two terms in Eq. (110) perfectly cancel, which completes the proof that \( \delta \pi _\mathrm {max}^{\mathrm {AN}} = 0 \).

C General Bounds on \(\beta \) for Uniform and Regular Structures with Gaussian Fitness

In this appendix we derive some general upper and lower bounds on the coefficient \(\beta \), defined in Eq. (86), for NK structures that are both uniform and regular. For this purpose we write the probability of \(\sigma \) being a local optimum as

$$\begin{aligned} \pi _\text {max} = {\mathbb {E}\left[ {\prod _{l=1}^{L} \varTheta \left( -\varDelta _lF(\sigma )\right) }\right] } = {\mathbb {E}\left[ {\prod _{l=1}^L \varTheta \left( -\sum _{r=1}^{|{\mathcal {B}}|} \left( f_r\left( {\downarrow _{B_r}}\varDelta _l \sigma \right) -f_r\left( {\downarrow _{B_r}}\sigma \right) \right) \right) }\right] }.\nonumber \\ \end{aligned}$$
(121)

All fitness values of the partial landscapes \(f_r\) are i.i.d. random variables. If \(l\in B_r\), then \(f_r\left( {\downarrow _{B_r}}\varDelta _l\sigma \right) \) and \(f_r\left( {\downarrow _{B_r}}\sigma \right) \) are independent. Otherwise they are identical. Thus effectively only the sum over r with \(l\in B_r\) remains. Due to regularity there are \(\tilde{k} = \frac{Nk}{L}\) such elements for each l. For different r, the terms are always independent. The left-hand terms are also independent for different l. However the right-hand terms are correlated for different l but the same r, resulting in a non-trivial problem. Using these observations we can directly integrate out all terms \(f_r\left( {\downarrow _{B_r}}\varDelta _l\sigma \right) \) and arrive at

$$\begin{aligned} \pi _\text {max} = {\mathbb {E}\left[ {\prod _{l=1}^{L} \varPhi _{\tilde{k}}\left( \sum _{r\;|\;l\in B_r} f_r\left( {\downarrow _{B_r}}\sigma \right) \right) }\right] }, \end{aligned}$$
(122)

where \(\varPhi _{\tilde{k}}\) is the cumulative distribution function of the sum of \(\tilde{k}\) i.i.d. fitness values. Introducing the short-hand notation \(x_r = f_r\left( {\downarrow _{B_r}}\sigma \right) \), we can write the sum as a matrix product

$$\begin{aligned} \pi _\text {max} = {\mathbb {E}\left[ {\prod _{l=1}^{L} \varPhi _{\tilde{k}}\left( ({\mathbf {B}}x)_l\right) }\right] } \end{aligned}$$
(123)

where \({\mathbf {B}}\) is the incidence matrix of the NK structure, i.e. \({\mathbf {B}}_{lr} = b_{l,r} = 1\) if \(l\in B_r\) and 0 otherwise.

If the base fitness distribution is a standard normal distribution, then the sum of \(\tilde{k}\) i.i.d. fitness values is also normal distributed with variance \(\tilde{k}\). Consequently we can simplify as

$$\begin{aligned} \pi _\text {max} = {\mathbb {E}\left[ {\prod _{l=1}^{L} \varPhi \left( \frac{1}{\sqrt{\tilde{k}}}({\mathbf {B}}x)_l\right) }\right] }. \end{aligned}$$
(124)

The random vector \(y = \frac{1}{\sqrt{\tilde{k}}}{\mathbf {B}}x\) is then jointly normal distributed with zero mean and covariance matrix \(\mathbf {C} = \frac{1}{\tilde{k}}{\mathbf {B}}{\mathbf {B}}^T\). This matrix is positive-semidefinite, and therefore

$$\begin{aligned} \pi _\text {max} = \int _{\mathbb {R}^L} \frac{\mathrm {d}y}{\sqrt{(2\pi )^L\det \mathbf {C}}}\exp \left( -\frac{1}{2}y^T\mathbf {C}^{-1}y + \sum _{l=1}^L \ln \varPhi (y_l)\right) . \end{aligned}$$
(125)

We can shift the integrand by a yet to be specified vector z, which yields

$$\begin{aligned} \pi _\text {max}&= \int _{\mathbb {R}^L} \frac{\mathrm {d}y}{\sqrt{(2\pi )^L\det \mathbf {C}}} \nonumber \\&\quad \ \times \exp \left( -\frac{1}{2}y^T\mathbf {C}^{-1}y -\frac{1}{2}z^T\mathbf {C}^{-1}z -z^T\mathbf {C}^{-1}y + \sum _{l=1}^L\ln \varPhi (y_l+z_l)\right) . \end{aligned}$$
(126)

Absorbing the first term in the exponent into a probability measure, we have again

$$\begin{aligned} \pi _\text {max} = e^{-\frac{1}{2}z^T\mathbf {C}^{-1}z}{\mathbb {E}\left[ {\exp \left( -z^T\mathbf {C}^{-1}y +\sum _{l=1}^L \ln \varPhi (y_l+z_l)\right) }\right] } \end{aligned}$$
(127)

where y is still jointly normal distributed with covariance matrix \(\mathbf {C}\).

Notice that the all-ones vector \(\bar{1}\) is an eigenvector of \(\mathbf {C}\) with the eigenvalue k. This can be seen through the relations \({\mathbf {B}}\bar{1} = \tilde{k}\bar{1}\) and \({\mathbf {B}}^T\bar{1} = k\bar{1}\), as there are exactly \(\tilde{k}\) ones in each row of \({\mathbf {B}}\) and k ones in each column. Thus let the \(z_l = \bar{z}\) be equal for all l. Then

$$\begin{aligned} \pi _\text {max} = e^{-L\frac{\bar{z}^2}{2k}}\prod _{l=1}^L{\mathbb {E}\left[ {\exp \left( \sum _{l=1}^L\left( \ln \varPhi (y_l+\bar{z}) - \frac{\bar{z}}{k} y_l\right) \right) }\right] }. \end{aligned}$$
(128)

C. 1 Lower Bound

By Jensen’s inequality we have

$$\begin{aligned} \pi _\text {max} \ge e^{-L\frac{\bar{z}^2}{2k}}\prod _{l=1}^L\exp \left( {\mathbb {E}\left[ {\ln \varPhi (y_l+\bar{z}) - \frac{\bar{z}}{k} y_l}\right] }\right) . \end{aligned}$$
(129)

Because \(y_l\) has a symmetric distribution, the mean of \(\bar{z} y_l\) vanishes. The variance of \(y_l\) is always 1, because by regularity and uniformity the diagonal elements of \({\mathbf {B}}{\mathbf {B}}^T\) are \(\tilde{k}\), which is canceled to 1 by the pre-factor in \(\mathbf {C}\). If we then assume \(\bar{z}\) to be increasing in our limit of interest and noting that the Gaussian has a tail falling much quicker to zero than the tail of \(\ln \varPhi \) falls to \(-\infty \) at \(x\rightarrow -\infty \), we can establish the bound

$$\begin{aligned} \pi _\text {max} \ge e^{-L\frac{\bar{z}^2}{2k}}\prod _{l=1}^L\exp \left( {\mathbb {E}\left[ {\varPhi (y_l+\bar{z})-1}\right] }(1+o(1))\right) \end{aligned}$$
(130)

which can be evaluated to

$$\begin{aligned} \pi _\text {max} \ge \exp \left( -L\frac{\bar{z}^2}{2k} + L\left( \varPhi \left( \frac{\bar{z}}{\sqrt{2}}\right) -1\right) (1+o(1))\right) . \end{aligned}$$
(131)

If we choose \(\bar{z} = 2\sqrt{\ln k}\), then asymptotically for large k

$$\begin{aligned} \pi _\text {max} \ge \exp \left( -L\left( \frac{2\ln k}{k} + \mathcal {O}\left( \frac{1}{k \sqrt{\ln k}}\right) \right) \right) . \end{aligned}$$
(132)

Note that choosing \(\bar{z} = \tilde{z} \sqrt{\ln k}\) with \(\tilde{z} < 2\) will not give a better bound, as the right-hand term in the exponent in Eq. (131) would then dominate and approach zero more slowly than \(\frac{\ln k}{k}\). This shows that \(\beta \le 2\) for uniform and regular structures. With the MF model, which is uniform and regular, we have an example of a realization of \(\beta = 2\). This shows that the bound is tight.

C. 2 Upper Bound

Starting from Eq. (128) we can find an upper bound by simply optimizing each term in the sum. The resulting sum is then an upper bound on the integrand, and because the expectation is taken with respect to a probability measure, it is bounded by the same value as well. If \(0< \frac{\bar{z}}{k} < \frac{1}{\sqrt{2\pi }}\), the optimum must be at \(y_l^\star +\bar{z} > 0\). Then by using the simplification \(\ln \varPhi (y_l+z_l) \le \varPhi (y_l+z_l) -1\), the optimum is found to be at

$$\begin{aligned} y_l^\star = \sqrt{2\ln \left( \frac{k}{\sqrt{2\pi }\bar{z}}\right) }-\bar{z}. \end{aligned}$$
(133)

Inserting \(y_l^\star \) back into the simplified argument of the expectation and assuming \(\bar{z}\rightarrow \infty \) in the limit of interest we find

$$\begin{aligned} \pi _\text {max} \le \exp \left( -L\frac{\bar{z}^2}{2k} -L\left( \frac{\bar{z}}{k\sqrt{2\ln \left( \frac{k}{\sqrt{2\pi }\bar{z}}\right) }}(1+o(1)) + \frac{\bar{z}}{k}\sqrt{2\ln \left( \frac{k}{\sqrt{2\pi }\bar{z}}\right) } - \frac{\bar{z}^2}{k}\right) \right) . \end{aligned}$$
(134)

The left-most and right-most terms are of equal order, but the second one from the left is always of less significant order than the second from the right, as long as \(\bar{z} = o(k)\).

The second term from the right becomes equal in order to the other two if \(\bar{z} = \tilde{z}\sqrt{2\ln k}\) with a positive constant \(\tilde{z}\). This satisfies the condition \(\bar{z} = o(k)\) while still \(\bar{z} \rightarrow \infty \), as required by previous assumptions (given that \(k\rightarrow \infty \) in the limit of interest). With this we have

$$\begin{aligned} \pi _\text {max} \le \exp \left( -L\left( \frac{\ln k}{k}(2\tilde{z} - \tilde{z}^2) + \mathcal {O}\left( \frac{\ln \ln k}{k}\right) \right) \right) . \end{aligned}$$
(135)

The bound is best for \(\tilde{z} = 1\), and so:

$$\begin{aligned} \pi _\text {max} \le \exp \left( -L\left( \frac{\ln k}{k} + \mathcal {O}\left( \frac{\ln \ln k}{k}\right) \right) \right) \end{aligned}$$
(136)

showing that \(\beta \ge 1\) for regular and uniform NK structures with Gaussian fitness. This bound is realized by the AN and BN structures, for example, and thus it is tight.

D Simulation of the Number of Local Maxima

As first realized in [6], the choice of a Gaussian base fitness distribution greatly simplifies the computation of \(\pi _\mathrm {max}\) through the numerical evaluation of Eq. (25), as it allows us to take advantage of an efficient algorithm. With this choice, the integrals over \(\mathbf {q}\) and \(\mathbf {y}\) can be cast into the form of multi-dimensional Gaussian integrals which may be evaluated for generally defined NK structures. Once these integrals are evaluated, we may construct a covariance matrix \(\varSigma \) that satisfies the relation

$$\begin{aligned} \pi _\mathrm {max}= \int \mathcal {D} \mathbf {u} \exp \left( -\frac{1}{2}\sum _{j l} u_j \varSigma ^{-1}_{j l} u_l \right) , \end{aligned}$$
(137)

where \( \int \mathcal {D} \mathbf {u} = \frac{1}{\sqrt{(2\pi )^L \det \varSigma }} \int _{0}^{\infty } \prod _j du_j \) and the matrix elements of \(\varSigma \) are given by

$$\begin{aligned} \varSigma _{j l} = {\left\{ \begin{array}{ll} 2 \sum _{r} b_{l,r} &{} j = l \\ \sum _{r} b_{j,r} b_{l,r} &{} j \ne l. \end{array}\right. } \end{aligned}$$
(138)

Thus, the problem reduces to determining the probability that all the entries of the Gaussian random vector realized by the covariance matrix \(\varSigma \) are positive. Since finding the probability for rectangular domains of multivariate Gaussian distribution is a well-known problem, an efficient algorithm has been known for a long time [28] and its implementation has been provided by the original authors as an R library [27].

Roughly speaking, this algorithm consists of two steps: i) transforming to an integral over a unit rectangular domain such that a rejection-free Monte-Carlo simulation is possible and ii) finding an ordering of loci that minimizes the variance of the Monte-Carlo step. However, since the loci in the NK models we consider in this review are statistically identical, the second step is irrelevant in this particular case. Thus, here we describe briefly how the transformation can be achieved from Eq. (137).

Since \(\varSigma \) is positive-definite, the Cholesky decomposition ensures that there exists a triangular matrix C such that \(\varSigma = C C^T\). The substitution \(\mathbf {u} = C \mathbf {x}\) then diagonalizes the integral at the cost of nontrivial integral domain,

$$\begin{aligned} \pi _\mathrm {max}= \frac{1}{(2\pi )^{L/2}}\int _{ \mathbf {x} \in \mathcal {R}}\prod _{j=1}^{L} dx_j \exp \left( -\frac{1}{2}\sum _{j =1} ^L x_j^2 \right) , \end{aligned}$$
(139)

where the domain \(\mathcal {R} = (a_1, \infty ) \times (a_2, \infty ) \times \cdots (a_L, \infty )\) and \(a_j = - \sum _{l=1}^{j-1} x_l C_{j l } / C_{j j } \). Next, performing the canonical transformation to a standard uniform distribution \( z_i = \varPhi (x_i) \), where \(\varPhi (x)\) is the CDF of the standard Gaussian distribution, the integral becomes

$$\begin{aligned} \pi _\mathrm {max}= \int _{ \mathbf {z} \in \mathcal {R'}} \prod _{j=1}^{L} dz_j, \end{aligned}$$
(140)

where \( \mathcal {R'} = (d_1, 1) \times (d_2, 1) \times \cdots (d_L, 1) \) and \(d_j = \varPhi ( - \sum _{l=1}^{j-1} \varPhi ^{-1}(z_l) C_{j l} / C_{j j })\). Finally, another linear transformation \(z_j = d_j + w_j (1- d_j)\) brings the integral into the form

$$\begin{aligned} \pi _\mathrm {max}= \int _{\mathbf {w} \in \mathcal {R''} } \prod _{j=1}^{L} (1- d_j) dw_j, \end{aligned}$$
(141)

where \(\mathcal {R''} = (0,1)^{L}\). Now that the integral domain is the L-dimensional unit rectangle, this integral can be evaluated by sampling L random variables from a uniform distribution on (0, 1) and subsequently estimating the weight factors \(d_j\).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hwang, S., Schmiegelt, B., Ferretti, L. et al. Universality Classes of Interaction Structures for NK Fitness Landscapes. J Stat Phys 172, 226–278 (2018). https://doi.org/10.1007/s10955-018-1979-z

Download citation

Keywords

  • Evolution
  • Fitness landscapes
  • Epistasis
  • Adaptive walks