Skip to main content

Advertisement

Log in

Modeling and analysis of the dynamics of communities of microbial DNA sequences in environments

  • Original Paper
  • Published:
Nonlinear Dynamics Aims and scope Submit manuscript

Abstract

The dynamics of biological communities in environments have always been a fundamental topic in dynamical systems. Biological community dynamics is the time evolution of the community of DNA sequences present in the community at the most fundamental level. A community of DNA sequences is represented as a probability function defined on the set of strings composed of four letters \(\texttt{A}\), \(\texttt{C}\), \(\texttt{G}\), and \(\texttt{T}\). In this study, we first construct a model of the dynamics of a community of DNA sequences using a partial differential equation defined on the set of strings. We present a numerical solution to the model and demonstrate the validity of the model by reproducing the observed dynamics of microbial DNA sequence communities in environments in simulations using the numerical solution. Next, we define a metric on the set of probability functions defined on the set of strings that is appropriate for measuring the dissimilarities between communities of DNA sequences, and we introduce physical quantities, such as the variational speed and directional persistence, for sequence community dynamics based on the metric. We also provide a method for calculating the introduced physical quantities and assure the accuracy of the method theoretically. We calculate these physical quantities for the dynamics of the above-mentioned microbial sequence communities and show that the introduced quantities can quantify the characteristics of these dynamics. The equation and physical quantities studied here can be used to control the dynamics of communities of microbial sequences in environments in tasks in environmental and agricultural engineering, such as environmental pollution cleanup and soil enrichment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Algorithm 2

Similar content being viewed by others

Data availability

The datasets generated during this study are available from the Nucleotide database of the National Center for Biotechnology Information. See Sect. S1 of the Supplementary Information.

References

  1. Becerra-Bonache, L., De La Higuera, C., Janodet, J.-C., Tantini, F.: Learning balls of strings from edit corrections. J. Mach. Learn. Res. 9, 1841–1870 (2008)

    MathSciNet  MATH  Google Scholar 

  2. Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: String Processing and Information Retrieval (spire 2000): 7th International Symposium, pp. 39–48. IEEE (2000)

  3. Bray, J.R., Curtis, J.T.: An ordination of the upland forest communities of southern Wisconsin. Ecol. Monogr. 27(4), 325–349 (1957)

    Google Scholar 

  4. Chao, A., Chazdon, R.L., Colwell, R.K., Shen, T.-J.: A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecol. Lett. 8(2), 148–159 (2005)

    Google Scholar 

  5. Cohen, E., Kessler, D.A., Levine, H.: Recombination dramatically speeds up evolution of finite populations. Phys. Rev. Lett. 94, 098102 (2005)

    Google Scholar 

  6. Czekanowski, J.: Zur differential Diagnose der Neandertalgruppe. Korrespbl. dt. Ges. Anthrop. 40, 44–47 (1909)

    Google Scholar 

  7. Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7(3), 171–176 (1964)

    Google Scholar 

  8. Darwin, C.: On the Origin of Species. J. Murray, London (1859)

    Google Scholar 

  9. Derrida, B., Peliti, L.: Evolution in a flat fitness landscape. Bull. Math. Biol. 53, 355–382 (1991)

    MATH  Google Scholar 

  10. Desai, M.M., Fisher, D.S.: Beneficial mutation-selection balance and the effect of linkage on positive selection. Genetics 176, 1759–1798 (2007)

    Google Scholar 

  11. Desai, M.M., Fisher, D.S.: The balance between mutators and nonmutators in asexual populations. Genetics 188, 997–1014 (2011)

    Google Scholar 

  12. Desai, M.M., Fisher, D.S., Murray, A.W.: The speed of evolution and maintenance of variation in asexual populations. Curr. Biol. 17, 385–394 (2007)

    Google Scholar 

  13. Desai, M.M., Walczak, A.M., Fisher, D.S.: Genetic diversity and the structure of genealogies in rapidly adapting populations. Genetics 193, 565–585 (2013)

    Google Scholar 

  14. Deza, M.M., Laurent, M.: Geometry of Cuts and Metrics. Springer, Berlin (1997)

    MATH  Google Scholar 

  15. Dirichlet, P.G.L.: Sur une nouvelle méthode pour la détermination des intégrales multiples. J. Math. Pures Appl. 4, 164–168 (1839)

    Google Scholar 

  16. Eigen, M., McCaskill, J., Schuster, P.: The molecular quasi-species. Adv. Chem. Phys. 75, 149–263 (1989)

    Google Scholar 

  17. Eigen, M., Schuster, P.: The Hypercycle: A Principle of Natural Self-Organization. Springer, Berlin (1979)

    Google Scholar 

  18. Eiter, T., Mannila, H.: Distance measures for point sets and their computation. Acta Inform. 34(2), 109–133 (1997)

    MathSciNet  MATH  Google Scholar 

  19. Endler, J.A.: Natural Selection in the Wild. Princeton University Press, Princeton (1986)

    Google Scholar 

  20. Fisher, R.A.: The Genetical Theory of Natural Selection. Oxford University Press, Oxford (1930)

    MATH  Google Scholar 

  21. Flyvbjerg, H., Lautrup, B.: Evolution in a rugged fitness landscape. Phys. Rev. A 46, 6714 (1992)

    MATH  Google Scholar 

  22. Fontana, W., Stadler, P.F., Bornberg-Bauer, E.G., Griesmacher, T., Hofacker, I.L., Tacker, M., Tarazona, P., Weinberger, E.D., Schuster, P.: RNA folding and combinatory landscapes. Phys. Rev. E 47, 2083 (1993)

    Google Scholar 

  23. Goldman, N., Yang, Z.: A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 11(5), 725–736 (1994)

    Google Scholar 

  24. Good, B.H., Desai, M.M.: Fluctuations in fitness distributions and the effects of weak linked selection on sequence evolution. Theor. Popul. Biol. 85, 86–102 (2013)

    MATH  Google Scholar 

  25. Good, B.H., Rouzine, I.M., Balick, D.J., Hallatschek, O., Desai, M.M.: Distribution of fixed beneficial mutations and the rate of adaptation in asexual populations. Proc. Natl. Acad. Sci. USA 109(13), 4950–4955 (2012)

    Google Scholar 

  26. Haldane, J.B.S.: The Causes of Evolution. Longmans, Green, London (1932)

    Google Scholar 

  27. Hallatschek, O., Nelson, D.R.: Gene surfing in expanding populations. Theor. Popul. Biol. 73, 158–170 (2008)

    MATH  Google Scholar 

  28. Hallatschek, O., Nelson, D.R.: Life at the front of an expanding population. Evolution 64, 193–206 (2010)

    Google Scholar 

  29. Higgs, P.G.: Error thresholds and stationary mutant distributions in multi-locus diploid genetics models. Genet. Res. 63, 63–78 (1994)

    Google Scholar 

  30. Higgs, P.G., Derrida, B.: Stochastic models for species formation in evolving populations. J. Phys. A 24, L985 (1991)

    MATH  Google Scholar 

  31. Higgs, P.G., Derrida, B.: Genetic distance and species formation in evolving populations. J. Mol. Evol. 35, 454–465 (1992)

    Google Scholar 

  32. Hofbauer, J., Schuster, P., Sigmund, K.: A note on evolutionary stable strategies and game dynamics. J. Theor. Biol. 81, 609–6 (1979)

    MathSciNet  Google Scholar 

  33. Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge (1998)

    MATH  Google Scholar 

  34. Hutchinson, G.E.: Circular causal systems in ecology. Ann. NY Acad. Sci. 50(4), 221–246 (1948)

    Google Scholar 

  35. Izsak, C., Price, A.R.G.: Measuring \(\beta \)-diversity using a taxonomic similarity index, and its relation to spatial scale. Mar. Ecol. Prog. Ser. 215, 69–77 (2001)

    Google Scholar 

  36. Jaccard, P.: Contribution au problème de l’immigration post-glaciare de la flore alpine. Bull. Soc. Vaudoise Sci. Nat. 36, 87–130 (1900)

    Google Scholar 

  37. Johnston, M.O.: Natural selection on floral traits in two species of Lobelia with different pollinators. Evolution 45(6), 1468–1479 (1991)

    Google Scholar 

  38. Kauffman, S.A.: The Origins of Order: Self Organization and Selection in Evolution. Oxford University Press, New York (1993)

    Google Scholar 

  39. Kauffman, S.A., Levin, S.: Towards a general theory of adaptive walks on rugged landscapes. J. Theor. Biol. 128, 11–45 (1987)

    MathSciNet  Google Scholar 

  40. Kauffman, S.A., Weinberger, E.D.: The NK model of rugged fitness landscapes and its application to maturation of the immune response. J. Theor. Biol. 141, 211–245 (1989)

    Google Scholar 

  41. Kessler, D.A., Levine, H.: Mutator dynamics on a smooth evolutionary landscape. Phys. Rev. Lett. 80, 2012 (1998)

    Google Scholar 

  42. Kessler, D.A., Levine, H., Ridgway, D., Tsimring, L.S.: Evolution on a smooth landscape. J. Stat. Phys. 87, 519–544 (1997)

    MATH  Google Scholar 

  43. Kimura, M.: The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge (1983)

    Google Scholar 

  44. Kingsolver, J.G., Hoekstra, H.E., Hoekstra, J.M., Berrigan, D., Vignieri, S.N., Hill, C.E., Hoang, A., Gibert, P., Beerli, P.: The strength of phenotypic selection in natural populations. Am. Nat. 157(3), 245–261 (2001)

    Google Scholar 

  45. Koleff, P., Gaston, K.J., Lennon, J.J.: Measuring beta diversity for presence-absence data. J. Anim. Ecol. 72(3), 367–382 (2003)

    Google Scholar 

  46. Kolmogorov, A.M.: Sulla teoria di Volterra della lotta per l’esistenza. Giorn. Inst. Ital. Attuari 7, 74–80 (1936)

    MATH  Google Scholar 

  47. Koyano, H., Hayashida, M.: Volume formula and growth rates of the balls of strings under the edit distances. Submitted

  48. Koyano, H., Hayashida, M., Akutsu, T.: Maximum margin classifier working in a set of strings. Proc. R. Soc. A 472(2187), 20150551 (2016)

    MathSciNet  MATH  Google Scholar 

  49. Koyano, H., Hayashida, M., Akutsu, T.: Optimal string clustering based on a Laplace-like mixture and EM algorithm on a set of strings. J. Comput. Syst. Sci. 106, 94–128 (2019)

    MathSciNet  MATH  Google Scholar 

  50. Koyano, H., Kishino, H.: Quantifying biodiversity and asymptotics for a sequence of random strings. Phys. Rev. E 81(6), 061912 (2010)

    MathSciNet  Google Scholar 

  51. Koyano, H., Tsubouchi, T., Kishino, H., Akutsu, T.: Archaeal \(\beta \) diversity patterns under the seafloor along geochemical gradients. J. Geophys. Res. Biogeosci. 119(9), 1770–1788 (2014)

    Google Scholar 

  52. Lamarck, J.B.P.A.: Philosophie zoologique. Dentu et l’Auteur, Paris (1809)

    Google Scholar 

  53. Lande, R., Arnold, S.J.: The measurement of selection on correlated characters. Evolution 37(6), 1210–1226 (1983)

    Google Scholar 

  54. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Higgins, D.G.: Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948 (2007)

  55. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Dokl. Akad. Nauk SSSR 163(4), 845–848 (1965)

    MathSciNet  MATH  Google Scholar 

  56. Li, W.-H., Wu, C.-I., Luo, C.-C.: A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 2(2), 150–174 (1985)

    Google Scholar 

  57. Lotka, A.J.: Elements of Physical Biology. Williams and Wilkins, Baltimore (1925)

    MATH  Google Scholar 

  58. Lozupone, C., Knight, R.: UniFrac: A new phylogenetic method for comparing microbial communities. Appl. Environ. Microbiol. 71(12), 8228–8235 (2005)

    Google Scholar 

  59. Lozupone, C.A., Hamady, M., Kelley, S.T., Knight, R.: Quantitative and qualitative \(\beta \) diversity measures lead to different insights into factors that structure microbial communities. Appl. Environ. Microbiol. 73(5), 1576–1585 (2007)

    Google Scholar 

  60. Lozupone, C.A., Knight, R.: Species divergence and the measurement of microbial diversity. FEMS Microbiol. Rev. 32(4), 557 (2008)

    Google Scholar 

  61. Macken, C.A., Hagan, P.S., Perelson, A.S.: Evolutionary walks on rugged landscapes. SIAM J. Appl. Math. 51, 799–827 (1991)

    MathSciNet  MATH  Google Scholar 

  62. Mann, H.B., Wald, A.: On stochastic limit and order relationships. Ann. Math. Stat. 14, 217–226 (1943)

    MathSciNet  MATH  Google Scholar 

  63. May, R.M.: Stability and Complexity in Model Ecosystems. Princeton University Press, Princeton (2001)

    MATH  Google Scholar 

  64. Maynard Smith, J.: Evolutionary Genetics. Oxford University Press, Oxford (1989)

    Google Scholar 

  65. Meijering, E., Dzyubachyk, O., Smal, I.: Methods for cell and particle tracking. In: Conn, P.M. (ed.) Methods in Enzymology, vol. 504, pp. 183–200. Elsevier, Amsterdam (2012)

    Google Scholar 

  66. Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)

    Google Scholar 

  67. Nei, M.: Molecular Evolutionary Genetics. Columbia University Press, New York (1987)

    Google Scholar 

  68. Nielsen, R., Yang, Z.: Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148(3), 929–936 (1998)

    Google Scholar 

  69. Pavoine, S., Dufour, A.B., Chessel, D.: From dissimilarities among species to dissimilarities among communities: a double principal coordinate analysis. J. Theor. Biol. 228(4), 523–537 (2004)

    MathSciNet  MATH  Google Scholar 

  70. Plotkin, J.B., Muller-Landau, H.C.: Sampling the species composition of a landscape. Ecology 83(12), 3344–3356 (2002)

    Google Scholar 

  71. Price, T., Kirkpatrick, M., Arnold, S.J.: Directional selection and the evolution of breeding date in birds. Science 240(4853), 798–799 (1988)

    Google Scholar 

  72. Rausher, M.D.: The measurement of selection on quantitative traits: biases due to environmental covariances between traits and fitness. Evolution 46(3), 616–626 (1992)

    Google Scholar 

  73. Ridgway, D., Levine, H., Kessler, D.A.: Evolution on a smooth landscape: the role of bias. J. Stat. Phys. 90, 191–210 (1998)

    MATH  Google Scholar 

  74. Rouzine, I.M., Coffin, J.M.: Evolution of human immunodeficiency virus under selection and weak recombination. Genetics 170, 7–18 (2005)

    Google Scholar 

  75. Rouzine, I.M., Coffin, J.M.: Highly fit ancestors of a partly sexual haploid population. Theor. Popul. Biol. 71, 239–250 (2007)

    MATH  Google Scholar 

  76. Rouzine, I.M., Coffin, J.M.: Multi-site adaptation in the presence of infrequent recombination. Theor. Popul. Biol. 77, 189–204 (2010)

    MATH  Google Scholar 

  77. Rouzine, I.M., Wakeley, J., Coffin, J.M.: The solitary wave of asexual evolution. Proc. Natl. Acad. Sci. USA 100, 587–592 (2003)

    Google Scholar 

  78. Schuster, P.: Dynamics of molecular evolution. Physica D 22, 100–119 (1986)

    MathSciNet  Google Scholar 

  79. Smith, W., Solow, A.R., Preston, P.E.: An estimator of species overlap using a modified beta-binomial model. Biometrics 52, 1472–1477 (1996)

    MATH  Google Scholar 

  80. Sørensen, T.A.: A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Kongel. Danske Vidensk. Selsk. Skr. 5, 1–34 (1948)

    Google Scholar 

  81. Suzuki, Y.: New methods for detecting positive selection at single amino acid sites. J. Mol. Evol. 59(1), 11–19 (2004)

    Google Scholar 

  82. Suzuki, Y., Gojobori, T.: A method for detecting positive selection at single amino acid sites. Mol. Biol. Evol. 16(10), 1315–1328 (1999)

    Google Scholar 

  83. Tarazona, P.: Error thresholds for molecular quasispecies as phase transitions: From simple landscapes to spin-glass models. Phys. Rev. A 45, 6038 (1992)

    Google Scholar 

  84. Torgerson, W.S.: Multidimensional scaling: I. Theory and method. Psychometrika 17(4), 401–419 (1952)

    MathSciNet  MATH  Google Scholar 

  85. Tsimring, L.S., Levine, H., Kessler, D.A.: RNA virus evolution via a fitness-space model. Phys. Rev. Lett. 76, 4440 (1996)

    Google Scholar 

  86. Tuomisto, H.: A diversity of beta diversities: straightening up a concept gone awry. Part 1. Defining beta diversity as a function of alpha and gamma diversity. Ecography 33(1), 2–22 (2010)

  87. Tuomisto, H.: A diversity of beta diversities: straightening up a concept gone awry. Part 2. Quantifying beta diversity and related phenomena. Ecography 33(1), 23–45 (2010)

  88. Verhulst, P.-F.: Notice sur la loi que la population suit dans son accroissement. Corresp. Math. Phys. 10, 113–121 (1838)

    Google Scholar 

  89. Volterra, V.: Variazioni e fluttuazioni del numero d’individui in specie animali conviventi. Memoria della Reale Accademia Nazionale dei Lincei 2, 31–113 (1926)

    MATH  Google Scholar 

  90. Weinberger, E.D.: Local properties of Kauffman’s NK model: a tunably rugged energy landscape. Phys. Rev. A 44, 6399 (1991)

    Google Scholar 

  91. Weinberger, E.D., Stadler, P.F.: Why some fitness landscapes are fractal. J. Theor. Biol. 163, 255–275 (1993)

    Google Scholar 

  92. Whittaker, R.H.: Vegetation of the Siskiyou Mountains. Oregon Calif. Ecol. Monogr. 30(3), 279–338 (1960)

    MathSciNet  Google Scholar 

  93. Whittaker, R.H.: Evolution and measurement of species diversity. Taxon 21, 213–251 (1972)

    Google Scholar 

  94. Wiehe, T., Baake, E., Schuster, P.: Error propagation in reproduction of diploid organisms: a case study on single peaked landscapes. J. Theor. Biol. 177, 1–15 (1995)

    Google Scholar 

  95. Woodcock, G., Higgs, P.G.: Population evolution on a multiplicative single-peak fitness landscape. J. Theor. Biol. 179(1), 61–73 (1996)

    Google Scholar 

  96. Wright, S.: Evolution in Mendelian populations. Genetics 16(2), 97–159 (1931)

    Google Scholar 

  97. Wright, S.: Evolution and the Genetics of Populations Volume 1 Genetic and Biometric Foundations. University of Chicago Press, Chicago (1968)

  98. Wright, S.: Evolution and the Genetics of Populations Volume 2 Theory of Gene Frequencies. University of Chicago Press, Chicago (1969)

  99. Yue, J.C., Clayton, M.K.: A similarity measure based on species proportions. Commun. Stat. Theory Methods 34(11), 2123–2131 (2005)

Download references

Funding

This work was supported by Grant-in-Aid for Scientific Research (B) from the Japan Society for the Promotion of Science (16KT0020) and Gurunavi, Inc., Tokyo, Japan.

Author information

Authors and Affiliations

Authors

Contributions

HK constructed the mathematical model, developed the computational methods, performed the numerical simulations, and wrote the manuscript. KS, NY and TY performed the experiments.

Corresponding author

Correspondence to Hitoshi Koyano.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 37198 KB)

Supplementary file 2 (mp4 2167 KB)

Supplementary file 3 (mp4 3379 KB)

Appendix A: Proofs of the theorems

Appendix A: Proofs of the theorems

In this appendix, we provide proofs of Theorems 1 to 3.

Proof of Theorem 1

For any \(s \in A^{*}\) and \(t \in [0, \infty )\), the relative frequency \(p(s, t + \Delta t)\) of sequence s in community \(S(t + \Delta t)\) at time \(t + \Delta t\) is composed of the following three components: (i) the relative frequency in \(S(t + \Delta t)\) of sequences that are equal to s and that belong to \(S(t) \smallsetminus {\tilde{S}}(t)\), (ii) the relative frequency in \(S(t + \Delta t)\) of those sequences which are equal to s and which are produced by sequences that are equal to s and that belong to \({\tilde{S}}(t)\) (i.e., sequences with no mutations), and (iii) the relative frequency in \(S(t + \Delta t)\) of those sequences which are equal to s and which are produced by sequences that are different from s and that belong to \({\tilde{S}}(t)\) (i.e., sequences with mutations). First, (i) is represented as

$$\begin{aligned} \frac{x(s, t) - {\tilde{x}}(s, t)}{n(t + \Delta t)}. \end{aligned}$$

Next, using the strong law of large numbers, we can represent (ii) as

$$\begin{aligned} \frac{o(s, t) (1 - \alpha _{L})^{\ell (s)}}{n(t + \Delta t)}. \end{aligned}$$

Lastly, from the strong law of large numbers, (iii) is represented as

$$\begin{aligned}{} & {} \frac{1}{n(t + \Delta t)} \\{} & {} \quad \sum _{1 \le r < \infty } \sum _{s^{\prime } \in V_{d_{L}}(s, r)} \frac{o(s^{\prime }, t) {}_{\ell (s^{\prime })}C_{r} \alpha _{L}^{r} (1 - \alpha _{L})^{\ell (s^{\prime }) - r}}{|V_{d_{L}}(s^{\prime }, r)|}. \end{aligned}$$

Therefore, we have

$$\begin{aligned}&p(s, t + \Delta t) = \frac{x(s, t) - {\tilde{x}}(s, t)}{n(t + \Delta t)} \\&\quad + \frac{o(s, t) (1 - \alpha _{L})^{\ell (s)}}{n(t + \Delta t)} + \frac{1}{n(t + \Delta t)} \\&\quad \sum _{1 \le r < \infty }\sum _{s^{\prime } \in V_{d_{L}}(s, r)} \frac{o(s^{\prime }, t) {}_{\ell (s^{\prime })}C_{r} \alpha _{L}^{r} (1 - \alpha _{L})^{\ell (s^{\prime }) - r}}{|V_{d_{L}}(s^{\prime }, r)|}. \end{aligned}$$

Expanding the left-hand side of the above equation in a Taylor series with respect to t and rearranging the equation provides

$$\begin{aligned}&\frac{\partial p(s, t)}{\partial t} = \ \frac{1}{\Delta t} \left( - p(s, t) + \frac{x(s, t) - {\tilde{x}}(s, t)}{n(t + \Delta t)} \right) \nonumber \\&\quad + \frac{1}{\Delta t} \frac{o(s, t) (1 - \alpha _{L})^{\ell (s)}}{n(t + \Delta t)} + \frac{1}{\Delta t} \frac{1}{n(t + \Delta t)}\nonumber \\&\quad \sum _{1 \le r < \infty } \sum _{s^{\prime } \in V_{d_{L}}(s, r)} \frac{o(s^{\prime }, t) {}_{\ell (s^{\prime })}C_{r} \alpha _{L}^{r} (1 - \alpha _{L})^{\ell (s^{\prime }) - r}}{|V_{d_{L}}(s^{\prime }, r)|}. \end{aligned}$$
(28)

First, we consider the second term of the right-hand side of Eq. (28). Roughly, we have \(n(t + \Delta t) \rightarrow n(t)\) and \(o(s, t) \rightarrow 0\) as \(\Delta t \rightarrow 0\), and therefore, if \(n(t) > 0\) holds, we have

$$\begin{aligned} \frac{o(s, t)}{n(t + \Delta t)} \rightarrow 0 \end{aligned}$$

as \(\Delta t \rightarrow 0\). Thus, we suppose that for any \(s \in A^{*}\) and \(t \in [0, \infty )\), there exists the limit of

$$\begin{aligned} \frac{1}{\Delta t} \frac{o(s, t)}{n(t + \Delta t)} \end{aligned}$$

as \(\Delta t \rightarrow 0\) (i.e., it is possible to let \(\Delta t\) approach zero, keeping the ratio \((o(s, t)/n(t + \Delta t)) /\Delta t\) constant) and denote the limit by b(st). b(st) can be interpreted as a relative frequency in S(t) of offspring sequences produced by sequences that are equal to s and that produce offspring sequences and die at time t. Hence, the limit \(b(s, t) (1 - \alpha _{L})^{\ell (s)}\) of the second term of the right-hand side of Eq. (28) as \(\Delta t \rightarrow 0\) represents a relative frequency in S(t) of those offspring sequences which have no mutations (i.e., are equal to s) and which are produced by sequences that are equal to s and that produce offspring sequences and die at time t. Conducting a similar discussion with respect to the third term of the right-hand side of Eq. (28), we see that the limit of the third term as \(\Delta t \rightarrow 0\) is

$$\begin{aligned} \sum _{1 \le r < \infty } \sum _{s^{\prime } \in V_{d_{L}}(s, r)} \frac{b(s^{\prime }, t) {}_{\ell (s^{\prime })}C_{r} \alpha _{L}^{r} (1 - \alpha _{L})^{\ell (s^{\prime }) - r}}{|V_{d_{L}}(s^{\prime }, r)|}. \end{aligned}$$

The above expression represents a relative frequency in S(t) of those offspring sequences which are equal to s and which are produced by sequences that are different from s and that produce offspring and die at time t.

Lastly, we consider the first term of the right-hand side of Eq. (28). We rewrite it as

$$\begin{aligned}{} & {} \frac{1}{\Delta t} \left( - p(s, t) + \frac{x(s, t) - {\tilde{x}}(s, t)}{n(t + \Delta t)} \right) \\{} & {} \quad = \frac{1}{\Delta t} \left( - \frac{x(s, t)}{n(t)} + \frac{x(s, t)}{n(t + \Delta t)} - \frac{{\tilde{x}}(s, t)}{n(t + \Delta t)} \right) . \end{aligned}$$

Roughly, we have \(n(t + \Delta t) \rightarrow n(t)\) and \({\tilde{x}}(s, t) \rightarrow 0\) as \(\Delta t \rightarrow 0\), and therefore, if \(n(t) > 0\) holds, we have

$$\begin{aligned} -\frac{x(s, t)}{n(t)} + \frac{x(s, t)}{n(t + \Delta t)} \rightarrow 0, \quad \frac{{\tilde{x}}(s, t)}{n(t + \Delta t)} \rightarrow 0 \end{aligned}$$

as \(\Delta t \rightarrow 0\). Thus, we suppose that for any \(s \in A^{*}\) and \(t \in [0, \infty )\), there exists the limit of

$$\begin{aligned} \frac{1}{\Delta t} \left( - \frac{x(s, t)}{n(t)} + \frac{x(s, t)}{n(t + \Delta t)} - \frac{{\tilde{x}}(s, t)}{n(t + \Delta t)} \right) \end{aligned}$$

as \(\Delta t \rightarrow 0\) and denote the limit by c(st). In the above expression, if \(\Delta t\) is sufficiently small, \(x(s, t)/n(t + \Delta t) - x(s, t)/n(t)\) is approximately equal to the variation in the relative frequency of sequences equal to s in the unit time interval \([t, t + \Delta t]\), and \({\tilde{x}}(s, t)/n(t + \Delta t)\) is approximately equal to the relative frequency of sequences that are equal to s and that produce offspring sequences and die in \([t, t + \Delta t]\). Hence, c(st) is interpreted as the contribution of sequences that are equal to s and that do not produce offspring sequences and die at time t to the variation in the relative frequency of sequences equal to s at t. Consequently, we obtain the partial differential equation

$$\begin{aligned}{} & {} \frac{\partial p(s, t)}{\partial t} = c(s, t) + b(s, t) (1 - \alpha _{L})^{\ell (s)} \\{} & {} \quad + \sum _{1 \le r < \infty } \sum _{s^{\prime } \in V_{d_{L}}(s, r)} b(s^{\prime }, t) \frac{{}_{\ell (s^{\prime })}C_{r} \alpha _{L}^{r} (1 - \alpha _{L})^{\ell (s^{\prime }) - r}}{|V_{d_{L}}(s^{\prime }, r)|}. \end{aligned}$$

\(\square \)

Proof of Theorem 2

We first show the nonnegativity of \(d_{B}\). Noting Eqs. (25) and (22), we have \(d_{\textrm{min}}(p_{1}, p_{2}) \ge 0\) for any \(p_{1}, p_{2} \in {\mathcal {P}}\). Therefore, \(d_{B}(p_{1}, p_{2}) \ge 0\) holds from Eq. (24). Thus, if \(p_{1} = p_{2}\) holds for \(p_{1}, p_{2} \in {\mathcal {P}}\), we have \(d_{B}(p_{1}, p_{2}) = 0\) from Eqs. (24) and (22). On the other hand, supposing \(p_{1} \ne p_{2}\) provides \(d_{\textrm{min}}(p_{1}, p_{2}) > 0\) from Eqs. (25) and (22). Hence, \(d_{B}(p_{1}, p_{2}) > 0\) holds from Eqs. (24) and (22).

Next, we show the symmetry of \(d_{B}\). From Eqs. (24) and (23), it suffices to demonstrate that \(d_{\textrm{min}}(p_{1}, p_{2}) = d_{\textrm{min}}(p_{2}, p_{1})\) holds for any \(p_{1}, p_{2} \in {\mathcal {P}}\). We choose \(p_{1}, p_{2} \in {\mathcal {P}}\) arbitrarily. Noting Eq. (23) provides

$$\begin{aligned}&d_{\beta }(p_{1}, q_{1}) + d_{\beta }(q_{1}, q_{2}) + \cdots \\&\qquad + d_{\beta }(q_{\ell - 1}, q_{\ell }) + d_{\beta }(q_{\ell }, p_{2})\\&\quad = \ d_{\beta }(p_{2}, q_{\ell }) + d_{\beta }(q_{\ell }, q_{\ell - 1})\\&\qquad + \cdots + d_{\beta }(q_{2}, q_{1}) + d_{\beta }(q_{1}, p_{1}) \end{aligned}$$

for any \(\ell \in {\mathbb {Z}}^{+}\) and \(q_{1}, \ldots , q_{\ell } \in {\mathcal {P}}\). Therefore, if

$$\begin{aligned}&d_{\beta }(p_{1}, q^{*}_{1}) + d_{\beta }(q^{*}_{1}, q^{*}_{2}) \\&\qquad + \cdots + d_{\beta }(q^{*}_{\ell ^{*} - 1}, q^{*}_{\ell ^{*}}) + d_{\beta }(q^{*}_{\ell ^{*}}, p_{2})\\&\quad = \min \big \{ d_{\beta }(p_{1}, q_{1}) + d_{\beta }(q_{1}, q_{2}) \\&\qquad + \cdots + d_{\beta }(q_{\ell - 1}, q_{\ell }) + d_{\beta }(q_{\ell }, p_{2}): \ell \in {\mathbb {Z}}^{+}, q_{1}, \\&\qquad \ldots , q_{\ell } \in {\mathcal {P}} \big \} \end{aligned}$$

holds for \(\ell ^{*} \in {\mathbb {Z}}^{+}\) and \(q^{*}_{1}, \ldots , q^{*}_{\ell ^{*}} \in {\mathcal {P}}\), we have

$$\begin{aligned}&\min \left\{ d_{\beta }(p_{2}, q_{1}) + d_{\beta }(q_{1}, q_{2}) + \cdots + d_{\beta }(q_{\ell - 1}, q_{\ell })\right. \\&\qquad \left. + d_{\beta }(q_{\ell }, p_{1}): \ell \in {\mathbb {Z}}^{+}, q_{1}, \ldots , q_{\ell } \in {\mathcal {P}} \right\} \\&\quad = d_{\beta }(p_{2}, q^{*}_{\ell ^{*}}) + d_{\beta }(q^{*}_{\ell ^{*}}, q^{*}_{\ell ^{*} - 1}) \\&\qquad + \cdots + d_{\beta }(q^{*}_{2}, q^{*}_{1}) + d_{\beta }(q^{*}_{1}, p_{1}) \end{aligned}$$

and

$$\begin{aligned}&\min \left\{ d_{\beta }(p_{1}, q_{1}) + d_{\beta }(q_{1}, q_{2}) + \cdots + d_{\beta }(q_{\ell - 1}, q_{\ell }) \right. \\&\qquad \left. + d_{\beta }(q_{\ell }, p_{2}): \ell \in {\mathbb {Z}}^{+}, q_{1}, \ldots , q_{\ell } \in {\mathcal {P}} \right\} \\&\quad = \min \left\{ d_{\beta }(p_{2}, q_{1}) + d_{\beta }(q_{1}, q_{2}) + \cdots + d_{\beta }(q_{\ell - 1}, q_{\ell }) \right. \\&\qquad \left. + d_{\beta }(q_{\ell }, p_{1}): \ell \in {\mathbb {Z}}^{+}, q_{1}, \ldots , q_{\ell } \in {\mathcal {P}} \right\} . \end{aligned}$$

Thus, \(d_{\textrm{min}}(p_{1}, p_{2}) = d_{\textrm{min}}(p_{2}, p_{1})\) holds from Eq. (25).

Lastly, supposing that there exist \(p_{1}, p_{2}\), and \(p_{3}\) such that \(d_{B}(p_{1}, p_{3}) > d_{B}(p_{1}, p_{2}) + d_{B}(p_{2}, p_{3})\) holds leads to the contradiction to Eqs. (24) and (25), and therefore, \(d_{B}\) satisfies the triangle inequality. \(\square \)

Proof of Theorem 3

We choose \(p_{1}, p_{2} \in {\mathcal {P}}\) arbitrarily. From the condition of the theorem, using Proposition 1 in [51], we have

$$\begin{aligned}{} & {} d_{\beta }\big ({\hat{p}}_{1}( \ \cdot \ , S_{1}^{(n_{1})}), q_{1}\big ) {\mathop {\longrightarrow }\limits ^{\mathrm {\small a.s.}}}d_{\beta }(p_{1}, q_{1}), \quad \\{} & {} d_{\beta }\big (q_{\ell }, {\hat{p}}_{2}( \ \cdot \ , S_{2}^{(n_{2})})\big ) {\mathop {\longrightarrow }\limits ^{\mathrm {\small a.s.}}}d_{\beta }(q_{\ell }, p_{2}) \end{aligned}$$

as \(n_{1}, n_{2} \rightarrow \infty \) for any \(\ell \in {\mathbb {Z}}^{+}\) and \(q_{1}, \ldots , q_{\ell } \in {\mathcal {P}}\). Therefore,

$$\begin{aligned}&d_{\beta }({\hat{p}}_{1}( \ \cdot \ , S_{1}^{(n_{1})}), q_{1}) + c_{\ell }(q_{1}, \ldots , q_{\ell })\nonumber \\&\quad + d_{\beta }\big (q_{\ell }, {\hat{p}}_{2}( \ \cdot \ , S_{2}^{(n_{2})})\big )\nonumber \\&\quad {\mathop {\longrightarrow }\limits ^{\mathrm {\small a.s.}}}d_{\beta }(p_{1}, q_{1}) + c_{\ell }(q_{1}, \ldots , q_{\ell }) + d_{\beta }(q_{\ell }, p_{2}) \end{aligned}$$
(29)

holds as \(n_{1}, n_{2} \rightarrow \infty \) by the Mann–Wald theorem [62]. We suppose that

$$\begin{aligned} d_{\textrm{min}}(p_{1}, p_{2})= & {} d_{\beta }(p_{1}, q^{*}_{1}) + c_{\ell ^{*}}(q^{*}_{1}, \ldots , q^{*}_{\ell ^{*}})\\{} & {} +~d_{\beta }\left( q^{*}_{\ell ^{*}}, p_{2}\right) \end{aligned}$$

holds for \(\ell ^{*} \in {\mathbb {Z}}^{+}\) and \(q^{*}_{1}, \ldots , q^{*}_{\ell ^{*}} \in {\mathcal {P}}\). In addition, we assume that \(\ell ^{*} \le u\) holds for the input \(u \in {\mathbb {Z}}^{+}\) of Algorithm 2. We denote the probabilities that \(\ell ^{*} \in {\mathbb {Z}}^{+}\) and \(q^{*}_{1}, \ldots , q^{*}_{\ell ^{*}} \in {\mathcal {P}}\) are chosen in the loop with respect to i from the first to eighth lines in Algorithm 2 by \(\psi ^{(i)}_{0}\) and \(\psi ^{(i)}_{1}, \ldots , \psi ^{(i)}_{\ell ^{*}}\), respectively, for each \(i = 1, \ldots , m\). Then, \(\psi ^{(i)} = \prod _{j = 0}^{\ell ^{*}} \psi ^{(i)}_{j} > 0\) holds from condition \(\hbox {G}_{{2}}\) on “choose \(x \in X\) randomly”. Thus, we have

$$\begin{aligned} \frac{1}{m} \sum _{i = 1}^{m} \psi ^{(i)} > 0 \end{aligned}$$
(30)

for any \(m \in {\mathbb {Z}}^{+}\). We define a random variable \(X_{i}\) as

$$\begin{aligned} X_{i}= & {} {\left\{ \begin{array}{ll} 1 &{} \textrm{if} \quad \ell ^{*} \quad \textrm{and} \quad q^{*}_{1}, \ldots , q^{*}_{\ell ^{*}} \\ &{}\quad \mathrm {were \ chosen \ in \ the} \ i\text{- }\mathrm {th \ loop \ in \ Algorithm} ~2, \\ 0 &{} \textrm{otherwise}. \end{array}\right. }\nonumber \\ \end{aligned}$$
(31)

We have \(E(X_{i}) = \psi ^{(i)}\) and \(V(X_{i}) = \psi ^{(i)} (1 - \psi ^{(i)}) < 1\), where \(E(X_{i})\) and \(V(X_{i})\) are the expectation and variance of \(X_{i}\), respectively. Hence, from condition \(\hbox {G}_{{1}}\), using the strong law of large numbers, we obtain

$$\begin{aligned} \frac{1}{m} \sum _{i = 1}^{m} X_{i} {\mathop {\longrightarrow }\limits ^{\mathrm {\small a.s.}}}\frac{1}{m} \sum _{i = 1}^{m} \psi ^{(i)} \end{aligned}$$

as \(m \rightarrow \infty \). Therefore, from Eqs. (30) and (31), there exists \(m_{0} \in {\mathbb {Z}}^{+}\) such that if \(m \ge m_{0}\), then at least one of \(X_{1}, \ldots , X_{m}\) is equal to one. The above discussion holds under the assumption of \(\ell ^{*} \le u\). There always exists \(u_{0} \in {\mathbb {Z}}^{+}\) satisfying \(\ell ^{*} \le u_{0}\), and thus, we choose such \(u_{0}\). If \(m \ge m_{0}\) and \(u \ge u_{0}\), then

$$\begin{aligned}{} & {} d_{\text {min}}(p_{1}, p_{2}) = \min _{1 \le i \le m} \big \{ d_{\beta }(p_{1}, q_{i 1})\\{} & {} \quad + c_{\ell ^{(u)}_{i}}\big (q_{i 1}, \ldots , q_{i \ell ^{(u)}_{i}}\big ) + d_{\beta }(q_{i \ell ^{(u)}_{i}}, p_{2}) \big \} \quad \mathrm {a.s.} \end{aligned}$$

holds. Hence, noting Eq. (29) and the definition of \({\hat{d}}^{(m, u)}_{\textrm{min}}\) in the tenth line in Algorithm 2 leads to

$$\begin{aligned} {\hat{d}}_{\textrm{min}}^{(m, u)}(S_{1}^{(n_{1})}, S_{2}^{(n_{2})}) {\mathop {\longrightarrow }\limits ^{\mathrm {\small a.s.}}}d_{\text {min}}(p_{1}, p_{2}) \end{aligned}$$

as \(m, u, n_{1}, n_{2} \rightarrow \infty \). Therefore, from the Mann–Wald theorem and the condition of the theorem, applying Proposition 1 in [51], we obtain

$$\begin{aligned}{} & {} \min \left\{ {\hat{d}}_{\beta }(S_{1}^{(n_{1})}, S_{2}^{(n_{2})}), {\hat{d}}_{\textrm{min}}^{(m, u)}(S_{1}^{(n_{1})}, S_{2}^{(n_{2})}) \right\} \\{} & {} \quad {\mathop {\longrightarrow }\limits ^{\mathrm {\small a.s.}}}\min \big \{ d_{\beta }(p_{1}, p_{2}), d_{\textrm{min}}(p_{1}, p_{2}) \big \} \end{aligned}$$

as \(m, u, n_{1}, n_{2} \rightarrow \infty \). Consequently, noting Eq. (24) and the definition of \({\hat{d}}_{B}^{(m, u)}\) in the eleventh line in Algorithm 2 provides the desired theorem. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Koyano, H., Sawada, K., Yamamoto, N. et al. Modeling and analysis of the dynamics of communities of microbial DNA sequences in environments. Nonlinear Dyn 111, 5767–5797 (2023). https://doi.org/10.1007/s11071-022-08105-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11071-022-08105-y

Keywords

Navigation