Using genetic data to estimate diffusion rates in heterogeneous landscapes


Having a precise knowledge of the dispersal ability of a population in a heterogeneous environment is of critical importance in agroecology and conservation biology as it can provide management tools to limit the effects of pests or to increase the survival of endangered species. In this paper, we propose a mechanistic-statistical method to estimate space-dependent diffusion parameters of spatially-explicit models based on stochastic differential equations, using genetic data. Dividing the total population into subpopulations corresponding to different habitat patches with known allele frequencies, the expected proportions of individuals from each subpopulation at each position is computed by solving a system of reaction–diffusion equations. Modelling the capture and genotyping of the individuals with a statistical approach, we derive a numerically tractable formula for the likelihood function associated with the diffusion parameters. In a simulated environment made of three types of regions, each associated with a different diffusion coefficient, we successfully estimate the diffusion parameters with a maximum-likelihood approach. Although higher genetic differentiation among subpopulations leads to more accurate estimations, once a certain level of differentiation has been reached, the finite size of the genotyped population becomes the limiting factor for accurate estimation.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. Anderson E, Thompson E (2002) A model-based method for identifying species hybrids using multilocus genetic data. Genetics 160(3):1217–1229

    Google Scholar 

  2. Berliner LM (2003) Physical-statistical modeling in geophysics. J Geophys Res 108:8776

    Article  Google Scholar 

  3. Bohonak AJ (1999) Dispersal, gene flow, and population structure. Q Rev Biol 1999:21-45

  4. Broquet T, Ray N, Petit E, Fryxell JM, Burel F (2006) Genetic isolation by distance and landscape connectivity in the American marten (martes americana). Landsc Ecol 21(6):877–889

    Article  Google Scholar 

  5. Calderón AP (1980) On an inverse boundary value problem. In: Raupp MA, Meyer WH (eds) Seminar on numerical analysis and its applications to continuum physics. Sociedade Brasileira de Matematica, Brazil, pp 63–73

    Google Scholar 

  6. Cantrell RS, Cosner C (2003) Spatial ecology via reaction–diffusion equations. Wiley, Chichester

    Google Scholar 

  7. Cornuet J-M, Piry S, Luikart G, Estoup A, Solignac M (1999) New methods employing multilocus genotypes to select or exclude populations as origins of individuals. Genetics 153(4):1989–2000

    Google Scholar 

  8. Doyle PG, Snell JL (1984) Random walks and electric networks. AMC 10:12

    MathSciNet  MATH  Google Scholar 

  9. Durbin J, Koopman SJ (2012) Time series analysis by state space methods, vol 38. Oxford University Press, Oxford

    Google Scholar 

  10. Gardiner C (2009) Stochastic methods. In: Springer series in synergetics. Springer, Berlin

    Google Scholar 

  11. Gilligan CA (2008) Sustainable agriculture and plant diseases: an epidemiological perspective. Philos Trans R Soc B Biol Sci 363(1492):741–759

    Article  Google Scholar 

  12. Graves T, Chandler RB, Royle JA, Beier P, Kendall KC (2014) Estimating landscape resistance to dispersal. Landsc Ecol 29(7):1201–1211

    Article  Google Scholar 

  13. Graves TA, Beier P, Royle JA (2013) Current approaches using genetic distances produce poor estimates of landscape resistance to interindividual dispersal. Mol Ecol 22(15):3888–3903

    Article  Google Scholar 

  14. Hamrick J, Trapnell DW (2011) Using population genetic analyses to understand seed dispersal patterns. Acta Oecologica 37(6):641–649

    Article  Google Scholar 

  15. Hanski IA, Gilpin ME (1996) Metapopulation biology: ecology, genetics, and evolution. Academic Press, New York

    Google Scholar 

  16. Hecht F (2012) New development in freefem++. J Numer Math 20(3–4):251–265

    MathSciNet  MATH  Google Scholar 

  17. Hewitt GM (2000) The genetic legacy of the quarternary ice ages. Nature 405:907–913 (22 June 2000)

    Article  Google Scholar 

  18. Kareiva PM (1983) Local movement in herbivorous insects: applying a passive diffusion model to mark-recapture field experiments. Oecologia 57:322–327

    Article  Google Scholar 

  19. Klein EK, Bontemps A, Oddou-Muratorio S (2013) Seed dispersal kernels estimated from genotypes of established seedlings: does density-dependent mortality matter? Meth Ecol Evol 4(11):1059–1069

    Article  Google Scholar 

  20. Kot M, Lewis M, van den Driessche P (1996) Dispersal data and the spread of invading organisms. Ecology 77:2027–2042

    Article  Google Scholar 

  21. Marin J, Robert CP (2007) Bayesian Core. Springer, New York, NY

    Google Scholar 

  22. McRae BH (2006) Isolation by resistance. Evolution 60(8):1551–1561

    Article  Google Scholar 

  23. Nachman AI (1996) Global uniqueness for a two-dimensional inverse boundary value problem. Ann Math 1996:71-96

  24. Nei M (1973) Analysis of gene diversity in subdivided populations. Proc Natl Acad Sci 70(12):3321–3323

    Article  MATH  Google Scholar 

  25. Ovaskainen O, Rekola H, Meyke E, Arjas E (2008) Bayesian methods for analyzing movements in heterogeneous landscapes from mark-recapture data. Ecology 89(2):542–554

    Article  Google Scholar 

  26. Paetkau D, Calvert W, Stirling I, Strobeck C (1995) Microsatellite analysis of population structure in Canadian polar bears. Mol Ecol 4(3):347–354

    Article  Google Scholar 

  27. Papaïx J, Goyeau H, Du Cheyron P, Monod H, Lannou C (2011) Influence of cultivated landscape composition on variety resistance: an assessment based on wheat leaf rust epidemics. N Phytol 191(4):1095–1107

    Article  Google Scholar 

  28. Patterson TA, Thomas L, Wilcox C, Ovaskainen O, Matthiopoulos J (2008) State-space models of individual animal movement. Trends Ecol Evol 23(2):87–94

    Article  Google Scholar 

  29. Preisler HK, Ager AA, Johnson BK, Kie JG (2004) Modeling animal movements using stochastic differential equations. Environmetrics 15(7):643–657

    Article  Google Scholar 

  30. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959

    Google Scholar 

  31. Rannala B, Mountain JL (1997) Detecting immigration by using multilocus genotypes. Proc Natl Acad Sci 94(17):9197–9201

    Article  Google Scholar 

  32. Robledo-Arnuncio JJ (2012) Joint estimation of contemporary seed and pollen dispersal rates among plant populations. Mol Ecol Res 12(2):299–311

    Article  Google Scholar 

  33. Robledo-Arnuncio JJ, Garcia C (2007) Estimation of the seed dispersal kernel from exact identification of source plants. Mol Ecol 16(23):5098–5109

    Article  Google Scholar 

  34. Roques L (2013) Modèles de réaction-diffusion pour l’écologie spatiale. Editions Quae

  35. Roques L, Auger-Rozenberg M-A, Roques A (2008) Modelling the impact of an invasive insect via reaction–diffusion. Math Biosci 216(1):47–55

    MathSciNet  Article  MATH  Google Scholar 

  36. Roques L, Garnier J, Hamel F, Klein EK (2012) Allee effect promotes diversity in traveling waves of colonization. Proc Natl Acad Sci USA 109(23):8828–8833

    MathSciNet  Article  Google Scholar 

  37. Roques L, Hosono Y, Bonnefon O, Boivin T (2014) The effect of competition on the neutral intraspecific diversity of invasive species. J Math Biol. doi:10.1007/s00285-014-0825-4

    MathSciNet  MATH  Google Scholar 

  38. Roques L, Soubeyrand S, Rousselet J (2011) A statistical-reaction–diffusion approach for analyzing expansion processes. J Theor Biol 274:43–51

    MathSciNet  Article  MATH  Google Scholar 

  39. Rousset F (1997) Genetic differentiation and estimation of gene flow from f-statistics under isolation by distance. Genetics 145(4):1219–1228

    Google Scholar 

  40. Shigesada N, Kawasaki K (1997) Biological invasions: theory and practice. oxford series in ecology and evolution. Oxford University Press, Oxford

    Google Scholar 

  41. Slatkin M (1987) Gene flow and the geographic structure of natural populations. Science 236(4803):787–792

    Article  Google Scholar 

  42. Smouse PE, Focardi S, Moorcroft PR, Kie JG, Forester JD, Morales JM (2010) Stochastic modelling of animal movement. Philos Trans R Soc B Biol Sci 365(1550):2201–2211

    Article  Google Scholar 

  43. Soubeyrand S, Laine AL, Hanski I, Penttinen A (2009) Spatio-temporal structure of host-pathogen interactions in a metapopulation. Am Nat 174:308–320

    Article  Google Scholar 

  44. Soubeyrand S, Roques L (2014) Parameter estimation for reaction–diffusion models of biological invasions. Popul Ecol 56(2):427–434

    Article  Google Scholar 

  45. Southwood TRE, Henderson PA (2009) Ecological methods. Wiley, New York

    Google Scholar 

  46. Sylvester J, Uhlmann G (1987) A global uniqueness theorem for an inverse boundary value problem. Ann Math 125(1):153–169

    MathSciNet  Article  MATH  Google Scholar 

  47. Tetali P (1991) Random walks and the effective resistance of networks. J Theor Prob 4(1):101–109

    MathSciNet  Article  MATH  Google Scholar 

  48. Turchin P (1998) Quantitative analysis of movement: measuring and modeling population redistribution in animals and plants. Sinauer, Sunderland

    Google Scholar 

  49. Valdinoci E (2009) From the long jump random walk to the fractional laplacian. arXiv:0901.3261

  50. Wikle CK (2003) Hierarchical models in environmental science. Int Stat Rev 71:181–199

    Article  MATH  Google Scholar 

  51. Wright S (1943) Isolation by distance. Genetics 28:114–138

    Google Scholar 

Download references

Conflict of interest

The authors declare that they have no conflict of interest.

Author information



Corresponding author

Correspondence to L. Roques.

Additional information

The research leading to these results has received funding from the French Agence Nationale pour la Recherche, within the ANR-12-AGRO-0006 PEERLESS, ANR-13-ADAP-0006 MECC and ANR-14-CE25-0013 NONLOCAL projects and from the European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement n.321186-ReaDi-Reaction-Diffusion Equations, Propagation and Modellings.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (avi 375 KB)

Supplementary material 2 (avi 323 KB)

Supplementary material 3 (avi 354 KB)

Supplementary material 4 (avi 343 KB)

Supplementary material 5 (avi 351 KB)

Supplementary material 6 (avi 351 KB)


Appendix 1: gradual release of the pre-dispersal populations

The Eq. (2.2) describes a simultaneous release of all the individuals at \(t=0.\) To account for a possible gradual release of the individuals, the Eq. (2.2) can be replaced by:

$$\begin{aligned} \frac{\partial u}{\partial t}=\varDelta (D(x) \, u) -\frac{u}{\nu }+u_0(x)\, f(t), \ t>0, \, x\in \varOmega , \end{aligned}$$

where the term \(u_0(x)\, f(t)\) describes the release of the individuals; \(u_0(x)\) still corresponds to the pre-dispersal density and the function f(t) is the release rate. It can be described by any nonnegative function or distribution with integral 1 and with support in [0, T], T corresponding to the end of the release period. In this framework, the density of dispersers coming from habitat \(\varOmega ^h\) satisfies the equation:

$$\begin{aligned} \frac{\partial u^h}{\partial t}=\varDelta (D(x) \, u^h) -\frac{u^h}{\nu }+u_0^h(x)\, f(t), \ t>0, \, x\in \varOmega , \end{aligned}$$

where \(u_0^h\) is still given by (2.7).

Appendix B: precise shape of the diffusion terms

In our numerical computations, we took

$$\begin{aligned} \phi (x)=\mu _{2\, R}(\Vert x\Vert ) \hbox { and }\psi (x)=\psi (x_1,x_2)=\mu _{R}\left( x_1- q\right) , \end{aligned}$$

for the function \(\mu \) defined by (see Fig. 7):

$$\begin{aligned} \mu _R(r)=\exp \left( \frac{-r^4}{(r^2-R^2)^2}\right) \hbox { for }r \in (-R,R) \hbox { and }\mu _R(r)=0 \hbox { otherwize}. \end{aligned}$$
Fig. 7

The function \(\mu _R(r)\), for \(R=0.05\) and \(r\in (-0.1, 0.1)\)

Appendix C: computation of the \(F_{ST}\)

The index \(F_{ST}\) is used as a measure of genetic differentiation among the subpopulations. It was computed as follows: we set

$$\begin{aligned} J_S=\frac{1}{\varLambda }\sum \limits _{\lambda =1}^{\varLambda } \sum \limits _{a=1}^{A} \sum \limits _{h=1}^{H} \frac{1}{H}\left( p_{h \lambda a}\right) ^2 \hbox { and } J_T=\frac{1}{\varLambda }\sum \limits _{\lambda =1}^{\varLambda }\sum \limits _{a=1}^{A} \left( \frac{1}{H} \sum \limits _{h=1}^{H} p_{h \lambda a}\right) ^2, \end{aligned}$$

where \(\varLambda \) is the number of loci, A,  the number of alleles per locus whose frequency is measured and H the number of subpopulations. Here, \(J_S\) and \(J_T\) denote the mean homozygosity across subpopulations and the homozygosity of the total population, respectively. Then, we can write

$$\begin{aligned} F_{ST}= \frac{J_S-J_T}{1-J_T}. \end{aligned}$$

This formula corresponds to Nei’s \(G_{ST}\) for a single locus (Nei 1973), with numerator and denominator averaged over the \(\varLambda \) loci. In our computations, all the subpopulations had the same size; in other situations, the weight 1 / H in the above formulas for \(J_S\) and \(J_T\) should be replaced by the relative sizes of the subpopulations.

Appendix D: numerical computation of the cumulated population densities

In order to compute the cumulated densities \(w_\infty (x)\) and \(w_\infty ^h(x),\) we used the time-dependent partial differential equation solver Comsol Multiphysics\(^{\copyright }\) applied to the evolution equations (10.2) and (10.4) below at large time (\(t=20\)), with default parameter values (finite element method with second order basis elements) and a triangular mesh adapted to the geometry of our landscape and made of 5296 elements.

We defined the cumulated population density at intermediate times t and position x by:

$$\begin{aligned} w_t(x)=\int _0^{t}u(s,x)\, ds, \ \hbox { for all }t>0, \ x\in \varOmega . \end{aligned}$$

Integrating (2.2) between 0 and \(t>0\) we note that \(w_t(x)\) satisfies the following equation:

$$\begin{aligned} \frac{\partial w_t}{\partial t}=\varDelta (D(x) \, w_t) -\frac{w_t}{\nu }+u_0(x), \ t>0, \, x\in \varOmega , \end{aligned}$$

and \(w_0(x)=0.\)

Similarly, the cumulated population density of individuals coming from \(\varOmega ^h\) is:

$$\begin{aligned} w_t^h(x)=\int _0^{t}u^h(s,x)\, ds, \ \hbox { for all }t>0, \ x\in \varOmega . \end{aligned}$$

This function satisfies:

$$\begin{aligned} \frac{\partial w_t^h}{\partial t}=\varDelta (D(x) \, w_t^h) -\frac{w_t^h}{\nu }+u_0^h(x), \ t>0, \, x\in \varOmega , \end{aligned}$$

and \(w_0^h(x)=0.\)

Appendix E: using abundance data in the inference of the diffusion rates

As already mentioned, an important feature of our approach is that the likelihood does not depend on the capture rates \(\beta _\tau \). As the expected number of individuals captured in a trap \(\theta _\tau \) is proportional to \(\alpha \, \beta _\tau \), the absolute number of individuals captured in \(\theta _\tau \) cannot be used directly to infer the diffusion parameters if the capture rates are not known. However, if the capture rate was the same (\(=\beta \)) for all traps, we could include the information on the absolute number of captured individuals \(\mathbf {I}=\{I_1, \ldots ,I_J\}\) by computing the likelihood

$$\begin{aligned} \begin{array}{ll} \mathcal {L}(D,\alpha \, \beta ) &{} =\mathbb {P}(\mathbf {\mathcal {G}},\mathbf {I}|D, \alpha \, \beta ) \\ &{} = \mathbb {P}(\mathbf {\mathcal {G}}|\mathbf {I},D, \alpha \, \beta ) \mathbb {P}(\mathbf {I}|D, \alpha \, \beta ), \end{array} \end{aligned}$$

where \(\mathbf {\mathcal {G}}\) is the genotype information. In our framework, the genotype information does not depend on the number of captured individuals in each trap, as we assumed a constant number of genotyped individuals per trap, G. Besides, we have shown that the quantity \(\mathbb {P}(\mathbf {\mathcal {G}}|D, \alpha \, \beta )\) was independent of \(\alpha \, \beta \). Using the assumptions of Sect. 2 on the capture process, the quantity \(\mathbb {P}(\mathbf {I}|D, \alpha \, \beta )\) can be computed explicitly:

$$\begin{aligned} \mathbb {P}(\mathbf {I}|D, \alpha \, \beta )=\prod \limits _{\tau =1}^{J}\exp (-C_\tau )\frac{C_\tau ^{I_\tau }}{(I_\tau !)}. \end{aligned}$$

Finally, one can infer the diffusion parameters, together with the product \( \alpha \, \beta \) by maximising the likelihood:

$$\begin{aligned} \mathcal {L}(D,\alpha \, \beta )=2^{k} \prod \limits _{\tau =1, \ldots , J} \prod \limits _{i=1, \ldots , G} \exp (-C_\tau )\frac{C_\tau ^{I_\tau }}{(I_\tau !)} \sum \limits _{h=1}^{H}\left[ \frac{C^h_\tau }{C_\tau }\prod \limits _{\lambda =1}^{\varLambda }p_{h \lambda a^1} \, p_{h \lambda a^2}\right] , \end{aligned}$$

where k is the total number of heterozygous loci in the genotyped population. For the computation of \(C_\tau \) and \(C^h_\tau ,\) the pre-dispersal density \(\alpha \) can be fixed arbitrarily to 1.

Appendix F: modelling sharp transitions between regions with different diffusion rates

For the sake of simplicity, we assumed in this paper that the coefficient D was a smooth function of the position x, leading to a scalar equation (2.2), with a unique classical solution. Sharp transitions could be modelled by replacing the Eq. (2.2) by a system of N equations, where N is the number of patches \(\varOmega _i\) where the diffusion coefficient takes a constant value \(D_i\) and \(u_i\) is the population density in the patch \(\varOmega _i\):

$$\begin{aligned}\left\{ \begin{array}{l} \displaystyle \frac{\partial u_i}{\partial t}=D_i \, \varDelta u_i - \frac{u_i}{\nu }, \ x \in \varOmega _i, \\ \displaystyle u_i=u_j, \ x \in \partial \varOmega _i\cap \partial \varOmega _j, \\ \displaystyle D_i \, \nabla u_i \cdot \mathbf {n}_i=-D_j \, \nabla u_j \cdot \mathbf {n}_j, \ x \in \partial \varOmega _i\cap \partial \varOmega _j, \end{array}\right. \end{aligned}$$

where \(\partial \varOmega _i\) denotes the boundary of \(\varOmega _i\) and \(\mathbf {n}_i\) the outward unit normal to the boundary. The first boundary condition corresponds to the continuity of the population density in \(\varOmega =\bigcup \nolimits _{i}\varOmega _i\). The second boundary condition guaranties the conservation of mass in the absence of mortality (\(\nu =+\infty \)) and with reflecting boundary conditions on \(\partial \varOmega \).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Roques, L., Walker, E., Franck, P. et al. Using genetic data to estimate diffusion rates in heterogeneous landscapes. J. Math. Biol. 73, 397–422 (2016).

Download citation


  • Reaction–diffusion
  • Stochastic differential equation
  • Inference
  • Mechanistic-statistical model
  • Allele frequencies
  • Genotype measurements

Mathematics Subject Classification

  • 35K45
  • 35K57
  • 35Q92
  • 65C30
  • 92D10
  • 92D40