A Model for Cell Proliferation in a Developing Organism

In mathematical biology, there is a great deal of interest in producing continuum models by scaling discrete agent-based models governed by local stochastic rules. We discuss a particular example of this approach: a model for the proliferation of neural crest cells that can help us understand the development of Hirschprung’s disease, a potentially-fatal condition in which the enteric nervous system of a new-born child does not extend all the way through the intestine and colon. Our starting point is a discrete-state, continuous-time Markov chain model proposed by Hywood et al. (2013a) for the location of the neural crest cells that make up the enteric nervous system. Hywood et al. (2013a) scaled their model to derive an approximate second order partial differential equation describing how the limiting expected number of neural crest cells evolve in space and time. In contrast, we exploit the relationship between the above-mentioned Markov chain model and the well-known Yule-Furry process to derive the exact form of the scaled version of the process. Furthermore, we provide expressions for other features of the domain agent occupancy process, such as the variance of the marginal occupancy at a particular site, the distribution of the number of agents that are yet to reach a given site and a stochastic description of the process itself.


Introduction
Cell motility and cell proliferation play an indispensable role in collective cell spreading, which is essential to many key biological processes, medicine and pathological mechanisms including foetal tissue growth, tumour growth and wound healing (see e.g., Maini et al. 2004a;Deng et al. 2006;Cai et al. 2007;Friedl and Gilmour 2009), cell mobility in orchestrating morphogenesis during embryonic development (Gilbert 2003;Keller 2005;Binder et al. 2008;Vulin et al. 2014), immune responses Madri and Graesser (2000), tissue domain expansion (Crampin, Hackborn and Maini 2002), cancer Weinberg and Hanahan (2000) and vascular disease Raines (2000). Cell proliferation and cell division process lead to domain expansion. This can change the cellular spatial distribution and consequently lead to mechanisms of transport for proliferating cells. Understanding this mechanism in controlling collective cell migration can, potentially, enable us to prevent any abnormalities such as developmental anomalies or cancer metastasis. Agent-based cellular models have widely been used to model proliferative tissue growth (Binder and Landman 2009;Hywood et al. 2013b;Ross et al. 2016). The methods that are typically used to develop these models are based on using a master equation or by using the Fokker-Planck partial differential equation approach, Pillay et al. (2017). Applications of using these developed mathematical models have extensively been used for bone and tumour growth (Czarnecki et al. 2014;Monteagudo and Santos 2015), embryonic tissue growth by assuming time evolution of the length of the tissue domain was known (Binder et al. 2008;De Oliveira and Binder 2019) and fungal and yeast colonies (Matsuura 2000;Tronnolone et al. 2017). Therefore, to improve our understanding of proliferation cell processes, mathematical models have been formulated for the behaviour of the cell population as a whole, based on the behaviour of the individual cells within the population. For example, in the model for the growth of the foetal gut proposed by Hywood et al. (2013a), neural crest cells populate the gut during its growth and remain in the gut tissue to form the neurons that construct the enteric nervous system (ENS). Failure of the neural crest cells to invade the gut tissue completely can cause imperfect formation of the ENS, and lead to the existence of Hirschsprung's disease Newgreen and Young (2002). According to Grosfeld (2008), the first report of this disease may have occurred back in 1691. However, the disease is named after the Danish pediatrician Hirschsprung (1981), who described it. This disease can result in serious colon complications, such as enterocolitis and toxic megacolon, which can be life threatening, and it is vital that it be diagnosed early so that medical care can be commenced in a timely manner. In the literature, there are two main approaches to explaining collective cell spreading: (i) discrete models, and (ii) continuum models Murray (2012). The first class, discrete, individual-based models, usually involve discretizing time and/or space. These models capture the stochasticity and nonuniformity observed in experiments Simpson et al. (2007). The benefit of these models is that they can predict features based on information Murray et al. (2011) and fluctuation Simpson et al. (2014) that can be measured in experimental data Simpson et al. (2010). In parallel to discrete models, much attention has also been given to developing continuum models, illustrating the behaviour of populations of cells using reaction-diffusion equations. Continuum models can be used to clarify the relationships among the model parameters so that they can be investigated numerically Maini et al. (2004b). The interplay between pattern formation and domain growth, studied in Crampin, Gaffney and Maini (2002), and Woolley et al. (2011), makes use of both deterministic and stochastic aspects. A multiscale conception of the complex processes driving cell migration can be attained by linking these two modelling approaches together in an equivalent framework; this gives intuition into the interplay between the individual level and population level models that enable us to use either modelling scenario to study appropriate properties of the system. Hywood et al. (2013a) developed such a linkage by studying a discrete, stochastic, domain growth model where discrete 'domain agents' representing the gut cells proliferate. This model is a one dimensional semi-infinite lattice model for tissue growth with lattice spacing . Let N (t) be the number of domain agents at time t and L(t) ≡ N (t) be the length of the gut at this time. It follows that x = i is the position of the ith lattice site. In this model a proliferation event can occur, which results in the inclusion of an additional domain agent into the lattice. Consequently, the number of agents rises by one with each proliferation. In this process, if the domain agent at site i is selected to proliferate, it moves to site i + 1 and pushes all domain agents in sites i + 1, i + 2, . . . one place higher. This process is depicted in Fig. 1 which demonstrates a realization of two proliferation events with a subset of blue marked and green unmarked agents. Whenever a marked or an unmarked agent is selected to proliferate, it moves one site to the right and an unmarked domain agent is inserted into the lattice. If a marked domain agent is selected to proliferate, then the addition of the unmarked domain agent will not only extend the lattice but possibly also separate marked domain agents. By letting the spacing shrink to zero, Hywood et al. (2013a) derived a PDE (their Eq. (14)) to describe how the continuous expected occupancy evolves in time. The coefficient of the diffusion term in this PDE is a linear function of , and hence it should be taken to be zero as the lattice spacing shrinks. However, Hywood et al. (2013a) observed a very good fit between simulations and the trajectories of this PDE, when is taken to be small and positive. Our aim in this work is to analyse the model of Hywood et al. (2013a) in a different way, by observing that the proliferation process of Hywood et al. (2013a) can be interpreted as a superposition of Yule-Furry processes. We are able to derive expressions for the expectation and variance of the occupancy at a given point, the distribution of the number of agents yet to reach a given point and a stochastic description of the process itself.

The Proliferation Process
In this paper we work on the proliferation process introduced by Hywood et al. (2013a). Let {X (t)} t≥0 = {(X 1 (t), X 2 (t), ..., X N (t) (t))} t≥0 be a random vector that models the state of a proliferation process at time t, with N (t) the number of sites at time t and X i (t) = 1 if site i is occupied by a marked domain agent at time t and equal to zero otherwise. Let T k denote the time at which the k-th proliferation event occurs with T 0 = 0 and τ k+1 = T k+1 − T k the time between the kth In the next step, the unmarked domain agent at site 3 is selected to proliferate. The new unmarked domain agent is added to the lattice, occupying site 3 and k + 1st events. We assume that the random variable τ k+1 is exponentially distributed with rate λN (T k ) = λ(N (0) + k), so that the expected value of τ k+1 is 1/(λ(N (0) + k)), and that the location of the k + 1st proliferation event has a discrete uniform distribution on {1, . . . , N (0) + k}. If the proliferation event occurs at site j then X (T k+1 ) = (X 1 (T k ), . . . , X j−1 (T k ), 0, X j (T k ), . . . , X N (T k ) (T k )).
The above assumptions imply that the probability that a proliferation event occurs in (t, t+τ ], when the state is X (t), is λN (t)τ +o(τ ) and the location of such a proliferation is to the left of site i with probability (i − 1)/N (T ). Noting that X i (t + τ ) = 0 if a proliferation event occurs exactly at site i, it follows that the expected occupancy E(X i (t + τ )|X (t)) of site i at time t + τ satisfies the equation (1) Taking expectations of both sides, we can derive an equation for C i (t) ≡ E(X i (t)|X (0)), We can model an initial condition where there is a contiguous segment of lattice sites r + 1, . . . , s that are occupied by marked agents by taking To derive a PDE, we make the transformation of variables i → x and C i (t) to C(x, t), divide both sides of Eq. (2) by τ , and take limits → 0 and τ → 0 so that If we want to take as an initial condition that C(x, 0) = 1 on some interval (a, b], and zero otherwise, we need to let both r and s in the above model approach infinity in such a way that lim →0 r = a and lim →0 s = b.
, observing that, for any t and τ in the discrete model, and so We see that the expected length of the domain in the discrete model grows exponentially. Via the limiting procedure used above, this is also true of the length L(t) of the deterministic domain of the PDE (3). Hywood et al. (2013a) proposed that the expected occupancy C(x, t) is well-modelled by the second-order partial differential equation.
Because of the form (5), the first order term on the right hand side of (6) is the same as that in (3) /τ required to derive (6) cannot both be finite and non-zero. Thus, this equation has to be regarded as an approximation. In Fig. 2, we compare Eq. (6) derived by Hywood et al. (2013a) with simulation results averaged over 1000 of realizations for small values of . We can see that it does not fit well. In Sect. 3, we shall present an alternative view of the proliferation process that exploits the fact that individual particles move according to a Yule-Furry process. In

Yule-Furry Processes and Proliferation Processes
The pure birth process with a Markovian structure introduced by McKendrick (1914), is considered as one of the simplest branching processes. In the case that the birth rate is linear, it is called the pure linear birth, classical Yule or Yule-Furry process see, for example, Yule (1925), Bharucha-Reid (1997), De La Fortelle (2006 and references therein. The Yule-Furry process has been used to model stochastic population growth with a preferential attachment process as a model of speciation in biology. It was developed with the aim to model various stochastic dynamical systems processes in different fields ranging from epidemics, demography and population biology to server queuing modelling Elia and Taricco (1992), mathematical biology Baake and Wakolbinger (2015) and branching processes Kendall (1966). For instance, Romero-Arias et al. (2018) analysed a growth model of an avascular tumour that considers the basic biological principles of proliferation and genetic mutations of the cell. They modelled the cell mutations dynamics by using a Yule-Furry process and by assuming that cell mutation depends only on the previous genetic state. Mathematically, the Yule or Yule-Furry process is a pure-birth continuous-time Markov chain on the state space {1, 2, . . .} with transition rates q(i, i +1) = λi for some λ > 0, which is known as the splitting rate, see for example (Taylor 1975, Page 122). It is well-known that if a Yule-Furry process starts in state j at time zero then its position at time t has a negative binomial distribution with parameters j and e −λt , that is the probability that it is in state k ≥ j at time t is with p jk (t) = 0 if k < j. A domain agent in the proliferation process of Hywood et al. (2013a) moves according to a Yule-Furry process. Observing that the number of domain agents U k (t) at a given position k ≥ 1 at time t is necessarily 0 or 1, it easily follows that Now we are interested in the expected total number of particles at position k at time t, therefore, we consider s − r of these processes, Y ( j) (t), independently evolving on the positive integers, with common splitting rate λ and with different starting points j = r + 1, ..., s. Let the number of particles at position k at time t in the jth process be denoted by U Then the expected total number of particles at position k at time t is given by An important observation, made by Hywood et al. (2013a), is that this expected total number of particles is given by (9) even if the processes Y ( j) (t) are dependent, and that imposing a particular type of dependence results in the discrete space proliferation model described in Sect. 1. To see this, observe that the position Y (t) at time t of a particle evolving according to a Yule-Furry Process with splitting rate λ and with Y (0) = j satisfies the stochastic equation where N ( (t)) is the number of points in an inhomogeneous Poisson process with an intensity measure depending on the evolution of Y on the interval [0, t) given by To simulate a Yule-Furry process according to this viewpoint, we could use the fact that the integrand in (11) is piecewise constant and so, when Y (t) = k, we draw an exponential random variable with parameter kλ and increase the state to k + 1 when this time has elapsed. (Fig. 3) Another way to simulate the process would be to generate independent homogeneous Poisson processes {N i } with rate λ at each integer point i and move the particle at k one place to the right at any instant when there is an event in one of the Poisson processes N i with i ≤ k. This is equivalent to thinking of the random intensity measure in (11) as generated by the independent homogeneous Poisson processes {N i }. Specifically, by expressing the area represented by the integral in (11) as a sum of the areas of horizontal rectangles, we see that with S i = 0 for i ≤ j and S i = inf{u : Y (u) ≥ i} for i > j. These two viewpoints are illustrated in Fig. 3 for a case where j = 2. In the left hand panel the jumps are events in a Poisson process of increasing intensity. In the right hand panel, they are events in the superposition of an increasing number of Poisson processes. Now let's return to the s − r processes Y ( j) (t) driving Eq. (9). In the spirit of (12), we can think of each of these as being generated by a sequence {N  Hywood et al. (2013a): points in the ith Poisson process correspond to proliferation events at the ith site and move particles from all processes Y ( j) (t) that are to the right of i simultaneously. In this way, the proliferation process can be regarded as a coupled version of the asymmetric random walk. This leads us to the following theorem.

be a random vector that models the state of a proliferation process as defined in Sect. 2 at time t, with N (t) the number of sites at time t and X i (0) = I (r < i ≤ s). Then
(i) The expected number of points at site k in the proliferation process is given by C k (t) in (9). (ii) The variance in the number of points at site k in the proliferation process is given by measures the distance between the jth and ( j + 1)st marked domain agents at time t. Then the s − r random variables Z ( j) (t) are independent with common probability mass function given by (iv) For k > r, let M k (t) = k =r +1 X (t) be the number of marked domain agents in positions r + 1, . . . , k at time t. Then where 2 F 1 (a, b; c; z) is the hypergeometric function (Abramowitz and Stegun 1965, Definition 15.1.1).
Proof See the Appendix.

Asymptotic Properties
With a view to establishing a continuum limit of the model in Sect. 3, we suppose that there are N (0) = (b − a)/ particles initially situated at positions ( a/ + 1), · · · , b/ within the interval (a, b] with equal spacing . The idea is that there is an initial 'particle mass' (b − a)/ distributed evenly among these N (0) points. The point at position ( a/ + j) carries a mass , which we can think of as a rectangle of unit height over the interval [ ( a/ + j), ( a/ + j + 1)]. The expected height of the rectangle at position y at time t, where with respect to (9), we have used the correspondence k = y/ , r = a/ and s = b/ . Note that C (y, t) is a piecewise constant function, with jumps at points of the form k . Furthermore, k, r and s increase at the same rate as goes to zero, in particular, r /k → a/y and s/k → b/y.  initially located at sites 12 up to 118, and = 1/n with n ranging from 1 to 5. We observe that, as decreases, the curves become flatter with decreased tail mass.

The limiting profile C(y, t)
Figures 4 and the left hand panel of 5 suggest that, taken as a function of y for a fixed value of t, C (y, t) approaches a limit as → 0. We investigate this behaviour in this section, commencing with an intuitive characterisation using the normal approximation to the binomial distribution. Equation (14) for the expected number of particles at position y at time t when the spacing is has the form where S (y, t) is a random variable with a binomial distribution with parameters y/ − 1 and e −λt . The mean and variance of S (y, t) are, respectively, e −λt ( y/ − 1) and e −λt (1 − e −λt )( y/ − 1). Employing the normal approximation to approximate C (y, t) when is small, we get where Z is a standard normal random variable. The performance of this approximation is illustrated in red in the left hand panel of Fig. 4 and in the left-hand panel of Fig.  5. In our numerical studies, without loss of generality, we assume the value of λ to be 0.69. We plotted C (y, t) for different values of λ in the right hand panel of Fig. 5. In Fig. 6 we compare the normal approximation result in Eq. 16 with Eq. 6, derived by Hywood et al. (2013a). We see that Eq. 16 fits well with the simulation results averaged over 1000 of realizations and for = 0.1, = 0.01 and = 0.001. Figure  7 illustrates how expression (16) (16) is equal to zero if a/y = e −λt and approaches −∞ or ∞ as → 0 depending on whether a/y − e −λt is negative or positive, that is according as y > ae λt or y < ae λt . Similarly, approaches 1/2 if y = ae λt or be λt , 1 if ae λt < y < be λt and 0 otherwise. Thus we would expect the limiting profile for C (y, t) to be Remark 4.1 Note that C(y, t), given by (17) • has the correct mass b − a for all t, and • is a weak solution of the first-order partial differential Eq. (3).
It remains to establish the form (17) rigorously as → 0. This is given in the following theorem.  We started this section by observing that we can think of our initial condition as spreading a mass of (b − a)/ over N (0) = (b − a)/ points, so that each point carries a mass of . In the limit as → 0 the mass at each point goes to zero but the number of points approaches infinity in such a way that the total mass approaches b − a. For y ∈ (a, ∞), this motivates us to define M (y, t) = M y/ (t) in the model where the spacing is set to and M k (t) is defined in part (iv) of Theorem 3.1. The limiting distribution of M (y, t) is given in the following theorem.  ∈ (a, b), (ii) For positive t, y ∈ (ae λt , be λt ] and z ∈ (−∞, ∞), Proof See the Appendix.

Remark 4.2
If y < ae λt , then Theorem 4.2(i) implies that lim →0 Pr(M (y, t) ≤ z − a) = 1 for all z ∈ (a, b]. In particular, this implies that lim →0 Pr(M (y, t) ≤ 0) = 1, which tells us that M (y, t) weakly converges to a point mass at zero. On the other hand, if y > be λt then lim →0 Pr(M (y, t) ≤ z − a) = 0 for all z ∈ (a, b) and M (y, t) weakly converges to a point mass at b − a. Otherwise, M (y, t) weakly converges to a point mass at z = ye −λt − a.
This agrees with the observation of Theorem 4.1 that the limiting expected number of agents C(y, t) has the form of a square wave of height e −λt over the interval y ∈ (ae λt , be λt ).

Conclusion
In this paper we have analysed the cellular proliferation process model of Hywood et al. (2013a) by taking advantage of its relationship to the superposition of a set of coupled Yule-Furry processes, each with the same splitting rate. In Sect. 2, we presented the model of Hywood et al. (2013a) together with the argument of Hywood et al. (2013a) to derive an approximating second order PDE for the expected occupancy of the process at position y at time t. In Sect. 3, we analysed the discrete model with positive , and used the model's representation as a set of coupled Yule-Furry processes to derive an exact expression for the expected occupancy C k (t) at site k at time t. We also derived expressions for the variance of the occupancy and the distribution of the number of sites between adjacent domain agents. Finally, we showed that at time t, domain agents occupy positions that are distributed as a geometric renewal process. In Sect. 4, we considered limiting versions as → 0 of the process that we analysed in Sect. 3. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Appendix
Proof of Theorm 3.1 (i) The first assertion follows from the argument outlined in Sect. 3: the expected number of particles at site k is the same in the proliferation process as it is in the asymmetric random walk process, which is given by (9). (ii) The number of points at site k in the proliferation process is a Bernoulli random variable with a probability C k (t) of being equal to one. It follows that its variance is equal to C k (t)(1 − C k (t)). (iii) Using our view of the Yule-Furry process as being generated by a sequence of independent homogeneous Poisson processes {N i } with rate λ at each integer point i, we see that the distance Z ( j) between two marked domain agents also grows as a Yule-Furry process, but one that starts off with Z ( j) (0) = 1. Equation (13) then follows from Eq. (7). (iv) If k ≤ j, then the proliferation process started with less than or equal to j − r agents in positions r + 1, . . . , k and, since agents never move to lower numbered positions, Pr(M k (t) ≤ j − r ) = 1 for all t.
For k > j, the event that there are less than j − r marked agents in positions r + 1, . . . , k at time t is the same as the event that the agent starting in position j has moved to one of the positions > k at time t. From this observation, it follows that Substituting Eq. (7) we see that, for j = r + 1, . . . , s, between the Bernoulli distribution with success probability and the Bernoulli distribution with success probability θ . Suppose S k has a binomial B(k, θ) distribution. Then, by Theorem 2 of Arratia and Gordon (1989), if θ < < 1, where We can deal with the complementary case where 0 < < θ by noting that k − S k has a binomial B(k, 1 − θ) distribution. Clearly H (1 − , 1 − θ) = H ( , θ ), and r (1 − , 1 − θ) = 1/r ( , θ ) (< 1), and so Finally, when = θ , the Central Limit Theorem tells us that We proceed to the proof of the theorem.
(ii) For any finite value of , the number of particles at location y, that is X y/ , is a Bernoulli random variable: there is either a particle at site y/ or there isn't. Since the mean of this Bernoulli random variable is C (y, t), its variance must be V ar (y, t) = C (y, t)(1 − C (y, t)). This proves (ii). The summand on the right hand side of (29) is Pr(R( z/ + 1, t) = ) where R(n, t) is a negative binomial random variable with parameters n and e −λt . The right hand side of (29) itself can thus be interpreted as Pr (R( z/ + 1, t) ≥ y/ + 1). The random variable R(n, t) has the same distribution as the sum n i=1 T (i) (t) where T (i) (t) are independent geometric random variables with common parameter e −λt . The mean of T (i) (t) is e λt and its variance is e 2λt − e λt .
We can apply the Weak Law of Large Numbers to conclude that the distribution of R( z/ + 1, t)/( z/ + 1) approaches a point mass at e λt as → 0. For z ∈ (a, min(y, b)], we conclude that lim →0 Pr (M (y, t) ≤ ( z/ − a/ )) = lim where n( ) = ye −λt / + z (e −λt − e −2λt ) y/ . The limiting probability (30) is the same as by the Central Limit Theorem and because the expression on the right hand side approaches −z as → 0.