1 Introduction

Cancer invasion is a complex process involving numerous interactions between the cancer cells and the extracellular matrix (ECM) (cf. the tumour microenvironment) facilitated by matrix degrading enzymes. Along with active cell migration (both individual and collective) and increased/excessive proliferation, these processes enable the local spread of cancer cells into the surrounding tissue. Any encounter with blood or lymphatic vessels (cf. tumour-induced angiogenesis, lymph-angiogenesis) in the tumour microenvironment initiates and facilitates the spread of the cancer to secondary locations in the host, i.e., metastasis or metastatic spread. A comprehensive historical overview of the biology of metastastic spread can be found in the article by Talmadge and Fidler (2010), while an overview of the core aspects of invasion can be found in the articles of Hanahan and Weinberg (2000, 2011) and the review article of Friedl and Wolf (2003). From a mathematical modelling perspective, cancer invasion has been a topic of interest for almost 30 years with a range of approaches and techniques being used, and an overview can be found in the recent review paper by Sfakianakis and Chaplain (2021). Broadly speaking, two different approaches have been used to model cancer invasion-continuum approaches (i.e. using differential equations with cancer cell density as one of the dependent variables) and individual-based or agent-based approaches (i.e. focusing on the movement of individual cells). Some have also adopted a so-called hybrid approach e.g. Anderson et al. (2000), deriving a discrete model governing the migration of individual cancer cells from the discretization of an associated system of PDEs.

It is not our intention here to discuss in detail the previous modelling work in the area. Rather the aim of this paper is to investigate mathematically the connection between the stochastic differential and the partial differential equations (SDEs and PDEs respectively) that are typically used to describe the migration of living cells. The precise interplay between the SDE and PDE approach is not yet clear and has been in the research focus the last years. The difference between these two approaches is significant and it lies primarily in the immediate focus of the mathematical model. Namely, whereas the SDE approach focuses primarily on the migratory behaviour of the individual cells, the PDE approach describes the macroscopic behavioural pattern of a large collective of cells. As the behavioural pattern, in real life biology, of large cell collectives is related to the migration of individual cells so should the two mathematical approaches be connected.

By identifying the interplay between the two mathematical approaches, we shed light in the complexity of multiscale modelling and simulations. This has direct implications in the biological understanding of solitary cancer cell migration and the development/growth of tumours. Such detailed knowledge of how far individual cancer cells can penetrate into the local tissue is very important from a surgical point of view and can help to minimise the amount of resection required, a point initially raised and investigated in the work of Anderson et al. (2000).

Our aim, hence, in this work is to investigate the interplay between the SDE and PDE modeling approaches of cancer invasion and growth. Namely, we exhibit that the terms of the numerical scheme solving the underlying SDE are not in a one-to-one correspondence to the terms of the PDE. In more detail, we show that a particular “correction” in the drift terms of the numerical scheme of the SDE that improves the approximation qualities of the schemes when compared with the numerical solution of the PDE.

These ideas are studied in the remainder paper in the following way. In Sect. 2 we provide some background for the motivation to this paper stating the general forms of PDE and SDE to be considered, while in Sect. 3 we derive the SDE schemes in some detail. In Sect. 4 we undertake numerical experiments and compare results from computational simulations of the underlying cell migration PDE model with simulations of two different SDE schemes. Finally in Sect. 5 concluding remarks are made.

2 Motivation

We are motivated in this work by typical continuum cancer invasion models (e.g. Chaplain and Lolas (2005, 2006); Andasari et al. (2011)) and consider the following general Advection-Diffusion PDE

$$\begin{aligned} \frac{\partial u}{\partial t}+\nabla \cdot \left( {\textbf{A}} u\right) - \nabla \cdot \left( \textbf{D}\nabla u\right) =0, \end{aligned}$$
(1)

where, according to the usual practice, \(u:[0,T]\times \Omega \rightarrow {\mathbb {R}}\) represents the space-time dependent density of cancer cells, and where \(\Omega \subset {\mathbb {R}}^d\), with \(d=2,3\), is a Lipschitz domain. We assume throughout this work that both the advection and diffusion coefficients \(\textbf{A}\) and \(\textbf{D}\) are non-constant in the sense that they depend on \(\textbf{x}\in {\mathbb {R}}^d\).

It is biologically understood that the macroscopic patterns of a large collective of cells is related to the migration of individual cells. Hence, our aim is to understand the connection between the continuous model (1), and models that capture the migration of solitary cancer cells. Following the ideas developed in the seminal works by Einstein (1998) on the investigation of Brownian motion, as well as by Stratonovich (1966); Ito et al. (2012); Kitanidis (1994), the motion of the solitary cancer cells is usually described via SDEs that track the position of the cells. These SDEs typically take the form

$$\begin{aligned} d\textbf{X}_t = \textbf{a}(\textbf{X}_t,t) \, dt + \textbf{d}(\textbf{X}_t,t) \, d\textbf{W}_t \, , \end{aligned}$$
(2)

with \(t\ge 0\). Here \(\textbf{X}_t\in {\mathbb {R}}^d\) is a stochastic process that represents the position of the solitary cancer cells, and where \(\textbf{a}(\textbf{X}_t,t)\) and \(\textbf{d}(\textbf{X}_t,t)\) are the drift and the diffusion coefficients respectively. Here \(\textbf{W}_t\) represents a d-dimensional Wiener process.

The critical question that arises in this work is the following: how do the advection and diffusion terms \(\textbf{A}\), \(\textbf{D}\) of the deterministic PDE (1) relate to the drift and diffusion terms \(\textbf{a}\), \(\textbf{d}\) of the SDE (2)?

By identifying these relations we shed light in the complexity of multiscale modelling and simulations in general, and in the migration of solitary cancer cells and macroscopic tumour growth within a living organism in particular. Succeeding, hence, in identifying such relations will allow well established—and phenomenologically verifiable—macroscopic models to be used in order to extract, at the smaller scale, the migratory pattern of solitary cells or small clusters of cancer cells. Vice-versa, these relations will provide an additional validation to the use of deterministic models or even encourage ideas for a hybrid method using both deterministic and stochastic models.

We proceed in this effort under the assumption that u represents a single solitary cancer cell or a small cluster of cancer cells. We denote by \(\textbf{x}_t\in {\mathbb {R}}^d\) the numerical approximation of the solution stochastic process \(\textbf{X}_t\in {\mathbb {R}}^d\) that represents the position of the cell’s centroid at time t. Following (Kloeden and Platen 1992; Kitanidis 1994; Ito et al. 2012; Stratonovich 1966; Evans 2013) we exhibit in Sect. 3 that the less intuitive numerical scheme:

$$\begin{aligned} \textbf{x}_{t+\Delta t} = \textbf{x}_t + \big (\textbf{A}(\textbf{x}_t) +\nabla \cdot \textbf{D}(\textbf{x}_t)\big ) \, \Delta t + \sqrt{2\Delta t}\textbf{V}(\textbf{x}_t)\varvec{\,} \xi , \end{aligned}$$
(3)

where \(\varvec{\xi }\) is a vector of d independent and normally distributed random variables of mean 0 and variance 1, and where \({\textbf{V}}\) given by

$$\begin{aligned} {\textbf{V}}(\textbf{x}_t) \textbf{V}^T(\textbf{x}_t) = {\textbf{D}}(\textbf{x}_t) \ , \end{aligned}$$
(4)

provides a better approximation to the PDE (1) than the more intuitive numerical scheme:

$$\begin{aligned} \textbf{x}_{t+\Delta t} = \textbf{x}_t + {\textbf{A}} (\textbf{x}_t) \, \Delta t + \sqrt{2\Delta t}\textbf{V}(\textbf{x}_t)\varvec{\,} \xi . \end{aligned}$$
(5)

Both schemes account, in the same way, for the “square root” of the diffusion \(\textbf{D}\) in their corresponding diffusion/noise terms. As we will see in Sect. 3 this is derived in a very natural way.

The SDE (2) is of an Ito-type and, accordingly, the schemes (3) and (5) are derived using the Euler-Maruyama approximation. The less intuitive scheme (3) should not be mistaken for the equivalent scheme of a Stratonovich-type SDE emerging from (2), cf. Evans (2013); it should rather be understood as an Ito-type scheme that is different from (5).

Structurally, the difference of the two schemes (3) and (5) lies in their corresponding drift terms. In particular, in the “corrected” scheme (3) the drift term accounts for both the advection, \(\textbf{A}\), and the diffusion, \(\textbf{D}\), coefficients of the PDE (1). In the scheme (5) the drift term accounts only for the advection term \(\textbf{A}\). This final remark, the one-to-one correspondence between the terms of the scheme (5) and the equation (2), justifies the characterisation of the scheme (5) as “more intuitive”, cf. Evans (2013); Arnold (1974).

In Sect. 3 we derive the scheme (3) from the PDE (1) and in Sect. 4 we compare it numerically with the scheme (5) and the PDE (1).

We, hence, conclude that there is not a one-to-one relation between the advection term of the PDE and the drift term of the SDE—unless the divergence of the diffusion \(\textbf{D}\) vanishes; this is clearly the case when the diffusion \(\textbf{D}\) does not depend on the spatial variable \(\textbf{x}\).

3 Derivation of the SDE schemes

We make the fundamental modeling assumption that a cell can be viewed as a sufficiently small cell-cluster that satisfies the PDE (1). We furthermore assume that, without loss of generality, such a cell-cluster has unit mass i.e.

$$\begin{aligned} \int _{V_{\infty }} u(t,\textbf{x}) \,d\textbf{x}= 1 , \end{aligned}$$
(6)

where \(u(t,\textbf{x})\) is the density and \(V_{\infty }\) the volume of the cell-cluster.

To capture the behaviour of the cell-cluster, and in particular, the position of its centroid and spread, we use the method of moments. Accordingly, the first and second moments of u read respectively

$$\begin{aligned} \varvec{\nu }_1(t) = \int _{V_{\infty }} \textbf{x u}(t,\textbf{x}) \,d\textbf{x}, \end{aligned}$$
(7a)
$$\begin{aligned} \varvec{\nu }_2(t) = \int _{V_{\infty }} \textbf{x}\textbf{x}^T u(t,\textbf{x}) \,d\textbf{x}. \end{aligned}$$
(7b)

Note that \(\varvec{\nu }_1(t)\) is a d-dimensional vector and \(\varvec{\nu }_2(t)\) a \(d\times d\)-dimensional matrix, and they are both understood via their physical rather than probabilistic interpretation. We also consider the second moment about the mean

$$\begin{aligned} {\varvec{N}}(t) = \int _{V_{\infty }} \left( \textbf{x}- \varvec{\nu }_1(t)\right) \left( \textbf{x}- \varvec{\nu }_1(t)\right) ^T u(t,\textbf{x}) \,d\textbf{x}= \varvec{\nu }_2(t) - \varvec{\nu }_1(t) \varvec{\nu }_1^T(t), \end{aligned}$$
(8)

where \({\varvec{N}}(t)\) is \(d\times d\)-dimensional matrix. The first moment \(\varvec{\nu }_1\) in (7a) can be interpreted as the position of the cell-cluster centroid, and, accordingly, it’s (time) rate of change as the velocity of the cell-cluster. On the other hand, the rate of change of the second moment \(\varvec{\nu }_2\) in (7b) represents the spreading of the cell-cluster. These two remarks together allow to (heuristically) identify the drift and diffusion coefficients, \(\textbf{a}\) and \(\textbf{d}\), of the SDE (2) as \(\frac{d\varvec{\nu }_1(t)}{dt}\) and \(\frac{d{\varvec{N}}(t)}{dt}\) respectively.

As a direct consequence of that, we will construct a numerical scheme in line with the classical Euler-Maruyama approximation, cf. Kloeden and Platen (1992), as follows

$$\begin{aligned} \textbf{x}_{t+\Delta t} = \textbf{x}_t + \frac{d\varvec{\nu }_1(t)}{dt} \Delta t + \sqrt{\frac{d{\varvec{N}}(t)}{dt} \Delta t}\, \varvec{\xi }, \end{aligned}$$
(9)

where the square root in the above, should be understood in the usual matrix notation, see e.g. (4).

To this end, we impose the following boundary conditions that indicate an exponential decay of \(u(t,\textbf{x})\) as \(|\textbf{x}|\rightarrow \infty \).

$$\begin{aligned}&\int _{S_{\infty }} e^{{\textbf{z}}^T \textbf{x}} u(t,\textbf{x}) \textbf{A}^T \textbf{n}\,dS = 0 , \end{aligned}$$
(10a)
$$\begin{aligned}&\int _{S_{\infty }} e^{{\textbf{z}}^T \textbf{x}} (\textbf{D}\nabla u(t,\textbf{x}))^T \textbf{n}\,dS = 0 , \end{aligned}$$
(10b)
$$\begin{aligned}&\int _{S_{\infty }} e^{{\textbf{z}}^T \textbf{x}} u(t,\textbf{x}) \textbf{D}\textbf{n}\,dS = 0 , \end{aligned}$$
(10c)

where \(S_{\infty }\) represents the surface of the cell-cluster \(V_{\infty }\) and \(\textbf{z}\) a d-dimensional vector. Through the i-th derivative of (10a)–(10c) with respect to \(\textbf{z}\) and setting \(\textbf{z}= 0\) one can retrieve the boundary conditions for the i-th moment. In our case we only need to differentiate twice since we only need the first two moments.

With all these in mind, we proceed by calculating through (7a) and (8) the time derivatives of \(\varvec{\nu }_1(t)\) and \({\varvec{N}}(t)\). For \(\varvec{\nu }_1\) we multiply (1) by \(x_i\), for \(i = 1, \ldots , n\), integrate over the cell-cluster volume \(V_\infty \), and obtain

$$\begin{aligned}&\int _{V_{\infty }} x_i \bigg [ \frac{\partial u}{\partial t}+\nabla \cdot \left( {\textbf{A}} u\right) - \nabla \cdot \left( {\textbf{D}}\nabla u\right) \bigg ] \,d\textbf{x}= 0 \iff \nonumber \\&\underbrace{\int _{V_{\infty }} x_i \frac{\partial u}{\partial t} \,d\textbf{x}}_{= I_{11}} + \underbrace{\int _{V_{\infty }} x_i \nabla \cdot \left( {\textbf{A}} u\right) \,d\textbf{x}}_{= I_{12}} - \underbrace{\int _{V_{\infty }} x_i \nabla \cdot \left( {\textbf{D}}\nabla u\right) \,d\textbf{x}}_{= I_{13}} = 0 . \end{aligned}$$
(11a)

We work on each of the terms \(I_{11}\), \(I_{12}\), \(I_{13}\) on the left hand side separately; and obtain for \(I_{11}\)

$$\begin{aligned} {I_{11}} = \int _{V_{\infty }} x_i \frac{\partial u}{\partial t} \,d\textbf{x}= \frac{\partial }{\partial t} \int _{V_{\infty }} x_i u \,d\textbf{x}= \frac{d(\varvec{\nu }_1)_i}{d t }. \end{aligned}$$
(11b)

For the second term \(I_{12}\), after invoking the Divergence Theorem and the boundary conditions (10a)–(10c), we obtain

$$\begin{aligned} {I_{12}}&= \int _{V_{\infty }} x_i \nabla \cdot \left( {\textbf{A}} u\right) \,d\textbf{x}= \int _{V_{\infty }} \nabla \cdot \left( {\textbf{A}} u x_i\right) \,d\textbf{x}- \int _{V_{\infty }} A_i u \, d\textbf{x}\nonumber \\&= \int _{S_{\infty }} u x_i {\textbf{A}}^T \textbf{n}\,dS - \int _{V_{\infty }} A_i u \,d\textbf{x}\nonumber \\&{\mathop {=}\limits ^{(10a)}} - \int _{V_{\infty }} A_i u \, d\textbf{x}, \end{aligned}$$
(11c)

where \(A_i\) is the i-th element of the vector \(\textbf{A}\). Similarly, the third term \(I_{13}\) recasts into

$$\begin{aligned} {I_{13}}&= \int _{V_{\infty }} x_i \nabla \cdot \left( \textbf{D}\nabla u\right) \, d\textbf{x}= \int _{V_{\infty }} \nabla \cdot (x_i {\textbf{D}} \nabla u) \, d\textbf{x}- \int _{V_{\infty }} \left( \nabla u\right) ^T {\textbf{D}}_i \, d\textbf{x}\nonumber \\&= \int _{S_{\infty }} x_i ({\textbf{D}} \nabla u)^T {\textbf{n}} \, dS - \int _{V_{\infty }} \nabla \cdot \left( u {\textbf{D}}_i\right) \, d\textbf{x}+ \int _{V_{\infty }} u \nabla \cdot {\textbf{D}}_i \, d\textbf{x}\nonumber \\&{\mathop {=}\limits ^{(10b)}} - \int _{S_{\infty }} u {\textbf{D}}_i^T {\textbf{n}} \, dS + \int _{V_{\infty }} u \nabla \cdot {\textbf{D}}_i \, d\textbf{x}\nonumber \\&{\mathop {=}\limits ^{(10c)}} \int _{V_{\infty }} u \nabla \cdot {\textbf{D}}_i \, d\textbf{x}, \end{aligned}$$
(11d)

where \(\textbf{D}_i\) is the i-th column of the diffusion matrix \(\textbf{D}\). So, by combining (11a) with (11b)–(11d) we obtain the following representation for the rate of change of the i-th (vector) component of the first moment:

$$\begin{aligned} \frac{d(\varvec{\nu }_1)_i}{dt} = \int _{V_{\infty }} (A_i + \nabla \cdot {\textbf{D}}_i) u \, d\textbf{x}. \end{aligned}$$
(12)

In a similar way, we identify the relation satisfied by the \(ij^\text {th}\) element of the second moment \(\varvec{\nu }_2\), which allows us to compute the rate of change of \({\varvec{N}}(t)\) in (8):

$$\begin{aligned} \begin{aligned}&\int _{V_{\infty }} x_i x_j \bigg [ \frac{\partial u}{\partial t}+\nabla \cdot \left( {\textbf{A}} u\right) - \nabla \cdot \left( {\textbf{D}}\nabla u\right) \bigg ] \,d\textbf{x}= 0 \iff \\&\underbrace{\int _{V_{\infty }} x_i x_j \frac{\partial u}{\partial t} \,d\textbf{x}}_{= I_{21}} + \underbrace{\int _{V_{\infty }} x_i x_j \nabla \cdot \left( {\textbf{A}} u\right) \,d\textbf{x}}_{= I_{22}} - \underbrace{\int _{V_{\infty }} x_i x_j \nabla \cdot \left( {\textbf{D}}\nabla u\right) \,d\textbf{x}}_{= I_{23}} = 0. \end{aligned} \end{aligned}$$
(13a)

As with (11a), we employ the Divergence Theorem and the boundary condition (10a)–(10c) and calculate these three integrals one-by-one. The first one, \(I_{21}\), reads

$$\begin{aligned} {I_{21}} = \int _{V_{\infty }} x_i x_j \frac{\partial u}{\partial t} \, d\textbf{x}= \frac{\partial }{\partial t} \int _{V_{\infty }} x_i x_j u \,d\textbf{x}=\frac{d(\varvec{\nu }_2)_{ij}}{dt}, \end{aligned}$$
(13b)

the second integral, \(I_{21}\), recasts into

$$\begin{aligned} {I_{22}} = \int _{V_{\infty }} x_i x_j \nabla \cdot ({\textbf{A}} u) \, d\textbf{x}&= \int _{V_{\infty }} \nabla \cdot (x_i x_j {\textbf{A}} u) \, d\textbf{x}- \int _{V_{\infty }} (x_j A_i + x_i A_j)u \, d\textbf{x}\nonumber \\&= \int _{S_{\infty }} x_i x_j u {\textbf{A}}^T {\textbf{n}} dS - \int _{V_{\infty }} (x_j A_i + x_i A_j) u \,d\textbf{x}\nonumber \\&{\mathop {=}\limits ^{(10a)}} - \int _{V_{\infty }} (x_j A_i + x_i A_j) u \,d\textbf{x}, \end{aligned}$$
(13c)

and for the third term, \(I_{21}\), it holds

$$\begin{aligned} {I_{23}}&= \int _{V_{\infty }} x_i x_j \nabla \cdot ({\textbf{D}} \nabla u) \,d\textbf{x}\nonumber \\&= \int _{V_{\infty }} \nabla \cdot (x_i x_j {\textbf{D}} \nabla u) \,d\textbf{x}- \int _{V_{\infty }} \big [ x_j ({\textbf{D}} \nabla u)_i + x_i({\textbf{D}} \nabla u)_j \big ] \, d\textbf{x}\nonumber \\&= \int _{S_{\infty }} x_i x_j ({\textbf{D}} \nabla u)^T {\textbf{n}} \, dS - \int _{V_{\infty }} \nabla \cdot (u x_j {\textbf{D}}_i + u x_i {\textbf{D}}_j) \, d\textbf{x}\nonumber \\&\qquad + \int _{V_{\infty }} \big [ x_j \nabla \cdot {\textbf{D}}_i + x_i \nabla \cdot {\textbf{D}}_j + 2D_{ij} \big ] u \,d\textbf{x}\nonumber \\&{\mathop {=}\limits ^{(10b)}} - \int _{S_{\infty }} (u x_j {\textbf{D}}_i + u x_i {\textbf{D}}_j)^T {\textbf{n}} \,d\textbf{x}\nonumber \\&\qquad + \int _{V_{\infty }} \big [ x_j \nabla \cdot {\textbf{D}}_i + x_i \nabla \cdot {\textbf{D}}_j + 2D_{ij} \big ] u \,d\textbf{x}\nonumber \\&{\mathop {=}\limits ^{(10c)}} \int _{V_{\infty }} \big [ x_j \nabla \cdot {\textbf{D}}_i + x_i \nabla \cdot {\textbf{D}}_j + 2D_{ij} \big ] u \,d\textbf{x}, \end{aligned}$$
(13d)

where \(D_{ij}\) is the element in the i-th row and j-th column of the matrix \(\textbf{D}\). By combining (13a) with (13b)–(13d), we obtain the following relation for ij-th (matrix) element of the second moment

$$\begin{aligned} \frac{d(\varvec{\nu }_2)_{ij}}{dt} = \int _{V_{\infty }} (x_j A_i + x_i A_j)u \,d\textbf{x}+ \int _{V_{\infty }} \big [ x_j \nabla \cdot {\textbf{D}}_i + x_i \nabla \cdot {\textbf{D}}_j + 2D_{ij} \big ] u \,d\textbf{x}.\nonumber \\ \end{aligned}$$
(14)

Summing up (12) and (14), we deduce the following set of equations in a vector form for any given advection and diffusion terms A and D in (1)

$$\begin{aligned} \frac{d\varvec{\nu }_1}{dt} =&\int _{V_{\infty }} \left( \textbf{A}+ \nabla \cdot {\textbf{D}} \right) u(t,\textbf{x}) \, d\textbf{x}\end{aligned}$$
(15)
$$\begin{aligned} \frac{d\varvec{\nu }_2}{dt} =&\int _{V_{\infty }} \big (\textbf{x}(\textbf{A}+\nabla \cdot {\textbf{D}})^T + (\textbf{A}+ \nabla \cdot {\textbf{D}}) {\textbf{x}}^T + 2{\textbf{D}} \big ) u(t,\textbf{x}) \,d\textbf{x}\end{aligned}$$
(16)
$$\begin{aligned} \frac{d {\varvec{N}}}{dt} =&\int _{V_{\infty }} \big (\textbf{x}(\textbf{A}+\nabla \cdot {\textbf{D}})^T + (\textbf{A}+ \nabla \cdot {\textbf{D}}) {\textbf{x}}^T + 2{\textbf{D}} \big ) u(t,\textbf{x}) \,d\textbf{x}\nonumber \\&- \varvec{\nu }_1(t) \left( \int _{V_{\infty }} (\textbf{A}+ \nabla \cdot {\textbf{D}}) u(t,\textbf{x}) \, d\textbf{x}\right) ^T \nonumber \\&- \left( \int _{V_{\infty }} (\textbf{A}+ \nabla \cdot {\textbf{D}}) u(t,\textbf{x}) \, d\textbf{x}\right) \varvec{\nu }_1^T(t) . \end{aligned}$$
(17)

We retrieve (17) by taking the time derivative of (8) and substituting equations (15) and (16).

Assuming that the mass of the cell-cluster is concentrated, at time t, in a single point \({\textbf{x}}_t\), we can represent the density of the cell-cluster as

$$\begin{aligned}u(t,\textbf{x}) = \delta \left( \textbf{x}- {\textbf{x}}_t\right) , \end{aligned}$$

where \(\delta \) is the Dirac function centered at \({\textbf{x}}_t\). Then the velocity of the cell-cluster in (15), is given as the sum of the advection A and the divergence of the diffusion matrix D at \(\textbf{x}_t\):

$$\begin{aligned} \frac{d\varvec{\nu }_1}{dt} = \textbf{A}(x_t) + \nabla \cdot {\textbf{D}}\vert _{\textbf{x}_t} . \end{aligned}$$
(18)

Accordingly, the rate of change of second moment about the mean in (17) recasts into

$$\begin{aligned} \frac{d {\varvec{N}}}{dt}&= {\textbf{x}}_t (\textbf{A}+\nabla \cdot {\textbf{D}})^T\bigg \vert _{\textbf{x}_t} + (\textbf{A}+ \nabla \cdot {\textbf{D}})\bigg \vert _{\textbf{x}_t} {\textbf{x}}^T_t + 2{\textbf{D}}\bigg \vert _{\textbf{x}_t} \nonumber \\&\qquad \qquad - {\textbf{x}}_t (\textbf{A}+\nabla \cdot {\textbf{D}})^T\bigg \vert _{\textbf{x}_t} - (\textbf{A}+ \nabla \cdot {\textbf{D}})\bigg \vert _{\textbf{x}_t} {\textbf{x}}^T_t \nonumber \\&= 2{\textbf{D}}\bigg \vert _{\textbf{x}_t}, \end{aligned}$$
(19)

which clearly indicates that the rate of spreading of the cell-cluster is given by the diffusion matrix at the point \(\textbf{x}_t\).

We close this section by substituting (18) and (19) into (9) to obtain the corrected numerical scheme

$$\begin{aligned} \textbf{x}_{t+\Delta t} = \textbf{x}_t + \big (\textbf{A}(\textbf{x}_t) +\nabla \cdot \textbf{D}(\textbf{x}_t)\big ) \Delta t + \sqrt{2\Delta t}\textbf{V}(x_t)\varvec{\xi }_2, \end{aligned}$$
(20)

where \({\textbf{V}}(\textbf{x}_t) {\textbf{V}}^T(\textbf{x}_t) = {\textbf{D}}(\textbf{x}_t)\) and where \(\varvec{\xi }_2\) is a vector of d independent and normally distributed random variables with 0 mean and variance 1.

The scheme (20) that we have just derived, is different from the one without the correction of the drift term, i.e. (5), which we repeat here for completion:

$$\begin{aligned} \textbf{x}_{t+\Delta t} = \textbf{x}_t + \textbf{A}(\textbf{x}_t) \Delta t + \sqrt{2\Delta t}\textbf{V}(x_t)\varvec{\xi }_1, \end{aligned}$$
(21)

where, as before, \(\varvec{\xi }_1\) is a vector of independent and normally distributed random variables with mean 0 and variance 1.

In the next section we numerically investigate the two schemes (21) and (20) and provide evidence of their differences and their fitting with the corresponding/underlying PDE (1).

4 Numerical experiments

We have seen in the previous section that the drift term of the SDE scheme (20) incorporates a diffusion-based contribution/correction that is not found in the more intuitive scheme (21). This is a significant difference between the two schemes and is central in our numerical investigations. Namely our aim in this section is to numerically investigate the impact that this correction has on the simulations of these two schemes. To this end, we consider three specific numerical settings that highlight the difference of the two schemes, and compare their predictions with each other and with the corresponding/underlying PDE (1).

In more detail, in Experiment 1, we investigate the behaviour of the two schemes, (21) and (20), on a particular application where the corresponding noise terms \(\varvec{\xi }_1\), \(\varvec{\xi }_2\) are the same. This allows to csompare the two schemes as a result of their differences on the drift terms alone.

In Experiment 2, we consider the same computational setting as in Experiment 1, with the only difference being that the two SDE schemes (21), (20) are augmented with different noise terms \(\varvec{\xi }_1,\) \(\varvec{\xi }_2\). With a large number of scheme realisations we extract information on the full spectrum of differences between the two schemes.

In Experiment 3 we consider a more generic, and common in the cancer invasion modelling literature, experimental setting where the advection and diffusion terms \(\textbf{A}\), \(\textbf{D}\) of (1) depend on the spatial variable \(\textbf{x}\in {\mathbb {R}}^d\) through their dependence on the non-uniform tumour microenvironment \(v:{\mathbb {R}}^d\rightarrow {\mathbb {R}}\). For both SDE schemes (21) and (20), we perform a large number of realisations and compare them with the solution of the underlying PDE (1). We accordingly conclude that the corrected SDE scheme (20) provides a much better approximation to the PDE (1) than the more intuitive SDE scheme (21).

For the numerical solution of the PDE (1) we use a numerical method that was previously developed in Kolbe et al. (2016); Sfakianakis et al. (2017) and which we briefly discuss in the Appendix 1. All algorithm implementations, simulations, and visualisations were conducted in MATLAB (2022).

Experiment 1

In the first experiment, we consider a modelling setting where the advection and diffusion terms, \(\textbf{A}\) and \(\textbf{D}\), of the PDE (1) have an explicit dependence on the space variable \(\textbf{x}\), namely:

$$\begin{aligned} \textbf{A}(\textbf{x}) = \begin{pmatrix}a x_1^7 \\ ax_2^7\end{pmatrix}, \qquad \textbf{D}(\textbf{x}) = \begin{pmatrix} b x_1^7 &{} 0 \\ 0 &{} b x_2^7 \end{pmatrix}, \end{aligned}$$
(22)

where \(\textbf{x} = (x_1, x_2)\in {\mathbb {R}}^2\) and \(a, b\in {\mathbb {R}}\) are constants. Accordingly, the PDE (1) reads as

$$\begin{aligned} \frac{\partial u}{\partial t}+\nabla \cdot \left[ {\begin{pmatrix} a x_1^7 \\ a x_2^7 \end{pmatrix} } u \right] - \nabla \cdot \left[ \begin{pmatrix} b x_1^7 &{} 0 \\ 0 &{} b x_2^7\end{pmatrix} \nabla u\right] =0, \end{aligned}$$
(23)

where \(t \in [0,T]\). Similarly, the SDE schemes (21) and (20) are re-formulated, for

$$\begin{aligned} \textbf{V}(\mathbf {x_t}) = \begin{pmatrix} \sqrt{b x_1^7} &{} 0 \\ 0 &{} \sqrt{b x_2^7} \end{pmatrix}, \end{aligned}$$
(24)

as follows:

$$\begin{aligned} \textbf{x}_{t+\Delta t}&= \textbf{x}_t + {\begin{pmatrix} a x_1^7 \\ a x_2^7 \end{pmatrix}} \Delta t + \sqrt{2\Delta t} \begin{pmatrix} \sqrt{b x_1^7} &{} 0 \\ 0 &{} \sqrt{b x_2^7} \end{pmatrix}\varvec{\xi }_1, \end{aligned}$$
(25)
$$\begin{aligned} \textbf{x}_{t+\Delta t}&= \textbf{x}_t + \left[ {\begin{pmatrix} a x_1^7 \\ a x_2^7 \end{pmatrix}} +\nabla \cdot {\begin{pmatrix} b x_1^7 &{} 0 \\ 0 &{} b x_2^7 \end{pmatrix} }\right] \Delta t + \sqrt{2\Delta t} \begin{pmatrix} \sqrt{b x_1^7} &{} 0 \\ 0 &{} \sqrt{b x_2^7} \end{pmatrix}\varvec{\xi }_2. \end{aligned}$$
(26)

These two SDE schemes are augmented with the same noise terms \(\varvec{\xi }_1\), \(\varvec{\xi }_2\) to allow for a direct, one-to-one, comparison of their realisations.

The parameters used in this numerical experiment can be found in Table 1 and the simulation results are presented in Fig. 1. In more detail, we perform 100 realisations of the two schemes (25), (26)—each pair of experiments with the same noise—starting from the same initial position \((x_1,x_2)=(1,1)\) and running over the time \(t\in [0,1]\). These results show that the realisations of the two schemes appear to be—in a one-to-one conformation—slightly shifted and parallel to each other. More specifically, as can be seen, the corrected scheme (26) introduces an additional displacement of the cell migration track/sample paths directed towards larger values of \(x_1\) and \(x_2\). This is due to the nature and structure of the drift term as well as the diffusion-based correction introduced in (26) and the positivity of the parameter b, cf. Table 1.

Table 1 Parameters used in Experiments 1 and 2
Fig. 1
figure 1

Experiment 1. Showing here the cell migration tracks/sample paths of 100 realisations of the SDE schemes (25), in orange, and (26), in light blue, with the same random noise. The initial position of the cells is set at (1,1) for all realisations and the total travelling time is \(T=1\). Note that the sample paths of the two schemes are parallel shifts of one another. The shift between the two schemes is due to the adaptation of the drift term of the scheme (26) (color figure online)

Fig. 2
figure 2

Experiment 2. Comparison of the migration tracks traversed the cells following the SDE schemes (25) and (26). For both schemes, we considered initial position at (1, 1) and a total travelling time \(T=1\). a, b the full migration tracks of \(10^5\) cells following the SDE scheme (25) and (26) respectively. c superimposing the migration patterns of (a) and (b) clearly reveals that the corrected scheme (26) introduces additional migratory bias

Fig. 3
figure 3

Experiment 2. Showing here the final positions of all cell migration after \(10^5\) realisations of the SDE schemes (25) and (26)—in orange and light blue respectively—as shown in Fig. 2. All realisations start from (1, 1) and run for time \(t\in [0,1]\). The corresponding convex hulls of these final positions are also shown. Note that the adaptation in the drift term of the corrected scheme (26), as opposed to the scheme (25), induces additional migration of the cells and spread of their final positions (color figure online)

Experiment 2

The intuition we have gained from Experiment 1, namely the way the adaptation of the drift term in the corrected scheme (26) affects the distance and direction of the cell migration, can be investigated further by considering independent noise terms for the two schemes. Hence, in the current experiment, we choose independent noise terms \( \varvec{\xi }_1\), \(\varvec{\xi }_2 \) for the two SDE schemes (25) and (26) and perform \(10^5\) new realisations with each, all start from the initial point \((x_1,x_2)=(1,1)\) and running over time \(t\in [0,1]\). We otherwise consider the same setting and parameters as in Experiment 1; the parameters for this experiment can be found in in Table 1.

The simulation results are presented in Figs. 2 and 3. In Fig. 2, in particular, we see that cells migrate further away from the origin and in a more biased fashion when they follow the corrected scheme (26) rather than the scheme (25). This is clearly the result of the additional bias introduced in the drift term of the corrected scheme (26). The qualitative difference between the two schemes can be further seen in Fig. 3 where we present the final positions of the cells at time T of both schemes, along with their corresponding convex hulls.

To measure the quantitative difference of the two schemes (25) and (26), we first measure the average distance traversed by the cells from their initial position until the final time T. In more detail, we perform K realisations of the schemes and calculate, for each one, the distance between the initial and the final position. We then calculate the average distance the cells have traversed by the formula

$$\begin{aligned} d = \frac{1}{K} \sum _{i=1}^{K} \sqrt{(x_{1,T}^i - x_{1,t_0}^i)^2 + (x_{2,T}^i - x_{2,t_0}^i)^2}, \end{aligned}$$
(27)

where \(\left( x_{1,{t_0}}^i, x_{2,{t_0}}^i\right) \) and \(\left( x_{1,{T}}^i, x_{2,{T}}^i\right) \) represent the initial and final positions of the cells in the realisation \(i = 1,\ldots , K\). We apply the above formula for the two schemes (25) and (26), after performing \(K=10^6\) realisations of each, calculate their respective average distances \(d_1\) and \(d_2\) respectively, and evaluate the signed relative difference between the two to obtain

$$\begin{aligned} E = \frac{d_2 - d_1}{d_1}\approx 0.076 \, . \end{aligned}$$
(28)

To further quantify the difference between the two schemes, we perform a uniform binning approach of the final positions of the cancer cells as described in Anderson et al. (2008). Namely, we consider a partition of the \(x_1\)-axis into non-overlapping bins of fixed size r, and assign to each bin the quantity:

$$\begin{aligned} U_i = \sum _{j=1}^{N_i} I_{i,j}, \end{aligned}$$
(29)

where \(N_i\) is the total number of positions \((x_1,x_2)\) in the i-th bin and \(I_{i,j}\) is the j-th element of the set \(I_i = \{ x_2 : \text {for positions } (x_1,x_2) \text { in the } i \text {-th bin} \}\). The quantification was made by computing the \(J_2\) criterion which is defined as follows

$$\begin{aligned} J_2 = \frac{|S_w + S_b |}{S_w}, \end{aligned}$$
(30)

where

$$\begin{aligned}&S_w = \sum _{i=1}^M P_i S_i, \qquad S_b = \sum _{i=1}^M P_i (\varvec{\mu }_i - \varvec{\mu }_0)(\varvec{\mu }_i - \varvec{\mu }_0)^T, \nonumber \\&P_i = \frac{n_i}{K}, \qquad S_i = {\mathbb {E}} \big [ (\textbf{x}- \varvec{\mu }_0)(\textbf{x}- \varvec{\mu }_0)^T \big ], \qquad \varvec{\mu }_0 = \sum _{i=1}^M P_i \varvec{\mu }_i \end{aligned}$$
(31)

where \(n_i\) is the number of positions in the i-th bin, K is the number of realisations, and \(\varvec{\mu }_i\) is the mean value of i-th bin. After computing the \(J_2\) values of both schemes (25) and (26), we compute the signed relative error for different values r of the size of the bins

$$\begin{aligned} {\mathcal {E}}^r = \frac{ J_{2}^{2,r} - J_{2}^{1,r} }{J_{2}^{1,r}} \ , \end{aligned}$$
(32)

where \(J_{2}^{1,r}, J_{2}^{2,r}\) are the values of the \(J_2\)-criterion for schemes (25) and (26) respectively. Larger values of \(J_2\) correspond to better separated data. For the choice of \(r = 0.01 + kh \, , \text { where } h = 0.001 , \text { for } k = 1,\ldots ,60 \, \), we observe that for smaller sizes of the bins we get larger values of \(J_2\) and that \(J_{2}^{1,r} < J_{2}^{2,r}\) for all the values of r. The average value of the \(k-\)different values of \({\mathcal {E}}^r\) is

$$\begin{aligned} {\mathcal {E}}^r \approx 0.0082. \end{aligned}$$
(33)

This result provides a second confirmation that (26) introduces additional migratory bias to the one side of the plane.

Fig. 4
figure 4

Experiment 3. Numerical solution of the PDE (34). a showing here the isolines of the concentration u at time \(t=0\); they serve as initial conditions for the PDE (34). b the structure of the (fixed) extracellular environment v that participates in the advection and diffusion terms of (34); the formula of v is given in (36). c isolines of the solution u of the PDE (34) at the final time \(t=10\); they reveal a higher concentration of the cancer cells in the “valleys” of the (fixed) extracellular environment v shown in b

Experiment 3

In this experiment we consider a PDE that is more common in the field of cancer invasion than the PDE in Eq. (23). Namely, we consider here a model where the growth of the tumour depends on the extracellular environment. This could represent, e.g. an extracellular chemical signal, the density of the extracellular matrix, or a completely different extracellular bio-chemical queue. Still, for the sake of simplicity of presentation, we do not make any particular biological assumptions on the nature of the extracellular environment and rather refer to it simply as “environment”.

Furthermore, we assume that this environment is non-uniform in space, does not change in time, and influences the growth of tumour in a very specific fashion. These assumptions are incorporated in the following model:

$$\begin{aligned} \frac{\partial u}{\partial t}(t,\textbf{x}) - \nabla \cdot \bigg (\nabla v(\textbf{x}) u(t,\textbf{x})\bigg ) - \nabla \cdot \bigg ( \left( v(\textbf{x})-1 \right) \nabla u(t,\textbf{x}) \bigg )=0, \end{aligned}$$
(34)

where \(t\ge 0\), \(\textbf{x}= (x_1,x_2)\in \Omega =\left[ -12,12\right] ^2\), and where \(u:[0,\infty )\times \Omega \rightarrow {\mathbb {R}}\) represents the density of the cancer cells. As previously mentioned, we do not investigate the biomedical realism of this model, nor do we interpret its findings under this light.

The PDE (34) is augmented with the initial condition

$$\begin{aligned} u(0,\textbf{x})=u(0,x_1,x_2) = 3\times 10^5\,e^{-20( x_1^2 + x_2^2 )}, \end{aligned}$$
(35)

for \(\textbf{x}\in \Omega \) as shown in Fig. 4, and zero-Neumann boundary conditions over \(\partial \Omega \). Furthermore, the (fixed) extracellular environment \(v:\Omega \rightarrow {\mathbb {R}}\), shown in Fig. 4, is given by

$$\begin{aligned} v(\textbf{x})=v(x_1,x_2) = 1 + 0.5\left( \sin ^2(x_1) +\cos ^2(x_2) \right) . \end{aligned}$$
(36)

Note that, for v given in (36), the diffusivity \(v(\textbf{x})-1\) of (34) is non-negative for all \(\textbf{x}\in \Omega \).

We plot in Fig. 4 the isolines of the solution of the PDE (34) at the final time \(t=10\). These illustrate clearly a significantly higher concentration of the cancer cells in the “valleys” of the extracellular environment v.

Based on the advection and diffusion terms of the PDE (34), we rewrite the SDE schemes (21) and (20) as follows:

$$\begin{aligned} \textbf{x}_{t+\Delta t}&= \textbf{x}_t - \nabla v(\textbf{x}_t) \Delta t + \sqrt{2\Delta t} \sqrt{v(\textbf{x}_t)-1}\varvec{\xi }_1, \end{aligned}$$
(37a)
$$\begin{aligned} \textbf{x}_{t+\Delta t}&= \textbf{x}_t + \sqrt{2\Delta t} \sqrt{v(\textbf{x}_t)-1}\varvec{\xi }_2. \end{aligned}$$
(37b)

Note that the corrected SDE scheme (37b), which corresponds to the corrected scheme (20), lacks a drift term; this is a result of the particular structure of the advection and diffusion terms of the PDE (34) and the way they are combined in the drift term of the corrected scheme (20).

Fig. 5
figure 5

Experiment 3. Simulations and comparison of \(10^4\) realisations of SDE schemes (37a) and (37b). a, b full set of tracks/sample paths for the more intuitive scheme (37a) and the corrected scheme (37b) respectively. It can be clearly seen that the spread in (37b) is much wider than in (37a); this is justified by the adaptation introduced in the drift term of the corrected scheme (26). This remark is confirmed by superimposing the cell tracks of (a) in (b) in (c). d this shows the final positions of the \(10^4\) realisations of (37a) and (37b), in orange and light blue respectively, along with the convex hulls of the corresponding points. It is clearly seen that the cells concentrate in the “valleys” of the environment v, cf. with the solution of the PDE (34) in Fig. 4, and that the final positions of (37b) (shown in light blue) spread more than those of (37a). The final positions of (37a) are not visible as they are overlapped by the final positions of (37b) (color figure online)

Fig. 6
figure 6

Experiment 3. Qualitative comparison of the numerical solution of the PDE (34) against the predictions of the SDE schemes (37a) and (37b). a isolines of the final time solution u of the PDE (34), cf. Fig. 4, superimposed with \(10^4\) (final time) realisations of the SDE scheme (37a). b isolines of the final time solution u of the PDE (34), cf. Fig. 4, superimposed with \(10^4\) (final time) realisations of the SDE scheme (37b). We note that the cells described by the SDE scheme (37a) are concentrated, at the final time, almost exclusively in the “valleys” of the (fixed) environment v, cf. Fig. 4, much more than cells described by the corrected SDE scheme (37b), and much more than the final time solution u of the PDE (34). The results confirm that the corrected scheme (37b) offers a much better approximation of the PDE (34) than the more more intuitive scheme (37a)

The simulations that we perform for Experiment 3 are shown in Fig. 5 and are similar to the ones for Experiment 2 (cf. Figs. 2 and 3). We note that the cell tracks of the more intuitive scheme (37a) spread out from the origin in a lesser extend than the corrected scheme (37b). This is due to the presence of the drift term in the scheme (37a) which leads almost all cancer cells to high densities of the extracellular environment v. On the other hand, the final positions of the sample paths given by (37b) concentrate less in the “valleys” of the extracellular environment.

In Fig. 6 we present a direct qualitative comparison between the numerical solution of the PDE (34) and the SDE schemes (37a) and (37b). What this figure shows is that, at the final time \(t=10\), the positions of the cells of \(10^4\) realisations of the SDE scheme (37a) are much more concentrated than the corresponding solution of the PDE (34). On the other hand, the distribution at the same final time \(t=10\), of \(10^4\) realisations of the corrected scheme (37b) is more spread out and much closer to the solution of the PDE (34). In effect, these simulation results indicate that the corrected scheme (37b) is a better approximation to the numerical solution of the PDE (34) than the scheme (37a).

Similarly to Experiment 2, we quantify the difference between the two schemes by measuring the corresponding average distances traversed by the cells from their initial position. In more detail, we perform \(K=10^4\) realisations for each scheme, calculate the average distances \(d_1\) and \(d_2\) using the formula (27), and their relative difference via:

$$\begin{aligned} E = \frac{ d_2 - d_1 }{d_1}\approx 0.516 \, . \end{aligned}$$
(38)

To further quantify the difference of the two schemes, we perform a uniform binning and calculate the relative error of the \(J_2\)-criterion for (37a) and (37b) for \(k-\)different values of the size r of the bins as in (32). We noticed that, for smaller values r we have that again \(J_{2}^{1,r} < J_{2}^{2,r}\). We choose \(r = 0.01 + kh\), where \(h = 0.001\), for \(k = 1,...,140\), and calculate the average value of the \(k-\)different values of the signed relative error \({\mathcal {E}}^r\) which is

$$\begin{aligned} {\mathcal {E}}^r \approx 3.0692\,. \end{aligned}$$
(39)

This confirms, yet again, that (37b) brings an additional migratory bias to the movement of solitary cancer cells to the entire plain.

5 Discussion

In this paper we have investigated the long-standing question of bridging the scales in problems of multiscale modelling and simulations. As we are motivated by the study of cancer growth and invasion models, we can rephrase this question as follows: how does one identify the correct SDEs (2) that dictate the migratory pattern of solitary cancer cells or small clusters of cancer cell, when the macroscopic evolution of the cancer cell colony follows the advection-diffusion PDE (1)?

We have exhibited in this paper, that the answer to this question is not trivial. The usual heuristic understanding that the advection and diffusion terms of the PDE (1) are responsible for the biased and random motion of the cancer cells, respectively, is not precise. Were this correct, the drift term in the SDE schemes that describe the migration of the cells in the solitary cell regime would have been solely dependent on the advection term \(\textbf{A}\) of the PDE (1), as e.g. is the case in the numerical scheme (5). We have seen with the derivations of the SDE schemes in Sect. 3 and with the numerical investigations in Sect. 4 that this is not the case. On the contrary, we have shown with (3) that the drift term of the correct SDE schemes should account for the advection \(\textbf{A}\) as well as the divergence of the diffusion \(\textbf{D}\) of the PDE (1) in a very specific way.

When comparing the corrected scheme (3) with the more heuristically expected one (5), we see that their difference depends solely on the divergence of the diffusion \(\textbf{D}\). This clearly indicates that the two schemes would be/are identical in the case of space independent diffusion \(\textbf{D}\). If though, both the advection and diffusion terms depend on the spatial variable—as is typically the case in cancer invasion models—then, identifying the inconsistency of the SDE schemes with the underlying PDE is not trivial and quite often is not readily apparent due to the inherent stochasticity.

From a biological/oncological perspective this is important in determining accurately how far individual cancer cells can penetrate the local tissue, and so from a clinical point of view the model results have much predictive potential. It is known that resection margins (i.e. how much tissue is removed surgically) and the pattern of cancer invasion are predictors both of local recurrence and of survival in patients who undergo surgery (Tam et al. 2022; Lee et al. 2012; Spiro et al. 1999; Yokota et al. 1993; Kim et al. 2014; Tsujitani et al. 1995). With further refinement, accurate parameterisation and testing, using the results from the model would enable quantitative estimates to be made of the likely extent of local spread by an invasive cancer. This would then enable a surgical oncologist to tailor the radicality of surgical excision for a given individual situation. Further, more accurate estimation of metastatic spread (with the associated implications for adjuvant systemic therapy) will also be possible.

Future work in further developing the insights gained from the current modelling will focus on parameterising the model more accurately in order to make quantitative comparison with any available data. A promising avenue of development in this regard, given the difficulty in obtaining in vivo data, is to use in vitro data from organotypic invasion assays (cf. Franssen et al. 2021; Nurmenniemi et al. 2009; Nyström et al. 2005).