Relative distance between tracers as a measure of diffusivity within moving aggregates

Tracking of particles, be it a passive tracer or an actively moving bacterium in the growing bacterial colony, is a powerful technique to probe the physical properties of the environment of the particles. One of the most common measures of particle motion driven by fluctuations and random forces is its diffusivity, which is routinely obtained by measuring the mean squared displacement of the particles. However, often the tracer particles may be moving in a domain or an aggregate which itself experiences some regular or random motion and thus masks the diffusivity of tracers. Here we provide a method for assessing the diffusivity of tracer particles within mobile aggregates by measuring the so-called mean squared relative distance (MSRD) between two tracers. We provide analytical expressions for both the ensemble and time averaged MSRD allowing for direct identification of diffusivities from experimental data.


Introduction
In most cases, living matter is organized in the form of multicellular aggregates, agglomerates consisting of many individual cells. Examples range from microcolonies formed by bacteria [1,2] (see figure 1a) to eukaryotic cells forming aggregates [3,4,5] and tissues [6].
One of the standard ways to experimentally assess the mechanical properties of such agglomerates is by performing particle tracking and analyzing the trajectories of individual cells or embedded passive tracer particles. Similar measurements can be performed on a subcellular level with injected particles or tracing cell organelles as means of quantifiying the physical properties of the cell cytoplasm [7,8]. By assuming a random motion of cells within agglomerates or tracers in the cell cytoplasm one typically measures the mean squared displacement (MSD) and thus gets access to the diffusion constant and the scaling of diffusion.
However, frequently cell aggregates or individual cells exhibit spatial translation and rotation [1,2]. This motion contributes to the MSD of tracers and makes it difficult to disentangle the diffusivity of the tracers.
In this paper we investigate a quantity that enables us to measure the diffusion coefficient of tracers within mobile domains, the so called mean squared relative distance (MSRD). It is similar to the standard MSD except it utilizes the relative distance between two particles. This results in the MSRD, unlike the MSD, being insensitive to the translation and rotation of the domain in which the tracking is happening. The problem of relative diffusion is more than a century old. From classical works of Richardson and Batchelor [9,10,11], to direct applications in biophysical tracking [12], this topic is extensively studied. The prototypical quantity of interest is a vector of relative displacement of two tracers. However, the second moment of the displacement carries information about the initial positions of the particles. Normally, when, for example, measuring two tracers in a turbulent atmosphere that would not pose any particular difficulty. Let us imagine we want to analyze relative diffusion of two tracers in a cloud, which itself is rotated and advected by a larger scale atmospheric currents. In this case, the initial displacement between the particles matters. Rotation of the cloud would lead to changes in the relative displacement even if there is no diffusion inside the cloud. By focusing specifically on the statistics of the absolute distance between tracers we circumvent this issue. Interestingly, although previously introduced at least in some works [13], the statistics of the relative distance has not been studied in detail before.
2 The mean-squared relative distance of two random walkers As an example, let us define the MSRD of two cells within a bacterial colony. Two cells, a and b, at positions r a (t) and r b (t) have the absolute distance This quantity is independent of any translational or rotational motion of the cell aggregate. We define the MSRD, denoted δ(t), as the squared mean of the change of this distance with time t: Here, we average over an ensemble of different realizations of d ab (t) while keeping d ab (0) fixed. An illustration of the quantity δ(t) is shown in figure 1c. For simplicity, but also in agreement with experimental measurements which are often taking place in a single focal plane of a microscope, we will consider a two-dimensional scenario. The generalization to higher dimensions is straightforward. From figure 1c it is easy to see that the MSRD is probably the only measurable quantity similar to the standard MSD but has the advantage of being insensitive to the motion of the cell aggregate as a whole. Our goal is to relate the behavior of δ(t) to the diffusivity of individual cells.
In order to study the behavior of the MSRD, we first simulated the trajectories of pairs of Gaussian random walks with diffusion coefficients D = D a = D b = 0.5 (given as a unitless quantity) and an initial distance d 0 in an unbounded domain (details of the simulations can be found in appendix A). By computing the scalar distance of the two trajectories, defined in equation 1, we can compute the MSRD (see equation 2) in the ensemble average sense (see figure 2a). We observe that the MSRD exhibits two regimes which can be approximated as δ ∼ 4Dt for small (c) Sketch of a simplified two-dimensional aggregate. Two particles are initially separated by a distance d ab (0). Up to a time τ , they perform random motion in a circular domain, see solid lines illustrating the trajectories of particles. However, the aggregate itself rotates (as marked by the black dot on the boundary of the domain) and its center of mass experiences some random motion (grey line). To quantify the diffusivity of particles we follow their absolute relative distance as a function of time d ab (t), which is independent of the motion of the domain. times t, and by δ ∼ 8Dt for large times. The transition point between the regimes depends on the initial distance d 0 with a later transition corresponding to a larger d 0 (see figure 2b). We discuss the origin of the two regimes later in the text where the corresponding analytical expression for the MSRD is derived. In the transition region, the MSRD can be approximated by a power law with δ ∝ t α , α > 1 (see figure 2a). Of particular note, this transient behavior can be misinterpreted as a signature of superdiffusion if the time traces are not long enough to detect the second diffusive regime [14].
As an alternative to the ensemble-average, we computed the MSRD by timeaveraging relative distances for a pair of very long trajectories. The time averaged MSRD δ t is given by Here, for every lag time t, we average the relative distance over all starting points t 0 along the trajectory. Interestingly, we observe in our simulations that the MSRD follows a single scaling δ t ∼ 4Dt and is independent of the initial distance d 0 (see figure 2c). Often, the time-averaging is applied for the estimation of the diffusion coefficient in data where the statistics are not strong enough to deliver a reliable ensemble average. Examples for such cases are usually experiments in which a high effort is required to measure the trajectory of a single tracer, such as the study of the motion of individual tracers within cells or single cells themselves [7,8,15,16,17]. However, as became apparent recently, care should be taken when interpreting the time-averaged data [18].
Differences between time-averaged and ensemble-averaged quantities appear quite frequently, for example, if the tracers of the ensemble are in different dynamic states [12,19], or if the diffusion coefficient is not spatially homogeneous [20]. Additionally, time-averages and ensemble-averages can differ for the case of the so called weak ergodicity breaking which can be linked to power-law distributed waiting times present in a system of interest [21]. Examples of such systems are subdiffusive continuous time random walks and Lévy walks [21,22,23], see also a recent review [18].
In our case, the difference of the ensemble-averaged and the time-averaged MSRD stems from the difference in the initial conditions of the random walks. While we picked the same initial condition d 0 for the computation of the ensemble-averaged MSRD, it follows from the definition of the time-averaged MSRD (see equation 3) that the initial condition is constantly changing. This idea is further supported by performing an additional averaging over the ensemble-averaged MSRD with respect to randomly chosen initial distances d 0 . In this case, we observe that the MSRD shows the same behavior as the time-averaged one (see figure 2d).
Next, we analytically calculate the ensemble-averaged and time-averaged MSRD. This allows us to explain the origin of the two regimes of the ensemble-averaged MSRD and the difference between ensemble-averaged and time-averaged MSRDs. 3 Ensemble-and time-averaged mean squared relative distance In order to calculate the ensemble-averaged MSRD, we first reduce the motion of two particles (denoted by i = a, b) to the effective motion of a single particle. The probability density functions of each particle position, r a and r b , defined in cartesian coordinates (x, y), are given by a Gaussian distribution with the diffusion coefficient D i and the initial position (x i,0 , y i,0 ). The probability density function of the distance vector d ab (t) = r a (t) − r b (t) of these two particles, starting with an initial distance d 0 = |d ab (0)| in y−direction, is then given by This corresponds to the probability density function of a Gaussian random walker with a starting position (0, d 0 ) and a diffusion coefficient of For the particular case of d 0 = 0, the probability density function of the scalar distance d ab of a Gaussian random walk is given by the Rayleigh distribution [24,25], For an arbitrary initial distance d 0 , the probability density function of the scalar distance d ab is given by the Rice distribution [26], where I x is the modified Bessel function of the first kind [27]. The fact that the Rice distribution characterizes the distribution of the relative distance of two normally distributed particles is well known (see [28,29,30]). This equation is frequently used in radar and sonar signal processing. For completeness, we have included the derivation of the Rice Distribution in Appendix B.
In order to compute the MSRD, we calculate the first two moments of this probability density function: Thus, the ensemble-averaged MSRD is given by where d ab (0) = d 0 . Alternatively, one can also compute the MSRD by taking the probability density function of the distance vector between the two particles and computing the mean value of the mean squared scalar distance: In both cases we arrive at the same result. The calculated ensemble-averaged MSRD (equation 10) reproduces the results of the numerical simulations (see figure 2a). We can approximate thus the MSRD agrees with the observed limits (see figure 2a). The two regimes of diffusion exist due to the effect of the initial condition d 0 . For small times, d 0 is much larger than the relative displacement due to tracers' diffusion ∆d ab , d 2 0 ≫ ∆d ab ∼ D ab t. By expanding Eq.(10) in this limit, we can show that the change of the distance d ab for a small displacement ∆d ab is approximated by the projection of ∆d ab in the direction of r a − r b , mimicking a one-dimensional random motion and explaining why the MSRD follows δ ∼ 2D ab t scaling. Later, when the the limit d 2 0 ≪ D ab t is fulfilled, the distance d 0 can be neglected and the distribution of displacements can be approximated by the Rayleigh distribution (see equation 6). In this case, the motion is fully two-dimensional and we get d 2 ab (t) pray ∼ 4D ab t. While d 0 does not affect the scaling behavior for large or small values of the time t (compared to d 2 0 /D ab ), it does determine the transition time between these two limits, as can be seen in figure 2b.
In order to compute the time-averaged MSRD, we average the time-dependent probability density function of the distances (see equation 7) over a time interval [0, T ] and call the resulting probability density functionp rice (see Appendix C for the derivation). Then, we compute the mean value of the ensemble-averaged MSRD for this distribution. The resulting time-averaged MSRD is given by (see Appendix C for its derivation). This result agrees with the behavior observed in simulations (see figure 2c) and is independent of the initial distance d 0 . We can relate this result to the ensemble-averaged MSRD. In the calculation of the running time average, and in the limit of the trajectory length going to infinity, the initial distances between tracers entering averaging also grow infinitely with time. Thus, in the corresponding ensemble-average picture we would be operating in the regime, where the diffusive displacement is much smaller than the initial distance. Hence the time-averaged MSRD has the same asymptotic as the first scaling regime of the ensemble-averaged MSRD. Until now, we neglected the role of boundary conditions while studying the MSRD. In most cases, the cells will move within aggregates that are spatially confined (see figure 1a and 1b). In the next section we consider the effects of a finite domain size.

Effects of the finite domain size
Often, tracers move within domains of finite size, for example individual molecules inside single cells [7,17,31] or individual cells within cell aggregates [1,32]. To account for such boundary effects, we simulated the motion of the two Gaussian random walkers within a circle (see figure 1c) with reflective boundaries (details of the simulations are given in Appendix A). As might be expected, the behavior of the ensemble-averaged and time-averaged MSRD starts to be affected by the boundary when the displacement becomes comparable to the radius R of the circle (see figure 3). For longer times, the MSRD saturates towards the values δ sat,e for the ensemble average and δ sat,t for the time average. The values of δ sat,e and δ sat,t can be estimated analytically (see Appendix D). The saturation values depend on the initial positions of the two particles for the ensemble-average and do not depend on the initial condition for the time average. The transition region towards saturation might be interpreted as a signature of subdiffusion. While subdiffusion might indeed occur in cells as a result of tracer particles being trapped in local environments, it is important to discriminate such behaviors from the effects of the domain (cell) boundary. In that respect, the analytical results of the MSRD saturation values δ sat,e and δ sat,t can provide an estimate on when to expect the influence of boundary effects.

Conclusions
In this paper, we presented a tool to measure the diffusion coefficients of individual tracers within mobile domains. To mitigate the effect of domain movement, we suggest looking at the relative distance between pairs of tracer particles. Therefore, it is required to track the positions of at least two tracers simultaneously. From these data, one can measure the mean-squared relative distance of a pair of particles. The MSRD enables us to quantify the sum of the diffusion coefficients of motile cells within biological aggregates (for example bacterial microcolonies, cell spheroids or tissues), independent of translations and rotations of the agglomerates. Under the assumption of identical diffusivities of cells, this can be directly translated into the characteristics of the individual cells.
In order to compute not just the sum of the diffusion coefficients, but their values, it is necessary to track not just two but three cells (a, b, c) simultaneously. By computing the three sums D a + D b , D a + D c and D b + D c it is possible to estimate the individual diffusion coefficients D a , D b and D c of the individual cells. The method based on MSRD measurements can, with some limitations, be used for non-uniform diffusivities, where, for example, the diffusion constant is a function of the distance from the center of the aggregate [1].
In this manuscript, we have shown that even in the simplest case of normal diffusion, analysis of the MSRD can exhibit some non-trivial behavior. The apparent diffusion constant read out from the ensemble-averaged MSRD may differ by a factor of 2 or even look like a superdiffusion in a transient regime. There is also a factor 2 difference in the long time scaling of the ensemble and time-averaged values of diffusion constant. We linked these differences to the initial separation between the tracers. Moreover, the domain size may lead to the saturation of the measured MSRD as a function of time. Our analytical results provide the guidelines for how the diffusivity of particles can be reliably extracted from the tracking data. This approach is viable for generalizing to anomalous (subdiffusion) and heterogeneous diffusion, which are both frequently encountered in biological settings [8,33,34]. There are differences between the time-averages and ensemble averages which contrasts the case of normal diffusion. These differences are rooted in the underlying transport mechanisms, more robust and thus might be diagnostically relevant.

A Simulation Details
In order to compute the ensemble-averaged MSRD, we simulated 2000 pairs of random walks in two dimensions. The walkers possessed a normally distributed step length and a constant step time, corresponding to a diffusion coefficient D = 0.5. In figure 2d we studied the ensemble-averaged MSRD for random initial conditions. Therefore, the start points of the random walks where chosen independently of each other such that they were randomly distributed within a circle of radius R = 1000.
In order to compute the time-averaged MSRD, we simulated two trajectories consisting of 10 7 individual steps, with similar properties as the trajectories simulated for the ensemble-average.
To consider the effect of boundaries, we included reflective boundary conditions relative to a circle of radius R = 200. The initial positions of the random walks were chosen such that the first walker starts at the center of the circle and the second one starts at a distance d 0 = 4 from the first one.

B Derivation of the Rice distribution
The Rice distribution can be derived by considering the probability density function of the scalar distance of a random walk from the origin of the coordinate system, given in equation 5. By computing the distribution of the distances d ab and transforming the integral to polar coordinates, one can compute the distribution, which is defined to be the Rice distribution: Here, I 0 is the modified Bessel function of the first kind [27].

C Time-averaged MSRD
The probability density function of the time-dependent scalar distance d(t) of two particles, performing a Gaussian random walk, is given by the Rice distribution (see equation 7). In order to compute the time averaged mean squared displacements of d(t), we take this time-dependent distribution of distances and compute its mean over a time interval t ∈ [0, T ], given byp The resulting equation, calledp rice , represents the probability density function of having two particles with a distance d at some point in the given time interval. t ∈ [0, T ] By performing the substitution u = t/T , this equation takes the form The equation under the integral, is uniformly convergent [35], so that we can exchange the limit and the integral. For T → ∞ the function within the integral converges lim T →∞ w(T ) = 0.
This tells us that for an infinitely long time interval t ∈ [0, ∞] all distances d are equally likely and that there are no memory effects of the initial distance d 0 .
In the next step we compute the mean of the MSRD δ(d, τ ) for such an uniform distribution in the time interval t ∈ [0, g] and then compute the limit g → ∞. This is given by δ t (τ ) = lim g→∞ g 0 dd δ(d, τ ) g = 2D ab τ.
Again, we used a substitution of the form u = d/g to simplify the integral. This is the time-averaged MSRD.

D Additional calculations for radial boundary conditions
For large times, the positions of two particles (i = a, b) within a circle of radius R are uncorrelated and homogeneously distributed The probability density function for two independent particles is then given by p ab,circ (r a , r b ) = p circ (r a )p circ (r b ).
In order to compute the ensemble averaged saturation value of the MSRD, δ sat,e , we define the distance of the two particles in polar coordinates (R 1 , φ 1 ) and (R 2 , φ 2 ) so that δ sat,e (R, d 0 ) = (d ab (t) − d ab (0)) 2 p ab,circ where f (R) = 8 and with E(x) being the complete elliptic integral of x. Here, we define d ab (0) = d 0 . This integral can be computed numerically. The time-averaged saturation value of the MSRD is calculated by assuming that the initial positions of the particles are homogeneously distributed within the circle area A and thus we integrate d 0 = |r 0,a − r 0,b | over all positions of the two particles within the circle: Again, the resulting value of the MSRD δ sat,t can be computed numerically.