Reconstructing the Intrinsic Statistical Properties of Intermittent Locomotion Through Corrections for Boundary Effects

Locomotion characteristics are often recorded within bounded spaces, a constraint which introduces geometry-specific biases and potentially complicates the inference of behavioural features from empirical observations. We describe how statistical properties of an uncorrelated random walk, namely the steady-state stopping location probability density and the empirical step probability density, are affected by enclosure in a bounded space. The random walk here is considered as a null model for an organism moving intermittently in such a space, that is, the points represent stopping locations and the step is the displacement between them. Closed-form expressions are derived for motion in one dimension and simple two-dimensional geometries, in addition to an implicit expression for arbitrary (convex) geometries. For the particular choice of no-go boundary conditions, we demonstrate that the empirical step distribution is related to the intrinsic step distribution, i.e. the one we would observe in unbounded space, via a multiplicative transformation dependent solely on the boundary geometry. This conclusion allows in practice for the compensation of boundary effects and the reconstruction of the intrinsic step distribution from empirical observations.


Introduction
Recent theoretical work on animal movement focuses attention on boundary effects at different scales, for example, population edge effects where two habitats conjoin , random walks in confined spaces (Bearup and Petrovskii 2015), insects following walls in circular experimental arenas (Jeanson et al. 2003), and the efficiency of catching ground-dwelling arthropods depending on trap shape (Ahmed and Petrovskii 2019). When the movement range of the studied organism is of the same order as the arena size, the boundary could affect many of the recorded locomotion characteristics. Such external bias requires theoretical models to provide expected values for intrinsic locomotion characteristics under the null hypothesis that the individual's movement follows a simple random walk. Failure to compare the empirical values against those expected under the correct null hypothesis could lead to the wrong conclusions. For example, an organism might appear to have an intrinsic movement bias when, in fact, it follows a simple random walk.
In studying the anxiety levels in animals, such as mice, for instance, it is common practice to measure locomotion on a grid in terms of displacements and their respective frequencies (Michel and Tirelli 2002). To reduce the interference of the experimenter, these tests tend to be conducted in enclosures such as cages and monitored using cameras (Kas and Olivier 2008). Locomotion characteristics are also among the main measurements of explorative behaviour across animal species, and these are often recorded within bounded spaces (Russell et al. 2010;von Merten and Siemers 2012;Degen et al. 2015). However, as demonstrated in a recent paper by some of the present authors (Christensen et al. 2020), confining the space available to the test subject affects the statistical properties of the movement in a non-trivial, geometry-specific way, which can lead to erroneous conclusions about the existence of an inherent bias. Similar effects were observed, for example, in an experiment by Mallapur et al. (2009), who found that a decrease in the enclosure size of domestic fowl leads to a decrease in the mean and maximum step length as well as the total distance travelled. Even in the wild, many species are observed to move within ranges or home territories (Giuggioli et al. 2011;Potts and Lewis 2014;Riotte-Lambert et al. 2015). In theory, the relevance of boundary effects could be diminished by selecting data produced 'sufficiently far' from the boundaries. However, this is not always possible and the difficulty of estimating a suitable cut-off distance will inevitably introduce errors of unknown size into the statistics.
These effects are not limited to ecology and the study of animal behaviour but appear in a wide range of contexts. A relevant example from biophysics is the motion of molecules undergoing totally confined diffusion in cell membranes. These can be modelled as random walks following a symmetric Gaussian displacement distribution confined within a cuboid. A study by Ritchie et al. (2005) found that increasing the time span over which the molecule's displacements were averaged resulted in an increasingly peaked and circular distribution in position probabilities. This observation demonstrates that the interplay of dynamics and geometry affects the ability to infer the shape of the enclosing geometry solely from the time averaged position probability.
In the present study, we establish a firm theoretical framework for analysing the effect of an arbitrary boundary on the empirical probability density of the stopping location and step of an uncorrelated random walk (Sect. 2). Our inspiration originates in intermittent locomotion where the stopping locations are the turning points and the steps are the displacements between them. Due to their special role in locomotion, the distribution of stopping locations is not necessarily the same as the distribution of locations. For the particular choice of no-go boundary conditions, our results show how to compensate for these boundary effects and provide a procedure for the reconstruction of the unconstrained step distribution from the empirically observed one (Sect. 3). Extensions to the more realistic case of a correlated random walk are beyond the scope of the present work; however, as shown in Sects. 2.2.1 and 2.2.2, this may not be needed to understand the aforementioned observations (Mallapur et al. 2009;Ritchie et al. 2005), whose qualitative features already arise when considering uncorrelated random walks in a bounded space. This suggests that the results for the uncorrelated random walk could be used to provide estimates of boundary effects on a correlated random walk.

Uncorrelated Random Walk
In this study, we consider a symmetric Markovian random walk, which can be interpreted as a null model for intermittent locomotion (Kramer and McLaughlin 2001), also referred to as saltatory pattern (O'Brien et al. 1990), of an individual organism with no bodily features and no underlying decision-making capacity (Schwartz 2016). A clearly defined null model is of the utmost importance in the identification of relevant (e.g. behavioural) features from empirical data, as it provides an explicit term of comparison. Since the observables of interest are static, we henceforth ignore any temporal aspect of the motion, thus reducing the random walk to a sequence of stopping locations. A sample sequence of consecutive stopping locations can be generated via the following procedure: 1. Initiate the walker at location r i=0 chosen uniformly at random within the bounded domain Ω. 2. Pick a step δr with probability f i (δr), where f i (δr) denotes the probability of observing a step δr in unbounded space. For the purpose of numerical simulations, this probability distribution is assumed to be known. However, as we demonstrate in Sect. 2, extracting it from experimental observations is not trivial. 3. If r i + δr ∈ Ω, the random walker moves to a new location r i+1 = r i + δr.
Otherwise, a new step is determined consistently with the particular choice of boundary conditions. This might require re-sampling the probability density f i (δr), e.g. in the case of no-go boundary conditions. 4. Steps 2-3 are repeated N times. Time series of stopping location coordinates and steps are recorded.
Relevant observables can be extracted from the recorded sequence. The basic procedure 1 → 4 is then repeated M times and ensemble averages are calculated. It is understood that time averages converge to ensemble averages for large enough N , due to the ergodic nature of the process. Finally, it is important to notice that for a fixed functional form of f i (δr), the statistical properties of the random walk depend solely on the dimensionless ratio of the system size and some relevant length scale of the step distribution (e.g. its variance). In other words, they are independent of our choice of units. Hence, we are free to fix the system size while varying f i (δr) without loss of generality.

Step and Stopping Location Probability Density
The most straightforward statistical analyses of intermittent locomotion are the determination of the probability density functions of the empirical stopping location and displacement . The latter is also known as the movement kernel (Avgar et al. 2016). Indeed, both functions are of immediate interest for application in ecology as they are often considered to convey information about dispersal and foraging strategies (Clobert et al. 2012;Okubo and Levin 2001;Turchin 2015;Benhamou 2014;Lepš 1981). However, the boundary which is usually present in experiments affects the empirical observations in a non-trivial manner. This makes it difficult to disentangle the intrinsic motion of the test subject from these boundary effects. Assuming that the intrinsic properties of a subject's locomotion only depend on its internal state (Bartumeus 2009; Maye et al. 2007;Anteneodo and Chialvo 2009) and not on the subject's location within the bounded domain, it is reasonable to disregard the information about the starting point of each step and to consider only the location averaged form of the step distributions. Sometimes this approach may be necessitated by data which only informs about the displacements or their magnitudes. Consequently, one must first find general expressions for the stopping location probability. For simplicity the effects of enclosure in a confined space upon the random walker's empirical displacement probability density function are first investigated in one dimension (Sect. 2.1) before extending the results to arbitrary convex domains in two dimensions (Sect. 2.2, where we also briefly discuss convex domains of arbitrary dimension). For practical reasons, most experiments involving motion in enclosed spaces take place in either circular or rectangular geometries, such as a Petri dish or a cage. Consequently, Sects. 2.2.1 and 2.2.2 address these particular two-dimensional geometries explicitly.

One-dimensional Case
We will use the phrase "step of ", where ∈ R, to indicate a displacement between stopping locations of magnitude | |. We say that the step is to the right (resp. left) if > 0 (resp. < 0). We denote by f i ( ) the probability density function of a step on the whole real line, which we assume to be independent of the starting point and symmetric, f i ( ) = f i (− ). Thus, we think of f i ( ) as an intrinsic, or internal, characteristic of the intermittent locomotion, as opposed to the extrinsic, or external, features introduced by the interaction with the boundary. In unbounded space, f i ( ) is the limiting form of the empirical histogram of the observed steps and can therefore be reconstructed directly from experimental data.
In bounded space, however, the probability density of a step differs from f i ( ) in a boundary-condition specific way. Here, we follow the treatment of Bearup and Petrovskii (2015) and consider three types of boundary conditions: a no-go boundary, where steps extending outside of the domain are rejected; a stop-go boundary, where such steps are terminated at the boundary; and a reflecting boundary, where the portion of such steps extending beyond the domain boundary is reflected back into the domain. A schematic is provided in Fig. 1. Given a starting point at location x ∈ [0, 1], the probability density function of the next step being a step of terminating within the domain [0, 1] is then where θ is the Heaviside step function and δ is the Dirac delta function. The locationdependent normalisation factor N (x) appearing in the no-go case reads The higher order terms, h.o.t. in Eq.
(1), entering the conditional probability for the reflecting boundary condition account for cases where multiple reflections occur in a single burst of motion between stopping events. They read and vanish if the support of the intrinsic step probability f i ( ) is a subset of (−2, 2). In a realistic setting, where the studied organism is not in distress, multiple reflections are expected to play a negligible role. The probability density of a step of starting and ending in [0, 1], which we denote f t ( ) and will henceforth refer to as the 'transformed' probability, is then obtained by integrating Eq. (1) over the starting location x ∈ [0, 1] with some measure dμ(x), It corresponds to the limiting form of the empirical histogram of the observed steps in bounded space. In the present context, it is most meaningful to assume that the measure dμ(x) corresponds to the steady-state stopping location probability density, which we denote g(x), whence dμ(x) = g(x)dx. The probability density g(x), which depends both on f i ( ) and on the choice of boundary conditions, can be found by with a = 0.75 and starting location x = 0.4 (black dashed), which we compare to the conditional step probability density for each boundary condition (blue solid, shaded). Unlike the intrinsic step probability f i ( ), the conditional probability P( |x) is not symmetric in general. The top part of the schematic is adapted from Bearup and Petrovskii (2015) solving a homogeneous Fredholm equation of the second kind of the form with P(x − x |x ) the conditional probability densities given in Eq.
(1). For the nogo boundary condition, one can check by substitution that the integral equation (5) is solved by the ansatz g(x) ∝ N (x). For this particular boundary condition, the expression for the transformed probability density function simplifies dramatically, This is a remarkable result as it indicates that, for no-go boundaries, the mapping between the intrinsic step probability f i ( ) and the empirically observed (transformed) step probability f t ( ) is multiplicative and independent of the specific choice of f i ( ).
In particular, it only depends on the geometry of the bounded space through shown in Fig. 2, which we will henceforth refer to as 'shaper' function. We will see that this result extends straightforwardly to higher dimensions. Notably, the variance of the transformed step probability density, 2 t , is always smaller than that of the Fig. 2 Shaper function for a simple random walk on the unit segment, x ∈ [0, 1], with no-go boundary conditions. It enters the expression for the transformed one-dimensional step probability density as a multiplicative factor in Eq. (6). This shaper function linearly suppresses the intrinsic step probability for increasing step magnitude and vanishes for step magnitudes larger than or equal to the system size corresponding probability density in unbounded space, 2 i , where to go from Eq. (8) to (9) we have reduced the support of f i ( ) to [0, 1] (normalising the resulting probability density accordingly) and the inequality relating Eqs. (9) and (10) follows from the shaper function being monotonically decreasing in | | (see "Appendix A"). For the case of reflecting boundaries, one can check by substitution that Eq. (5) is solved by the uniform probability density g(x) = 1, independently of f i ( ). Substituting into Eq. (4) we find which lacks the simple structure that was observed for no-go boundaries.
Finally, for the case of stop-go boundaries, Eq. (5) for the steady-state stopping location probability density cannot in general be solved in closed form. To proceed further, we assume that g(x) = 1, corresponding to an experiment where the subject is placed uniformly at random within the bounded domain and only statistics of the first step are collected. Substituting the relevant form of Eqs. (1) into (4) produces for stop-go boundary, where in going from Eqs. (12) to (13) we have used f i ( ) = f i (− ) to express the transformed step probability in a more symmetric fashion. Once again, the expression for the transformed step probability lacks the simple structure observed in the no-go case.
While the precise form of the transformed step probability density function depends on the combination of f i ( ) and boundary conditions, Eqs. (6), (11), and (12) share the property that f t ( ) vanishes for | | > 1, reflecting the impossibility for the random walker to exit the domain [0, 1]. In the following, we will focus solely on the case of no-go boundary conditions. We focus on it because we think it is compatible with the behaviour of animals that have habituated to their environment, when we do not expect many animals would have an increased probability of stopping at the boundary (the 'stop-go' boundary condition) or would simply be reflected off the boundary (the 'reflecting' boundary condition).

Two-dimensional Case (Convex Domain)
Here, we extend the analysis presented in the previous Sect. 2.1 to intermittent locomotion in two-dimensional bounded space, focusing solely on the case of no-go boundary conditions. Displacements between successive stopping locations will be denoted = ( x , y ), with ∈ R 2 . The probability density function of on the real plane is denoted f i ( x , y ) and is assumed to be rotationally symmetric, for all θ ∈ [0, 2π). Now let Ω be a bounded convex domain and I Ω (x, y) be the indicator function taking the value 1 if (x, y) ∈ Ω and 0 otherwise. For a convex polygon with N sides, such that each side i ∈ {1, . . . , N } is given by the equation y = a i x + c i , the indicator function I Ω (x, y) is the following product of Heaviside step functions where q(i) = 1 (resp. q(i) = −1) if the polygon is above (resp. below) the line y = a i x + c i and J i (x, y) = y − a i x − c i . In bounded space and with the specific choice of no-go boundary conditions, the probability density function of the next step being a step of terminating within Ω given a starting point at location r = (x, y) is with the normalisation factor The reason for imposing that Ω is convex is to avoid ambiguities when both the starting and finishing points are within Ω but some convex combination of the two is not. The probability density of a step starting and ending in Ω, which we denote f t ( x , y ), is then obtained by integrating Eq. (16) over the starting location r ∈ Ω with some measure dμ(r), Similar to the one-dimensional case, we argue that this measure should correspond to the steady-state stopping location probability density function g(r), which is defined as the solution of the integral equation One can check by substitution that Eq. (19) is solved by the ansatz g(r) ∝ N (r), whence The mapping between the intrinsic and transformed step probability density for the case of no-go boundary conditions thus has the same structure as seen in the onedimensional case, namely that of a multiplication by a geometric-specific shaper function of the form Assuming that Ω is convex, the shaper function Eq. (21) is monotonically decreasing in | |, the magnitude of the step (see "Appendix B"). Consequently, the variance of | | under the transformed probability density f t ( x , y ) is always smaller than that under the corresponding probability density f i ( x , y ) in unbounded space. This can be shown along the lines of Eqs. (8)-(10).
The construction outlined in this section can be extended straightforwardly to (convex) domains of arbitrary dimensionality. With Ω a bounded d-dimensional convex domain and I Ω (r) the associated indicator function of r ∈ R d , one arrives at the now familiar multiplicative relation where the higher-dimensional shaper function h d D ( ) is defined as

Square Geometry
As an example, we can use Eq. (15) for the indicator function of a polygonal domain in combination with Eq. (21) for a general shaper function to obtain the shaper function associated with a square domain of unit side, in which case which can also be written as a product of one-dimensional shaper functions, Eq. (7), Notably, the shaper function h ( x , y ), shown in Fig. 3, does not satisfy the same rotational symmetry as the step probability density function in unbounded space, Eq. (14), meaning that this symmetry is broken by the interaction with the boundary.

Circular Geometry
While the discussion above encompasses the disk of unit radius as a specific case of bounded convex domain with indicator function the commonness of (at least approximately) circular boundaries in natural as well as experimental conditions makes it worthwhile presenting the corresponding results in an explicit form. Starting with the steady-state stopping location distribution, we use g(r) ∝ N (r) together with Eqs. (17), (14) and the indicator function Eq. (26) to obtain Fig. 3 Shaper function for a square geometry of unit side as defined in Eq. (24). It vanishes in every direction when | | > √ 2, i.e. when the step magnitude exceeds the diameter of the domain. Remarkably, the shaper function does not satisfy the same rotational symmetry of the intrinsic step probability density function, which is therefore broken in the empirical step distribution The geometry-specific shaper function, which we denote h • ( x , y ), can be obtained by substituting Eq. (26) into Eq. (21) and reads Unlike the shaper function for a unit square domain, Eq. (24), the shaper function for a circular domain h • ( x , y ) is rotationally symmetric, see Fig. 4, and the symmetry of the intrinsic step probability density f i ( x , y ) is preserved under the interaction with the boundary.

Statistics of the Step Magnitude
In the discussion above, the dependence of the transformed step probability density function f t ( x , y ) on its two arguments x and y was left explicit. This might have been redundant for the intrinsic probability f i ( x , y ), which was assumed to be rotationally symmetric; however, it is not so for f t ( x , y ) since the shaper function h 2D ( x , y ) need not satisfy that same symmetry (see Sect. 2.2.1 for an example).
On the other hand, the magnitude of the displacement between successive stopping locations is often more easily accessible from an experimental point of view, as it does not require the set of coordinates to be consistent across sample paths. It is therefore instructive to consider how the probability density function of the step magnitude is affected by enclosure in a bounded space with no-go boundary conditions. We denote byf i ( ) the probability to observe a step of magnitude | | = in the real plane. It is related to its two-dimensional counterpart viã Similarly, we denote byf t ( ) the probability to observe a step of magnitude | | = starting and ending in Ω. It can be written using Eqs. (20) and (21) as which has the same multiplicative structure as Eq. (4) with a modified shaper functioñ depending only on the step magnitude .

Correcting for Boundary Effects Under No-Go Boundary Conditions
In Sect. 2, we explored how three common choices of boundary conditions (no-go, stop-go and reflecting) affect the statistical properties of intermittent locomotion in bounded space. For the particular case of no-go boundary conditions, starting from the conditional probability density function of the next step, Eq.
(1) for one-dimensional domains or Eq. (16) for two-dimensional ones, we showed that the empirical step probability f t ( ) density is related to the intrinsic step probability f i ( ) by a multiplicative transformation involving a 'shaper function', Eqs. (7) or (21), that depends solely on the geometry of the domain. This simple structure can be exploited to correct for boundary effects, assuming that the boundary geometry is known. In particular, we can write for one-and two-dimensional geometries, respectively. When the step exceeds the maximum linear dimension of the domain in a given direction, the shaper function vanishes and the expressions in Eq. (33) become undefined. This behaviour is expected, since such steps cannot be performed inside the domain. In two dimensions and focusing on the statistics of the step magnitude, Eqs. (30) and (31) givef i ( ) ∝f t ( )/h 2D ( ), which is undefined for step magnitudes exceeding the maximum linear dimension of the bounded space. We demonstrate this procedure numerically in Fig. 5 for the case of a non-trivial intrinsic step probability density in a square geometry.

Conclusion and Outlook
We studied the effect of enclosure in a bounded space on locomotion characteristics in a null model of an organism moving intermittently in one-and two-dimensional domains of arbitrary convex geometry. Intermittent locomotion, which characterises organisms across taxa (Kramer and McLaughlin 2001), is modelled here as a simple random walk, where we consider individual steps to represent the net displacement between stopping locations. The temporal aspect of locomotion is not taken into account. For the particular case of no-go boundary conditions, our analysis yielded analytical closed form expressions for the probability density function of the stopping location and step. Corresponding expressions were also obtained in one dimension for stop-go and reflecting boundary conditions. Both these locomotion characteristics are affected by the boundary geometry in a non-trivial way, thus demonstrating how a superficial statistical analysis can lead to erroneous conclusions. For example, the empirical variance of the step probability density function is reduced upon enclosure, which could be misinterpreted as a behavioural response to the latter. For the particular case of no-go boundary conditions, we also demonstrated that the relation between the intrinsic step probability density, f i ( ), and its empirical counterpart in bounded space, f t ( ), amounts to a multiplication by a geometry-specific function of the step , which we termed 'shaper' function and denoted h( ). This shaper function displays a maximum at | | = 0 and is a monotonically decreasing function of the step magnitude | | in a given direction. In one dimension and for general twodimensional convex domains, the shaper function can be calculated straightforwardly and often in closed form. The simple relations between f i ( ) and f t ( ) further enables the reconstruction of f i ( ) from an empirically measured f t ( ) up to the maximum step allowed within the enclosure. Any information about steps larger than this cut-off is lost, as such an observation is forbidden within the boundary. The reliability of such reconstruction has been numerically demonstrated.
The possibility of consistently eliminating boundary effects is of immediate interest in behaviour and ecology, where properties of movement can provide useful information about the state of an organism. Locomotion as performed by real organisms is generally functional to other activities (e.g. foraging) and is expected to be subjected to considerable biases. These biases can be intrinsic, such as asymmetries in body or behaviour (Wiper 2017) and changes due to learning, or extrinsic, such as the particular conformation of the bounded environment. As such, the random walk examined in this paper can be considered a null model for the displacement of an actual organism in space. Nonetheless, our study sheds light on the risks associated with overlooking boundary effects when interpreting locomotion data as a behavioural feature. Higher moments of the step probability density, particularly the correlation function of sequential steps, were not considered in this work. However, a recent study by some of the authors (Christensen et al. 2020) demonstrates how enclosure also introduces extrinsic negative correlations, which are required to prevent the random walker from exiting the domain. by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

A Variance of f t ( ) in One Dimension
Let F(x) be a non-negative, monotonically decreasing function of x. Let P(x) be a probability density function of x ∈ [0, 1] satisfying the normalisation condition corresponding to expressions (8) and (10), Sect. 2.1. In order to prove the inequality I > J , we first use the law of total expectation to write whence I > J ⇐⇒ Cov(x 2 , F(x)) < 0.
The proof of the right-hand side of equivalence (37) for F(x) a monotonically decreasing function of x is a standard textbook exercise. It follows by noticing that (F(x) − F(y))(x 2 − y 2 ) < 0 for any pair of random variables x and y and that (F(x) − F(y))(x 2 − y 2 ) = 2Cov(x 2 , F) when x and y are i.i.d.

B Overlap of Indicator Functions
The generic two-dimensional shaper function for convex domains h 2D ( x , y ) of Eq. (21) can be interpreted as the area of the intersection between Ω and a copy of itself, Ω , which was offset by = ( x , y ). Using the indicator function, To demonstrate that h 2D ( cos(θ ), sin(θ )) decreases monotonically with | | at fixed θ , we first perform a rotation of the coordinate axes so that, without loss of generality, θ = 0 and the relative offset of Ω and Ω is aligned with the x-axis. We now Fig. 6 Area of the intersection of two convex domains of identical geometry, Ω and Ω , with a relative displacement of . Each domain is approximated as a union of non-overlapping rectangular domains of height h (shaded), such that each rectangular sub-domain of Ω can only ever overlap with one rectangular sub-domain of Ω as | | varies approximate Ω and Ω as unions of N non-overlapping rectangular domains r i and r i of height h; see Fig. 6, with the understanding that this approximation becomes exact in the limit h → 0. The intersection Ω ∩ Ω can now be written as a union of intersections It is straightforward to show that each term appearing in the sum in the right-hand side of Eq. (40) is monotonically decreasing in | |. Consequently, h 2D ( cos(θ ), sin(θ )) also decreases monotonically with | | at fixed θ .