Data-driven stochastic modelling of zebrafish locomotion

In this work, we develop a data-driven modelling framework to reproduce the locomotion of fish in a confined environment. Specifically, we highlight the primary characteristics of the motion of individual zebrafish (Danio rerio), and study how these can be suitably encapsulated within a mathematical framework utilising a limited number of calibrated model parameters. Using data captured from individual zebrafish via automated visual tracking, we develop a model using stochastic differential equations and describe fish as a self propelled particle moving in a plane. Based on recent experimental evidence of the importance of speed regulation in social behaviour, we extend stochastic models of fish locomotion by introducing experimentally-derived processes describing dynamic speed regulation. Salient metrics are defined which are then used to calibrate key parameters of coupled stochastic differential equations, describing both speed and angular speed of swimming fish. The effects of external constraints are also included, based on experimentally observed responses. Understanding the spontaneous dynamics of zebrafish using a bottom-up, purely data-driven approach is expected to yield a modelling framework for quantitative investigation of individual behaviour in the presence of various external constraints or biological assays. Electronic supplementary material The online version of this article (doi:10.1007/s00285-014-0843-2) contains supplementary material, which is available to authorized users.

Swimming filter -effect of pre-processing on speed. Time series plots of the instantaneous speed ut of raw observation data for 10 zebrafish, with 28 isolated segments each representing 1 min of active swimming (green) as determined by the pre-processing filter. Periods in which the speed of an individual drops below a threshold value of 1 BL.s −1 (3 cm.s −1 ) for longer than 2 s were rejected. From the remainder of the data, 60 s periods of sustained swimming were isolated, with segments denoted S 1 . . . S 28 . Segments of data less than 60 s in duration are discarded (e.g. remaining F 6 data after segment S 19 is 58.2 s in duration and is therefore discarded) Fig. S2 Swimming filter: effect of pre-processing on turning speed. Time series plots of the turning speed ωt of raw observation data for 10 zebrafish, with 28 isolated segments each representing 1 min of active swimming (green) as determined by the preprocessing filter. Erratic motion, or thrashing, was found to coincide with periods of nearstationary (forward) motion and thus was largely rejected by the filter, as required.

Fig. S3
Wall-corrected turning speed as a function of projected collision distance. Computed values of the wall-corrected turning speed ωc are plotted against projected collision distance d W for all experimental data segments (grey squares), and compared with data from individually calibrated random walkers (red circles). Non-parametric regression (thin dashed) and exponential fits (thick solid) are shown for experimental (red) and simulated (blue) data. The effect of the wall avoidance function fw(d W , φ W ) implemented in our model, is to bend RW trajectories away from the boundaries in a broadly similar manner to that observed experimentally. Where there was sufficient data to interpolate the exponential functions for both experimental and simulated data, a similar functional dependence on d W was found for some segments. However, due to the intrinsic stochasticity of the trajectory data, these plots are intended primarily to show the general trend for simulated trajectories to exhibit a similar skewed distribution of ωc as a function of d W . Computed values of the wall-corrected turning speed ωc are plotted against projected collision time t W for all experimental data segments (grey squares), and compared with data from individually calibrated random walkers (red circles). Non-parametric regression (thin dashed) and exponential fits (thick solid) are shown for experimental (blue) and simulated (red) data. Similar to the data shown in Fig. S3, we found a broadly similar (skewed) distribution of ωc as a function of t W due to wall interactions governed by the wall-avoidance function fw(d W , φ W ) (dependant on the projected collision distance d W ).

Fig. S5
Fish speed as a function of projected collision distance. Values of speed ut are plotted against projected collision distance d W for all experimental data segments (grey squares), and compared with data from individually calibrated random walkers (red circles). Non-parametric regression (dashed lines) are shown for experimental (blue) and simulated (red) data. Finding no strong trend in the distribution of speed ut as a function of projected wall collisions, we do not include an explicit speed dependence in our model. In these plots we find that both experimental and simulated speed data exhibit similarly flat responses as a function of d W . Values of speed ut are plotted against projected collision time t W for all experimental data segments (grey squares), and compared with data from individually calibrated random walkers (red circles). Non-parametric regression (dashed lines) are shown for experimental (blue) and simulated (red) data. Finding no strong trend in the distribution of speed ut as a function of projected wall collisions, we do not include an explicit speed dependence in our model. In these plots we find that both experimental and simulated speed data exhibit similar flat responses as a function of t W . To study the robustness of simulated trajectories with respect to the numerical timestep value ∆t and corresponding sample generation frequency fs (Hz) -we compare unbounded RW trajectories, simulated using identical stochastic processes (dWt and dZt) sampled at different frequencies, with model parameters calibrated for experimental segment S 1 7. Columns A-F show respectively: 60 s RW trajectory portraits; speed-turning speed correlation; speed distribution, turning speed distribution; speed and turning speed autocorrelation. Each row represents trials computed with sample frequencies ranging from 1000 Hz to 5 Hz (∆t =0.001 s -0.2 s). Experimental segment data (grey) is compared to the RW realisations (red) for each trial.

Fig. S8
Comparison between experimental segment trajectories and corresponding random walker. Trajectory portraits for individually calibrated random walkers (thick blue lines), simulated for 60 s, compared with experimental source trajectories (thin lines coloured according to parent fish ID). Qualitative comparisons are gleaned from these trajectory portraits which indicate differing patterns of swimming behaviour across each of the segments. Some trajectories followed more tightly curving, or erratic paths, whilst others appeared to follow more fluid, open curves. The unique parameter calibration of the SDEs driving each random walker was found in many examples to reflect these different characteristics, producing similar curvature to the corresponding source data. For example, the tightly winding trajectory of segment S 6 is well captured by the model, producing a similarly curving random walker path. In contrast, the trajectory of S 24 is found to be more fluid, punctuated by occasional rapid turns and exhibiting strong thigmotactic-like behaviour, where both features are again reflected in the corresponding random walker trajectory.

Fig. S9
Comparison of speed distribution. Speed Ut distributions of independently calibrated random walkers, simulated for 60 s at 5 Hz (red), compared with the associated experimental segment distribution (grey). Normal pdf for simulated data and experimental data are indicated by red and blue curves respectively. Simulated speed data was found, as expected, to be normally distributed with means and variances closely matched to that of the source data segments.
Fig. S10 Comparison of turning speed distribution. Turning speed Ωt distributions of independently calibrated random walkers, simulated for 60 s at 5 Hz (red), compared with the associated experimental segment distribution (grey). Normal pdf for simulated data and experimental data are indicated by red and blue curves respectively. Turning speed data was consistently more peaked than would be expected for a normal distribution, due to the coupling between SDEs, and therefore compared more favourably with experimental data across all segments.
Fig. S11 Comparisons for speed autocorrelation function -ACFu. Speed ACFs for independently calibrated random walkers (red) are compared with corresponding ACFs from experimental data (grey). The autocorrelation functions for both speed and turning speed (Fig. S12) for all segments are well matched between experimental and simulated realisations. Subject to intrinsic noise, and the effects of the boundaries on turning, these ACF comparisons indicate that the main features of the autocorrelation, primarily the decay rates, remain consistent with those of the corresponding experimental processes ut and ωt. However, by employing the MLE calibration method described by van den Berg, 2011, we find that estimation of rate parameter θ is dominated by the lag-one autocorrelation coefficient r 1 as we described in (6). Recovering the initial ACF decay accurately is therefore sensitive to the data acquisition rate fs and the smoothness of the experimental ACF, under the assumption that it decays exponentially.
Fig. S12 Comparisons for turning speed autocorrelation function -ACFω. Turning speed ACFs for independently calibrated random walkers (red) are compared with corresponding ACFs from experimental data (grey). The autocorrelation functions for both speed (Fig. S11) and turning speed for all segments are well matched between experimental and simulated realisations. Subject to intrinsic noise, and the effects of the boundaries on turning, these ACF comparisons indicate that the main features of the autocorrelation, primarily the decay rates, remain consistent with those of the corresponding experimental processes ut and ωt. However, by employing the MLE calibration method described by van den Berg, 2011, we find that estimation of rate parameter θ is dominated by the lag-one autocorrelation coefficient r 1 as we described in (6). Recovering the initial ACF decay accurately is therefore sensitive to the data acquisition rate fs and the smoothness of the experimental ACF, under the assumption that it decays exponentially. The oscillations observed in the experimental ACFω functions are found to be more pronounced in trajectories which exhibit stronger thigmotactic behaviour, reflecting the periodic quarter turns at the tank corners, with a period dependant on the swimming speed. Similar oscillations, for example those in ACFω for S 23 and other simulated segments exhibiting strong wall-following, are observed with a comparable period.
Fig. S13 Comparing cross-correlation of Ut and Ωt for simulated and experimental segments. The cross-correlation between speed and turning speed of independently calibrated random walkers are compared with corresponding experimental data. The joint distributions of speed and turning speed values for each segment confirm that the coupling function fc enforces a suitable cross-correlation between Ut and Ωt, uniquely parameterised by MLE calibrated values of µu and σω for each segment (σ 0 = 10 rad.s −1 fixed). For each simulated segment comparison, we recovered the characteristic tapering of the range and variance of Ωt values which decrease at higher speeds.
Fig. S14 Effects of coupling function fc on trajectory data. To study the effects of the additional coupling between speed and turning speed -decribed by fc, two representative segments S 3 and S 17 were used to calibrate random walkers, simulated with and without the coupling function (fixed σω equal to calibrated segment value). Columns A-F show respectively: 60 s RW trajectory portraits; speed-turning speed correlation; speed distribution, turning speed distribution; speed and turning speed autocorrelation. Rows contain data from each trial, with experimental segment data (grey) compared to the RW realisations (red).