Guppy experiments
We performed experiments between 9.00 am and 9.00 pm in March and April 2011 at the University of Exeter. We used Trinidadian guppies (Poecilia reticulata) from a large laboratory-reared population bred from fish collected on the island of Trinidad in 2008 from the lower reaches of the Aripo River (10°40’N 61°14’W). Adult fish were collected from laboratory stock tanks (150 × 300 cm), separated according to biological sex and subsequently housed on a 12:12-h light-dark cycle in gravel-bottomed holding tanks (30 × 30 × 45 cm) with no more than 50 individuals per tank. The numbers of male and female fish in holding tanks were continuously replenished, but each fish was only used once and was collected from stock tanks at least 24 h prior to being used in our experiment. All experimental procedures were performed in conformance with UK Home Office guidelines.
We performed experiments in a white Perspex, flat-bottomed, square tank with a side length of 60 cm. This tank was emptied, wiped with 98% alcohol and rinsed with untreated water between trials. For experiments, we filled the tank with 21 l of water. This resulted in a water depth of approximately 6 cm. We then transferred 12 guppies from the holding tanks into the experimental tank, let them habituate for 30 min and subsequently recorded their movement for 25 min at a rate of 10 frames per second using a standard definition, overhead camera (Sony Handycam DCR-SX33). Fish were transferred to separate housing tanks after experiments to ensure they were not reused.
While we performed our experiment on shoals with different sex-ratios, we did not investigate the effects of sex-ratios on shoal behaviour here. For completeness, the three group compositions we tested were all-male, all-female and a balanced sex-ratio of 6 male and 6 female fish. We performed 20 replicate trials for male and female groups and 22 replicate trials for mixed-sex groups. Thus, we conducted a total of 62 experimental trials including 744 guppies.
The light colour of the tank and the absence of plants or rocks inside the brightly-lit experimental tank meant that guppies were highly conspicuous. This aided data acquisition (see below), but was also likely to heighten stress levels in the fish (Bode et al. 2010).
Guppy data acquisition
We obtained data on individual fish positions and subsequently trajectories from the video recordings of experiments using the open-source tracking software ‘SwisTrack’ (Correll et al. 2006) and previously established methodology (Bode et al. 2010). As the water in the experimental tank was shallow, we only considered movement in two dimensions. Recent work further corroborates the appropriateness of this approximation (Watts et al. 2017). We smoothed position time series using a 3 frames wide moving window average and approximated the instantaneous speed of individuals from the distance Δx between consecutive tracked positions (0.1 s apart) with the formula Δx/0.1s. Speed time-series at this temporal resolution were highly auto-correlated (Pearson’s correlation at lag 0.1 s: 0.94). To plausibly fit our statistical models, which did not capture auto-correlations at such a fine temporal scale, we coarse-grained the speed time-series by only using instantaneous speeds that were 1 s apart (Pearson’s correlation in coarse-grained data at lag 1 s: 0.54). Figure 1 shows a top-down view of our experimental tank and examples for trajectories.
While we did not use sex-specific data here, differences in body shape and size in guppies (Magurran 2005) allowed for a semi-automated way to identify smaller males and larger females by using the number of pixels representing fish in video recordings. We visually checked the correctness of the automated part of this procedure and manually labelled genders based on body shapes whenever size differences were ambiguous. In the discussion, we indicate how this additional data could be used.
Reflections on the water surface and overlaps of fish resulted in missing or erroneous fish positions that we had to remove from our data (approach described in Bode et al. 2010). Furthermore, we only used data when the positions of all 12 fish in the experimental tank were tracked. Following this, we obtained a total of 445,332 observations of instantaneous speeds across all experimental trials and individuals (from 37,111 observation time points of entire shoals). This represents approximately 40% of all time points in our experiments. We also discuss this in more detail in the supplementary information, section S1.1.
Stickleback data
We used previously published experimental data on stickleback movement (Bode et al. 2010). In these experiments, the movement of shoals of 8 three-spined sticklebacks (Gasterosteus aculeatus) was observed inside a circular experimental arena. Eight shoals were tested and each shoal was exposed to four different experimental treatments that altered the threat levels perceived by the fish. Here, we combine the data from all experimental treatments and therefore average over observed differences in behaviour across treatments (Bode et al. 2010). All further details on the stickleback experiments can be found in (Bode et al. 2010).
We applied the same procedures as described above for the guppy data to the stickleback data. The stickleback experiments were recorded at a frame rate of 25 frames per second and at this temporal resolution speed time-series were highly auto-correlated (Pearson’s correlation at lag 0.04 s: 0.96). As for the guppy data, we thus used instantaneous speeds that were 1 s apart (Pearson’s correlation in coarse-grained data at lag 1 s: 0.68). We only used data when the positions of all 8 fish in the experimental tank were tracked. This resulted in a total of 256,816 observations of instantaneous speeds across all experimental trials and individuals (from 32,102 observation time points of entire shoals).
We tested the robustness of our findings to the size of the time gap between consecutive data points used in our analysis (see below).
Initial trajectory analysis
For a shoal-level characterisation of movement behaviour, we computed the two most widely adopted order parameters used to categorise collective behaviour (Tunstrøm et al. 2013). Contrasting these broadly adopted measures with our Hidden Markov Model approach served to contextualise and highlight the additional insights we gained.
The first order parameter, polarisation, O
p
, measures how aligned individuals are. It is computed using the normalised instantaneous movement direction, u
i
, of fish, where individuals are numbered i = 1,…,N with N indicating the size of the shoal:
$$ {O}_p=\left|{\Sigma}_{i=1\dots N}\left({\mathbf{u}}_{\mathbf{i}}\right)\right|/N. $$
(1)
The polarisation, or the absolute value of the mean individual movement direction, takes value 1, if all individuals move in the same direction, and value 0 if there is no alignment on average (e.g. when individuals move in random directions).
The second order parameter, rotation, O
r
, measures the extent to which the shoal rotates around its centre of mass. In addition to individuals’ movement direction, it also incorporates the unit vector r
i
pointing from the shoal’s centre of mass towards individual i to compute the mean normalised angular momentum,
$$ {O}_r=\mid {\Sigma}_{i=1\dots N}\left({\mathbf{u}}_{\mathbf{i}}\times {r}_i\right)\mid /N, $$
(2)
where we set the vertical component of all vectors to zero, as we only considered two-dimensional movement. The rotation takes values between 0 (no rotation) and 1 (strong rotation).
We used these order parameters to obtain an overview of the shoal-level organisation and movement structure in our guppy and stickleback data. However, the polarisation and rotation do not capture important aspects of social behaviour. For example, if only pairs of fish interacted by moving consistently at the same speed this would not necessarily be captured by shoal-level measures. Therefore, we developed statistical models to characterise social movement in more detail.
Statistical movement models
The guppy trajectories shown in Fig. 1 already suggested high variability in fish movement. At any given time, some fish may move, possibly in a group, while others remain stationary. To quantitatively investigate and further characterise these movements, we fit statistical models to our data. In contrast to the shoal-level order parameters introduced above, we considered individuals’ speeds in our models rather than their movement directions. Modelling speeds had the advantage that they were easily measured and that they, in contrast to movement directions, did not need to be expressed as non-trivially truncated probability distributions whenever fish are near to tank walls (as seen in Fig. 1). An additional argument for considering speeds is given by the empirical evidence suggesting that speed regulation dominates fish interactions (Katz et al. 2011). We developed three separate models. Models 1 and 2 assumed that individual fish move independently from each other. Using model 3, we tested for social interactions between fish. By comparing the relative quality in explaining our data of model 3 to the other models, we provide quantitative evidence for the existence and ubiquity of speed-mediated social interactions in our guppy and stickleback data.
Model 1 is a simple baseline model that has previously been used for individual speeds in fish (Aoki 1982). In this model, we assumed that the individual speed of a guppy at time t, denoted by V(t), can be modelled by a gamma distribution with constant mean μ and standard deviation σ:
$$ Model\ 1:\kern2em V(t)\sim \Gamma \left(\mu, \sigma \right) $$
(3)
Here, μ and σ are model parameters to be estimated by fitting the model to our data. We used gamma distributions to model speeds throughout, because they capture the mean and variance with one parameter each and only support values greater or equal to zero, as required for speeds. In model 1 the speed of an individual does therefore not depend on the speeds of other individuals and the mean speed of individuals does not change over time. Under this framework, individuals may temporarily be stationary, but longer time periods of maintaining the same speed are highly unlikely.
To allow for changes in mean individual speed over time (e.g. stationary versus movement phases) and to incorporate possibly intermitted social behaviour, we extended model 1 by incorporating additional behavioural states into our models. We used the well-established framework of Hidden Markov Models (HMMs) to this end (Zucchini et al. 2016) and built on previous work on animal group movement (Langrock et al. 2014). In models 2 and 3 we assumed that individuals display two or three behavioural states, respectively, that we cannot observe directly, and that they switch between these states according to transition probabilities that only depend on the state they are currently in (Markov property). Each behavioural state is associated with parameterised probability distributions for individual speeds. Given data and using established methodology it is possible to estimate the parameters of these distributions, as well as the transition probabilities (see below). We refer the reader to general textbooks for further background on HMMs and algorithmic details (Zucchini et al. 2016).
In model 2, we assumed individuals move independently from each other and display two distinct behavioural states, both with constant mean and standard deviation:
$$ {\displaystyle \begin{array}{lllll} Model\ 2:& & State\ 1:& & V(t)\sim \Gamma \left({\mu}_1,{\sigma}_1\right)\\ {}& & State\ 2:& & V(t)\sim \Gamma \left({\mu}_2,{\sigma}_2\right)\end{array}} $$
(4)
This model has six parameters: two means and standard deviations each and two parameters capturing the four transition probabilities (since the probabilities of remaining in the current state and switching to the other state sum to one).
Model 3 extended model 2 with an additional behavioural state that captures social interactions by assuming that the speed of individuals depends on the speed of their nearest neighbour, V
n.n.
(t):
$$ {\displaystyle \begin{array}{lllll} Model\ 3:& & State\ 1:& & V(t)\sim \Gamma \left({\mu}_1,{\sigma}_1\right)\\ {}& & State\ 2:& & V(t)\sim \Gamma \left({\mu}_2,{\sigma}_2\right)\\ {}& & State\ 3:& & V(t)\sim \Gamma \left({V}_{n.n.}(t),{\sigma}_3\right)\end{array}} $$
(5)
This model has eleven parameters, as there are now nine transition probabilities. For simplicity, we assumed that the mean speed of individuals in state 3 is the current speed of the nearest neighbour, rather than its speed a short time ago. We argue that the high auto-correlation in speed time series in combination with short reaction times justifies this approach. We also assumed that while the mean of speeds in state 3 could vary, the variance, σ
3
, was constant. Large means in gamma distributions are often associated with large variances and vice-versa. Thus, our assumption could affect the fit of model 3 to the data, but we nevertheless decided to keep our model as simple as possible.
We used a maximum likelihood approach to fit models 1–3 to our guppy and stickleback data (see also supplemental discussion, section S1.2). Our implementation in the R programming environment (version 3.01; R Core Team 2012) builds on previous work (Langrock et al. 2014). As explained above, there were gaps in our speed time series due to missing data. We accounted for these by separately computing the contribution to the likelihood of uninterrupted time series segments. In technical terms, we re-started the forward algorithm using the stationary distribution of the underlying Markov chain for behavioural states every time there was a gap in our data and multiplied the resulting likelihood contributions to obtain the overall likelihood.
To compare the relative fit of the three models to the data, we computed the Akaike Information Criterion (AIC) from the maximum likelihood for models. Lower AIC values suggest better model fit and the AIC penalises models with more parameters. We used a permutation test to assess whether the change in AIC, denoted ΔAIC, between models 2 and 3 was larger than expected by chance. Specifically, we tested the hypothesis that the observed ΔAIC was no larger than we would expect under random pairings of individual speeds and nearest neighbour speeds. To compute a p-value, we fit model 3 to randomised data in which the (V(t),V
n.n.
(t)) pairings had been shuffled and recorded if ΔAIC of this model fit to model 2 was larger than ΔAIC observed for the original data. Computing the fraction of times this was the case in across replicate repetitions of this procedure yielded our p-value. We used a similar randomisation procedure to assess the robustness of our parameter estimates (see supplementary information).
In addition to presenting a way to test for the existence and ubiquity of social interactions, model 3 also allowed us to classify individual fish behaviour into the three behavioural states. After fitting the model, we applied the Viterbi algorithm to speed time series to infer the most likely behavioural state sequence of individuals. Briefly, for a fitted HMM and given a sequence of observations, the algorithm determines the most likely sequence of hidden states using the estimated probability distributions for speeds in the different states and the transition probabilities between states. From this analysis, we thus obtained for a given time series of individual speeds a sequence of inferred behavioural states that individuals were in. We used the Viterbi-decoded occurrence of state 3 to characterise speed-mediated social behaviour.