Tracking data and movement model
Tracking data of bar-headed geese were available to us from a broader disease and migration ecology study implemented by the Food & Agriculture Organization of the United Nations (FAO) and U.S. Geological Survey (USGS). In total, 91 individuals were captured during the years 2007–2009 in several locations: Lake Qinghai in China (hereafter termed “Lake Qinghai”), Chilika Lake and Koonthankulum bird sanctuary in India (hereafter termed “India”), and Terkhiin Tsagaan Lake, Mongolia (hereafter termed “West Mongolia”). All individuals were equipped with ARGOS-GPS tags which were programmed to record the animals’ location every 2 h (ARGOS PTT-100; Microwave Telemetry, Columbia, Maryland, USA). Eighty of the deployed tags collected and transmitted data for \(241 \pm 253\,(\text {mean} \pm \text {SD})\) days. In total, 169, 887 fixes could be acquired over the course of the tracking period (Table 1 and Takekawa et al. 2009; Hawkes et al. 2011). Individuals that were tracked for less than a complete year were excluded from the subsequent analyses, which left a total of 66 individuals (Lake Qinghai: 20, India: 20, West Mongolia: 26). We pooled data from all capture sites for the analyses.
Table 1 A summary of the catching sites and corresponding sample sizes
We used the recently developed the empirical Random Trajectory Generator (eRTG, Technitis et al. 2016) to simulate the migrations of unobserved individuals of bar-headed geese. This movement model is conditional, i.e. simulates the movement between two end locations with a fixed number of steps based on a dynamic drift derived from a step-wise joint probability surface. One main advantage of the eRTG is that the trajectories it simulates retain the geometric characteristics of the empirical tracking data (step length, turning angle, as well as covariance and auto-correlation of step length and turning angle), as it relies entirely on empirical distribution functions. Consequently, if a destination cannot be reached within the realms of the empirical distributions of e.g. step lengths and turning angles, the simulation fails rather than forcing the last step towards the destination.
We extended this movement model by incorporating a stochastic switch between the two main states of bar-headed goose migration, non-stop migratory flights (“migratory state”) and movements during staging periods at stopover locations (“stopover state”). We classified the entire tracking data according to the individuals’ movement behaviour to identify these states prior to extracting the empirical distributions functions for the eRTG. First, we clustered the locations in the tracking data using an expectation-maximisation binary clustering algorithm designed for annotating animal movement data (EMbC, Garriga et al. 2016). The EMbC divided the trajectories of bar-headed geese into four behavioural classes (slow speed & low turning angles, slow speed & high tuning angles, high speed & low turning angles, and high speed & high turning angles), which we then re-classified into two behavioural classes, namely high-speed movements (combining the two high speed classes) and low-speed movements (combining the two low speed classes). Within the high-speed behavioural cluster, the average speed between locations was \(8.4 \pm 6.7 \frac{m}{s}\) (mean ± SD) whereas the average speed for the low-speed behavioural cluster was \(0.3 \pm 1.0 \frac{m}{s}\) (mean ± SD). As estimates of speed and turning angle are highly dependent on the sampling rate of the data, we removed those parts of the trajectories that exceeded the average sampling interval of 2 h. Subsequently, we used the low-speed locations for the empirical distribution functions for the stopover state of the two-state eRTG, and the locations classified as high-speed for the empirical distribution functions for the migratory state of the eRTG (see Figure S2). Finally, we derived the step lengths and turning angles from each coherent stretch of data (i.e. only subsequent fixes with a sampling rate of 2 hours). Following this, we calculated the changes in step length and turning angle at a lag of one observation, as well as the covariance between contemporary observations of step length and turning angle. We derived the corresponding empirical distribution functions for both movement states and prepared them for use in the eRTG functions.
Finally, we determined the duration of staging periods, and the duration and cumulative distance of individual migratory legs from the tracking data. We first identified seasonal migration events between breeding and wintering grounds (and vice versa) in the empirical trajectories using the behavioural annotation. We then determined migratory legs (sequential locations classified as migratory state) as well as stopovers (sequential locations classified as stopover state, with a duration \(>12h\)). We used two main proxies to characterise migratory legs, namely cumulative migratory distance as well as duration, and one proxy to characterise staging periods, namely stopover duration. We calculated these proxies for all individuals and migrations, and determined the maximum observed distance (\(\text {dm}_{\text {max}}\)) and duration (\(\text {Tm}_{\text {max}}\)) of a migratory leg. As we did not distinguish between extended staging (e.g. during moult, or after unsuccessful breeding attempts) from use of stopover locations during migration, we calculated the \(95\%\) quantile of the observed stopover durations (\(\text {Ts}_{\text {max}}\)) rather than the maximum.
Simulating a bar-headed goose migration with the two-state eRTG
When simulating a conditional random trajectory between two arbitrary locations a and z, the two-state eRTG initially draws from the distribution functions for the migratory state, producing a fast, directed trajectory. To determine the time available for moving from a to z, we assumed the mean empirical flight speed derived for the migratory state, and calculated the number of required steps accordingly. While simulating the trajectory, after each step modelled by the eRTG, the cumulative distance of the trajectory as well as the duration since the start of the migratory leg were calculated. By using cumulative distance and duration as well as the empirically derived \(\text {dm}_{\text {max}}\) and \(\text {Tm}_{\text {max}}\), our two-state eRTG was based on a binomial experiment with two possible outcomes: switching to the stopover state with a probability of \(p_{ms}\), or resuming migration with a probability of \(1-p_{ms}\). We defined \(p_{ms}\), the transition probability to switch from migratory state to stopover state, as
$$\begin{aligned} p_{ms}(t) = \frac{\sum _{i=0}^{t}(dm)}{dm_{\text {max}}} \times \frac{\sum _{i=0}^{t}(Tm)}{Tm_{\text {max}}} \end{aligned}$$
(1)
where \(\text {dm}\) and \(\text {Tm}\) represent the distance and duration between two consecutive locations during a migratory leg. At step t, the simulation of the migratory movement can switch to the unconditional stopover state, corresponding to a correlated random walk, with a probability of \(p_{ms}(t)\). Likewise, the simulation can switch back from stopover state to migratory state with the probability \(p_{sm}(t)\), which we defined as as
$$\begin{aligned} p_{sm}(t) = \left( \frac{\sum _{i=0}^{t}(Ts)}{Ts_{\text {max}}}\right) ^2 \end{aligned}$$
(2)
where \(\text {Ts}\) represents the duration between two consecutive locations during a stopover. This process is then repeated until the simulation terminates because: either the trajectory reached its destination, or the step-wise joint probability surface did not allow for reaching the destination with the remaining number of steps (resulting in a dead end or zero probability).
Evaluating the plausibility of simulated migrations
We estimated the plausibility of each simulated trajectory, representing a unique migratory route, using a measure we called route viability \(\Phi\) aimed to integrate the ecological context into the movement simulations. We developed this measure specifically with the stepping-stone migratory strategy of bar-headed geese or similar species in mind, and it is defined by the time spent in migratory mode, the time spent at stopover sites, and the habitat suitability of the respective utilised stopover sites. For this specific measure of route viability, we made two main assumptions: (1), it is desirable to reach the destinations quickly, i.e. staging at a stopover site comes at the cost of delaying migration, and (2), the cost imposed by delaying migration is inversely-proportional to the quality of the stopover site, i.e. the use of superior stopover sites can counterbalance the delay. Our argument for these assumptions is that during spring migration, the arrival at the breeding grounds needs to be well-timed with the phenology of their major food resources (Bauer et al. 2008). Furthermore, the quality of stopover sites has been shown to be of crucial importance for other species of geese with similar migratory strategies (Green et al. 2002; Drent et al. 2007).
Each simulated multi-state trajectory between two arbitrary locations a and z can be characterised by a total migration duration \(\tau _{a,z}\), which consists of the total flight time \(\tau _{M,a,z}\) and the total staging time at stopover sites \(\tau _{S,a,z}\). The total flight time \(\tau _{M,a,z}\) is the sum of the time spent flying during each migratory leg l, and is thus \(\tau _{M,a,z} \, = \sum _{l=0}^{n}t_M(l)\), with \(t_M(l)\) corresponding to the time spent flying during migratory leg l. Similarly, the total staging time \(\tau _{S,a,z}\) consists of the staging times at all visited stopover sites, corresponding to \(\tau _{S,a,z} \, = \sum _{k=0}^{n}t_S(k)\), where \(t_S(k)\) amounts to the staging time at stopover site k. For our metric of route viability, we will consider the time spent staging at stopover locations \(\tau _{S,a,z}\) as a delay compared to the time spent in flight. This delay is, however, mediated by the benefit b an individual gains at the stopover site from replenishing its fat reserves. We define this benefit gained by staying at stopover site k, b(k), as proportional to the time spent at site k, \(t_S(k)\), and the habitat suitability of site k, S(k). This habitat suitability S should range between [0, 1], which allows our measure of route viability to range between [0, 1] as well. We further assume the effects of several sequential stopovers to be cumulative, and thus define the total benefit of a migratory trajectory between locations a and z with n stopovers as \(B_{a,z} \, = \sum _{k=0}^{n}S(k) \times t_{S}(k)\). Finally, we define the route viability \(\Phi _{a,z}\) of any trajectory between a and z as:
$$\begin{aligned} \Phi _{a,z} \, = \frac{\tau _{M,a,z}}{\tau _{M,a,z} + \tau _{S,a,z} - B_{a,z}} \, = \frac{\tau _{M,a,z}}{\tau _{a,z} - B_{a,z}} \end{aligned}$$
(3)
Thus, the viability of a trajectory with no stopovers and a trajectory with stopovers of the highest possible quality (\(S(k)=1\)) will be equal, and is defined solely by the time the individual spent in migratory state (\(\Phi _{a,z}=1\)). For trajectories with stopovers in less than optimal sites, however, the viability of trajectories is relative to both the staging duration and quality of stopover sites, and should take values of \(\frac{\tau _{M,a,z}}{\tau _{a,z}}< \Phi _{a,z} < 1\). Using this metric, we assessed simulated trajectories in a way that is biologically meaningful for bar-headed geese. In the next section, we detail how we calculated the route viability \(\Phi\) for each simulated migration.
A migratory connectivity network for bar-headed geese
We simulated migrations of bar-headed geese within the native range of the species which naturally occurs in Central Asia (68–107\(^{\circ }\)N , 9–52\(^{\circ }\)E). According to BirdLife International and NatureServe (2013), both the breeding and wintering range are separated into four distinct range fragments (see also Figure S1), with minimum distances between range fragments ranging from 79 km to 2884 km. For this study, we investigated how well, in terms of an environmentally informed measure of route viability and the number of stopovers required to reach a range fragment, these range fragments can be connected by simulated migrations of bar-headed geese.
To choose start- and endpoints for the simulated migrations, we sampled ten random locations from each of the range fragments indicated in the distribution data provided by BirdLife International and NatureServe (2013). We simulated 1000 trajectories for all pairs of range fragments (100 trajectories per location pair) and counted the number of successes (trajectories reach the destination) and failures (trajectories terminate in a dead end). We proceeded to calculate the viability of simulated routes in the following way: Initially, we determined the total duration of the trajectory between locations a and z, \(\tau _{a,z}\), the number of stopover sites used, \(n_{a,z}\), as well as the time spent at each stopover site, \(t_{S}(k)\), for each of the total \(n_{a,z}\) stopovers (corresponding to the number of steps multiplied with the location interval of 2 h). We determined the habitat suitability of stopover locations S(k) using habitat suitability landscapes for bar-headed geese during five periods of the year (see Figure S3): winter/early spring (mid-November–February), mid-spring (mid March–mid April), late spring/summer (mid April–mid August), early autumn (mid August–mid September), and late autumn (mid September–mid November). We identified these periods using a segmentation by habitat use (van Toor et al. 2016, for details see Section A in the Electronic Supplementary Material (ESM)). The segmentation-by-habitat-use procedure uses animal location data and associated environmental information to identify time periods for which habitat use is consistent. Habitat suitability models derived for these time periods should thus reflect differences in habitat use by bar-headed geese throughout the year. We used time series of remotely sensed environmental information and Random Forest models (Breiman 2001) to derive habitat suitability models corresponding to these five time periods, and predicted the corresponding habitat suitability landscapes (Section A in the ESM). Following the prediction of habitat suitability landscapes for winter/early spring, mid-spring, late spring/summer, early autumn, and late autumn, we annotated all stopover state locations of the simulated trajectories with the corresponding habitat suitability. We then calculated the benefit b gained by using a stopover location k using the mean suitability for each of the stopover locations, S(k), and the duration spent at stopover locations, \(\tau _S(k)\).
To calculate the route viability \(\Phi _{a,z}\), we also required an estimate for duration of migration if a simulation were exclusively using the migratory state \(\tau _{M,a,z}\), without the utilisation of stopover sites. We used a simple linear model to predict flight time as a function of geographic distance which we trained on the empirical data derived from the migratory legs (see Section B in the ESM for details). By basing the linear model on the empirical migratory legs rather than mean flight speed, the estimate for \(\tau _M\) retains the inherent tortuosity of waterbird migrations. For each simulated trajectory, we then calculated the geographic distance between its start- and endpoint, and predicted the expected flight time \(\tau _{M,a,z}\). Finally, we calculated route viability \(\Phi _{a,z}\) for all trajectories using Eq. 3, repeating the process for each of the five suitability landscapes derived from the segmentation by habitat use. This resulted in five different values of \(\Phi _{a,z}\) for every simulated trajectory, corresponding to winter/early spring, mid-spring, late spring/summer, early autumn, and late autumn, respectively.
Calculating migratory connectivity as average route viability
We calculated migratory connectivity between range fragments as the average route viability \(\Phi _{\text {avg.}}\) of all trajectories connecting two range fragments. We calculated this average by using non-parametric bootstrapping on the median route viability \(\Phi _{\text {avg.}}\) (using 1000 replicates), and also computed the corresponding \(95\%\) confidence intervals (CI) of the median route viability \(\Phi _{\text {avg.}}\). We did this for each of the five time periods represented in the suitability landscapes, and also computed an overall migratory connectivity by averaging all five habitat suitability values for each stopover site prior to calculating \(\Phi\).
We wanted to compare migratory connectivity within the breeding range and migratory connectivity in the wintering range to test our first hypothesis stating that migratory connectivity should be higher within the breeding range. To do so, we differentiated between route viability among breeding range fragments (\(\Phi_{\text{breeding}}\)), among the wintering areas (\(\Phi_{\text{wintering}}\)), and between breeding and wintering range fragments (\(\Phi_{\text{mixed}}\)). We computed the median and 95% CIs of route viability with non-parametric bootstrapping with 1000 replicates, using the average habitat suitability of all five suitability landscapes for all trajectories within the breeding range, all trajectories in the wintering range, and all trajectories connecting breeding range fragments with wintering range fragments.
To test our second hypothesis, stating that variation in migratory connectivity throughout the year should be higher in the breeding range than in the wintering range, we calculated the standard deviation of route viability for the five suitability landscapes in the breeding range and in the wintering range. We did this by again differentiating trajectories in the wintering range, trajectories in the breeding range, and trajectories connecting breeding range fragments with wintering range fragments. We computed route viability \(\Phi\) for each of the five suitability landscapes for all trajectories, and pooled the corresponding values for \(\Phi _{\text {late \, winter/early\, spring}}\), \(\Phi _{\text {mid-spring}}\), \(\Phi _{\text {late \, spring/summer}}\), \(\Phi _{\text {early \,autumn}}\), and \(\Phi _{\text {late\, autumn}}\) for the wintering range, for the breeding range, and for trajectories connecting breeding range fragments with wintering range fragments separately. We then used a non-parametric bootstrapping (1,000 replicates) on the standard deviation over the five time periods, and determined the corresponding 95% CIs on the standard deviation.
Calculating route viability for empirical migrations
Following the above described procedure, we annotated the stopover locations of empirical migrations with the habitat suitability of the corresponding time period, and calculated the route viability for these migratory trajectories in the same way as described above. We then used non-parametric bootstrapping on the median route viability for all empirical migrations (\(\Phi _{\text {emp., total}}\)), only spring migrations (\(\Phi _{\text {emp., spring}}\)) and only autumn migrations (\(\Phi _{\text {emp.,autumn}}\)), and computed 95% CIs for the median of \(\Phi _{\text {emp., total}}\), \(\Phi _{\text {emp., spring}}\), and \(\Phi _{\text {emp., autumn}}\).