# Incorporating Telemetry Error into Hidden Markov Models of Animal Movement Using Multiple Imputation

- 1.8k Downloads
- 7 Citations

## Abstract

When data streams are observed without error and at regular time intervals, discrete-time hidden Markov models (HMMs) have become immensely popular for the analysis of animal location and auxiliary biotelemetry data. However, measurement error and temporally irregular data are often pervasive in telemetry studies, particularly in marine systems. While relatively small amounts of missing data that are missing-completely-at-random are not typically problematic in HMMs, temporal irregularity can result in few (if any) observations aligning with the regular time steps required by HMMs. Fitting HMMs that explicitly account for uncertainty attributable to location measurement error, temporally irregular observations, or other forms of missing data typically requires computationally demanding techniques, such as Markov chain Monte Carlo (MCMC). Using simulation and a real-world bearded seal (*Erignathus barbatus*) example, I investigate a practical alternative to incorporating measurement error and temporally irregular observations into HMMs based on multiple imputation of the position process drawn from a single-state continuous-time movement model. This two-stage approach is relatively simple, performed with existing software using efficient maximum likelihood methods, and completely parallelizable. I generally found the approach to perform well across a broad range of simulated measurement error and irregular sampling rates, with latent states and locations reliably recovered in nearly all simulated scenarios. However, high measurement error coupled with low sampling rates often induced bias in both the estimated probability distributions of data streams derived from the imputed position process and the estimated effects of spatial covariates on state transition probabilities. Results from the two-stage analysis of the bearded seal data were similar to a more computationally intensive single-stage MCMC analysis, but the two-stage analysis required much less computation time and no custom model-fitting algorithms. I thus found the two-stage multiple-imputation approach to be promising in terms of its ease of implementation, computation time, and performance. Code for implementing the approach using the R package “momentuHMM” is provided.

Supplementary materials accompanying this paper appear online.

## Keywords

Argos satellite telemetry Bearded seal Biotelemetry crawl momentuHMM moveHMM Multistate models## 1 Introduction

*Z*state-dependent probability distributions, where the unobservable (hidden) state sequence \((z_{t} \in \{{1,\ldots ,Z}\}\) for \(t=1,\ldots ,T)\) is assumed to be a Markov chain. The state sequence of the Markov chain is governed by (typically first-order) state transition probabilities, \(\gamma _{ij}^{(t)} =\hbox {Pr}({z_{t+1} =j\mid z_t =i} )\) for \(i,j=1,\ldots ,Z\), and an initial distribution \({\varvec{\delta }} ^{(1)}\). The likelihood of an HMM can be succinctly expressed using the forward algorithm:

*Z*-vector of ones (see Zucchini et al. 2016 for a thorough introduction to HMMs).

While HMMs for animal movement based solely on location data are somewhat limited in the number and type of biologically meaningful movement behavior states they are able to accurately identify (Morales et al. 2004; Beyer et al. 2013; Bagniewska et al. 2013; McClintock et al. 2014), multivariate HMMs that utilize both location and auxiliary biotelemetry data (e.g., McClintock et al. 2013; Russell et al. 2015; DeRuiter et al. 2016) can facilitate the identification of additional states that go beyond the two-state approaches that are most frequently used by practitioners (“encamped” and “exploratory” states *sensu* Morales et al. 2004 or “foraging” and “transit” states *sensu* Jonsen et al. 2005).

When data streams are observed without error and at regular time intervals, a major advantage of HMMs is the relatively fast and efficient maximization of the likelihood using the forward algorithm (Zucchini et al. 2016), and user-friendly software is available for fitting movement HMMs under these circumstances (e.g., Michelot et al. 2016a). However, location measurement error is rarely nonexistent and depends on both the telemetry device and the system under study. For example, GPS errors are typically less than 50m, but Argos errors can exceed 10 km (Costa et al. 2010; Silva et al. 2014). Missing location or auxiliary biotelemetry data typically arise when transmitters cannot communicate with satellites at temporally regular intervals. While missing data can be a problem in terrestrial systems (e.g., in canyons or dense forest), it can often be pervasive in marine environments. If some \(X_{t}\) are missing, these data gaps are often ignored in maximum likelihood analyses (by replacing \(\mathbf{P}({x_{t}})\) with the \(Z\times Z\) unity matrix) and thus do not contribute information to the estimation of state-dependent probability distribution parameters (e.g., Zucchini et al. 2016), but if data are frequently missing or not missing-completely-at-random, this strategy could have undesirable inferential consequences (Nakagawa and Freckleton 2008). Another strategy is to impute missing data using simple linear interpolation (e.g., Russell et al. 2015; DeRuiter et al. 2016; Michelot et al. 2016b), although the reliability of this approach is poorly understood. While relatively small amounts of missing data that are missing-completely-at-random are not typically problematic in HMMs, an extreme case of missing data can arise when location data are obtained with little or no temporal regularity, as in many marine mammal telemetry studies (e.g., Jonsen et al. 2005), such that few (if any) observations align with the regular time steps required by discrete-time HMMs. Temporal irregularity is therefore different from the conventional HMM missing data scenario where data streams are consistently observed at regular time intervals, but some of these observations happen to be missing.

When explicitly accounting for uncertainty attributable to location measurement error, temporally irregular observations, or other forms of missing data, one must typically fit HMMs using computationally intensive (and often time-consuming) model-fitting techniques such as Markov chain Monte Carlo (Jonsen et al. 2005; McClintock et al. 2013; Turek et al. 2016). For example, a recent multivariate HMM analysis of bearded seal (*Erignathus barbatus*) telemetry data that incorporated six behavior states, seven data streams, location measurement error, temporal irregularity, and missing auxiliary data required several *weeks* to fit using MCMC (McClintock et al. 2017). Given the time and money typically expended in deploying animal-borne telemetry devices, one could posit that such an “expensive” analysis is entirely justified. However, complex analyses requiring novel statistical methods and custom model-fitting algorithms are not practical for many of the biologists and ecologists conducting these studies.

Here I investigate a practical approach to incorporating measurement error and temporal irregularity into HMMs for animal movement using multiple imputation (Rubin 1987; Hooten et al. 2017). This two-stage approach is relatively simple and can be performed with existing software using efficient maximum likelihood methods. After describing the approach in more detail in the next section, I investigate its performance properties in a series of simulation experiments. I then approximate the bearded seal analysis of McClintock et al. (2017) using multiple imputation and compare the results.

## 2 Methods

The multiple-imputation approach for incorporating location measurement error and temporally irregular or missing observations into animal movement HMMs consists of two stages. The basic concept is to first employ a single-state (i.e., \(Z=1\)) movement model that is relatively easy to fit but can accommodate location measurement error and temporally irregular or missing observations. The second stage involves repeatedly fitting an HMM to *n* temporally regular realizations of the position process drawn from the model output of the first stage. These temporally regular realizations of the position process constitute the data “we wish we had” if the observation process were not subject to location measurement error and temporally irregular or missing observations. Inferences about behavior states (e.g., state decoding, state probabilities, “activity budgets”), transition probabilities, and state-dependent probability distribution parameters are based on a pooling of the *n* imputed data HMM analyses using standard multiple-imputation formulae (Rubin 1987).

For the first stage of the analysis, the continuous-time correlated random walk model of Johnson et al. (2008) is very well suited and easily implemented in the R package “crawl” (Johnson 2016) using maximum likelihood methods, although any movement model that accommodates both location measurement error and temporally irregular or missing observations could be fitted to the telemetry locations during this stage. Some advantages of the Johnson et al. (2008) model in “crawl” are its speed, flexible measurement error model specification, and ease of repeatedly drawing realizations of the temporally regular position process from the model output.

*n*imputed data sets using standard methods. The HMM data streams need not be limited to step length or turning angle (e.g., Morales et al. 2004), and auxiliary biotelemetry data streams that inform behavior states can also be included at this stage (e.g., dive activity, altitude, heart rate; see McClintock et al. 2013). Data streams or covariates that are dependent on location (e.g., step length, turning angle, habitat type, snow depth, sea surface temperature) will of course vary among the

*n*realizations of the position process, and the pooled inferences across the HMM analyses will therefore reflect location uncertainty. Pooled point and variance estimates can be calculated as

## 3 Simulation Study

### 3.1 Simulation Methods

Simulation inputs for evaluating performance of multiple imputation as an approach to incorporating location measurement error and temporally irregular observations into hidden Markov models of animal movement. Behavior states included “foraging” (*F*), “transit” (*T*), “hauled out on ice” (*I*), and “resting at sea” (*S*). State-dependent data streams included step length \(({s_{z}})\), turning angle \(( {\phi _{z}})\), number of dives \(({d_{z}})\), proportion dive time \(({w_{z}})\), and proportion dry time \(( {v_{z}})\) for \(z\in \{{F,T,I,S}\}\) and were generated from Gamma(shape, scale), wrapped Cauchy [wCauchy(mean, concentration)], Poisson(rate), or beta (shape1, shape2) distributions. Each simulation was based on \(N=7\) simulated individual tracks. Three sets of simulations were conducted. The “two-state with covariate” simulations were similar to the “two-state” simulations, but included a spatially correlated covariate on transition probabilities.

Data stream | Simulation set | ||
---|---|---|---|

Two-state | Two-state with covariate | Four-state | |

Step length | \(s_{F} \sim \hbox {Gamma}({0.25,400})\) | \(s_{F} \sim \hbox {Gamma}({0.25,400})\) | \(s_{F} \sim \hbox {Gamma}({100,5})\) |

\(s_{T} \sim \hbox {Gamma}({25,40})\) | \(s_{T} \sim \hbox {Gamma}({25,40})\) | \(s_{T} \sim \hbox {Gamma}({100,50})\) | |

\(s_I ,s_S \sim \hbox {Gamma}( {1,10} )\) | |||

Turning angle | \(\phi _F \sim \hbox {wCauchy}( {0,0.00} )\) | \(\phi _F \sim \hbox {wCauchy}( {0,0.00} )\) | \(\phi _F \sim \hbox {wCauchy}( {0,0.10} )\) |

\(\phi _T \sim \hbox {wCauchy}( {0,0.75} )\) | \(\phi _T \sim \hbox {wCauchy}( {0,0.75} )\) | \(\phi _T \sim \hbox {wCauchy}( {0,0.75} )\) | |

\(\phi _I ,\phi _S \sim \hbox {wCauchy}( {0,0.85} )\) | |||

No. of dives | \(d_F \sim \hbox {Poisson}( {20} )\) | ||

\(d_T \sim \hbox {Poisson}(5)\) | |||

\(d_I ,d_S \sim \hbox {Poisson}( 1 )\) | |||

Dive time | \(w_F ,w_T \sim \hbox {Beta}( {10,1} )\) | ||

\(w_I ,w_S \sim \hbox {Beta}( {1,10} )\) | |||

Dry time | \(v_F ,v_T ,v_S \sim \hbox {Beta}( {1,10} )\) | ||

\(v_I \sim \hbox {Beta}( {10,1} )\) |

*T*regular time steps. Temporal irregularity was introduced by allowing observations \(({{{\varvec{y}}}_{k}})\) to fall somewhere along the straight line between the temporally regular locations for each time step:

The first set used simulated location data typical of the most commonly employed two-state HMMs of animal movement (e.g., Morales et al. 2004; Jonsen et al. 2005) consisting of an area-restricted-search-type state (i.e., “encamped” or “foraging”) and a high-speed, directionally persistent state (i.e., “exploratory” or “transit”). Based on the results of Beyer et al. (2013) demonstrating HMMs can perform poorly when movement behavior states are not sufficiently distinct, the probability distributions for step length and turning angle were chosen such that they had little overlap between states (Table 1). The transition probability matrix for these simulations was \( {\varvec{\Gamma }}= \left[ {{\begin{array}{ll} {0.8}&{} {0.2} \\ {0.1}&{} {0.9} \\ \end{array}}}\right] \), where each element \(\gamma _{i,j} \) is the probability of switching from state *i* at time *t* to state *j* at time \(t+1\) for \(i,j\in \{{1=\hbox {``foraging''},2=\hbox {``transit''}}\}\). Temporal irregularity was simulated by assuming the wait times between the observed locations followed an exponential distribution with rate \(\lambda \), which can be interpreted as the expected number of (temporally irregular) locations observed between (temporally regular) time steps *t* and \(t+1\). I limited the design points to \(\lambda =2, 1\), and 0.5, but also included a design point assuming temporal regularity (i.e., \(T+1\) observations occurring at temporally regular times \(t=1,\ldots ,T+1)\). Location measurement error was assumed to arise from a bivariate normal distribution based on an error ellipse with semi-major axis of length *M*, semi-minor axis of length *m*, and orientation *c* (McClintock et al. 2015). For simplicity, I assumed \(M=m, c=0\), and limited the design points to \(M=50\) (low measurement error relative to state-dependent scales of movement), \(M=500\) (moderate error), \(M=1500\) (high error), and \(M=3000\) (extreme error).

The second set of simulations were identical to the first except for one key difference. For these scenarios, state transition probabilities were assumed to be a function of a spatially correlated covariate, whereby simulated individuals were more likely to switch to (and remain in) the “foraging” state \(({z_t =1})\) at locations with larger covariate values. The transition probability matrix for these simulations was \( {\varvec{\Gamma }}^{(t)}=\left[ {{\begin{array}{ll} {1-\gamma _{1,2}^{\left( t \right) }}&{} {\gamma _{1,2}^{\left( t \right) }} \\ {\gamma _{2,1}^{\left( t \right) }}&{} {1-\gamma _{2,1}^{\left( t \right) }} \\ \end{array} }} \right] \), where \(\gamma _{1,2}^{(t)} =\hbox {logit}^{-1}({5-25c_{t}}), \gamma _{2,1}^{( t)} =\hbox {logit}^{-1}({-10+50c_{t}})\), and \(c_{t} \in [{0,1}]\) is the spatial covariate value corresponding to the animal’s location at time *t*.

The third set of simulations was motivated by multivariate HMMs that utilize both location and additional data streams to inform behavioral states of ice-associated seals (e.g., McClintock et al. 2017). Data were generated under \(Z=4\) behavior states (1 \(=\) “hauled out on ice,” 2 \(=\) “resting at sea,” 3 \(=\) “foraging,” and 4 \(=\) “transit”) and characterized by 6 data streams: step length, turning angle, number of dives, proportion dive time, proportion dry time, and proportion sea ice cover (see Table 1). Because the step length distribution, turning angle distribution, and transition probabilities for the “hauled out on ice” and “resting at sea” states were identical, I simply simulated state sequences for three states \(\left( {{\varvec{\Gamma }}=\left[ {{\begin{array}{lll} {0.65}&{} {0.35}&{} {0.01} \\ {0.09}&{} {0.82}&{} {0.09} \\ {0.04}&{} {0.24}&{} {0.72} \\ \end{array}}}\right] }\right) \) and then delineated these two non-diving states based on whether or not the sea ice concentration at the initial location of the corresponding time step was \(>0.05\). A spatially correlated sea ice concentration grid was simulated using the R package “gstat” (Pebesma 2004). The second stage of the multiple-imputation approach requires more computation time with four states, so I limited the measurement error scenarios to \(M\in \{{50,500,1500}\}\) for this set of simulations.

- (1)
Separately fit the continuous-time correlated random walk movement model of Johnson et al. (2008) to the observed data \(( {{\varvec{y}}})\) for each of the \(N=7\) individuals using the crwMLE() function in the R package “crawl” (Johnson 2016). Assume a bivariate normal error ellipse model for each observed location \({{\varvec{y}}}_{k} \sim N({{\varvec{\mu }}_{k},{\varvec{\varSigma }}_{k}})\), where \({\varvec{\mu }} _{k} \) is the true location and \({\varvec{\varSigma }}_{k} =\left( {{\begin{array}{ll} {\frac{M^{2}}{2}}&{} 0 \\ 0&{} {\frac{M^{2}}{2}} \\ \end{array}}}\right) \) at time \(k\in [{1,T+1}]\).

- (2)
Use the “crawl” function crwPostIS() to draw

*n*samples from the posterior distribution of the position process \(\left( {{\varvec{\mu }} _{t}^{(i)} ,i=1,\ldots ,n} \right) \) at temporally regular intervals \((t=1,2,\ldots ,T+1)\) conditional on the fitted parameters for each individual from step 1. For each of the*n*realizations of the position process, calculate step lengths, turning angles, and any other data streams or covariates that depend on location. - (3)
Estimate the model parameters by fitting the corresponding HMM (consisting of the “Data stream” distributions in Table 1) to each of the

*n*imputed data sets using maximum likelihood methods. For each of the*n*HMM fits, use the Viterbi algorithm to estimate the most likely state sequence and the forward–backward algorithm to estimate state probabilities for each time step (Zucchini et al. 2016). - (4)
Use Eqs. (2) and (3) to calculate pooled parameter estimates, variances, and 95%

*t*-distributed confidence intervals (Rubin and Schenker 1986).

*n*HMM fits. For “pooled” point and variance estimates of the state probabilities for each time step \(( {q_{z,t} ,z\in \{{1,\ldots ,Z}\},t=1,\ldots ,T} ),\) I respectively used Eqs. (2) and (3) assuming \(1/n\mathop \sum \limits _{i=1}^{n} \hbox {var}({\theta ^{(i )}})=0\).

When fitting the four-state HMMs, I included the imputed sea ice concentration data as an additional data stream modeled as a (state-dependent) beta distribution. This of course is not how the data were generated, and a more sophisticated approach would be to incorporate sea ice concentration into the transition probabilities for the non-diving states (“hauled out on ice” and “resting at sea”), possibly using a hidden semi-Markov model (Zucchini et al. 2016). However, hidden semi-Markov models are more challenging to implement than HMMs, and my goal was to evaluate this approximation as a similar strategy was used by McClintock et al. (2017) to help distinguish different types of resting behavior in bearded seals.

I evaluated the performance of the multiple-imputation approach based on its ability to estimate the unobserved state sequence, the parameters of the state-dependent probability distributions, and the temporally regular locations \(({{\varvec{\mu }}_{t}})\). Classification accuracy was evaluated based on the proportion of time steps for the estimated most likely state sequence that matched the true state, a measure of agreement between the true and estimated states that takes chance agreement into account (Congalton 1991), and the proportion of estimated state probabilities \(( {q_{z,t}})\) with at least 0.05 probability assigned to the true state. I also compared the multiple-imputation approach to an HMM fitted to the single most likely position process predicted by each of the “crawl” model fits (hereafter referred to as “the single-imputation approach”). For all three sets of simulations, I first fit the single-imputation model using the true parameters as starting values and then used the estimated parameters from the single-imputation model fit as starting values for the multiple-imputation model fits. I used similar parameter constraints to those used by McClintock et al. (2017) to avoid label switching among the *n* HMMs fitted for each simulation scenario.

For all scenarios within each set of simulations, \(n=400\) imputations were performed. To provide some insight into the number of imputations that may be required in practice, pooled estimates from the \(n=400\) model fits were compared to those of randomly selected subsets of \(n=30\) and \(n=5\) imputations. All analyses were performed in R (R Core Team 2016) using an extension of the “moveHMM” package (Michelot et al. 2016a) that is currently under development (https://github.com/bmcclintock/momentuHMM). The “momentuHMM” package allows for additional data streams, as well as user-specified design matrices and constraints for the state-dependent probability distribution parameters. R code for simulating data and implementing the two-stage analyses using “momentuHMM” is provided in ESM of Appendix A.

### 3.2 Simulation Results

For the two-state simulations that included a spatial covariate on state transition probabilities \(({T=7000})\), each individual HMM typically required about 20 s \((\hbox {range}=6\)–\(40\,\hbox {s})\) to converge. All \(n=400\) imputed data analyses required about 19 min \((\hbox {range}=12\)–\(24\,\hbox {min})\) when run in parallel on a 3.7 GHz processor with seven cores. Both the single- and multiple-imputation approaches generally performed well, and in most cases the inclusion of a spatial covariate on state transition probabilities made the models more robust to location measurement error and temporal irregularity relative to the two-state scenarios without a covariate. Across all scenarios, the multiple-imputation approach tended to outperform the single-imputation approach in terms of state (Fig. 2) and probability density estimation (ESM of Appendix B). As with the two-state simulations without a covariate, both approaches were increasingly unable to recover the true step length and turning angle distributions as measurement error increased and sampling rate decreased. Similarly, the effect of the spatial covariate on transition probabilities was increasingly underestimated as measurement error increased and sampling rate decreased (ESM of Appendix B). The multiple-imputation approach tended to estimate 95% confidence bands that covered \({\varvec{\mu }}_{t} \), but mean coverage when \(M=50\) m was as low as 81, 84, and 86% for \(\lambda =2\), 1, and 0.5 (ESM of Appendix B), and the mean errors in these estimated locations were 81.1, 142.3, and 280.1 m, respectively.

I was somewhat surprised that measurement error and sampling rate did not have a larger impact on identification of the two non-diving states (“hauled out on ice” and “resting at sea”) in the four-state simulations. I therefore performed an identical analysis excluding the “dry time” data stream from the fitted HMMs and thus leaving sea ice concentration as the sole data stream distinguishing these two non-diving states. However, I still found that measurement error and sampling rate did not have a deleterious effect on state or location estimation (ESM of Appendix C), and it appears the beta distribution approximation for sea ice used by McClintock et al. (2017) can be useful for identifying different types of resting behavior in ice-associated seals.

For all three sets of simulations, pooled estimates based on all \(n=400\) and a subset of \(n=30\) randomly selected imputations were virtually indistinguishable. However, while pooled estimates based on \(n=5\) imputations generally performed better than the single-imputation approach, they did not perform as well as the pooled estimates from \(n=400\) or \(n=30\) imputations, particularly in the estimation of the true underlying position process (ESM of Appendix B).

## 4 Bearded Seal Example

### 4.1 Example Methods

The bearded seal is a documented benthic forager whose life history is intricately linked with sea ice, but the nature of that relationship and other ecological connections with habitat is poorly understood. McClintock et al. (2017; hereafter MLCB) recently conducted a single-stage analysis of bearded seal biotelemetry data that incorporated six behavior states (“hauled out on ice,” “resting at sea,” “hauled out on land,” “mid-water foraging,” “benthic foraging,” and “transit”), seven data streams (step length, bearing, proportion dive time, proportion dry time, number of benthic dives, proportion sea ice cover, and proportion land cover), location measurement error and temporal irregularity, and missing auxiliary biotelemetry data. To formally account for these sources of uncertainty, MLCB formulated their six-state hierarchical HMM as a Bayesian model using a complete data likelihood that conditions on the unobserved states. Their analysis required substantial model development, a custom MCMC model-fitting algorithm written in the C programming language, and several weeks of run time to achieve their target convergence diagnostics. While here I compare the two-stage approach to that MLCB, it deserves note that alternative single-stage model specifications, model-fitting algorithms, or software could lead to decreases (or increases) in computation times relative to MLCB (e.g., Turek et al. 2016).

The bearded seal data consist of \(N=7\) individuals deployed with Argos (Service Argos 2013) satellite-linked tags near Kotzebue, Alaska, USA, between 2009 and 2012. In addition to location data, the tags were equipped with sensors for recording dive and wet/dry data from which the dive time, dry time, and the number of benthic dive data streams were calculated at 6-hr time steps over the duration of each deployment. Given the nature of the Argos platform, bearded seal diving behavior leading to limited or irregular exposure to satellites, and limited bandwidth for transferring data, the location data were subject to measurement error and temporal irregularity, while the auxiliary biotelemetry data were subject to missing or incomplete records. The Argos error ellipses were overwhelmingly oriented toward the *x*-axis, with mean semi-major axis \(M= 11252\,\hbox {m}\,(\hbox {median}=4174\,\hbox {m}, \hbox {SD}=36190)\), semi-minor axis \(m=493\,\hbox {m}\,(\hbox {median}=239\,\hbox {m}, \hbox {SD}=4894)\), and orientation \(c=90^{\circ } \,(\hbox {median}=89^{\circ }, \hbox {SD}=20)\). There was an average of 7.2 \((\hbox {SD}=5.5)\) location observations per 6-h time step, with 18% of time steps having no observed locations. Full details of the data can be found in London (2016).

I approximated the model implemented by MLCB using the same single- and multiple-imputation approaches described in *Simulation methods*. The six-state HMM assumes step length \(s_{t} \mid z_{t} =i\sim \hbox {Gamma}({a_{i},b_{i}}),\) turning angle \(\phi _{t} \mid z_{t} =i\sim \hbox {wCauchy}({0,\rho _{i}})\), and the number of benthic dives \(d_{t} \mid z_{t} =i\sim \hbox {Poisson}( {r_{i}})\) for \(z\in \{{I,S,L,M,B,T}\}\), where *I* denotes “hauled out on ice,” *S* denotes “resting at sea,” *L* denotes “hauled out on land,” *M* denotes “mid-water foraging,” *B* denotes “benthic foraging,” *T* denotes “transit,” and \(\hbox {wCauchy}( {0,\rho _{z}})\) denotes a wrapped Cauchy distribution with mean zero and concentration parameter \(\rho _z \in ( {-1,1} )\). For the data streams corresponding to proportions \((w_t =\hbox {dive}\, \hbox {time}, v_{t} =\hbox {dry}\, \hbox {time}, c_{t} =\hbox {sea}\, \hbox {ice}\, \hbox {cover}, l_{t} =\hbox {land}\, \hbox {cover})\), the HMM assumes \(f_{t} \mid z_{t} =i\sim \hbox {Beta}\left( {\upsilon _{i}^{f},\delta _{i}^{f}}\right) \) for \(f\in \{ {w,v,c,l} \}\). To facilitate comparisons and avoid label switching among the *n* HMM fits, I used the same constraints on the state-dependent probability distribution parameters as MLCB.

Because these tags produced both GPS and Argos locations, the error model for the first-stage “crawl” model fits depended on the location type. For GPS locations, I used a bivariate normal model \({{\varvec{y}}}_{k} \sim N({{\varvec{\mu }}_{k} ,{\varvec{\varSigma }}_{k}})\) assuming \(M=50\). For the Argos locations, I used a bivariate normal model based on the Argos error ellipse (McClintock et al. 2015). For each realization of the position process, the step length, turning angle, sea ice cover, and land cover data streams were calculated at temporally regular time steps matching the 6-hr resolution of the auxiliary biotelemetry data. Unlike the other biotelemetry data streams, the number of benthic dives \(({d_t})\), defined as the number of (presumably foraging) dives to the sea floor, was not directly observable because the exact locations (and thus sea floor depth) during each 6-hr time step were unknown. I therefore calculated the number of benthic dives based on the sea floor depths at the start and end locations for each time step (see MLCB for further details on how benthic dives were calculated). All analyses were performed in R using the “momentuHMM” package.

There are seven key differences between the “full” treatment of MLCB and the two-stage approaches. These differences include: (1) MLCB used MCMC to fit a Bayesian model based on a complete data likelihood that conditions on the unobserved states, while the two-stage approaches maximize the HMM likelihood (Eq. 1) directly; (2) the fitted HMMs using the two-stage approaches are not hierarchical and thus do not contain individual-level random effects on the state-dependent probability distribution parameters; (3) the two-stage approach does not incorporate environmental data (e.g., sea floor depth, sea ice, land) or a maximum bearded seal travelling speed of 2 m/s into the position process; (4) the two-stage approaches used here do not impute any missing auxiliary biotelemetry data (dive time, dry time, and number of benthic dives); (5) the two-stage approaches use a bivariate normal Argos error ellipse model instead of a bivariate *t*-distributed model because the latter is not implemented in “crawl”; (6) the two-stage approaches utilize a single-state continuous-time correlated random walk model for the position process instead of a multistate discrete-time correlated random walk model (see McClintock et al. 2014); and (7) the two-stage approaches assume the initial distribution \(({\varvec{\delta }})\) is the stationary distribution. These differences clearly suggest that the results from the analyses will not be identical, but my objective was to assess how well the two-stage approaches approximate the approach of MLCB.

### 4.2 Example Results

## 5 Discussion

Motivated by my experiences with complex movement HMMs that incorporate measurement error and temporally irregular or missing data, I have investigated the utility of a two-stage approach based on multiple imputation as a more practical alternative to computationally demanding and often time-consuming model-fitting techniques such as MCMC. Based on limited simulations and the bearded seal example, I found this approach to be promising in terms of its ease of implementation, computation time, and performance. The two-stage approach can be performed with existing software using maximum likelihood methods and thus alleviates the need for custom model-fitting algorithms and computer code that is typically required for analogous single-stage analyses. Because maximum likelihood methods are used, likelihood-based model selection criteria (e.g., AIC, BIC) could potentially be used to select among competing models at either stage in the analysis. Further, unlike MCMC and similar techniques, multiple imputation is completely parallelizable; with sufficient processing power computation times need not be longer than that required to fit a single HMM. While \(n=400\) imputations were used here, I found that far fewer imputations may actually be necessary in practice. However, the appropriate number of imputations is likely to be case dependent.

In terms of state identification and path reconstruction, I generally found multiple imputation of the position process based on the continuous-time correlated random walk model of Johnson et al. (2008) to be robust to sampling rate and location measurement error. The two-stage approach thus appears to be a reliable method for inferring both when and where individuals exhibit particular behaviors and could therefore be used to investigate hypotheses about activity budgets, space use, resource selection, and many other areas of movement and behavioral ecology. Somewhat to my surprise, low sampling rates and high measurement error did not adversely affect identification of very similar states solely distinguishable by a spatially correlated covariate (i.e., “hauled out on ice” and “resting at sea” in the four-state simulations; see ESM of Appendix C), suggesting the beta approximation for sea ice concentration used by McClintock et al. (2017) can be useful for this purpose. Although not investigated here, it is important to note that difficulties arising from measurement error and temporally irregular or missing data will almost certainly be amplified when state-dependent probability distributions are less distinct than they were in my simulations (Beyer et al. 2013). However, as demonstrated in the four-state simulations, additional data streams can help facilitate accurate state estimation when step length and turning angle distributions are similar among states.

In terms of state transition probabilities and probability distribution estimation for data streams that depend on the position process (e.g., step length, turning angle), multiple imputation based on the model of Johnson et al. (2008) was less robust to sampling rate and, in particular, measurement error. While the estimated effects of the spatial covariate on transition probabilities were all significant except in the most extreme cases, both the single- and multiple-imputation approaches increasingly underestimated the true transition probability coefficients as measurement error increased and sampling rate decreased (see ESM of Appendix B). Clearly, finer-scale spatial relationships can become masked as uncertainty about the position process increases. Based on the simulations examined here, low sampling rates and high measurement error can result in the single-state model of Johnson et al. (2008) smoothing the track toward the more dominant state (“transit” in the two-state simulations without a spatial covariate, “foraging” in the two-state simulations with a spatial covariate, and “foraging” in the four-state simulations; see Fig. 1, ESM of Appendix B). However, the smoothed distributions typically remained distinct enough for the states and spatial covariate effects to be reasonably well inferred in the second stage. Thus, while generally reliable for state and location estimation, if primary interest is in unbiased estimation of distributions for data streams or covariate effects that depend on the position process, I would not recommend using this two-stage approach based on the model of Johnson et al. (2008) unless: (1) measurement errors are small relative to the (state-dependent) scales of movement; and (2) sampling rates are high relative to the time step of interest. Although not explored here, alternative movement models or the inclusion of movement covariates (if available) in the first stage could help mitigate this smoothing. For example, auxiliary biotelemetry data can be used as covariates in the model of Johnson et al. (2008), but these were not used in the first stage of the bearded seal analysis because missing covariate data are not currently permitted in “crawl.”

The single-imputation approach generally performed similarly to multiple imputation when sampling rates were high and measurement error was low. It could therefore potentially serve as a fast and reliable alternative to simple linear imputation or more computationally intensive methods for handling temporally irregular or missing location data under these circumstances. However, as expected, the performance of the single-imputation approach declined with lower sampling rates and higher measurement error.

Although I have found imputation to have several potentially desirable qualities with respect to ease of implementation and computation time, this two-stage approach to accounting for measurement error and missing data in movement HMMs has some important limitations relative to more complicated single-stage alternatives. As highlighted in the bearded seal example, perhaps the most important limitation is the separation of the position process (i.e., the movement model) from the latent behavior states and the environment (e.g., bathymetry, land, sea ice). The latter is perhaps best demonstrated in Fig. 4, where the two-stage approach clearly imputed some inland positions that are rather dubious for an ice-associated seal. The two-stage approach used here also does not include individual-level random effects in the HMMs (Altman 2007; DeRuiter et al. 2016). Extending the two-stage approach to incorporate other types of missing data, random effects, and environmental constraints to movement (e.g., land) is the focus of ongoing research. As with any maximum likelihood analysis, starting values for the optimization are also an important consideration when using the two-stage approach.

Here I have focused on imputation of the position process when location data are subject to measurement error and temporal irregularity, but multiple imputation need not be limited to these scenarios. For example, missing auxiliary biotelemetry data (e.g., dive time, dry time, number of dives) in the bearded seal example could be imputed using standard missing data techniques (Rubin 1987). This would also allow for the investigation of different mechanisms for missingness that can be problematic if not accounted for (Nakagawa and Freckleton 2008).

Despite the subtle but important differences between two-stage approaches and the single-stage treatment, the overall inferences from the bearded seal example are quite similar in terms of the primary objectives of McClintock et al. (2017), including the quantification of seasonal activity budgets, the identification of bearded seal foraging habitat, and the characterization of different movement behaviors in relation to seasonal sea ice. However, if one prefers single-stage approaches, the two-stage approach could still prove useful for exploring alternative HMMs from which to choose a potential candidate for the single-stage treatment. Given its ease of implementation and relatively fast computation, the two-stage approach could also be used to generate initial values or help diagnose convergence in single-stage analyses. Whether used for primary analysis or as an exploratory aid, I found multiple imputation to be a promising addition to the ecologist’s toolbox for inference when HMM data streams are subject to observation error.

## Notes

### Acknowledgements

I thank the guest editors of this special issue (M. Hooten, R. King, and R. Langrock) for the invitation to contribute, and D. Johnson and J. London for helpful comments. Bearded seal tagging was funded by the Bureau of Ocean Energy Management Alaska Environmental Studies Program through Interagency Agreement M07RG13317. The findings and conclusions herein are those of the author(s) and do not necessarily represent the views of NOAA/NMFS.

## Supplementary material

## References

- Altman, R. M. (2007), “Mixed hidden Markov models”,
*Journal of the American Statistical Association*, 102, 201–210MathSciNetCrossRefzbMATHGoogle Scholar - Bagniewska, J. M., Hart, T., Harrington, L. A., and Macdonald, D. W. (2013), “Hidden Markov analysis describes dive patterns in semiaquatic animals”,
*Behavioral Ecology*, 24, 659–667.CrossRefGoogle Scholar - Beyer, H. L., Morales, J. M., Murray, D., and Fortin, M. J. (2013), “The effectiveness of Bayesian state-space models for estimating behavioural states from movement paths”,
*Methods in Ecology and Evolution*, 4, 433–441.CrossRefGoogle Scholar - Congalton, R.G. (1991), “A review of assessing the accuracy of classifications of remotely sensed data”,
*Remote Sensing of Environment*, 37, 35–46.ADSCrossRefGoogle Scholar - Costa, D. P., Robinson, P. Arnould, J., Harrison, A-L., Simmons, S., et al. (2010), “Accuracy of ARGOS locations of pinnipeds at-sea estimated using Fastloc GPS”,
*PLoS ONE*, 5, e8677.ADSCrossRefPubMedPubMedCentralGoogle Scholar - DeRuiter, S. L., Langrock, R., Skirbutas, T., Goldbogen, J. A., Chalambokidis, J., Friedlaender, A. S., and Southall, B. L. (2016), “A multivariate mixed hidden Markov model to analyze blue whale diving behaviour during controlled sound exposures”, arXiv:1602.06570.
- Hooten, M. B., Johnson, D. S., McClintock, B. T., and Morales, J. M. (2017),
*Animal Movement: Statistical Models for Telemetry Data*. CRC Press. Boca Raton, Florida, USA.CrossRefGoogle Scholar - Johnson, D. S. (2016), “crawl: Fit Continuous-Time Correlated Random Walk Models to Animal Movement Data”,
*R package version 2.0.4*. https://cran.r-project.org/web/packages/crawl/ - Johnson, D. S., London, J. M., Lea, M. A., and Durban, J. W. (2008), “Continuous-time correlated random walk model for animal telemetry data”,
*Ecology*, 89, 1208–1215.CrossRefPubMedGoogle Scholar - Jonsen, I. D., Flemming, J. M. and Myers, R. A. (2005), “Robust state-space modeling of animal movement data”,
*Ecology*, 86, 2874–2880.CrossRefGoogle Scholar - Langrock, R., King, R., Matthiopoulos, J., Thomas, L., Fortin, D., and Morales, J. M. (2012), “Flexible and practical modeling of animal telemetry data: hidden Markov models and extensions”,
*Ecology*, 93, 2336–2342.CrossRefPubMedGoogle Scholar - London, J. M. (2016), kotzeb0912: v1.0 [Data set].
*Zenodo*. http://doi.org/10.5281/zenodo.57290 - McClintock, B. T., King, R., Thomas, L., Matthiopoulos, J., McConnell, B. J., and Morales, J. M. (2012), “A general discrete-time modeling framework for animal movement using multistate random walks”,
*Ecological Monographs*, 82, 335–349.CrossRefGoogle Scholar - McClintock, B. T., Russell, D. J., Matthiopoulos, J., and King, R. (2013), “Combining individual animal movement and ancillary biotelemetry data to investigate population-level activity budgets”,
*Ecology*, 94, 838–849.CrossRefGoogle Scholar - McClintock, B. T., Johnson, D. S., Hooten, M. B., Ver Hoef, J. M., and Morales, J. M. (2014), “When to be discrete: the importance of time formulation in understanding animal movement”,
*Movement Ecology*, 2, 21.CrossRefPubMedPubMedCentralGoogle Scholar - McClintock, B. T., London, J. M., Cameron, M. F., and Boveng, P. L. (2015), “Modelling animal movement using the Argos satellite telemetry location error ellipse”,
*Methods in Ecology and Evolution*, 6, 266–277.CrossRefGoogle Scholar - McClintock, B. T., London, J. M., Cameron, M. F., and Boveng, P. L. (2017), “Bridging the gaps in animal movement: hidden behaviors and ecological relationships revealed by integrated data stream”,
*Ecosphere*, 8, e01751.CrossRefGoogle Scholar - Michelot, T., Langrock, R., and Patterson, T. A. (2016a), “moveHMM: an R package for the statistical modelling of animal movement data using hidden Markov models”,
*Methods in Ecology and Evolution*, 7, 1308–1315.CrossRefGoogle Scholar - Michelot, T., Langrock, R., Bestley, S., Jonsen, I. D., Photopoulou, T., and Patterson, T. A. (2016b), “Estimation and simulation of foraging trips in land-based marine predators”, arXiv:1610.06953.
- Morales, J. M., Haydon, D. T., Frair, J., Holsinger, K. E., and Fryxell, J. M. (2004), “Extracting more out of relocation data: building movement models as mixtures of random walks”,
*Ecology*, 85, 2436–2445.CrossRefGoogle Scholar - Nakagawa, S., and Freckleton, R. (2008), “Missing inaction: the dangers of ignoring missing data”,
*Trends in Ecology and Evolution*, 23, 592–596.CrossRefPubMedGoogle Scholar - Pebesma, E. J. (2004), “Multivariable geostatistics in S: the gstat package”,
*Computers and Geosciences*, 30, 683–691.ADSCrossRefGoogle Scholar - R Core Team (2016),
*R: a Language and Environment for Statistical Computing*. R Foundation for Statistical Computing. Vienna, Austria. http://www.R-project.org - Rubin, D. (1987),
*Multiple Imputation for Nonresponse in Surveys*. Wiley, New York, USA.CrossRefzbMATHGoogle Scholar - Rubin, D. B. and Schenker, N. (1986), “Multiple imputation for interval estimation from simple random samples with ignorable nonresponse”,
*Journal of the American Statistical Association*, 81, 366–374.MathSciNetCrossRefzbMATHGoogle Scholar - Russell, D. J., McClintock, B., Matthiopoulos, J., Thompson, P., Thompson, D., Hammond, P. Jones, E., MacKenzie, M., Moss, S., and McConnell, B. (2015), “Intrinsic and extrinsic drivers of activity budgets in sympatric grey and harbour seals”,
*Oikos*, 124, 1462–1472.CrossRefGoogle Scholar - Service Argos (2013),
*Argos User’s Manual*. CLS. http://www.argos-system.org/manual. - Silva, M. A., Jonsen, I., Russell, D. J. F., Prieto, R., Thompson, D., and Baumgartner, M. F. (2014), “Assessing performance of Bayesian state-space models fit to Argos satellite telemetry locations processed with Kalman filtering”,
*PLoS ONE*, 9, e92277.ADSCrossRefPubMedPubMedCentralGoogle Scholar - Turek, D., de Valpine, P., and Paciorek, C. J. (2016), “Efficient Markov chain Monte Carlo sampling for hierarchical hidden Markov models”,
*Environmental and Ecological Statistics*23, 549–564.MathSciNetCrossRefGoogle Scholar - Zucchini, W., MacDonald, I. L., and Langrock, R. (2016),
*Hidden Markov Models for Time Series: An Introduction Using R*, Second Edition. CRC Press. Boca Raton, Florida, USA.zbMATHGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.