Abstract
Many statistical models in ecology follow the state space paradigm. For such models, the important step of model validation rarely receives as much attention as estimation or hypothesis testing, perhaps due to lack of available algorithms and software. Model validation is often based on a naive adaptation of Pearson residuals, i.e. the difference between observations and posterior means, even if this approach is flawed. Here, we consider validation of state space models through one-step prediction errors, and discuss principles and practicalities arising when the model has been fitted with a tool for estimation in general mixed effects models. Implementing one-step predictions in the R package Template Model Builder, we demonstrate that it is possible to perform model validation with little effort, even if the ecological model is multivariate, has non-linear dynamics, and whether observations are continuous or discrete. With both simulated data, and a real data set related to geolocation of seals, we demonstrate both the potential and the limitations of the techniques. Our results fill a need for convenient methods for validating a state space model, or alternatively, rejecting it while indicating useful directions in which the model could be improved.
Similar content being viewed by others
Notes
TMB is an R package (R Core Team 2015) available both at the Comprehensive R Archive Network (cran.r-project.org) and in a development version at GitHub (github.com/kaskr/adcomp).
References
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC–19:716–723 (system identification and time-series analysis)
Albertsen CM, Whoriskey K, Yurkowski D, Nielsen A, Mills Flemming J (2015) Fast fitting of non-Gaussian state-space models to animal movement data via template model builder. Ecology 96(10):2598–2604
Anscombe FJ, Tukey JW (1963) The examination and analysis of residuals. Technometrics 5(2):141–160. doi:10.1080/00401706.1963.10490071
Berg CW, Nielsen A (2016) Accounting for correlated observations in an age-based state-space stock assessment model. ICES J Mar Sci. doi:10.1093/icesjms/fsw046
Bolker BM, Gardner B, Maunder M, Berg CW, Brooks M, Comita L, Crone E, Cubaynes S, Davies T, Valpine P et al (2013) Strategies for fitting nonlinear ecological models in R, AD Model Builder, and BUGS. Methods Ecol Evol 4(6):501–512
Box GE, Draper NR (1987) Empirical model-building and response surfaces. Wiley, New York
Box GEP, Jenkins GM (1970) Time series analysis: forecasting and control, 1976. ISBN: 0-8162-1104-3
Cadigan N, Morgan M, Brattey J (2014) Improved estimation and forecasts of stock maturities using generalised linear mixed models with auto-correlated random effects. Fish Manag Ecol 21(5):343–356
Clark C, Mangel M (2000) Dynamic state variable models in ecology: methods and applications. Oxford University Press, Oxford
Clark JS (2007) Models for ecological data: an introduction, vol 11. Princeton University Press, Princeton
Cox D, Hinkley D (1974) Theoretical statistics. Chapman & Hall, London
Cox DR, Snell EJ (1968) A general definition of residuals. J R Stat Soc Ser B (Methodol) 30(2):248–275, URL http://www.jstor.org/stable/2984505
Dawid AP (1984) Present position and potential developments: Some personal views: Statistical theory: the prequential approach. J R Stat Soc Ser A (General) 147(2):278–292
de Valpine P, Hastings A (2002) Fitting population models incorporating process noise and observation error. Ecol Monogr 72(1):57–76
Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5(3):236–244
Evensen G (2003) The ensemble kalman filter: theoretical formulation and practical implementation. Ocean Dyn 53(4):343–367
Fournier DA, Skaug HJ, Ancheta J, Ianelli J, Magnusson A, Maunder MN, Nielsen A, Sibert J (2012) AD Model Builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optim Methods Softw 27(2):233–249
Frühwirth-Schnatter S (1996) Recursive residuals and model diagnostics for normal and non-normal state space models. Environ Ecol Stat 3(4):291–309
Gelman A, Carlin JB, Stern HS, Rubin DB (2014) Bayesian data analysis, vol 2. Taylor & Francis, Abingdon
Gilks WR, Richardson S, Spiegelhalter DJ (1996) Markov chain Monte Carlo in practice. Interdisciplinary statistics. Chapman and Hall, London
Griewank A, Walther A (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM, New Delhi
Harvey AC (1989) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge
Jonsen I, Flemming J, Myers R (2005) Robust state-space modeling of animal movement data. Ecology 86(11):2874–2880
Jonsen I, Basson M, Bestley S, Bravington M, Patterson T, Pedersen MW, Thomson R, Thygesen UH, Wotherspoon S (2013) State-space models for bio-loggers: a methodological road map. Deep Sea Res Part II Top Stud Ocean 88:34–46
Kalliovirta L (2012) Misspecification tests based on quantile residuals. Econom J 15(2):358–393. doi:10.1111/j.1368-423X.2011.00364.x
Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82:35–45
Kristensen K, Nielsen A, Berg CW, Skaug H, Bell BM (2016) TMB: automatic differentiation and Laplace approximation. J Stat Softw 70(5):1–21. doi:10.18637/jss.v070.i05
Liu JS, Chen R (1998) Sequential monte carlo methods for dynamic systems. J Am Stat Assoc 93:1032–1044
Ljung GM, Box GEP (1978) On a measure of lack of fit in time series models. Biometrika 65(2):297. doi:10.1093/biomet/65.2.297
Ljung L (1999) System Identification—Theory for the User. Information and system sciences series, 2nd edn. Prentice-Hall, Upper Saddle River
Madsen H (2007) Time series analysis. Chapman & Hall/CRC, London
May RM (1974) Biological populations with nonoverlapping generations: stable points, stable cycles, and chaos. Science 186(4164):645–647
Murray J (1989) Mathematical Biology. Springer-Verlag, Berlin
Nielsen A, Berg CW (2014) Estimation of time-varying selectivity in stock assessments using state-space models. Fish Res 158:96–101
Øksendal B (2010) Stochastic differential equations—An Introduction with Applications, 6th edn. Springer-Verlag, Berlin
Patterson T, Thomas L, Wilcox C, Ovaskainen O, Mathhiopoulos J (2008) State-space models of individual animal movement. Trends Ecol Evol 23(2):87–94
Pebesma EJ (2004) Multivariable geostatistics in s: the gstat package. Comput Geosci 30:683–691
Pedersen MW, Berg CW (2016) A stochastic surplus production model in continuous time. Fish Fish 18:226–243
Pedersen MW, Berg CW, Thygesen UH, Nielsen A, Madsen H (2011) Estimation methods for nonlinear state-space models in ecology. Ecol Model 222(8):1394–1400
R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org/
Rall LB (1980) Applications of software for automatic differentiation in numerical computation. In: Alefeld G, Grigorieff RD (eds) Fundamentals of numerical computation (computer-oriented numerical analysis), Springer, pp 141–156
Rosenblatt M (1952) Remarks on a multivariate transformation. Ann Math Stat 23(3):470–472
Rue H, Martino S, Chopin N (2009) Approximate bayesian inference for latent Gaussian models by using integrated nested laplace approximations. J R Stat Soc Ser B (Stat Methodol) 71(2):319–392
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3–4):591. doi:10.1093/biomet/52.3-4.591
Skaug HJ, Fournier DA (2006) Automatic approximation of the marginal likelihood in non-Gaussian hierarchical models. Comput Stat Data Anal 51(2):699–709
Smith J (1985) Diagnostic checks of non-standard time series models. J Forecast 4(3):283–291
Thygesen UH, Sommmer L, Evans K, Patterson TA (2016) Dynamic optimal foraging theory explains vertical migrations of bigeye tuna. Ecol Appear 97:1852–1861
Tierney L, Kadane JB (1986) Accurate approximations for posterior moments and marginal densities. J Am Stat Assoc 81(393):82–86
Waagepetersen R (2006) A simulation-based goodness-of-fit test for random effects in generalized linear mixed models. Scand J Stat 33(4):721–731
Wan EA, Van Der Merwe R (2000) The unscented Kalman filter for nonlinear estimation. In: Adaptive Systems for signal processing, communications, and control symposium 2000. AS-SPCC. The IEEE 2000, IEEE, pp 153–158
Wasserstein RL, Lazar NA (2016) The ASA’s statement on p-values: context, process, and purpose. Am Stat 70:129–133
Zucchini W, MacDonald IL (2009) Hidden Markov models for time series: an introduction using R. CRC Press, Boca Raton
Author information
Authors and Affiliations
Corresponding author
Additional information
Handling Editor: Pierre Dutilleul.
Rights and permissions
About this article
Cite this article
Thygesen, U.H., Albertsen, C.M., Berg, C.W. et al. Validation of ecological state space models using the Laplace approximation. Environ Ecol Stat 24, 317–339 (2017). https://doi.org/10.1007/s10651-017-0372-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-017-0372-4