Validation of ecological state space models using the Laplace approximation

Abstract

Many statistical models in ecology follow the state space paradigm. For such models, the important step of model validation rarely receives as much attention as estimation or hypothesis testing, perhaps due to lack of available algorithms and software. Model validation is often based on a naive adaptation of Pearson residuals, i.e. the difference between observations and posterior means, even if this approach is flawed. Here, we consider validation of state space models through one-step prediction errors, and discuss principles and practicalities arising when the model has been fitted with a tool for estimation in general mixed effects models. Implementing one-step predictions in the R package Template Model Builder, we demonstrate that it is possible to perform model validation with little effort, even if the ecological model is multivariate, has non-linear dynamics, and whether observations are continuous or discrete. With both simulated data, and a real data set related to geolocation of seals, we demonstrate both the potential and the limitations of the techniques. Our results fill a need for convenient methods for validating a state space model, or alternatively, rejecting it while indicating useful directions in which the model could be improved.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Notes

  1. 1.

    TMB is an R package (R Core Team 2015) available both at the Comprehensive R Archive Network (cran.r-project.org) and in a development version at GitHub (github.com/kaskr/adcomp).

References

  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC–19:716–723 (system identification and time-series analysis)

    Article  Google Scholar 

  2. Albertsen CM, Whoriskey K, Yurkowski D, Nielsen A, Mills Flemming J (2015) Fast fitting of non-Gaussian state-space models to animal movement data via template model builder. Ecology 96(10):2598–2604

    Article  PubMed  Google Scholar 

  3. Anscombe FJ, Tukey JW (1963) The examination and analysis of residuals. Technometrics 5(2):141–160. doi:10.1080/00401706.1963.10490071

    Article  Google Scholar 

  4. Berg CW, Nielsen A (2016) Accounting for correlated observations in an age-based state-space stock assessment model. ICES J Mar Sci. doi:10.1093/icesjms/fsw046

  5. Bolker BM, Gardner B, Maunder M, Berg CW, Brooks M, Comita L, Crone E, Cubaynes S, Davies T, Valpine P et al (2013) Strategies for fitting nonlinear ecological models in R, AD Model Builder, and BUGS. Methods Ecol Evol 4(6):501–512

    Article  Google Scholar 

  6. Box GE, Draper NR (1987) Empirical model-building and response surfaces. Wiley, New York

    Google Scholar 

  7. Box GEP, Jenkins GM (1970) Time series analysis: forecasting and control, 1976. ISBN: 0-8162-1104-3

  8. Cadigan N, Morgan M, Brattey J (2014) Improved estimation and forecasts of stock maturities using generalised linear mixed models with auto-correlated random effects. Fish Manag Ecol 21(5):343–356

    Article  Google Scholar 

  9. Clark C, Mangel M (2000) Dynamic state variable models in ecology: methods and applications. Oxford University Press, Oxford

    Google Scholar 

  10. Clark JS (2007) Models for ecological data: an introduction, vol 11. Princeton University Press, Princeton

    Google Scholar 

  11. Cox D, Hinkley D (1974) Theoretical statistics. Chapman & Hall, London

    Google Scholar 

  12. Cox DR, Snell EJ (1968) A general definition of residuals. J R Stat Soc Ser B (Methodol) 30(2):248–275, URL http://www.jstor.org/stable/2984505

  13. Dawid AP (1984) Present position and potential developments: Some personal views: Statistical theory: the prequential approach. J R Stat Soc Ser A (General) 147(2):278–292

    Article  Google Scholar 

  14. de Valpine P, Hastings A (2002) Fitting population models incorporating process noise and observation error. Ecol Monogr 72(1):57–76

    Article  Google Scholar 

  15. Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5(3):236–244

    Google Scholar 

  16. Evensen G (2003) The ensemble kalman filter: theoretical formulation and practical implementation. Ocean Dyn 53(4):343–367

    Article  Google Scholar 

  17. Fournier DA, Skaug HJ, Ancheta J, Ianelli J, Magnusson A, Maunder MN, Nielsen A, Sibert J (2012) AD Model Builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optim Methods Softw 27(2):233–249

    Article  Google Scholar 

  18. Frühwirth-Schnatter S (1996) Recursive residuals and model diagnostics for normal and non-normal state space models. Environ Ecol Stat 3(4):291–309

    Article  Google Scholar 

  19. Gelman A, Carlin JB, Stern HS, Rubin DB (2014) Bayesian data analysis, vol 2. Taylor & Francis, Abingdon

    Google Scholar 

  20. Gilks WR, Richardson S, Spiegelhalter DJ (1996) Markov chain Monte Carlo in practice. Interdisciplinary statistics. Chapman and Hall, London

    Google Scholar 

  21. Griewank A, Walther A (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation. SIAM, New Delhi

    Google Scholar 

  22. Harvey AC (1989) Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge

    Google Scholar 

  23. Jonsen I, Flemming J, Myers R (2005) Robust state-space modeling of animal movement data. Ecology 86(11):2874–2880

    Article  Google Scholar 

  24. Jonsen I, Basson M, Bestley S, Bravington M, Patterson T, Pedersen MW, Thomson R, Thygesen UH, Wotherspoon S (2013) State-space models for bio-loggers: a methodological road map. Deep Sea Res Part II Top Stud Ocean 88:34–46

    Article  Google Scholar 

  25. Kalliovirta L (2012) Misspecification tests based on quantile residuals. Econom J 15(2):358–393. doi:10.1111/j.1368-423X.2011.00364.x

    Article  Google Scholar 

  26. Kalman RE (1960) A new approach to linear filtering and prediction problems. J Basic Eng 82:35–45

    Article  Google Scholar 

  27. Kristensen K, Nielsen A, Berg CW, Skaug H, Bell BM (2016) TMB: automatic differentiation and Laplace approximation. J Stat Softw 70(5):1–21. doi:10.18637/jss.v070.i05

    Article  Google Scholar 

  28. Liu JS, Chen R (1998) Sequential monte carlo methods for dynamic systems. J Am Stat Assoc 93:1032–1044

    Article  Google Scholar 

  29. Ljung GM, Box GEP (1978) On a measure of lack of fit in time series models. Biometrika 65(2):297. doi:10.1093/biomet/65.2.297

    Article  Google Scholar 

  30. Ljung L (1999) System Identification—Theory for the User. Information and system sciences series, 2nd edn. Prentice-Hall, Upper Saddle River

    Google Scholar 

  31. Madsen H (2007) Time series analysis. Chapman & Hall/CRC, London

    Google Scholar 

  32. May RM (1974) Biological populations with nonoverlapping generations: stable points, stable cycles, and chaos. Science 186(4164):645–647

    CAS  Article  PubMed  Google Scholar 

  33. Murray J (1989) Mathematical Biology. Springer-Verlag, Berlin

    Google Scholar 

  34. Nielsen A, Berg CW (2014) Estimation of time-varying selectivity in stock assessments using state-space models. Fish Res 158:96–101

    Article  Google Scholar 

  35. Øksendal B (2010) Stochastic differential equations—An Introduction with Applications, 6th edn. Springer-Verlag, Berlin

    Google Scholar 

  36. Patterson T, Thomas L, Wilcox C, Ovaskainen O, Mathhiopoulos J (2008) State-space models of individual animal movement. Trends Ecol Evol 23(2):87–94

    Article  PubMed  Google Scholar 

  37. Pebesma EJ (2004) Multivariable geostatistics in s: the gstat package. Comput Geosci 30:683–691

    Article  Google Scholar 

  38. Pedersen MW, Berg CW (2016) A stochastic surplus production model in continuous time. Fish Fish 18:226–243

    Article  Google Scholar 

  39. Pedersen MW, Berg CW, Thygesen UH, Nielsen A, Madsen H (2011) Estimation methods for nonlinear state-space models in ecology. Ecol Model 222(8):1394–1400

    Article  Google Scholar 

  40. R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, URL http://www.R-project.org/

  41. Rall LB (1980) Applications of software for automatic differentiation in numerical computation. In: Alefeld G, Grigorieff RD (eds) Fundamentals of numerical computation (computer-oriented numerical analysis), Springer, pp 141–156

  42. Rosenblatt M (1952) Remarks on a multivariate transformation. Ann Math Stat 23(3):470–472

    Article  Google Scholar 

  43. Rue H, Martino S, Chopin N (2009) Approximate bayesian inference for latent Gaussian models by using integrated nested laplace approximations. J R Stat Soc Ser B (Stat Methodol) 71(2):319–392

    Article  Google Scholar 

  44. Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3–4):591. doi:10.1093/biomet/52.3-4.591

    Article  Google Scholar 

  45. Skaug HJ, Fournier DA (2006) Automatic approximation of the marginal likelihood in non-Gaussian hierarchical models. Comput Stat Data Anal 51(2):699–709

    Article  Google Scholar 

  46. Smith J (1985) Diagnostic checks of non-standard time series models. J Forecast 4(3):283–291

    Article  Google Scholar 

  47. Thygesen UH, Sommmer L, Evans K, Patterson TA (2016) Dynamic optimal foraging theory explains vertical migrations of bigeye tuna. Ecol Appear 97:1852–1861

    Article  Google Scholar 

  48. Tierney L, Kadane JB (1986) Accurate approximations for posterior moments and marginal densities. J Am Stat Assoc 81(393):82–86

    Article  Google Scholar 

  49. Waagepetersen R (2006) A simulation-based goodness-of-fit test for random effects in generalized linear mixed models. Scand J Stat 33(4):721–731

    Article  Google Scholar 

  50. Wan EA, Van Der Merwe R (2000) The unscented Kalman filter for nonlinear estimation. In: Adaptive Systems for signal processing, communications, and control symposium 2000. AS-SPCC. The IEEE 2000, IEEE, pp 153–158

  51. Wasserstein RL, Lazar NA (2016) The ASA’s statement on p-values: context, process, and purpose. Am Stat 70:129–133

    Article  Google Scholar 

  52. Zucchini W, MacDonald IL (2009) Hidden Markov models for time series: an introduction using R. CRC Press, Boca Raton

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Uffe Høgsbro Thygesen.

Additional information

Handling Editor: Pierre Dutilleul.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Thygesen, U.H., Albertsen, C.M., Berg, C.W. et al. Validation of ecological state space models using the Laplace approximation. Environ Ecol Stat 24, 317–339 (2017). https://doi.org/10.1007/s10651-017-0372-4

Download citation

Keywords

  • Maximum likelihood estimation
  • Model validation
  • Residual analysis
  • Statistical ecology
  • State space methods
  • Time series analysis