Testing Simulation Models Using Frequentist Statistics

  • Andrew P. RobinsonEmail author
Part of the Simulation Foundations, Methods and Applications book series (SFMA)


One approach to validating simulation models is to formally compare model outputs with independent data. We consider such model validation from the point of view of Frequentist statistics. A range of estimates and tests of goodness of fit have been advanced. We review these approaches, and demonstrate that some of the tests suffer from difficulties in interpretation because they rely on the null hypothesis that the model is similar to the observations. This reliance creates two unpleasant possibilities, namely, a model could be spuriously validated when data are too few, or inappropriately rejected when data are too many. Finally, these tests do not allow a principled declaration of what a reasonable level of difference would be considering the purposes to which the model will be put. We consider equivalence tests, and demonstrate that they do not suffer from the previously identified shortcomings. We provide two case studies to illustrate the claims of the chapter.


Equivalence testing Null hypothesis significance testing Statistical models Model validation 



This study is supported in part by the Centre of Excellence for Biosecurity Risk Analysis, School of BioSciences, University of Melbourne, Australia. Thoughtful review comments by Lori Dalton, Steve Lane, Anca Hanea, James Camac, and the two editors have greatly improved this chapter.


  1. Aigner, D. J. (1972). A note on verification of computer simulation models. Management Science, 18(11), 615–619.CrossRefGoogle Scholar
  2. Alewell, C., & Manderscheid, B. (1998). Use of objective criteria for the assessment of biogeochemical ecosystem models. Ecological Modelling, 107, 213–224.CrossRefGoogle Scholar
  3. Bartelink, H. H. (1998). Radiation interception by forest trees: A simulation study on effects of stand density and foliage clustering on absorption and transmission. Ecological Modelling, 105, 213–225.CrossRefGoogle Scholar
  4. Berger, R. L., & Hsu, J. C. (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statistical Science, 11(4), 283–319.MathSciNetCrossRefGoogle Scholar
  5. Capes, H., et al. (2017). The allometric quarter-power scaling model and its applicability to Grand fir and Eucalyptus trees. Journal of Agricultural, Biological, and Environmental Statistics, $7$, 1–23.Google Scholar
  6. Casella, G., & Berger, R. L. (1990). Statistical inference. Belmont, CA.: Duxbury Press.zbMATHGoogle Scholar
  7. Caswell, H. (1976). The validation problem. In B. Patten (Ed.), Systems analysis and simulation in ecology (Vol. 4, pp. 313–325). Cambridge: Academic Press.CrossRefGoogle Scholar
  8. Cohen, K. J., & Cyert, R. M. (1961). Computer models in dynamic economics. The Quarterly Journal of Economics, 75(1), 112–127.CrossRefGoogle Scholar
  9. Duursma, R., Marshall, J., Robinson, A., & Pangle, R. (2007). Description and test of a simple process-based model of forest growth for mixed-species stands. Ecological Modelling, 203(3–4), 297–311.CrossRefGoogle Scholar
  10. Freese, F. (1960). Testing accuracy. Forest Science, 6(2), 139–145.Google Scholar
  11. Gentil, S., & Blake, G. (1981). Validation of complex ecosystem models. Ecological Modelling, 14, 21–38.CrossRefGoogle Scholar
  12. Gregoire, T. G., & Reynolds, M. R, Jr. (1988). Accuracy testing and estimation alternatives. Forest Science, 34(2), 302–320.Google Scholar
  13. Jans-Hammermeister, D. C., & McGill, W. B. (1997). Evaluation of three simulation models used to describe plant residue decomposition in soil. Ecological Modelling, 104, 1–13.CrossRefGoogle Scholar
  14. Kleijnen, J. P. C. (1995). Verification and validation of simulation models. European Journal of Operational Research, 82, 145–162.MathSciNetCrossRefGoogle Scholar
  15. Kleijnen, J. P. C., Bettonvil, B., & Van Groenendaal, W. (1998). Validation of trace-driven simulation models: A novel regression test. Management Science, 44(6), 812–819.Google Scholar
  16. Kleijnen, J. P. C. (1974). Statistical techniques in simulation (part 1). New York.: Marcel Dekker.zbMATHGoogle Scholar
  17. Landsberg, J. J., Waring, R. H., & Coops, N. C. (2003). Performance of the forest productivity model 3-PG applied to a wide range of forest types. Forest Ecology and Management, 172, 199–214.CrossRefGoogle Scholar
  18. Loehle, C. (1997). A hypothesis testing framework for evaluating ecosystem model performance. Ecological Modelling, 97, 153–165.CrossRefGoogle Scholar
  19. Mayer, D. G., & Butler, D. G. (1993). Statistical validation. Ecological Modelling, 68, 21–32.CrossRefGoogle Scholar
  20. McBride, G. B. (1999). Equivalence tests can enhance environmental science and management. Australian and New Zealand Journal of Statistics, 41(1), 19–29.MathSciNetCrossRefGoogle Scholar
  21. Meyners, M. (2012). Equivalence tests—A review. Food Quality and Preference, 26(2), 231–245.CrossRefGoogle Scholar
  22. Oreskes, N., Shrader-Frechette, K., & Belitz, K. (1994). Verification, validation, and confirmation of numerical models in the earth sciences. Science, 263, 641–646.CrossRefGoogle Scholar
  23. Ottosson, F., & Håkanson, L. (1997). Presentation and analysis of a model simulating the pH response of lake liming. Ecological Modelling, 105, 89–111.CrossRefGoogle Scholar
  24. Parkhurst, D. F. (2001). Statistical significance tests: equivalence and reverse tests should reduce misinterpretation. Bioscience, 51(12), 1051–1057.CrossRefGoogle Scholar
  25. Pocewicz, A. L., Gessler, P., & Robinson, A. P. (2004). The relationship between effective plant area index and landsat spectral response across elevation, solar insolation, and spatial scales in a northern Idaho forest. Canadian Journal of Forest Research, 34(2), 465–480.CrossRefGoogle Scholar
  26. R Core Team (2017). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Google Scholar
  27. Reynolds, M. R, Jr. (1984). Estimating the error in model predictions. Forest Science, 30(2), 454–469.Google Scholar
  28. Reynolds, M. R, Jr., Burkhart, H. E., & Daniels, R. F. (1981). Procedures for statistical validation of stochastic simulation models. Forest Science, 27(2), 349–364.Google Scholar
  29. Robinson, A. (2016). Equivalence: Provides tests and graphics for assessing tests of equivalence. R package version 0.7.2.Google Scholar
  30. Robinson, A., Duursma, R., & Marshall, J. (2005). A regression-based equivalence test for model validation: Shifting the burden of proof. Tree Physiology, 25(7), 903.CrossRefGoogle Scholar
  31. Robinson, A. P., & Ek, A. R. (2000). The consequences of hierarchy for modelling in forest ecosystems. Canadian Journal of Forest Research, 30(12), 1837–1846.CrossRefGoogle Scholar
  32. Robinson, A. P., & Froese, R. E. (2004). Model validation using equivalence tests. Ecological Modelling, 176(3–4), 349–358.CrossRefGoogle Scholar
  33. Rykiel, E. J. (1996). Testing ecological models—The meaning of validation. Ecological Modelling, 90(3), 229–244.CrossRefGoogle Scholar
  34. Sargent, R. G. (2012). Verification and validation of simulation models. Journal of Simulation, 7(1), 12–24.MathSciNetCrossRefGoogle Scholar
  35. Vanclay, J. K., & Skovsgaard, J. P. (1997). Evaluating forest growth models. Ecological Modelling, 98(1), 1–12.CrossRefGoogle Scholar
  36. Wellek, S. (2010). Testing statistical hypotheses of equivalence and noninferiority (2nd ed.). Chapman and Hall/CRC.Google Scholar
  37. Wykoff, W., Crookston, N., & Stage, A. (1982). User’s guide to the stand prognosis model. USDA Forest Service Intermountain Research Station, Ogden, UT. GTR-INT 133, 113 p.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.CEBRA, School of BioSciencesThe University of MelbourneParkvilleAustralia

Personalised recommendations