Skip to main content
Log in

Issues in information theory-based statistical inference—a commentary from a frequentist’s perspective

  • Review
  • Published:
Behavioral Ecology and Sociobiology Aims and scope Submit manuscript

Abstract

After several decades during which applied statistical inference in research on animal behaviour and behavioural ecology has been heavily dominated by null hypothesis significance testing (NHST), a new approach based on information theoretic (IT) criteria has recently become increasingly popular, and occasionally, it has been considered to be generally superior to conventional NHST. In this commentary, I discuss some limitations the IT-based method may have under certain circumstances. In addition, I reviewed some recent articles published in the fields of animal behaviour and behavioural ecology and point to some common failures, misunderstandings and issues frequently appearing in the practical application of IT-based methods. Based on this, I give some hints about how to avoid common pitfalls in the application of IT-based inference, when to choose one or the other approach and discuss under which circumstances a mixing of the two approaches might be appropriate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Aiken LS, West SG (1991) Multiple regression: testing and interpreting interactions. Sage, Newbury Park

    Google Scholar 

  • American Psychological Association (1994) Publication manual of the American Psychological Association, 4th edn. APA, Washington

    Google Scholar 

  • Anderson DR, Burnham KP, Thompson WL (2000) Null hypothesis testing: problems, prevalence, and an alternative. J Wildl Manage 64:912–923

    Article  Google Scholar 

  • Anderson DR, Link WA, Johnson DH, Burnham KP (2001) Suggestions for presenting the results of data analyses. J Wildl Manage 65:373–378

    Article  Google Scholar 

  • Austin PC, Tu JV (2004) Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J Clin Epidemiol 57:1138–1146

    Article  PubMed  Google Scholar 

  • Burnham KP, Anderson DR (2002) Model selection and multimodel inference, 2nd edn. Springer, Berlin

    Google Scholar 

  • Burnham KP, Anderson DR, Huyvaert KP (2010) AICc model selection in Ecological and behavioral science: some background, observations, and comparisons. Behav Ecol Sociobiol. doi:10.1007/s00265-010-1029-6

  • Chatfield C (1995) Model uncertainty, data mining and statistical inference. J Roy Stat Soc A Sta 158:419–466

    Article  Google Scholar 

  • Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Erlbaum Associates, New York

    Google Scholar 

  • Cohen J (1994) The earth is round (p < .05). Am Psychol 49:997–1003

    Article  Google Scholar 

  • Cohen J, Cohen P (1983) Applied multiple regression/correlation analysis for the behavioral sciences, 2nd edn. Lawrence Erlbaum Associates, Mahwah

    Google Scholar 

  • Derksen S, Keselman HJ (1992) Backward, forward and stepwise automated subset selection algorithms: frequency of obtaining authentic and noise variables. Br J Math Stat Psychol 45:265–282

    Google Scholar 

  • Dobson AJ (2002) An introduction to generalized linear models. Chapman & Hall, Boca Raton

    Google Scholar 

  • Dochtermann NA, Jenkins SH (2010) Developing multiple hypotheses in behavioral ecology. Behav Ecol Sociobiol. doi:10.1007/s00265-010-1039-4

  • Field A (2005) Discovering statistics using SPSS. Sage, London

    Google Scholar 

  • Forstmeier W, Schielzeth H (2010) Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner’s curse. Behav Ecol Sociobiol. doi:10.1007/s00265-010-1038-5

  • Freckleton RP (2010) Dealing with collinearity in behavioural and ecological data: model averaging and the problems of measurement error. Behav Ecol Sociobiol. doi:10.1007/s00265-010-1045-6

  • Freedman DA (1983) A note on screening regression equations. Am Stat 37:152–155

    Article  Google Scholar 

  • Garamszegi LZ (2010) Information-theoretic approaches to statistical analysis in behavioural ecology: an introduction. Behav Ecol Sociobiol. doi:10.1007/s00265-010-1028-7

  • Garamszegi LZ, Calhim S, Dochtermann N, Hegyi G, Hurd PL, Jørgensen C, Kutsukake N, Lajeunesse MJ, Pollard KA, Schielzeth H, Symonds MRE, Nakagawa S (2009) Changing philosophies and tools for statistical inferences in behavioral ecology. Behav Ecol 20:1363–1375

    Article  Google Scholar 

  • Guthery FS, Brennan LA, Peterson MJ, Lusk JJ (2005) Information theory in wildlife science: critique and viewpoint. J Wildl Manage 69:457–465

    Article  Google Scholar 

  • Harrell FE Jr (2001) Regression modeling strategies. Springer, New York

    Google Scholar 

  • Hector A, von Felten S, Schmid B (2010) Analysis of variance with unbalanced data: an update for ecology & evolution. J Anim Ecol 79:308–316

    Article  PubMed  Google Scholar 

  • Hegyi G, Garamszegi LZ (2010) Using information theory as a substitute for stepwise regression in ecology and behavior. Behav Ecol Sociobiol. doi:10.1007/s00265-010-1036-7

  • Hurlbert SH, Lombardi CM (2009) Final collapse of the Neyman–Pearson decision theoretic framework and rise of the neoFisherian. Ann Zool Fenn 46:311–349

    Google Scholar 

  • Ioannidis JPA (2005) Why most published research findings are false. PLoS Med 2:696–701

    Google Scholar 

  • James FC, McCulloch CE (1990) Multivariate analysis in ecology and systematics: panacea or Pandora’s box? Annu Rev Ecol Evol Syst 21:129–166

    Google Scholar 

  • Johnson DH (1999) The insignificance of statistical significance testing. J Wildl Manage 63:763–772

    Article  Google Scholar 

  • Johnson DH (2002) The role of hypothesis testing in wildlife science. J Wildl Manage 66:272–276

    Article  Google Scholar 

  • Johnson BJ, Omland KS (2004) Model selection in ecology and evolution. Trends Ecol Evol 19:101–108

    Article  PubMed  Google Scholar 

  • Lovell MC (1983) Data mining. Rev Econ Stat 65:1–12

    Article  Google Scholar 

  • Lukacs PM, Thompson WL, Kendall WL, Gould WR, Doherty PF Jr, Burnham KP, Anderson DR (2007) Concerns regarding a call for pluralism of information theory and hypothesis testing. J Appl Ecol 44:456–460

    Article  Google Scholar 

  • McCullagh P, Nelder JA (2008) Generalized linear models. Chapman and Hall, London

    Google Scholar 

  • Møller AP, Jennions MD (2002) How much variance can be explained by ecologists and evolutionary biologists? Oecologia 132:492–500

    Article  Google Scholar 

  • Mundry R, Nunn CL (2009) Stepwise model fitting and statistical inference: turning noise into signal pollution. Am Nat 173:119–123

    Article  PubMed  Google Scholar 

  • Nakagawa S, Cuthill IC (2007) Effect size, confidence interval and statistical significance: a practical guide for biologists. Biol Rev 82:591–605

    Article  PubMed  Google Scholar 

  • Nickerson RS (2000) Null hypothesis significance testing: a review of an old and continuing controversy. Psychol Methods 5:241–301

    Article  PubMed  CAS  Google Scholar 

  • Quinn GP, Keough MJ (2002) Experimental designs and data analysis for biologists. Cambridge University Press, Cambridge

    Google Scholar 

  • R Development Core Team (2009) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

  • Richards SA, Whittingham MJ, Stephens PA (2010) Model selection and model averaging in behavioural ecology: the utility of the IT-AIC framework. Behav Ecol Sociobiol. doi:10.1007/s00265-010-1035-8

  • Royall R (1997) Statistical evidence, a likelihood paradigm. Chapman & Hall, London

    Google Scholar 

  • Sakamoto Y, Akaike H (1978) Analysis of cross classified data by AIC. Ann Inst Stat Math 30:185–197

    Article  Google Scholar 

  • Schielzeth H (2010) Simple means to improve the interpretability of regression coefficients. Methods Ecol Evol 1:103–113

    Article  Google Scholar 

  • Siegel S, Castellan NJ (1988) Nonparametric statistics for the behavioral sciences, 2nd edn. McGraw-Hill, New York

    Google Scholar 

  • Sleep DJH, Drever MC, Nudds TD (2007) Statistical versus biological hypothesis testing: response to Steidl. J Wildl Manage 71:2120–2121

    Article  Google Scholar 

  • Smith GD, Ebrahim S (2002) Data dredging, bias, or confounding. Brit Med J 325:1437–1438

    Article  PubMed  Google Scholar 

  • Sokal RR, Rohlf FJ (1995) Biometry—the principles and practice of statistics in biological research, 3rd edn. Freeman, New York

    Google Scholar 

  • Steidl RJ (2006) Model selection, hypothesis testing, and risks of condemning analytical tools. J Wildl Manage 70:1497–1498

    Article  Google Scholar 

  • Stephens PA, Buskirk SW, Hayward GD, del Rio CM (2005) Information theory and hypothesis testing: a call for pluralism. J Appl Ecol 42:4–12

    Article  Google Scholar 

  • Stephens PA, Buskirk SW, del Rio CM (2007) Inference in ecology and evolution. Trends Ecol Evol 22:192–197

    Article  PubMed  Google Scholar 

  • Stoehr AM (1999) Are significance thresholds appropriate for the study of animal behaviour? Anim Behav 57:F22–F25

    Article  PubMed  Google Scholar 

  • Symonds MRE, Moussalli A (2010) A brief guide to model selection, multimodel inference and model averaging in behavioural ecology using Akaike’s Information Criterion. Behav Ecol Sociobiol. doi:10.1007/s00265-010-1037-6

  • Tabachnick BG, Fidell LS (2001) Using multivariate statistics, 4th edn. Allyn & Bacon, Boston

    Google Scholar 

  • Vul E, Harris C, Winkielman P, Pashler H (2010) Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspect Psychol Sci 4:274–290

    Article  Google Scholar 

  • Whittingham MJ, Stephens PA, Bradbury RB, Freckleton RP (2006) Why do we still use stepwise modelling in ecology and behaviour? J Anim Ecol 75:1182–1189

    Article  PubMed  Google Scholar 

  • Young SS, Bang H, Oktay K (2009) Cereal-induced gender selection? Most likely a multiple testing false positive. Proc R Soc Lond, Ser B 276:1211–1212

    Article  Google Scholar 

  • Zar JH (1999) Biostatistical analysis, 4th edn. Prentice Hall, New Jersey

    Google Scholar 

Download references

Acknowledgements

I wish to thank Peter Walsh and Hjalmar Kühl for introducing some of the concepts of IT-based inference to me and for several invaluable discussions about this approach. Kai F. Abt, Hjalmar Kühl, Kevin Langergraber, Rainer Stollhoff and three anonymous referees provided very helpful comments on an earlier version of this paper. Finally, I wish to thank László Zsolt Garamszegi for inviting me to write this paper and for his patience with me during the process of writing it. This work was supported by the Max Planck Society.

Conflict of interest

The author declares to have no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roger Mundry.

Additional information

Communicated by L. Garamszegi

This contribution is part of the Special Issue ‘Model selection, multimodel inference and information-theoretic approaches in behavioural ecology’ (see Garamszegi 2010).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mundry, R. Issues in information theory-based statistical inference—a commentary from a frequentist’s perspective. Behav Ecol Sociobiol 65, 57–68 (2011). https://doi.org/10.1007/s00265-010-1040-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00265-010-1040-y

Keywords

Navigation