Advertisement

Behavioral Ecology and Sociobiology

, Volume 65, Issue 1, pp 1–11 | Cite as

Information-theoretic approaches to statistical analysis in behavioural ecology: an introduction

  • László Zsolt GaramszegiEmail author
Review

Abstract

Scientific thinking may require the consideration of multiple hypotheses, which often call for complex statistical models at the level of data analysis. The aim of this introduction is to provide a brief overview on how competing hypotheses are evaluated statistically in behavioural ecological studies and to offer potentially fruitful avenues for future methodological developments. Complex models have traditionally been treated by model selection approaches using threshold-based removal of terms, i.e. stepwise selection. A recently introduced method for model selection applies an information-theoretic (IT) approach, which simultaneously evaluates hypotheses by balancing between model complexity and goodness of fit. The IT method has been increasingly propagated in the field of ecology, while a literature survey shows that its spread in behavioural ecology has been much slower, and model simplification using stepwise selection is still more widespread than IT-based model selection. Why has the use of IT methods in behavioural ecology lagged behind other disciplines? This special issue examines the suitability of the IT method for analysing data with multiple predictors, which researchers encounter in our field. The volume brings together different viewpoints to aid behavioural ecologists in understanding the method, with the hope of enhancing the statistical integration of our discipline.

Keywords

Akaike information criterion AIC GLM Likelihood Null hypothesis testing Parsimony Stepwise regression 

Notes

Acknowledgements

I am grateful to D. R. Anderson, R. Freckleton, F. Guthery, R. Montgomerie, S. Nakagawa, and P. Stephens for their constructive comments at the different stages of the manuscript. Special thanks to all referees that participated in the evaluation of the contributed papers (see details at the end of this volume). P. A. Bednekoff kindly assisted during the editorial process and helped obtain reports from independent referees. During this study, I was supported by a “Ramon y Cajal” research grant from the Spanish National Research Council (Consejo Superior de Investigaciones Científicas–CSIC). The Department of Systematic Zoology and Ecology, Eötvös Loránd University, Hungary provided stimulating working place, for which I am indebted to J. Török.

References

  1. Adolph SC, Hardin JS (2007) Estimating phenotypic correlations: correcting for bias due to intraindividual variability. Funct Ecol 21:178–184CrossRefGoogle Scholar
  2. Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Csáki F (ed) 2nd International Symposium on Information Theory. Akadémiai Kiadó, Budapest, pp 267–281Google Scholar
  3. Allen DM (1974) The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16:125–127CrossRefGoogle Scholar
  4. Anderson DR (2008) Model based inference in the life sciences: a primer on evidence. Springer, New YorkCrossRefGoogle Scholar
  5. Anderson DR, Burnham KP (2002) Avoiding pitfalls when using information-theoretic methods. J Wildl Manage 66:910–916Google Scholar
  6. Anderson DR, Burnham KP, Thompson WL (2000) Null hypothesis testing: problems, prevalence, and an alternative. J Wildl Manage 64:912–923CrossRefGoogle Scholar
  7. Bell AM, Hankison SJ, Laskowski KL (2009) The repeatability of behaviour: a meta-analysis. Anim Behav 77:771–783CrossRefGoogle Scholar
  8. Berger JO, Wolpert RL (1984) The likelihood principle. Institute of Mathematical Statistics, HaywardGoogle Scholar
  9. Burnham KP, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach. Springer, New YorkGoogle Scholar
  10. Burnham K, Anderson D, Huyvaert K (2010) Improving inferences in ecological and behavioral science: some background, observations, and comparisons. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1029-6
  11. Cézilly F, Danchin É, Giraldeau L-A (2008) Research methods in behavioural ecology. In: Danchin É, Giraldeau L-A, Cézilly F (eds) Behavioural ecology: an evolutionary perspective on behaviour. Oxford University Press, Oxford, pp 55–95Google Scholar
  12. Chamberlin TC (1890) The method of multiple working hypotheses. Science 15:92–96CrossRefGoogle Scholar
  13. Claeskens C, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, CambridgeGoogle Scholar
  14. Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik 31:377–403CrossRefGoogle Scholar
  15. Crawley MJ (2007) The R book. Wiley, West SussexCrossRefGoogle Scholar
  16. Derksen S, Keselman HJ (1992) Backward, forward and stepwise automated subset-selection algorithms—frequency of obtaining authentic and noise variables. Br J Math Stat Psychol 45:265–282Google Scholar
  17. Dochtermann N, Jenkins SH (2010) Developing and evaluating candidate hypotheses in behavioral ecology. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1039-4
  18. Draper NR, Smith H (1981) Applied regression analysis, 2nd edn. Wiley, New YorkGoogle Scholar
  19. Forster MR (2000) Key concepts in model selection: performance and generalizability. J Math Psychol 44:205–231PubMedCrossRefGoogle Scholar
  20. Forstmeier W, Schielzeth H (2010) Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner’s curse. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1038-5
  21. Fox J (2002) An R and S-PLUS companion to applied regression. Sage, Newbury ParkGoogle Scholar
  22. Freckleton RP (2010) Dealing with collinearity in behavioural and ecological data: model averaging and the problems of measurement error. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1045-6
  23. Garamszegi LZ, Calhim S, Dochtermann N, Hegyi G, Hurd PL, Jørgensen C, Kutsukake N, Lajeunesse MJ, Pollard KA, Schielzeth H, Symonds MRE, Nakagawa S (2009) Changing philosophies and tools for statistical inferences in behavioral ecology. Behav Ecol 20:1363–1375CrossRefGoogle Scholar
  24. Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, CambridgeGoogle Scholar
  25. Ginzburg LR, Jensen CXJ (2004) Rules of thumb for judging ecological theories. Trends Ecol Evol 19:121–126PubMedCrossRefGoogle Scholar
  26. Graham MH (2003) Confronting multicollinearity in ecological multiple regression. Ecology 84:2809–2815CrossRefGoogle Scholar
  27. Guthery FS (2007) Deductive and inductive methods of accumulating reliable knowledge in wildlife science. J Wildl Manage 71:222–225CrossRefGoogle Scholar
  28. Guthery FS, Brennan LA, Peterson MJ, Lusk JJ (2005) Information theory in wildlife science: critique and viewpoint. J Wildl Manage 69:457–465CrossRefGoogle Scholar
  29. Harvey PH, Pagel MD (1991) The comparative method in evolutionary biology. Oxford University Press, OxfordGoogle Scholar
  30. Hegyi G, Garamszegi LZ (2010) Stepwise selection and information theory in ecology and behavior. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1036-7
  31. Hilborn R, Mangel M (1997) The ecological detective: confronting models with data. Princeton University Press, PrincetonGoogle Scholar
  32. Hobbs NT, Hilborn R (2006) Alternatives to statistical hypothesis testing in ecology: a guide to self teaching. Ecol Appl 16:5–19PubMedCrossRefGoogle Scholar
  33. Huston MA (1997) Hidden treatments in ecological experiments: re-evaluating the ecosystem function of biodiversity. Oecologia 110:449–460CrossRefGoogle Scholar
  34. Johnson JB, Omland KS (2004) Model selection in ecology and evolution. Trends Ecol Evol 19:101–108PubMedCrossRefGoogle Scholar
  35. Jones KS, Nakagawa S, Sheldon BC (2009) Environmental sensitivity in relation to size and sex in birds: meta-regression analysis. Am Nat 174:122–133PubMedCrossRefGoogle Scholar
  36. Konishi S, Kitagawa G (2008) Information criteria and statistical modeling. Springer, New YorkCrossRefGoogle Scholar
  37. Krebs JR, Davies NB (1984) Behavioural ecology: an evolutionary approach. Blackwell Scientific, OxfordGoogle Scholar
  38. Lajeunesse MJ (2009) Meta-analysis and the comparative phylogenetic method. Am Nat 174:369–381Google Scholar
  39. Lebreton J-D, Burnham KP, Clobert J, Anderson DR (1992) Modeling survival and testing biological hypotheses using marked animals: a unified approach with case studies. Ecol Monogr 62:67–118CrossRefGoogle Scholar
  40. Liang H, Wu HL, Zou GH (2008) A note on conditional AIC for linear mixed-effects models. Biometrika 95:773–778PubMedCrossRefGoogle Scholar
  41. Linhart H, Zucchini W (1986) Model selection. Wiley, New YorkGoogle Scholar
  42. Lukacs PM, Thompson WL, Kendall WL, Gould WR, Doherty PF, Burnham KP, Anderson DR (2007) Concerns regarding a call for pluralism of information theory and hypothesis testing. J Appl Ecol 44:456–460CrossRefGoogle Scholar
  43. Mallows CL (1973) Some comments on Cp. Technometrics 15:661–675CrossRefGoogle Scholar
  44. Massart P (2007) Concentration inequalities and model selection: ecole d’eté de probabilités de Saint-Flour XXXIII—2003. Springer, BerlinGoogle Scholar
  45. McArdle BH (2003) Lines, models, and errors: regression in the field. Limnol Oceanogr 48:1363–1366Google Scholar
  46. McCarthy MA (2007) Bayesian methods for ecology. Cambridge University Press, CambridgeGoogle Scholar
  47. McQuarrie ADR, Tsai C-L (1998) Regression and time series model selection. World Scientific, SingaporeCrossRefGoogle Scholar
  48. Mundry R (2010) Issues in information theory based statistical inference—a commentary from a frequentist’s perspective. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1040-y
  49. Mundry R, Nunn CL (2008) Stepwise model fitting and statistical inference: turning noise into signal pollution. Am Nat 173:119–123CrossRefGoogle Scholar
  50. Murtaugh PA (2009) Performance of several variable-selection methods applied to real ecological data. Ecol Lett 12:1061–1068PubMedCrossRefGoogle Scholar
  51. Nakagawa S, Freckleton R (2008) Missing inaction: the dangers of ignoring missing data. Trends Ecol Evol 23:592–596PubMedCrossRefGoogle Scholar
  52. Nakagawa S, Freckleton RP (2010) Model averaging, missing data and multiple imputation: a case study for behavioural ecology. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1044-7
  53. O’Hara RB, Sillanpää MJ (2009) A review of Bayesian variable selection methods: what, how, and which. Bayesian Analysis 4:85–118CrossRefGoogle Scholar
  54. Owens IPF (2006) Where is behavioural ecology going? Trends Ecol Evol 21:356–361PubMedCrossRefGoogle Scholar
  55. Platt JR (1964) Strong inference. Science 146:347–353PubMedCrossRefGoogle Scholar
  56. Popper KR (1963) Conjectures and refutations. Routledge and Keagan Paul, LondonGoogle Scholar
  57. Pötscher BM (1989) Model selection under nonstationary: autoregressive models and stochastic linear regression models. Ann Stat 17:1257–1274CrossRefGoogle Scholar
  58. Quinn JF, Dunham AE (1983) On hypothesis testing in ecology and evolution. Am Nat 122:602–617CrossRefGoogle Scholar
  59. Quinn GP, Keough MJ (2002) Experimental design and data analysis for biologists. Cambridge University Press, CambridgeGoogle Scholar
  60. Rabosky DL (2006) Likelihood methods for detecting temporal shifts in diversification rates. Evolution 60:1152–1164PubMedGoogle Scholar
  61. Rao CR, Wu Y (1989) A strongly consistent procedure for model selection in a regression problem. Biometrika 76:369–374CrossRefGoogle Scholar
  62. Richards SA, Whittingham MJ, Stephens PA (2010) Model selection and model averaging in behavioural ecology: the utility of the IT-AIC framework. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1035-8
  63. Ripplinger J, Sullivan J (2008) Does choice in model selection affect maximum likelihood analysis? Syst Biol 57:76–85PubMedCrossRefGoogle Scholar
  64. Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471CrossRefGoogle Scholar
  65. Royall MR (1997) Statistical evidence: a likelihood paradigm. Chapman and Hall, LondonGoogle Scholar
  66. Rushton SP, Ormerod SJ, Kerby G (2004) New paradigms for modelling species distributions? J Appl Ecol 41:193–200CrossRefGoogle Scholar
  67. Sakamoto Y (1991) Categorical data analysis by AIC. KTK Scientific, TokyoGoogle Scholar
  68. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464CrossRefGoogle Scholar
  69. Shibata R (1981) An optimal selection of regression variables. Biometrika 68:45–54CrossRefGoogle Scholar
  70. Sokal RR, Rohlf FJ (1995) Biometry, 3rd edn. Freeman, New YorkGoogle Scholar
  71. Steidl RJ (2006) Model selection, hypothesis testing, and risks of condemning analytical tools. J Wildl Manage 70:1497–1498CrossRefGoogle Scholar
  72. Stephens PA, Buskirk SW, Hayward GD, Del Rio CM (2005) Information theory and hypothesis testing: a call for pluralism. J Appl Ecol 42:4–12CrossRefGoogle Scholar
  73. Stephens PA, Buskirk SW, del Rio CM (2007a) Inference in ecology and evolution. Trends Ecol Evol 22:192–197PubMedCrossRefGoogle Scholar
  74. Stephens PA, Buskirk SW, Hayward GD, Del Rio CM (2007b) A call for statistical pluralism answered. J Appl Ecol 44:461–463CrossRefGoogle Scholar
  75. Stone M (1974) Cross-validatory choice and assessment of statistical predictions. Jean-le-Blanc Journal of the Royal Statistical Society, Series B 36:111–147Google Scholar
  76. Sugiura N (1978) Further analysis of the data by Akaike’s information and the finite corrections. Commun Stat A7:13–26CrossRefGoogle Scholar
  77. Symonds M, Moussalli A (2010) Model selection, multimodel inference and model averaging using Akaike’s information criterion: an introduction for statistically terrified behavioural ecologists. Behav Ecol Sociobiol. doi: 10.1007/s00265-010-1037-6
  78. Takeuchi K (1976) Distribution of informational statistics and a criterion of model fitting (in Japanese). Suri-Kagaku (Mathematical Sciences) 153:12–18Google Scholar
  79. Towner MC, Luttbeg B (2007) Alternative statistical approaches to the use of data as evidence for hypotheses in human behavioral ecology. Evol Anthropol 16:107–118CrossRefGoogle Scholar
  80. Vaida F, Blanchard S (2005) Conditional Akaike information for mixed-effects models. Biometrika 92:351–370CrossRefGoogle Scholar
  81. Vapnik V, Chervonenkis A (1974) Theory of pattern recognition (in Russian). Nauka, MoscowGoogle Scholar
  82. Ward EJ (2008) A review and comparison of four commonly used Bayesian and maximum likelihood model selection tools. Ecol Modell 211:1–10CrossRefGoogle Scholar
  83. Wetherill GB, Duncombe P, Kenward M, Kollerstrom J, Paul SR, Vowden BJ (1986) Regression analysis with applications. Chapman and Hall, LondonGoogle Scholar
  84. Whiteheat H (2007) Selection of models of lagged identification rates and lagged association rates using AIC and QAIC. Commun Stat, Simul Comput 36:1233–1246CrossRefGoogle Scholar
  85. Whittingham MJ, Stephens PA, Bradbury RB, Freckleton RP (2006) Why do we still use stepwise modelling in ecology and behaviour? J Anim Ecol 75:1182–1189PubMedCrossRefGoogle Scholar
  86. Zucchini W (2000) An introduction to model selection. J Math Psychol 44:41–46PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.Department of Evolutionary EcologyEstación Biológica de Doñana–CSICSevilleSpain

Personalised recommendations