Behavioral Ecology and Sociobiology

, Volume 65, Issue 12, pp 2361–2372 | Cite as

Neglected biological patterns in the residuals

A behavioural ecologist’s guide to co-operating with heteroscedasticity
  • Ian R. CleasbyEmail author
  • Shinichi Nakagawa


One of the fundamental assumptions underlying linear regression models is that the errors have a constant variance (i.e., homoscedastic). When this assumption is violated, standard errors from a regression can be biased and inconsistent, meaning that the associated p values and 95% confidence intervals cannot be trusted. The assumption of homoscedasticity is made for statistical reasons rather than biological reasons; in most real datasets, some form of heteroscedasticity is likely to exist. However, a survey of the behavioural ecology literature showed that only about 5% of articles explicitly mentioned heteroscedasticity, leaving 95% of articles in which heteroscedasticity was apparently absent. These results strongly indicate that the prevalence of heteroscedasticity is widely under-reported within behavioural ecology. The aim of this article is to raise awareness of heteroscedasticity amongst behavioural ecologists. Using topical examples from fields in behavioural ecology such as sexual dimorphism and animal personality, we highlight the biological importance of considering heteroscedasticity. We also emphasize that researchers should pay closer attention to the variance in their data and consider what factors could cause heteroscedasticity. In addition, we introduce some simple methods of dealing with heteroscedasticity. The two methods we focus on are: (1) incorporating variance functions within a generalised least squares (GLS) framework to model the functional form of heteroscedasticity and; (2) heteroscedasticity-consistent standard error (HCSE) estimators, which can be used when the functional form of heteroscedasticity is unknown. Using case studies, we show how both methods can influence the output from linear regression models. Finally, we hope that more researchers will consider heteroscedasticity as an important source of additional information about the particular biological process being studied, rather than an impediment to statistical analysis.


Heteroscedasticity Homoscedasticity Linear regression Residuals Standard errors Variance Within-group errors 



We thank Barbara Morrissey, Eduardo Santos and Alistair Senior for commenting upon earlier versions of this manuscript. We are grateful to three anonymous reviewers for comments which improved the manuscript. We would also like to thank Terry Burke for his encouragement during the writing of this manuscript. S.N. is supported by the Marsden Fund.

Supplementary material

265_2011_1254_MOESM1_ESM.doc (230 kb)
ESM 1 (DOC 230 kb)
265_2011_1254_MOESM2_ESM.doc (3 kb)
ESM 2 (DOC 3 kb)
265_2011_1254_MOESM3_ESM.doc (1 kb)
ESM 3 (DOC 1 kb)
265_2011_1254_MOESM4_ESM.csv (3 kb)
ESM 4 (CSV 3 kb)
265_2011_1254_MOESM5_ESM.csv (1 kb)
ESM 5 (CSV 2 kb)


  1. Badyaev AV (2002) Growing apart: an ontogenetic perspective on the evolution of sexual size dimorphism. Trends Ecol Evol 17:369–378CrossRefGoogle Scholar
  2. Bivand RS, Pebesma EJ, Gómez-Rubio V (2008) Applied spatial data analysis with R. Springer, New YorkGoogle Scholar
  3. Breusch TS, Pagan AR (1979) A simple test for heteroskedasticity and random coefficient variation. Econometrica 47:1287–1294CrossRefGoogle Scholar
  4. Brotherstone S, Hill WG (1986) Heterogeneity of variance amongst herds for milk production. Anim Prod 42:297CrossRefGoogle Scholar
  5. Cardoso FF, Rosa GJM, Tempelman RJ (2005) Multiple-breed genetic inference using heavy-tailed structural models for heterogeneous residual variances. J Anim Sci 83:1766–1779PubMedGoogle Scholar
  6. Carroll RJ (2003) Variances are not always nuisance parameters. Biometrics 59:211–220PubMedCrossRefGoogle Scholar
  7. Clayton GA, Morris JA, Robertson A (1957) An experimental check on quantitative genetical theory: I. Short term responses to selection. J Genet 55:131–151CrossRefGoogle Scholar
  8. Cleasby IR (2010) The influence of early environment and parental care on offspring growth and survival in the house sparrow. PhD thesis, University of Sheffield, UKGoogle Scholar
  9. Cohen J (1990) Things I have learned (so far). Am Pyscol 45:1304–1312CrossRefGoogle Scholar
  10. Congdon PD (2010) Applied Bayesian hierarchical methods. CRC Press, FloridaCrossRefGoogle Scholar
  11. Cribari-Neto F (2004) Asymptotic inference under heteroskedasticity of unknown form. Comput Stats Data Anal 45:215–233CrossRefGoogle Scholar
  12. Cribari-Neto F, da Silva WB (2011) A new heteroskedasticity-consistent covariance matrix estimator for the linear regression model. Adv Stat Anal 95:129–146CrossRefGoogle Scholar
  13. Cribari-Neto F, Ferrari SLP, Cordeiro GM (2000) Improved heteroscedasticity-consistent covariance matrix estimators. Biometrika 87:907–918Google Scholar
  14. Cribari-Neto F, Ferrari SLP, Oliveira WASC (2005) Numerical evaluation of tests based on different heteroskedasticity consistent covariance matrix estimators. J Stat Comput Simul 75:611–628CrossRefGoogle Scholar
  15. Cribari-Neto F, Souza TC, Vasconcellos KLP (2007) Inference under heteroskedasticity and leveraged data. Commun Stat Theory Methods 36:1877–1888, Errata 37:3329–3330, 2008CrossRefGoogle Scholar
  16. Cryer JD, Chan K (2008) Time-series analysis with applications in R, 2nd edn. Springer, New YorkGoogle Scholar
  17. Darlington RB (1990) Regression and linear models. McGraw-Hill, New YorkGoogle Scholar
  18. Dingemanse NJ, Kazem AJN, Réale D, Wright J (2010) Behavioural reaction norms: animal personality meets individual plasticity. Trends Ecol Evol 25:81–89PubMedCrossRefGoogle Scholar
  19. Dutilleul P, Potvin C (1995) Among-environment heteroscedasticity and genetic autocorrelation: implications for the study of phenotypic plasticity. Genetics 139:1815–1829PubMedGoogle Scholar
  20. Erceg-Hurn DM, Mirosevich VM (2008) An easy way to maximise the accuracy and power of your research. Am Psychol 63:591–601PubMedCrossRefGoogle Scholar
  21. Fox J, Weisberg S (2010) An R companion to applied regression. Sage, CaliforniaGoogle Scholar
  22. Furno M (1996) Small sample behaviour of a robust heteroskedasticity consistent covariance matrix estimator with improved finite-sample properties. J Stat Comput Simul 54:115–128CrossRefGoogle Scholar
  23. Gelman A (2005) Analysis of variance – why it is more important than ever. Ann Stat 33:1–53CrossRefGoogle Scholar
  24. Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, New YorkGoogle Scholar
  25. Goldfeld SM, Quandt RE (1965) Some tests for homoscedasticity. J Am Stat Assoc 60:539–547CrossRefGoogle Scholar
  26. Griffiths SC, Owens IPF, Burke T (1999) Environmental determination of a sexually selected trait. Nature 400:358–360CrossRefGoogle Scholar
  27. Grissom RJ, Kim JJ (2005) Effect sizes for research: A broad practical approach. Erlbaum, New JerseyGoogle Scholar
  28. Hadfield JD (2010) MCMC methods for multi-response Generalised Linear Mixed Models: the MCMCglmm R package. J Stat Soft 33:1–22Google Scholar
  29. Hayes AF, Cai L (2007) Using heteroskedasticity-consistent standard error estimators in OLS regression: an introduction and software implementation. Behav Res Methods 39:709–722PubMedCrossRefGoogle Scholar
  30. Herberich E, Sikorski J, Hothron T (2010) A robust procedure for comparing multiple means under heteroscedasticity in unbalanced designs. PLoS ONE 5(3):e9788. doi: 10.1371/journal.pone.0009788 PubMedCrossRefGoogle Scholar
  31. Hill WG (1984) On selection among groups with heterogeneous variance. Anim Prod 39:473–477CrossRefGoogle Scholar
  32. Hill WG, Zhang X (2004) Effects on phenotypic variability of directional selection arising through genetic differences in residual variability. Genet Res 83:121–132PubMedCrossRefGoogle Scholar
  33. Honkanen T, Jormalainen V (2005) Genotypic variation in tolerance and resistance to fouling in the brown Alga Fucus vesiculosus. Ecology 144:196–205Google Scholar
  34. Jones KS, Nakagawa S, Sheldon BC (2009) Environmental sensitivity in relation to size and sex in birds: meta-regression analysis. Am Nat 174:122–133PubMedCrossRefGoogle Scholar
  35. Kauermann G, Carroll RJ (2001) A note on the efficiency of sandwich covariance matrix estimation. J Am Stat Assoc 96:1387–1396CrossRefGoogle Scholar
  36. Keppel G, Wickens TD (2004) Design and analysis: A researcher’s handbook, 4th edn. Pearson, New JerseyGoogle Scholar
  37. Long JS, Ervin LH (2000) Using heteroskedasticity consistent standard errors in the linear regression model. Am Stat 54:217–224CrossRefGoogle Scholar
  38. Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, New YorkGoogle Scholar
  39. Nakagawa S, Cuthill IC (2007) Effect size, confidence interval and statistical significance: a practical guide for biologists. Biol Rev 82:591–605PubMedCrossRefGoogle Scholar
  40. Nakagawa S, Gillespie DOS, Hatchwell BJ, Burke T (2007a) Predictable males and unpredictable females: sex difference in repeatability of care in a wild bird population. J Evol Biol 20:1674–1681PubMedCrossRefGoogle Scholar
  41. Nakagawa S, Ockendon N, Gillespie DOS, Hatchwell BJ, Burke T (2007b) Assessing the function of house sparrows’ bib size using a flexible meta-analysis method. Behav Ecol 18:831–840CrossRefGoogle Scholar
  42. Pinheiro JC, Bates DM (2000) Mixed-effects models in S and S-PLUS. Springer, New YorkCrossRefGoogle Scholar
  43. Pinheiro JC, Bates DM, DebRoy S, Sarkar D (2010) nlme: linear and nonlinear mixed effects models. R package version 3:1–97Google Scholar
  44. Qian L, Wang S (2001) Bias-corrected heteroscedasticity robust covariance matrix (sandwich) estimators. J Stat Comput Simul 70:161–174CrossRefGoogle Scholar
  45. R Development Core Team (2010) R: A language and environment for statistical computing, version 2.12.1. R Foundation for Statistical Computing, Vienna, AustriaGoogle Scholar
  46. Reale D, Dingemanse NJ (2010) Personality and individual social specialisation. In Szekely T, et al (eds) Social behaviour: Genes, ecology and evolution. Cambridge University PressGoogle Scholar
  47. Rönnegård L, Felleki M, Fikse F, Mulder HA, Strandberg E (2010) Genetic heterogeneity of residual variance—estimation of variance components using double hierarchical generalized. linear models. Genet Selection Evol 42:8CrossRefGoogle Scholar
  48. Ruxton GD (2006) The unequal variance t-test is an underused alternative to Student’s t-test and the Mann–Whitney U test. Behav Ecol 17:688–690CrossRefGoogle Scholar
  49. Schwagmeyer PL, Mock DW (2003) How consistently are good parents good parents? Repeatability of parental care in the house sparrow, Passer domesticus? Ethology 109:303–313CrossRefGoogle Scholar
  50. Sinn DL, Gosling SD, Moltschaniwskyj NA (2008) Development of shy/bold behaviour in squid: context-specific phenotypes associated with developmental plasticity. Anim Behav 75:433–442CrossRefGoogle Scholar
  51. Teder T, Tammaru T, Esperk T (2008) Dependence of phenotypic variance in body size on environmental quality. Am Nat 172:223–232PubMedCrossRefGoogle Scholar
  52. Venables WN (2000) Exegeses on linear models. Paper presented to S-PLUS User’s Conference, Washington, DC, 8–9 October, 1998Google Scholar
  53. White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48:817–838CrossRefGoogle Scholar
  54. Wilcox RR (1998) How many discoveries have been lost by ignoring modern statistical methods? Am Psychol 53:300–314CrossRefGoogle Scholar
  55. Wilcox RR (2005) Introduction to robust estimation and hypothesis testing. Academic Press, New YorkGoogle Scholar
  56. Wilkin TA, King LE, Sheldon BC (2009) Habitat quality, nestling diet, and provisioning behaviour in great tits Parus major. J Avian Biol 40:135–145CrossRefGoogle Scholar
  57. Wooldridge JM (2000) Introductory econometrics: a modern approach. South-Western College Publishing, OhioGoogle Scholar
  58. Zeileis A (2004) Econometric computing with HC and HAC covariance matrix estimators. J Stat Software 11:1–17Google Scholar
  59. Zeileis A, Hothorn T (2002) Diagnostic checking in regression relationships. R News 2:7–10Google Scholar
  60. Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. Springer, New YorkCrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.Department of Animal and Plant SciencesUniversity of SheffieldSheffieldUK
  2. 2.Department of ZoologyUniversity of OtagoDunedinNew Zealand

Personalised recommendations