Mac Nally, R. Biodiversity and Conservation (2002) 11: 1397. doi:10.1023/A:1016250716679
Ecologists and conservation biologists frequently use multipleregression (MR) to try to identify factors influencing response variables suchas species richness or occurrence. Many frequently used regression methods maygenerate spurious results due to multicollinearity. argued that there are actually two kinds of MR modelling: (1)seeking the best predictive model; and (2) isolating amounts of varianceattributable to each predictor variable. The former has attracted most attentionwith a plethora of criteria (measures of model fit penalized for modelcomplexity – number of parameters) and Bayes-factor-based methods havingbeen proposed, while the latter has been little considered, althoughhierarchical methods seem promising (e.g. hierarchical partitioning). If the twoapproaches agree on which predictor variables to retain, then it is more likelythat meaningful predictor variables (of those considered) have been found. Therehas been a problem in that, while hierarchical partitioning allowed the rankingof predictor variables by amounts of independent explanatory power, there was no(statistical) way to decide which variables to retain. A solution usingrandomization of the data matrix coupled with hierarchical partitioning ispresented, as is an ecological example.
Correlated variablesGeneral linear modelPredictionVariable subsets