Abstract
Conventional measures of predictor importance in linear models are applicable only when the assumption of homoscedasticity is satisfied. Moreover, they cannot be adapted to evaluating predictor importance in models of heteroscedasticity (i.e., dispersion), an issue that seems not to have been systematically addressed in the literature. We compare two suitable approaches, Dominance Analysis (DA) and Bayesian Model Averaging (BMA), for simultaneously evaluating predictor importance in models of location and dispersion. We apply them to the beta general linear model as a test-case, illustrating this with an example using real data. Simulations using several different model structures, sample sizes, and degrees of multicollinearity suggest that both DA and BMA largely agree on the relative importance of predictors of the mean, but differ when ranking predictors of dispersion. The main implication of these findings for researchers is that the choice between DA and BMA is most important when they wish to evaluate the importance of predictors of dispersion.
This is a preview of subscription content, log in to check access.



Notes
- 1.
Supplementary materials are available at http://dl.dropbox.com/u/39185009/PIE2012supp.html, including Appendices and R code for evaluating predictor importance via DA and BMA.
References
Azen, R., & Budescu, D.V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8, 129–148. doi:10.1037/1082-989X.8.2.129.
Azen, R., & Traxel, N. (2009). Using dominance analysis to determine predictor importance in logistic regression. Journal of Educational and Behavioral Statistics, 34, 319–347. doi:10.3102/1076998609332754.
Berger, J.O., & Pericchi, L.R. (1996). The intrinsic Bayes factor for model selection and prediction. Journal of the American Statistical Association, 91, 109–122. doi:10.1080/01621459.1996.10476668.
Bollen, K.A., Ray, S., Zavisca, J., & Harden, J. (2012). A comparison of Bayes factor approximation methods including two new methods. Sociological Methods & Research, 41(2), 294–324. doi:10.1177/0049124112452393.
Budescu, D.V. (1993). Dominance analysis: a new approach to the problem of relative importance of predictors in multiple regression. Psychological Bulletin, 114, 542–551. doi:10.1037/0033-2909.114.3.542.
Budescu, D.V., & Azen, R. (2004). Beyond global measures of relative importance: some insights from dominance analysis. Organizational Research Methods, 7, 341–350. doi:10.1177/1094428104267049.
Carroll, R.J. (2003). Variances are not always nuisance parameters. Biometrics, 59(2), 211–220. doi:10.1111/1541-0420.t01-1-00027.
Chao, Y.-C.E., Zhao, Y., Kupper, L.L., & Nylander-French, L.A. (2008). Quantifying the relative importance of predictors in multiple linear regression analyses for public health studies. Journal of Occupational and Environmental Hygiene, 5, 519. doi:10.1080/15459620802225481.
Chatfield, C. (1995). Model uncertainty, data mining and statistical inference. Journal of the Royal Statistical Society. Series A. Statistics in Society, 158, 419–466. Retrieved from http://www.jstor.org/stable/2983440.
Clyde, M., & George, E.I. (2004). Model uncertainty. Statistical Science, 19, 81–94. doi:10.1214/088342304000000035.
Cox, D.R., & Snell, E.J. (1989). Analysis of binary data. New York: Chapman and Hall.
Cribari-Neto, F., & Souza, T.C. (2011). Testing inference in variable dispersion beta regressions. Journal of Statistical Computation and Simulation, 1–17. doi:10.1080/00949655.2011.599033.
Cribari-Neto, F., & Zeileis, A. (2010). Beta regression in R. Journal of Statistical Software, 34(02), 1–24. Retrieved from http://www.jstatsoft.org/v34/i02.
Czado, C., & Raftery, A.E. (2006). Choosing the link function and accounting for link uncertainty in generalized linear models using Bayes factors. Statistical Papers, 47, 419–442. doi:10.1007/s00362-006-0296-9.
Darlington, R.B. (1968). Multiple regression in psychological research and practice. Psychological Bulletin, 69, 161–182. doi:10.1037/h0025471.
Draper, D. (1995). Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society. Series B (Methodological), 57, 45–97. Retrieved from http://www.jstor.org/stable/2346087.
Duchateau, L., & Janssen, P. (2005). Understanding heterogeneity in generalized mixed and frailty models. American Statistician, 59(2), 143–146. doi:10.1198/000313005X43236.
Eicher, T.S., Papageorgiou, C., & Raftery, A.E. (2011). Default priors and predictive performance in Bayesian model averaging, with application to growth determinants. Journal of Applied Econometrics, 26, 30–55. doi:10.1002/jae.1112.
Erikson, R.S., Wright, G.C., & McIver, J.P. (1997). Too many variables? A comment on Bartels’ model-averaging proposal. Paper presented at the Political Methodology Conference, Columbus, Ohio. Retrieved from. http://www.stat.wisc.edu/~burgette/.
Evans, M., & Swartz, T. (1995). Methods for approximating integrals in statistics with special emphasis on Bayesian integration problems. Statistical Science, 10, 254–272. doi:10.1214/ss/1177009938.
Fernández, C., & Ley, E. (2002). Bayesian modeling of catch in a north-west Atlantic fishery. Journal of the Royal Statistical Society. Series C. Applied Statistics, 51, 257–280. doi:10.1111/1467-9876.00268.
Fernández, C., Ley, E., & Steel, M.F.J. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100, 381–427. doi:10.1016/S0304-4076(00)00076-2.
Fernández, C., & Steel, M.F.J. (1998). On Bayesian modeling of fat tails and skewness. Journal of the American Statistical Association, 93, 359–371. doi:10.1080/01621459.1998.10474117.
Ferrari, S.L.P., & Cribari-Neto, F. (2004). Beta regression for modeling rates and proportions. Journal of Applied Statistics, 31, 799–815. doi:10.1080/0266476042000214501.
Field, A.P. (2005). Discovering statistics using SPSS: and sex, drugs and rock ’n’ roll. Thousand Oaks: Sage Publications.
Freckleton, R.P. (2011). Dealing with collinearity in behavioural and ecological data: model averaging and the problems of measurement error. Behavioral Ecology and Sociobiology, 65, 91–101. doi:10.1007/s00265-010-1045-6.
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., & Hothorn, T. (2011). mvtnorm: multivariate normal and t distributions (R package Version 0.9-9991). Retrieved from http://CRAN.R-project.org/package=mvtnorm.
George, E.I. (2004). Bayesian model selection. In Encyclopedia of statistical sciences. doi:10.1002/0471667196.ess0989.pub2.
Glymour, C., Madigan, D., Pregibon, D., & Smyth, P. (1997). Statistical themes and lessons for data mining. Data Mining and Knowledge Discovery, 1, 11–28. doi:10.1023/a:1009773905005.
Gravetter, F.J., & Wallnau, L.B. (2006). Statistics for the behavioural sciences (7th ed.). Belmont: Thomson Wadsworth.
Hoeting, J.A., Madigan, D., Volinsky, C.T., & Raftery, A.E. (1999). Bayesian model averaging: a tutorial. Statistical Science, 14, 382–401. doi:10.1214/ss/1009212519.
Hubbard, R., & Lindsay, R.M. (2008). Why p values are not a useful measure of evidence in statistical significance testing. Theory & Psychology, 18, 69–88. doi:10.1177/0959354307086923.
Jeffreys, H. (1961). The theory of probability (3rd ed.). Oxford: Oxford University Press.
Johnson, J.W., & Lebreton, J.M. (2004). History and use of relative importance indices in organizational research. Organizational Research Methods, 238–257. doi:10.1177/1094428104266510.
Kadane, J.B., & Lazar, N.A. (2004). Methods and criteria for model selection. Journal of the American Statistical Association, 99, 279–290. doi:10.1198/016214504000000269.
Kass, R.E., & Raftery, A.E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795. doi:10.1080/01621459.1995.10476572.
Kieschnick, R., & McCullough, B. (2003). Regression analysis of variates observed on (0,1): percentages, proportions and fractions. Statistical Modeling, 3, 193–213. doi:10.1191/1471082X03st053oa.
Kosmidis, I., & Firth, D. (2009). Bias reduction in exponential family nonlinear models. Biometrika, 96, 793–804. doi:10.1093/biomet/asp055.
Kruskal, W., & Majors, R. (1989). Concepts of relative importance in recent scientific literature. American Statistician, 43, 2–6. doi:10.1080/00031305.1989.10475596.
Kumar, M., Kee, F.T., & Manshor, A.T. (2009). Determining the relative importance of critical factors in delivering service quality of banks: an application of dominance analysis in SERVQUAL model. Managing Service Quality, 19, 211–228. doi:10.1108/09604520910943198.
Leamer, E.E. (1973). Multicollinearity: a Bayesian interpretation. Review of Economics and Statistics, 55, 371–380. Retrieved from http://www.jstor.org/stable/1927962.
Lebreton, J.M., Binning, J.F., Adorno, A.J., & Melcher, K.M. (2004). Importance of personality and job-specific affect for predicting job attitudes and withdrawal behavior. Organizational Research Methods, 7, 300–325. doi:10.1177/1094428104266015.
LeBreton, J.M., Ployhart, R.E., & Ladd, R.T. (2004). A Monte Carlo comparison of relative importance methodologies. Organizational Research Methods, 7, 258–282. doi:10.1177/1094428104266017.
Liang, F., Paulo, R., Molina, G., Clyde, M.A., & Berger, J.O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association, 103, 410–423. doi:10.1198/016214507000001337.
Liu, C.C., & Aitkin, M. (2008). Bayes factors: prior sensitivity and model generalizability. Journal of Mathematical Psychology, 52, 362–375. doi:10.1016/j.jmp.2008.03.002.
Long, J.S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks: Sage Publications.
Madigan, D., & Raftery, A.E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association, 89, 1535–1546. doi:10.1080/01621459.1994.10476894.
McCullagh, P., & Nelder, J.A. (1989). Generalized linear models. New York: Chapman and Hall.
Menard, S. (2000). Coefficients of determination for multiple logistic regression analysis. American Statistician, 54, 17–24. doi:10.1080/00031305.2000.10474502.
Ospina, R., Cribari-Neto, F., & Vasconcellos, K.L.P. (2006). Improved point and interval estimation for a beta regression model. Computational Statistics & Data Analysis, 51, 960–981. doi:10.1016/j.csda.2005.10.002.
Paciorek, C.J. (2006). Misinformation in the conjugate prior for the linear model with implications for free-knot spline modeling. Bayesian Analysis, 1, 375–383. doi:10.1214/06-BA114.
Pammer, K., & Kevan, A. (2007). The contribution of visual sensitivity, phonological processing, and nonverbal iq to children’s reading. Scientific Studies of Reading, 11(1), 33–53. doi:10.1080/10888430709336633.
R Development Core Team (2011). R: a language and environment for statistical computing. Vienna: R Foundation, for Statistical Computing. Available from http://www.R-project.org/.
Raftery, A.E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163. Retrieved from http://www.jstor.org/stable/271063.
Raftery, A.E. (1996). Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika, 83, 251–266. doi:10.1093/biomet/83.2.251.
Raftery, A.E., Madigan, D., & Hoeting, J.A. (1997). Bayesian model averaging for linear regression models. Journal of the American Statistical Association, 92, 179–191. doi:10.1080/01621459.1997.10473615.
Raftery, A.E., & Zheng, Y. (2003). Discussion: performance of Bayesian model averaging. Journal of the American Statistical Association, 98, 931–938. doi:10.1198/016214503000000891.
Retzer, J.J., Soofi, E.S., & Soyer, R. (2009). Information importance of predictors: concept, measures, Bayesian inference, and applications. Computational Statistics & Data Analysis, 53, 2363–2377. doi:10.1016/j.csda.2008.03.010.
Simas, A.B., Barreto-Souza, W., & Rocha, A.V. (2010). Improved estimators for a general class of beta regression models. Computational Statistics & Data Analysis, 54, 348–366. doi:10.1016/j.csda.2009.08.017.
Smithson, M., & Merkle, E. C. (in press). Generalized linear models for categorical and continuous limited dependent variables. Boca Raton: Chapman and Hall. Chapter 6 Doubly bounded continuous variables.
Smithson, M., & Verkuilen, J. (2006). A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychological Methods, 11, 54–71. doi:10.1037/1082-989X.11.1.54.
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. doi:10.1214/aos/1176344136.
Tabachnick, B.G., & Fidell, L.S. (2007). Using multivariate statistics (5 edn.). Boston: Pearson Allyn & Bacon.
Venter, A., Maxwell, S.E., Howard, E.A.T., & Steven, D.B. (2000). Issues in the use and application of multiple regression analysis. In Handbook of applied multivariate statistics and mathematical modeling (pp. 151–182). doi:10.1016/B978-012691360-6/50007-0.
Verkuilen, J., & Smithson, M. (2012). Mixed and mixture regression models for continuous bounded responses using the beta distribution. Journal of Educational and Behavioral Statistics. doi:10.3102/1076998610396895.
Viallefont, V., Raftery, A.E., & Richardson, S. (2001). Variable selection and Bayesian model averaging in case-control studies. Statistics in Medicine, 20, 3215–3230. doi:10.1002/sim.976.
Volinsky, C.T., & Raftery, A.E. (2000). Bayesian information criterion for censored survival models. Biometrics, 56, 256–262. doi:10.1111/j.0006-341X.2000.00256.x.
Wang, D., Zhang, W., & Bakhai, A. (2004). Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression. Statistics in Medicine, 23, 3451–3467. doi:10.1002/sim.1930.
Weakliem, D.L. (1999). A critique of the Bayesian information criterion for model selection. Sociological Methods & Research, 27, 359–397. doi:10.1177/0049124199027003002.
Zellner, A., & Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. Trabajos de Estadística y de Investigación Operativa, 31, 585–603. doi:10.1007/bf02888369.
Zucchini, W. (2000). An introduction to model selection. Journal of Mathematical Psychology, 44, 41–61. doi:10.1006/jmps.1999.1276.
Author information
Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shou, Y., Smithson, M. Evaluating Predictors of Dispersion: A Comparison of Dominance Analysis and Bayesian Model Averaging. Psychometrika 80, 236–256 (2015). https://doi.org/10.1007/s11336-013-9375-8
Received:
Published:
Issue Date:
Key words
- heteroscedasticity
- predictor importance
- dominance analysis
- Bayesian model averaging
- beta regression
- beta GLM