Evaluating Predictors of Dispersion: A Comparison of Dominance Analysis and Bayesian Model Averaging

Abstract

Conventional measures of predictor importance in linear models are applicable only when the assumption of homoscedasticity is satisfied. Moreover, they cannot be adapted to evaluating predictor importance in models of heteroscedasticity (i.e., dispersion), an issue that seems not to have been systematically addressed in the literature. We compare two suitable approaches, Dominance Analysis (DA) and Bayesian Model Averaging (BMA), for simultaneously evaluating predictor importance in models of location and dispersion. We apply them to the beta general linear model as a test-case, illustrating this with an example using real data. Simulations using several different model structures, sample sizes, and degrees of multicollinearity suggest that both DA and BMA largely agree on the relative importance of predictors of the mean, but differ when ranking predictors of dispersion. The main implication of these findings for researchers is that the choice between DA and BMA is most important when they wish to evaluate the importance of predictors of dispersion.

This is a preview of subscription content, log in to check access.

Figure 1.
Figure 2.
Figure 3.

Notes

  1. 1.

    Supplementary materials are available at http://dl.dropbox.com/u/39185009/PIE2012supp.html, including Appendices and R code for evaluating predictor importance via DA and BMA.

References

  1. Azen, R., & Budescu, D.V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8, 129–148. doi:10.1037/1082-989X.8.2.129.

    Article  PubMed  Google Scholar 

  2. Azen, R., & Traxel, N. (2009). Using dominance analysis to determine predictor importance in logistic regression. Journal of Educational and Behavioral Statistics, 34, 319–347. doi:10.3102/1076998609332754.

    Article  Google Scholar 

  3. Berger, J.O., & Pericchi, L.R. (1996). The intrinsic Bayes factor for model selection and prediction. Journal of the American Statistical Association, 91, 109–122. doi:10.1080/01621459.1996.10476668.

    Article  Google Scholar 

  4. Bollen, K.A., Ray, S., Zavisca, J., & Harden, J. (2012). A comparison of Bayes factor approximation methods including two new methods. Sociological Methods & Research, 41(2), 294–324. doi:10.1177/0049124112452393.

    Article  Google Scholar 

  5. Budescu, D.V. (1993). Dominance analysis: a new approach to the problem of relative importance of predictors in multiple regression. Psychological Bulletin, 114, 542–551. doi:10.1037/0033-2909.114.3.542.

    Article  Google Scholar 

  6. Budescu, D.V., & Azen, R. (2004). Beyond global measures of relative importance: some insights from dominance analysis. Organizational Research Methods, 7, 341–350. doi:10.1177/1094428104267049.

    Article  Google Scholar 

  7. Carroll, R.J. (2003). Variances are not always nuisance parameters. Biometrics, 59(2), 211–220. doi:10.1111/1541-0420.t01-1-00027.

    Article  PubMed  Google Scholar 

  8. Chao, Y.-C.E., Zhao, Y., Kupper, L.L., & Nylander-French, L.A. (2008). Quantifying the relative importance of predictors in multiple linear regression analyses for public health studies. Journal of Occupational and Environmental Hygiene, 5, 519. doi:10.1080/15459620802225481.

    Article  PubMed  Google Scholar 

  9. Chatfield, C. (1995). Model uncertainty, data mining and statistical inference. Journal of the Royal Statistical Society. Series A. Statistics in Society, 158, 419–466. Retrieved from http://www.jstor.org/stable/2983440.

    Article  Google Scholar 

  10. Clyde, M., & George, E.I. (2004). Model uncertainty. Statistical Science, 19, 81–94. doi:10.1214/088342304000000035.

    Article  Google Scholar 

  11. Cox, D.R., & Snell, E.J. (1989). Analysis of binary data. New York: Chapman and Hall.

    Google Scholar 

  12. Cribari-Neto, F., & Souza, T.C. (2011). Testing inference in variable dispersion beta regressions. Journal of Statistical Computation and Simulation, 1–17. doi:10.1080/00949655.2011.599033.

  13. Cribari-Neto, F., & Zeileis, A. (2010). Beta regression in R. Journal of Statistical Software, 34(02), 1–24. Retrieved from http://www.jstatsoft.org/v34/i02.

    Google Scholar 

  14. Czado, C., & Raftery, A.E. (2006). Choosing the link function and accounting for link uncertainty in generalized linear models using Bayes factors. Statistical Papers, 47, 419–442. doi:10.1007/s00362-006-0296-9.

    Article  Google Scholar 

  15. Darlington, R.B. (1968). Multiple regression in psychological research and practice. Psychological Bulletin, 69, 161–182. doi:10.1037/h0025471.

    Article  PubMed  Google Scholar 

  16. Draper, D. (1995). Assessment and propagation of model uncertainty. Journal of the Royal Statistical Society. Series B (Methodological), 57, 45–97. Retrieved from http://www.jstor.org/stable/2346087.

    Google Scholar 

  17. Duchateau, L., & Janssen, P. (2005). Understanding heterogeneity in generalized mixed and frailty models. American Statistician, 59(2), 143–146. doi:10.1198/000313005X43236.

    Article  Google Scholar 

  18. Eicher, T.S., Papageorgiou, C., & Raftery, A.E. (2011). Default priors and predictive performance in Bayesian model averaging, with application to growth determinants. Journal of Applied Econometrics, 26, 30–55. doi:10.1002/jae.1112.

    Article  Google Scholar 

  19. Erikson, R.S., Wright, G.C., & McIver, J.P. (1997). Too many variables? A comment on Bartels’ model-averaging proposal. Paper presented at the Political Methodology Conference, Columbus, Ohio. Retrieved from. http://www.stat.wisc.edu/~burgette/.

  20. Evans, M., & Swartz, T. (1995). Methods for approximating integrals in statistics with special emphasis on Bayesian integration problems. Statistical Science, 10, 254–272. doi:10.1214/ss/1177009938.

    Article  Google Scholar 

  21. Fernández, C., & Ley, E. (2002). Bayesian modeling of catch in a north-west Atlantic fishery. Journal of the Royal Statistical Society. Series C. Applied Statistics, 51, 257–280. doi:10.1111/1467-9876.00268.

    Article  Google Scholar 

  22. Fernández, C., Ley, E., & Steel, M.F.J. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100, 381–427. doi:10.1016/S0304-4076(00)00076-2.

    Article  Google Scholar 

  23. Fernández, C., & Steel, M.F.J. (1998). On Bayesian modeling of fat tails and skewness. Journal of the American Statistical Association, 93, 359–371. doi:10.1080/01621459.1998.10474117.

    Google Scholar 

  24. Ferrari, S.L.P., & Cribari-Neto, F. (2004). Beta regression for modeling rates and proportions. Journal of Applied Statistics, 31, 799–815. doi:10.1080/0266476042000214501.

    Article  Google Scholar 

  25. Field, A.P. (2005). Discovering statistics using SPSS: and sex, drugs and rock ’n’ roll. Thousand Oaks: Sage Publications.

    Google Scholar 

  26. Freckleton, R.P. (2011). Dealing with collinearity in behavioural and ecological data: model averaging and the problems of measurement error. Behavioral Ecology and Sociobiology, 65, 91–101. doi:10.1007/s00265-010-1045-6.

    Article  Google Scholar 

  27. Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., & Hothorn, T. (2011). mvtnorm: multivariate normal and t distributions (R package Version 0.9-9991). Retrieved from http://CRAN.R-project.org/package=mvtnorm.

  28. George, E.I. (2004). Bayesian model selection. In Encyclopedia of statistical sciences. doi:10.1002/0471667196.ess0989.pub2.

    Google Scholar 

  29. Glymour, C., Madigan, D., Pregibon, D., & Smyth, P. (1997). Statistical themes and lessons for data mining. Data Mining and Knowledge Discovery, 1, 11–28. doi:10.1023/a:1009773905005.

    Article  Google Scholar 

  30. Gravetter, F.J., & Wallnau, L.B. (2006). Statistics for the behavioural sciences (7th ed.). Belmont: Thomson Wadsworth.

    Google Scholar 

  31. Hoeting, J.A., Madigan, D., Volinsky, C.T., & Raftery, A.E. (1999). Bayesian model averaging: a tutorial. Statistical Science, 14, 382–401. doi:10.1214/ss/1009212519.

    Article  Google Scholar 

  32. Hubbard, R., & Lindsay, R.M. (2008). Why p values are not a useful measure of evidence in statistical significance testing. Theory & Psychology, 18, 69–88. doi:10.1177/0959354307086923.

    Article  Google Scholar 

  33. Jeffreys, H. (1961). The theory of probability (3rd ed.). Oxford: Oxford University Press.

    Google Scholar 

  34. Johnson, J.W., & Lebreton, J.M. (2004). History and use of relative importance indices in organizational research. Organizational Research Methods, 238–257. doi:10.1177/1094428104266510.

  35. Kadane, J.B., & Lazar, N.A. (2004). Methods and criteria for model selection. Journal of the American Statistical Association, 99, 279–290. doi:10.1198/016214504000000269.

    Article  Google Scholar 

  36. Kass, R.E., & Raftery, A.E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795. doi:10.1080/01621459.1995.10476572.

    Article  Google Scholar 

  37. Kieschnick, R., & McCullough, B. (2003). Regression analysis of variates observed on (0,1): percentages, proportions and fractions. Statistical Modeling, 3, 193–213. doi:10.1191/1471082X03st053oa.

    Article  Google Scholar 

  38. Kosmidis, I., & Firth, D. (2009). Bias reduction in exponential family nonlinear models. Biometrika, 96, 793–804. doi:10.1093/biomet/asp055.

    Article  Google Scholar 

  39. Kruskal, W., & Majors, R. (1989). Concepts of relative importance in recent scientific literature. American Statistician, 43, 2–6. doi:10.1080/00031305.1989.10475596.

    Google Scholar 

  40. Kumar, M., Kee, F.T., & Manshor, A.T. (2009). Determining the relative importance of critical factors in delivering service quality of banks: an application of dominance analysis in SERVQUAL model. Managing Service Quality, 19, 211–228. doi:10.1108/09604520910943198.

    Article  Google Scholar 

  41. Leamer, E.E. (1973). Multicollinearity: a Bayesian interpretation. Review of Economics and Statistics, 55, 371–380. Retrieved from http://www.jstor.org/stable/1927962.

    Article  Google Scholar 

  42. Lebreton, J.M., Binning, J.F., Adorno, A.J., & Melcher, K.M. (2004). Importance of personality and job-specific affect for predicting job attitudes and withdrawal behavior. Organizational Research Methods, 7, 300–325. doi:10.1177/1094428104266015.

    Article  Google Scholar 

  43. LeBreton, J.M., Ployhart, R.E., & Ladd, R.T. (2004). A Monte Carlo comparison of relative importance methodologies. Organizational Research Methods, 7, 258–282. doi:10.1177/1094428104266017.

    Article  Google Scholar 

  44. Liang, F., Paulo, R., Molina, G., Clyde, M.A., & Berger, J.O. (2008). Mixtures of g priors for Bayesian variable selection. Journal of the American Statistical Association, 103, 410–423. doi:10.1198/016214507000001337.

    Article  Google Scholar 

  45. Liu, C.C., & Aitkin, M. (2008). Bayes factors: prior sensitivity and model generalizability. Journal of Mathematical Psychology, 52, 362–375. doi:10.1016/j.jmp.2008.03.002.

    Article  Google Scholar 

  46. Long, J.S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks: Sage Publications.

    Google Scholar 

  47. Madigan, D., & Raftery, A.E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association, 89, 1535–1546. doi:10.1080/01621459.1994.10476894.

    Article  Google Scholar 

  48. McCullagh, P., & Nelder, J.A. (1989). Generalized linear models. New York: Chapman and Hall.

    Google Scholar 

  49. Menard, S. (2000). Coefficients of determination for multiple logistic regression analysis. American Statistician, 54, 17–24. doi:10.1080/00031305.2000.10474502.

    Google Scholar 

  50. Ospina, R., Cribari-Neto, F., & Vasconcellos, K.L.P. (2006). Improved point and interval estimation for a beta regression model. Computational Statistics & Data Analysis, 51, 960–981. doi:10.1016/j.csda.2005.10.002.

    Article  Google Scholar 

  51. Paciorek, C.J. (2006). Misinformation in the conjugate prior for the linear model with implications for free-knot spline modeling. Bayesian Analysis, 1, 375–383. doi:10.1214/06-BA114.

    Article  PubMed Central  PubMed  Google Scholar 

  52. Pammer, K., & Kevan, A. (2007). The contribution of visual sensitivity, phonological processing, and nonverbal iq to children’s reading. Scientific Studies of Reading, 11(1), 33–53. doi:10.1080/10888430709336633.

    Article  Google Scholar 

  53. R Development Core Team (2011). R: a language and environment for statistical computing. Vienna: R Foundation, for Statistical Computing. Available from http://www.R-project.org/.

    Google Scholar 

  54. Raftery, A.E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163. Retrieved from http://www.jstor.org/stable/271063.

    Article  Google Scholar 

  55. Raftery, A.E. (1996). Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika, 83, 251–266. doi:10.1093/biomet/83.2.251.

    Article  Google Scholar 

  56. Raftery, A.E., Madigan, D., & Hoeting, J.A. (1997). Bayesian model averaging for linear regression models. Journal of the American Statistical Association, 92, 179–191. doi:10.1080/01621459.1997.10473615.

    Article  Google Scholar 

  57. Raftery, A.E., & Zheng, Y. (2003). Discussion: performance of Bayesian model averaging. Journal of the American Statistical Association, 98, 931–938. doi:10.1198/016214503000000891.

    Article  Google Scholar 

  58. Retzer, J.J., Soofi, E.S., & Soyer, R. (2009). Information importance of predictors: concept, measures, Bayesian inference, and applications. Computational Statistics & Data Analysis, 53, 2363–2377. doi:10.1016/j.csda.2008.03.010.

    Article  Google Scholar 

  59. Simas, A.B., Barreto-Souza, W., & Rocha, A.V. (2010). Improved estimators for a general class of beta regression models. Computational Statistics & Data Analysis, 54, 348–366. doi:10.1016/j.csda.2009.08.017.

    Article  Google Scholar 

  60. Smithson, M., & Merkle, E. C. (in press). Generalized linear models for categorical and continuous limited dependent variables. Boca Raton: Chapman and Hall. Chapter 6 Doubly bounded continuous variables.

  61. Smithson, M., & Verkuilen, J. (2006). A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychological Methods, 11, 54–71. doi:10.1037/1082-989X.11.1.54.

    Article  PubMed  Google Scholar 

  62. Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. doi:10.1214/aos/1176344136.

    Article  Google Scholar 

  63. Tabachnick, B.G., & Fidell, L.S. (2007). Using multivariate statistics (5 edn.). Boston: Pearson Allyn & Bacon.

    Google Scholar 

  64. Venter, A., Maxwell, S.E., Howard, E.A.T., & Steven, D.B. (2000). Issues in the use and application of multiple regression analysis. In Handbook of applied multivariate statistics and mathematical modeling (pp. 151–182). doi:10.1016/B978-012691360-6/50007-0.

    Google Scholar 

  65. Verkuilen, J., & Smithson, M. (2012). Mixed and mixture regression models for continuous bounded responses using the beta distribution. Journal of Educational and Behavioral Statistics. doi:10.3102/1076998610396895.

    Google Scholar 

  66. Viallefont, V., Raftery, A.E., & Richardson, S. (2001). Variable selection and Bayesian model averaging in case-control studies. Statistics in Medicine, 20, 3215–3230. doi:10.1002/sim.976.

    Article  PubMed  Google Scholar 

  67. Volinsky, C.T., & Raftery, A.E. (2000). Bayesian information criterion for censored survival models. Biometrics, 56, 256–262. doi:10.1111/j.0006-341X.2000.00256.x.

    Article  PubMed  Google Scholar 

  68. Wang, D., Zhang, W., & Bakhai, A. (2004). Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression. Statistics in Medicine, 23, 3451–3467. doi:10.1002/sim.1930.

    Article  PubMed  Google Scholar 

  69. Weakliem, D.L. (1999). A critique of the Bayesian information criterion for model selection. Sociological Methods & Research, 27, 359–397. doi:10.1177/0049124199027003002.

    Article  Google Scholar 

  70. Zellner, A., & Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. Trabajos de Estadística y de Investigación Operativa, 31, 585–603. doi:10.1007/bf02888369.

    Article  Google Scholar 

  71. Zucchini, W. (2000). An introduction to model selection. Journal of Mathematical Psychology, 44, 41–61. doi:10.1006/jmps.1999.1276.

    Article  PubMed  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yiyun Shou.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Shou, Y., Smithson, M. Evaluating Predictors of Dispersion: A Comparison of Dominance Analysis and Bayesian Model Averaging. Psychometrika 80, 236–256 (2015). https://doi.org/10.1007/s11336-013-9375-8

Download citation

Key words

  • heteroscedasticity
  • predictor importance
  • dominance analysis
  • Bayesian model averaging
  • beta regression
  • beta GLM