Abstract
One important component of model selection using generalized linear models (GLM) is the choice of a link function. We propose using approximate Bayes factors to assess the improvement in fit over a GLM with canonical link when a parametric link family is used. The approximate Bayes factors are calculated using the Laplace approximations given in [32], together with a reference set of prior distributions. This methodology can be used to differentiate between different parametric link families, as well as allowing one to jointly select the link family and the independent variables. This involves comparing nonnested models and so standard significance tests cannot be used. The approach also accounts explicitly for uncertainty about the link function. The methods are illustrated using parametric link families studied in [12] for two data sets involving binomial responses.
Similar content being viewed by others
References
M. Aitkin. Posterior Bayes factors (disc: P128–142). Journal of the Royal Statistical Society, Series B, Methodological, 53: 111–128, 1991.
F. J. Aranda-Ordaz. On two families of transformations to additivity for binary response data (corr: V70 p303). Biometrika, 68: 357–363, 1981.
S. Banerjee, B. P. Carlin, and A. E. Gelfand. Hierarchical Modeling and Analysis for spatial Data. Chapman & Hall/CRC, Boca Raton, London, New York, Washington, D. C., 2004.
M. S. Bartlett A comment on D. V. Lindley's statistical paradox. Biometrika, 44: 533, 1957.
C. I. Bliss. The calculation of the dose-mortality curve. Annals of Applied Biology, 22: 134–167, 1935.
D. Collett. Modelling Binary Data. Chapman & Hall, London, 2nd edition, 2002.
P. Congdon. Applied Bayesian Modelling. John Wiley & Sons, Chichester, U.K., 2003.
T. W. Copenhaver and P. W. Mielke. Quantit analysis: A quantal assay refinement. Biometrics, 33: 175–186, 1977.
D. R. Cox and N. Reid. Parameter orthogonality and approximate conditional inference (c/r: P18–39). Journal of the Royal Statistical Society, Series B, Methodological 49: 1–18, 1987.
C. Czado. On link selection in generalized linear models. In Advances in GLIM and Statistical Modelling. Proceedings of the GLIM92 Conference, pages 60–65. Springer-Verlag (Berlin; New York), 1992.
C. Czado. Bayesian inference of binary regression models with parametric link. Journal of Statistical Planning and Inference 41: 121–140, 1994.
C. Czado. On selecting parametric link transformation families in generalized linear models. Journal of Statistical Planning and Inference, 61: 125–139, 1997.
C. Czado and A. Munk. Noncanonical links in generalized linear models—when is the effort justified? Journal of Statistical Planning and Inference, 87: 317–345, 2000.
C. Czado and T. J. Santner. The effect of link misspecification on binary regression inference. Journal of Statistical Planning and Inference, 33: 213–231, 1992.
D. Dey, S. Ghosh, and B. Mallik. Generalized Linear Models: A Bayesian Perspective, Marcel Dekker, New York, 2000.
D. Draper. Assessment and propagation of model uncertainty (disc: P71–97). Journal of the Royal Statistical Society, Series B, Methodological, 57: 45–70, 1995.
A. E. Gelfand and D. K. Dey. Bayesian model choice: Asymptotics and exact calculations. Journal of the Royal Statistical Society, Series B, Methodological, 56: 501–514, 1994.
W. R. Gilks, S. Richardson, and D. J. Spiegelhalter. Markov Chain Monte Carlo in Practice. Chapman & Hall, London, 1996.
V. M. Guerrero and R. A. Johnson. Use of the Box-Cox transformation with binary response models. Biometrika, 69: 309–314, 1982.
C. Han and B. P. Carlin. MCMC methods for computing Bayes factors: a comparative review. Journal of the American Statistical Association, 96: 1122–1132, 2001.
J. A. Hoeting, D. Madigan, A. E. Raftery, and C. T. Volinsky. Bayesian model averaging: A tutorial (disc: P401–417). Statistical Science, 14: 382–401, 1999. Corrected version available at www.stat.washington.edu/www/research/online/hoeting1999.pdf.
R. E. Kass and A. E. Raftery. Bayes factors. Journal of the American Statistical Association, 90: 773–795, June 1995.
D. V. Lindley. On the presentation of evidence. Biometrika, 44: 187–192, 1957.
P. McCullagh and J. A. Nelder. Generalized Linear Models. Chapman and Hall, London, 2nd edition, 1989.
H. Milicer and F. Szcotka. Age at menarche in Warsaw girls in 1965. Human Biology, 40: 199–203, 1966.
M. A. J. Montfort and A. Otten. Quantal response analysis: Enlargement of the logistic model with a kurtosis parameter. Biometrical Journal, 18: 371–380, 1976.
B. J. T. Morgan. Observations on quantit analysis. Biometrics, 39: 879–886, 1983.
M. A. Newton, C. Czado, and R. Chappell. Bayesian inference for semiparametric binary regression. J. Amer. Stat. Assoc., 91: 142–153, 1996.
I. Ntzoufras, P. Dellaportas, and J. J. Forster. Bayesian variable and link determination for generalised linear models. preprint, 2001.
D. Pregibon. Goodness of link tests for generalized linear models. Applied Statistics, 29: 15–24, 1980.
R. L. Prentice. Generalization of the probit and logit methods for dose response curves. Biometrics, 32: 761–768, 1976.
A. E. Raftery. Approximate Bayes factors and accounting for model uncertainty in generalized linear models. Biometrika, 83: 251–266, 1996.
G. Schwartz. Estimating the dimension of a model. Annals of Statistics, 6: 461–64, 1978.
T. Stukel. Generalized logistic models. Journal of the American Statistical Association, 83: 426–431, 1988.
J. M. G. Taylor. The cost of generalizing logistic regression. Journal of the American Statistical Association, 83: 1078–1083, 1988.
J. M. G. Taylor, A. L. Siqueira, and R. E. Weiss. The cost of adding parameters to a model. Journal of the Royal Statistical Society, Series B, Methodological, 58: 593–607, 1996.
V. Viallefont, A. E. Raftery, and S. Richardson. Variable selection and Bayesian model averaging in epidemiological case-control studies. Statistics in Medicine, 20: 3215–3230, 2001.
A. S. Whittmore. Transformations to linearity in binary regression. SIAM Journal on Applied Mathematics, 43: 703–710, 1983.
Author information
Authors and Affiliations
Additional information
The first author was supported by Sonderforschungsbereich 386 Statistische Analyse Diskreter Strukturen, and the second author by NIH Grant 1R01CA094212-01 and ONR Grant N00014-01-10745.
Rights and permissions
About this article
Cite this article
Czado, C., Raftery, A.E. Choosing the link function and accounting for link uncertainty in generalized linear models using Bayes factors. Statistical Papers 47, 419–442 (2006). https://doi.org/10.1007/s00362-006-0296-9
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/s00362-006-0296-9