Abstract
In many biostatistical applications concerned with the analysis of duration times and especially those including high-dimensional genetic information, the following three extensions of classical accelerated failure time (AFT) models are required: (1) a flexible, nonparametric estimate of the survival time distribution, (2) a structured additive predictor including linear as well as nonlinear effects of continuous covariates and possibly further types of effects such as random or spatial effects, and (3) regularization and variable selection of high-dimensional effect vectors. Although a lot of research has dealt with these features separately, the development of AFT models combining them in a unified framework has not been considered yet. We present a Bayesian approach for modeling and inference in such flexible AFT models, incorporating a penalized Gaussian mixture error distribution, a structured additive predictor with Bayesian P-splines as a main ingredient, and Bayesian versions of ridge and LASSO as well as a spike and slab priors to enforce sparseness. Priors for regression coefficients are conditionally Gaussian, facilitating Markov chain Monte Carlo inference. The proposed model class is extensively tested in simulation studies and applied in the analysis of acute myeloid leukemia survival times considering microarray information as well as clinical covariates as prognostic factors.
Similar content being viewed by others
References
Barbieri, M.M., Berger, J.O.: Optimal predictive model selection. Ann. Stat. 32, 870–897 (2004)
Benner, A., Zucknick, M., Hielscher, T., Ittrich, C., Mansmann, U.: High-dimensional Cox models: the choice of penalty as part of the model building process. Biom. J. 52, 50–69 (2010)
Berger, J.O.: Statistical Decision Theory and Bayesian Analysis, 2nd edn. Springer, Berlin (1993)
Brezger, A., Lang, S.: Generalized additive regression based on Bayesian P-splines. Comput. Stat. Data Anal. 50, 967–991 (2006)
Eilers, P., Marx, B.D.: Flexible smoothing using B-splines and penalties (with comments and rejoinder). Stat. Sci. 11, 89–121 (1996)
Fahrmeir, L., Kneib, T.: Bayesian Smoothing and Regression for Longitudinal. Spatial and Event History Data. Oxford University Press, Oxford (2011)
Fahrmeir, L., Kneib, T., Lang, S., Marx, B.: Regression. Springer, Berlin (2013)
Fahrmeir, L., Kneib, T., Konrath, S.: Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection. Stat. Comput. 20, 203–219 (2010)
Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer Series in Statistics. Springer, Berlin (2006)
Graf, E., Schmoor, C., Sauerbrei, W., Schumacher, M.: Assessment and comparison of prognostic classifcation schemes for survival data. Stat. Med. 18, 2529–2545 (1999)
Griffin, J.E., Brown, P.J. (2010) Bayesian Adaptive Lassos with Non-Convex Penalization. Technical report, University of Warwick, Dept. of Statistics
Hanson, T.: Modeling censored lifetime data using a mixture of gammas baseline. Bayesian Anal. 3, 575–594 (2006)
Hennerfeind, A., Brezger, A., Fahrmeir, L.: Geoadditive survival models. J. Am. Stat. Assoc. 101, 1065–1075 (2006)
Ishwaran, H., Rao, S.J.: Detecting differentially expressed genes in microarrays using Bayesian model selection. J. Am. Stat. Assoc. 98, 438–455 (2003)
Ishwaran, H., Rao, S.J.: Spike and slab variable selection: frequentist and Bayesian strategies. Ann. Stat. 33, 730–773 (2005)
Kneib, T., Fahrmeir, L.: A mixed model approach for geoadditive hazard regression. Scand. J. Stat. 34, 207–228 (2007)
Komárek, A., Lesaffre, E., Hilton, J.F.: Accelerated failure time model for arbitrarily censored data with smoothed error distribution. J. Comput. Graph. Stat. 14, 726–745 (2005)
Komárek, A., Lesaffre, E.: Bayesian accelerated failure time model with multivariate doubly interval-censored data and flexible distributional assumptions. J. Am. Stat. Assoc. 103, 523–533 (2008)
Konrath, S. (2013). Bayesian Regularization in Regression Models for Survival Data. Dissertation, LMU München
Konrath, S., Fahrmeir, L., Kneib, T.: Bayesian smoothing, shrinkage and variable selection in hazard regression. In: Becker, C., Fried, R., Kuhnt, S. (eds.) Robustness and Complex Data Structures, pp. 149–170. Festschrift in Honour of Ursula Gather, Springer, Berlin (2013)
Lang, S., Brezger, A.: Bayesian P-Splines. J. Comput. Graph. Stat. 13, 183–212 (2004)
Lee, A., Caron, F., Doucet, A., Holmes, C. (2012). Bayesian sparsity-path-analysis of genetic association signals using generalized t priors. Stat. Appl. Genet. Mol. Biol. 11, Article 5
Li, Q., Lin, N.: The Bayesian elastic net. Bayesian Anal. 5, 847–866 (2010)
Metzeler, K.H., Hummel, M., Bloomfield, C.D., Spiekermann, K., Braess, J., et al.: An 86-probe-set gene-expression signature predicts survival in cytogenetically normal acute myeloid leukemia. Blood 112, 4193–4201 (2008)
Müller, P., Parmigiani, G., Rice, K. (2006). FDR and Bayesian Multiple Comparison Rules. John Hopkins University. Dept. Biostatistics, Working Paper 115
Park, T., Casella, G.: The Bayesian Lasso. J. Am. Stat. Assoc. 103, 681–686 (2008)
Walker, S., Mallick, B.K.: A Bayesian semiparametric accelerated failure time model. Biometrics 55, 477–483 (1999)
Acknowledgments
Financial support by the German Research Foundation (DFG), grant FA 128/5-1/2 is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
The Below is the Electronic Supplementary Material.
Rights and permissions
About this article
Cite this article
Konrath, S., Fahrmeir, L. & Kneib, T. Bayesian accelerated failure time models based on penalized mixtures of Gaussians: regularization and variable selection. AStA Adv Stat Anal 99, 259–280 (2015). https://doi.org/10.1007/s10182-014-0240-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10182-014-0240-6