Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models

Yin, Wenjing; Zhao, Sihai Dave; Liang, Feng

doi:10.1007/s10985-022-09549-5

Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models

Published: 03 March 2022

Volume 28, pages 282–318, (2022)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

381 Accesses
2 Altmetric
Explore all metrics

Abstract

For high dimensional gene expression data, one important goal is to identify a small number of genes that are associated with progression of the disease or survival of the patients. In this paper, we consider the problem of variable selection for multivariate survival data. We propose an estimation procedure for high dimensional accelerated failure time (AFT) models with bivariate censored data. The method extends the Buckley-James method by minimizing a penalized $L_2$ loss function with a penalty function induced from a bivariate spike-and-slab prior specification. In the proposed algorithm, censored observations are imputed using the Kaplan-Meier estimator, which avoids a parametric assumption on the error terms. Our empirical studies demonstrate that the proposed method provides better performance compared to the alternative procedures designed for univariate survival data regardless of whether the true events are correlated or not, and conceptualizes a formal way of handling bivariate survival data for AFT models. Findings from the analysis of a myeloma clinical trial using the proposed method are also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Censored cumulative residual independent screening for ultrahigh-dimensional survival data

Article 26 May 2017

Broken adaptive ridge regression for right-censored survival data

Article 05 April 2021

Variable selection for survival data with a class of adaptive elastic net techniques

Article 17 March 2015

References

Ahmed SE, Hossain S, Doksum KA (2012) Lasso and shrinkage estimation in weibull censored regression models. J Stat Plan Inference 142(6):1273–1284
Article MathSciNet MATH Google Scholar
Barber RF, Candès EJ et al (2015) Controlling the false discovery rate via knockoffs. Annal Stat 43(5):2055–2085
Article MathSciNet MATH Google Scholar
Barbieri MM, Berger JO (2004) Optimal predictive model selection. Annal Stat 32(3):870–897
Article MathSciNet MATH Google Scholar
Buckley J, James I (1979) Linear regression with censored data. Biometrika 66(3):429–436
Article MATH Google Scholar
Cai T, Huang J, Tian L (2009) Regularized estimation for the accelerated failure time model. Biometrics 65(2):394–404
Article MathSciNet MATH Google Scholar
Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: Model-free knockoffs for high-dimensional controlled variable selection. J R Stat Soc: Ser B (Stat Methodol) 80(3):551–577
Article MathSciNet MATH Google Scholar
Chang SH (2004) Estimating marginal effects in accelerated failure time models for serial sojourn times among repeated events. Lifetime Data Anal 10(2):175–190
Article MathSciNet MATH Google Scholar
Chatonnet F, Pignarre A, Sérandour AA, Caron G, Avner S, Robert N, Kassambara A, Laurent A, Bizot M, Agirre X et al (2020) The hydroxymethylome of multiple myeloma identifies fam72d as a 1q21 marker linked to proliferation. Haematologica 105(3):774–783
Article Google Scholar
Chiou SH, Kang S, Kim J, Yan J (2014) Marginal semiparametric multivariate accelerated failure time model with generalized estimating equations. Lifetime Data Anal 20(4):599–618
Article MathSciNet MATH Google Scholar
Cox DR (1972) Regression models and life-tables. J R Stat Soc: Ser B (Methodol) 34(2):187–202
MathSciNet MATH Google Scholar
Duan W, Zhang R, Zhao Y, Shen S, Wei Y, Chen F, Christiani DC (2018) Bayesian variable selection for parametric survival model with applications to cancer omics data. Human Genom 12(1):49
Article Google Scholar
George EI, McCulloch RE (1997) Approaches for bayesian variable selection. Stat Sinica 7(2):339–373
MATH Google Scholar
Hanagal DD (2006) Bivariate weibull regression model based on censored samples. Stat Papers 47(1):137–147
Article MathSciNet MATH Google Scholar
Hawley TS, Riz I, Yang W, Wakabayashi Y, DePalma L, Chang YT, Peng W, Zhu J, Hawley RG (2013) Identification of an abcb1 (p-glycoprotein)-positive carfilzomib-resistant myeloma subpopulation by the pluripotent stem cell fluorescent dye cdy1. Am J Hematol 88(4):265–272
Article Google Scholar
He W, Lawless JF (2005) Bivariate location-scale models for regression analysis, with applications to lifetime data. J R Stat Soc: Ser B (Stat Methodol) 67(1):63–78
Article MathSciNet MATH Google Scholar
Hornsteiner U, Hamerle A (1996) A combined gee/buckley-james method for estimating an accelerated failure time model of multivariate failure times. Discussion Paper 47, Ludwig-Maximillians Universitat, Munchen. Also available from http://stat.unimuenchen.de/sfb386/publikation.html
Hu J, Chai H (2013) Adjusted regularized estimation in the accelerated failure time model with high dimensional covariates. J Multiv Anal 122:96–114
Article MathSciNet MATH Google Scholar
Huang J, Ma S (2010) Variable selection in the accelerated failure time model via the bridge method. Lifetime Data Anal 16(2):176–195
Article MathSciNet MATH Google Scholar
Huang J, Ma S, Xie H (2006) Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics 62(3):813–820
Article MathSciNet MATH Google Scholar
Huang J, Ma S, Xie H (2007) Least absolute deviations estimation for the accelerated failure time model. Stat Sinica 17(4):1533–1548
MathSciNet MATH Google Scholar
Huang J, Ma S, Xie H, Zhang CH (2009) A group bridge approach for variable selection. Biometrika 96(2):339–355
Article MathSciNet MATH Google Scholar
Huang L, Kopciuk K, Lu X (2020) Adaptive group bridge selection in the semiparametric accelerated failure time model. J Multiv Anal 175:104562
Article MathSciNet MATH Google Scholar
Huang Y (2002) Censored regression with the multistate accelerated sojourn times model. J R Stat Soc: Ser B (Stat Methodol) 64(1):17–29
Article MathSciNet MATH Google Scholar
Jin Z, Lin D, Wei L, Ying Z (2003) Rank-based inference for the accelerated failure time model. Biometrika 90(2):341–353
Article MathSciNet MATH Google Scholar
Jin Z, Lin D, Ying Z (2006) On least-squares regression with censored data. Biometrika 93(1):147–161
Article MathSciNet MATH Google Scholar
Jin Z, Lin D, Ying Z (2006) Rank regression analysis of multivariate failure time data based on marginal linear models. Scandinavian J Stat 33(1):1–23
Article MathSciNet MATH Google Scholar
Johnson BA et al (2009) On lasso for censored data. Electron J Stat 3:485–506
Article MathSciNet MATH Google Scholar
Kalbfleisch JD, Prentice RL (2011) The statistical analysis of failure time data. Wiley, New Jersey
MATH Google Scholar
Khan MHR, Shaw JEH (2016) Variable selection for survival data with a class of adaptive elastic net techniques. Stat Comput 26(3):725–741
Article MathSciNet MATH Google Scholar
Khan MHR, Shaw JEH (2019) Variable selection for accelerated lifetime models with synthesized estimation techniques. Stat Methods Med Res 28(3):937–952
Article MathSciNet Google Scholar
Khan MHR, Bhadra A, Howlader T (2019) Stability selection for lasso, ridge and elastic net implemented with aft models. Stat Appl Genet Mol Biol 18(5):742
Article MathSciNet MATH Google Scholar
Konrath S, Fahrmeir L, Kneib T (2015) Bayesian accelerated failure time models based on penalized mixtures of gaussians: regularization and variable selection. AStA Adv Stat Anal 99(3):259–280
Article MathSciNet MATH Google Scholar
Koul H, Vv Susarla, Van Ryzin J et al (1981) Regression analysis with randomly right-censored data. Annal Stat 9(6):1276–1288
Article MathSciNet MATH Google Scholar
Lee KE, Mallick BK (2004) Bayesian methods for variable selection in survival models with application to dna microarray data. Sankhyā: Ind J Stat 66(4):756–778
MathSciNet MATH Google Scholar
Lee KH, Chakraborty S, Sun J (2017) Variable selection for high-dimensional genomic data with censored outcomes using group lasso prior. Comput Stat Data Anal 112:1–13
Article MathSciNet MATH Google Scholar
Li H, Yin G (2009) Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika 96(2):293–306
Article MathSciNet MATH Google Scholar
Li Y, Dicker L, Zhao SD (2014) The dantzig selector for censored linear regression models. Stat Sinica 24(1):251
MathSciNet MATH Google Scholar
Lu W (2007) Tests of independence for censored bivariate failure time data. Lifetime Data Anal 13(1):75–90
Article MathSciNet MATH Google Scholar
Miller RG (1976) Least squares regression with censored data. Biometrika 63(3):449–464
Article MathSciNet MATH Google Scholar
Mitchell TJ, Beauchamp JJ (1988) Bayesian variable selection in linear regression. J Am Stat Assoc 83(404):1023–1032
Article MathSciNet MATH Google Scholar
Narisetty NN, He X et al (2014) Bayesian variable selection with shrinking and diffusing priors. Annal Stat 42(2):789–817
Article MathSciNet MATH Google Scholar
Noll JE, Vandyke K, Hewett DR, Mrozik KM, Bala RJ, Williams SA, Kok CH, Zannettino AC (2015) Pttg1 expression is associated with hyperproliferative disease and poor prognosis in multiple myeloma. J Hematol Oncol 8(1):106
Article Google Scholar
Pan W, Kooperberg C (1999) Linear regression for bivariate censored data via multiple imputation. Stat Med 18(22):3111–3121
Article Google Scholar
Pan W, Louis TA (2000) A linear mixed-effects model for multivariate censored data. Biometrics 56(1):160–166
Article MATH Google Scholar
Park T, Casella G (2008) The bayesian lasso. J Am Stat Assoc 103(482):681–686
Article MathSciNet MATH Google Scholar
Ročková V, George EI (2014) Emvs: the em approach to bayesian variable selection. J Am Stat Assoc 109(506):828–846
Article MathSciNet MATH Google Scholar
Sabourin JA, Valdar W, Nobel AB (2015) A permutation approach for selecting the penalty parameter in penalized model selection. Biometrics 71(4):1185–1194
Article MathSciNet MATH Google Scholar
Schneider H, Weissfeld L (1986) Estimation in linear models with censored data. Biometrika 73(3):741–745
Article MathSciNet MATH Google Scholar
Sha N, Tadesse MG, Vannucci M (2006) Bayesian variable selection for the analysis of microarray data with censored outcomes. Bioinformatics 22(18):2262–2268
Article Google Scholar
Shaughnessy J (2005) Amplification and overexpression of cks1b at chromosome band 1q21 is associated with reduced levels of p27 kip1 and an aggressive clinical course in multiple myeloma. Hematology 10:117–126
Article Google Scholar
Shaughnessy JD Jr, Zhan F, Burington BE, Huang Y, Colla S, Hanamura I, Stewart JP, Kordsmeier B, Randolph C, Williams DR et al (2007) A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood 109(6):2276–2284
Article Google Scholar
Shi L, Campbell G, Jones W, Campagne F, Wen Z, Walker S, Su Z, Chu T, Goodsaid F, Pusztai L et al (2010) The maqc-ii project: a comprehensive study of common practices for the development and validation of microarray-based predictive models. Nature Biotechnol 28:827–838
Article Google Scholar
Stute W, Wang JL (1993) The strong law under random censorship. Annal Stat 36:1591–1607
MathSciNet MATH Google Scholar
Tanner MA, Wong WH (1987) The calculation of posterior distributions by data augmentation. J Am Stat Assoc 82(398):528–540
Article MathSciNet MATH Google Scholar
Tibshirani R (1997) The lasso method for variable selection in the cox model. Stat Med 16(4):385–395
Article Google Scholar
Tsiatis AA (1990) Estimating regression parameters using linear rank tests for censored data. Annal Stat 90:354–372
MathSciNet MATH Google Scholar
Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei L (2011) On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30(10):1105–1117
Article MathSciNet Google Scholar
Van Erp S, Oberski DL, Mulder J (2019) Shrinkage priors for bayesian penalized regression. J Math Psychol 89:31–50
Article MathSciNet MATH Google Scholar
Visser M (1996) Nonparametric estimation of the bivariate survival function with an application to vertically transmitted aids. Biometrika 83(3):507–518
Article MATH Google Scholar
Wang S, Nan B, Zhu J, Beer DG (2008) Doubly penalized buckley-james method for survival data with high-dimensional covariates. Biometrics 64(1):132–140
Article MathSciNet MATH Google Scholar
Wang X, Song L (2011) Adaptive lasso variable selection for the accelerated failure models. Commun Stat-Theory Methods 40(24):4372–4386
Article MathSciNet MATH Google Scholar
Wang YG, Fu L (2011) Rank regression for accelerated failure time model with clustered and censored data. Comput Stat Data Anal 55(7):2334–2343
Article MathSciNet MATH Google Scholar
Wei LJ (1992) The accelerated failure time model: a useful alternative to the cox regression model in survival analysis. Stat Med 11(14–15):1871–1879
Article Google Scholar
Wei LJ, Ying Z, Lin D (1990) Linear regression analysis of censored survival data based on rank tests. Biometrika 77(4):845–851
Article MathSciNet Google Scholar
Xu J, Leng C, Ying Z (2010) Rank-based variable selection with censored data. Stat Comput 20(2):165–176
Article MathSciNet Google Scholar
Yi GY, He W (2006) Methods for bivariate survival data with mismeasured covariates under an accelerated failure time model. Commun Stat-Theory Methods 35(8):1539–1554
Article MathSciNet MATH Google Scholar
Yin G, Cai J (2005) Quantile regression models with multivariate failure time data. Biometrics 61(1):151–161
Article MathSciNet MATH Google Scholar
Zhan F, Huang Y, Colla S, Stewart JP, Hanamura I, Gupta S, Epstein J, Yaccoby S, Sawyer J, Burington B et al (2006) The molecular classification of multiple myeloma. Blood 108(6):2020–2028
Article Google Scholar
Zhu LP, Li L, Li R, Zhu LX (2011) Model-free feature screening for ultrahigh-dimensional data. J Am Stat Assoc 106(496):1464–1475
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, University of Illinois, Urbana-Champaign, Champaign, IL, USA
Wenjing Yin, Sihai Dave Zhao & Feng Liang

Authors

Wenjing Yin
View author publications
You can also search for this author in PubMed Google Scholar
Sihai Dave Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Feng Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Liang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Multicollinearity design

Let sample size n be 100 and dimension p be 100. Let the first 10 variables be independently generated from standard normal distribution. Then for $j = 11,\cdots ,20$, consider

$$\begin{aligned} {\varvec{X}}_j = {\varvec{X}}_{j-10} + \tau , \end{aligned}$$

where $\tau $ is a random error from a standard normal distribution. The rest of the variables are further generated from multivariate normal distribution with mean zero and covariance matrix with elements $\varvec{\varSigma }_{ij} = 0.5^{|i-j|}$. Following Sect. 4.1, generate ${\varvec{T}}_{\cdot 1}$ and ${\varvec{T}}_{\cdot 2}$ and corresponding censoring times and censoring indicators. Furthermore, we assume the relevant variables as the following

no sharing: $\left\{ j: \varvec{\beta }_{j1} \ne 0\right\} = \left\{ 1,\cdots ,10\right\} , \left\{ j: \varvec{\beta }_{j2} \ne 0\right\} = \left\{ 21,\cdots ,30\right\} $
all sharing: $\left\{ j: \varvec{\beta }_{jk} \ne 0\right\} = \left\{ 1,\cdots ,10\right\} $
some sharing: $\left\{ j: \varvec{\beta }_{j1} \ne 0\right\} = \left\{ 1,\cdots ,10\right\} , \left\{ j: \varvec{\beta }_{j2} \ne 0\right\} = \left\{ 1,\cdots ,5,\cdots ,\right. \left. 21,\cdots ,25\right\} $

All of the true relevant variables are generated independently from $\textsf {N}(3,0.5)$. We repeat all simulation setups for 200 times and fix the true coefficient values for all simulation runs.

Table 11 False positives and false negatives reported for multicollinearity design

Full size table

Table 12 Sensitivity, specificity, and MCC scores reported for multicollinearity design

Full size table

The results of the multicollinearity design can be found in Tables 9 and 10. For this simulation design, the univariate AEnet failed to give any results due to the issue with singular matrix computation, therefore we only report results from the other four competing methods. We see that all of the methods tend to recognize the ten irrelevant variables as signals. For no-sharing and all-sharing cases, the proposed method is able to give the smallest number of false positives while being able to recognize almost all of the relevant variables, giving almost zero false negatives. For some-sharing cases, we observe more obvious trade-off between false positives and false negatives for using $\lambda _{min}$ and $\lambda _{1se}$ while the proposed method selects the variables more strictly, returning with lower false positive scores and higher false negative scores. However, in terms of MCC score as an overall measure, the proposed method is able to achieve the highest MCC scores for all setups, demonstrating that the proposed method is able to outperform existing methods and to handle complicated data examples.

Dense design

Let $n = 100$ and $p = 100$. Following Sect. 4.1, we generate design matrix ${\varvec{X}}$ from multivariate normal distribution with mean zero and covariance matrix with elements $\varvec{\varSigma }_{ij} = 0.5^{|i-j|}$. Then we generate ${\varvec{T}}_{\cdot 1}$ and ${\varvec{T}}_{\cdot 2}$ and corresponding censoring times and censoring indicators in a similar manner. In this simulation design, we assume that for each column of the true coefficient matrix, there are 20 relevant variables. That is, for some-sharing setups, we will have in total 45 relevant variables. All of the true relevant variables are generated independently from $\textsf {N}(3,0.5)$. We repeat all simulation setups for 200 times and fix the true coefficient values for all simulation runs.

Table 13 False positives and false negatives reported for dense design

Full size table

Table 14 Sensitivity, specificity, and MCC scores reported for dense design

Full size table

The results of the dense design can be found in Tables 11 and 12. We see that the proposed method gives consistent performance to have the best MCC scores among all competing methods. For no-sharing and all-sharing setups, the proposed method is able to give the best combination of false positives and false negatives, achieving highest sensitivity and specificity scores. For some-sharing setups, when $c \ne 1$, the proposed method is more strict in selecting signals which results in missing almost half of the relevant variables. However the proposed method is still able to correctly identify more relevant variables and noise variables compared with other competing methods, achieving the highest MCC scores.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, W., Zhao, S.D. & Liang, F. Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models. Lifetime Data Anal 28, 282–318 (2022). https://doi.org/10.1007/s10985-022-09549-5

Download citation

Received: 25 June 2020
Accepted: 22 January 2022
Published: 03 March 2022
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10985-022-09549-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models

Abstract

Access this article

Similar content being viewed by others

Censored cumulative residual independent screening for ultrahigh-dimensional survival data

Broken adaptive ridge regression for right-censored survival data

Variable selection for survival data with a class of adaptive elastic net techniques

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Multicollinearity design

Dense design

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian penalized Buckley-James method for high dimensional bivariate censored regression models

Abstract

Access this article

Similar content being viewed by others

Censored cumulative residual independent screening for ultrahigh-dimensional survival data

Broken adaptive ridge regression for right-censored survival data

Variable selection for survival data with a class of adaptive elastic net techniques

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Multicollinearity design

Dense design

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation