Generalized accelerated failure time spatial frailty model for arbitrarily censored data

Zhou, Haiming; Hanson, Timothy; Zhang, Jiajia

doi:10.1007/s10985-016-9361-4

Generalized accelerated failure time spatial frailty model for arbitrarily censored data

Published: 18 March 2016

Volume 23, pages 495–515, (2017)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Haiming Zhou¹,
Timothy Hanson² &
Jiajia Zhang³

882 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

Flexible incorporation of both geographical patterning and risk effects in cancer survival models is becoming increasingly important, due in part to the recent availability of large cancer registries. Most spatial survival models stochastically order survival curves from different subpopulations. However, it is common for survival curves from two subpopulations to cross in epidemiological cancer studies and thus interpretable standard survival models can not be used without some modification. Common fixes are the inclusion of time-varying regression effects in the proportional hazards model or fully nonparametric modeling, either of which destroys any easy interpretability from the fitted model. To address this issue, we develop a generalized accelerated failure time model which allows stratification on continuous or categorical covariates, as well as providing per-variable tests for whether stratification is necessary via novel approximate Bayes factors. The model is interpretable in terms of how median survival changes and is able to capture crossing survival curves in the presence of spatial correlation. A detailed Markov chain Monte Carlo algorithm is presented for posterior inference and a freely available function frailtyGAFT is provided to fit the model in the R package spBayesSurv. We apply our approach to a subset of the prostate cancer data gathered for Louisiana by the surveillance, epidemiology, and end results program of the National Cancer Institute.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining the strengths of Dutch survey and register data in a data challenge to predict fertility (PreFer)

Article Open access 13 April 2024

Maximum likelihood estimation of the Weibull distribution with reduced bias

Article Open access 17 April 2023

The Utility of Multistate Models: A Flexible Framework for Time-to-Event Data

Article Open access 29 June 2022

References

Banerjee S, Carlin BP (2003) Semiparametric spatio-temporal frailty modeling. Environmetrics 14(5):523–535
Article Google Scholar
Banerjee S, Dey DK (2005) Semiparametric proportional odds models for spatially correlated survival data. Lifetime Data Anal 11(2):175–191
Article MathSciNet MATH Google Scholar
Banerjee S, Wall MM, Carlin BP (2003) Frailty modeling for spatially correlated survival data, with application to infant mortality in Minnesota. Biostatistics 4(1):123–142
Article MATH Google Scholar
Besag J (1974) Spatial interaction and the statistical analysis of lattice systems. J R Stat Soc 36(2):192–236
MathSciNet MATH Google Scholar
Bouliotis G, Billingham L (2011) Crossing survival curves: alternatives to the log-rank test. Trials 12(Suppl 1):A137
Article Google Scholar
Chiou SH, Kang S, Yan J (2015) Semiparametric accelerated failure time modeling for clustered failure times from stratified sampling. J Am Stat Assoc 110:621–629
Article MathSciNet Google Scholar
Christensen R, Johnson W (1988) Modeling accelerated failure time with a Dirichlet process. Biometrika 75(4):693–704
Article MathSciNet MATH Google Scholar
Cox DR (1975) Partial likelihood. Biometrika 62(2):269–276
Article MathSciNet MATH Google Scholar
De Iorio M, Johnson WO, Müller P, Rosner GL (2009) Bayesian nonparametric nonproportional hazards survival modeling. Biometrics 65(3):762–771
Article MathSciNet MATH Google Scholar
Dickey JM (1971) The weighted likelihood ratio, linear hypotheses on normal location parameters. Ann Math Stat 42(1):204–223
Article MathSciNet MATH Google Scholar
Gamerman D (1997) Sampling from the posterior distribution in generalized linear mixed models. Stat Comput 7(1):57–68
Article Google Scholar
Geisser S, Eddy WF (1979) A predictive approach to model selection. J Am Stat Assoc 74(365):153–160
Article MathSciNet MATH Google Scholar
Gelfand AE, Dey DK (1994) Bayesian model choice: asymptotics and exact calculations. J R Stat Soc 56(3):501–514
MathSciNet MATH Google Scholar
Gelfand AE, Vounatsou P (2003) Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics 4(1):11–15
Article MATH Google Scholar
Haario H, Saksman E, Tamminen J (2001) An adaptive Metropolis algorithm. Bernoulli 7(2):223–242
Article MathSciNet MATH Google Scholar
Hanson T, Johnson WO (2002) Modeling regression error with a mixture of Polya trees. J Am Stat Assoc 97(460):1020–1033
Article MathSciNet MATH Google Scholar
Hanson T, Johnson WO (2004) A Bayesian semiparametric AFT model for interval-censored data. J Comput Gr Stat 13(2):341–361
Article MathSciNet Google Scholar
Hanson T, Kottas A, Branscum A (2008) Modelling stochastic order in the analysis of receiver operating characteristic data: Bayesian nonparametric approaches. J R Stat Soc 57(2):207–225
Article MATH Google Scholar
Hanson TE (2006) Inference for mixtures of finite Polya tree models. J Am Stat Assoc 101(476):1548–1565
Article MathSciNet MATH Google Scholar
Hanson TE, Jara A (2013) Surviving fully Bayesian nonparametricregression models. In: Bayesian theory and applications. Oxford University Press, Oxford, pp 592–615
Hanson TE, Jara A, Zhao L et al (2012) A Bayesian semiparametric temporally-stratified proportional hazards model with spatial frailties. Bayesian Anal 7(1):147–188
Article MathSciNet MATH Google Scholar
Henderson R, Shimakura S, Gorst D (2002) Modeling spatial variation in leukemia survival data. J Am Stat Assoc 97(460):965–972
Article MathSciNet MATH Google Scholar
Hennerfeind A, Brezger A, Fahrmeir L (2006) Geoadditive survival models. J Am Stat Assoc 101(475):1065–1075
Article MathSciNet MATH Google Scholar
Jara A, Hanson TE (2011) A class of mixtures of dependent tailfree processes. Biometrika 98(3):553–566
Article MathSciNet MATH Google Scholar
Koenker R (2008) Censored quantile regression redux. J Stat Softw 27(6):1–25
Article Google Scholar
Kottas A, Gelfand AE (2001) Bayesian semiparametric median regression modeling. J Am Stat Assoc 96(456):1458–1468
Article MATH Google Scholar
Kuo L, Mallick B (1997) Bayesian semiparametric inference for the accelerated failure-time model. Can J Stat 25(4):457–472
Article MATH Google Scholar
Li Y, Ryan L (2002) Modeling spatial survival data using semiparametric frailty models. Biometrics 58(2):287–297
Article MathSciNet MATH Google Scholar
Logan BR, Klein JP, Zhang M-J (2008) Comparing treatments in the presence of crossing survival curves: an application to bone marrow transplantation. Biometrics 64(3):733–740
Article MathSciNet MATH Google Scholar
Neal RM (2003) Slice sampling. Ann Stat 31(3):705–767
Article MathSciNet MATH Google Scholar
Pang L, Lu W, Wang HJ (2015) Local Buckley-James estimation for heteroscedastic accelerated failure time model. Stat Sin 25(3):863–877
MathSciNet MATH Google Scholar
Portnoy S (2003) Censored regression quantiles. J Am Stat Assoc 98(464):1001–1012
Article MathSciNet MATH Google Scholar
Raftery AE (1996) Hypothesis testing and model selection via posterior simulation. In: Markov Chain Monte Carlo in practice. Springer, New York, pp 163–187
Robert C, Casella G (2005) Monte Carlo statistical methods. Springer, New York
MATH Google Scholar
Verdinelli I, Wasserman L (1995) Computing Bayes factors using a generalization of the Savage-Dickey density ratio. J Am Stat Assoc 90(430):614–618
Article MathSciNet MATH Google Scholar
Walker SG, Mallick BK (1999) A Bayesian semiparametric accelerated failure time model. Biometrics 55(2):477–483
Article MathSciNet MATH Google Scholar
Wang S, Zhang J, Lawson AB (2012) A Bayesian normal mixture accelerated failure time spatial model and its application to prostate cancer. Stat Methods Med Res. doi:10.1177/0962280212466189
Zellner A (1983) Applications of Bayesian analysis in econometrics. Statistician 32(1/2):23–34
Article Google Scholar
Zhang J, Lawson AB (2011) Bayesian parametric accelerated failure time spatial model and its application to prostate cancer. J Appl Stat 38(3):591–603
Article MathSciNet Google Scholar
Zhao L, Hanson TE (2011) Spatially dependent Polya tree modeling for survival data. Biometrics 67(2):391–403
Article MathSciNet MATH Google Scholar
Zhao L, Hanson TE, Carlin BP (2009) Mixtures of Polya trees for flexible spatial frailty survival modelling. Biometrika 96(2):263–276
Article MathSciNet MATH Google Scholar
Zhou H, Hanson T, Jara A, Zhang J (2015) Modeling county level breast cancer survival data using a covariate-adjusted frailty proportional hazards model. Ann Appl Stat 9(1):43–68
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

This work was supported by NCI grant 5R03CA176739. The authors would like to thank the editor, the associate editor, and the two referees for their valuable comments, which led to great improvements to the paper.

Author information

Authors and Affiliations

Division of Statistics, Northern Illinois University, DeKalb, IL, 60115, USA
Haiming Zhou
Department of Statistics, University of South Carolina, Columbia, SC, 29208, USA
Timothy Hanson
Department of Epidemiology and Biostatistics, University of South Carolina, Columbia, SC, 29208, USA
Jiajia Zhang

Authors

Haiming Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Hanson
View author publications
You can also search for this author in PubMed Google Scholar
Jiajia Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haiming Zhou.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1122 KB)

Appendix

Proposition 1

Assume that ${\varvec{\gamma }}_{l,k}|\alpha \overset{ind.}{\sim } N_{q+1}\left( {\mathbf {0}}, \frac{2n}{\alpha \rho (l+1)} ({\mathbf {X}}'{\mathbf {X}})^{-1}\right) $ under $H_1$ and ${\varvec{\gamma }}_{l,k,-j}|\alpha \overset{ind.}{\sim } N_{q}\left( {\mathbf {0}}, \frac{2n}{\alpha \rho (l+1)} ({\mathbf {X}}_{-j}'{\mathbf {X}}_{-j})^{-1}\right) $ under $H_0$, where $\alpha $ is fixed and ${\mathbf {X}}_{-j}$ is the design matrix ${\mathbf {X}}$ excluding the $(j+1)$th column. Then the Assumption (2.12) holds, and

$$\begin{aligned} p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha ) = \prod _{l=1}^{L-1}\prod _{k=1}^{2^l} \phi \left( 0\bigg |0, \frac{2n}{\alpha \rho (l+1)} ({\mathbf {X}}'{\mathbf {X}})_{jj}^{-1}\right) . \end{aligned}$$

where $({\mathbf {X}}'{\mathbf {X}})_{jj}^{-1}$ is the $(j+1,j+1)$th element of $({\mathbf {X}}'{\mathbf {X}})^{-1}$, and $\phi (\cdot |\mu , \sigma ^2)$ denotes the normal density with mean $\mu $ and variance $\sigma ^2$.

Proof

Since ${\varvec{\gamma }}_{l,k}|\alpha $ follows a multivariate normal, $({\varvec{\gamma }}_{l,k,-j}|{\varvec{\gamma }}_{l,k,j}=0,\alpha )$ still follows a multivariate normal distribution

$$\begin{aligned} p({\varvec{\gamma }}_{l,k,-j}|{\varvec{\gamma }}_{l,k,j}=0,\alpha )&\propto \exp \left\{ -\frac{\alpha \rho (l+1)}{4n} {\varvec{\gamma }}_{l,k}'({\mathbf {X}}'{\mathbf {X}}){\varvec{\gamma }}_{l,k} \right\} \nonumber \\&\propto \exp \left\{ -\frac{\alpha \rho (l+1)}{4n} {\varvec{\gamma }}_{l,k,-j}'({\mathbf {X}}_{-j}'{\mathbf {X}}_{-j}){\varvec{\gamma }}_{l,k,-j} \right\} \nonumber \\&\propto N_{q}\left( {\mathbf {0}}, \frac{2n}{\alpha \rho (l+1)} ({\mathbf {X}}_{-j}'{\mathbf {X}}_{-j})^{-1}\right) . \end{aligned}$$

This implies that $p({\varvec{\gamma }}_{l,k,-j}|{\varvec{\gamma }}_{l,k,j}=0,\alpha )=p_0({\varvec{\gamma }}_{l,k,-j}|\alpha )$ and by independence $p({\varvec{\varUpsilon }}_{-j}|{\varvec{\varUpsilon }}_j=0,\alpha )=p_0({\varvec{\varUpsilon }}_{-j}|\alpha )$. In addition, $\alpha $ is fixed and ${\varvec{\varUpsilon }}_{-j}$ is independent of all other parameters in ${\varvec{\psi }}$, thus the Assumption (2.12) holds. It is easy to evaluate $p({\varvec{\varUpsilon }}_j=0)$ by noting the properties of multivariate normal. $\square $

Proposition 2

Assume the same priors on ${\varvec{\gamma }}_{l,k}$ as Proposition 1 and additional prior on $\alpha $ as $\pi (\alpha )={\varGamma }(\alpha |a_0, b_0)$ under both $H_1$ and $H_0$. Then given existence of all involved expectations, $BF_{10}$ can be written as

$$\begin{aligned} BF_{10} = \left\{ p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}})\right\} ^{-1} \left\{ E\left[ \frac{1}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha )} \right] \right\} ^{-1}, \end{aligned}$$

where the expectation is with respect to $p(\alpha |{\varvec{\varUpsilon }}_j={\mathbf {0}}, {\mathcal {D}})$.

Proof

First note that ${\varvec{\psi }}$ represents all remaining model parameters but ${\varvec{\varUpsilon }}_j$ and the prior for ${\varvec{\varUpsilon }}_j$ only depend on the precision parameter $\alpha $, so we have $p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\varvec{\psi }})=p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha )$. Also note that ${\mathcal {L}}({\varvec{\varUpsilon }}_j, {\varvec{\psi }})$ is the likelihood function, so we could denote it by $p({\mathcal {D}}|{\varvec{\varUpsilon }}_j, {\varvec{\psi }}) $. It follows that

$$\begin{aligned}&p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}|{\mathcal {D}})\int {\mathcal {L}}({\varvec{\varUpsilon }}_j, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j, {\varvec{\psi }})d({\varvec{\varUpsilon }}_j,{\varvec{\psi }}) \\&\quad = p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}|{\mathcal {D}})\int p({\mathcal {D}}|{\varvec{\varUpsilon }}_j, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j, {\varvec{\psi }})d({\varvec{\varUpsilon }}_j,{\varvec{\psi }})\\&\quad = p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}|{\mathcal {D}}) p({\mathcal {D}}) \\&\quad = p({\mathcal {D}}|{\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }})\\&\quad = {\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}). \end{aligned}$$

Then we have

$$\begin{aligned} BF_{10}^{-1}&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ {\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\psi }})}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}})\int {\mathcal {L}}({\varvec{\varUpsilon }}_j, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j, {\varvec{\psi }})d({\varvec{\varUpsilon }}_j,{\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ {\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\psi }}) p({\varvec{\psi }}|{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}})}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}|{\mathcal {D}})\int {\mathcal {L}}({\varvec{\varUpsilon }}_j, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j, {\varvec{\psi }})d({\varvec{\varUpsilon }}_j,{\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ {\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\psi }}) p({\varvec{\psi }}|{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}})}{{\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }})p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ p({\varvec{\psi }}) p({\varvec{\psi }}|{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}})}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ p({\varvec{\psi }}|{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}})}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \int \frac{1}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha )} p({\varvec{\psi }}_{-\alpha }, \alpha |{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}}) d{\varvec{\psi }}_{-\alpha } d\alpha \\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{1}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha )} p(\alpha |{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}}) d\alpha , \end{aligned}$$

where $({\varvec{\psi }}_{-\alpha }, \alpha )={\varvec{\psi }}$. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, H., Hanson, T. & Zhang, J. Generalized accelerated failure time spatial frailty model for arbitrarily censored data. Lifetime Data Anal 23, 495–515 (2017). https://doi.org/10.1007/s10985-016-9361-4

Download citation

Received: 16 October 2014
Accepted: 12 March 2016
Published: 18 March 2016
Issue Date: July 2017
DOI: https://doi.org/10.1007/s10985-016-9361-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized accelerated failure time spatial frailty model for arbitrarily censored data

Abstract

Access this article

Similar content being viewed by others

Combining the strengths of Dutch survey and register data in a data challenge to predict fertility (PreFer)

Maximum likelihood estimation of the Weibull distribution with reduced bias

The Utility of Multistate Models: A Flexible Framework for Time-to-Event Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 1122 KB)

Appendix

Proposition 1

Proof

Proposition 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generalized accelerated failure time spatial frailty model for arbitrarily censored data

Abstract

Access this article

Similar content being viewed by others

Combining the strengths of Dutch survey and register data in a data challenge to predict fertility (PreFer)

Maximum likelihood estimation of the Weibull distribution with reduced bias

The Utility of Multistate Models: A Flexible Framework for Time-to-Event Data

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 1122 KB)

Appendix

Appendix

Proposition 1

Proof

Proposition 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation