Skip to main content

Advertisement

Log in

Generalized accelerated failure time spatial frailty model for arbitrarily censored data

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Flexible incorporation of both geographical patterning and risk effects in cancer survival models is becoming increasingly important, due in part to the recent availability of large cancer registries. Most spatial survival models stochastically order survival curves from different subpopulations. However, it is common for survival curves from two subpopulations to cross in epidemiological cancer studies and thus interpretable standard survival models can not be used without some modification. Common fixes are the inclusion of time-varying regression effects in the proportional hazards model or fully nonparametric modeling, either of which destroys any easy interpretability from the fitted model. To address this issue, we develop a generalized accelerated failure time model which allows stratification on continuous or categorical covariates, as well as providing per-variable tests for whether stratification is necessary via novel approximate Bayes factors. The model is interpretable in terms of how median survival changes and is able to capture crossing survival curves in the presence of spatial correlation. A detailed Markov chain Monte Carlo algorithm is presented for posterior inference and a freely available function frailtyGAFT is provided to fit the model in the R package spBayesSurv. We apply our approach to a subset of the prostate cancer data gathered for Louisiana by the surveillance, epidemiology, and end results program of the National Cancer Institute.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Banerjee S, Carlin BP (2003) Semiparametric spatio-temporal frailty modeling. Environmetrics 14(5):523–535

    Article  Google Scholar 

  • Banerjee S, Dey DK (2005) Semiparametric proportional odds models for spatially correlated survival data. Lifetime Data Anal 11(2):175–191

    Article  MathSciNet  MATH  Google Scholar 

  • Banerjee S, Wall MM, Carlin BP (2003) Frailty modeling for spatially correlated survival data, with application to infant mortality in Minnesota. Biostatistics 4(1):123–142

    Article  MATH  Google Scholar 

  • Besag J (1974) Spatial interaction and the statistical analysis of lattice systems. J R Stat Soc 36(2):192–236

    MathSciNet  MATH  Google Scholar 

  • Bouliotis G, Billingham L (2011) Crossing survival curves: alternatives to the log-rank test. Trials 12(Suppl 1):A137

    Article  Google Scholar 

  • Chiou SH, Kang S, Yan J (2015) Semiparametric accelerated failure time modeling for clustered failure times from stratified sampling. J Am Stat Assoc 110:621–629

    Article  MathSciNet  Google Scholar 

  • Christensen R, Johnson W (1988) Modeling accelerated failure time with a Dirichlet process. Biometrika 75(4):693–704

    Article  MathSciNet  MATH  Google Scholar 

  • Cox DR (1975) Partial likelihood. Biometrika 62(2):269–276

    Article  MathSciNet  MATH  Google Scholar 

  • De Iorio M, Johnson WO, Müller P, Rosner GL (2009) Bayesian nonparametric nonproportional hazards survival modeling. Biometrics 65(3):762–771

    Article  MathSciNet  MATH  Google Scholar 

  • Dickey JM (1971) The weighted likelihood ratio, linear hypotheses on normal location parameters. Ann Math Stat 42(1):204–223

    Article  MathSciNet  MATH  Google Scholar 

  • Gamerman D (1997) Sampling from the posterior distribution in generalized linear mixed models. Stat Comput 7(1):57–68

    Article  Google Scholar 

  • Geisser S, Eddy WF (1979) A predictive approach to model selection. J Am Stat Assoc 74(365):153–160

    Article  MathSciNet  MATH  Google Scholar 

  • Gelfand AE, Dey DK (1994) Bayesian model choice: asymptotics and exact calculations. J R Stat Soc 56(3):501–514

    MathSciNet  MATH  Google Scholar 

  • Gelfand AE, Vounatsou P (2003) Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics 4(1):11–15

    Article  MATH  Google Scholar 

  • Haario H, Saksman E, Tamminen J (2001) An adaptive Metropolis algorithm. Bernoulli 7(2):223–242

    Article  MathSciNet  MATH  Google Scholar 

  • Hanson T, Johnson WO (2002) Modeling regression error with a mixture of Polya trees. J Am Stat Assoc 97(460):1020–1033

    Article  MathSciNet  MATH  Google Scholar 

  • Hanson T, Johnson WO (2004) A Bayesian semiparametric AFT model for interval-censored data. J Comput Gr Stat 13(2):341–361

    Article  MathSciNet  Google Scholar 

  • Hanson T, Kottas A, Branscum A (2008) Modelling stochastic order in the analysis of receiver operating characteristic data: Bayesian nonparametric approaches. J R Stat Soc 57(2):207–225

    Article  MATH  Google Scholar 

  • Hanson TE (2006) Inference for mixtures of finite Polya tree models. J Am Stat Assoc 101(476):1548–1565

    Article  MathSciNet  MATH  Google Scholar 

  • Hanson TE, Jara A (2013) Surviving fully Bayesian nonparametricregression models. In: Bayesian theory and applications. Oxford University Press, Oxford, pp 592–615

  • Hanson TE, Jara A, Zhao L et al (2012) A Bayesian semiparametric temporally-stratified proportional hazards model with spatial frailties. Bayesian Anal 7(1):147–188

    Article  MathSciNet  MATH  Google Scholar 

  • Henderson R, Shimakura S, Gorst D (2002) Modeling spatial variation in leukemia survival data. J Am Stat Assoc 97(460):965–972

    Article  MathSciNet  MATH  Google Scholar 

  • Hennerfeind A, Brezger A, Fahrmeir L (2006) Geoadditive survival models. J Am Stat Assoc 101(475):1065–1075

    Article  MathSciNet  MATH  Google Scholar 

  • Jara A, Hanson TE (2011) A class of mixtures of dependent tailfree processes. Biometrika 98(3):553–566

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R (2008) Censored quantile regression redux. J Stat Softw 27(6):1–25

    Article  Google Scholar 

  • Kottas A, Gelfand AE (2001) Bayesian semiparametric median regression modeling. J Am Stat Assoc 96(456):1458–1468

    Article  MATH  Google Scholar 

  • Kuo L, Mallick B (1997) Bayesian semiparametric inference for the accelerated failure-time model. Can J Stat 25(4):457–472

    Article  MATH  Google Scholar 

  • Li Y, Ryan L (2002) Modeling spatial survival data using semiparametric frailty models. Biometrics 58(2):287–297

    Article  MathSciNet  MATH  Google Scholar 

  • Logan BR, Klein JP, Zhang M-J (2008) Comparing treatments in the presence of crossing survival curves: an application to bone marrow transplantation. Biometrics 64(3):733–740

    Article  MathSciNet  MATH  Google Scholar 

  • Neal RM (2003) Slice sampling. Ann Stat 31(3):705–767

    Article  MathSciNet  MATH  Google Scholar 

  • Pang L, Lu W, Wang HJ (2015) Local Buckley-James estimation for heteroscedastic accelerated failure time model. Stat Sin 25(3):863–877

    MathSciNet  MATH  Google Scholar 

  • Portnoy S (2003) Censored regression quantiles. J Am Stat Assoc 98(464):1001–1012

    Article  MathSciNet  MATH  Google Scholar 

  • Raftery AE (1996) Hypothesis testing and model selection via posterior simulation. In: Markov Chain Monte Carlo in practice. Springer, New York, pp 163–187

  • Robert C, Casella G (2005) Monte Carlo statistical methods. Springer, New York

    MATH  Google Scholar 

  • Verdinelli I, Wasserman L (1995) Computing Bayes factors using a generalization of the Savage-Dickey density ratio. J Am Stat Assoc 90(430):614–618

    Article  MathSciNet  MATH  Google Scholar 

  • Walker SG, Mallick BK (1999) A Bayesian semiparametric accelerated failure time model. Biometrics 55(2):477–483

    Article  MathSciNet  MATH  Google Scholar 

  • Wang S, Zhang J, Lawson AB (2012) A Bayesian normal mixture accelerated failure time spatial model and its application to prostate cancer. Stat Methods Med Res. doi:10.1177/0962280212466189

  • Zellner A (1983) Applications of Bayesian analysis in econometrics. Statistician 32(1/2):23–34

    Article  Google Scholar 

  • Zhang J, Lawson AB (2011) Bayesian parametric accelerated failure time spatial model and its application to prostate cancer. J Appl Stat 38(3):591–603

    Article  MathSciNet  Google Scholar 

  • Zhao L, Hanson TE (2011) Spatially dependent Polya tree modeling for survival data. Biometrics 67(2):391–403

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao L, Hanson TE, Carlin BP (2009) Mixtures of Polya trees for flexible spatial frailty survival modelling. Biometrika 96(2):263–276

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou H, Hanson T, Jara A, Zhang J (2015) Modeling county level breast cancer survival data using a covariate-adjusted frailty proportional hazards model. Ann Appl Stat 9(1):43–68

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by NCI grant 5R03CA176739. The authors would like to thank the editor, the associate editor, and the two referees for their valuable comments, which led to great improvements to the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haiming Zhou.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1122 KB)

Appendix

Appendix

Proposition 1

Assume that \({\varvec{\gamma }}_{l,k}|\alpha \overset{ind.}{\sim } N_{q+1}\left( {\mathbf {0}}, \frac{2n}{\alpha \rho (l+1)} ({\mathbf {X}}'{\mathbf {X}})^{-1}\right) \) under \(H_1\) and \({\varvec{\gamma }}_{l,k,-j}|\alpha \overset{ind.}{\sim } N_{q}\left( {\mathbf {0}}, \frac{2n}{\alpha \rho (l+1)} ({\mathbf {X}}_{-j}'{\mathbf {X}}_{-j})^{-1}\right) \) under \(H_0\), where \(\alpha \) is fixed and \({\mathbf {X}}_{-j}\) is the design matrix \({\mathbf {X}}\) excluding the \((j+1)\)th column. Then the Assumption (2.12) holds, and

$$\begin{aligned} p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha ) = \prod _{l=1}^{L-1}\prod _{k=1}^{2^l} \phi \left( 0\bigg |0, \frac{2n}{\alpha \rho (l+1)} ({\mathbf {X}}'{\mathbf {X}})_{jj}^{-1}\right) . \end{aligned}$$

where \(({\mathbf {X}}'{\mathbf {X}})_{jj}^{-1}\) is the \((j+1,j+1)\)th element of \(({\mathbf {X}}'{\mathbf {X}})^{-1}\), and \(\phi (\cdot |\mu , \sigma ^2)\) denotes the normal density with mean \(\mu \) and variance \(\sigma ^2\).

Proof

Since \({\varvec{\gamma }}_{l,k}|\alpha \) follows a multivariate normal, \(({\varvec{\gamma }}_{l,k,-j}|{\varvec{\gamma }}_{l,k,j}=0,\alpha )\) still follows a multivariate normal distribution

$$\begin{aligned} p({\varvec{\gamma }}_{l,k,-j}|{\varvec{\gamma }}_{l,k,j}=0,\alpha )&\propto \exp \left\{ -\frac{\alpha \rho (l+1)}{4n} {\varvec{\gamma }}_{l,k}'({\mathbf {X}}'{\mathbf {X}}){\varvec{\gamma }}_{l,k} \right\} \nonumber \\&\propto \exp \left\{ -\frac{\alpha \rho (l+1)}{4n} {\varvec{\gamma }}_{l,k,-j}'({\mathbf {X}}_{-j}'{\mathbf {X}}_{-j}){\varvec{\gamma }}_{l,k,-j} \right\} \nonumber \\&\propto N_{q}\left( {\mathbf {0}}, \frac{2n}{\alpha \rho (l+1)} ({\mathbf {X}}_{-j}'{\mathbf {X}}_{-j})^{-1}\right) . \end{aligned}$$

This implies that \(p({\varvec{\gamma }}_{l,k,-j}|{\varvec{\gamma }}_{l,k,j}=0,\alpha )=p_0({\varvec{\gamma }}_{l,k,-j}|\alpha )\) and by independence \(p({\varvec{\varUpsilon }}_{-j}|{\varvec{\varUpsilon }}_j=0,\alpha )=p_0({\varvec{\varUpsilon }}_{-j}|\alpha )\). In addition, \(\alpha \) is fixed and \({\varvec{\varUpsilon }}_{-j}\) is independent of all other parameters in \({\varvec{\psi }}\), thus the Assumption (2.12) holds. It is easy to evaluate \(p({\varvec{\varUpsilon }}_j=0)\) by noting the properties of multivariate normal. \(\square \)

Proposition 2

Assume the same priors on \({\varvec{\gamma }}_{l,k}\) as Proposition 1 and additional prior on \(\alpha \) as \(\pi (\alpha )={\varGamma }(\alpha |a_0, b_0)\) under both \(H_1\) and \(H_0\). Then given existence of all involved expectations, \(BF_{10}\) can be written as

$$\begin{aligned} BF_{10} = \left\{ p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}})\right\} ^{-1} \left\{ E\left[ \frac{1}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha )} \right] \right\} ^{-1}, \end{aligned}$$

where the expectation is with respect to \(p(\alpha |{\varvec{\varUpsilon }}_j={\mathbf {0}}, {\mathcal {D}})\).

Proof

First note that \({\varvec{\psi }}\) represents all remaining model parameters but \({\varvec{\varUpsilon }}_j\) and the prior for \({\varvec{\varUpsilon }}_j\) only depend on the precision parameter \(\alpha \), so we have \(p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\varvec{\psi }})=p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha )\). Also note that \({\mathcal {L}}({\varvec{\varUpsilon }}_j, {\varvec{\psi }})\) is the likelihood function, so we could denote it by \(p({\mathcal {D}}|{\varvec{\varUpsilon }}_j, {\varvec{\psi }}) \). It follows that

$$\begin{aligned}&p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}|{\mathcal {D}})\int {\mathcal {L}}({\varvec{\varUpsilon }}_j, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j, {\varvec{\psi }})d({\varvec{\varUpsilon }}_j,{\varvec{\psi }}) \\&\quad = p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}|{\mathcal {D}})\int p({\mathcal {D}}|{\varvec{\varUpsilon }}_j, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j, {\varvec{\psi }})d({\varvec{\varUpsilon }}_j,{\varvec{\psi }})\\&\quad = p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}|{\mathcal {D}}) p({\mathcal {D}}) \\&\quad = p({\mathcal {D}}|{\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }})\\&\quad = {\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}). \end{aligned}$$

Then we have

$$\begin{aligned} BF_{10}^{-1}&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ {\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\psi }})}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}})\int {\mathcal {L}}({\varvec{\varUpsilon }}_j, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j, {\varvec{\psi }})d({\varvec{\varUpsilon }}_j,{\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ {\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\psi }}) p({\varvec{\psi }}|{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}})}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}|{\mathcal {D}})\int {\mathcal {L}}({\varvec{\varUpsilon }}_j, {\varvec{\psi }}) p({\varvec{\varUpsilon }}_j, {\varvec{\psi }})d({\varvec{\varUpsilon }}_j,{\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ {\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }}) p({\varvec{\psi }}) p({\varvec{\psi }}|{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}})}{{\mathcal {L}}({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }})p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ p({\varvec{\psi }}) p({\varvec{\psi }}|{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}})}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}, {\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{ p({\varvec{\psi }}|{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}})}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\varvec{\psi }})}d{\varvec{\psi }}\\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \int \frac{1}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha )} p({\varvec{\psi }}_{-\alpha }, \alpha |{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}}) d{\varvec{\psi }}_{-\alpha } d\alpha \\&= p({\varvec{\varUpsilon }}_j={\mathbf {0}}|{\mathcal {D}}) \int \frac{1}{p({\varvec{\varUpsilon }}_j={\mathbf {0}}|\alpha )} p(\alpha |{\varvec{\varUpsilon }}_j={\mathbf {0}},{\mathcal {D}}) d\alpha , \end{aligned}$$

where \(({\varvec{\psi }}_{-\alpha }, \alpha )={\varvec{\psi }}\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, H., Hanson, T. & Zhang, J. Generalized accelerated failure time spatial frailty model for arbitrarily censored data. Lifetime Data Anal 23, 495–515 (2017). https://doi.org/10.1007/s10985-016-9361-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-016-9361-4

Keywords

Navigation