Abstract
The joint modeling of longitudinal and survival data has received extraordinary attention in the statistics literature recently, with models and methods becoming increasingly more complex. Most of these approaches pair a proportional hazards survival with longitudinal trajectory modeling through parametric or nonparametric specifications. In this paper we closely examine one data set previously analyzed using a two parameter parametric model for Mediterranean fruit fly (medfly) egg-laying trajectories paired with accelerated failure time and proportional hazards survival models. We consider parametric and nonparametric versions of these two models, as well as a proportional odds rate model paired with a wide variety of longitudinal trajectory assumptions reflecting the types of analyses seen in the literature. In addition to developing novel nonparametric Bayesian methods for joint models, we emphasize the importance of model selection from among joint and non joint models. The default in the literature is to omit at the outset non joint models from consideration. For the medfly data, a predictive diagnostic criterion suggests that both the choice of survival model and longitudinal assumptions can grossly affect model adequacy and prediction. Specifically for these data, the simple joint model used in by Tseng et al. (Biometrika 92:587–603, 2005) and models with much more flexibility in their longitudinal components are predictively outperformed by simpler analyses. This case study underscores the need for data analysts to compare on the basis of predictive performance different joint models and to include non joint models in the pool of candidates under consideration.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aalen OO (1980) A model for nonparametric regression analysis of counting processes. In: Klonecki W, Kozek A, Rosinski J (eds) Mathematical statistics and probability theory. Lecture notes in statistics. Springer, New York, pp 1–25
Banerjee S, Gelfand AE, Finley AO, Sang H (2008) Gaussian predictive process models for large spatial datasets. J Roy Stat Soc Ser B 70: 825–848
Bedrick EJ, Christensen R, Johnson WO (2000) Bayesian accelerated failure time analysis with application to veterinary epidemiology. Stat Med 19: 221–237
Berger JO, Guglielmi A (2001) Bayesian testing of a parametric model versus nonparametric alternatives. J Am Stat Assoc 96: 174–184
Branscum AJ, Hanson TE (2008) Bayesian nonparametric meta-analysis using Polya tree mixture models. Biometrics 64: 825–833
Brown ER, Ibrahim JG (2003) A Bayesian semiparametric joint hierarchical model for longitudinal and survival data. Biometrics 59: 221–228
Brown ER, Ibrahim JG, DeGruttola V (2005) A flexible B-spline model for multiple longitudinal biomarkers and survival. Biometrics 61: 64–73
Bycott P, Taylor J (1998) A comparison of smoothing techniques for CD4 data measured with error in a time-dependent Cox proportional hazards model. Stat Med 17: 2061–2077
Carey JR (2003) Longevity: the biology and demography of life span. Princeton University Press, Princeton
Carey JR, Liedo P, Müller HG, Wang JL, Chiou JM (1998) Relationship of age patterns of fecundity to mortality, longevity, and lifetime reproduction in a large cohort of Mediterranean fruit fly females. J Gerontol A Biol Sci Med Sci 53: 245–251
Chen M-H, Shao Q-M, Xu D (2002) Sufficient and necessary conditions on the propriety of posterior distributions for generalized linear mixed models. Sankhya Ser A 64: 57–85
Chen M-H, Ibrahim JG, Shao Q-M (2006) Posterior propriety and computation for the Cox regression model with applications to missing covariates. Biometrika 93: 791–807
Chiou JM, Müller HG, Wang JL, Carey JR (2003) A functional multiplicative effects model for longitudinal data, with application to reproductive histories of female medflies. Stat Sinica 13: 1119–1133
Cox DR (1972) Regression models and life-tables (with discussion). J Roy Stat Soc Ser B 34: 187–220
Cox DR, Oakes D (1984) Analysis of survival data. Chapman and Hall, London
De Blasi P, Hjort NL (2007) Bayesian survival analysis in proportional hazard models with logistic relative risk. Scand J Stat 34: 229–257
Fahrmeir L, Kneib T (2009) Propriety of posteriors in structured additive regression models: theory and empirical evidence. J Stat Plann Infer 139: 843–859
Faucett CL, Thomas DC (1996) Simultaneously modelling censored survival data and repeatedly measured covariates: a Gibbs sampling approach. Stat Med 15: 1663–1685
Ferguson TS (1974) Prior distributions on spaces of probability measures. Ann Stat 2: 615–629
Geisser S, Eddy WF (1979) A predictive approach to model selection. J Am Stat Assoc 74: 153–160
Hanson TE (2006) Inference for mixtures of finite Polya tree models. J Am Stat Assoc 101: 1548–1565
Hanson T, Johnson WO (2002) Modeling regression error with a mixture of Polya trees. J Am Stat Assoc 97: 1020–1033
Hanson TE, Yang M (2007) Bayesian semiparametric proportional odds models. Biometrics 63: 88–95
Hanson T, Johnson WO, Laud P (2009) A unified approach to semiparametric inference for survival models with step-stress covariates. Can J Stat 37: 60–79
Hsieh F, Tseng Y-K, Wang J-L (2006) Joint modeling of survival and longitudinal data: Likelihood approach revisited. Biometrics 62: 1037–1043
Higdon D (2001) Space and space-time modeling using process convolutions. Discussion paper 2001-2003, Institute for Statistics and Decision Sciences, Duke University.
Ibrahim JG, Chen M-H, Sinha D (2001) Bayesian survival analysis. Springer-Verlag, New York
Kneib T (2006) Mixed model based inference in structured additive regression. PhD Thesis, Munich University
Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38: 963–974
Lang S, Brezger A (2004) Bayesian P-splines. J Comput Graph Stat 13: 183–212
Lavine M (1992) Some aspects of Polya tree distributions for statistical modeling. Ann Stat 20: 1222–1235
Lavine M (1994) More aspects of Polya tree distributions for statistical modeling. Ann Stat 22: 1161–1176
Law NJ, Taylor JMG, Sandler H (2002) The joint modeling of a longitudinal disease progression marker and the failure time process in the presence of cure. Biostatistics 3: 547–563
Li Y, Lin X, Müller P (2009) Bayesian inference in semiparametric mixed models for longitudinal data. Biometrics. doi:10.1111/j.1541-0420.2009.01227.x
Martinussen T, Scheike TH (2006) Dynamic regression models for survival analysis. Springer, New York
Mauldin RD, Sudderth WD, Williams SC (1992) Polya trees and random distributions. Ann Stat 20: 1203–1221
Müller H-G, Stadtmüller U (2005) Generalized functional linear models. Ann Stat 33: 774–805
Müller H-G, Carey JR, Wu D, Liedo P, Vaupel JW (2001) Reproductive potential predicts longevity of female Mediterranean fruit flies. Proc Roy Soc Lond B 268: 445–450
Paddock SM, Ruggeri F, Lavine M, West M (2003) Randomized Polya tree models for nonparametric Bayesian inference. Stat Sinica 13: 443–460
Prentice RL (1982) Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika 69: 331–342
Prentice RL, Kalbfleisch JD (1979) Hazard rate models with covariates. Biometrics 35: 25–39
Rice JA (2004) Functional and longitudinal data analysis: perspectives on smoothing. Stat Sinica 14: 631–647
Schoop R, Graf E, Schumacher M (2008) Quantifying the predictive performance of prognostic models for censored survival data with time-dependent covariates. Biometrics 64: 603–610
Song X, Wang CY (2008) Semiparametric approaches for joint modeling of longitudinal and survival data with time-varying coefficients. Biometrics 64: 557–566
Song X, Davidian M, Tsiatis AA (2002) An estimator for the proportional hazards model with multiple longitudinal covariates measured with error. Biostatistics 3: 511–528
Sundaram S (2006) Semiparametric inference in proportional odds model with time-dependent covariates. J Stat Plann Infer 136: 320–334
Therneau TM, Grambsch PM (2000) Modeling survival data: extending the Cox model. Springer, New York
Tseng Y-K, Hsieh F, Wang J-L (2005) Joint modelling of accelerated failure time and longitudinal data. Biometrika 92: 587–603
Tsiatis AA, Davidian M (2004) Joint modeling of longitudinal and time-to-event data: an overview. Stat Sinica 14: 809–834
Wang Y, Taylor JMG (2001) Jointly modeling longitudinal and event time data with application to acquired immunodeficiency syndrome. J Am Stat Assoc 96: 895–905
Wulfsohn MS, Tsiatis AA (1997) A joint model for survival and longitudinal data measured with error. Biometrics 53: 330–339
Yang R, Chen M-H (1995) Bayesian analysis for random coefficient regression models using noninformative priors. J Multivar Anal 55: 283–311
Zhang M, Davidian M (2008) “Smooth” semiparametric regression analysis for arbitrarily censored time-to-event data. Biometrics 64: 567–576
Acknowledgements
The authors thank Professor Yi-Kuan Tseng for providing \({\widehat{\mathbf {b}}_i}\) under the model fit in Tseng et al. (2005) and mapping them to the original data.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
The Below is the Electronic Supplementary Material.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Hanson, T.E., Branscum, A.J. & Johnson, W.O. Predictive comparison of joint longitudinal-survival modeling: a case study illustrating competing approaches. Lifetime Data Anal 17, 3–28 (2011). https://doi.org/10.1007/s10985-010-9162-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-010-9162-0