Skip to main content

Frailty Models and Heterogeneity

  • Chapter
  • First Online:
Modeling Discrete Time-to-Event Data

Part of the book series: Springer Series in Statistics ((SSS))

  • 3590 Accesses

Abstract

The regression modeling techniques considered in the previous chapters are based on the assumption that all relevant covariates are known and included in the analysis set. In practice, however, typically only a limited number of potentially influential variables are known, and it often happens that part of the heterogeneity in the population remains unobserved. In survival modeling, this “unobserved heterogeneity,” when ignored, may cause severe artifacts. This chapter presents various approaches to account for unobserved heterogeneity in discrete time-to-event models. We first consider the discrete hazard frailty model, which incorporates random intercept terms to account for subject-specific variations caused by unobserved covariate information (Sects. 9.1 and 9.2). In Sect. 9.3 discrete hazard frailty models are extended to the case where covariate effects are allowed to be smooth and nonlinear. The model class that is considered is the discrete additive hazard frailty model. Because model misspecification is a critical issue in random-effects models, Sect. 9.4 presents data-driven strategies for variable selection in discrete hazard frailty models. Alternative approaches to incorporate unobserved heterogeneity in discrete time-to-event models are presented in Sects. 9.5 and 9.6, which deal with penalized fixed-effects and finite mixture modeling, respectively. Finally, the connection between discrete hazard frailty models and sequential models in item response theory is investigated (Sect. 9.7).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aalen, O. O. (1988). Heterogeneity in survival analysis. Statistics in Medicine, 7, 1121–1137.

    Article  Google Scholar 

  • Abbring, J. H., & Van den Berg, G. J. (2007). The unobserved heterogeneity distribution in duration analysis. Biometrika, 94, 87–99.

    Article  MathSciNet  MATH  Google Scholar 

  • Agresti, A. (2009). Analysis of ordinal categorical data (2nd ed.). New York: Wiley.

    MATH  Google Scholar 

  • Agresti, A., Caffo, B., & Ohman-Strickland, P. (2004). Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies. Computational Statistics & Data Analysis, 47, 639–653.

    Article  MathSciNet  MATH  Google Scholar 

  • Aitkin, M. (1999). A general maximum likelihood analysis of variance components in generalized linear models. Biometrics, 55, 117–128.

    Article  MathSciNet  MATH  Google Scholar 

  • Almansa, J., Vermunt, J. K., Forero, C. G., & Alonso, J. (2014). A factor mixture model for multivariate survival data: An application to the analysis of lifetime mental disorders. Journal of the Royal Statistical Society, Series C, 63, 85–102.

    Article  MathSciNet  Google Scholar 

  • Anderson, D. A., & Aitkin, M. (1985). Variance component models with binary response: Interviewer variability. Journal of the Royal Statistical Society, Series B, 47, 203–210.

    MathSciNet  Google Scholar 

  • Baker, M., & Melino, A. (2000). Duration dependence and nonparametric heterogeneity: A monte carlo study. Journal of Econometrics, 96, 357–393.

    Article  MATH  Google Scholar 

  • Bondell, H. D., & Reich, B. J. (2009). Simultaneous factor selection and collapsing levels in anova. Biometrics, 65, 169–177.

    Article  MathSciNet  MATH  Google Scholar 

  • Breslow, N. E., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed model. Journal of the American Statistical Association, 88, 9–25.

    MATH  Google Scholar 

  • Breslow, N. E., & Lin, X. (1995). Bias correction in generalized linear mixed models with a single component of dispersion. Biometrika, 82, 81–91.

    Article  MathSciNet  MATH  Google Scholar 

  • Broström, G. (2013). glmmML: Generalized linear models with clustering. R package version 1.0. http://cran.r-project.org/web/packages/glmmML

  • Culpepper, S. A. (2014). If at first you don’t succeed, try, try again – applications of sequential IRT models to cognitive assessments. Applied Psychological Measurement, 38, 632–644.

    Article  Google Scholar 

  • De Boeck, P., Bakker, M., Zwitser, R., Nivard, M., Hofman, A., Tuerlinckx, F., et al. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39(12), 1–28.

    Article  Google Scholar 

  • De Boeck, P., & Wilson, M. (2004). A framework for item response models. New York: Springer.

    Book  MATH  Google Scholar 

  • Diggle, P. J., Heagerty, P., Liang, K.-Y., & Zeger, S. L. (2002). Analysis of longitudinal data (2nd ed.). New York: Oxford University Press.

    MATH  Google Scholar 

  • Elbers, C., & Ridder, G. (1982). True and spurious duration dependence: The identifiability of the proportional hazard model. The Review of Economic Studies, 49, 403–409.

    Article  MathSciNet  MATH  Google Scholar 

  • Follmann, D., & Lambert, D. (1989). Generalizing logistic regression by non-parametric mixing. Journal of the American Statistical Association, 84, 295–300.

    Article  Google Scholar 

  • Frederiksen, A., Honoré, B. E., & Hu, L. (2007). Discrete time duration models with group-level heterogeneity. Journal of Econometrics, 141, 1014–1043.

    Article  MathSciNet  MATH  Google Scholar 

  • Frühwirth-Schnatter, S. (2006). Finite mixture and Markov switching models. New York: Springer.

    MATH  Google Scholar 

  • Gertheiss, J., & Tutz, G. (2010). Sparse modeling of categorial explanatory variables. Annals of Applied Statistics, 4, 2150–2180.

    Article  MathSciNet  MATH  Google Scholar 

  • Groll, A. (2015). glmmLasso: Variable selection for generalized linear mixed models by L1-penalized estimation. R package version 1.3.6. http://cran.r-project.org/web/packages/glmmLasso

    MathSciNet  Google Scholar 

  • Groll, A., & Tutz, G. (2014). Variable selection for generalized linear mixed models by L 1-penalized estimation. Statistics and Computing, 24, 137–154.

    Article  MathSciNet  MATH  Google Scholar 

  • Groll, A., & Tutz, G. (2016). Variable selection in discrete survival models including heterogeneity. Lifetime Data Analysis [published online].

    Google Scholar 

  • Grün, B., & Leisch, F. (2008). FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. Journal of Statistical Software, 28(4), 1–35.

    Article  Google Scholar 

  • Ham, J. C., & Rea, S. A., Jr. (1987). Unemployment insurance and male unemployment duration in Canada. Journal of Labor Economics, 5, 325–353.

    Article  Google Scholar 

  • Hartzel, J., Liu, I., & Agresti, A. (2001). Describing heterogenous effects in stratified ordinal contingency tables, with applications to multi-center clinical trials. Computational Statistics & Data Analysis, 35, 429–449.

    Article  MathSciNet  MATH  Google Scholar 

  • Heagerty, P. J., & Kurland, B. F. (2001). Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika, 88, 973–984.

    Article  MathSciNet  MATH  Google Scholar 

  • Heckman, J. J., & Singer, B. (1984a). Econometric duration analysis. Journal of Econometrics, 24, 63–132.

    Article  MathSciNet  MATH  Google Scholar 

  • Heckman, J. J., & Singer, B. (1984b). A method for minimizing the impact of distributional assumptions in econometric models of duration. Econometrica, 52, 271–320.

    Article  MathSciNet  MATH  Google Scholar 

  • Hedeker, D., Siddiqui, O., & Hu, F. B. (2000). Random-effects regression analysis of correlated grouped-time survival data. Statistical Methods in Medical Research, 9, 161–179.

    Article  MATH  Google Scholar 

  • Hinde, J. (1982). Compound Poisson regression models. In R. Gilchrist (Ed.), GLIM 1982 International Conference on Generalized Linear Models (pp. 109–121). New York: Springer.

    Google Scholar 

  • Hougaard, P. (1984). Life table methods for heterogeneous populations: Distributions describing the heterogeneity. Biometrika, 71, 75–83.

    Article  MathSciNet  MATH  Google Scholar 

  • Kim, Y.-J., & Jhun, M. (2008). Cure rate model with interval censored data. Statistics in Medicine, 27, 3–14.

    Article  MathSciNet  Google Scholar 

  • Kuk, A. Y., & Chen, C.-H. (1992). A mixture model combining logistic regression with proportional hazards regression. Biometrika, 79, 531–541.

    Article  MATH  Google Scholar 

  • Lancaster, T. (1985). Generalised residuals and heterogeneous duration models: With applications to the Weibull model. Journal of Econometrics, 28, 155–169.

    Article  MATH  Google Scholar 

  • Lancaster, T. (1992). The econometric analysis of transition data. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Land, K. C., Nagin, D. S., & McCall, P. L. (2001). Discrete-time hazard regression models with hidden heterogeneity: The semiparametric mixed Poisson regression approach. Sociological Methods & Research, 29, 342–373.

    Article  MathSciNet  Google Scholar 

  • Li, C.-S., Taylor, J. M., & Sy, J. P. (2001). Identifiability of cure models. Statistics & Probability Letters, 54, 389–395.

    Article  MathSciNet  MATH  Google Scholar 

  • Lin, X., & Breslow, N. E. (1996). Bias correction in generalized linear mixed models with multiple components of dispersion. Journal of the American Statistical Association, 91, 1007–1016.

    Article  MathSciNet  MATH  Google Scholar 

  • Lin, X., & Zhang, D. (1999). Inference in generalized additive mixed models by using smoothing splines. Journal of the Royal Statistical Society, Series B, 61, 381–400.

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, Q., & Pierce, D. A. (1994). A note on Gauss-Hermite quadrature. Biometrika, 81, 624–629.

    MathSciNet  MATH  Google Scholar 

  • Maller, R. A., & Zhou, X. (1996). Survival analysis with long-term survivors. New York: Wiley.

    MATH  Google Scholar 

  • Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.

    Article  MATH  Google Scholar 

  • McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association, 92, 162–170.

    Article  MathSciNet  MATH  Google Scholar 

  • McCulloch, C. E., & Neuhaus, J. M. (2011). Misspecifying the shape of a random effects distribution: Why getting it wrong may not matter. Statistical Science, 26, 388–402.

    Article  MathSciNet  MATH  Google Scholar 

  • McCulloch, C. E., & Searle, S. (2001). Generalized, linear, and mixed models. New York: Wiley.

    MATH  Google Scholar 

  • McDonald, J. W., & Rosina, A. (2001). Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. Statistical Methods and Applications, 10, 257–272.

    Article  MATH  Google Scholar 

  • McLachlan, G. J., & Peel, D. (2000). Finite mixture models. New York: Wiley.

    Book  MATH  Google Scholar 

  • Muthén, B., & Masyn, K. (2005). Discrete-time survival mixture analysis. Journal of Educational and Behavioral Statistics, 30, 27–58.

    Article  Google Scholar 

  • Neuhaus, J. M., & McCulloch, C. E. (2006). Separating between- and within-cluster covariate effects by using conditional and partitioning methods. Journal of the Royal Statistical Society, Series B, 68, 859–872.

    Article  MathSciNet  MATH  Google Scholar 

  • Nicoletti, C., & Rondinelli, C. (2010). The (mis)specification of discrete duration models with unobserved heterogeneity: A Monte Carlo study. Journal of Econometrics, 159, 1–13.

    Article  MathSciNet  Google Scholar 

  • Pinheiro, J. C., & Bates, D. M. (1995). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics, 4, 12–35.

    Google Scholar 

  • Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In J. Neyman (Ed.), Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press.

    Google Scholar 

  • Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8, 185–205.

    Article  Google Scholar 

  • Ruppert, D., Wand, M. P., & Carroll, R. J. (2003). Semiparametric regression. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Schall, R. (1991). Estimation in generalised linear models with random effects. Biometrika, 78, 719–727.

    Article  MATH  Google Scholar 

  • Scheike, T., & Jensen, T. (1997). A discrete survival model with random effects: An application to time to pregnancy. Biometrics, 53, 318–329.

    Article  MATH  Google Scholar 

  • Sy, J. P., & Taylor, J. M. (2000). Estimation in a Cox proportional hazards cure model. Biometrics, 56, 227–236.

    Article  MathSciNet  MATH  Google Scholar 

  • Tutz, G. (1990). Sequential item response models with an ordered response. British Journal of Statistical and Mathematical Psychology, 43, 39–55.

    Article  MathSciNet  MATH  Google Scholar 

  • Tutz, G. (2012). Regression for categorical data. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Tutz, G. (2015). Sequential models for ordered responses. In W. van der Linden & R. Hambleton (Eds.), Handbook of modern item response theory. New York: Springer.

    Google Scholar 

  • Tutz, G., & Oelker, M. (2015). Modeling clustered heterogeneity: Fixed effects, random effects and mixtures. International Statistical Review (to appear).

    Google Scholar 

  • Van den Berg, G. J. (2001). Duration models: Specification, identification and multiple durations. In J. J. Heckman & E. Leamer (Eds.), Handbook of econometrics (Vol. V, pp. 3381–3460). Amsterdam: North Holland.

    Google Scholar 

  • van der Linden, W., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.

    Book  MATH  Google Scholar 

  • Vaupel, J. W., Manton, K. G., & Stallard, E. (1979). The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography, 16, 439–454.

    Article  Google Scholar 

  • Vaupel, J. W., & Yashin, A. I. (1985). Heterogeneity’s ruses: Some surprising effects of selection on population dynamics. The American Statistician, 39, 176–185.

    MathSciNet  Google Scholar 

  • Verhelst, N. D., Glas, C., & De Vries, H. (1997). A steps model to analyze partial credit. In W. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 123–138). New York: Springer.

    Chapter  Google Scholar 

  • Vermunt, J. K. (1996). Log-linear event history analysis: A general approach with missing data, latent variables, and unobserved heterogeneity. Tilburg: Tilburg University Press.

    MATH  Google Scholar 

  • Wolfinger, R. W. (1994). Laplace’s approximation for nonlinear mixed models. Biometrika, 80, 791–795.

    Article  MathSciNet  MATH  Google Scholar 

  • Wood, S. N. (2006). Generalized additive models: An introduction with R. London: Chapman & Hall/CRC.

    MATH  Google Scholar 

  • Xue, X., & Brookmeyer, R. (1997). Regression analysis of discrete time survival data under heterogeneity. Statistics in Medicine, 16, 1983–1993.

    Article  Google Scholar 

  • Yu, B., Tiwari, R. C., Cronin, K. A., & Feuer, E. J. (2004). Cure fraction estimation from the mixture cure models for grouped survival data. Statistics in Medicine, 23, 1733–1747.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Tutz, G., Schmid, M. (2016). Frailty Models and Heterogeneity. In: Modeling Discrete Time-to-Event Data. Springer Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-28158-2_9

Download citation

Publish with us

Policies and ethics