Frailty Models and Heterogeneity

Tutz, Gerhard; Schmid, Matthias

doi:10.1007/978-3-319-28158-2_9

Gerhard Tutz⁸ &
Matthias Schmid⁹

Part of the book series: Springer Series in Statistics ((SSS))

3590 Accesses

Abstract

The regression modeling techniques considered in the previous chapters are based on the assumption that all relevant covariates are known and included in the analysis set. In practice, however, typically only a limited number of potentially influential variables are known, and it often happens that part of the heterogeneity in the population remains unobserved. In survival modeling, this “unobserved heterogeneity,” when ignored, may cause severe artifacts. This chapter presents various approaches to account for unobserved heterogeneity in discrete time-to-event models. We first consider the discrete hazard frailty model, which incorporates random intercept terms to account for subject-specific variations caused by unobserved covariate information (Sects. 9.1 and 9.2). In Sect. 9.3 discrete hazard frailty models are extended to the case where covariate effects are allowed to be smooth and nonlinear. The model class that is considered is the discrete additive hazard frailty model. Because model misspecification is a critical issue in random-effects models, Sect. 9.4 presents data-driven strategies for variable selection in discrete hazard frailty models. Alternative approaches to incorporate unobserved heterogeneity in discrete time-to-event models are presented in Sects. 9.5 and 9.6, which deal with penalized fixed-effects and finite mixture modeling, respectively. Finally, the connection between discrete hazard frailty models and sequential models in item response theory is investigated (Sect. 9.7).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aalen, O. O. (1988). Heterogeneity in survival analysis. Statistics in Medicine, 7, 1121–1137.
Article Google Scholar
Abbring, J. H., & Van den Berg, G. J. (2007). The unobserved heterogeneity distribution in duration analysis. Biometrika, 94, 87–99.
Article MathSciNet MATH Google Scholar
Agresti, A. (2009). Analysis of ordinal categorical data (2nd ed.). New York: Wiley.
MATH Google Scholar
Agresti, A., Caffo, B., & Ohman-Strickland, P. (2004). Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies. Computational Statistics & Data Analysis, 47, 639–653.
Article MathSciNet MATH Google Scholar
Aitkin, M. (1999). A general maximum likelihood analysis of variance components in generalized linear models. Biometrics, 55, 117–128.
Article MathSciNet MATH Google Scholar
Almansa, J., Vermunt, J. K., Forero, C. G., & Alonso, J. (2014). A factor mixture model for multivariate survival data: An application to the analysis of lifetime mental disorders. Journal of the Royal Statistical Society, Series C, 63, 85–102.
Article MathSciNet Google Scholar
Anderson, D. A., & Aitkin, M. (1985). Variance component models with binary response: Interviewer variability. Journal of the Royal Statistical Society, Series B, 47, 203–210.
MathSciNet Google Scholar
Baker, M., & Melino, A. (2000). Duration dependence and nonparametric heterogeneity: A monte carlo study. Journal of Econometrics, 96, 357–393.
Article MATH Google Scholar
Bondell, H. D., & Reich, B. J. (2009). Simultaneous factor selection and collapsing levels in anova. Biometrics, 65, 169–177.
Article MathSciNet MATH Google Scholar
Breslow, N. E., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed model. Journal of the American Statistical Association, 88, 9–25.
MATH Google Scholar
Breslow, N. E., & Lin, X. (1995). Bias correction in generalized linear mixed models with a single component of dispersion. Biometrika, 82, 81–91.
Article MathSciNet MATH Google Scholar
Broström, G. (2013). glmmML: Generalized linear models with clustering. R package version 1.0. http://cran.r-project.org/web/packages/glmmML
Culpepper, S. A. (2014). If at first you don’t succeed, try, try again – applications of sequential IRT models to cognitive assessments. Applied Psychological Measurement, 38, 632–644.
Article Google Scholar
De Boeck, P., Bakker, M., Zwitser, R., Nivard, M., Hofman, A., Tuerlinckx, F., et al. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39(12), 1–28.
Article Google Scholar
De Boeck, P., & Wilson, M. (2004). A framework for item response models. New York: Springer.
Book MATH Google Scholar
Diggle, P. J., Heagerty, P., Liang, K.-Y., & Zeger, S. L. (2002). Analysis of longitudinal data (2nd ed.). New York: Oxford University Press.
MATH Google Scholar
Elbers, C., & Ridder, G. (1982). True and spurious duration dependence: The identifiability of the proportional hazard model. The Review of Economic Studies, 49, 403–409.
Article MathSciNet MATH Google Scholar
Follmann, D., & Lambert, D. (1989). Generalizing logistic regression by non-parametric mixing. Journal of the American Statistical Association, 84, 295–300.
Article Google Scholar
Frederiksen, A., Honoré, B. E., & Hu, L. (2007). Discrete time duration models with group-level heterogeneity. Journal of Econometrics, 141, 1014–1043.
Article MathSciNet MATH Google Scholar
Frühwirth-Schnatter, S. (2006). Finite mixture and Markov switching models. New York: Springer.
MATH Google Scholar
Gertheiss, J., & Tutz, G. (2010). Sparse modeling of categorial explanatory variables. Annals of Applied Statistics, 4, 2150–2180.
Article MathSciNet MATH Google Scholar
Groll, A. (2015). glmmLasso: Variable selection for generalized linear mixed models by L1-penalized estimation. R package version 1.3.6. http://cran.r-project.org/web/packages/glmmLasso
MathSciNet Google Scholar
Groll, A., & Tutz, G. (2014). Variable selection for generalized linear mixed models by L ₁-penalized estimation. Statistics and Computing, 24, 137–154.
Article MathSciNet MATH Google Scholar
Groll, A., & Tutz, G. (2016). Variable selection in discrete survival models including heterogeneity. Lifetime Data Analysis [published online].
Google Scholar
Grün, B., & Leisch, F. (2008). FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. Journal of Statistical Software, 28(4), 1–35.
Article Google Scholar
Ham, J. C., & Rea, S. A., Jr. (1987). Unemployment insurance and male unemployment duration in Canada. Journal of Labor Economics, 5, 325–353.
Article Google Scholar
Hartzel, J., Liu, I., & Agresti, A. (2001). Describing heterogenous effects in stratified ordinal contingency tables, with applications to multi-center clinical trials. Computational Statistics & Data Analysis, 35, 429–449.
Article MathSciNet MATH Google Scholar
Heagerty, P. J., & Kurland, B. F. (2001). Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika, 88, 973–984.
Article MathSciNet MATH Google Scholar
Heckman, J. J., & Singer, B. (1984a). Econometric duration analysis. Journal of Econometrics, 24, 63–132.
Article MathSciNet MATH Google Scholar
Heckman, J. J., & Singer, B. (1984b). A method for minimizing the impact of distributional assumptions in econometric models of duration. Econometrica, 52, 271–320.
Article MathSciNet MATH Google Scholar
Hedeker, D., Siddiqui, O., & Hu, F. B. (2000). Random-effects regression analysis of correlated grouped-time survival data. Statistical Methods in Medical Research, 9, 161–179.
Article MATH Google Scholar
Hinde, J. (1982). Compound Poisson regression models. In R. Gilchrist (Ed.), GLIM 1982 International Conference on Generalized Linear Models (pp. 109–121). New York: Springer.
Google Scholar
Hougaard, P. (1984). Life table methods for heterogeneous populations: Distributions describing the heterogeneity. Biometrika, 71, 75–83.
Article MathSciNet MATH Google Scholar
Kim, Y.-J., & Jhun, M. (2008). Cure rate model with interval censored data. Statistics in Medicine, 27, 3–14.
Article MathSciNet Google Scholar
Kuk, A. Y., & Chen, C.-H. (1992). A mixture model combining logistic regression with proportional hazards regression. Biometrika, 79, 531–541.
Article MATH Google Scholar
Lancaster, T. (1985). Generalised residuals and heterogeneous duration models: With applications to the Weibull model. Journal of Econometrics, 28, 155–169.
Article MATH Google Scholar
Lancaster, T. (1992). The econometric analysis of transition data. Cambridge: Cambridge University Press.
MATH Google Scholar
Land, K. C., Nagin, D. S., & McCall, P. L. (2001). Discrete-time hazard regression models with hidden heterogeneity: The semiparametric mixed Poisson regression approach. Sociological Methods & Research, 29, 342–373.
Article MathSciNet Google Scholar
Li, C.-S., Taylor, J. M., & Sy, J. P. (2001). Identifiability of cure models. Statistics & Probability Letters, 54, 389–395.
Article MathSciNet MATH Google Scholar
Lin, X., & Breslow, N. E. (1996). Bias correction in generalized linear mixed models with multiple components of dispersion. Journal of the American Statistical Association, 91, 1007–1016.
Article MathSciNet MATH Google Scholar
Lin, X., & Zhang, D. (1999). Inference in generalized additive mixed models by using smoothing splines. Journal of the Royal Statistical Society, Series B, 61, 381–400.
Article MathSciNet MATH Google Scholar
Liu, Q., & Pierce, D. A. (1994). A note on Gauss-Hermite quadrature. Biometrika, 81, 624–629.
MathSciNet MATH Google Scholar
Maller, R. A., & Zhou, X. (1996). Survival analysis with long-term survivors. New York: Wiley.
MATH Google Scholar
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.
Article MATH Google Scholar
McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the American Statistical Association, 92, 162–170.
Article MathSciNet MATH Google Scholar
McCulloch, C. E., & Neuhaus, J. M. (2011). Misspecifying the shape of a random effects distribution: Why getting it wrong may not matter. Statistical Science, 26, 388–402.
Article MathSciNet MATH Google Scholar
McCulloch, C. E., & Searle, S. (2001). Generalized, linear, and mixed models. New York: Wiley.
MATH Google Scholar
McDonald, J. W., & Rosina, A. (2001). Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. Statistical Methods and Applications, 10, 257–272.
Article MATH Google Scholar
McLachlan, G. J., & Peel, D. (2000). Finite mixture models. New York: Wiley.
Book MATH Google Scholar
Muthén, B., & Masyn, K. (2005). Discrete-time survival mixture analysis. Journal of Educational and Behavioral Statistics, 30, 27–58.
Article Google Scholar
Neuhaus, J. M., & McCulloch, C. E. (2006). Separating between- and within-cluster covariate effects by using conditional and partitioning methods. Journal of the Royal Statistical Society, Series B, 68, 859–872.
Article MathSciNet MATH Google Scholar
Nicoletti, C., & Rondinelli, C. (2010). The (mis)specification of discrete duration models with unobserved heterogeneity: A Monte Carlo study. Journal of Econometrics, 159, 1–13.
Article MathSciNet Google Scholar
Pinheiro, J. C., & Bates, D. M. (1995). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics, 4, 12–35.
Google Scholar
Rasch, G. (1961). On general laws and the meaning of measurement in psychology. In J. Neyman (Ed.), Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press.
Google Scholar
Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8, 185–205.
Article Google Scholar
Ruppert, D., Wand, M. P., & Carroll, R. J. (2003). Semiparametric regression. Cambridge: Cambridge University Press.
Book MATH Google Scholar
Schall, R. (1991). Estimation in generalised linear models with random effects. Biometrika, 78, 719–727.
Article MATH Google Scholar
Scheike, T., & Jensen, T. (1997). A discrete survival model with random effects: An application to time to pregnancy. Biometrics, 53, 318–329.
Article MATH Google Scholar
Sy, J. P., & Taylor, J. M. (2000). Estimation in a Cox proportional hazards cure model. Biometrics, 56, 227–236.
Article MathSciNet MATH Google Scholar
Tutz, G. (1990). Sequential item response models with an ordered response. British Journal of Statistical and Mathematical Psychology, 43, 39–55.
Article MathSciNet MATH Google Scholar
Tutz, G. (2012). Regression for categorical data. Cambridge: Cambridge University Press.
MATH Google Scholar
Tutz, G. (2015). Sequential models for ordered responses. In W. van der Linden & R. Hambleton (Eds.), Handbook of modern item response theory. New York: Springer.
Google Scholar
Tutz, G., & Oelker, M. (2015). Modeling clustered heterogeneity: Fixed effects, random effects and mixtures. International Statistical Review (to appear).
Google Scholar
Van den Berg, G. J. (2001). Duration models: Specification, identification and multiple durations. In J. J. Heckman & E. Leamer (Eds.), Handbook of econometrics (Vol. V, pp. 3381–3460). Amsterdam: North Holland.
Google Scholar
van der Linden, W., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.
Book MATH Google Scholar
Vaupel, J. W., Manton, K. G., & Stallard, E. (1979). The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography, 16, 439–454.
Article Google Scholar
Vaupel, J. W., & Yashin, A. I. (1985). Heterogeneity’s ruses: Some surprising effects of selection on population dynamics. The American Statistician, 39, 176–185.
MathSciNet Google Scholar
Verhelst, N. D., Glas, C., & De Vries, H. (1997). A steps model to analyze partial credit. In W. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 123–138). New York: Springer.
Chapter Google Scholar
Vermunt, J. K. (1996). Log-linear event history analysis: A general approach with missing data, latent variables, and unobserved heterogeneity. Tilburg: Tilburg University Press.
MATH Google Scholar
Wolfinger, R. W. (1994). Laplace’s approximation for nonlinear mixed models. Biometrika, 80, 791–795.
Article MathSciNet MATH Google Scholar
Wood, S. N. (2006). Generalized additive models: An introduction with R. London: Chapman & Hall/CRC.
MATH Google Scholar
Xue, X., & Brookmeyer, R. (1997). Regression analysis of discrete time survival data under heterogeneity. Statistics in Medicine, 16, 1983–1993.
Article Google Scholar
Yu, B., Tiwari, R. C., Cronin, K. A., & Feuer, E. J. (2004). Cure fraction estimation from the mixture cure models for grouped survival data. Statistics in Medicine, 23, 1733–1747.
Article Google Scholar

Download references

Author information

Authors and Affiliations

LMU Munich, Munich, Germany
Gerhard Tutz
University of Bonn, Bonn, Germany
Matthias Schmid

Authors

Gerhard Tutz
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Schmid
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tutz, G., Schmid, M. (2016). Frailty Models and Heterogeneity. In: Modeling Discrete Time-to-Event Data. Springer Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-28158-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-28158-2_9
Published: 15 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28156-8
Online ISBN: 978-3-319-28158-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics