Abstract
We consider methods for the analysis of discrete-time recurrent event data, when interest is mainly in prediction. The Aalen additive model provides an extremely simple and effective method for the determination of covariate effects for this type of data, especially in the presence of time-varying effects and time varying covariates, including dynamic summaries of prior event history. The method is weakened for predictive purposes by the presence of negative estimates. The obvious alternative of a standard logistic regression analysis at each time point can have problems of stability when event frequency is low and maximum likelihood estimation is used. The Firth penalised likelihood approach is stable but in removing bias in regression coefficients it introduces bias into predicted event probabilities. We propose an alterative modified penalised likelihood, intermediate between Firth and no penalty, as a pragmatic compromise between stability and bias. Illustration on two data sets is provided.
Similar content being viewed by others
References
Aalen OO, Fosen J, Wedon-Fekjær H, Borgan Ø, Husebye E (2004) Dynamic analysis of multivariate failure time data. Biometrics 60:764–773
Albert A, Anderson JA (1984) On the existence of maximum likelihood estimates in logistic regression models. Biometrika 71:1–10
Anscome FJ (1956) On estimating binomial response relations. Biometrika 43:461–464
Berkson J (1953) A statistically precise and relatively simple method of estimating the bioassay with quantal response, based on the logistic function. J Am Statist Assoc 48:565–599
Borgan Ø, Fiaccone RL, Henderson R, Barreto ML (2007) Dynamic analysis of recurrent event data with missing observations, with application to infant diarrhoea in Brazil. Scandinavian J Statist 34:53–69
Cox DR (1970) Analysis of binary data, 1st edn. Chapman and Hall, London
Diggle PJ, Heagerty PJ, Liang K-Y, Zeger S (2002) Analysis of longitudinal data, 2nd edn. Oxford University Press, Oxford
Ferro CAT, Stephenson DB (2012) Deterministic forecasts of extreme events and warnings. In: Jolliffe IB, Stephenson DB (eds) Forecast verification: a practitioner’s guide in atmospheric science, 2nd edn. Wiley, Chichester
Firth D (1993) Bias reduction of maximum likelihood estimates. Biometrika 80:27–38
Fosen J, Borgan Ø, Weedon-Fekær H, Aalen OO (2006) Dynamic analysis of recurrent event data using the additive hazard model. Biometr J 48:381–398
Haldane JBS (1956) The estimation and significance of the logarithm of a ratio of frequencies. Ann Human Genet 20:309–311
Heinz G, Puhr R (2010) Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets. Statist Med 29:770–777
Heinze G, Schemper M (2002) A solution to the problem of separation in logistic regression. Statist Med 21:2409–2419
Heinze G (2006) A comparative investigation of methods for logistic regression with separated or nearly separated data. Statist Med 25:4216–4226
Henderson R, Diggle PJ, Dobson A (2002) Identification and efficacy of longitudinal markers for survival. Biostatistics 3:33–50
Henderson R, Keiding N (2005a) Individual survival time prediction using statistical models. (Forudsigelse af individuelle levetider ved hjaelp af statistuiske modeller). Danish Med J 167/10:1174–1177
Henderson R, Keiding N (2005b) Individual survival time prediction using statistical models. J Med Ethics 31:703–706
Jachan M, Feldwisch H, Posdziech F, Brandt A, Altenmüller D-M, Schulze-Bonhage A, Timmer J, Schelter B (2009) Probabilistic forecasts of epileptic seizures and evaluation by the Brier score. Fourth Eur Conf Int Federation Medi Biol Eng Proc 22:1701–1705
Martinussen T, Scheike TH (2006) Dynamic regression models for survival data. Springer, New York
Mehta CR, Patel NR (1995) Exact logistic regression: theory and examples. Statist Med 14:2143–2160
Proust-Lima C, Taylor JMG (2009) Development and validation of a dynamic prognostic tool for prostate cancer recurrence using repeated measures of posttreatment PSA: a joint modeling approach. Biostatistics 10:535–549
van Houwelingen H, Putter H (2011) Dynamic prediction in clinical survival analysis. Chapman and Hall/CRC Press, London
Acknowledgments
The research of Rosemeire Fiaccone was supported in part by National Council of Technological and Scientific Development—CNPq, Brazil (Num. 237094/2012-6, 480614/2011-3). Robin Henderson benefited from participation in Deutsche Forschungsgemeinschaft research programme FR 3070/1-1. We are grateful for the comments of the reviewers.
Author information
Authors and Affiliations
Corresponding author
Appendix: logistic regression with separation not detected in R
Appendix: logistic regression with separation not detected in R
In the following, y is a vector of length 100, with all elements zero except the first, which is one, and x1 is a vector of 50 zeros followed by 50 ones, representing two equally sized groups. If we attempt to fit the logistic regression
then clearly a perfect fit is obtained at \(\hat{\beta }_0=\mathrm{logit}(1/50)=-3.892\) and \(\hat{\beta }_1=-\infty \). Some R (version 3.1.2) output, edited to remove unnecessary material (marked by [...], is:
Of most concern is the statement of convergence, which is true because the maximised likelihood has indeed converged: moving either of the coefficients away from their current values leads to no improvement. The fitted probabilities \(\hat{\pi }_0\) and \(\hat{\pi }_1\) are accurate but clearly \(\hat{\beta }_1\) is unrealistic. Uncritical assessment of the results might lead to this problem being missed.
If we use the Firth correction as implemented in Kosmidis’ bias reduction package brglm, we obtain:
Hence the coefficients are stabilised, at the expense of higher values of \(\hat{\pi }_0\) and \(\hat{\pi }_1\) as expected. Heinze’ package logistf gives the same results.
Rights and permissions
About this article
Cite this article
Elgmati, E., Fiaccone, R.L., Henderson, R. et al. Penalised logistic regression and dynamic prediction for discrete-time recurrent event data. Lifetime Data Anal 21, 542–560 (2015). https://doi.org/10.1007/s10985-015-9321-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-015-9321-4