Skip to main content
Log in

Consistent and robust inference in hazard probability and odds models with discrete-time survival data

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

For discrete-time survival data, conditional likelihood inference in Cox’s hazard odds model is theoretically desirable but exact calculation is numerical intractable with a moderate to large number of tied events. Unconditional maximum likelihood estimation over both regression coefficients and baseline hazard probabilities can be problematic with a large number of time intervals. We develop new methods and theory using numerically simple estimating functions, along with model-based and model-robust variance estimation, in hazard probability and odds models. For the probability hazard model, we derive as a consistent estimator the Breslow–Peto estimator, previously known as an approximation to the conditional likelihood estimator in the hazard odds model. For the hazard odds model, we propose a weighted Mantel–Haenszel estimator, which satisfies conditional unbiasedness given the numbers of events in addition to the risk sets and covariates, similarly to the conditional likelihood estimator. Our methods are expected to perform satisfactorily in a broad range of settings, with small or large numbers of tied events corresponding to a large or small number of time intervals. The methods are implemented in the R package dSurvival.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Allison PD (1982) Discrete-time methods for the analysis of event histories. Sociol Methodol 13:61–98

    Article  Google Scholar 

  • Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical Models Based on Counting Processes. Springer, New York

  • Breslow NE (1974) Covariance analysis of censored survival data. Biometrics 30:89–100

    Article  Google Scholar 

  • Breslow NE (1981) Odds ratio estimators when the data are sparse. Biometrika 68:73–84

    Article  MathSciNet  MATH  Google Scholar 

  • Buja A, Berk R, Brown L, George E, Pitkin E, Traskin M, Zhao L, Zhang K (2019) Models as approximations I: Consequences illustrated with linear regression. Stat Sci 34:523–544

    Article  MathSciNet  MATH  Google Scholar 

  • Cochran WG (1954) Some methods for strengthening the common \(\chi ^2\) tests. Biometrics 10:417–451

    Article  MathSciNet  MATH  Google Scholar 

  • Cox DR (1972) Regression models and life tables (with discussion). J R Stat Soc Ser B 34:187–220

    MATH  Google Scholar 

  • Cox DR, Oaks DO (1984) Analysis of survival data. Chapman & Hall, London

    Google Scholar 

  • Efron B (1977) The efficiency of Cox’s likelihood function for censored data. J Am Stat Assoc 72:557–565

    Article  MathSciNet  MATH  Google Scholar 

  • Kalbfleisch JD, Prentice RL (1980) The statistical analysis of failure time data. Wiley, New York

    MATH  Google Scholar 

  • Kaplan EL, Meier P (1958) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53:457–481

    Article  MathSciNet  MATH  Google Scholar 

  • Lin DY, Wei LJ (1989) The robust inference for the Cox proportional hazards model. J Am Stat Assoc 84:1074–1079

    Article  MathSciNet  MATH  Google Scholar 

  • Lindsay BG (1980) Nuisance parameters, mixture models and the efficiency of partial likelihood estimators. Philos Trans R Soc Ser A 296:639–665

    MathSciNet  MATH  Google Scholar 

  • Lindsay BG (1983) Efficiency of the conditional score in a mixture setting. Ann Stat 11:486–197

    Article  MathSciNet  MATH  Google Scholar 

  • Manski CF (1988) Analog estimation methods in econometrics. Chapman & Hall, New York

    MATH  Google Scholar 

  • Mantel N, Haenszel WM (1959) Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 22:719–748

    Google Scholar 

  • Peto R (1972) Contribution to the discussion of Cox (1972): regression models and life tables. J R Stat Soc Ser B 34:205–207

    MathSciNet  Google Scholar 

  • Prentice RL, Gloeckler LA (1978) Regression analysis of grouped survival data with application to breast cancer data. Biometrics 34:57–67

    Article  MATH  Google Scholar 

  • Robins JM, Breslow NE, Greenland S (1986) Estimators of the Mantel-Haenszel variance consistent in both sparse data and large strata limiting models. Biometrics 42:311–324

    Article  MathSciNet  MATH  Google Scholar 

  • Tan Z (2020) Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data. Biometrika 107:137–158

    Article  MathSciNet  MATH  Google Scholar 

  • Tan Z (2020b) dSurvival: discrete-time survival analysis, R package version 1.0. http://www.stat.rutgers.edu/~ztan

  • Tan Z (2022) Analysis of odds, probability, and hazard ratios: from 2 by 2 ables to two-sample survival data. J Stat Plan Inference 221:248–265

    Article  MATH  Google Scholar 

  • Therneau TM (2015) A Package for Survival Analysis, version 2.38

  • Therneau TM, Grambsch PM, Fleming TR (1990) Martingale based residuals for survival models. Biometrika 77:147–160

    Article  MathSciNet  MATH  Google Scholar 

  • Therneau TM, Grambsch PM (2000) Modeling survival data: extending the Cox model. Springer, New York

    Book  MATH  Google Scholar 

  • Thompson WA Jr (1977) On the treatment of grouped observations in life studies. Biometrics 33:463–470

    Article  MathSciNet  MATH  Google Scholar 

  • Tsiatis AA (1981) A large sample study of Cox’s regression model. Ann Stat 9:93–108

    Article  MathSciNet  MATH  Google Scholar 

  • White H (1982) Maximum likelihood estimation of misspecified models. Econometrica 50:1–25

    Article  MathSciNet  MATH  Google Scholar 

  • Willett JB, Singer JD (2004) Discrete-time survival analysis. In: Kaplan D (ed) SAGE handbook of quantitative methodology for the social sciences, pp 200–213

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiqiang Tan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 306 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, Z. Consistent and robust inference in hazard probability and odds models with discrete-time survival data. Lifetime Data Anal 29, 555–584 (2023). https://doi.org/10.1007/s10985-022-09585-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-022-09585-1

Keywords

Navigation