Advertisement

What price semiparametric Cox regression?

  • Martin JullumEmail author
  • Nils Lid Hjort
Article

Abstract

Cox’s proportional hazards regression model is the standard method for modelling censored life-time data with covariates. In its standard form, this method relies on a semiparametric proportional hazards structure, leaving the baseline unspecified. Naturally, specifying a parametric model also for the baseline hazard, leading to fully parametric Cox models, will be more efficient when the parametric model is correct, or close to correct. The aim of this paper is two-fold. (a) We compare parametric and semiparametric models in terms of their asymptotic relative efficiencies when estimating different quantities. We find that for some quantities the gain of restricting the model space is substantial, while it is negligible for others. (b) To deal with such selection in practice we develop certain focused and averaged focused information criteria (FIC and AFIC). These aim at selecting the most appropriate proportional hazards models for given purposes. Our methodology applies also to the simpler case without covariates, when comparing Kaplan–Meier and Nelson–Aalen estimators to parametric counterparts. Applications to real data are also provided, along with analyses of theoretical behavioural aspects of our methods.

Keywords

Cox regression Focused information criteria Model selection Parametrics and semiparametrics Survival data 

Notes

Acknowledgements

Our efforts have been supported in part by the Norwegian Research Council, through the project FocuStat (Focus Driven Statistical Inference With Complex Data) and the research based innovation centre Statistics for Innovation (sfi)\(^2\). We are also grateful to the reviewers and editor Mei-Ling T. Lee for constructive comments which led to an improved presentation.

Supplementary material

10985_2018_9450_MOESM1_ESM.pdf (203 kb)
Supplementary material 1 (pdf 203 KB)

References

  1. Aalen OO, Gjessing HK (2001) Understanding the shape of the hazard rate: a process point of view [with discussion and a rejoinder]. Stat Sci 16:1–22zbMATHGoogle Scholar
  2. Aalen OO, Borgan Ø, Gjessing HK (2008) Survival and event history analysis: a process point of view. Springer, BerlinCrossRefGoogle Scholar
  3. Andersen PK, Borgan Ø, Gill RD, Keiding N (1993) Statistical models based on counting processes. Springer, BerlinCrossRefGoogle Scholar
  4. Borgan Ø (1984) Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scand J Stat 11:1–16MathSciNetzbMATHGoogle Scholar
  5. Breslow NE (1972) Contribution to the discussion of the paper by D.R. Cox. J R Stat Soc Ser B 34:216–217MathSciNetGoogle Scholar
  6. Claeskens G, Hjort NL (2003) The focused information criterion [with discussion and a rejoinder]. J Am Stat Assoc 98:900–916CrossRefGoogle Scholar
  7. Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  8. Cox DR (1972) Regression models and life-tables [with discussion and a rejoinder]. J R Stat Soc Ser B 34:187–220zbMATHGoogle Scholar
  9. Efron B (1977) The efficiency of Cox’s likelihood function for censored data. J Am Stat Assoc 72:557–565MathSciNetCrossRefGoogle Scholar
  10. Hjort NL (1985) Bootstrapping Cox’s regression model. Department of Statistics, University of Stanford, Tech. repGoogle Scholar
  11. Hjort NL (1990) Goodness of fit tests in models for life history data based on cumulative hazard rates. Ann Stat 18:1221–1258MathSciNetCrossRefGoogle Scholar
  12. Hjort NL (1992) On inference in parametric survival data models. Int Stat Rev 60:355–387CrossRefGoogle Scholar
  13. Hjort NL (2008) Focused information criteria for the linear hazard regression model. In: Vonta F, Nikulin M, Limnios N, Huber-Carol C (eds) Statistical models and methods for biomedical and technical systems. Birkhäuser, Boston, pp 487–502CrossRefGoogle Scholar
  14. Hjort NL, Claeskens G (2003) Frequentist model average estimators [with discussion and a rejoinder]. J Am Stat Assoc 98:879–899CrossRefGoogle Scholar
  15. Hjort NL, Claeskens G (2006) Focused information criteria and model averaging for the Cox hazard regression model. J Am Stat Assoc 101:1449–1464MathSciNetCrossRefGoogle Scholar
  16. Hjort NL, Pollard DB (1993) Asymptotics for minimisers of convex processes. Department of Mathematics, University of Oslo, Tech. repGoogle Scholar
  17. Jeong JH, Oakes D (2003) On the asymptotic relative efficiency of estimates from Cox’s model. Sankhya 65:422–439MathSciNetzbMATHGoogle Scholar
  18. Jeong JH, Oakes D (2005) Effects of different hazard ratios on asymptotic relative efficiency estimates from Cox’s model. Commun Stat Theory Methods 34:429–448MathSciNetCrossRefGoogle Scholar
  19. Jullum M, Hjort NL (2017) Parametric or nonparametric: The FIC approach. Stat Sin 27:951–981MathSciNetzbMATHGoogle Scholar
  20. Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, New YorkCrossRefGoogle Scholar
  21. Meier P, Karrison T, Chappell R, Xie H (2004) The price of Kaplan-Meier. J Am Stat Assoc 99:890–896MathSciNetCrossRefGoogle Scholar
  22. Miller R (1983) What price Kaplan-Meier? Biometrics 39:1077–1081MathSciNetCrossRefGoogle Scholar
  23. Oakes D (1977) The asymptotic information in censored survival data. Biometrika 64:441–448MathSciNetCrossRefGoogle Scholar
  24. van der Vaart A (2000) Asymptotic statistics. Cambridge University Press, CambridgeGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of MathematicsUniversity of OsloOsloNorway

Personalised recommendations