Computational Statistics

, Volume 22, Issue 4, pp 619–634 | Cite as

Selection between proportional and stratified hazards models based on expected log-likelihood

  • Benoit LiquetEmail author
  • Jérôme Saracco
  • Daniel Commenges
Original Paper


The problem of selecting between semi-parametric and proportional hazards models is considered. We propose to make this choice based on the expectation of the log-likelihood (ELL) which can be estimated by the likelihood cross-validation (LCV) criterion. The criterion is used to choose an estimator in families of semi-parametric estimators defined by the penalized likelihood. A simulation study shows that the ELL criterion performs nearly as well in this problem as the optimal Kullback–Leibler criterion in term of Kullback–Leibler distance and that LCV performs reasonably well. The approach is applied to a model of age-specific risk of dementia as a function of sex and educational level from the data of a large cohort study.


Kullback–Leibler information Likelihood cross-validation Model selection Proportional hazards model Smoothing Stratified hazards model 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov B, Csaki F (eds) Second International Symposium on Information Theory. Budapest, Akademiai Kiado, pp 267–281Google Scholar
  2. Andersen PK, Borgan R, Gill R, Keiding D (1993) Statistical models based on counting processes. Springer, New YorkzbMATHGoogle Scholar
  3. Commenges D, Letenneur L, Joly P, Alioum A, Dartigues J (1998) Modelling age-specific risk: application to dementia. Statistics in Medicine 17:1973–1988CrossRefGoogle Scholar
  4. Commenges D, Joly P, Letenneur L, Dartigues JF (2004) Incidence and prevalence of Alzheimer’s disease or dementia using an Illness-death model. Stat Med 23:199–210CrossRefGoogle Scholar
  5. Cox D (1972) Regression models and life tables (with discussion). J R Stat Soc B 34:187–220zbMATHGoogle Scholar
  6. Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403zbMATHCrossRefMathSciNetGoogle Scholar
  7. DeLeeuw J (1992) Introduction to Akaike (1973) information theory and an extension of the maximum likelihood principle. In: Kotz S, Johnson NL (eds) Breakthroughs in statistics, vol I. Foundations and basic theory. Springer, New York, pp 599–609Google Scholar
  8. Gray RJ (1994) Splines-based tests in survival analysis. Biometrics 50:640–652zbMATHCrossRefMathSciNetGoogle Scholar
  9. Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman and Hall, LondonzbMATHGoogle Scholar
  10. Ishiguro M, Sakamoto Y, Kitagawa G (1997) Bootstrapping log likelihood and EIC, an extension of AIC. Ann Inst Stat Math 49:411–434zbMATHCrossRefMathSciNetGoogle Scholar
  11. Joly P, Commenges D, Letenneur L (1998) A penalized likelihood approach for arbitrarily censored and truncated data: application to age-specific incidence of dementia. Biometrics 54:185–194zbMATHCrossRefGoogle Scholar
  12. Kooperberg C, Stone CJ, Truong YK (1995) Hazard regression. J Am Stat Assoc 90:78–94zbMATHCrossRefMathSciNetGoogle Scholar
  13. Letenneur L, Commenges D, Dartigues J, Barberger-Gateau P (1994) Incidence of dementia and alzheimer’s disease in elderly community residents of south-western france. Int J Epidemiol 23:1256–1261CrossRefGoogle Scholar
  14. Letenneur L, Gilleron V, Commenges D, Helmer C, Orgogozo J, Dartigues J (1999) Are sex and educational level independent predictors of dementia and alzheimer’s disease? Incidence data from the PAQUID project. J Neurol Neurosurg Psychiatry 66:177–183CrossRefGoogle Scholar
  15. Liquet B, Commenges D (2004) Estimating the expectation of the log-likelihood with censored data for estimator selection. Lifetime Data Anal 10:351–367zbMATHCrossRefMathSciNetGoogle Scholar
  16. Liquet B, Sakarovitch C, Commenges D (2003) Bootstrap choice of estimators in non-parametric families: an extension of EIC. Biometrics 59:172–178CrossRefMathSciNetGoogle Scholar
  17. Miller AJ (1990) Subset selection in regression. Chapman and Hall, New YorkzbMATHGoogle Scholar
  18. O’Sullivan F (1988) Fast computation of fully automated log-density and log-hazard estimators. SIAM J Sci Stat Comput 9:363–379zbMATHCrossRefMathSciNetGoogle Scholar
  19. Ramlau-Hansen H (1983) Smoothing counting process intensities by means of kernel functions. Ann Stat 11:453–466zbMATHMathSciNetGoogle Scholar
  20. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464zbMATHGoogle Scholar
  21. Silverman B (1986) Density estimation for statistics and data analysis. Chapman and Hall, LondonzbMATHGoogle Scholar

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  • Benoit Liquet
    • 1
    Email author
  • Jérôme Saracco
    • 2
  • Daniel Commenges
    • 3
  1. 1.INSERM U875, ISPEDUniversité Bordeaux 2Bordeaux CedexFrance
  2. 2.GREThA, UMR CNRS 5113Université Montesquieu-Bordeaux IVPessac CedexFrance
  3. 3.INSERM U875, ISPEDUniversité Bordeaux 2Bordeaux cedexFrance

Personalised recommendations