Skip to main content

Weighted Lindley frailty model: estimation and application to lung cancer data

Abstract

In this paper, we propose a novel frailty model for modeling unobserved heterogeneity present in survival data. Our model is derived by using a weighted Lindley distribution as the frailty distribution. The respective frailty distribution has a simple Laplace transform function which is useful to obtain marginal survival and hazard functions. We assume hazard functions of the Weibull and Gompertz distributions as the baseline hazard functions. A classical inference procedure based on the maximum likelihood method is presented. Extensive simulation studies are further performed to verify the behavior of maximum likelihood estimators under different proportions of right-censoring and to assess the performance of the likelihood ratio test to detect unobserved heterogeneity in different sample sizes. Finally, to demonstrate the applicability of the proposed model, we use it to analyze a medical dataset from a population-based study of incident cases of lung cancer diagnosed in the state of São Paulo, Brazil.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Notes

  1. 1.

    ICD-10 is the \(10^\mathrm{th}\) revision of the International Statistical Classification of Diseases and Related Health Problems (ICD), a medical classification list by the World Health Organization (WHO).

References

  1. Aalen OO (1988) Heterogeneity in survival analysis. Stat Med 7(11):1121–1137

    Article  Google Scholar 

  2. Ali S (2015) On the bayesian estimation of the weighted lindley distribution. J Stat Comput Simul 85(5):855–880

    MathSciNet  MATH  Article  Google Scholar 

  3. Almeida MP, Paixão RS, Ramos PL, Tomazella V, Louzada F, Ehlers RS (2020) Bayesian non-parametric frailty model for dependent competing risks in a repairable systems framework. Reliab Eng Syst Saf 204:107145

    Article  Google Scholar 

  4. Andrade CTd, Magedanz AMPCB, Escobosa DM, Tomaz WM, Santinho CS, Lopes TO, Lombardo V (2012) The importance of a database in the management of healthcare services. Einstein (São Paulo) 10:360–365

    Article  Google Scholar 

  5. Balakrishnan N, Peng Y (2006) Generalized gamma frailty model. Stat Med 25(16):2797–2816

    MathSciNet  Article  Google Scholar 

  6. Barker P, Henderson R (2005) Small sample bias in the gamma frailty model for univariate survival. Lifetime Data Anal 11(2):265–284

    MathSciNet  MATH  Article  Google Scholar 

  7. Böhnstedt M, Gampe J, Putter H(2021) Information measures and design issues in the study of mortality deceleration: findings for the gamma-gompertz model. Lifetime Data Anal 1–24

  8. Bretagnolle J, Huber-Carol C(1988) Effects of omitting covariates in cox’s model for survival data. Scand J Stat 125–138

  9. Calsavara VF, Milani EA, Bertolli E, Tomazella V (2020) Long-term frailty modeling using a non-proportional hazards model: Application with a melanoma dataset. Stat Methods Med Res 29(8):2100–2118

    MathSciNet  Article  Google Scholar 

  10. Calsavara VF, Rodrigues AS, Rocha R, Louzada F, Tomazella V, Souza AC, Costa RA, Francisco RP (2019a) Zero-adjusted defective regression models for modeling lifetime data. J Appl Stat 46(13):2434–2459

    MathSciNet  Article  Google Scholar 

  11. Calsavara VF, Rodrigues AS, Rocha R, Tomazella V, Louzada F (2019b) Defective regression models for cure rate modeling with interval-censored data. Biom J 61:841–859

    MathSciNet  MATH  Article  Google Scholar 

  12. Clayton DG (1978) A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65(1):141–151

    MathSciNet  MATH  Article  Google Scholar 

  13. Cox DR (1972) Regression models and life-tables. J Roy Stat Soc: Ser B (Methodol) 34(2):187–202

    MathSciNet  MATH  Google Scholar 

  14. Cox DR, Snell EJ (1968) A general definition of residuals. J Roy Stat Soc: Ser B (Methodol) 30(2):248–265

    MathSciNet  MATH  Google Scholar 

  15. Duchateau L, Janssen P (2007) The frailty model. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  16. Elbers C, Ridder G (1982) True and spurious duration dependence: The identifiability of the proportional hazard model. Rev Econ Stud 49(3):403–409

    MathSciNet  MATH  Article  Google Scholar 

  17. Ghitany M, Alqallaf F, Al-Mutairi DK, Husain H (2011) A two-parameter weighted lindley distribution and its applications to survival data. Math Comput Simul 81(6):1190–1201

    MathSciNet  MATH  Article  Google Scholar 

  18. Henderson R, Oman P (1999) Effect of frailty on marginal regression estimates in survival analysis. J Roy Stat Soc Ser B (Stat Methodol) 61(2):367–379

    MathSciNet  MATH  Article  Google Scholar 

  19. Henningsen A, Toomet O (2011) maxlik: A package for maximum likelihood estimation in r. Comput Stat 26(3):443–458

    MathSciNet  MATH  Article  Google Scholar 

  20. Horowitz JL (1999) Semiparametric estimation of a proportional hazard model with unobserved heterogeneity. Econometrica 67(5):1001–1028

    MathSciNet  MATH  Article  Google Scholar 

  21. Hougaard P (1986) Survival models for heterogeneous populations derived from stable distributions. Biometrika 73(2):387–396

    MathSciNet  MATH  Article  Google Scholar 

  22. Hougaard P (1995) Frailty models for survival data. Lifetime Data Anal 1(3):255–273

    MathSciNet  Article  Google Scholar 

  23. Hougaard P (2012) Analysis of multivariate survival data. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  24. Ibrahim J, Chen M, Sinha D (2001) Bayesian survival analysis springer series in statistics. Springer, New York, pp 978–981

    Google Scholar 

  25. Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90(430):773–795

    MathSciNet  MATH  Article  Google Scholar 

  26. Keiding N, Andersen PK, Klein JP (1997) The role of frailty models and accelerated failure time models in describing heterogeneity due to omitted covariates. Stat Med 16(2):215–224

    Article  Google Scholar 

  27. Klein JP (1992)Semiparametric estimation of random effects using the cox model based on the em algorithm. Biometrics 795–806 (1992)

  28. Klein JP, Moeschberger ML (2006) Survival analysis: techniques for censored and truncated data. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  29. Lawless JF (2011) Statistical models and methods for lifetime data, vol 362. John Wiley & Sons, New York

    MATH  Google Scholar 

  30. Leão J, Leiva V, Saulo H, Tomazella V (2017) Birnbaum-saunders frailty regression models: diagnostics and application to medical data. Biom J 59(2):291–314

    MathSciNet  MATH  Article  Google Scholar 

  31. Lehmann EL (2004) Elements of large-sample theory. Springer Science & Business Media, Berlin

    Google Scholar 

  32. Lehmann EL, Casella G (2006) Theory of point estimation. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  33. Lindley DV(1958) Fiducial distributions and bayes’ theorem. J Roy Stat Soc Ser B (Methodological) 102–107 (1958)

  34. Louzada F, Cuminato JA, Rodriguez OMH, Tomazella VL, Milani EA, Ferreira PH, Ramos PL, Bochio G, Perissini IC, Junior OAG et al (2020) Incorporation of frailties into a non-proportional hazard regression model and its diagnostics for reliability modeling of downhole safety valves. IEEE Access 8:219757–219774

    Article  Google Scholar 

  35. Maller R, Zhou X (1996) Survival ananlysis with long-term survivors. John Wiley & Sons, New York

    MATH  Google Scholar 

  36. Marsaglia G, Tsang WW (2000) A simple method for generating gamma variables. ACM Trans Math Softw (TOMS) 26(3):363–372

    MathSciNet  MATH  Article  Google Scholar 

  37. Mazucheli J, Coelho-Barros EA, Achcar JA (2016) An alternative reparametrization for the weighted lindley distribution. Pesquisa Operacional 36(2):345–353

    Article  Google Scholar 

  38. Nash JC, Varadhan R, Grothendieck G, Nash MJC, Yes L(2020) Package ‘optimx’

  39. Nielsen GG, Gill RD, Andersen PK, Sørensen TI (1992) A counting process approach to maximum likelihood estimation in frailty models. Scand J Stat 25–43

  40. Nielsen HB, Mortensen SB (2016) ucminf: General-Purpose Unconstrained Non-Linear Optimization (2016). https://CRAN.R-project.org/package=ucminf. R package version 1.1-4

  41. Nocedal J, Wright S (1999) Springer series in operations research. Numer Optim

  42. Parner E et al (1998) Inference in semiparametric frailty models. Acta Jutlandica 73:320–321

    Google Scholar 

  43. Pickles A, Crouchley R (1995) A comparison of frailty models for multivariate survival data. Stat Med 14(13):1447–1461

    MATH  Article  Google Scholar 

  44. Core Team R (2020) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020). https://www.R-project.org/

  45. Robert C, Casella G (2013) Monte Carlo statistical methods. Springer Science & Business Media, Berlin

    MATH  Google Scholar 

  46. Rocha R, Nadarajah S, Tomazella V, Louzada F (2016) Two new defective distributions based on the Marshall-Olkin extension. Lifetime Data Anal 22:216–240

    MathSciNet  MATH  Article  Google Scholar 

  47. Sinha D, Dey DK (1997) Semiparametric bayesian analysis of survival data. J Am Stat Assoc 92(439):1195–1212

    MathSciNet  MATH  Article  Google Scholar 

  48. Struthers CA, Kalbfleisch JD (1986) Misspecified proportional hazard models. Biometrika 73(2):363–369

    MathSciNet  MATH  Article  Google Scholar 

  49. Vaupel JW, Manton KG, Stallard E (1979) The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography 16(3):439–454

    Article  Google Scholar 

  50. Vaupel JW, Yashin AI (1983) The deviant dynamics of death in heterogeneous populations

  51. Venables WN, Ripley BD (2013) Modern applied statistics with S-PLUS. Springer Science & Business Media (2013)

  52. Vilca F, Santana L, Leiva V, Balakrishnan N (2011) Estimation of extreme percentiles in birnbaum-saunders distributions. Comput Stat Data Anal 55(4):1665–1678

    MathSciNet  MATH  Article  Google Scholar 

  53. Wienke A (2010) Frailty models in survival analysis. CRC Press, Boca Raton

    Book  Google Scholar 

Download references

Acknowledgements

The authors are very grateful to the Associate Editor and the anonymous referees for their helpful and useful comments that improved the manuscript. In addition, the authors are also very grateful to the São Paulo Oncocenter Foundation (FOSP) for providing the lung cancer dataset. Alex Mota acknowledges grant from the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) - Finance Code 001. Jeremias Leão is supported by the FAPEAM. Francisco Louzada is supported by the Brazilian agencies CNPq (grant number 301976/2017-1) and FAPESP (grant number 2013/07375-0).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Alex Mota.

Ethics declarations

Data availability statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A1—Simulation results for the WL frailty model with Gompertz baseline hazard function

Appendix A1—Simulation results for the WL frailty model with Gompertz baseline hazard function

Here, we fixed \(\kappa =0.5\), \(\rho =0.6\), \(\sigma ^2=0.8\), and \(\beta _1=0.7\). We also analyzed the performance of the ML estimates, considering the same criteria adopted for the model with the Weibull baseline hazard function. In general, we observed similar results. However, when comparing the two studies, we noticed that the metrics RMSE, SD, and CP presented a better performance when we adopted the model with the Weibull baseline hazard function.

Table 6 Bias, RMSE and SD of the ML estimates, and empirical CP of \(95\%\) asymptotic confidence intervals for the simulated data of the WL frailty model with Gompertz baseline hazard function

We also repeated the simulation study to analyze the performance of the LR test for \(H_0:\sigma ^2=0\). In this study, we set \(\kappa =0.5\), \(\rho =0.6\), \(\beta _1=0.7\) and \(\sigma ^2\in \{0, 0.01, 0.10, 0.20, 0.50, 0.75, 1.00, 1.50\}\). The sample size was configured to study the model with the Gompertz baseline hazard function. The censored times were generated from the \(\text {Uniform}(0,14)\) distribution, with the proportion of censoring times varying from 5% to 17%. Again, the results are similar to the model with the Weibull baseline hazard function. However, to obtain test power greater than or equal to 0.9, \(\sigma ^2 \ge 0.75\) and \(n\ge 500\) are required.

Table 7 Rejection rates of the null hypothesis (absence of unobservable heterogeneity) at \(5\%\) nominal significance level for several unobserved heterogeneity and sample sizes considering the WL frailty model with Gompertz baseline hazard function

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mota, A., Milani, E.A., Calsavara, V.F. et al. Weighted Lindley frailty model: estimation and application to lung cancer data. Lifetime Data Anal 27, 561–587 (2021). https://doi.org/10.1007/s10985-021-09529-1

Download citation

Keywords

  • Lung cancer
  • Maximum likelihood method
  • Unobserved heterogeneity
  • Weighted Lindley distribution.