Skip to main content
Log in

Asymptotic justification of maximum likelihood estimation for the proportional excess hazard model in analysis of cancer registry data

  • Original Paper
  • Statistics for Stochastic Processes
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

Population-based cancer registry studies are conducted to investigate the various cancer question and have important impacts on cancer control. In order to investigate cancer prognosis from cancer registry data, it is necessary to adjust the effect of deaths from other causes, since cancer registry data include deaths from causes other than cancer. To correct for the effect of deaths from other causes, excess hazard models are often used. The concept of the excess hazard model is that the hazard function for any death in a cancer registry population is the sum of the hazard for cancer deaths, refer to the excess hazard, and the hazard for deaths from other causes. The Cox proportional hazard model for the excess hazard has been developed, and for this model, Perme et al. (Biostatistics 10:136–146, 2009) proposed the inference procedure of the regression coefficients using the techniques of the EM algorithm to compute the maximum likelihood estimator. In this article, we present the large sample properties for the maximum likelihood estimator. We introduce a consistent estimator of the variance for the regression coefficients based on the technique of the semiparametric theory and the consistency and the asymptotic normality of the estimator. The empirical property of variance estimator is investigated by the finite sample simulation studies. We also apply the variance estimator to cancer registry data for stomach, lung, and liver cancer patients from the Surveillance, Epidemiology, and End Results (SEER) database in U.S.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Allemani, C., Weir, H. K., Carreira, H., Harewood, R., Spika, D., Wang, X. S., et al. (2015). Global surveillance of cancer survival 1995–2009: Analysis of individual data for 25,676,887 patients from 279 population-based registries in 67 countries (CONCORD-2). Lancet, 385, 977–1010.

    Article  Google Scholar 

  • Allemani, C., Matsuda, T., Carlo, V. D., Harewood, R., Matz, M., Maja, N., et al. (2018). Global surveillance of trends in cancer survival 2000–14 (CONCORD-3): Analysis of individual records for 37,513,025 patients diagnosed with one of 18 cancers from 322 population-based registries in 71 countries. Lancet, 39, 1023–75.

    Article  Google Scholar 

  • Andersen, P. K., Borgan, O., Gill, R. D., & Keiding, N. (1993). Statistical models based on counting processes. Springer.

    Book  MATH  Google Scholar 

  • Bolard, P., Quantin, C., Abrahamowicz, M., Estève, J., Giorgi, R., Chadha-Boreham, H., et al. (2002). Assessing time-by-covariate interactions in relative survival models using restrictive cubic spline functions. Journal of Cancer Epidemiology and Prevention, 7, 113–122.

    Google Scholar 

  • Coleman, M. P., Quaresma, Q., Berrino, F., Lutz, J., Angelis, R. D., Capocaccia, R., et al. (2008). Cancer survival in five continents: A worldwide population-based study (CONCORD). Lancet Oncology, 9, 730–756.

    Article  Google Scholar 

  • Cortese, G., & Scheike, T. H. (2008). Dynamic regression hazards models for relative survival. Statistics in Medicine, 27, 3563–3584.

    Article  MathSciNet  Google Scholar 

  • Derks, M. G. M., Bastiaannet, E., Kiderlen, M., Hilling, D. E., Boelens, P. G., Walsh, P. M., et al. (2018). Variation in treatment and survival of older patients with nonmetastatic breast cancer in five European countries: A population-based cohort study from the EURECCA Breast Cancer Group. British Journal of Cancer, 119, 121–129.

    Article  Google Scholar 

  • Dickman, P. W., Sloggett, A., Hills, M., & Hakulinen, T. (2004). Regression models for relative survival. Statistics in Medicine, 23(1), 51–64.

    Article  Google Scholar 

  • Ederer, F., Axitell, L. M., & Cutler, S. J. (1961). The relative survival rate: A statistical methodology. National Cancer Institute Monograph, 6, 101–121.

    Google Scholar 

  • Estève, J., Benhamou, E., Croasdale, M., & Raymond, L. (1990). Relative survival and the estimation of net survival: Elements for further discussion. Statistics in Medicine, 9, 529–538.

    Article  Google Scholar 

  • Fang, H.-B., Li, G., & Sun, J. (2005). Maximum likelihood estimation in a semiparametric logistic/proportional-hazard mixture model. Scandinavian Journal of Statistics, 32, 59–75.

    Article  MathSciNet  MATH  Google Scholar 

  • Fine, J. P., & Gray, R. J. (1999). A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association, 94, 496–509.

    Article  MathSciNet  MATH  Google Scholar 

  • Fleming, T. R., & Harrington, D. P. (1991). Counting processes and survival analysis. Wiley.

    MATH  Google Scholar 

  • Gorgi, R., Abrahamowicz, M., Quantin, C., Bolard, P., Estève, J., Gouvernet, J., & Faivre, J. (2003). A relative survival regression model using B-spline functions to model non-proportional hazards. Statistics in Medicine, 22, 2767–2784.

    Article  Google Scholar 

  • Hakulinen, T. (1982). Cancer survival corrected for heterogeneity in patient withdrawal. Biometrics, 38, 933–942.

    Article  Google Scholar 

  • Hakulinen, T., & Tenkanen, L. (1987). Regression analysis of relative survival rates. Journal of the Royal Statistical Society, Series C., 36, 309–317.

    Google Scholar 

  • Kalager, M., Adami, H.-O., Lagergren, P., Steindorf, K., & Dickman, P. W. (2021). Cancer outcomes research-a European challenge: Measures of the cancer burden. Molecular Oncology, 15, 3223–3241.

    Article  Google Scholar 

  • Komukai, S., & Hattori, S. (2017). Doubly robust estimator for net survival rate in analyses of cancer registry data. Biometrics, 73, 124–133.

    Article  MathSciNet  MATH  Google Scholar 

  • Komukai, S., & Hattori, S. (2020). Doubly robust inference procedure for relative survival ratio in population-based cancer registry data. Statistics in Medicine, 39(13), 1884–1900.

    Article  MathSciNet  Google Scholar 

  • Lambert, P. C., Smith, L. K., Jones, D. R., & Botha, J. L. (2005). Additive and multiplicative covariate regression models for relative survival incorporating fractional polynomials for time-dependent effects. Statistics in Medicine, 24, 3871–3885.

    Article  MathSciNet  Google Scholar 

  • Li, M., Reintals, M., D’Onise, K., Farshid, G., Holmes, A., Joshi, R., Karapetis, C. S., Miller, C. L., Olver, I. N., Buckley, E. S., Townsend, A., Walters, D., & Roder, D. M. (2021). Investigating the breast cancer screening-treatment-mortality pathway of women diagnosed with invasive breast cancer: Results from linked health data. European Journal of Cancer Care, 31, e13539. https://doi.org/10.1111/ecc.13539

    Article  Google Scholar 

  • Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.

    MathSciNet  MATH  Google Scholar 

  • Murphy, S. A., Rossini, A. J., & van der Vaart, A. W. (1997). Maximum likelihood estimation in the proportional odds model. Journal of the American Statistical Association, 92, 968–976.

    Article  MathSciNet  MATH  Google Scholar 

  • Nelson, C. P., Lambert, P. C., Squire, I. B., & Jones, D. R. (2007). Flexible parametric models for relative survival, with application in coronary heart disease. Statistics in Medicine, 26, 5486–5498.

    Article  MathSciNet  Google Scholar 

  • Perme, M. P., Stare, J., & Estève, J. (2012). On estimation in relative survival. Biometrics, 68, 113–120.

    Article  MathSciNet  MATH  Google Scholar 

  • Perme, M. P., Henderson, R., & Stare, J. (2009). An approach to estimation in relative survival regression. Biostatistics, 10, 136–146.

    Article  MATH  Google Scholar 

  • Perme, M. P., Estève, J., & Rachet, B. (2016). Analysing population-based cancer survival–settling the controversies. BMC Cancer, 16, 933. https://doi.org/10.1186/s12885-016-2967-9

    Article  Google Scholar 

  • Pollard, D. (1990). Empirical processes: Theory and applications. Institute of Mathematical Statistics.

  • Rubio F. J., Remontet L., Jewell N. P. & Belot A. (2018). On a general structure for hazard-based regression models: an application to population-based cancer research. Statistical Methods in Medical Research.

  • Rubio, F. J., Rachet, B., Giorgi, R., Maringe, C., Belot, A., & the CENSUR working survival group,. (2021). On models for the estimation of the excess mortality hazard in case of insufficiently stratified life tables. Biostatistics, 22(1), 51–67.

  • Sasieni, P. D. (1996). Proportional excess hazards. Biometrika, 83, 127–141.

    Article  MathSciNet  MATH  Google Scholar 

  • Schuil, H., Derks, M., Liefers, G.-J., Portielje, J., van de Velde, C., Syed, B., Green, A., Ellis, I., Cheung, K.-L., & Bastiaannet, E. (2018). Treatment strategies and survival outcomes in older women with breast cancer: A comparative study between the FOCUS cohort and Nottingham cohort. Journal of Geriatric Oncology, 9, 635–641.

    Article  Google Scholar 

  • Syriopoulou, E., Rutherford, M. R., & Lambert, P. C. (2021). Inverse probability weighting and doubly robust standardization in the relative survival framework. Statistics in Medicine, 40, 6069–6092.

    Article  MathSciNet  Google Scholar 

  • Touraine, C., Graféo, N., Giorgi, R., & the CENSUR working survival group. (2020). More accurate cancer-related excess mortality through correcting background mortality for extra variables. Statistical Methods in Medical Research, 29(1), 122–136.

  • Tsiatis, A. (2006). Semiparametric Theory and Missing Data. New York: Springer.

    MATH  Google Scholar 

  • Van der Vaart, A. W. (2000). Asymptotic statistics (Vol. 3). Cambridge University Press.

  • Woods, L. M., Rachet, B., Morris, M., Bhaskaran, K., & Coleman, M. P. (2021). Are socio-economic inequalities in breast cancer survival explained by peri-diagnostic factors? BMC Cancer, 21, 485. https://doi.org/10.1186/s12885-021-08087-x

    Article  Google Scholar 

Download references

Acknowledgements

The first author’s research was partly supported by Grant-in-Aid for Early-Career Scientists (20K19754) from the Ministry of Education, Science, Sports and Technology of Japan. The second author’s research was partly supported by Grant-in-Aid for Challenging Exploratory Research (16K12403) and for Scientific Research (16H06299, 18H03208) from the Ministry of Education, Science, Sports and Technology of Japan. Computational calculations were performed at the Institute of Medical Science (the University of Tokyo).

Funding

The research leading to these results received funding from the Ministry of Education, Science, Sports and Technology of Japan under Grant-in-Aid for Early-Career Scientists No. 20K19754, for Challenging Exploratory Research No. 16K12403, and for Scientific Research No. 16H06299 and No. 18H03208.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Satoshi Hattori.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Appendix A: Existence of the maximum likelihood estimator and identifiability of \(\beta _0\) and \(\Lambda _0\)

The existence of the pair of the parameters \((\beta ,\Lambda )\) maximizing the observed likelihood (3) is proved by using the similar arguments to the proof of theorem 1 in Fang et al. (2005). In this Appendix, along this line, we prove the identifiability of \((\beta _0,\Lambda _0)\) in the sense that \(L(\Lambda ,\beta ;t,\delta ,z) = L(\Lambda _0, \beta _0;t,\delta ,z)\) implies \(\beta = \beta _0\) and \(\Lambda (t)=\Lambda _0\) for all \(t \in [0,\tau ]\).

Suppose the parameter space \(\mathscr {B} \in R^p\) of \(\beta \) is compact, where p is the dimension of \(\beta \). Since the vector of covariates Z is bounded, \(e^{\beta ^TZ}\) is also bounded, and its lower and upper bounds are denoted by \(K_l\) and \(K_u\), respectively. Let \(t_1<t_2<\cdots <t_k\) be the distinct failure times. Then, for any right-continuous and non-decreasing function \(\Lambda (t)\), it holds that

$$\begin{aligned} 0 \le L(\beta , \Lambda )&\le \prod _{i=1}^n{ \left\{ K_u \Lambda (T_i) + \Lambda _P(T_i|Z_i) \right\} ^{\Delta _i} e^{ - \Lambda (T_i)K_l } } \nonumber \\&\le \prod _{i:T_i < t_k}{ \left\{ \frac{ K_u }{ K_l } + \Lambda _P(T_i|Z_i) \right\} ^{\Delta _i} }\nonumber \\&\times \prod _{i:T_i = t_k}{ \left\{ K_u \Lambda (t_k)e^{ - \Lambda (t_k)K_l } + \Lambda _P(T_i|Z_i) \right\} ^{\Delta _i} }. \end{aligned}$$
(12)

Because forcing \(\Lambda (T_i)=\Lambda (t_k)\) for all \(T_i \ge t_k\) will increase the likelihood if \(t_k\) is sufficiently large value satisfying \(\Lambda (t_k) \ge 1\), it suffices to restrict the space of \(\Lambda (t)\) to the space \(\Omega _0\), where

$$\begin{aligned} \Omega _0 =&\left\{ \Lambda : \Lambda (t) \ \text{ is } \text{ the } \text{ right } \text{ continuous } \text{ and } \text{ non-decreasing } \text{ function } \right. \\&\qquad \left. \text{ with } \ \Lambda (t)=\Lambda (t_k) \ \mathrm{for \ all} \ t \ge t_k \right\} . \end{aligned}$$

Let \(A_M = \left\{ \Lambda \in \Omega _0: \Lambda (t_k) \le M \right\} \) for any \(0<M<\infty \). Because \(L(\beta ,\Lambda )\) is continuous in \(\beta \) and \(\Lambda \), it has a maximum in the compact subspace \(\mathscr {B} \times A_M\) for any given M. Let \(L^{(M)}\) be the maximum value of \(L(\beta ,\Lambda )\) in \(\mathscr {B} \times A_M\). By \(e^{-MK_l}M \rightarrow 0\) as \(M \rightarrow \infty \), there exists an \(M_0 \ge 1\) such that the right-hand side of (12) is less than \(L^{(M_0)}\) for all \(\Lambda \) out of \(A_{M_0}\). Therefore, the likelihood evaluated at any sequence \(\Lambda _m\) of \(\Lambda \) with \(\Lambda _m(t_k)\) diverging to infinity as \(m \rightarrow \infty \) will not approach the maximum value of \(L(\beta ,\Lambda )\). As a consequence, when maximizing the observed likelihood (3), we can restrict the compact subspace \(\mathscr {B} \times A_{M_0}\). The existence of the maximum likelihood estimator can be proved by the continuity of the likelihood.

We prove that both of \(\beta _0\) and \(\Lambda _0\) are identifiable. By considering \(L(\Lambda ,\beta ;t,0,z) = L(\Lambda _0, \beta _0;t,0,z)\), we have that \(\Lambda (t)/\Lambda _0(t)=e^{-\left( \beta - \beta _0 \right) ^TZ}\) for all \(t \le \tau \) and Z such that \(\Pr (T> \tau |Z)>0\). Therefore, since \(\left( \beta - \beta _0 \right) ^TZ\) is constant for all Z, it hold that \(\beta =\beta _0\) if the covariance of Z is nondegenerate, and also we have \(\Lambda (t)=\Lambda _0(t)\) for all \(t \le \tau \). By considering \(L(\Lambda ,\beta ;t,1,z) = L(\Lambda _0, \beta _0;t,1,z)\), we also have \(\mathrm{{d}}\Lambda (t)=\mathrm{{d}}\Lambda _0(t)\) for all \(t \le \tau \).

Appendix B: Nuisance tangent space and its orthogonal complement

Let \(\mathcal {H}\) be a Hilbert space consisted of all p-dimensional measurable functions of \((T,\Delta , Z)\) with mean-zero and finite variance equipped with inner product \(\left<h_1,h_2\right>=E\left[ h_1^T(T,\Delta ,Z)h_2(T,\Delta ,Z)\right] \). To derive the nuisance tangent space for the nuisance parameter \(\eta =\left\{ \Lambda , \Lambda _C, F_Z \right\} \), we consider parametric submodels \(\Lambda _{h_1}(t;\gamma _1)\), \(\Lambda _{C,h_2}(t|Z;\gamma _2)\), and \(F_{Z,h_3}(z;\gamma _3)\) for \(\Lambda \), \(\Lambda _C\), and \(F_Z\), respectively, which were defined in Sect. 3, where \(\gamma _1\), \(\gamma _2\), and \(\gamma _3\) are the finite-dimensional nuisance parameters. Then, the nuisance tangent spaces for each nuisance parameter will be derived as the mean-square closure of all parametric submodel nuisance tangent spaces. Since the derivations of the nuisance tangent spaces \(\Gamma _2\) and \(\Gamma _3\) in Sect. 4, which are for the nuisance parameters \(\Lambda _C\) and \(F_Z\), respectively, are the same as those of Section 5.2 in Tsiatis (2006), we only derive here the nuisance tangent space \(\Gamma _1\), which is for the nuisance parameter \(\Lambda \), in Theorem 1.

Again, we consider a parametric submodel \(\Lambda _{h_1}(t;\gamma _1) = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \mathrm{{d}}\Lambda _0(u) } = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \lambda _0(u) \mathrm{{d}}u }\), where \(h_1(u)\) is an arbitrary p-dimensional bounded function. The contribution to the log-likelihood function under the parametric submodel is

$$\begin{aligned} \ell _n(\beta , \gamma _1; h_1)&= \sum _{i=1}^n{ \Delta _i \log { \left[ \left\{ 1 + \gamma _1 h_1(T_i) \right\} \mathrm{{d}}\Lambda _0(T_i)e^{\beta ^TZ_i} + \mathrm{{d}}\Lambda _P(T_i|Z_i) \right] } } \\&\quad - \sum _{i=1}^n{ \int _{0}^{T_i}{ \left\{ 1 + \gamma _1 h_1(t) \right\} \mathrm{{d}}\Lambda _0(t) }e^{\beta ^TZ_i } }. \end{aligned}$$

Taking derivatives of \(\ell _n(\beta , \gamma _1; h_1)\) with respect to \(\gamma _1\), and evaluating \(\beta =\beta _0\) and \(\gamma _1=0\), we obtain the score function

$$\begin{aligned} U_{n,\gamma _1}(\beta _0;h)&= \sum _{i=1}^n{ \int _{0}^{\tau }{ h_1(t) W(t|Z_i;\beta _0, \Lambda _0) \mathrm{{d}}M_i(t)} }. \end{aligned}$$

Then, the score function for this parametric submodel is in the nuisance tangent space \(\Gamma _1\). Since any element of \(\mathcal {H}\) can be approximated by a sequence of bounded function (Tsiatis 2006, Section 4), the score function with parametric submodel without the boundedness of \(h_1(t)\) is also in \(\Gamma _1\).

For any parametric submodel \(\Lambda (t;\gamma _1) = \int _0^t{ \lambda (u;\gamma _1) \mathrm{{d}}u }\), the score function with respect to \(\gamma _1\), setting \(\gamma _1=0\) and \(\beta =\beta _0\), is expressed as

$$\begin{aligned} U_{1,\gamma _1}(\beta _0)&= \int _{0}^{\tau }{ \left\{ \left. \frac{\partial }{\partial \gamma _1 }\log {\lambda (t;\gamma _1)} \right| _{\gamma _1=0} \right\} W(t|Z;\beta _0, \Lambda _0) \mathrm{{d}}M(t)}. \nonumber \end{aligned}$$

Then, this score function is in the nuisance tangent space \(\Gamma _1\). On the other hand, we can demonstrate that the score function for the some parametric submodel included in \(\Gamma _1\), such as \(\Lambda _{h_1}(t;\gamma _1) = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \mathrm{{d}}\Lambda _0(u) }\), is an element of a parametric submodel nuisance tangent space. Therefore, it holds that the nuisance tangent space for \(\Lambda (t)\) is equal to \(\Gamma _1\).

\(\Gamma _1 \perp \Gamma _2\) can be easily proved under the assumption (A2) and \(\Gamma _i \perp \Gamma _3 \ (i=1,2)\) can be also proved by \(E[\alpha _i^Th_3(Z)]=0\), where \(\alpha _i \in \Gamma _i\ (i=1,2)\) and \(h_3(Z) \in \Gamma _3\). Then the nuisance tangent space for the nuisance parameter \(\eta =\left\{ \Lambda , \Lambda _C, F_Z \right\} \) is given by the direct sum of three orthogonal spaces, \(\Gamma = \Gamma _1 \oplus \Gamma _2 \oplus \Gamma _3\). The orthogonal complement \(\Gamma ^{\perp }\) is obtained by applying the almost same procedures as the proof of Theorem 5.5 in Tsiatis (2006).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Komukai, S., Hattori, S. Asymptotic justification of maximum likelihood estimation for the proportional excess hazard model in analysis of cancer registry data. Jpn J Stat Data Sci 6, 337–359 (2023). https://doi.org/10.1007/s42081-023-00190-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42081-023-00190-6

Keywords

Navigation