Abstract
Population-based cancer registry studies are conducted to investigate the various cancer question and have important impacts on cancer control. In order to investigate cancer prognosis from cancer registry data, it is necessary to adjust the effect of deaths from other causes, since cancer registry data include deaths from causes other than cancer. To correct for the effect of deaths from other causes, excess hazard models are often used. The concept of the excess hazard model is that the hazard function for any death in a cancer registry population is the sum of the hazard for cancer deaths, refer to the excess hazard, and the hazard for deaths from other causes. The Cox proportional hazard model for the excess hazard has been developed, and for this model, Perme et al. (Biostatistics 10:136–146, 2009) proposed the inference procedure of the regression coefficients using the techniques of the EM algorithm to compute the maximum likelihood estimator. In this article, we present the large sample properties for the maximum likelihood estimator. We introduce a consistent estimator of the variance for the regression coefficients based on the technique of the semiparametric theory and the consistency and the asymptotic normality of the estimator. The empirical property of variance estimator is investigated by the finite sample simulation studies. We also apply the variance estimator to cancer registry data for stomach, lung, and liver cancer patients from the Surveillance, Epidemiology, and End Results (SEER) database in U.S.
Similar content being viewed by others
References
Allemani, C., Weir, H. K., Carreira, H., Harewood, R., Spika, D., Wang, X. S., et al. (2015). Global surveillance of cancer survival 1995–2009: Analysis of individual data for 25,676,887 patients from 279 population-based registries in 67 countries (CONCORD-2). Lancet, 385, 977–1010.
Allemani, C., Matsuda, T., Carlo, V. D., Harewood, R., Matz, M., Maja, N., et al. (2018). Global surveillance of trends in cancer survival 2000–14 (CONCORD-3): Analysis of individual records for 37,513,025 patients diagnosed with one of 18 cancers from 322 population-based registries in 71 countries. Lancet, 39, 1023–75.
Andersen, P. K., Borgan, O., Gill, R. D., & Keiding, N. (1993). Statistical models based on counting processes. Springer.
Bolard, P., Quantin, C., Abrahamowicz, M., Estève, J., Giorgi, R., Chadha-Boreham, H., et al. (2002). Assessing time-by-covariate interactions in relative survival models using restrictive cubic spline functions. Journal of Cancer Epidemiology and Prevention, 7, 113–122.
Coleman, M. P., Quaresma, Q., Berrino, F., Lutz, J., Angelis, R. D., Capocaccia, R., et al. (2008). Cancer survival in five continents: A worldwide population-based study (CONCORD). Lancet Oncology, 9, 730–756.
Cortese, G., & Scheike, T. H. (2008). Dynamic regression hazards models for relative survival. Statistics in Medicine, 27, 3563–3584.
Derks, M. G. M., Bastiaannet, E., Kiderlen, M., Hilling, D. E., Boelens, P. G., Walsh, P. M., et al. (2018). Variation in treatment and survival of older patients with nonmetastatic breast cancer in five European countries: A population-based cohort study from the EURECCA Breast Cancer Group. British Journal of Cancer, 119, 121–129.
Dickman, P. W., Sloggett, A., Hills, M., & Hakulinen, T. (2004). Regression models for relative survival. Statistics in Medicine, 23(1), 51–64.
Ederer, F., Axitell, L. M., & Cutler, S. J. (1961). The relative survival rate: A statistical methodology. National Cancer Institute Monograph, 6, 101–121.
Estève, J., Benhamou, E., Croasdale, M., & Raymond, L. (1990). Relative survival and the estimation of net survival: Elements for further discussion. Statistics in Medicine, 9, 529–538.
Fang, H.-B., Li, G., & Sun, J. (2005). Maximum likelihood estimation in a semiparametric logistic/proportional-hazard mixture model. Scandinavian Journal of Statistics, 32, 59–75.
Fine, J. P., & Gray, R. J. (1999). A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association, 94, 496–509.
Fleming, T. R., & Harrington, D. P. (1991). Counting processes and survival analysis. Wiley.
Gorgi, R., Abrahamowicz, M., Quantin, C., Bolard, P., Estève, J., Gouvernet, J., & Faivre, J. (2003). A relative survival regression model using B-spline functions to model non-proportional hazards. Statistics in Medicine, 22, 2767–2784.
Hakulinen, T. (1982). Cancer survival corrected for heterogeneity in patient withdrawal. Biometrics, 38, 933–942.
Hakulinen, T., & Tenkanen, L. (1987). Regression analysis of relative survival rates. Journal of the Royal Statistical Society, Series C., 36, 309–317.
Kalager, M., Adami, H.-O., Lagergren, P., Steindorf, K., & Dickman, P. W. (2021). Cancer outcomes research-a European challenge: Measures of the cancer burden. Molecular Oncology, 15, 3223–3241.
Komukai, S., & Hattori, S. (2017). Doubly robust estimator for net survival rate in analyses of cancer registry data. Biometrics, 73, 124–133.
Komukai, S., & Hattori, S. (2020). Doubly robust inference procedure for relative survival ratio in population-based cancer registry data. Statistics in Medicine, 39(13), 1884–1900.
Lambert, P. C., Smith, L. K., Jones, D. R., & Botha, J. L. (2005). Additive and multiplicative covariate regression models for relative survival incorporating fractional polynomials for time-dependent effects. Statistics in Medicine, 24, 3871–3885.
Li, M., Reintals, M., D’Onise, K., Farshid, G., Holmes, A., Joshi, R., Karapetis, C. S., Miller, C. L., Olver, I. N., Buckley, E. S., Townsend, A., Walters, D., & Roder, D. M. (2021). Investigating the breast cancer screening-treatment-mortality pathway of women diagnosed with invasive breast cancer: Results from linked health data. European Journal of Cancer Care, 31, e13539. https://doi.org/10.1111/ecc.13539
Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.
Murphy, S. A., Rossini, A. J., & van der Vaart, A. W. (1997). Maximum likelihood estimation in the proportional odds model. Journal of the American Statistical Association, 92, 968–976.
Nelson, C. P., Lambert, P. C., Squire, I. B., & Jones, D. R. (2007). Flexible parametric models for relative survival, with application in coronary heart disease. Statistics in Medicine, 26, 5486–5498.
Perme, M. P., Stare, J., & Estève, J. (2012). On estimation in relative survival. Biometrics, 68, 113–120.
Perme, M. P., Henderson, R., & Stare, J. (2009). An approach to estimation in relative survival regression. Biostatistics, 10, 136–146.
Perme, M. P., Estève, J., & Rachet, B. (2016). Analysing population-based cancer survival–settling the controversies. BMC Cancer, 16, 933. https://doi.org/10.1186/s12885-016-2967-9
Pollard, D. (1990). Empirical processes: Theory and applications. Institute of Mathematical Statistics.
Rubio F. J., Remontet L., Jewell N. P. & Belot A. (2018). On a general structure for hazard-based regression models: an application to population-based cancer research. Statistical Methods in Medical Research.
Rubio, F. J., Rachet, B., Giorgi, R., Maringe, C., Belot, A., & the CENSUR working survival group,. (2021). On models for the estimation of the excess mortality hazard in case of insufficiently stratified life tables. Biostatistics, 22(1), 51–67.
Sasieni, P. D. (1996). Proportional excess hazards. Biometrika, 83, 127–141.
Schuil, H., Derks, M., Liefers, G.-J., Portielje, J., van de Velde, C., Syed, B., Green, A., Ellis, I., Cheung, K.-L., & Bastiaannet, E. (2018). Treatment strategies and survival outcomes in older women with breast cancer: A comparative study between the FOCUS cohort and Nottingham cohort. Journal of Geriatric Oncology, 9, 635–641.
Syriopoulou, E., Rutherford, M. R., & Lambert, P. C. (2021). Inverse probability weighting and doubly robust standardization in the relative survival framework. Statistics in Medicine, 40, 6069–6092.
Touraine, C., Graféo, N., Giorgi, R., & the CENSUR working survival group. (2020). More accurate cancer-related excess mortality through correcting background mortality for extra variables. Statistical Methods in Medical Research, 29(1), 122–136.
Tsiatis, A. (2006). Semiparametric Theory and Missing Data. New York: Springer.
Van der Vaart, A. W. (2000). Asymptotic statistics (Vol. 3). Cambridge University Press.
Woods, L. M., Rachet, B., Morris, M., Bhaskaran, K., & Coleman, M. P. (2021). Are socio-economic inequalities in breast cancer survival explained by peri-diagnostic factors? BMC Cancer, 21, 485. https://doi.org/10.1186/s12885-021-08087-x
Acknowledgements
The first author’s research was partly supported by Grant-in-Aid for Early-Career Scientists (20K19754) from the Ministry of Education, Science, Sports and Technology of Japan. The second author’s research was partly supported by Grant-in-Aid for Challenging Exploratory Research (16K12403) and for Scientific Research (16H06299, 18H03208) from the Ministry of Education, Science, Sports and Technology of Japan. Computational calculations were performed at the Institute of Medical Science (the University of Tokyo).
Funding
The research leading to these results received funding from the Ministry of Education, Science, Sports and Technology of Japan under Grant-in-Aid for Early-Career Scientists No. 20K19754, for Challenging Exploratory Research No. 16K12403, and for Scientific Research No. 16H06299 and No. 18H03208.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Appendix A: Existence of the maximum likelihood estimator and identifiability of \(\beta _0\) and \(\Lambda _0\)
The existence of the pair of the parameters \((\beta ,\Lambda )\) maximizing the observed likelihood (3) is proved by using the similar arguments to the proof of theorem 1 in Fang et al. (2005). In this Appendix, along this line, we prove the identifiability of \((\beta _0,\Lambda _0)\) in the sense that \(L(\Lambda ,\beta ;t,\delta ,z) = L(\Lambda _0, \beta _0;t,\delta ,z)\) implies \(\beta = \beta _0\) and \(\Lambda (t)=\Lambda _0\) for all \(t \in [0,\tau ]\).
Suppose the parameter space \(\mathscr {B} \in R^p\) of \(\beta \) is compact, where p is the dimension of \(\beta \). Since the vector of covariates Z is bounded, \(e^{\beta ^TZ}\) is also bounded, and its lower and upper bounds are denoted by \(K_l\) and \(K_u\), respectively. Let \(t_1<t_2<\cdots <t_k\) be the distinct failure times. Then, for any right-continuous and non-decreasing function \(\Lambda (t)\), it holds that
Because forcing \(\Lambda (T_i)=\Lambda (t_k)\) for all \(T_i \ge t_k\) will increase the likelihood if \(t_k\) is sufficiently large value satisfying \(\Lambda (t_k) \ge 1\), it suffices to restrict the space of \(\Lambda (t)\) to the space \(\Omega _0\), where
Let \(A_M = \left\{ \Lambda \in \Omega _0: \Lambda (t_k) \le M \right\} \) for any \(0<M<\infty \). Because \(L(\beta ,\Lambda )\) is continuous in \(\beta \) and \(\Lambda \), it has a maximum in the compact subspace \(\mathscr {B} \times A_M\) for any given M. Let \(L^{(M)}\) be the maximum value of \(L(\beta ,\Lambda )\) in \(\mathscr {B} \times A_M\). By \(e^{-MK_l}M \rightarrow 0\) as \(M \rightarrow \infty \), there exists an \(M_0 \ge 1\) such that the right-hand side of (12) is less than \(L^{(M_0)}\) for all \(\Lambda \) out of \(A_{M_0}\). Therefore, the likelihood evaluated at any sequence \(\Lambda _m\) of \(\Lambda \) with \(\Lambda _m(t_k)\) diverging to infinity as \(m \rightarrow \infty \) will not approach the maximum value of \(L(\beta ,\Lambda )\). As a consequence, when maximizing the observed likelihood (3), we can restrict the compact subspace \(\mathscr {B} \times A_{M_0}\). The existence of the maximum likelihood estimator can be proved by the continuity of the likelihood.
We prove that both of \(\beta _0\) and \(\Lambda _0\) are identifiable. By considering \(L(\Lambda ,\beta ;t,0,z) = L(\Lambda _0, \beta _0;t,0,z)\), we have that \(\Lambda (t)/\Lambda _0(t)=e^{-\left( \beta - \beta _0 \right) ^TZ}\) for all \(t \le \tau \) and Z such that \(\Pr (T> \tau |Z)>0\). Therefore, since \(\left( \beta - \beta _0 \right) ^TZ\) is constant for all Z, it hold that \(\beta =\beta _0\) if the covariance of Z is nondegenerate, and also we have \(\Lambda (t)=\Lambda _0(t)\) for all \(t \le \tau \). By considering \(L(\Lambda ,\beta ;t,1,z) = L(\Lambda _0, \beta _0;t,1,z)\), we also have \(\mathrm{{d}}\Lambda (t)=\mathrm{{d}}\Lambda _0(t)\) for all \(t \le \tau \).
Appendix B: Nuisance tangent space and its orthogonal complement
Let \(\mathcal {H}\) be a Hilbert space consisted of all p-dimensional measurable functions of \((T,\Delta , Z)\) with mean-zero and finite variance equipped with inner product \(\left<h_1,h_2\right>=E\left[ h_1^T(T,\Delta ,Z)h_2(T,\Delta ,Z)\right] \). To derive the nuisance tangent space for the nuisance parameter \(\eta =\left\{ \Lambda , \Lambda _C, F_Z \right\} \), we consider parametric submodels \(\Lambda _{h_1}(t;\gamma _1)\), \(\Lambda _{C,h_2}(t|Z;\gamma _2)\), and \(F_{Z,h_3}(z;\gamma _3)\) for \(\Lambda \), \(\Lambda _C\), and \(F_Z\), respectively, which were defined in Sect. 3, where \(\gamma _1\), \(\gamma _2\), and \(\gamma _3\) are the finite-dimensional nuisance parameters. Then, the nuisance tangent spaces for each nuisance parameter will be derived as the mean-square closure of all parametric submodel nuisance tangent spaces. Since the derivations of the nuisance tangent spaces \(\Gamma _2\) and \(\Gamma _3\) in Sect. 4, which are for the nuisance parameters \(\Lambda _C\) and \(F_Z\), respectively, are the same as those of Section 5.2 in Tsiatis (2006), we only derive here the nuisance tangent space \(\Gamma _1\), which is for the nuisance parameter \(\Lambda \), in Theorem 1.
Again, we consider a parametric submodel \(\Lambda _{h_1}(t;\gamma _1) = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \mathrm{{d}}\Lambda _0(u) } = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \lambda _0(u) \mathrm{{d}}u }\), where \(h_1(u)\) is an arbitrary p-dimensional bounded function. The contribution to the log-likelihood function under the parametric submodel is
Taking derivatives of \(\ell _n(\beta , \gamma _1; h_1)\) with respect to \(\gamma _1\), and evaluating \(\beta =\beta _0\) and \(\gamma _1=0\), we obtain the score function
Then, the score function for this parametric submodel is in the nuisance tangent space \(\Gamma _1\). Since any element of \(\mathcal {H}\) can be approximated by a sequence of bounded function (Tsiatis 2006, Section 4), the score function with parametric submodel without the boundedness of \(h_1(t)\) is also in \(\Gamma _1\).
For any parametric submodel \(\Lambda (t;\gamma _1) = \int _0^t{ \lambda (u;\gamma _1) \mathrm{{d}}u }\), the score function with respect to \(\gamma _1\), setting \(\gamma _1=0\) and \(\beta =\beta _0\), is expressed as
Then, this score function is in the nuisance tangent space \(\Gamma _1\). On the other hand, we can demonstrate that the score function for the some parametric submodel included in \(\Gamma _1\), such as \(\Lambda _{h_1}(t;\gamma _1) = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \mathrm{{d}}\Lambda _0(u) }\), is an element of a parametric submodel nuisance tangent space. Therefore, it holds that the nuisance tangent space for \(\Lambda (t)\) is equal to \(\Gamma _1\).
\(\Gamma _1 \perp \Gamma _2\) can be easily proved under the assumption (A2) and \(\Gamma _i \perp \Gamma _3 \ (i=1,2)\) can be also proved by \(E[\alpha _i^Th_3(Z)]=0\), where \(\alpha _i \in \Gamma _i\ (i=1,2)\) and \(h_3(Z) \in \Gamma _3\). Then the nuisance tangent space for the nuisance parameter \(\eta =\left\{ \Lambda , \Lambda _C, F_Z \right\} \) is given by the direct sum of three orthogonal spaces, \(\Gamma = \Gamma _1 \oplus \Gamma _2 \oplus \Gamma _3\). The orthogonal complement \(\Gamma ^{\perp }\) is obtained by applying the almost same procedures as the proof of Theorem 5.5 in Tsiatis (2006).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Komukai, S., Hattori, S. Asymptotic justification of maximum likelihood estimation for the proportional excess hazard model in analysis of cancer registry data. Jpn J Stat Data Sci 6, 337–359 (2023). https://doi.org/10.1007/s42081-023-00190-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-023-00190-6