Asymptotic justification of maximum likelihood estimation for the proportional excess hazard model in analysis of cancer registry data

Komukai, Sho; Hattori, Satoshi

doi:10.1007/s42081-023-00190-6

Asymptotic justification of maximum likelihood estimation for the proportional excess hazard model in analysis of cancer registry data

Original Paper
Statistics for Stochastic Processes
Published: 10 March 2023

Volume 6, pages 337–359, (2023)
Cite this article

Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

150 Accesses
1 Citation
Explore all metrics

Abstract

Population-based cancer registry studies are conducted to investigate the various cancer question and have important impacts on cancer control. In order to investigate cancer prognosis from cancer registry data, it is necessary to adjust the effect of deaths from other causes, since cancer registry data include deaths from causes other than cancer. To correct for the effect of deaths from other causes, excess hazard models are often used. The concept of the excess hazard model is that the hazard function for any death in a cancer registry population is the sum of the hazard for cancer deaths, refer to the excess hazard, and the hazard for deaths from other causes. The Cox proportional hazard model for the excess hazard has been developed, and for this model, Perme et al. (Biostatistics 10:136–146, 2009) proposed the inference procedure of the regression coefficients using the techniques of the EM algorithm to compute the maximum likelihood estimator. In this article, we present the large sample properties for the maximum likelihood estimator. We introduce a consistent estimator of the variance for the regression coefficients based on the technique of the semiparametric theory and the consistency and the asymptotic normality of the estimator. The empirical property of variance estimator is investigated by the finite sample simulation studies. We also apply the variance estimator to cancer registry data for stomach, lung, and liver cancer patients from the Surveillance, Epidemiology, and End Results (SEER) database in U.S.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Article Open access 07 September 2023

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

References

Allemani, C., Weir, H. K., Carreira, H., Harewood, R., Spika, D., Wang, X. S., et al. (2015). Global surveillance of cancer survival 1995–2009: Analysis of individual data for 25,676,887 patients from 279 population-based registries in 67 countries (CONCORD-2). Lancet, 385, 977–1010.
Article Google Scholar
Allemani, C., Matsuda, T., Carlo, V. D., Harewood, R., Matz, M., Maja, N., et al. (2018). Global surveillance of trends in cancer survival 2000–14 (CONCORD-3): Analysis of individual records for 37,513,025 patients diagnosed with one of 18 cancers from 322 population-based registries in 71 countries. Lancet, 39, 1023–75.
Article Google Scholar
Andersen, P. K., Borgan, O., Gill, R. D., & Keiding, N. (1993). Statistical models based on counting processes. Springer.
Book MATH Google Scholar
Bolard, P., Quantin, C., Abrahamowicz, M., Estève, J., Giorgi, R., Chadha-Boreham, H., et al. (2002). Assessing time-by-covariate interactions in relative survival models using restrictive cubic spline functions. Journal of Cancer Epidemiology and Prevention, 7, 113–122.
Google Scholar
Coleman, M. P., Quaresma, Q., Berrino, F., Lutz, J., Angelis, R. D., Capocaccia, R., et al. (2008). Cancer survival in five continents: A worldwide population-based study (CONCORD). Lancet Oncology, 9, 730–756.
Article Google Scholar
Cortese, G., & Scheike, T. H. (2008). Dynamic regression hazards models for relative survival. Statistics in Medicine, 27, 3563–3584.
Article MathSciNet Google Scholar
Derks, M. G. M., Bastiaannet, E., Kiderlen, M., Hilling, D. E., Boelens, P. G., Walsh, P. M., et al. (2018). Variation in treatment and survival of older patients with nonmetastatic breast cancer in five European countries: A population-based cohort study from the EURECCA Breast Cancer Group. British Journal of Cancer, 119, 121–129.
Article Google Scholar
Dickman, P. W., Sloggett, A., Hills, M., & Hakulinen, T. (2004). Regression models for relative survival. Statistics in Medicine, 23(1), 51–64.
Article Google Scholar
Ederer, F., Axitell, L. M., & Cutler, S. J. (1961). The relative survival rate: A statistical methodology. National Cancer Institute Monograph, 6, 101–121.
Google Scholar
Estève, J., Benhamou, E., Croasdale, M., & Raymond, L. (1990). Relative survival and the estimation of net survival: Elements for further discussion. Statistics in Medicine, 9, 529–538.
Article Google Scholar
Fang, H.-B., Li, G., & Sun, J. (2005). Maximum likelihood estimation in a semiparametric logistic/proportional-hazard mixture model. Scandinavian Journal of Statistics, 32, 59–75.
Article MathSciNet MATH Google Scholar
Fine, J. P., & Gray, R. J. (1999). A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association, 94, 496–509.
Article MathSciNet MATH Google Scholar
Fleming, T. R., & Harrington, D. P. (1991). Counting processes and survival analysis. Wiley.
MATH Google Scholar
Gorgi, R., Abrahamowicz, M., Quantin, C., Bolard, P., Estève, J., Gouvernet, J., & Faivre, J. (2003). A relative survival regression model using B-spline functions to model non-proportional hazards. Statistics in Medicine, 22, 2767–2784.
Article Google Scholar
Hakulinen, T. (1982). Cancer survival corrected for heterogeneity in patient withdrawal. Biometrics, 38, 933–942.
Article Google Scholar
Hakulinen, T., & Tenkanen, L. (1987). Regression analysis of relative survival rates. Journal of the Royal Statistical Society, Series C., 36, 309–317.
Google Scholar
Kalager, M., Adami, H.-O., Lagergren, P., Steindorf, K., & Dickman, P. W. (2021). Cancer outcomes research-a European challenge: Measures of the cancer burden. Molecular Oncology, 15, 3223–3241.
Article Google Scholar
Komukai, S., & Hattori, S. (2017). Doubly robust estimator for net survival rate in analyses of cancer registry data. Biometrics, 73, 124–133.
Article MathSciNet MATH Google Scholar
Komukai, S., & Hattori, S. (2020). Doubly robust inference procedure for relative survival ratio in population-based cancer registry data. Statistics in Medicine, 39(13), 1884–1900.
Article MathSciNet Google Scholar
Lambert, P. C., Smith, L. K., Jones, D. R., & Botha, J. L. (2005). Additive and multiplicative covariate regression models for relative survival incorporating fractional polynomials for time-dependent effects. Statistics in Medicine, 24, 3871–3885.
Article MathSciNet Google Scholar
Li, M., Reintals, M., D’Onise, K., Farshid, G., Holmes, A., Joshi, R., Karapetis, C. S., Miller, C. L., Olver, I. N., Buckley, E. S., Townsend, A., Walters, D., & Roder, D. M. (2021). Investigating the breast cancer screening-treatment-mortality pathway of women diagnosed with invasive breast cancer: Results from linked health data. European Journal of Cancer Care, 31, e13539. https://doi.org/10.1111/ecc.13539
Article Google Scholar
Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.
MathSciNet MATH Google Scholar
Murphy, S. A., Rossini, A. J., & van der Vaart, A. W. (1997). Maximum likelihood estimation in the proportional odds model. Journal of the American Statistical Association, 92, 968–976.
Article MathSciNet MATH Google Scholar
Nelson, C. P., Lambert, P. C., Squire, I. B., & Jones, D. R. (2007). Flexible parametric models for relative survival, with application in coronary heart disease. Statistics in Medicine, 26, 5486–5498.
Article MathSciNet Google Scholar
Perme, M. P., Stare, J., & Estève, J. (2012). On estimation in relative survival. Biometrics, 68, 113–120.
Article MathSciNet MATH Google Scholar
Perme, M. P., Henderson, R., & Stare, J. (2009). An approach to estimation in relative survival regression. Biostatistics, 10, 136–146.
Article MATH Google Scholar
Perme, M. P., Estève, J., & Rachet, B. (2016). Analysing population-based cancer survival–settling the controversies. BMC Cancer, 16, 933. https://doi.org/10.1186/s12885-016-2967-9
Article Google Scholar
Pollard, D. (1990). Empirical processes: Theory and applications. Institute of Mathematical Statistics.
Rubio F. J., Remontet L., Jewell N. P. & Belot A. (2018). On a general structure for hazard-based regression models: an application to population-based cancer research. Statistical Methods in Medical Research.
Rubio, F. J., Rachet, B., Giorgi, R., Maringe, C., Belot, A., & the CENSUR working survival group,. (2021). On models for the estimation of the excess mortality hazard in case of insufficiently stratified life tables. Biostatistics, 22(1), 51–67.
Sasieni, P. D. (1996). Proportional excess hazards. Biometrika, 83, 127–141.
Article MathSciNet MATH Google Scholar
Schuil, H., Derks, M., Liefers, G.-J., Portielje, J., van de Velde, C., Syed, B., Green, A., Ellis, I., Cheung, K.-L., & Bastiaannet, E. (2018). Treatment strategies and survival outcomes in older women with breast cancer: A comparative study between the FOCUS cohort and Nottingham cohort. Journal of Geriatric Oncology, 9, 635–641.
Article Google Scholar
Syriopoulou, E., Rutherford, M. R., & Lambert, P. C. (2021). Inverse probability weighting and doubly robust standardization in the relative survival framework. Statistics in Medicine, 40, 6069–6092.
Article MathSciNet Google Scholar
Touraine, C., Graféo, N., Giorgi, R., & the CENSUR working survival group. (2020). More accurate cancer-related excess mortality through correcting background mortality for extra variables. Statistical Methods in Medical Research, 29(1), 122–136.
Tsiatis, A. (2006). Semiparametric Theory and Missing Data. New York: Springer.
MATH Google Scholar
Van der Vaart, A. W. (2000). Asymptotic statistics (Vol. 3). Cambridge University Press.
Woods, L. M., Rachet, B., Morris, M., Bhaskaran, K., & Coleman, M. P. (2021). Are socio-economic inequalities in breast cancer survival explained by peri-diagnostic factors? BMC Cancer, 21, 485. https://doi.org/10.1186/s12885-021-08087-x
Article Google Scholar

Download references

Acknowledgements

The first author’s research was partly supported by Grant-in-Aid for Early-Career Scientists (20K19754) from the Ministry of Education, Science, Sports and Technology of Japan. The second author’s research was partly supported by Grant-in-Aid for Challenging Exploratory Research (16K12403) and for Scientific Research (16H06299, 18H03208) from the Ministry of Education, Science, Sports and Technology of Japan. Computational calculations were performed at the Institute of Medical Science (the University of Tokyo).

Funding

The research leading to these results received funding from the Ministry of Education, Science, Sports and Technology of Japan under Grant-in-Aid for Early-Career Scientists No. 20K19754, for Challenging Exploratory Research No. 16K12403, and for Scientific Research No. 16H06299 and No. 18H03208.

Author information

Authors and Affiliations

Department of Biomedical Statistics, Graduate School of Medicine, Osaka University, Yamadaoka 2-2, Suita, Osaka, 565-0871, Japan
Sho Komukai & Satoshi Hattori
Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Yamadaoka 2-2, Suita, Osaka, 565-0871, Japan
Satoshi Hattori

Authors

Sho Komukai
View author publications
You can also search for this author in PubMed Google Scholar
Satoshi Hattori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Satoshi Hattori.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

Appendix A: Existence of the maximum likelihood estimator and identifiability of $\beta _0$ and $\Lambda _0$

The existence of the pair of the parameters $(\beta ,\Lambda )$ maximizing the observed likelihood (3) is proved by using the similar arguments to the proof of theorem 1 in Fang et al. (2005). In this Appendix, along this line, we prove the identifiability of $(\beta _0,\Lambda _0)$ in the sense that $L(\Lambda ,\beta ;t,\delta ,z) = L(\Lambda _0, \beta _0;t,\delta ,z)$ implies $\beta = \beta _0$ and $\Lambda (t)=\Lambda _0$ for all $t \in [0,\tau ]$.

Suppose the parameter space $\mathscr {B} \in R^p$ of $\beta $ is compact, where p is the dimension of $\beta $. Since the vector of covariates Z is bounded, $e^{\beta ^TZ}$ is also bounded, and its lower and upper bounds are denoted by $K_l$ and $K_u$, respectively. Let $t_1<t_2<\cdots <t_k$ be the distinct failure times. Then, for any right-continuous and non-decreasing function $\Lambda (t)$, it holds that

$$\begin{aligned} 0 \le L(\beta , \Lambda )&\le \prod _{i=1}^n{ \left\{ K_u \Lambda (T_i) + \Lambda _P(T_i|Z_i) \right\} ^{\Delta _i} e^{ - \Lambda (T_i)K_l } } \nonumber \\&\le \prod _{i:T_i < t_k}{ \left\{ \frac{ K_u }{ K_l } + \Lambda _P(T_i|Z_i) \right\} ^{\Delta _i} }\nonumber \\&\times \prod _{i:T_i = t_k}{ \left\{ K_u \Lambda (t_k)e^{ - \Lambda (t_k)K_l } + \Lambda _P(T_i|Z_i) \right\} ^{\Delta _i} }. \end{aligned}$$

(12)

Because forcing $\Lambda (T_i)=\Lambda (t_k)$ for all $T_i \ge t_k$ will increase the likelihood if $t_k$ is sufficiently large value satisfying $\Lambda (t_k) \ge 1$, it suffices to restrict the space of $\Lambda (t)$ to the space $\Omega _0$, where

$$\begin{aligned} \Omega _0 =&\left\{ \Lambda : \Lambda (t) \ \text{ is } \text{ the } \text{ right } \text{ continuous } \text{ and } \text{ non-decreasing } \text{ function } \right. \\&\qquad \left. \text{ with } \ \Lambda (t)=\Lambda (t_k) \ \mathrm{for \ all} \ t \ge t_k \right\} . \end{aligned}$$

Let $A_M = \left\{ \Lambda \in \Omega _0: \Lambda (t_k) \le M \right\} $ for any $0<M<\infty $. Because $L(\beta ,\Lambda )$ is continuous in $\beta $ and $\Lambda $, it has a maximum in the compact subspace $\mathscr {B} \times A_M$ for any given M. Let $L^{(M)}$ be the maximum value of $L(\beta ,\Lambda )$ in $\mathscr {B} \times A_M$. By $e^{-MK_l}M \rightarrow 0$ as $M \rightarrow \infty $, there exists an $M_0 \ge 1$ such that the right-hand side of (12) is less than $L^{(M_0)}$ for all $\Lambda $ out of $A_{M_0}$. Therefore, the likelihood evaluated at any sequence $\Lambda _m$ of $\Lambda $ with $\Lambda _m(t_k)$ diverging to infinity as $m \rightarrow \infty $ will not approach the maximum value of $L(\beta ,\Lambda )$. As a consequence, when maximizing the observed likelihood (3), we can restrict the compact subspace $\mathscr {B} \times A_{M_0}$. The existence of the maximum likelihood estimator can be proved by the continuity of the likelihood.

We prove that both of $\beta _0$ and $\Lambda _0$ are identifiable. By considering $L(\Lambda ,\beta ;t,0,z) = L(\Lambda _0, \beta _0;t,0,z)$, we have that $\Lambda (t)/\Lambda _0(t)=e^{-\left( \beta - \beta _0 \right) ^TZ}$ for all $t \le \tau $ and Z such that $\Pr (T> \tau |Z)>0$. Therefore, since $\left( \beta - \beta _0 \right) ^TZ$ is constant for all Z, it hold that $\beta =\beta _0$ if the covariance of Z is nondegenerate, and also we have $\Lambda (t)=\Lambda _0(t)$ for all $t \le \tau $. By considering $L(\Lambda ,\beta ;t,1,z) = L(\Lambda _0, \beta _0;t,1,z)$, we also have $\mathrm{{d}}\Lambda (t)=\mathrm{{d}}\Lambda _0(t)$ for all $t \le \tau $.

Appendix B: Nuisance tangent space and its orthogonal complement

Let $\mathcal {H}$ be a Hilbert space consisted of all p-dimensional measurable functions of $(T,\Delta , Z)$ with mean-zero and finite variance equipped with inner product $\left<h_1,h_2\right>=E\left[ h_1^T(T,\Delta ,Z)h_2(T,\Delta ,Z)\right] $. To derive the nuisance tangent space for the nuisance parameter $\eta =\left\{ \Lambda , \Lambda _C, F_Z \right\} $, we consider parametric submodels $\Lambda _{h_1}(t;\gamma _1)$, $\Lambda _{C,h_2}(t|Z;\gamma _2)$, and $F_{Z,h_3}(z;\gamma _3)$ for $\Lambda $, $\Lambda _C$, and $F_Z$, respectively, which were defined in Sect. 3, where $\gamma _1$, $\gamma _2$, and $\gamma _3$ are the finite-dimensional nuisance parameters. Then, the nuisance tangent spaces for each nuisance parameter will be derived as the mean-square closure of all parametric submodel nuisance tangent spaces. Since the derivations of the nuisance tangent spaces $\Gamma _2$ and $\Gamma _3$ in Sect. 4, which are for the nuisance parameters $\Lambda _C$ and $F_Z$, respectively, are the same as those of Section 5.2 in Tsiatis (2006), we only derive here the nuisance tangent space $\Gamma _1$, which is for the nuisance parameter $\Lambda $, in Theorem 1.

Again, we consider a parametric submodel $\Lambda _{h_1}(t;\gamma _1) = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \mathrm{{d}}\Lambda _0(u) } = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \lambda _0(u) \mathrm{{d}}u }$, where $h_1(u)$ is an arbitrary p-dimensional bounded function. The contribution to the log-likelihood function under the parametric submodel is

$$\begin{aligned} \ell _n(\beta , \gamma _1; h_1)&= \sum _{i=1}^n{ \Delta _i \log { \left[ \left\{ 1 + \gamma _1 h_1(T_i) \right\} \mathrm{{d}}\Lambda _0(T_i)e^{\beta ^TZ_i} + \mathrm{{d}}\Lambda _P(T_i|Z_i) \right] } } \\&\quad - \sum _{i=1}^n{ \int _{0}^{T_i}{ \left\{ 1 + \gamma _1 h_1(t) \right\} \mathrm{{d}}\Lambda _0(t) }e^{\beta ^TZ_i } }. \end{aligned}$$

Taking derivatives of $\ell _n(\beta , \gamma _1; h_1)$ with respect to $\gamma _1$, and evaluating $\beta =\beta _0$ and $\gamma _1=0$, we obtain the score function

$$\begin{aligned} U_{n,\gamma _1}(\beta _0;h)&= \sum _{i=1}^n{ \int _{0}^{\tau }{ h_1(t) W(t|Z_i;\beta _0, \Lambda _0) \mathrm{{d}}M_i(t)} }. \end{aligned}$$

Then, the score function for this parametric submodel is in the nuisance tangent space $\Gamma _1$. Since any element of $\mathcal {H}$ can be approximated by a sequence of bounded function (Tsiatis 2006, Section 4), the score function with parametric submodel without the boundedness of $h_1(t)$ is also in $\Gamma _1$.

For any parametric submodel $\Lambda (t;\gamma _1) = \int _0^t{ \lambda (u;\gamma _1) \mathrm{{d}}u }$, the score function with respect to $\gamma _1$, setting $\gamma _1=0$ and $\beta =\beta _0$, is expressed as

$$\begin{aligned} U_{1,\gamma _1}(\beta _0)&= \int _{0}^{\tau }{ \left\{ \left. \frac{\partial }{\partial \gamma _1 }\log {\lambda (t;\gamma _1)} \right| _{\gamma _1=0} \right\} W(t|Z;\beta _0, \Lambda _0) \mathrm{{d}}M(t)}. \nonumber \end{aligned}$$

Then, this score function is in the nuisance tangent space $\Gamma _1$. On the other hand, we can demonstrate that the score function for the some parametric submodel included in $\Gamma _1$, such as $\Lambda _{h_1}(t;\gamma _1) = \int _0^t{ \left\{ 1 + \gamma _1 h_1(u) \right\} \mathrm{{d}}\Lambda _0(u) }$, is an element of a parametric submodel nuisance tangent space. Therefore, it holds that the nuisance tangent space for $\Lambda (t)$ is equal to $\Gamma _1$.

$\Gamma _1 \perp \Gamma _2$ can be easily proved under the assumption (A2) and $\Gamma _i \perp \Gamma _3 \ (i=1,2)$ can be also proved by $E[\alpha _i^Th_3(Z)]=0$, where $\alpha _i \in \Gamma _i\ (i=1,2)$ and $h_3(Z) \in \Gamma _3$. Then the nuisance tangent space for the nuisance parameter $\eta =\left\{ \Lambda , \Lambda _C, F_Z \right\} $ is given by the direct sum of three orthogonal spaces, $\Gamma = \Gamma _1 \oplus \Gamma _2 \oplus \Gamma _3$. The orthogonal complement $\Gamma ^{\perp }$ is obtained by applying the almost same procedures as the proof of Theorem 5.5 in Tsiatis (2006).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Komukai, S., Hattori, S. Asymptotic justification of maximum likelihood estimation for the proportional excess hazard model in analysis of cancer registry data. Jpn J Stat Data Sci 6, 337–359 (2023). https://doi.org/10.1007/s42081-023-00190-6

Download citation

Received: 30 June 2022
Revised: 13 January 2023
Accepted: 28 January 2023
Published: 10 March 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s42081-023-00190-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asymptotic justification of maximum likelihood estimation for the proportional excess hazard model in analysis of cancer registry data

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Violating the normality assumption may be the lesser of two evils

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

Appendix A: Existence of the maximum likelihood estimator and identifiability of \(\beta _0\) and \(\Lambda _0\)

Appendix B: Nuisance tangent space and its orthogonal complement

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Asymptotic justification of maximum likelihood estimation for the proportional excess hazard model in analysis of cancer registry data

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

A Tutorial on Applying the Difference-in-Differences Method to Health Data

Violating the normality assumption may be the lesser of two evils

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

Appendix A: Existence of the maximum likelihood estimator and identifiability of \(\beta _0\) and \(\Lambda _0\)

Appendix B: Nuisance tangent space and its orthogonal complement

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation