Abstract
Re-parametrization is often done to make a constrained optimization problem an unconstrained one. This paper focuses on the non-parametric maximum likelihood estimation of the sub-distribution functions for current status data with competing risks. Our main aim is to propose a method using re-parametrization, which is simpler and easier to handle with compared to the constrained maximization methods discussed in Jewell and Kalbfleisch (Biostatistics. 5, 291–306, 2004) and Maathuis (2006), when both the monitoring times and the number of individuals observed at these times are fixed. Then the Expectation-Maximization (EM) algorithm is used for estimating the unknown parameters. We have also established some asymptotic results of these maximum likelihood estimators. Finite sample properties of these estimators are investigated through an extensive simulation study. Some generalizations have been discussed.
Similar content being viewed by others
References
Dempster, A.P., Laird, N. and Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Methodol.39, 1–38.
Groeneboom, P., Maathuis, M.H. and Wellner, J.A. (2008). Current status data with competing risks consistency and rates of convergence of the MLE. Ann. Stat.36, 1031–1063.
Groeneboom, P., Maathuis, M.H. and Wellner, J.A. (2008). Current status data with competing risks limiting distribution of the MLE. Ann. Stat.36, 1064–1089.
Hudgens, M.G., Satten, G.A. and Longini, I.M. Jr (2001). Nonparametric maximum likelihood estimation for competing risks survival data subject to interval censoring and truncation. Biometrics57, 74–80.
Jewell, NP and Kalbfleisch, JD (2004). Maximum likelihood estimation of ordered multinomial parameters. Biostatistics5, 291–306.
Jewell, N.P., Van Der Laan, M. and Henneman, T. (2003). Nonparametric estimation from current status data with competing risks. Biometrika90, 183–197.
Koley and Dewanji (2016). Non-parametric maximum likelihood estimation of current status data with competing risks. Technical Report No, ASU/2016/6 http://www.isical.ac.in/~asu/TR/TechRepASU201606.pdf.
Lehmann, E.L. and Casella, G. (1998). Theory of point estimation, 2nd edn. Springer, Berlin, p. 463–465.
Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm. J. R. Statist. Soc. B44, 226–233.
Maathuis, M.H. and Hudgens, M.G. (2011). Nonparametric inference for competing risks current status data with continuous, discrete or grouped observation times. Biometrika98, 325–340.
Maathuis, M.H. (2006). Nonparametric estimation for current status data with competing risks. Diss. University of Washington, Washington.
Turnbull, BW (1976). The empirical distribution function with arbitrarily grouped, Censored and truncated data. J. R. Stat. Soc. Ser. B Methodol.38, 290–295.
Ververidis, D. and Kotropoulos, C. (2008). Gaussian mixture modeling by exploiting the mahalanobis distance. IEEE Trans. Signal Process.56, 2797–2811.
Wu, C.F. (1983). On the convergence properties of the EM algorithm. Ann. Stat.11, 95–103.
Acknowledgments
We would like to thank the Associate Editor and the anonymous reviewers for a careful reading of the manuscript and several helpful suggestions.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof. of Theorem 1
Let us first note the following properties:
P1 : The true value of the parameter \(\underset {\sim }{\lambda }\), denoted by \(\underset {\sim }{\lambda _{0}}\), lies in an open set. In the present framework, we have \(0 < \underset {\sim }{\lambda } < 1\) in the sense that 0 ≤ λji ≤ 1, for all j, i, with some restrictions. Note that \(\underset {\sim }{\lambda }= 0\) means λji = 0, for all j, i, which is not of interest. Similarly\(, \underset {\sim }{\lambda }= 1\) meaning λji = 1, for all j, i, which is also of no interest.
P2 : Let \(\underset {\sim }{\lambda }^{\prime }\) and \(\underset {\sim }{\lambda }^{\prime \prime }\) be two values of the parameter vector \(\underset {\sim }{\lambda }\) with \(\underset {\sim }{\lambda }^{\prime } \neq \underset {\sim }{\lambda }^{\prime \prime }\). Then, there is at least one component of these two vectors which are not equal, say \(, \lambda _{ji}^{\prime } \neq \lambda _{ji}^{\prime \prime }\), for some i = 1,..,k and j = 1,..,m. Also, the densities \(p^{i} (\delta ^{(i)} \mid \underset {\sim }{\lambda })\)’s are functions of \(\lambda _{ji^{\prime }}\), for j = 1,2,..,m and i′≤, for i = 1,...,k. Hence \(,~p^{i^{\prime }} (\delta ^{(i^{\prime })} \mid \underset {\sim }{\lambda }^{\prime }) \neq p^{i^{\prime }} (\delta ^{(i^{\prime })} \mid \underset {\sim }{\lambda }^{\prime \prime })\), for i′≥ i, and the likelihood \(L(\underset {\sim }{\lambda })\), being the product of these densities \(, L(\underset {\sim }{\lambda }^{\prime }) \neq L(\underset {\sim }{\lambda }^{\prime \prime })\).
P3 : For each i = 1,..,k, \(E_{i,\underset {\sim }{\lambda }} \left [ \frac {\partial }{\partial \underset {\sim }{\lambda }} \log \hspace {4 pt} p^{i} (\delta ^{(i)} \mid \underset {\sim }{\lambda }) \right ] = 0\), where \(E_{i,\underset {\sim }{\lambda }} \left [ \cdot \right ] \) is the expectation with respect to the density \(p^{i} (\delta ^{(i)} \mid \underset {\sim }{\lambda })\). Again, differentiating with respect to \(\underset {\sim }{\lambda }\), we have \(E_{i,\underset {\sim }{\lambda }}\left [ \left (\frac {\partial }{\partial \underset {\sim }{\lambda }} \log \hspace {4 pt} p^{i} (\delta ^{(i)} \mid \underset {\sim }{\lambda })\right )^2 \right ] = E_{i,\underset {\sim }{\lambda }} \left [ - \frac {\partial ^2}{\partial \underset {\sim }{\lambda } \partial \underset {\sim }{\lambda }^{\text {T}}} \log \hspace {4 pt}p^{i} (\delta ^{(i)} \mid \underset {\sim }{\lambda }) \right ] \)\( = \mathscr{I}_i(\underset {\sim }{\lambda }) \), say, which is assumed to be non-negative definite.
P4 : The density function \(p^{i} (\delta ^{(i)} \mid \underset {\sim }{\lambda })\) is a polynomial in the λji’s. Therefore, it is continuous in each λji and admits all third order derivatives. Also, since each λji is bounded above by 1, it can be shown that these third order derivatives are bounded by functions with finite expectations in a neighbourhood of the true value \(\underset {\sim }{\lambda _{0}}\) of \(\underset {\sim }{\lambda }\).
Let us consider a sphere Qa with center at the true value \(\underset {\sim }{\lambda _{0}}\) and radius a. We will show that, for sufficiently small value of a, log \(L_I(\underset {\sim }{\lambda }) < \) log \(L_I(\underset {\sim }{\lambda _{0}})\) with probability tending to 1 as n →∞ for all \(\underset {\sim }{\lambda }\) on the surface of Qa. So \(, L_I(\underset {\sim }{\lambda })\) has a local maximum in the interior of Qa and, hence, the likelihood equations have a solution within Qa. Note that, from (4.2), we have log \(L_I(\underset {\sim }{\lambda }) = {\sum }_{i = 1}^k {\sum }_{l = 1}^{n_i} p^i (\delta ^{(i)}_l \mid \underset {\sim }{\lambda })\), so that
Then, following the proof given in Lehmann and Casella (1998) and using the properties P1-P4, it can be proved that the maximum of (8.1), over all \(\underset {\sim }{\lambda }\) on the surface of Qa, is less than zero. This completes the proof of (i). Using Taylor’s series expansion on the likelihood equation, we have
For a particular i, by weak law of large numbers, we have
Also, using Central Limit Theorem, we have \(\frac {1}{\sqrt {n_i}} {\sum }_{l = 1}^{n_i} \frac {\partial }{\partial \underset {\sim }{\lambda }} \log {\kern 4pt} p^i(\delta _l^{(i)} | \underset {\sim }{\lambda }) | _{\underset {\sim }{\lambda }= \underset {\sim }{\lambda _{0}}} \xrightarrow {d} N(0,\mathscr{I}_i(\underset {\sim }{\lambda _{0}}))\) and \(-\frac {1}{n} \frac {\partial ^2}{\partial \underset {\sim }{\lambda } \partial \underset {\sim }{\lambda }^T} \log {\kern 4pt} L_I(\underset {\sim }{\lambda })= - {\sum }_{i = 1}^{k} \frac {n_i}{n} \times \frac {1}{n_i} {\sum }_{l = 1}^{n_i} \frac {\partial ^2}{\partial \underset {\sim }{\lambda } \partial \underset {\sim }{\lambda }^T} \log {\kern 4pt} p^{i}(\delta _l^{(i)} | \underset {\sim }{\lambda })\)\(\xrightarrow {p} {\sum }_{i = 1}^{k} w_i \mathscr{I}_i(\underset {\sim }{\lambda })= \mathscr{I}(\underset {\sim }{\lambda }), \) say. Using these results and Slutsky’s theorem, we have from (8.2), \(\sqrt {n}(\hat {\underset {\sim }{\lambda }} - \underset {\sim }{\lambda _{0}}) \xrightarrow {d} N(0,\mathscr{I}^{-1}(\underset {\sim }{\lambda _{0}}))\). This completes the proof of (ii). Note that, since \(\hat {\underset {\sim }{\lambda }}\) is a consistent estimate, using the weak law of large numbers, as before\(, -\frac {1}{n} \frac {\partial ^2}{ \partial \underset {\sim }{\lambda } \partial \underset {\sim }{\lambda }^T} \log \hspace {4 pt} L_I(\underset {\sim }{\lambda })\) evaluated at \(\underset {\sim }{\lambda }= \hat {\underset {\sim }{\lambda }}\) can be taken as a consistent estimate of \(\mathscr{I}(\underset {\sim }{\lambda _{0}}) ={\sum }_{i = 1}^k w_i \mathscr{I}_i(\underset {\sim }{\lambda })\).
Rights and permissions
About this article
Cite this article
Koley, T., Dewanji, A. Revisiting Non-Parametric Maximum Likelihood Estimation of Current Status Data with Competing Risks. Sankhya B 81, 39–59 (2019). https://doi.org/10.1007/s13571-018-0172-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13571-018-0172-3
Keywords and phrases.
- Monitoring time
- Isotonic constraints
- Re-parametrization
- Cover percentage
- Observed Mahalanobis distance
- Interval hazards
- EM algorithm
- Complete data likelihood