Abstract
A new algorithm is presented and studied in this paper for fast computation of the nonparametric maximum likelihood estimate of a U-shaped hazard function. It successfully overcomes a difficulty when computing a U-shaped hazard function, which is only properly defined by knowing its anti-mode, and the anti-mode itself has to be found during the computation. Specifically, the new algorithm maintains the constant hazard segment, regardless of its length being zero or positive. The length varies naturally, according to what mass values are allocated to their associated knots after each updating. Being an appropriate extension of the constrained Newton method, the new algorithm also inherits its advantage of fast convergence, as demonstrated by some real-world data examples. The algorithm works not only for exact observations, but also for purely interval-censored data, and for data mixed with exact and interval-censored observations.
Similar content being viewed by others
References
Ayer, M., Brunk, H.D., Ewing, G.M., Reid, W.T., Silverman, E.: An empirical distribution function for sampling with incomplete information. Ann. Math. Stat. 26, 641–647 (1955)
Banerjee, M.: Estimating monotone, unimodal and U-shaped failure rates using asymptotic pivots. Stat. Sin. 18, 467–492 (2008)
Bray, T.A., Crawford, G.B., Proschan, F.: Maximum Likelihood Estimation of a U-shaped Failure Rate Function. Defense Technical Information Center, Mathematical Note 534, Boeing Research Laboratories, Seattle (1967)
Dümbgen, L., Freitag-Wolf, S., Jongbloed, G.: Estimating a unimodal distribution from interval-censored data. J. Am. Stat. Assoc. 101, 1094–1106 (2006)
Grenander, U.: On the theory of mortality measurement. II. Skand. Aktuarietidskr. 39, 125–153 (1956)
Groeneboom, P., Jongbloed, G.: Nonparametric Estimation under Shape Constraints: Estimators, Algorithms and Asymptotics. Cambridge University Press, Cambridge (2014)
Groeneboom, P., Jongbloed, G., Wellner, J.A.: The support reduction algorithm for computing non-parametric function estimates in mixture models. Scand. J. Stat. 35, 385–399 (2008)
Hall, P., Huang, L.S., Gifford, J.A., Gijbels, I.: Nonparametric estimation of hazard rate under the constraint of monotonicity. J. Comput. Graph. Stat. 10, 592–614 (2001)
Huang, J., Wellner, J.A.: Estimation of a monotone density or monotone hazard under random censoring. Scand. J. Stat. 22, 3–33 (1995)
Jankowski, H., Wang, I., McCague, H., Wellner, J.A.: R Package ConvexHaz: Nonparametric MLE/LSE of Convex Hazard (Version 0.2). http://cran.r-project.org/web/packages/convexHaz/index.html (2009)
Jankowski, H.K., Wellner, J.A.: Computation of nonparametric convex hazard estimators via profile methods. J. Nonparametric Stat. 21, 505–518 (2009a)
Jankowski, H.K., Wellner, J.A.: Nonparametric estimation of a convex bathtub-shaped hazard function. Bernoulli 15, 1010–1035 (2009b)
Kaplan, E.L., Meier, P.: Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53, 457–481 (1958)
Klein, J .P., Moeschberger, M.L.: Survival Analysis: Techniques for Censored and Truncated Data, 2nd edn. Springer, Berlin (2003)
Lawson, C .L., Hanson, R .J.: Solving Least Squares Problems. Prentice-Hall, Inc, Englewood Cliffs (1974)
Lee, E .T., Wang, J .W.: Statistical Methods for Survival Data Analysis, 3rd edn. Wiley, London (2003)
Meyer, M.C., Habtzghi, D.: Nonparametric estimation of density and hazard rate functions with shape restrictions. J. Nonparametric Stat. 23, 455–470 (2011)
Mykytyn, S.W., Santner, T.J.: Maximum likelihood estimation of the survival function based on censored data under hazard rate assumptions. Commun. Stat. Theory Methods 10, 1369–1387 (1981)
Peto, R.: Experimental survival curves for interval-censored data. J. R. Stat. Soc. Ser. C 22, 86–91 (1973)
Proschan, F.: Theoretical explanation of observed decreasing failure rate. Technometrics 5, 375–383 (1963)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2015)
Reboul, L.: Estimation of a function under shape restrictions: applications to reliability. Ann. Stat. 33, 1330–1356 (2005)
Schick, A., Yu, Q.: Consistency of the GMLE with mixed case interval-censored data. Scand. J. Stat. 27, 45–55 (2000)
Tsai, W.-Y.: Estimation of the survival function with increasing failure rate based on left truncated and right censored data. Biometrika 75, 319–324 (1988)
Turnbull, B.W.: Nonparametric estimation of a survivorship function with doubly censored data. J. Am. Stat. Assoc. 69, 169–173 (1974)
Wang, Y.: On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J. R. Stat. Soc. Ser. B 69, 185–198 (2007)
Wang, Y.: Dimension-reduced nonparametric maximum likelihood computation for interval-censored data. Comput. Stat. Data Anal. 52, 2388–2402 (2008)
Wang, Y.: npsurv: Non-parametric Survival Analysis (R Package Version 0.3-4). http://cran.r-project.org/package=npsurv (2015)
Wellner, J.A.: Interval censoring case 2: alternative hypotheses. In: Koul, H., Deshpande, J.V. (eds.) Analysis of Censored Data, Proceedings of the Workshop on Analysis of Censored Data, vol. 27, pp. 271–291. University of Pune, Pune (1995)
Acknowledgements
The authors thank the editor, associated editor and two referees for their constructive suggestions, which led to many improvements in the manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Derivatives and the Hessian matrix
Consider the partial derivatives of the modified log-likelihood function (3) with respect to masses, in the general case when there exist both exact and interval-censored data. The first partial derivatives are simply the gradient functions evaluated at the corresponding support points, i.e.,
The Hessian matrix \(\mathbf {H}\) can be computed by \( \mathbf {H}= -{\mathbf {D}}^\top {\mathbf {D}}\), where \({\mathbf {D}}\) is \(n \times (|\mathcal{I}| + m)\), with its (i, j)-th element given by, for \(i \in 1, \ldots , |\mathcal{I}|\),
and, for \(i = |\mathcal{I}| + 1, \ldots , n\),
where
Appendix 2: Proofs
Proof of Lemma 2
Since \({\varvec{\pi }}' = {{\mathrm{arg\,min}}}_{{\varvec{\pi }}\ge 0} ||{\mathbf {S}}{\varvec{\pi }}- \mathbf {b}||\), it holds that \(||{\mathbf {S}}{\varvec{\pi }}' - \mathbf {b}|| \le ||{\mathbf {S}}\mathbf {0}- \mathbf {b}|| = ||\mathbf {b}||\) and hence \(||{\mathbf {S}}{\varvec{\pi }}'|| \le ||{\mathbf {S}}{\varvec{\pi }}' - \mathbf {b}|| + ||\mathbf {b}|| \le 2 ||\mathbf {b}||\). Therefore, \(||{\mathbf {S}}{\varvec{\delta }}|| \le 2 ||\mathbf {b}|| + \sqrt{n}\). Because \(||\mathbf {b}||\) only depends on \(h(T_i)\), \(i \in \mathcal{I}\), which is bounded away from zero for all \(h \in \mathcal{K}_0\). \(\square \)
Proof of Lemma 3
Since \({\varvec{\delta }}\equiv {\varvec{\pi }}' - {\varvec{\pi }}\) maximizes
under restriction \({\varvec{\pi }}' \ge 0\), we have
Noting the Taylor series expansion
for any \(0< \alpha < \frac{1}{2}\), there is a \(\lambda > 0\) such that if \(||{\mathbf {S}}{\varvec{\delta }}|| \le \lambda \), then
thus satisfying the Armijo rule.
If \(||{\mathbf {S}}{\varvec{\delta }}|| > \lambda \), then \(||\sigma ^k {\mathbf {S}}{\varvec{\delta }}|| \le \lambda \) holds for some \(k > 0\). Because \(||{\mathbf {S}}{\varvec{\delta }}|| \le u\) from Lemma 2, we need at most
steps for Armijo’s rule to be satisfied in all cases. \(\square \)
Proof of Theorem 5
Owing to its monotone increase, \({\tilde{\ell }}(h_s)\) will converge to a finite value no greater than \({\tilde{\ell }}({\hat{h}})\), where \({\hat{h}}\) maximizes \({\tilde{\ell }}(h)\). Further,
because of Armijo’s rule and the nonnegative definiteness of \({\mathbf {S}}_s^{+\top } {\mathbf {S}}_s^+\).
Consider all point-mass directions \(e \in \{\pm e_0, \pm e_{1,\tau }, \pm e_{2,\eta }\}\) from \(h_s\), that are valid in the sense that there exists an \(\epsilon > 0\) such that \(h_s + \epsilon e \in \mathcal{K}\). Denote the steepest ascent direction by
and \({\varvec{\delta }}_s^*\) the direction resulting from \(h_s\) to \(h_s + e_s^*\). Hence, from any \(\epsilon \in {\mathbb {R}}\) such that \(h_s + \epsilon e_s^* \in \mathcal{K}\), we have
because of the optimality of \({\varvec{\delta }}_s\).
Now, let us assume that \(d(h_s + e_s^*; h_s)\) does not approach 0 as \(s \rightarrow \infty \). There are, hence, infinitely many s such that \(d(h_s + e_s^*; h_s) \ge \tau \), for some \(\tau > 0\). For such an s and noting that
we have, with Lemma 2,
Without loss of generality, assume \(\tau \le u^2\) and let \(\epsilon = \tau / u^2\). As a result,
a positive value that is independent of s. Since this violates the Cauchy property of a convergent sequence, we must have \(d(h_s + e_s^*; h_s) \rightarrow 0\) as \(s \rightarrow \infty \). Therefore, \(d({\hat{h}}; h_s) \le d(h_s + e_s^*; h_s) (|h_s| + |{\hat{h}}|) \rightarrow 0\) from Corollary 2, and \({\tilde{\ell }}(h_s) \rightarrow {\tilde{\ell }}({\hat{h}})\) from Lemma 1. \(\square \)
Rights and permissions
About this article
Cite this article
Wang, Y., Fani, S. Nonparametric maximum likelihood computation of a U-shaped hazard function. Stat Comput 28, 187–200 (2018). https://doi.org/10.1007/s11222-017-9724-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-017-9724-z