## Abstract

We propose a new algorithm for computing the maximum likelihood estimate of a nonparametric survival function for interval-censored data, by extending the recently-proposed constrained Newton method in a hierarchical fashion. The new algorithm makes use of the fact that a mixture distribution can be recursively written as a mixture of mixtures, and takes a divide-and-conquer approach to break down a large-scale constrained optimization problem into many small-scale ones, which can be solved rapidly. During the course of optimization, the new algorithm, which we call the hierarchical constrained Newton method, can efficiently reallocate the probability mass, both locally and globally, among potential support intervals. Its convergence is theoretically established based on an equilibrium analysis. Numerical study results suggest that the new algorithm is the best choice for data sets of any size and for solutions with any number of support intervals.

This is a preview of subscription content, access via your institution.

## References

Bogaerts, K., Lesaffre, E.: A new, fast algorithm to find the regions of possible support for bivariate interval-censored data. J. Comput. Graph. Stat.

**13**, 330–340 (2004)Böhning, D.: A vertex-exchange-method in

*D*-optimal design theory. Metrika**33**, 337–347 (1986)Böhning, D., Schlattmann, P., Dietz, E.: Interval censored data: A note on the nonparametric maximum likelihood estimator of the distribution function. Biometrika

**83**, 462–466 (1996)Chen, L., Jha, P., Sirling, B., Sgaier, S.K., Daid, T., Kaul, R., Nagelkerke, N.: Sexual risk factors for HIV infection in early and advanced HIV epidemics in Sub-Saharan Africa: systematic overview of 68 epidemiological studies. PLoS ONE

**2**, e1001 (2007)Dax, A.: The smallest point of a polytope. J. Optim. Theory Appl.

**64**, 429–432 (1990)Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B

**39**, 1–22 (1977)Dümbgen, L., Freitag-Wolf, S., Jongbloed, G.: Estimating a unimodal distribution from interval-censored data. J. Am. Stat. Assoc.

**101**, 1094–1106 (2006)Gentleman, R., Vandal, A.C.: Computational algorithms for censored-data problems using intersection graphs. J. Comput. Graph. Stat.

**10**, 403–421 (2001)Gentleman, R., Vandal, A.C.: Icens: NPMLE for censored and truncated data. R package version 1.18.0 (2009)

Groeneboom, P.: Nonparametric maximum likelihood estimators for interval censoring and deconvolution. Technical report 378, Department of Statistics, Stanford University (1991)

Groeneboom, P., Jongbloed, G., Wellner, J.A.: The support reduction algorithm for computing nonparametric function estimates in mixture models. Scand. J. Stat.

**35**, 385–399 (2008)Groeneboom, P., Wellner, J.A.: Information Bounds and Nonparametric Maximum Likelihood Estimation. Birkhäuser, Basel (1992)

Jongbloed, G.: The iterative convex minorant algorithm for nonparametric estimation. J. Comput. Graph. Stat.

**7**, 301–321 (1998)Kumwenda, N.I., Hoover, D.R., Mofenson, L.M., Thigpen, M.C., Kafulafula, G., Li, Q., Mipando, L., Nkanaunena, K., Mebrahtu, T., Bulterys, M., Fowler, M.G., Taha, T.E.: Extended antiretroviral prophylaxis to reduce breast-milk HIV-1 transmission. N. Engl. J. Med.

**359**, 119–129 (2008)Lawson, C.L., Hanson, R.J.: Solving Least Squares Problems. Prentice-Hall, New York (1974)

Lesperance, M.L., Kalbfleisch, J.D.: An algorithm for computing the nonparametric MLE of a mixing distribution. J. Am. Stat. Assoc.

**87**, 120–126 (1992)Lindsay, B.G.: In: Mixture Models: Theory, Geometry and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5. Institute for Mathematical Statistics, Hayward (1995)

Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn. Springer, New York (2008)

Maathuis, M.H.: Reduction algorithm for the NPMLE for the distribution of bivariate interval-censored data. J. Comput. Graph. Stat.

**14**, 352–362 (2005)Maathuis, M.H.: MLEcens: Computation of the MLE for bivariate (interval) censored data. R package version 0.1-2. (2007)

Peto, R.: Experimental survival curves for interval-censored data. Appl. Stat.

**22**, 86–91 (1973)Pilla, R.S., Lindsay, B.G.: Alternative EM methods for nonparametric finite mixture models. Biometrika

**88**, 535–550 (2001)Siegfried, N., Clarke, M., Volmink, J.: Randomised controlled trials in Africa of HIV and AIDS: descriptive study and spatial distribution. BMJ

**331**, 742 (2005)Sun, J.: The Statistical Analysis of Interval-censored Failure Time Data. Springer, Berlin (2006)

Turnbull, B.W.: Nonparametric estimation of a survivorship function with doubly censored data. J. Am. Stat. Assoc.

**69**, 169–173 (1974)Turnbull, B.W.: The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Stat. Soc. B

**38**, 290–295 (1976)Wang, Y.: On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J. R. Stat. Soc. B

**69**, 185–198 (2007)Wang, Y.: Dimension-reduced nonparametric maximum likelihood computation for interval-censored data. Comput. Stat. Data Anal.

**52**, 2388–2402 (2008)Wellner, J.A., Zhan, Y.: A hybrid algorithm for computation of the nonparametric maximum likelihood estimator from censored data. J. Am. Stat. Assoc.

**92**, 945–959 (1997)Wong, G.Y., Yu, Q.: Generalized MLE of a joint distribution function with multivariate interval-censored data. J. Multivar. Anal.

**69**, 155–166 (1999)Wu, C.F.: Some algorithmic aspects of the theory of optimal designs. Ann. Stat.

**6**, 1286–1301 (1978)

## Acknowledgements

The authors thank the Editor, the Associate Editor and two reviewers for many constructive comments, and are grateful to Bruce Lindsay for helpful suggestions. This research was supported by a Marsden grant of the Royal Society of New Zealand (9145/3608546).

## Author information

### Affiliations

### Corresponding author

## Appendix: Linear regression over a simplex

### Appendix: Linear regression over a simplex

Consider the constrained least squares problem:

for *δ*>0. It can be solved by the NNLS algorithm of Lawson and Hanson (1974), after a transformation suggested by Dax (1990). Letting **y**=**x**/*δ* and **c**=**b**/*δ*, it is apparent that the problem is equivalent to

which is further equivalent to

where **P**=**A**−(**c**,…,**c**). The solution to problem (22) can be found by solving the following least squares problem with only non-negativity constraints:

By relating the Karush-Kuhn-Tucker conditions for both problems, Dax established that if \({\tilde {\mathbf {y}}}\) solves problem (23), then \({\tilde {\mathbf {y}}}/{\tilde {\mathbf {y}}}^{{\top }}\mathbf {1}\) solves problem (22).

Problem (23) can be solved by the NNLS algorithm of Lawson and Hanson (1974).

## Rights and permissions

## About this article

### Cite this article

Wang, Y., Taylor, S.M. Efficient computation of nonparametric survival functions via a hierarchical mixture formulation.
*Stat Comput* **23, **713–725 (2013). https://doi.org/10.1007/s11222-012-9341-9

Received:

Accepted:

Published:

Issue Date:

### Keywords

- Nonparametric maximum likelihood
- Survival function
- Interval censoring
- Clinical trial
- Constrained Newton method
- Disease-free survival