Skip to main content
Log in

Efficient computation of nonparametric survival functions via a hierarchical mixture formulation

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We propose a new algorithm for computing the maximum likelihood estimate of a nonparametric survival function for interval-censored data, by extending the recently-proposed constrained Newton method in a hierarchical fashion. The new algorithm makes use of the fact that a mixture distribution can be recursively written as a mixture of mixtures, and takes a divide-and-conquer approach to break down a large-scale constrained optimization problem into many small-scale ones, which can be solved rapidly. During the course of optimization, the new algorithm, which we call the hierarchical constrained Newton method, can efficiently reallocate the probability mass, both locally and globally, among potential support intervals. Its convergence is theoretically established based on an equilibrium analysis. Numerical study results suggest that the new algorithm is the best choice for data sets of any size and for solutions with any number of support intervals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bogaerts, K., Lesaffre, E.: A new, fast algorithm to find the regions of possible support for bivariate interval-censored data. J. Comput. Graph. Stat. 13, 330–340 (2004)

    Article  MathSciNet  Google Scholar 

  • Böhning, D.: A vertex-exchange-method in D-optimal design theory. Metrika 33, 337–347 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  • Böhning, D., Schlattmann, P., Dietz, E.: Interval censored data: A note on the nonparametric maximum likelihood estimator of the distribution function. Biometrika 83, 462–466 (1996)

    Article  MATH  Google Scholar 

  • Chen, L., Jha, P., Sirling, B., Sgaier, S.K., Daid, T., Kaul, R., Nagelkerke, N.: Sexual risk factors for HIV infection in early and advanced HIV epidemics in Sub-Saharan Africa: systematic overview of 68 epidemiological studies. PLoS ONE 2, e1001 (2007)

    Article  Google Scholar 

  • Dax, A.: The smallest point of a polytope. J. Optim. Theory Appl. 64, 429–432 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 1–22 (1977)

    MathSciNet  MATH  Google Scholar 

  • Dümbgen, L., Freitag-Wolf, S., Jongbloed, G.: Estimating a unimodal distribution from interval-censored data. J. Am. Stat. Assoc. 101, 1094–1106 (2006)

    Article  MATH  Google Scholar 

  • Gentleman, R., Vandal, A.C.: Computational algorithms for censored-data problems using intersection graphs. J. Comput. Graph. Stat. 10, 403–421 (2001)

    Article  MathSciNet  Google Scholar 

  • Gentleman, R., Vandal, A.C.: Icens: NPMLE for censored and truncated data. R package version 1.18.0 (2009)

  • Groeneboom, P.: Nonparametric maximum likelihood estimators for interval censoring and deconvolution. Technical report 378, Department of Statistics, Stanford University (1991)

  • Groeneboom, P., Jongbloed, G., Wellner, J.A.: The support reduction algorithm for computing nonparametric function estimates in mixture models. Scand. J. Stat. 35, 385–399 (2008)

    Article  MathSciNet  Google Scholar 

  • Groeneboom, P., Wellner, J.A.: Information Bounds and Nonparametric Maximum Likelihood Estimation. Birkhäuser, Basel (1992)

    Book  MATH  Google Scholar 

  • Jongbloed, G.: The iterative convex minorant algorithm for nonparametric estimation. J. Comput. Graph. Stat. 7, 301–321 (1998)

    MathSciNet  Google Scholar 

  • Kumwenda, N.I., Hoover, D.R., Mofenson, L.M., Thigpen, M.C., Kafulafula, G., Li, Q., Mipando, L., Nkanaunena, K., Mebrahtu, T., Bulterys, M., Fowler, M.G., Taha, T.E.: Extended antiretroviral prophylaxis to reduce breast-milk HIV-1 transmission. N. Engl. J. Med. 359, 119–129 (2008)

    Article  Google Scholar 

  • Lawson, C.L., Hanson, R.J.: Solving Least Squares Problems. Prentice-Hall, New York (1974)

    MATH  Google Scholar 

  • Lesperance, M.L., Kalbfleisch, J.D.: An algorithm for computing the nonparametric MLE of a mixing distribution. J. Am. Stat. Assoc. 87, 120–126 (1992)

    Article  MATH  Google Scholar 

  • Lindsay, B.G.: In: Mixture Models: Theory, Geometry and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5. Institute for Mathematical Statistics, Hayward (1995)

    Google Scholar 

  • Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming, 3rd edn. Springer, New York (2008)

    MATH  Google Scholar 

  • Maathuis, M.H.: Reduction algorithm for the NPMLE for the distribution of bivariate interval-censored data. J. Comput. Graph. Stat. 14, 352–362 (2005)

    Article  MathSciNet  Google Scholar 

  • Maathuis, M.H.: MLEcens: Computation of the MLE for bivariate (interval) censored data. R package version 0.1-2. (2007)

  • Peto, R.: Experimental survival curves for interval-censored data. Appl. Stat. 22, 86–91 (1973)

    Article  Google Scholar 

  • Pilla, R.S., Lindsay, B.G.: Alternative EM methods for nonparametric finite mixture models. Biometrika 88, 535–550 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Siegfried, N., Clarke, M., Volmink, J.: Randomised controlled trials in Africa of HIV and AIDS: descriptive study and spatial distribution. BMJ 331, 742 (2005)

    Article  Google Scholar 

  • Sun, J.: The Statistical Analysis of Interval-censored Failure Time Data. Springer, Berlin (2006)

    MATH  Google Scholar 

  • Turnbull, B.W.: Nonparametric estimation of a survivorship function with doubly censored data. J. Am. Stat. Assoc. 69, 169–173 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  • Turnbull, B.W.: The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Stat. Soc. B 38, 290–295 (1976)

    MathSciNet  MATH  Google Scholar 

  • Wang, Y.: On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J. R. Stat. Soc. B 69, 185–198 (2007)

    Article  MATH  Google Scholar 

  • Wang, Y.: Dimension-reduced nonparametric maximum likelihood computation for interval-censored data. Comput. Stat. Data Anal. 52, 2388–2402 (2008)

    Article  MATH  Google Scholar 

  • Wellner, J.A., Zhan, Y.: A hybrid algorithm for computation of the nonparametric maximum likelihood estimator from censored data. J. Am. Stat. Assoc. 92, 945–959 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Wong, G.Y., Yu, Q.: Generalized MLE of a joint distribution function with multivariate interval-censored data. J. Multivar. Anal. 69, 155–166 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  • Wu, C.F.: Some algorithmic aspects of the theory of optimal designs. Ann. Stat. 6, 1286–1301 (1978)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors thank the Editor, the Associate Editor and two reviewers for many constructive comments, and are grateful to Bruce Lindsay for helpful suggestions. This research was supported by a Marsden grant of the Royal Society of New Zealand (9145/3608546).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Wang.

Appendix: Linear regression over a simplex

Appendix: Linear regression over a simplex

Consider the constrained least squares problem:

for δ>0. It can be solved by the NNLS algorithm of Lawson and Hanson (1974), after a transformation suggested by Dax (1990). Letting y=x/δ and c=b/δ, it is apparent that the problem is equivalent to

which is further equivalent to

(22)

where P=A−(c,…,c). The solution to problem (22) can be found by solving the following least squares problem with only non-negativity constraints:

(23)

By relating the Karush-Kuhn-Tucker conditions for both problems, Dax established that if \({\tilde {\mathbf {y}}}\) solves problem (23), then \({\tilde {\mathbf {y}}}/{\tilde {\mathbf {y}}}^{{\top }}\mathbf {1}\) solves problem (22).

Problem (23) can be solved by the NNLS algorithm of Lawson and Hanson (1974).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Y., Taylor, S.M. Efficient computation of nonparametric survival functions via a hierarchical mixture formulation. Stat Comput 23, 713–725 (2013). https://doi.org/10.1007/s11222-012-9341-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-012-9341-9

Keywords

Navigation