A Study of Interval Censoring in Parametric Regression Models
47
Citations
474
Downloads
Abstract Parametric models for interval censored data can now easily be fitted with minimal programming in certain standard statistical software packages. Regression equations can be introduced, both for the location and for the dispersion parameters. Finite mixture models can also be fitted, with a point mass on right (or left) censored observations, to allow for individuals who cannot have the event (or already have it). This mixing probability can also be allowed to follow a regression equation.

Here, models based on nine different distributions are compared for three examples of heavily censored data as well as a set of simulated data. We find that, for parametric models, interval censoring can often be ignored and that the density, at centres of intervals, can be used instead in the likelihood function, although the approximation is not always reliable. In the context of heavily interval censored data, the conclusions from parametric models are remarkably robust with changing distributional assumptions and generally more informative than the corresponding non-parametric models.

AIC dispersion regression exponential distribution finite mixture model gamma distribution intensity function interval censoring inverse Gaussian distribution log Cauchy distribution log Laplace distribution log logistic distribution log normal distribution log Student distribution normed profile likelihood robustness Weibull distribution

References H. Akaike, “Information theory and an extension of the maximum likelihood principle,” in Petrov, B. N. and Csàki, F., eds

Second International Symposium on Inference Theory , Budapest, Akadémiai Kiadó, 1973, pp. 267-281.

Google Scholar N. G. Becker and M. Melbye, “Use of a log-linear model to compute the empirical survival curve from interval-censored data, with application to data on test for HIV positivity,”

Australian Journal of Statistics vol. 33 pp. 125-133, 1991

Google Scholar J. Berkson and R. P. Gage, “Survival curves for cancer patients following treatment,”

Journal of the American Statistical Association vol. 47 pp. 501-515, 1952.

Google Scholar J. W. Boag, “Maximum likelihood estimates of the proportion of patients cured by cancer therapy,”

Journal of the Royal Statistical Society vol. B11 pp. 15-53, 1949.

Google Scholar T. J. Boardman, “Estimation in compound exponential failure models-when the data are grouped,”

Technometrics vol. 15 pp. 271-277, 1973.

Google Scholar J. Burridge, “A note on maximum likelihood estimation for regression models using grouped data,”

Journal of the Royal Statistical Society vol. B43 pp. 41-45, 1981a.

Google Scholar J. Burridge, “Empirical Bayes analysis of survival time data,”

Journal of the Royal Statistical Society vol. B43 pp. 65-75, 1981b.

Google Scholar J. Burridge, (1982) “Some unimodality properties of likelihoods derived from grouped data,”

Biometrika vol. 69 pp. 145-151, 1982.

Google Scholar B. Carstensen, “Regression models for interval censored survival data: application to HIV infection in Danish homosexual men,”

Statistics in Medicine vol. 15 pp. 2177-2189, 1996.

Google Scholar D. R. Cox, “Regression models and life-tables,”

Journal of the Royal Statistical Society vol. B34 pp. 187-220, 1972.

Google Scholar S. J. Cutler and L. M. Axtell, “Partitioning of a patient population with respect to different mortality rates,”

Journal of the American Statistical Association vol. 58 pp. 701-712, 1963.

Google Scholar V. De Gruttola and S. W. Lagakos, “Analysis of doubly censored survival data, with application to AIDS,”

Biometrics vol. 45 pp. 1-11, 1989.

Google Scholar J. E. Dennis and R. B. Schnabel,

Numerical Methods for Unconstrained Optimization and Nonlinear Equations , Prentice Hall: New-York, 1983.

Google Scholar V. T. Farewell, “A model for a binary variable with time-censored observations,”

Biometrika vol. 64, pp. 43-46, 1977.

Google Scholar V. T. Farewell, “The use of mixture models for the analysis of survival data with long-term survivors,”

Biometrics vol. 38 pp. 1041-1046, 1982.

Google Scholar V. T. Farewell, “Mixture models in survival analysis: are they worth the risk?”

Canadian Journal of Statistics vol. 14 pp. 257-262, 1986.

Google Scholar C. P. Farrington, “Interval censored survival data: a generalized linear model approach,”

Statistics in Medicine vol. 15 pp. 283-292, 1996.

Google Scholar D. M. Finkelstein, “A proportional hazards model for interval-censored survival data,”

Biometrics vol. 42 pp. 845-854, 1986.

Google Scholar D. M. Finkelstein and R. A. Wolfe, “A semiparametric model for regression analysis of interval-censored failure time data,”

Biometrics vol. 41 pp. 933-945, 1985.

Google Scholar R. A. Fisher, “On the mathematical foundations of theoretical statistics,”

Philosophical Transactions of the Royal Society of London vol. A222 pp. 309-368, 1992.

Google Scholar R. Gentleman and C. J. Geyer, “Maximum likelihood for interval censored data: consistency and computation,”

Biometrika vol. 81 pp. 618-623, 1994.

Google Scholar J. L. Haybittle, “A two-parameter model for the survival curve of treated cancer patients,”

Journal of the American Statistical Association vol. 53 pp. 16-26, 1965.

Google Scholar J. B. Hazelrig, M. E. Turner, and E. H. Blackstone, “Parametric survival analysis combining longitudinal and cross-sectional-censored and interval-censored data with concomitant information,”

Biometrics vol. 38 pp. 1-15, 1982.

Google Scholar D. F. Heitjan, “Inference from grouped continuous data: a review,”

Statistical Science vol. 4 pp. 164-183, 1989.

Google Scholar D. F. Heitjan, “Ignorability and coarse data: some biomedical examples,”

Biometrics vol. 49 pp. 1099-1109, 1993.

Google Scholar D. F. Heitjan, “Ignorability in general incomplete-data models,”

Biometrika vol. 81 pp. 701-708, 1994.

Google Scholar D. F. Heitjan and D. B. Rubin, “Ignorability and coarse data,”

Annals of Statistics vol. 19 pp. 2244-2253, 1991.

Google Scholar J. Huang, “Efficient estimation for the proportional hazards model with interval censoring,”

Annals of Statistics vol. 24 pp. 540-568, 1996.

Google Scholar R. Ihaka and R. Gentleman, “R: a language for data analysis and graphics,”

Journal of Computational Graphics and Statistics vol. 5 pp. 299-314, 1996.

Google Scholar M. Jacobsen, and N. Keiding,“Coarsening at random in general sample spaces and random censoring in continuous time,”

Annals of Statistics vol. 23 pp. 774-786, 1995.

Google Scholar D. K. Kim, “Regression analysis of interval-censored survival data with covariates using log-linear models,”

Biometrics vol. 53 pp. 1274-1283, 1997.

Google Scholar M. Y. Kim, V. G. De Gruttola, and S. W. Lagakos, “Analyzing doubly censored data with covariates, with application to AIDS,”

Biometrics vol. 49 pp. 13-22, 1993.

Google Scholar C. Kooperberg and D. B. Clarkson, “Hazard regression with interval-censored data,”

Biometrics vol. 53 pp. 1485-1494, 1997.

Google Scholar A. Y. C. Kuk and C. H. Chen, “A mixture model combining logistic regression with proportional hazards regression,”

Biometrika vol. 79 pp. 531-541, 1992.

Google Scholar J. C. Lindsey and L. M. Ryan, “Methods for interval-censored data,”

Statistics in Medicine vol. 17 pp. 219-238, 1998.

Google Scholar J. K. Lindsey, “Comparison of probability distributions,”

Journal of the Royal Statistical Society vol. B36 pp. 38-47, 1974a.

Google Scholar J. K. Lindsey, “Construction and comparison of statistical models,”

Journal of the Royal Statistical Society vol. B36 pp. 418-425, 1974b.

Google Scholar J. K. Lindsey, “Fitting parametric counting processes by using log linear models,”

Applied Statistics vol. 44 pp. 201-212, 1995.

Google Scholar J. K. Lindsey,

Parametric Statistical Inference , Oxford University Press: Oxford, 1996.

Google Scholar R. A. Maller, “Factorial analysis of recidivist data,”

Australian Journal of Statistics vol. 35 pp. 5-18, 1993.

Google Scholar M. L. Marshall, “Fitting the two-term mixed exponential and two-parameter lognormal distributions to grouped and censored data,”

Applied Statistics vol. 23 pp. 313-322, 1974.

Google Scholar R. G. Miller, “What price Kaplan-Meier?”

Biometrics vol. 39 pp. 1077-1081, 1983.

Google Scholar L. H. Moulton and N. A. Halsey, “A mixture model with detection limits for regression analyses of antibody response to vaccine,”

Biometrics vol. 51 pp. 1570-1578, 1995.

Google Scholar P. M. Odell, K. M. Anderson, and R. B. D'Agostino, “Maximum likelihood estimation for interval-censored data using a Weibull-based accelerated failure time model,”

Biometrics vol. 48 pp. 951-959, 1992.

Google Scholar R. Peto, “Experimental survival curves for interval-censored data,”

Applied Statistics vol. 22 pp. 86-91, 1973.

Google Scholar D. A. Pierce, W. H. Stewart, and K. J. Kopecky, “Distribution free regression analysis of grouped survival data,”

Biometrics vol. 35 pp. 785-793, 1979.

Google Scholar R. L. Prentice and L. A. Gloeckler, “Regression analysis of grouped survival data with application to breast cancer data,”

Biometrics vol. 34 pp. 57-68, 1978.

Google Scholar D. Rabinowitz, A. Tsiatis, and J. Aragon, “Regression with interval censored data,”

Biometrika vol. 82 pp. 501-513, 1995.

Google Scholar G. Rücker and D. Messerer, “Remission duration: an example of interval-censored observations,”

Statistics in Medicine vol. 7 pp. 1139-1145, 1988.

Google Scholar S. O. Samuelson and J. Kongerud, “Interval censoring in longitudinal data of respiratory symptoms in aluminium potroom workers: a comparison of methods,”

Statistics in Medicine vol. 13 pp. 1771-1780, 1994.

Google Scholar P. Schmidt and A. D. Witte,

Predicting Recidivism using Survival Models , Springer: Berlin 1988.

Google Scholar W. F. Sheppard, “On the calculation of the most probable values of frequency constants for data arranged according to equidistant divisions of a scale,”

Proceedings of the London Mathematical Society vol. 29 pp. 353-380, 1898.

Google Scholar D. Sinha, M. A. Tanner, and W. J. Hall, “Maximization of the marginal likelihood of grouped survival data,”

Biometrika vol. 81 pp. 53-60, 1994.

Google Scholar B. W. Turnbull, “Nonparametric estimation of a survivorship function with doubly censored data,”

Journal of the American Statistical Association vol. 69 pp. 169-173, 1974.

Google Scholar B. W. Turnbull, “The empirical distribution function with arbitrarily grouped censored and truncated data,”

Journal of the Royal Statistical Society vol. B38 pp. 290-295, 1976.

Google Scholar G. N. Wilkinson and C. E. Rogers, “Symbolic description of factorial models for analysis of variance,”

Applied Statistics vol. 22 pp. 392-399, 1973.

Google Scholar N. Younes and J. Lachin, “Link-based models for survival data with interval and continuous time censoring,”

Biometrics vol. 53 pp. 1199-1211, 1997.

Google Scholar © Kluwer Academic Publishers 1998

Authors and Affiliations 1. Biostatistics, Limburgs Universitair Centrum Belgium