Abstract
One or few observations can be highly influential on the Kaplan-Meier estimator, and consequently on the log-rank test statistic in comparing two survival functions. In this paper we derive case influence diagnostics for the Kaplan-Meier estimator and the log-rank test. We note that diagnostics in this context is quite different from the regression context where observations are usually assumed to be independent. Simulation studies are done to present some guidelines to determine influential observations deserving special attention. Illustrative examples are also given.
Similar content being viewed by others
References
Belsley, D.A., Kuh, E., and Welsch, R.E. (1980).Regression Diagnostics: Identifying Influential Data and Source of Collinearity. Wiley, New York.
Chatterjee, S. and Hadi, A.S. (1986). Influential observations, high leverage and outliers in linear regression (discussion).Statistical Science 1, 379–416.
Cook, R.D. and Weisberg, S. (1982).Residuals and Influence in Regression. Chapman and Hall, London.
Cox, D.R. and Snell, E.J. (1968). A general definition of residuals.Journal of the Royal Statistical Society, Series B30, 248–275.
Crowley, J. and Hu, M. (1977). Covariance analysis of heart transplant data.Journal of the American Statistical Association 872, 27–36.
Efron, B. (1967). The two sample problem with censored data.Proc. Fifth Berkeley Symposium in Mathematical statistics, IV, New York, Prentice Hall, 831–853.
Gill, R.D. (1980).Censoring and Stochastic Integrals, Mathematical center Tracts 124, Mathematisch Centrum, Amsterdam.
Kaplan, E.L. and Meier, P. (1958). Nonparametric estimation from incomplete observations.Journal of the American Statistical Association 53, 457–481.
Kim, C. and Kim, W. (1998). Some diagnostic results in nonparametric density estimation.Communications in Statistics — Theory and Methods 27, 291–303.
Lagakos, S.W. (1981). The graphical evaluation of explanatory variables in proportional hazards models.Biometrika 68, 93–98.
Lawless, J.F. (1982).Statistical Models and Methods for Lifetime Data. Wiley, New York.
Mantel, N. and Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease.Journal of the National Cancer Institute 22, 719–748.
Padgett, W.J. (1986). A kernel-type estimator of a quantile function from right-censored data.Journal of the American Statistical Association 81, 215–222.
Peto, R. and Peto, J. (1972). Asymptotically efficient rank invariant test procedures (with discussion).Journal of the Royal Statistical Society, Series A 135, 185–206.
Schoenfeld, D. (1982). Partial residuals for the proportional hazards regression model.Biometrika 69, 239–241.
Storer, B.E. and Crowley, J. (1985). A diagnostic for Cox regression and conditional likelihoods.Journal of the American Statistical Association 80, 139–147.
Therneau, T.M., Grambsch, P.M., and Fleming, T.R. (1990). Martingale-based residuals for survival models.Biometrika 77, 147–160.
Author information
Authors and Affiliations
Additional information
The first author was supported by the Korea Research Foundation Grant(KRF-2002-041-C00049), and he is a member of the Research Institute of Computer, Information and Communication at Pusan National University.
Appendix : Proof of Eq. (2.5)
Appendix : Proof of Eq. (2.5)
For Xi ≤ t,
and
Also, for Xi > t,
which completes the proof.
Rights and permissions
About this article
Cite this article
Kim, C., Bae, W. Case influence diagnostics in the Kaplan-Meier estimator and the log-rank test. Computational Statistics 20, 521–534 (2005). https://doi.org/10.1007/BF02741312
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF02741312