Skip to main content
Log in

Survival analysis: up from Kaplan–Meier–Greenwood

  • Methods
  • Published:
European Journal of Epidemiology Aims and scope Submit manuscript

Abstract

In the type of survival analysis that now is routine, only the points of follow-up at which deaths from the cause at issue occurred make contributions to the Greenwood standard error (SE) of the survival rate’s Kaplan–Meier (KM) point estimate. An equivalent of this ‘KMG’ analysis draws from defined subintervals of the survival period being addressed. The data on each subinterval consist of the number of deaths from the cause at issue and the amount of population–time of follow-up, d j and T j , together with the duration of the interval, t j . The KM point estimate is replicated by \( \exp [-{\sum \nolimits_{j} ({{{d_{j} }/ \mathord{{\vphantom {{d_{j}}{T_{j}}}} \kern-\nulldelimiterspace} {T_{j} }}})t_{j}}], \) and the KMG interval estimate is replicated by treating the {d j } as a set of point estimates of Poisson parameters {λ j }, thus taking the SE of \( \sum \nolimits_{j}({{{d_{j} } \mathord{{\vphantom {{d_{j} } {T_{j} }}} \kern-\nulldelimiterspace} {T_{j}}}})t_{j}\) to be \([{\sum \nolimits _{j} d_{j} ({{{t_{j} } / \mathord{{\vphantom {{t_{j} } {T_{j} }}} \kern-\nulldelimiterspace} {T_{j} }}})^{2} }]^{{{1/ \mathord{{\vphantom {1 2}} \kern-\nulldelimiterspace} 2}}}.\) In both the KMG analysis and this equivalent of it, the SE used to derive the survival rate’s lower confidence limit needs to be augmented by a factor that accounts for the loss of information due to censorings subsequent to the last ‘failure’ in the survival period at issue. But, SE-based interval estimation of survival rate actually needs to be replaced by a first-principles counterpart of it. A suitable point of departure in this is first-principles asymptotic interval estimation of the Poisson parameter \( \lambda =\sum \nolimits _{j}{\lambda_{j}}, \) if not the exact counterpart of this. A confidence limit for the survival rate can then be based on suitable augmentation or contraction of the {d j } set to \( \{ d_{j}^{*} \} \) consistent with a given limit for λ, the corresponding survival-rate limit being \( \exp [-{\sum \nolimits _{j} ({{{d_{j}^{ *} } \mathord{/ {\vphantom {{d_{j}^{ *} } {T_{j} }}} \kern-\nulldelimiterspace} {T_{j}}}})t_{j}}]. \) Suitable augmentation is constituted by an identical addition to each \(d_{j}^{1/2},\) suitable contraction by an identical subtraction from each \(d_{j}^{1/2} \ge 1.\)

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. The International Early Lung Cancer Action Program investigators. Survival of patients with Stage I lung cancer detected on CT screening. N Engl J Med. 2006;355(17):1763–71. doi:10.1056/NEJMoa060476.

    Article  Google Scholar 

  2. Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc. 1958;53(282):457–81. doi:10.2307/2281868.

    Article  Google Scholar 

  3. Greenwood M. A report on the natural duration of cancer. In: Reports on public health and medical subjects, vol. 33. His Majesty’s Stationery Office: London; 1926. p. 1–26.

  4. Miettinen OS. Estimability and estimation in case-referent studies. Am J Epidemiol. 1976;103:30–6.

    PubMed  CAS  Google Scholar 

  5. Borkowf CB. A simple hybrid variance estimator for the Kaplan–Meier survival function. Stat Med. 2005;24:827–51. doi:10.1002/sim.1960.

    Article  PubMed  Google Scholar 

  6. Peto R, Pike MC, Armitage P, Breslow NE, Cox DR, Howard SV, et al. Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. Analysis and examples. Br J Cancer. 1977;35(1):1–39.

    PubMed  CAS  Google Scholar 

  7. Barber S, Jennison C. A review of inferential methods for the Kaplan-Meier estimator. In: Research Report 98: 02, Statistics Group, University of Bath, UK. 1998. http://www.maths.leeds.ac.uk/~stuart/research/publications.html.

  8. Collett D. Modelling survival data in medical research. 2nd ed. Boca Raton: Chapman & Hall/CRC; 2003.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olli S. Miettinen.

Appendix 1

Appendix 1

The Nelson–Aalen estimator

An eminent alternative to the KMG statistics is now constituted by the Nelson–Aalen (NA) statistics [8], which, like the statistics proposed here, are based on consideration of survival rate (‘cause-specific’) as the complement of cumulative incidence based on the integral of incidence density (ID). In the NA approach, the ID integral is derived as \( \sum_{j} d_{i} /S_{i},\) where d i is the number of deaths/failures at a point in follow-up time when S i survivors were at risk for that outcome (and d i of them experienced this event). Thus the point estimate of the survival rate is taken to be \( \widehat{\hbox{SR}} = \exp \left(-{\sum_{i} d_{i} /S_{i} } \right). \) The value of this is always somewhat higher than that of the corresponding KM estimate [8].

As a simple example, we might have \(d_{1}/S_{1}=2/6\) together with only \(d_{2}/S_{2}=2/4,\) this in the absence of any censorings. The correct point estimate in this case may be taken to be the binomial one: \( \widehat{\hbox{SR}}=2/6=0.33.\) The corresponding NA estimate is \(\exp{[-(2/6+2/4)]}=0.43,\) while the KM estimate is \((4/6)(2/4)=2/6=0.33.\)

One way to arrive at the N/A estimator, based on data in the form of {S i , d i }, is to focus on time elements of duration dt backward (sic) from each of the failure times (indexed by i = 1, 2,…). From the ith one of these time elements the contribution to the ID integral is \(({d_{i}}/{S_{i}} \ {\text{d}t}){\text{d}t}={d_{i}}/{S_{i}}.\) As only these time elements contribute to the ID integral, the latter is \( \sum_{i} d_{i} /S_{i}. \)

If, however, we consider time elements of duration 2dt and, specifically, of duration dt forward as well as backward from each of the failure times, as is natural, then the ID integral becomes \( \sum \nolimits _{i} \{d_{i}/[S_{i} {\text{d}}t+(S_{i}-d_{i}){\text{d}}t]2{\text{d}}t=\sum \nolimits _{i} d_{i}/\)(S i  − ½d i ). In the simple example above, this modification replaces the NA estimate, 0.43 (above), by 0.34 \({(\simeq{2/6})}.\)

For the here-proposed approach, the NA estimator has the virtue of suggesting that the population–time for the jth interval (with d j  > 0) can be derived from the usual {S i , d i } data, supplemented by the timings of the {d i }, as \(T_{j} = t_{j} \sum \nolimits _{i} d_{i j} /\sum \nolimits _{i} d_{ij}/\)(S ij  − ½d ij ), the {S ij , d ij } constituting the set falling in the jth interval of follow-up time.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Miettinen, O.S. Survival analysis: up from Kaplan–Meier–Greenwood. Eur J Epidemiol 23, 585–592 (2008). https://doi.org/10.1007/s10654-008-9278-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10654-008-9278-7

Keywords

Navigation