Z-estimation and stratified samples: application to survival models
- 290 Downloads
The infinite dimensional Z-estimation theorem offers a systematic approach to joint estimation of both Euclidean and non-Euclidean parameters in probability models for data. It is easily adapted for stratified sampling designs. This is important in applications to censored survival data because the inverse probability weights that modify the standard estimating equations often depend on the entire follow-up history. Since the weights are not predictable, they complicate the usual theory based on martingales. This paper considers joint estimation of regression coefficients and baseline hazard functions in the Cox proportional and Lin–Ying additive hazards models. Weighted likelihood equations are used for the former and weighted estimating equations for the latter. Regression coefficients and baseline hazards may be combined to estimate individual survival probabilities. Efficiency is improved by calibrating or estimating the weights using information available for all subjects. Although inefficient in comparison with likelihood inference for incomplete data, which is often difficult to implement, the approach provides consistent estimates of desired population parameters even under model misspecification.
KeywordsSemiparametric models Proportional hazards Additive hazards Calibration of sampling weights Model misspecification Survey sampling
Wellner’s research was supported in part by National Science Foundation Grant DMS-1104832 and National Institute of Allegery and Infectious Diseases Grant 2R01 AI291968-04. Dedicated to Niels Keiding on the occasion of his 70th birthday.
- Breslow NE, Lumley T (2013) Semiparametric models and two-phase samples: applications to Cox regression. In: IMS collections, vol. 9, Institute of Mathematical Statistics, Beachwood, OH, pp 65–77Google Scholar
- Cox DR (1961) Tests of separate families of hypotheses. In: Proceedings of the fourth Berkeley symposium on mathematical statististics and probability, vol. 1, University of California Press, Berkeley, CA, pp 105–123Google Scholar
- Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, University of California Press, Berkeley, CA, pp 221–233Google Scholar
- Huber PJ (1980) Robust statistics. Wiley, New YorkGoogle Scholar
- Lumley T (2009) Robustness of semiparametric efficiency in nearly-correct models for two-phase samples. UW Biostatistics Working Paper Series. http://biostats.bepress.com/uwbiostat/paper351, Accessed 22 November 2014
- Lumley T (2012) Complex surveys: a guide to analysis using R. Wiley, Hoboken, NJGoogle Scholar
- van der Vaart AW (1995) Efficiency of infinite dimensional M-estimators. Stat Neerl 49:9–30Google Scholar
- van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, Cambridge, UKGoogle Scholar
- van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes with applications in statistics. Springer, New YorkGoogle Scholar
- Williams OD (1989) The Atherosclerosis Risk in Communities (ARIC) study—design and objectives. Am J Epidemiol 129:687–702Google Scholar