Coarsening at Random: Characterizations, Conjectures, Counter-Examples
The notion of coarsening at random (CAR) was introduced by Heitjan and Rubin (1991) to describe the most general form of randomly grouped, censored, or missing data, for which the coarsening mechanism can be ignored when making likelihood-based inference about the parameters of the distribution of the variable of interest. The CAR assumption is popular, and applications abound. However the full implications of the assumption have not been realized. Moreover a satisfactory theory of CAR for continuously distributed data—which is needed in many applications, particularly in survival analysis—hardly exists as yet. This paper gives a detailed study of CAR. We show that grouped data from a finite sample space always fit a CAR model: a nonparametric model for the variable of interest together with the assumption of an arbitrary CAR mechanism puts no restriction at all on the distribution of the observed data. In a slogan, CAR is everything. We describe what would seem to be the most general way CAR data could occur in practice, a sequential procedure called randomized monotone coarsening. We show that CAR mechanisms exist which are not of this type. Such a coarsening mechanism uses information about the underlying data which is not revealed to the observer, without this affecting the observer’s conclusions. In a second slogan, CAR is more than it seems. This implies that if the analyst can argue from subject-matter considerations that coarsened data is CAR, he or she has knowledge about the structure of the coarsening mechanism which can be put to good use in non-likelihood-based inference procedures. We argue that this is a valuable option in multivariate survival analysis. We give a new definition of CAR in general sample spaces, criticising earlier proposals, and we establish parallel results to the discrete case. The new definition focusses on the distribution rather than the density of the data. It allows us to generalise the theory of CAR to the important situation where coarsening variables (e.g., censoring times) are partially observed as well as the variables of interest.
KeywordsConditional Distribution Marginal Distribution Sample Space Discrete Case Semiparametric Model
Unable to display preview. Download preview PDF.
- P.J. Bickel, C.A.J. Klaassen, Y. Ritov and J.A. Wellner (1993), Efficient and Adaptive Inference in Semi-parametric Models, John Hopkins University Press, Baltimore.Google Scholar
- J.T. Chang and D. Pollard (1997), Conditioning as disintegration, Statistica Neerlandica 51 (to appear).Google Scholar
- R.D. Gill and J.M. Robins (1997), Sequential models for coarsening and missingness, Proc. First Seattle Symposium on Bio statistics: Survival Analysis, ed. D.Y. Lin, Springer-Verlag.Google Scholar
- M.J. van der Laan (1993), Efficient and Inefficient Estimation in Semiparametric Models, Ph.D. Thesis, Dept. Mathematics, University Utrecht; reprinted (1995) as CWI tract 114, Centre for Mathematics and Computer Science, Amsterdam.Google Scholar
- S.F. Nielsen (1996), Incomplete Observations and Coarsening at Random, preprint, Institute of Mathematical Statistics, Univ. of Copenhagen.Google Scholar
- J.M. Robins (1996a), Locally efficient median regression with random censoring and surrogate markers, pp. 263–274 in: Lifetime Data: Models in Reliability and Survival Analysis, N.P. Jewell, A.C. Kimber, M.L. Ting Lee, G.A. Whitmore (eds), Kluwer, Dordrecht.Google Scholar
- J.M. Robins (1996b), Non-response models for the analysis of non-monotone non-ignorable missing data, Statististics in Medicine, Special Issue, to appear.Google Scholar
- J.M. Robins and R.D. Gill (1996), Non-response models for the analysis of non-monotone ignorable missing data, Statistics in Medicine, to appear.Google Scholar
- J.M. Robins and Y. Ritov (1996), Towards a curse of dimensionality appropriate (CODA) asymptotic theory for semiparametric models, Statistics in Medicine, to appear.Google Scholar
- J.M. Robins and A. Rotnitzky (1992), Recovery of information and adjustment for dependent censoring using surrogate markers, pp. 297–331 in: AIDS Epidemiology—Methodological Issues, N. Jewell, K. Dietz, V. Farewell (eds), Birkhäuser, Boston.Google Scholar