Skip to main content

Coarsening at Random: Characterizations, Conjectures, Counter-Examples

  • Conference paper
Proceedings of the First Seattle Symposium in Biostatistics

Part of the book series: Lecture Notes in Statistics ((LNS,volume 123))

Abstract

The notion of coarsening at random (CAR) was introduced by Heitjan and Rubin (1991) to describe the most general form of randomly grouped, censored, or missing data, for which the coarsening mechanism can be ignored when making likelihood-based inference about the parameters of the distribution of the variable of interest. The CAR assumption is popular, and applications abound. However the full implications of the assumption have not been realized. Moreover a satisfactory theory of CAR for continuously distributed data—which is needed in many applications, particularly in survival analysis—hardly exists as yet. This paper gives a detailed study of CAR. We show that grouped data from a finite sample space always fit a CAR model: a nonparametric model for the variable of interest together with the assumption of an arbitrary CAR mechanism puts no restriction at all on the distribution of the observed data. In a slogan, CAR is everything. We describe what would seem to be the most general way CAR data could occur in practice, a sequential procedure called randomized monotone coarsening. We show that CAR mechanisms exist which are not of this type. Such a coarsening mechanism uses information about the underlying data which is not revealed to the observer, without this affecting the observer’s conclusions. In a second slogan, CAR is more than it seems. This implies that if the analyst can argue from subject-matter considerations that coarsened data is CAR, he or she has knowledge about the structure of the coarsening mechanism which can be put to good use in non-likelihood-based inference procedures. We argue that this is a valuable option in multivariate survival analysis. We give a new definition of CAR in general sample spaces, criticising earlier proposals, and we establish parallel results to the discrete case. The new definition focusses on the distribution rather than the density of the data. It allows us to generalise the theory of CAR to the important situation where coarsening variables (e.g., censoring times) are partially observed as well as the variables of interest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

Bibliography

  • P.J. Bickel, C.A.J. Klaassen, Y. Ritov and J.A. Wellner (1993), Efficient and Adaptive Inference in Semi-parametric Models, John Hopkins University Press, Baltimore.

    Google Scholar 

  • J.T. Chang and D. Pollard (1997), Conditioning as disintegration, Statistica Neerlandica 51 (to appear).

    Google Scholar 

  • D.M. Dabrowska (1988), Kaplan-Meier estimation on the plane, Ann. Statist. 16, 1475–1489.

    Article  MathSciNet  MATH  Google Scholar 

  • R.D. Gill (1989), Non-and semi-parametric maximum likelihood estimators and the von Mises method, Part 1, Scand. J. Statist. 16, 97–128.

    MathSciNet  MATH  Google Scholar 

  • R.D. Gill and J.M. Robins (1997), Sequential models for coarsening and missingness, Proc. First Seattle Symposium on Bio statistics: Survival Analysis, ed. D.Y. Lin, Springer-Verlag.

    Google Scholar 

  • D.F. Heitjan (1993), Ignorability and coarse data: some biomedical examples, Biometrics 49, 1099–1109.

    Article  MATH  Google Scholar 

  • D.F. Heitjan (1994), Ignorability in general incomplete-data models, Biometrika 81, 701–708.

    Article  MathSciNet  MATH  Google Scholar 

  • D.F. Heitjan and D.B. Rubin (1991), Ignorability and coarse data, Ann. Statist. 19, 2244–2253.

    Article  MathSciNet  MATH  Google Scholar 

  • M. Jacobsen and N. Keiding (1995), Coarsening at random in general sample spaces and random censoring in continuous time, Ann. Statist. 23, 774–786.

    Article  MathSciNet  MATH  Google Scholar 

  • R. Kress (1989), Linear Integral Equations, Springer-Verlag, Berlin.

    Book  MATH  Google Scholar 

  • M.J. van der Laan (1993), Efficient and Inefficient Estimation in Semiparametric Models, Ph.D. Thesis, Dept. Mathematics, University Utrecht; reprinted (1995) as CWI tract 114, Centre for Mathematics and Computer Science, Amsterdam.

    Google Scholar 

  • M.J. van der Laan (1996), Efficient estimation in the bivariate censoring model and repairing NPMLE, Ann. Statist. 24, 596–627.

    Article  MathSciNet  MATH  Google Scholar 

  • R.J.A. Little and D.B. Rubin (1987), Statistical Analysis with Missing Data, Wiley, New York.

    MATH  Google Scholar 

  • S.F. Nielsen (1996), Incomplete Observations and Coarsening at Random, preprint, Institute of Mathematical Statistics, Univ. of Copenhagen.

    Google Scholar 

  • R.L. Prentice and J. Cai (1992), Covariance and survivor function estimation using censored multivariate failure time data, Biometrika 79, 495–512.

    Article  MathSciNet  MATH  Google Scholar 

  • J.M. Robins (1996a), Locally efficient median regression with random censoring and surrogate markers, pp. 263–274 in: Lifetime Data: Models in Reliability and Survival Analysis, N.P. Jewell, A.C. Kimber, M.L. Ting Lee, G.A. Whitmore (eds), Kluwer, Dordrecht.

    Google Scholar 

  • J.M. Robins (1996b), Non-response models for the analysis of non-monotone non-ignorable missing data, Statististics in Medicine, Special Issue, to appear.

    Google Scholar 

  • J.M. Robins and R.D. Gill (1996), Non-response models for the analysis of non-monotone ignorable missing data, Statistics in Medicine, to appear.

    Google Scholar 

  • J.M. Robins and Y. Ritov (1996), Towards a curse of dimensionality appropriate (CODA) asymptotic theory for semiparametric models, Statistics in Medicine, to appear.

    Google Scholar 

  • J.M. Robins and A. Rotnitzky (1992), Recovery of information and adjustment for dependent censoring using surrogate markers, pp. 297–331 in: AIDS Epidemiology—Methodological Issues, N. Jewell, K. Dietz, V. Farewell (eds), Birkhäuser, Boston.

    Google Scholar 

  • J.M. Robins, A. Rotnitzky and L.P. Zhao (1994), Estimation of regression coefficients when some regressors are not always observed, J. Amer. Statist Assoc. 89, 846–866.

    Article  MathSciNet  MATH  Google Scholar 

  • D.B. Rubin (1976), Inference and missing data, Biometrika 63, 581–592.

    Article  MathSciNet  MATH  Google Scholar 

  • D.B. Rubin, H.S. Stern and V. Vehovar (1995), Handling “Don’t Know” survey responses: the case of the Slovenian plebiscite, J. Amer. Statist. Assoc. 90, 822–828.

    Article  Google Scholar 

  • A.W. van der Vaart (1991), On differentiable functionals, Ann. Statist. 19, 178–204.

    Article  MathSciNet  MATH  Google Scholar 

  • P. Whittle (1971), Optimization under Constraints, Wiley, New York.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag New York, Inc.

About this paper

Cite this paper

Gill, R.D., van der Laan, M.J., Robins, J.M. (1997). Coarsening at Random: Characterizations, Conjectures, Counter-Examples. In: Lin, D.Y., Fleming, T.R. (eds) Proceedings of the First Seattle Symposium in Biostatistics. Lecture Notes in Statistics, vol 123. Springer, New York, NY. https://doi.org/10.1007/978-1-4684-6316-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-4684-6316-3_14

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-0-387-94992-5

  • Online ISBN: 978-1-4684-6316-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics