Lifetime Data Analysis

, Volume 16, Issue 2, pp 215–230 | Cite as

Misclassification of current status data

  • Karen McKeown
  • Nicholas P. Jewell
Open Access


We describe a simple method for nonparametric estimation of a distribution function based on current status data where observations of current status information are subject to misclassification. Nonparametric maximum likelihood techniques lead to use of a straightforward set of adjustments to the familiar pool-adjacent-violators estimator used when misclassification is assumed absent. The methods consider alternative misclassification models and are extended to regression models for the underlying survival time. The ideas are motivated by and applied to an example on human papilloma virus (HPV) infection status of a sample of women examined in San Francisco.


Current status data Misclassification 



The authors wish to thank Dr. B. Moscicki for permission to use the HPV data, obtained with support from the National Institute of Health through grant #R37-CA51323. We also acknowledge support for this research from the National Institute of Allergy and Infectious Diseases through grant #R01-ES015493.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.


  1. Ayer M, Brunk HD, Ewing GM, Reid WT, Silverman E (1955) An empirical distribution function for sampling with incomplete information. Ann Math Stat 26: 641–647MATHCrossRefMathSciNetGoogle Scholar
  2. Banerjee M, Wellner JA (2001) Likelihood ratio tests for monotone functions. Ann Stat 29: 1699–1731MATHCrossRefMathSciNetGoogle Scholar
  3. Banerjee M, Wellner JA (2005) Confidence intervals for current status data. Scand J Stat 32: 405–424MATHCrossRefMathSciNetGoogle Scholar
  4. Barlow RE, Bartholomew DJ, Bremner JM, Brunk HD (1972) Statistical inference under order restrictions. Wiley, New YorkMATHGoogle Scholar
  5. Becker NG (1989) Analysis of infectious disease data. Chapman and Hall, New York, NYGoogle Scholar
  6. Diamond ID, McDonald JW, Shah IH (1986) Proportional hazards models for current status data: application to the study of differentials in age at weaning in Pakistan. Demography 23: 607–620CrossRefGoogle Scholar
  7. Groeneboom P, Wellner JA (1992) Nonparametric maximum likelihood estimators for interval censoring and denconvolution. Birkhäuser-Boston, BostonGoogle Scholar
  8. Grummer-Straun LM (1993) Regression analysis of current status data: an application to breast-feeding. J Am Stat Assoc 88: 758–765CrossRefGoogle Scholar
  9. Hardin JW, Schmiediche H, Carroll RJ (2003) The Simulation Extrapolation method for fitting generalized linear models with additive measurement error. Stata J 3(4): 1–12Google Scholar
  10. Jewell NP (2007) Correspondences between regression models for complex binary outcomes and those for structured multivariate survival analyses. In: Nair V (eds) Advances in statistical modeling and inference. World Scientific, Hackensack, New Jersey, pp 45–64Google Scholar
  11. Jewell NP, van der Laan M (2004) Current status data: review, recent developments and open problems. In: Advances in survival analysis, handbook in statistics #23. Elsevier, Amsterdam, pp 625–642Google Scholar
  12. Jewell NP, van der Laan M, Henneman T (2003) Nonparametric estimation from current status data with competing risks. Biometrika 90: 183–197MATHCrossRefMathSciNetGoogle Scholar
  13. Keiding K (1991) Age-specific incidence and prevalence:a statistical perspective. J R Stat Soc A 154: 371–412MATHCrossRefMathSciNetGoogle Scholar
  14. Küchenhoff H, Mwalili SM, Lesaffre E (2006) A general method for dealing with misclassification in regression: The misclassification SIMEX. Biometrics 62: 85–96MATHCrossRefMathSciNetGoogle Scholar
  15. Moscicki AB, Shiboski S, Broering J, Powell K, Clayton L, Jay N, Darragh TM, Brescia R, Kanowitz S, Miller SB, Stone J, Hanson E, Palefsky J (1998) The natural history of human papillomavirus infection as measured by repeated DNA testing in adolescent and young women. J Pediatr 132: 277–284CrossRefGoogle Scholar
  16. Neuhaus JM (1999) Bias and efficiency loss due to misclassified responses in binary regression. Biometrika 86: 843–855MATHCrossRefMathSciNetGoogle Scholar
  17. Politis DN, Romano JP, Wolf M (1999) Subsampling. Springer, New YorkMATHGoogle Scholar
  18. Sen B, Banerjee M, Woodroofe M (2010) Inconsistency of bootstrap: the Grenander estimator. Ann Stat, to appearGoogle Scholar
  19. Shiboski SC (1998) Generalized additive models for current status data. Lifetime Data Anal 4: 29–50MATHCrossRefGoogle Scholar
  20. Shiboski SC, Jewell NP (1992) Statistical analysis of the time dependence of HIV infectivity based on partner study data. J Am Stat Assoc 87: 360–372CrossRefGoogle Scholar
  21. Young JG, Jewell NP, Samuels SJ (2008) Regression analysis of a disease onset distribution using diagnosis data. Biometrics 64: 20–28MATHCrossRefMathSciNetGoogle Scholar

Copyright information

© The Author(s) 2010

Authors and Affiliations

  1. 1.Division of Biostatistics, School of Public HealthUniversity of CaliforniaBerkeleyUSA

Personalised recommendations