Skip to main content

Censored Data

  • Chapter
  • First Online:
EnvStats
  • 2384 Accesses

Abstract

Often in environmental data analysis values are reported simply as being “below detection limit” along with the stated detection limit (e.g., Helsel 2012; Porter et al. 1988; USEPA 1992c, 2001, 2002a, d, 2009; Singh et al. 2002, 2006, 2010b). A sample of data contains censored observations if some of the observations are reported only as being below or above some censoring level. Although this results in some loss of information, we can still use data that contain nondetects for graphical and statistical analyses. Statistical methods for dealing with censored data have a long history in the fields of survival analysis and life testing (e.g., Hosmer et al. 2008; Kleinbaum and Klein 2011; Nelson 2004). In this chapter, we will discuss how to create graphs, estimate distribution parameters and quantiles, construct prediction and tolerance intervals, perform goodness-of-fit tests, and compare distributions using censored data. See Helsel (2012) and Millard et al. (2014) for a more in-depth discussion of analyzing environmental censored data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bain, L.J., and M. Engelhardt. (1991). Statistical Analysis of Reliability and Life-Testing Models. Marcel Dekker, New York, 496 pp.

    MATH  Google Scholar 

  • Balakrishnan, N., and A.C. Cohen. (1991). Order Statistics and Inference: Estimation Methods. Academic Press, San Diego, CA, 375 pp.

    MATH  Google Scholar 

  • Breslow, N.E. (1970). A Generalized Kruskal–Wallis Test for Comparing K Samples Subject to Unequal Patterns of Censorship. Biometrika 57, 579–594.

    Google Scholar 

  • Cohen, A.C. (1959). Simplified Estimators for the Normal Distribution When Samples are Singly Censored or Truncated. Technometrics 1(3), 217−237.

    Article  MathSciNet  Google Scholar 

  • Cohen, A.C. (1991). Truncated and Censored Samples. Marcel Dekker, New York, 312 pp.

    MATH  Google Scholar 

  • Cohn, T.A. (1988). Adjusted Maximum Likelihood Estimation of the Moments of Lognormal Populations form Type I Censored Samples. U.S. Geological Survey Open-File Report 88–350, 34 pp.

    Google Scholar 

  • Cohn, T.A., L.L. DeLong, E.J. Gilroy, R.M. Hirsch, and D.K. Wells. (1989). Estimating Constituent Loads. Water Resources Research 25(5), 937−942.

    Article  Google Scholar 

  • Cox, D.R. (1972). Regression Models and Life Tables (with Discussion). Journal of the Royal Statistical Society of London, Series B 34, 187–220.

    Google Scholar 

  • Efron, B., and R.J. Tibshirani. (1993). An Introduction to the Bootstrap. Chapman & Hall, New York, 436 pp.

    Book  MATH  Google Scholar 

  • El-Shaarawi, A.H. (1989). Inferences about the Mean from Censored Water Quality Data. Water Resources Research 25(4) 685−690.

    Article  Google Scholar 

  • El-Shaarawi, A.H., and S.R. Esterby. (1992). Replacement of Censored Observations by a Constant: An Evaluation. Water Research 26(6), 835−844.

    Article  Google Scholar 

  • Gehan, E.A. (1965). A Generalized Wilcoxon Test for Comparing Arbitrarily Singly–Censored Samples. Biometrika 52, 203–223.

    Google Scholar 

  • Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, 320 pp.

    Google Scholar 

  • Gilliom, R.J., and D.R. Helsel. (1986). Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques. Water Resources Research 22, 135−146.

    Article  Google Scholar 

  • Gleit, A. (1985). Estimation for Small Normal Data Sets with Detection Limits. Environmental Science and Technology 19, 1201−1206.

    Article  Google Scholar 

  • Haas, C.N., and P.A. Scheff. (1990). Estimation of Averages in Truncated Samples. Environmental Science and Technology 24(6), 912−919.

    Article  Google Scholar 

  • Hashimoto, L.K., and R.R. Trussell. (1983). Evaluating Water Quality Data Near the Detection Limit. Paper presented at the Advanced Technology Conference, American Water Works Association, Las Vegas, Nevada, June 5−9, 1983.

    Google Scholar 

  • Helsel, D.R. (2012). Statistics for Censored Environmental Data Using Minitab and R, 2nd Edition. John Wiley & Sons, Hoboken, New Jersey, 344 pp.

    Google Scholar 

  • Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, 522 pp.

    Google Scholar 

  • Hirsch, R.M., and J.R. Stedinger. (1987). Plotting Positions for Historical Floods and Their Precision. Water Resources Research 23(4), 715−727.

    Article  Google Scholar 

  • Hosmer, D.W, S. Lemeshow, and S. May. (2008). Applied Survival Analysis: Regression Modeling of Time to Event Data, 2nd Edition. John Wiley & Sons, Hoboken, New Jersey, 416 pp.

    Google Scholar 

  • Kaplan, E.L., and P. Meier. (1958). Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association 53, 457−481.

    Article  MathSciNet  MATH  Google Scholar 

  • Kleinbaum, D.G., and M. Klein (2011). Survival Analysis: A Self-Learning Text, Third Edition. Springer, New York, 700 pp.

    Google Scholar 

  • Korn, L.R., and D.E. Tyler. (2001). Robust Estimation for Chemical Concentration Data Subject to Detection Limits. In Fernholz, L., S. Morgenthaler, and W. Stahel, eds. Statistics in Genetics and in the Environmental Sciences. Birkhauser Verlag, Basel, pp.41–63.

    Google Scholar 

  • Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.

    Google Scholar 

  • Latta, R.B. (1981). A Monte Carlo Study of Some Two-Sample Rank Tests with Censored Data. Journal of the American Statistical Association 76(375), 713−719.

    Article  MATH  Google Scholar 

  • Lee, E.T., and J.W. Wang (2003). Statistical Methods for Survival Data Analysis. Third Edition. John Wiley & Sons, Hoboken, New Jersey, 513 pp.

    Google Scholar 

  • Mantel, N. (1966). Evaluation of Survival Data and Two New Rank Order Statistics Arising in its Consideration. Cancer Chemotherapy Reports 50, 163-170.

    Google Scholar 

  • Michael, J.R., and W.R. Schucany. (1986). Analysis of Data from Censored Samples. In D’Agostino, R.B., and M.A. Stephens, eds. Goodness-of-Fit Techniques. Marcel Dekker, New York, 560 pp., Chapter 11, 461−496.

    Google Scholar 

  • Millard, S.P., P. Dixon, and N.K. Neerchal. (2014). Environmental Statistics with R. CRC Press, Boca Raton, Florida.

    Google Scholar 

  • Millard, S.P., and S.J. Deverel. (1988). Nonparametric Statistical Methods for Comparing Two Sites Based on Data with Multiple Nondetect Limits. Water Resources Research 24(12), 2087−2098.

    Article  Google Scholar 

  • Nelson, W. (1972). Theory and Applications of Hazard Plotting for Censored Failure Data. Technometrics 14, 945−966.

    Article  Google Scholar 

  • Nelson, W. (1982). Applied Life Data Analysis. John Wiley & Sons, New York, 634 pp.

    Google Scholar 

  • Nelson, W. (2004). Accelerated Testing: Statistical Models, Test Plans, and Data Analysis. John Wiley & Sons, Hoboken, New Jersey

    Google Scholar 

  • Newman, M.C., P.M. Dixon, B.B. Looney, and J.E. Pinder. (1989). Estimating Mean and Variance for Environmental Samples with Below Detection Limit Observations. Water Resources Bulletin 25(4), 905−916.

    Article  Google Scholar 

  • Peto, R., and J. Peto. (1972). Asymptotically Efficient Rank Invariant Test Procedures (with Discussion). Journal of the Royal Statistical Society of London, Series A 135, 185–206.

    Google Scholar 

  • Porter, P.S., R.C. Ward, and H.F. Bell. (1988). The Detection Limit. Environmental Science and Technology 22(8), 856−861.

    Article  Google Scholar 

  • Prentice, R.L. (1978). Linear Rank Tests with Right Censored Data. Biometrika 65, 167−179.

    Article  MathSciNet  MATH  Google Scholar 

  • Prentice, R.L., and P. Marek. (1979). A Qualitative Discrepancy between Censored Data Rank Tests. Biometrics 35, 861−867.

    Article  Google Scholar 

  • Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. The Statistician 42, 37−43.

    Article  Google Scholar 

  • Saw, J.G. (1961a). Estimation of the Normal Population Parameters Given a Type I Censored Sample. Biometrika 48, 367−377.

    MathSciNet  Google Scholar 

  • Schmee, J., D. Gladstein, and W. Nelson. (1985). Confidence Limits for Parameters of a Normal Distribution from Singly Censored Samples, Using Maximum Likelihood. Technometrics 27(2), 119−128.

    Google Scholar 

  • Schneider, H. (1986). Truncated and Censored Samples from Normal Populations. Marcel Dekker, New York, 273 pp.

    Google Scholar 

  • Shumway, R.H., A.S. Azari, and P. Johnson. (1989). Estimating Mean Concentrations under Transformations for Environmental Data with Detection Limits. Technometrics 31(3), 347−356.

    Article  Google Scholar 

  • Singh, A., A.K. Singh, and R.J. Iaci. (2002). Estimation of the Exposure Point Concentration Term Using a Gamma Distribution. EPA/600/R-02/084. October 2002. Technology Support Center for Monitoring and Site Characterization, Office of Research and Development, Office of Solid Waste and Emergency Response, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • Singh, A., R. Maichle, and S. Lee. (2006). On the Computation of a 95% Upper Confidence Limit of the Unknown Population Mean Based Upon Data Sets with Below Detection Limit Observations. EPA/600/R-06/022, March 2006. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • Singh, A., N. Armbya, and A. Singh. (2010b). ProUCL Version 4.1.00 Technical Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • Travis, C.C., and M.L. Land. (1990). Estimating the Mean of Data Sets with Nondetectable Values. Environmental Science and Technology 24, 961−962.

    Article  Google Scholar 

  • USEPA. (1992c). Statistical Analysis of Ground-Water Monitoring Data at RCRA Facilities: Addendum to Interim Final Guidance. Office of Solid Waste, U.S. Environmental Protection Agency, Washington, D.C. Currently available as part of: Statistical Training Course for Ground-Water Monitoring Data Analysis, EPA/530-R-93-003, which may be obtained through the RCRA Docket (202/260-9327).

    Google Scholar 

  • USEPA. (2001). Risk Assessment Guidance for Superfund: Volume III - Part A, Process for Conducting Probabilistic Risk Assessment. EPA 540-R-02-002, OSWER 9285.7-45, PB2002 963302, December 2001. Office of Emergency and Remedial Response, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • USEPA. (2002a). Guidance for Comparing Background and Chemical Concentrations in Soil for CERCLA Sites. EPA 540-R-01-003, OSWER 9285.7-41, September 2002. Office of Emergency and Remedial Response, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities: Unified Guidance. EPA 530-R-09-007, March 2009. Office of Resource Conservation and Recovery, Program Implementation and Information Division, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • Venzon, D.J., and S.H. Moolgavkar. (1988). A Method for Computing Profile–Likelihood-Based Confidence Intervals. Journal of the Royal Statistical Society, Series C (Applied Statistics) 37(1), pp. 87–94.

    Google Scholar 

  • Verrill, S., and R.A. Johnson. (1988). Tables and Large-Sample Distribution Theory for Censored-Data Correlation Statistics for Testing Normality. Journal of the American Statistical Association 83, 1192−1197.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Millard, S.P. (2013). Censored Data. In: EnvStats. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8456-1_8

Download citation

Publish with us

Policies and ethics