Skip to main content
Book cover

EnvStats pp 175–209Cite as

Censored Data

  • 2177 Accesses

Abstract

Often in environmental data analysis values are reported simply as being “below detection limit” along with the stated detection limit (e.g., Helsel 2012; Porter et al. 1988; USEPA 1992c, 2001, 2002a, d, 2009; Singh et al. 2002, 2006, 2010b). A sample of data contains censored observations if some of the observations are reported only as being below or above some censoring level. Although this results in some loss of information, we can still use data that contain nondetects for graphical and statistical analyses. Statistical methods for dealing with censored data have a long history in the fields of survival analysis and life testing (e.g., Hosmer et al. 2008; Kleinbaum and Klein 2011; Nelson 2004). In this chapter, we will discuss how to create graphs, estimate distribution parameters and quantiles, construct prediction and tolerance intervals, perform goodness-of-fit tests, and compare distributions using censored data. See Helsel (2012) and Millard et al. (2014) for a more in-depth discussion of analyzing environmental censored data.

Keywords

  • Lognormal Distribution
  • Censor Data
  • Prediction Interval
  • Tolerance Interval
  • Monte Carlo Trial

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4614-8456-1_8
  • Chapter length: 35 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   69.99
Price excludes VAT (USA)
  • ISBN: 978-1-4614-8456-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   89.99
Price excludes VAT (USA)
Fig. 8.1
Fig. 8.2
Fig. 8.3
Fig. 8.4
Fig. 8.5
Fig. 8.6
Fig. 8.7
Fig. 8.8
Fig. 8.9

References

  • Bain, L.J., and M. Engelhardt. (1991). Statistical Analysis of Reliability and Life-Testing Models. Marcel Dekker, New York, 496 pp.

    MATH  Google Scholar 

  • Balakrishnan, N., and A.C. Cohen. (1991). Order Statistics and Inference: Estimation Methods. Academic Press, San Diego, CA, 375 pp.

    MATH  Google Scholar 

  • Breslow, N.E. (1970). A Generalized Kruskal–Wallis Test for Comparing K Samples Subject to Unequal Patterns of Censorship. Biometrika 57, 579–594.

    Google Scholar 

  • Cohen, A.C. (1959). Simplified Estimators for the Normal Distribution When Samples are Singly Censored or Truncated. Technometrics 1(3), 217−237.

    CrossRef  MathSciNet  Google Scholar 

  • Cohen, A.C. (1991). Truncated and Censored Samples. Marcel Dekker, New York, 312 pp.

    MATH  Google Scholar 

  • Cohn, T.A. (1988). Adjusted Maximum Likelihood Estimation of the Moments of Lognormal Populations form Type I Censored Samples. U.S. Geological Survey Open-File Report 88–350, 34 pp.

    Google Scholar 

  • Cohn, T.A., L.L. DeLong, E.J. Gilroy, R.M. Hirsch, and D.K. Wells. (1989). Estimating Constituent Loads. Water Resources Research 25(5), 937−942.

    CrossRef  Google Scholar 

  • Cox, D.R. (1972). Regression Models and Life Tables (with Discussion). Journal of the Royal Statistical Society of London, Series B 34, 187–220.

    Google Scholar 

  • Efron, B., and R.J. Tibshirani. (1993). An Introduction to the Bootstrap. Chapman & Hall, New York, 436 pp.

    CrossRef  MATH  Google Scholar 

  • El-Shaarawi, A.H. (1989). Inferences about the Mean from Censored Water Quality Data. Water Resources Research 25(4) 685−690.

    CrossRef  Google Scholar 

  • El-Shaarawi, A.H., and S.R. Esterby. (1992). Replacement of Censored Observations by a Constant: An Evaluation. Water Research 26(6), 835−844.

    CrossRef  Google Scholar 

  • Gehan, E.A. (1965). A Generalized Wilcoxon Test for Comparing Arbitrarily Singly–Censored Samples. Biometrika 52, 203–223.

    Google Scholar 

  • Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, 320 pp.

    Google Scholar 

  • Gilliom, R.J., and D.R. Helsel. (1986). Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques. Water Resources Research 22, 135−146.

    CrossRef  Google Scholar 

  • Gleit, A. (1985). Estimation for Small Normal Data Sets with Detection Limits. Environmental Science and Technology 19, 1201−1206.

    CrossRef  Google Scholar 

  • Haas, C.N., and P.A. Scheff. (1990). Estimation of Averages in Truncated Samples. Environmental Science and Technology 24(6), 912−919.

    CrossRef  Google Scholar 

  • Hashimoto, L.K., and R.R. Trussell. (1983). Evaluating Water Quality Data Near the Detection Limit. Paper presented at the Advanced Technology Conference, American Water Works Association, Las Vegas, Nevada, June 5−9, 1983.

    Google Scholar 

  • Helsel, D.R. (2012). Statistics for Censored Environmental Data Using Minitab and R, 2nd Edition. John Wiley & Sons, Hoboken, New Jersey, 344 pp.

    Google Scholar 

  • Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, 522 pp.

    Google Scholar 

  • Hirsch, R.M., and J.R. Stedinger. (1987). Plotting Positions for Historical Floods and Their Precision. Water Resources Research 23(4), 715−727.

    CrossRef  Google Scholar 

  • Hosmer, D.W, S. Lemeshow, and S. May. (2008). Applied Survival Analysis: Regression Modeling of Time to Event Data, 2nd Edition. John Wiley & Sons, Hoboken, New Jersey, 416 pp.

    Google Scholar 

  • Kaplan, E.L., and P. Meier. (1958). Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association 53, 457−481.

    CrossRef  MathSciNet  MATH  Google Scholar 

  • Kleinbaum, D.G., and M. Klein (2011). Survival Analysis: A Self-Learning Text, Third Edition. Springer, New York, 700 pp.

    Google Scholar 

  • Korn, L.R., and D.E. Tyler. (2001). Robust Estimation for Chemical Concentration Data Subject to Detection Limits. In Fernholz, L., S. Morgenthaler, and W. Stahel, eds. Statistics in Genetics and in the Environmental Sciences. Birkhauser Verlag, Basel, pp.41–63.

    Google Scholar 

  • Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.

    Google Scholar 

  • Latta, R.B. (1981). A Monte Carlo Study of Some Two-Sample Rank Tests with Censored Data. Journal of the American Statistical Association 76(375), 713−719.

    CrossRef  MATH  Google Scholar 

  • Lee, E.T., and J.W. Wang (2003). Statistical Methods for Survival Data Analysis. Third Edition. John Wiley & Sons, Hoboken, New Jersey, 513 pp.

    Google Scholar 

  • Mantel, N. (1966). Evaluation of Survival Data and Two New Rank Order Statistics Arising in its Consideration. Cancer Chemotherapy Reports 50, 163-170.

    Google Scholar 

  • Michael, J.R., and W.R. Schucany. (1986). Analysis of Data from Censored Samples. In D’Agostino, R.B., and M.A. Stephens, eds. Goodness-of-Fit Techniques. Marcel Dekker, New York, 560 pp., Chapter 11, 461−496.

    Google Scholar 

  • Millard, S.P., P. Dixon, and N.K. Neerchal. (2014). Environmental Statistics with R. CRC Press, Boca Raton, Florida.

    Google Scholar 

  • Millard, S.P., and S.J. Deverel. (1988). Nonparametric Statistical Methods for Comparing Two Sites Based on Data with Multiple Nondetect Limits. Water Resources Research 24(12), 2087−2098.

    CrossRef  Google Scholar 

  • Nelson, W. (1972). Theory and Applications of Hazard Plotting for Censored Failure Data. Technometrics 14, 945−966.

    CrossRef  Google Scholar 

  • Nelson, W. (1982). Applied Life Data Analysis. John Wiley & Sons, New York, 634 pp.

    Google Scholar 

  • Nelson, W. (2004). Accelerated Testing: Statistical Models, Test Plans, and Data Analysis. John Wiley & Sons, Hoboken, New Jersey

    Google Scholar 

  • Newman, M.C., P.M. Dixon, B.B. Looney, and J.E. Pinder. (1989). Estimating Mean and Variance for Environmental Samples with Below Detection Limit Observations. Water Resources Bulletin 25(4), 905−916.

    CrossRef  Google Scholar 

  • Peto, R., and J. Peto. (1972). Asymptotically Efficient Rank Invariant Test Procedures (with Discussion). Journal of the Royal Statistical Society of London, Series A 135, 185–206.

    Google Scholar 

  • Porter, P.S., R.C. Ward, and H.F. Bell. (1988). The Detection Limit. Environmental Science and Technology 22(8), 856−861.

    CrossRef  Google Scholar 

  • Prentice, R.L. (1978). Linear Rank Tests with Right Censored Data. Biometrika 65, 167−179.

    CrossRef  MathSciNet  MATH  Google Scholar 

  • Prentice, R.L., and P. Marek. (1979). A Qualitative Discrepancy between Censored Data Rank Tests. Biometrics 35, 861−867.

    CrossRef  Google Scholar 

  • Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. The Statistician 42, 37−43.

    CrossRef  Google Scholar 

  • Saw, J.G. (1961a). Estimation of the Normal Population Parameters Given a Type I Censored Sample. Biometrika 48, 367−377.

    MathSciNet  Google Scholar 

  • Schmee, J., D. Gladstein, and W. Nelson. (1985). Confidence Limits for Parameters of a Normal Distribution from Singly Censored Samples, Using Maximum Likelihood. Technometrics 27(2), 119−128.

    Google Scholar 

  • Schneider, H. (1986). Truncated and Censored Samples from Normal Populations. Marcel Dekker, New York, 273 pp.

    Google Scholar 

  • Shumway, R.H., A.S. Azari, and P. Johnson. (1989). Estimating Mean Concentrations under Transformations for Environmental Data with Detection Limits. Technometrics 31(3), 347−356.

    CrossRef  Google Scholar 

  • Singh, A., A.K. Singh, and R.J. Iaci. (2002). Estimation of the Exposure Point Concentration Term Using a Gamma Distribution. EPA/600/R-02/084. October 2002. Technology Support Center for Monitoring and Site Characterization, Office of Research and Development, Office of Solid Waste and Emergency Response, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • Singh, A., R. Maichle, and S. Lee. (2006). On the Computation of a 95% Upper Confidence Limit of the Unknown Population Mean Based Upon Data Sets with Below Detection Limit Observations. EPA/600/R-06/022, March 2006. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • Singh, A., N. Armbya, and A. Singh. (2010b). ProUCL Version 4.1.00 Technical Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • Travis, C.C., and M.L. Land. (1990). Estimating the Mean of Data Sets with Nondetectable Values. Environmental Science and Technology 24, 961−962.

    CrossRef  Google Scholar 

  • USEPA. (1992c). Statistical Analysis of Ground-Water Monitoring Data at RCRA Facilities: Addendum to Interim Final Guidance. Office of Solid Waste, U.S. Environmental Protection Agency, Washington, D.C. Currently available as part of: Statistical Training Course for Ground-Water Monitoring Data Analysis, EPA/530-R-93-003, which may be obtained through the RCRA Docket (202/260-9327).

    Google Scholar 

  • USEPA. (2001). Risk Assessment Guidance for Superfund: Volume III - Part A, Process for Conducting Probabilistic Risk Assessment. EPA 540-R-02-002, OSWER 9285.7-45, PB2002 963302, December 2001. Office of Emergency and Remedial Response, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • USEPA. (2002a). Guidance for Comparing Background and Chemical Concentrations in Soil for CERCLA Sites. EPA 540-R-01-003, OSWER 9285.7-41, September 2002. Office of Emergency and Remedial Response, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities: Unified Guidance. EPA 530-R-09-007, March 2009. Office of Resource Conservation and Recovery, Program Implementation and Information Division, U.S. Environmental Protection Agency, Washington, D.C.

    Google Scholar 

  • Venzon, D.J., and S.H. Moolgavkar. (1988). A Method for Computing Profile–Likelihood-Based Confidence Intervals. Journal of the Royal Statistical Society, Series C (Applied Statistics) 37(1), pp. 87–94.

    Google Scholar 

  • Verrill, S., and R.A. Johnson. (1988). Tables and Large-Sample Distribution Theory for Censored-Data Correlation Statistics for Testing Normality. Journal of the American Statistical Association 83, 1192−1197.

    CrossRef  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and Permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Millard, S.P. (2013). Censored Data. In: EnvStats. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8456-1_8

Download citation