Abstract
Often in environmental data analysis values are reported simply as being “below detection limit” along with the stated detection limit (e.g., Helsel 2012; Porter et al. 1988; USEPA 1992c, 2001, 2002a, d, 2009; Singh et al. 2002, 2006, 2010b). A sample of data contains censored observations if some of the observations are reported only as being below or above some censoring level. Although this results in some loss of information, we can still use data that contain nondetects for graphical and statistical analyses. Statistical methods for dealing with censored data have a long history in the fields of survival analysis and life testing (e.g., Hosmer et al. 2008; Kleinbaum and Klein 2011; Nelson 2004). In this chapter, we will discuss how to create graphs, estimate distribution parameters and quantiles, construct prediction and tolerance intervals, perform goodness-of-fit tests, and compare distributions using censored data. See Helsel (2012) and Millard et al. (2014) for a more in-depth discussion of analyzing environmental censored data.
Keywords
- Lognormal Distribution
- Censor Data
- Prediction Interval
- Tolerance Interval
- Monte Carlo Trial
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options









References
Bain, L.J., and M. Engelhardt. (1991). Statistical Analysis of Reliability and Life-Testing Models. Marcel Dekker, New York, 496 pp.
Balakrishnan, N., and A.C. Cohen. (1991). Order Statistics and Inference: Estimation Methods. Academic Press, San Diego, CA, 375 pp.
Breslow, N.E. (1970). A Generalized Kruskal–Wallis Test for Comparing K Samples Subject to Unequal Patterns of Censorship. Biometrika 57, 579–594.
Cohen, A.C. (1959). Simplified Estimators for the Normal Distribution When Samples are Singly Censored or Truncated. Technometrics 1(3), 217−237.
Cohen, A.C. (1991). Truncated and Censored Samples. Marcel Dekker, New York, 312 pp.
Cohn, T.A. (1988). Adjusted Maximum Likelihood Estimation of the Moments of Lognormal Populations form Type I Censored Samples. U.S. Geological Survey Open-File Report 88–350, 34 pp.
Cohn, T.A., L.L. DeLong, E.J. Gilroy, R.M. Hirsch, and D.K. Wells. (1989). Estimating Constituent Loads. Water Resources Research 25(5), 937−942.
Cox, D.R. (1972). Regression Models and Life Tables (with Discussion). Journal of the Royal Statistical Society of London, Series B 34, 187–220.
Efron, B., and R.J. Tibshirani. (1993). An Introduction to the Bootstrap. Chapman & Hall, New York, 436 pp.
El-Shaarawi, A.H. (1989). Inferences about the Mean from Censored Water Quality Data. Water Resources Research 25(4) 685−690.
El-Shaarawi, A.H., and S.R. Esterby. (1992). Replacement of Censored Observations by a Constant: An Evaluation. Water Research 26(6), 835−844.
Gehan, E.A. (1965). A Generalized Wilcoxon Test for Comparing Arbitrarily Singly–Censored Samples. Biometrika 52, 203–223.
Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, 320 pp.
Gilliom, R.J., and D.R. Helsel. (1986). Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques. Water Resources Research 22, 135−146.
Gleit, A. (1985). Estimation for Small Normal Data Sets with Detection Limits. Environmental Science and Technology 19, 1201−1206.
Haas, C.N., and P.A. Scheff. (1990). Estimation of Averages in Truncated Samples. Environmental Science and Technology 24(6), 912−919.
Hashimoto, L.K., and R.R. Trussell. (1983). Evaluating Water Quality Data Near the Detection Limit. Paper presented at the Advanced Technology Conference, American Water Works Association, Las Vegas, Nevada, June 5−9, 1983.
Helsel, D.R. (2012). Statistics for Censored Environmental Data Using Minitab and R, 2nd Edition. John Wiley & Sons, Hoboken, New Jersey, 344 pp.
Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, 522 pp.
Hirsch, R.M., and J.R. Stedinger. (1987). Plotting Positions for Historical Floods and Their Precision. Water Resources Research 23(4), 715−727.
Hosmer, D.W, S. Lemeshow, and S. May. (2008). Applied Survival Analysis: Regression Modeling of Time to Event Data, 2nd Edition. John Wiley & Sons, Hoboken, New Jersey, 416 pp.
Kaplan, E.L., and P. Meier. (1958). Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association 53, 457−481.
Kleinbaum, D.G., and M. Klein (2011). Survival Analysis: A Self-Learning Text, Third Edition. Springer, New York, 700 pp.
Korn, L.R., and D.E. Tyler. (2001). Robust Estimation for Chemical Concentration Data Subject to Detection Limits. In Fernholz, L., S. Morgenthaler, and W. Stahel, eds. Statistics in Genetics and in the Environmental Sciences. Birkhauser Verlag, Basel, pp.41–63.
Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.
Latta, R.B. (1981). A Monte Carlo Study of Some Two-Sample Rank Tests with Censored Data. Journal of the American Statistical Association 76(375), 713−719.
Lee, E.T., and J.W. Wang (2003). Statistical Methods for Survival Data Analysis. Third Edition. John Wiley & Sons, Hoboken, New Jersey, 513 pp.
Mantel, N. (1966). Evaluation of Survival Data and Two New Rank Order Statistics Arising in its Consideration. Cancer Chemotherapy Reports 50, 163-170.
Michael, J.R., and W.R. Schucany. (1986). Analysis of Data from Censored Samples. In D’Agostino, R.B., and M.A. Stephens, eds. Goodness-of-Fit Techniques. Marcel Dekker, New York, 560 pp., Chapter 11, 461−496.
Millard, S.P., P. Dixon, and N.K. Neerchal. (2014). Environmental Statistics with R. CRC Press, Boca Raton, Florida.
Millard, S.P., and S.J. Deverel. (1988). Nonparametric Statistical Methods for Comparing Two Sites Based on Data with Multiple Nondetect Limits. Water Resources Research 24(12), 2087−2098.
Nelson, W. (1972). Theory and Applications of Hazard Plotting for Censored Failure Data. Technometrics 14, 945−966.
Nelson, W. (1982). Applied Life Data Analysis. John Wiley & Sons, New York, 634 pp.
Nelson, W. (2004). Accelerated Testing: Statistical Models, Test Plans, and Data Analysis. John Wiley & Sons, Hoboken, New Jersey
Newman, M.C., P.M. Dixon, B.B. Looney, and J.E. Pinder. (1989). Estimating Mean and Variance for Environmental Samples with Below Detection Limit Observations. Water Resources Bulletin 25(4), 905−916.
Peto, R., and J. Peto. (1972). Asymptotically Efficient Rank Invariant Test Procedures (with Discussion). Journal of the Royal Statistical Society of London, Series A 135, 185–206.
Porter, P.S., R.C. Ward, and H.F. Bell. (1988). The Detection Limit. Environmental Science and Technology 22(8), 856−861.
Prentice, R.L. (1978). Linear Rank Tests with Right Censored Data. Biometrika 65, 167−179.
Prentice, R.L., and P. Marek. (1979). A Qualitative Discrepancy between Censored Data Rank Tests. Biometrics 35, 861−867.
Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. The Statistician 42, 37−43.
Saw, J.G. (1961a). Estimation of the Normal Population Parameters Given a Type I Censored Sample. Biometrika 48, 367−377.
Schmee, J., D. Gladstein, and W. Nelson. (1985). Confidence Limits for Parameters of a Normal Distribution from Singly Censored Samples, Using Maximum Likelihood. Technometrics 27(2), 119−128.
Schneider, H. (1986). Truncated and Censored Samples from Normal Populations. Marcel Dekker, New York, 273 pp.
Shumway, R.H., A.S. Azari, and P. Johnson. (1989). Estimating Mean Concentrations under Transformations for Environmental Data with Detection Limits. Technometrics 31(3), 347−356.
Singh, A., A.K. Singh, and R.J. Iaci. (2002). Estimation of the Exposure Point Concentration Term Using a Gamma Distribution. EPA/600/R-02/084. October 2002. Technology Support Center for Monitoring and Site Characterization, Office of Research and Development, Office of Solid Waste and Emergency Response, U.S. Environmental Protection Agency, Washington, D.C.
Singh, A., R. Maichle, and S. Lee. (2006). On the Computation of a 95% Upper Confidence Limit of the Unknown Population Mean Based Upon Data Sets with Below Detection Limit Observations. EPA/600/R-06/022, March 2006. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.
Singh, A., N. Armbya, and A. Singh. (2010b). ProUCL Version 4.1.00 Technical Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.
Travis, C.C., and M.L. Land. (1990). Estimating the Mean of Data Sets with Nondetectable Values. Environmental Science and Technology 24, 961−962.
USEPA. (1992c). Statistical Analysis of Ground-Water Monitoring Data at RCRA Facilities: Addendum to Interim Final Guidance. Office of Solid Waste, U.S. Environmental Protection Agency, Washington, D.C. Currently available as part of: Statistical Training Course for Ground-Water Monitoring Data Analysis, EPA/530-R-93-003, which may be obtained through the RCRA Docket (202/260-9327).
USEPA. (2001). Risk Assessment Guidance for Superfund: Volume III - Part A, Process for Conducting Probabilistic Risk Assessment. EPA 540-R-02-002, OSWER 9285.7-45, PB2002 963302, December 2001. Office of Emergency and Remedial Response, U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2002a). Guidance for Comparing Background and Chemical Concentrations in Soil for CERCLA Sites. EPA 540-R-01-003, OSWER 9285.7-41, September 2002. Office of Emergency and Remedial Response, U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities: Unified Guidance. EPA 530-R-09-007, March 2009. Office of Resource Conservation and Recovery, Program Implementation and Information Division, U.S. Environmental Protection Agency, Washington, D.C.
Venzon, D.J., and S.H. Moolgavkar. (1988). A Method for Computing Profile–Likelihood-Based Confidence Intervals. Journal of the Royal Statistical Society, Series C (Applied Statistics) 37(1), pp. 87–94.
Verrill, S., and R.A. Johnson. (1988). Tables and Large-Sample Distribution Theory for Censored-Data Correlation Statistics for Testing Normality. Journal of the American Statistical Association 83, 1192−1197.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Millard, S.P. (2013). Censored Data. In: EnvStats. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8456-1_8
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8456-1_8
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8455-4
Online ISBN: 978-1-4614-8456-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)