Censored Data

Millard, Steven P.

doi:10.1007/978-1-4614-8456-1_8

Steven P. Millard²

2384 Accesses

Abstract

Often in environmental data analysis values are reported simply as being “below detection limit” along with the stated detection limit (e.g., Helsel 2012; Porter et al. 1988; USEPA 1992c, 2001, 2002a, d, 2009; Singh et al. 2002, 2006, 2010b). A sample of data contains censored observations if some of the observations are reported only as being below or above some censoring level. Although this results in some loss of information, we can still use data that contain nondetects for graphical and statistical analyses. Statistical methods for dealing with censored data have a long history in the fields of survival analysis and life testing (e.g., Hosmer et al. 2008; Kleinbaum and Klein 2011; Nelson 2004). In this chapter, we will discuss how to create graphs, estimate distribution parameters and quantiles, construct prediction and tolerance intervals, perform goodness-of-fit tests, and compare distributions using censored data. See Helsel (2012) and Millard et al. (2014) for a more in-depth discussion of analyzing environmental censored data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bain, L.J., and M. Engelhardt. (1991). Statistical Analysis of Reliability and Life-Testing Models. Marcel Dekker, New York, 496 pp.
MATH Google Scholar
Balakrishnan, N., and A.C. Cohen. (1991). Order Statistics and Inference: Estimation Methods. Academic Press, San Diego, CA, 375 pp.
MATH Google Scholar
Breslow, N.E. (1970). A Generalized Kruskal–Wallis Test for Comparing K Samples Subject to Unequal Patterns of Censorship. Biometrika 57, 579–594.
Google Scholar
Cohen, A.C. (1959). Simplified Estimators for the Normal Distribution When Samples are Singly Censored or Truncated. Technometrics 1(3), 217−237.
Article MathSciNet Google Scholar
Cohen, A.C. (1991). Truncated and Censored Samples. Marcel Dekker, New York, 312 pp.
MATH Google Scholar
Cohn, T.A. (1988). Adjusted Maximum Likelihood Estimation of the Moments of Lognormal Populations form Type I Censored Samples. U.S. Geological Survey Open-File Report 88–350, 34 pp.
Google Scholar
Cohn, T.A., L.L. DeLong, E.J. Gilroy, R.M. Hirsch, and D.K. Wells. (1989). Estimating Constituent Loads. Water Resources Research 25(5), 937−942.
Article Google Scholar
Cox, D.R. (1972). Regression Models and Life Tables (with Discussion). Journal of the Royal Statistical Society of London, Series B 34, 187–220.
Google Scholar
Efron, B., and R.J. Tibshirani. (1993). An Introduction to the Bootstrap. Chapman & Hall, New York, 436 pp.
Book MATH Google Scholar
El-Shaarawi, A.H. (1989). Inferences about the Mean from Censored Water Quality Data. Water Resources Research 25(4) 685−690.
Article Google Scholar
El-Shaarawi, A.H., and S.R. Esterby. (1992). Replacement of Censored Observations by a Constant: An Evaluation. Water Research 26(6), 835−844.
Article Google Scholar
Gehan, E.A. (1965). A Generalized Wilcoxon Test for Comparing Arbitrarily Singly–Censored Samples. Biometrika 52, 203–223.
Google Scholar
Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York, 320 pp.
Google Scholar
Gilliom, R.J., and D.R. Helsel. (1986). Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques. Water Resources Research 22, 135−146.
Article Google Scholar
Gleit, A. (1985). Estimation for Small Normal Data Sets with Detection Limits. Environmental Science and Technology 19, 1201−1206.
Article Google Scholar
Haas, C.N., and P.A. Scheff. (1990). Estimation of Averages in Truncated Samples. Environmental Science and Technology 24(6), 912−919.
Article Google Scholar
Hashimoto, L.K., and R.R. Trussell. (1983). Evaluating Water Quality Data Near the Detection Limit. Paper presented at the Advanced Technology Conference, American Water Works Association, Las Vegas, Nevada, June 5−9, 1983.
Google Scholar
Helsel, D.R. (2012). Statistics for Censored Environmental Data Using Minitab and R, 2nd Edition. John Wiley & Sons, Hoboken, New Jersey, 344 pp.
Google Scholar
Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, 522 pp.
Google Scholar
Hirsch, R.M., and J.R. Stedinger. (1987). Plotting Positions for Historical Floods and Their Precision. Water Resources Research 23(4), 715−727.
Article Google Scholar
Hosmer, D.W, S. Lemeshow, and S. May. (2008). Applied Survival Analysis: Regression Modeling of Time to Event Data, 2nd Edition. John Wiley & Sons, Hoboken, New Jersey, 416 pp.
Google Scholar
Kaplan, E.L., and P. Meier. (1958). Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association 53, 457−481.
Article MathSciNet MATH Google Scholar
Kleinbaum, D.G., and M. Klein (2011). Survival Analysis: A Self-Learning Text, Third Edition. Springer, New York, 700 pp.
Google Scholar
Korn, L.R., and D.E. Tyler. (2001). Robust Estimation for Chemical Concentration Data Subject to Detection Limits. In Fernholz, L., S. Morgenthaler, and W. Stahel, eds. Statistics in Genetics and in the Environmental Sciences. Birkhauser Verlag, Basel, pp.41–63.
Google Scholar
Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.
Google Scholar
Latta, R.B. (1981). A Monte Carlo Study of Some Two-Sample Rank Tests with Censored Data. Journal of the American Statistical Association 76(375), 713−719.
Article MATH Google Scholar
Lee, E.T., and J.W. Wang (2003). Statistical Methods for Survival Data Analysis. Third Edition. John Wiley & Sons, Hoboken, New Jersey, 513 pp.
Google Scholar
Mantel, N. (1966). Evaluation of Survival Data and Two New Rank Order Statistics Arising in its Consideration. Cancer Chemotherapy Reports 50, 163-170.
Google Scholar
Michael, J.R., and W.R. Schucany. (1986). Analysis of Data from Censored Samples. In D’Agostino, R.B., and M.A. Stephens, eds. Goodness-of-Fit Techniques. Marcel Dekker, New York, 560 pp., Chapter 11, 461−496.
Google Scholar
Millard, S.P., P. Dixon, and N.K. Neerchal. (2014). Environmental Statistics with R. CRC Press, Boca Raton, Florida.
Google Scholar
Millard, S.P., and S.J. Deverel. (1988). Nonparametric Statistical Methods for Comparing Two Sites Based on Data with Multiple Nondetect Limits. Water Resources Research 24(12), 2087−2098.
Article Google Scholar
Nelson, W. (1972). Theory and Applications of Hazard Plotting for Censored Failure Data. Technometrics 14, 945−966.
Article Google Scholar
Nelson, W. (1982). Applied Life Data Analysis. John Wiley & Sons, New York, 634 pp.
Google Scholar
Nelson, W. (2004). Accelerated Testing: Statistical Models, Test Plans, and Data Analysis. John Wiley & Sons, Hoboken, New Jersey
Google Scholar
Newman, M.C., P.M. Dixon, B.B. Looney, and J.E. Pinder. (1989). Estimating Mean and Variance for Environmental Samples with Below Detection Limit Observations. Water Resources Bulletin 25(4), 905−916.
Article Google Scholar
Peto, R., and J. Peto. (1972). Asymptotically Efficient Rank Invariant Test Procedures (with Discussion). Journal of the Royal Statistical Society of London, Series A 135, 185–206.
Google Scholar
Porter, P.S., R.C. Ward, and H.F. Bell. (1988). The Detection Limit. Environmental Science and Technology 22(8), 856−861.
Article Google Scholar
Prentice, R.L. (1978). Linear Rank Tests with Right Censored Data. Biometrika 65, 167−179.
Article MathSciNet MATH Google Scholar
Prentice, R.L., and P. Marek. (1979). A Qualitative Discrepancy between Censored Data Rank Tests. Biometrics 35, 861−867.
Article Google Scholar
Royston, P. (1993). A Toolkit for Testing for Non-Normality in Complete and Censored Samples. The Statistician 42, 37−43.
Article Google Scholar
Saw, J.G. (1961a). Estimation of the Normal Population Parameters Given a Type I Censored Sample. Biometrika 48, 367−377.
MathSciNet Google Scholar
Schmee, J., D. Gladstein, and W. Nelson. (1985). Confidence Limits for Parameters of a Normal Distribution from Singly Censored Samples, Using Maximum Likelihood. Technometrics 27(2), 119−128.
Google Scholar
Schneider, H. (1986). Truncated and Censored Samples from Normal Populations. Marcel Dekker, New York, 273 pp.
Google Scholar
Shumway, R.H., A.S. Azari, and P. Johnson. (1989). Estimating Mean Concentrations under Transformations for Environmental Data with Detection Limits. Technometrics 31(3), 347−356.
Article Google Scholar
Singh, A., A.K. Singh, and R.J. Iaci. (2002). Estimation of the Exposure Point Concentration Term Using a Gamma Distribution. EPA/600/R-02/084. October 2002. Technology Support Center for Monitoring and Site Characterization, Office of Research and Development, Office of Solid Waste and Emergency Response, U.S. Environmental Protection Agency, Washington, D.C.
Google Scholar
Singh, A., R. Maichle, and S. Lee. (2006). On the Computation of a 95% Upper Confidence Limit of the Unknown Population Mean Based Upon Data Sets with Below Detection Limit Observations. EPA/600/R-06/022, March 2006. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.
Google Scholar
Singh, A., N. Armbya, and A. Singh. (2010b). ProUCL Version 4.1.00 Technical Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.
Google Scholar
Travis, C.C., and M.L. Land. (1990). Estimating the Mean of Data Sets with Nondetectable Values. Environmental Science and Technology 24, 961−962.
Article Google Scholar
USEPA. (1992c). Statistical Analysis of Ground-Water Monitoring Data at RCRA Facilities: Addendum to Interim Final Guidance. Office of Solid Waste, U.S. Environmental Protection Agency, Washington, D.C. Currently available as part of: Statistical Training Course for Ground-Water Monitoring Data Analysis, EPA/530-R-93-003, which may be obtained through the RCRA Docket (202/260-9327).
Google Scholar
USEPA. (2001). Risk Assessment Guidance for Superfund: Volume III - Part A, Process for Conducting Probabilistic Risk Assessment. EPA 540-R-02-002, OSWER 9285.7-45, PB2002 963302, December 2001. Office of Emergency and Remedial Response, U.S. Environmental Protection Agency, Washington, D.C.
Google Scholar
USEPA. (2002a). Guidance for Comparing Background and Chemical Concentrations in Soil for CERCLA Sites. EPA 540-R-01-003, OSWER 9285.7-41, September 2002. Office of Emergency and Remedial Response, U.S. Environmental Protection Agency, Washington, D.C.
Google Scholar
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities: Unified Guidance. EPA 530-R-09-007, March 2009. Office of Resource Conservation and Recovery, Program Implementation and Information Division, U.S. Environmental Protection Agency, Washington, D.C.
Google Scholar
Venzon, D.J., and S.H. Moolgavkar. (1988). A Method for Computing Profile–Likelihood-Based Confidence Intervals. Journal of the Royal Statistical Society, Series C (Applied Statistics) 37(1), pp. 87–94.
Google Scholar
Verrill, S., and R.A. Johnson. (1988). Tables and Large-Sample Distribution Theory for Censored-Data Correlation Statistics for Testing Normality. Journal of the American Statistical Association 83, 1192−1197.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Probability, Statistics and Information, Seattle, Washington, USA
Steven P. Millard

Authors

Steven P. Millard
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Millard, S.P. (2013). Censored Data. In: EnvStats. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8456-1_8

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8456-1_8
Published: 10 August 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8455-4
Online ISBN: 978-1-4614-8456-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics