Skip to main content
Log in

Evaluating Left-Censored Data Through Substitution, Parametric, Semi-parametric, and Nonparametric Methods: A Simulation Study

  • Original Research Article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

In this study, an attempt was made to determine the degrees of bias in particular sampling sizes and methods. The aim of the study was to determine deviations from the median, the mean, and the standard deviation (SD) in different sample sizes and at different censoring rates for log-normal, exponential, and Weibull distributions in the case of full and censored data sampling. Thus, the concept of “censoring” and censoring types was handled in the first place. Then substitution, parametric (MLE), nonparametric (KM), and semi-parametric (ROS) methods were introduced for the evaluation of left-censored observations. Within the scope of the present study, the data were produced uncensored based on the different parameters of each distribution. Then the datasets were left-censored at the ratios of 5, 25, 45, and 65 %. The censored data were estimated through substitution (LOD and LOD/\(\sqrt{2}\)), parametric (MLE), semi-parametric (ROS), and nonparametric (KM) methods. In addition, evaluation was made by increasing the sample size from 20 to 300 by tens. Performance comparison was made between the uncensored dataset and the censored dataset on the basis of deviations from the median, the mean, and the SD. The results of simulation studies show that LOD/\(\sqrt{2}\) and ROS methods give better results than other methods in deviation from the mean in different sample sizes and at different censoring rates, while ROS gives better results than other methods in deviation from the median in almost all sample sizes and at almost all censoring rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Strobel HA, Heineman WR (1989) Chemical instrumentation. Wiley, New York

    Google Scholar 

  2. Miller JC, Miller JN (1993) Statistics for analytical chemistry. Ellis Horwood, New York

    Google Scholar 

  3. Huston C, Juarez-Colunga E (2009) Guidelines for computing summary statistics for data-sets containing non-detects. Bulkley Valley Research Center with Assistance from the B.C. Ministry of Environment

  4. Hornung R, Reed L (1990) estimation of average concentration in the presence of nondetectable values. Appl Occup Environ Hyg 5:46–51

    Article  CAS  Google Scholar 

  5. Glass D, Gray C (2001) Estimating mean exposures from censored data: exposure to benzene in the Australian Petroleum Industry. Ann Occup Hyg 45:275–282

    Article  CAS  PubMed  Google Scholar 

  6. Hawkins N, Norwood S, Rock J (1991) A strategy for occupational exposure assessment. Fairview. American Industrial Hygiene, Fairfax

    Google Scholar 

  7. Mulhausen J, Damiano J (1998) A strategy for assessing and managing occupational exposures. American Industrial Hygiene Association, Fairfax

    Google Scholar 

  8. Lee L, Helsel DR (2005) Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics. Comput Geosci 31:1241–1248

    Article  CAS  Google Scholar 

  9. Finkelstein M, Verma D (2001) Exposure estimation in the presence of nondetectable values: another look. Ind Hyg Assoc J 62:195–198

    CAS  Google Scholar 

  10. Schmoyer R, Beaucamp J, Brandt C (1996) Difficulties with the log-normal model in mean estimation and testing. Environ Ecol Stat 3:81–97

    Article  Google Scholar 

  11. She N (1997) Analyzing censored water quality data using a non-parametric approach. J Am Water Resour Assoc 33:615–624

    Article  CAS  Google Scholar 

  12. Hewett P, Ganser G (2007) A comparison of several methods for analyzing censored data. Ann Occup Hyg 51:611–632

    PubMed  Google Scholar 

  13. Antweiler R, Taylor H (2008) evaluation of statistical treatments of left-censored environmental data using coincident uncensored data sets: I. Summary statistics. Environ Sci Technol 42:3732–3738

    Article  CAS  PubMed  Google Scholar 

  14. Popovic M, Nie H, Chettle D, Mcneill F (2007) random left censoring: a second look at bonelead concentration measurements. Phys Med Biol 52:5369–5378

    Article  CAS  PubMed  Google Scholar 

  15. Chowdhury F, Gulshan J (2012) Comparison of estimation methods for left censored data. In: International conference on statistical data mining for bioinformatics health agriculture and environment, 132–141

  16. El-Shaarawi A, Esterby S (1992) Replacement of censored observations by a constant: an evaluation. Water Resour Res 26:835–844

    CAS  Google Scholar 

  17. Fisher R (1922) On the mathematical foundations of theoretical statistics. Philos Trans R Soc London Ser A 222:309–368

    Article  Google Scholar 

  18. Kaplan E, Meier P (1958) Non parametric estimation from incomplete observations. J Am Stat Assoc 53:457–481

    Article  Google Scholar 

  19. Tressou J, Leblanc J, Feinberg M, Bertail P (2004) Statistical methodology to evaluate food exposure to a contaminant and influence of sanitary limits: application to ochratoxin A. Regul Toxicol Pharmacol 40:252–263

    Article  CAS  PubMed  Google Scholar 

  20. Hosmer D, Lemeshow S, May S (2008) Applied survival analysis: regression modeling of time to event data 618. Wiley, New York

    Book  Google Scholar 

  21. Ware J, Demets D (1976) Reanalysis of some baboon descent data. Biometrics 32:459–464

    Article  CAS  PubMed  Google Scholar 

  22. Gilliom R, Helsel D (1986) Estimation of distributional parameters for censored trace level water quality data 1. Estimation techniques. Water Resour Res 22:135–146

    Article  Google Scholar 

  23. Helsel DR, Cohn TA (1988) Estimation of descriptive statistics for multiply censored water quality data. Water Resour Res 24(12):1997–2004

    Article  CAS  Google Scholar 

  24. Shumway R, Azari R, Kayhanian M (2002) Statistical approaches to estimating mean water quality concentrations with detection limits. Environ Sci Technol 36:3345–3353

    Article  CAS  PubMed  Google Scholar 

  25. Kroll C, Stedinger J (1996) Estimation of moments and quantiles using censored data. Water Resour Res 32:1005–1012

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mustafa Agah Tekindal.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tekindal, M.A., Erdoğan, B.D. & Yavuz, Y. Evaluating Left-Censored Data Through Substitution, Parametric, Semi-parametric, and Nonparametric Methods: A Simulation Study. Interdiscip Sci Comput Life Sci 9, 153–172 (2017). https://doi.org/10.1007/s12539-015-0132-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-015-0132-9

Keywords

Navigation