Abstract
In this study, an attempt was made to determine the degrees of bias in particular sampling sizes and methods. The aim of the study was to determine deviations from the median, the mean, and the standard deviation (SD) in different sample sizes and at different censoring rates for log-normal, exponential, and Weibull distributions in the case of full and censored data sampling. Thus, the concept of “censoring” and censoring types was handled in the first place. Then substitution, parametric (MLE), nonparametric (KM), and semi-parametric (ROS) methods were introduced for the evaluation of left-censored observations. Within the scope of the present study, the data were produced uncensored based on the different parameters of each distribution. Then the datasets were left-censored at the ratios of 5, 25, 45, and 65 %. The censored data were estimated through substitution (LOD and LOD/\(\sqrt{2}\)), parametric (MLE), semi-parametric (ROS), and nonparametric (KM) methods. In addition, evaluation was made by increasing the sample size from 20 to 300 by tens. Performance comparison was made between the uncensored dataset and the censored dataset on the basis of deviations from the median, the mean, and the SD. The results of simulation studies show that LOD/\(\sqrt{2}\) and ROS methods give better results than other methods in deviation from the mean in different sample sizes and at different censoring rates, while ROS gives better results than other methods in deviation from the median in almost all sample sizes and at almost all censoring rates.
Similar content being viewed by others
References
Strobel HA, Heineman WR (1989) Chemical instrumentation. Wiley, New York
Miller JC, Miller JN (1993) Statistics for analytical chemistry. Ellis Horwood, New York
Huston C, Juarez-Colunga E (2009) Guidelines for computing summary statistics for data-sets containing non-detects. Bulkley Valley Research Center with Assistance from the B.C. Ministry of Environment
Hornung R, Reed L (1990) estimation of average concentration in the presence of nondetectable values. Appl Occup Environ Hyg 5:46–51
Glass D, Gray C (2001) Estimating mean exposures from censored data: exposure to benzene in the Australian Petroleum Industry. Ann Occup Hyg 45:275–282
Hawkins N, Norwood S, Rock J (1991) A strategy for occupational exposure assessment. Fairview. American Industrial Hygiene, Fairfax
Mulhausen J, Damiano J (1998) A strategy for assessing and managing occupational exposures. American Industrial Hygiene Association, Fairfax
Lee L, Helsel DR (2005) Statistical analysis of water-quality data containing multiple detection limits: S-language software for regression on order statistics. Comput Geosci 31:1241–1248
Finkelstein M, Verma D (2001) Exposure estimation in the presence of nondetectable values: another look. Ind Hyg Assoc J 62:195–198
Schmoyer R, Beaucamp J, Brandt C (1996) Difficulties with the log-normal model in mean estimation and testing. Environ Ecol Stat 3:81–97
She N (1997) Analyzing censored water quality data using a non-parametric approach. J Am Water Resour Assoc 33:615–624
Hewett P, Ganser G (2007) A comparison of several methods for analyzing censored data. Ann Occup Hyg 51:611–632
Antweiler R, Taylor H (2008) evaluation of statistical treatments of left-censored environmental data using coincident uncensored data sets: I. Summary statistics. Environ Sci Technol 42:3732–3738
Popovic M, Nie H, Chettle D, Mcneill F (2007) random left censoring: a second look at bonelead concentration measurements. Phys Med Biol 52:5369–5378
Chowdhury F, Gulshan J (2012) Comparison of estimation methods for left censored data. In: International conference on statistical data mining for bioinformatics health agriculture and environment, 132–141
El-Shaarawi A, Esterby S (1992) Replacement of censored observations by a constant: an evaluation. Water Resour Res 26:835–844
Fisher R (1922) On the mathematical foundations of theoretical statistics. Philos Trans R Soc London Ser A 222:309–368
Kaplan E, Meier P (1958) Non parametric estimation from incomplete observations. J Am Stat Assoc 53:457–481
Tressou J, Leblanc J, Feinberg M, Bertail P (2004) Statistical methodology to evaluate food exposure to a contaminant and influence of sanitary limits: application to ochratoxin A. Regul Toxicol Pharmacol 40:252–263
Hosmer D, Lemeshow S, May S (2008) Applied survival analysis: regression modeling of time to event data 618. Wiley, New York
Ware J, Demets D (1976) Reanalysis of some baboon descent data. Biometrics 32:459–464
Gilliom R, Helsel D (1986) Estimation of distributional parameters for censored trace level water quality data 1. Estimation techniques. Water Resour Res 22:135–146
Helsel DR, Cohn TA (1988) Estimation of descriptive statistics for multiply censored water quality data. Water Resour Res 24(12):1997–2004
Shumway R, Azari R, Kayhanian M (2002) Statistical approaches to estimating mean water quality concentrations with detection limits. Environ Sci Technol 36:3345–3353
Kroll C, Stedinger J (1996) Estimation of moments and quantiles using censored data. Water Resour Res 32:1005–1012
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tekindal, M.A., Erdoğan, B.D. & Yavuz, Y. Evaluating Left-Censored Data Through Substitution, Parametric, Semi-parametric, and Nonparametric Methods: A Simulation Study. Interdiscip Sci Comput Life Sci 9, 153–172 (2017). https://doi.org/10.1007/s12539-015-0132-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-015-0132-9