Abstract
While estimation of measurement uncertainty (MU) is increasingly acknowledged as an essential component of the chemical measurement process, there is little agreement on how best to use even nominally well-estimated MU. There are philosophical and practical issues involved in defining what is “best” for a given data set; however, there is remarkably little guidance on how well different MU-using estimators perform with imperfect data. This report characterizes the bias, efficiency, and robustness properties for several commonly used or recently proposed estimators of true location, μ, using “Monte Carlo” (MC) evaluation of “measurement” data sets drawn from well-defined distributions. These synthetic models address a number of issues pertinent to interlaboratory comparisons studies. While the MC results do not provide specific guidance on “which estimator is best” for any given set of real data, they do provide broad insight into the expected relative performance within broadly defined scenarios. Perhaps the broadest and most emphatic guidance from the present study is that (1) well-estimated measurement uncertainties can be used to improve the reliability of location determination and (2) some approaches to using measurement uncertainties are better than others. The traditional inverse squared uncertainty-weighted estimators perform well only in the absence of unrepresentative values (value outliers) or underestimated uncertainties (uncertainty outliers); even modest contamination by such outliers may result in relatively inaccurate estimates. In contrast, some inverse total variance-weighted-estimators and probability density function area-based estimators perform well for all scenarios evaluated, including underestimated uncertainties, extreme value outliers, and asymmetric contamination.
Similar content being viewed by others
Abbreviations
- CCQM :
-
Comité Consultatif pour la Quantité de Matière
- ITV:
-
Inverse total variance
- IU2 :
-
Inverse squared uncertainty
- KC:
-
Key comparison
- Lp:
-
Least power
- MADe:
-
Median absolute deviation from the median, expressed as a standard deviation
- MC:
-
Monte Carlo
- MM:
-
Mixture model
- MU:
-
Measurement uncertainty
- n :
-
Number of measurements in a data set
- n BS :
-
Number of bootstrap pseudo-data sets
- n MC :
-
Number of Monte Carlo simulation data sets
- N(μ, σ):
-
Normal (Gaussian) distribution having mean μ and standard deviation σ
- NMI:
-
National Metrology Institute
- P :
-
Level of confidence
- PC:
-
Principal component
- PDF:
-
Probability density function
- UI[ℓ, u]:
-
Uniform (rectangular) distribution of integers having lower limit ℓ and upper limit u
- UR[ℓ, u] :
-
Uniform (rectangular) distribution of real numbers having lower limit ℓ and upper limit u
- s :
-
Estimate of dispersion, expressed as a standard deviation
- \( s{\left( {\hat{\mu }} \right)} \) :
-
Estimate of the variability of an estimator on replicate sampling of a population, expressed as a standard deviation
- u(x i ):
-
Uncertainty component of the ith measurement, expressed as a standard deviation
- U P (x i ):
-
Uncertainty component of the ith measurement, expressed as a coverage interval providing approximately P% level of confidence
- w i :
-
Weight in a given calculation given to the ith measurement
- \( \ifmmode\expandafter\bar\else\expandafter\=\fi{x} \) :
-
Arithmetic mean
- x i :
-
Value component of the ith measurement
- μ :
-
True location of a population
- \( \hat{\mu } \) :
-
Estimate of location
- σ :
-
True dispersion of a population, expressed as a standard deviation
References
Duewer DL (2004) A robust approach for the determination of CCQM key comparison reference values and uncertainties Working document CCQM/04-15. http://www.bipm.info/cc/CCQM/Allowed/10/CCQM04-15.pdf, accessed: 13 September 2007
Toman B (2007) Bayesian approaches to calculating a reference value in key comparison experiments. Technometrics 49(1):81–77
Cox MG (1999) A discussion of approaches for determining a reference value in the analysis of Key-Comparison data NPL Report CISE 42/99. http://www.npl.co.uk/ssfm/download/nplreports.html, accessed: 13 September 2007
Lowthian PJ, Thompson M (2002) Bump-hunting for the proficiency tester—searching for multimodality. Analyst 127:1359–1364
Ciarlini P, Cox MG, Pavese F, Regoliosi G (2004) The use of a mixture of probability distributions in temperature interlaboratory comparisons. Metrologia 41:116–121
BIPM key comparison database. http://kcdb.bipm.org/AppendixD/default.asp, accessed: 13 September 2007
CIPM (1 March 1999) Guidelines for CIPM key comparisons. http://www.bipm.org/utils/en/pdf/guidelines.pdf, accessed: 13 September 2007
Andrews DF, Bickel PJ, Hampel FR, Huber PJ, Rogers WH, Tukey JW (1972) Robust estimates of location. Princeton University Press, Princeton
Willink R (2006) Meaning and models in key comparisons, with measures of operability and interoperability. Metrologia 43:S220–S230
Croux C, Haesbroeck G (2002) Maxbias curves of robust location estimators based on subranges. J Nonparametr Stat 14:295–306
Cox MG (2007) The evaluation of key comparison data: determining the largest consistent subset. Metrologia 44:187–200
Duewer DL (2007) How to combine results having stated uncertainties: to MU or not to MU? In: Fajgelj A, Belli M, Sansone U (eds) Combining and reporting analytical results. RSC, London, pp 127–142
ISO (1995) Guide to the expression of uncertainty in measurement. ISO, Geneva
Rukhin AL, Vangel MG (1998) Estimation of a common mean and weighted means statistics. J Am Stat Assoc 93(441):303–308
Müller JW (2000) Possible advantages of a robust evaluation of comparisons. J Res Nat Inst Std Technol 105:551–554
Cox MG (2002) The evaluation of key comparison data. Metrologia 39:589–595
Pennecchi F, Callegaro L (2006) Between the mean and the median: the Lp estimator. Metrologia 43:213–219
Callegaro L, Pennecchi F (2007) Why always seek the expected value? A discussion relating to the Lp norm. Metrologia 44(6):L68–L70
Analytical Methods Committee (2001) Robust statistics: a method of coping with outliers AMC Technical Brief 6. http://www.rsc.org/images/brief6_tcm18-25948.pdf, accessed: 13 September 2007
Analytical Methods Committee (1989) Robust statistics—how not to reject outliers. Part 1. Basics. Analyst 114:1693–1697
RobStat.xla, MS EXCEL Add-in for Robust Statistics (2002) http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/RobustStatistics.asp, accessed: 13 September 2007
Viser RG (2006) Interpretation of interlaboratory comparison results to evaluate laboratory proficiency. Accred Qual Assur 10(9):521–526
Dataplot, http://www.itl.nist.gov/div898/software/dataplot/homepage.htm, accessed: 13 September 2007
Rousseeuw PJ (1985) Multivariate estimation with high breakdown point In: Grossman W, Pflug G, Nincze I, Wetrz W (eds) Mathematical statistics and applications. Reidel, Dordrecht, The Netherlands, pp 283–297
Rose AH, Wang C-M, Byer SD (2000) Round Robin for optical fiber Bragg grating metrology. J Res Nat Inst Std Technol 105:839–866
Cox NJ (2007), SHORTH: Stata module for descriptive statistics based on shortest halves. http://ideas.repec.org/c/boc/bocode/s456728.html, accessed: 13 September 2007
Spitzer P, VyskoČil L, Máriássy M, Pratt KW, Hongyu X, Dazhou C, Fanmin M, Kristensen HB, Hjelmer B, Rol PM, Nakamura S, Kim M, Torres M, Kozlowski W, Wyszynska J, Pawlina M, Karpov OV, Zdorikov N, Seyku E, Maximov I, Schmidt I, Eberhardt R (2001) pH determination on two phosphate buffers by Harned cell measurements, Final report for CCQM-K9. http://kcdb.bipm.org/AppendixB/appbresults/ccqm-k9/ccqm-k9_final_report.pdf, accessed: 13 September 2007
Rukhin AL, Sedransk N (2007) Statistics in metrology: international key comparisons and interlaboratory studies. J Data Sci 5:393–412
Graybill FA, Deal RB (1959) Combining unbiased estimators. Biometrics 15:543–550
Heydorn K (2006) The determination of an accepted reference value from proficiency data with stated uncertainties. Accred Qual Assur 10(9):479–484
Decker JE, Brown N, Cox MG, Steele AG, Douglas RJ (2006) Recent recommendations of the consultative committee for length (CCL) regarding strategies for evaluating key comparison data. Metrologia 43:L51–L55
Steele AG, Wood BM, Douglas RJ (2005) Outlier rejection for the weighted-mean KCRV. Metrologia 42:32–38
Ratel G (2006) Median and weighted median as estimators for the key comparison reference value (KCRV). Metrologia 43:S244–S248
Paule RC, Mandel J (1982) Consensus values and weighting factors. J Res Nat Bur Std 87:377–385
Rukhin AL, Biggerstaff BJ, Vangel MG (2000) Restricted maximum likelihood estimation of a common mean and the Mandel–Paule algorithm. J Stat Plan Infer 83:319–330
Diaconis P, Efron B (1983) Computer-intensive methods in statistics. Sci Am 248:116+
Duewer DL, Kowalski BR, Fasching JL (1976) Improving the reliability of factor analysis of chemical data by utilizing the measured analytical uncertainty. Anal Chem 48:2002–2010
Brereton RG (2003) Chemometrics: data analysis for the laboratory and chemical plant. Wiley, Chichester
Acknowledgments
I thank David L. Banks of Duke University for suggesting the investigation of the shorth and its mixture model analogues; Steven L.R. Ellison of LGC for his generous advice, mathematical insights, and for making the RobStat spreadsheet software freely available through the Analytical Methods Committee of the Royal Society for Chemistry; and James J. Filliben and N. Alan Heckert of the Statistical Engineering Division of NIST for developing and maintaining the freely available Dataplot graphical data analysis system. I particularly thank Jim for his critical, and on the whole, encouraging review of this work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Duewer, D.L. A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers. Accred Qual Assur 13, 193–216 (2008). https://doi.org/10.1007/s00769-008-0360-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00769-008-0360-3