A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers

Duewer, David Lee

doi:10.1007/s00769-008-0360-3

A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers

General Paper
Published: 15 February 2008

Volume 13, pages 193–216, (2008)
Cite this article

Accreditation and Quality Assurance Aims and scope Submit manuscript

David Lee Duewer¹

1278 Accesses
14 Citations
Explore all metrics

Abstract

While estimation of measurement uncertainty (MU) is increasingly acknowledged as an essential component of the chemical measurement process, there is little agreement on how best to use even nominally well-estimated MU. There are philosophical and practical issues involved in defining what is “best” for a given data set; however, there is remarkably little guidance on how well different MU-using estimators perform with imperfect data. This report characterizes the bias, efficiency, and robustness properties for several commonly used or recently proposed estimators of true location, μ, using “Monte Carlo” (MC) evaluation of “measurement” data sets drawn from well-defined distributions. These synthetic models address a number of issues pertinent to interlaboratory comparisons studies. While the MC results do not provide specific guidance on “which estimator is best” for any given set of real data, they do provide broad insight into the expected relative performance within broadly defined scenarios. Perhaps the broadest and most emphatic guidance from the present study is that (1) well-estimated measurement uncertainties can be used to improve the reliability of location determination and (2) some approaches to using measurement uncertainties are better than others. The traditional inverse squared uncertainty-weighted estimators perform well only in the absence of unrepresentative values (value outliers) or underestimated uncertainties (uncertainty outliers); even modest contamination by such outliers may result in relatively inaccurate estimates. In contrast, some inverse total variance-weighted-estimators and probability density function area-based estimators perform well for all scenarios evaluated, including underestimated uncertainties, extreme value outliers, and asymmetric contamination.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimization techniques for robust multivariate location and scatter estimation

Article 29 January 2015

Combined uncertainty factor for sampling and analysis

Article Open access 26 May 2017

Comparing uncertainties—Are they really different?

Article 06 May 2022

Abbreviations

CCQM :: Comité Consultatif pour la Quantité de Matière
ITV:: Inverse total variance
IU² :: Inverse squared uncertainty
KC:: Key comparison
Lp:: Least power
MADe:: Median absolute deviation from the median, expressed as a standard deviation
MC:: Monte Carlo
MM:: Mixture model
MU:: Measurement uncertainty
n :: Number of measurements in a data set
n _BS :: Number of bootstrap pseudo-data sets
n _MC :: Number of Monte Carlo simulation data sets
N(μ, σ):: Normal (Gaussian) distribution having mean μ and standard deviation σ
NMI:: National Metrology Institute
P :: Level of confidence
PC:: Principal component
PDF:: Probability density function
U_I[ℓ, u]:: Uniform (rectangular) distribution of integers having lower limit ℓ and upper limit u
U_R[ℓ, u] :: Uniform (rectangular) distribution of real numbers having lower limit ℓ and upper limit u
s :: Estimate of dispersion, expressed as a standard deviation
\( s{\left( {\hat{\mu }} \right)} \) :: Estimate of the variability of an estimator on replicate sampling of a population, expressed as a standard deviation
u(x_i):: Uncertainty component of the ith measurement, expressed as a standard deviation
U_P(x_i):: Uncertainty component of the ith measurement, expressed as a coverage interval providing approximately P% level of confidence
w _i :: Weight in a given calculation given to the ith measurement
\( \ifmmode\expandafter\bar\else\expandafter\=\fi{x} \) :: Arithmetic mean
x _i :: Value component of the ith measurement
μ :: True location of a population
\( \hat{\mu } \) :: Estimate of location
σ :: True dispersion of a population, expressed as a standard deviation

References

Duewer DL (2004) A robust approach for the determination of CCQM key comparison reference values and uncertainties Working document CCQM/04-15. http://www.bipm.info/cc/CCQM/Allowed/10/CCQM04-15.pdf, accessed: 13 September 2007
Toman B (2007) Bayesian approaches to calculating a reference value in key comparison experiments. Technometrics 49(1):81–77
Article Google Scholar
Cox MG (1999) A discussion of approaches for determining a reference value in the analysis of Key-Comparison data NPL Report CISE 42/99. http://www.npl.co.uk/ssfm/download/nplreports.html, accessed: 13 September 2007
Lowthian PJ, Thompson M (2002) Bump-hunting for the proficiency tester—searching for multimodality. Analyst 127:1359–1364
Article CAS Google Scholar
Ciarlini P, Cox MG, Pavese F, Regoliosi G (2004) The use of a mixture of probability distributions in temperature interlaboratory comparisons. Metrologia 41:116–121
Article Google Scholar
BIPM key comparison database. http://kcdb.bipm.org/AppendixD/default.asp, accessed: 13 September 2007
CIPM (1 March 1999) Guidelines for CIPM key comparisons. http://www.bipm.org/utils/en/pdf/guidelines.pdf, accessed: 13 September 2007
Andrews DF, Bickel PJ, Hampel FR, Huber PJ, Rogers WH, Tukey JW (1972) Robust estimates of location. Princeton University Press, Princeton
Google Scholar
Willink R (2006) Meaning and models in key comparisons, with measures of operability and interoperability. Metrologia 43:S220–S230
Google Scholar
Croux C, Haesbroeck G (2002) Maxbias curves of robust location estimators based on subranges. J Nonparametr Stat 14:295–306
Article Google Scholar
Cox MG (2007) The evaluation of key comparison data: determining the largest consistent subset. Metrologia 44:187–200
Article Google Scholar
Duewer DL (2007) How to combine results having stated uncertainties: to MU or not to MU? In: Fajgelj A, Belli M, Sansone U (eds) Combining and reporting analytical results. RSC, London, pp 127–142
Google Scholar
ISO (1995) Guide to the expression of uncertainty in measurement. ISO, Geneva
Google Scholar
Rukhin AL, Vangel MG (1998) Estimation of a common mean and weighted means statistics. J Am Stat Assoc 93(441):303–308
Article Google Scholar
Müller JW (2000) Possible advantages of a robust evaluation of comparisons. J Res Nat Inst Std Technol 105:551–554
Google Scholar
Cox MG (2002) The evaluation of key comparison data. Metrologia 39:589–595
Article Google Scholar
Pennecchi F, Callegaro L (2006) Between the mean and the median: the Lp estimator. Metrologia 43:213–219
Article Google Scholar
Callegaro L, Pennecchi F (2007) Why always seek the expected value? A discussion relating to the Lp norm. Metrologia 44(6):L68–L70
Article Google Scholar
Analytical Methods Committee (2001) Robust statistics: a method of coping with outliers AMC Technical Brief 6. http://www.rsc.org/images/brief6_tcm18-25948.pdf, accessed: 13 September 2007
Analytical Methods Committee (1989) Robust statistics—how not to reject outliers. Part 1. Basics. Analyst 114:1693–1697
Article Google Scholar
RobStat.xla, MS EXCEL Add-in for Robust Statistics (2002) http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/RobustStatistics.asp, accessed: 13 September 2007
Viser RG (2006) Interpretation of interlaboratory comparison results to evaluate laboratory proficiency. Accred Qual Assur 10(9):521–526
Article CAS Google Scholar
Dataplot, http://www.itl.nist.gov/div898/software/dataplot/homepage.htm, accessed: 13 September 2007
Rousseeuw PJ (1985) Multivariate estimation with high breakdown point In: Grossman W, Pflug G, Nincze I, Wetrz W (eds) Mathematical statistics and applications. Reidel, Dordrecht, The Netherlands, pp 283–297
Google Scholar
Rose AH, Wang C-M, Byer SD (2000) Round Robin for optical fiber Bragg grating metrology. J Res Nat Inst Std Technol 105:839–866
CAS Google Scholar
Cox NJ (2007), SHORTH: Stata module for descriptive statistics based on shortest halves. http://ideas.repec.org/c/boc/bocode/s456728.html, accessed: 13 September 2007
Spitzer P, VyskoČil L, Máriássy M, Pratt KW, Hongyu X, Dazhou C, Fanmin M, Kristensen HB, Hjelmer B, Rol PM, Nakamura S, Kim M, Torres M, Kozlowski W, Wyszynska J, Pawlina M, Karpov OV, Zdorikov N, Seyku E, Maximov I, Schmidt I, Eberhardt R (2001) pH determination on two phosphate buffers by Harned cell measurements, Final report for CCQM-K9. http://kcdb.bipm.org/AppendixB/appbresults/ccqm-k9/ccqm-k9_final_report.pdf, accessed: 13 September 2007
Rukhin AL, Sedransk N (2007) Statistics in metrology: international key comparisons and interlaboratory studies. J Data Sci 5:393–412
Google Scholar
Graybill FA, Deal RB (1959) Combining unbiased estimators. Biometrics 15:543–550
Article Google Scholar
Heydorn K (2006) The determination of an accepted reference value from proficiency data with stated uncertainties. Accred Qual Assur 10(9):479–484
Article CAS Google Scholar
Decker JE, Brown N, Cox MG, Steele AG, Douglas RJ (2006) Recent recommendations of the consultative committee for length (CCL) regarding strategies for evaluating key comparison data. Metrologia 43:L51–L55
Article Google Scholar
Steele AG, Wood BM, Douglas RJ (2005) Outlier rejection for the weighted-mean KCRV. Metrologia 42:32–38
Article Google Scholar
Ratel G (2006) Median and weighted median as estimators for the key comparison reference value (KCRV). Metrologia 43:S244–S248
Article Google Scholar
Paule RC, Mandel J (1982) Consensus values and weighting factors. J Res Nat Bur Std 87:377–385
Google Scholar
Rukhin AL, Biggerstaff BJ, Vangel MG (2000) Restricted maximum likelihood estimation of a common mean and the Mandel–Paule algorithm. J Stat Plan Infer 83:319–330
Article Google Scholar
Diaconis P, Efron B (1983) Computer-intensive methods in statistics. Sci Am 248:116+
Article Google Scholar
Duewer DL, Kowalski BR, Fasching JL (1976) Improving the reliability of factor analysis of chemical data by utilizing the measured analytical uncertainty. Anal Chem 48:2002–2010
Article CAS Google Scholar
Brereton RG (2003) Chemometrics: data analysis for the laboratory and chemical plant. Wiley, Chichester
Google Scholar

Download references

Acknowledgments

I thank David L. Banks of Duke University for suggesting the investigation of the shorth and its mixture model analogues; Steven L.R. Ellison of LGC for his generous advice, mathematical insights, and for making the RobStat spreadsheet software freely available through the Analytical Methods Committee of the Royal Society for Chemistry; and James J. Filliben and N. Alan Heckert of the Statistical Engineering Division of NIST for developing and maintaining the freely available Dataplot graphical data analysis system. I particularly thank Jim for his critical, and on the whole, encouraging review of this work.

Author information

Authors and Affiliations

Analytical Chemistry Division, Stop 8390, Chemical Science and Technology Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899-8390, USA
David Lee Duewer

Authors

David Lee Duewer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Lee Duewer.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duewer, D.L. A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers. Accred Qual Assur 13, 193–216 (2008). https://doi.org/10.1007/s00769-008-0360-3

Download citation

Received: 13 September 2007
Accepted: 11 January 2008
Published: 15 February 2008
Issue Date: May 2008
DOI: https://doi.org/10.1007/s00769-008-0360-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers

Abstract

Access this article

Similar content being viewed by others

Optimization techniques for robust multivariate location and scatter estimation

Combined uncertainty factor for sampling and analysis

Comparing uncertainties—Are they really different?

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers

Abstract

Access this article

Similar content being viewed by others

Optimization techniques for robust multivariate location and scatter estimation

Combined uncertainty factor for sampling and analysis

Comparing uncertainties—Are they really different?

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation