Performance of MM-estimators on multi-modal data shows potential for improvements in consensus value estimation

Ellison, Stephen L. R.

doi:10.1007/s00769-009-0571-2

Performance of MM-estimators on multi-modal data shows potential for improvements in consensus value estimation

General Paper
Published: 04 August 2009

Volume 14, pages 411–419, (2009)
Cite this article

Accreditation and Quality Assurance Aims and scope Submit manuscript

Stephen L. R. Ellison¹

1228 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

The performance of a number of robust estimators in the presence of distinct secondary subsets of data is assessed. Estimators examined include the kernel mode recommended by IUPAC, the MM-estimator described by Yohai and, for comparison, the mean, median, and Huber estimate. The performance of the estimators was compared by application to simulated data with one major and one minor mode, and with known minor mode location and proportion of data in the minor mode. The MM-estimator generally performed better than classical and Huber estimates and also provided better precision than the kernel mode at lower minor mode proportions (20% or less). At high minor mode proportion (30%), the kernel density mode provided smaller mean bias and better precision at modest minor mode offsets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust Minimax Variance Estimation of Location under Bounded Distribution Interquantile Ranges

Article 22 May 2020

A Comparative Study of Robust and Stable Estimates of Multivariate Location

Article 21 February 2019

Mean Values: A Multicriterial Analysis

References

ISO 13528:2005 (2005) Statistical methods for use in proficiency testing by interlaboratory comparisons. ISO, Geneva
Analytical Methods Committee (1989) Analyst, 114, 1693
Thompson M, Ellison SLR, Wood R (2006) Pure Appl Chem 78:145–196
Article CAS Google Scholar
Lowthian PJ, Thompson M (2002) Analyst 10:1359–1364
Article CAS Google Scholar
Huber PJ (1981) Robust statistics. Wiley, New York
Book Google Scholar
Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics: theory and methods. Wiley, Chichester
Book Google Scholar
Yohai VJ (1987) Ann Stat 15:642–656
Article Google Scholar
R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0, URL http://www.R-project.org
Venables WN, Ripley BD (2002) Modern Applied Statistics with S, 4th edn. Springer, New York ISBN 0-387-95457-0
Google Scholar
Scott DW (1992) Multivariate density estimation: theory, practice, and visualization. Wiley, Chichester
Book Google Scholar
Rousseew PJ, Croux C (1993) J Am Stat Assoc 88:1273–1283
Article Google Scholar
ISO/TS 20612:2007 (2007) Water quality—interlaboratory comparisons for proficiency testing of analytical chemistry laboratories, ISO, Geneva

Download references

Acknowledgments

Preparation of this paper was supported under contract with the UK Department for Innovation, Universities and Schools National Measurement System (NMS) Chemical and Biological Metrology Programme. The author would additionally like to thank Dr Antonio Possolo (NIST, USA) for suggesting the possible use of MM-estimators in inter-laboratory studies.

Author information

Authors and Affiliations

LGC Limited, Teddington, UK
Stephen L. R. Ellison

Authors

Stephen L. R. Ellison
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stephen L. R. Ellison.

Additional information

Presented at the Eurachen PT Workshop, October 2008, Rome, Italy.

Appendix: Formulae and algorithm for MM-estimation

Tukey’s bisquare

Tukey’s bisquare is defined by setting ρ in equation 1 to

$$ \rho (z) = \left\{ {\begin{array}{*{20}c} {1 - [1 - (z/k)^{2} ]^{3} } \\ 1 \\ \end{array} \;\begin{array}{*{20}c} {{\text{if}}\left| z \right| \le k} \\ {\left| {\text{z}} \right| > k} \\ \end{array} \,\,\,\,\,\,\,} \right. $$

(3)

where $ z = {{(x - \hat{\mu })} \mathord{\left/ {\vphantom {{(x - \hat{\mu })} {\hat{\sigma }}}} \right. \kern-\nulldelimiterspace} {\hat{\sigma }}}, $ $ \hat{\mu } $ is the robust estimate of location and $ \hat{\sigma } $ a robust estimate of scale. This leads to the weight function

$$ w_{i} = \left[ {1 - \min (1,abs(z_{i} /c))^{2} } \right]^{2} $$

(4)

where c is a tuning parameter which takes the values shown in Table 1 for different desired efficiencies.

Table 1 Tuning parameters for different asymptotic relative efficiency

Full size table

MM-estimate for location with equal prior weights

Yohai’s “MM-estimate” uses simple initial estimates of scale and location as input to an initial S-estimate, which refines both, providing a robust estimate of scale with high breakdown point and a somewhat refined estimate of location. In a subsequent step, the S-estimate of scale is used in an M-estimate of location which uses the bisquare weight function above with a tuning parameter set to the desired higher efficiency, refining the location estimate to provide high efficiency. The implementation used for the simulation study above is that of Venables and Ripley [9], which provides MM-estimates for the general case of regression estimates with (optionally) differing prior weights. For the simpler case of a single location estimate with equal prior weights, as in the present comparison, the implementation provided below to illustrate the procedure provides essentially identical results within rounding and convergence error.

The two principal steps are:

(a)
Obtain initial S-estimates of scale and location.

For n data with no reported uncertainties (and therefore equal prior weights) the initial S-estimate can be implemented by iterative reweighting as follows:
(1)
set $ w_{i} = \left[ {1 - \min \left( {1,abs\left( {{{(x_{i} - \hat{\mu }_{ - 1} )} \mathord{\left/ {\vphantom {{(x_{i} - \hat{\mu }_{ - 1} )} {k_{0} \hat{\sigma }_{ - 1} }}} \right. \kern-\nulldelimiterspace} {k_{0} \hat{\sigma }_{ - 1} }}} \right)} \right)^{2} } \right]^{2} $where k ₀ is a tuning constant (set to 1.548 in the present implementation) and $ \hat{\mu }_{ - 1} $ and $ \hat{\sigma }_{ - 1} $the previous or initial estimates of location and scale, respectively. For a simple location estimate without reported uncertainties, initial values can be set from the median and scaled median absolute deviation (described as MAD_E in reference [2]), respectively.
(2)
set $ \hat{\mu }_{0} = {{\sum {w_{i} x_{i} } } \mathord{\left/ {\vphantom {{\sum {w_{i} x_{i} } } {\sum {w_{i} } }}} \right. \kern-\nulldelimiterspace} {\sum {w_{i} } }} $
(3)
set $ u_{i} = \left[ {{{\left( {x_{i} - \mu_{0} } \right)} \mathord{\left/ {\vphantom {{\left( {x_{i} - \mu_{0} } \right)} {k_{0} \hat{\sigma }_{ - 1} }}} \right. \kern-\nulldelimiterspace} {k_{0} \hat{\sigma }_{ - 1} }}} \right]^{2} $
(4)
set $ \hat{\sigma }_{0} = \hat{\sigma }_{ - 1} \sqrt {{{2\sum {\max \left( {1,\,3u_{i} - 3u_{i}^{2} + u_{i}^{3} } \right)} } \mathord{\left/ {\vphantom {{2\sum {\max \left( {1,\,3u_{i} - 3u_{i}^{2} + u_{i}^{3} } \right)} } {(n - 1)}}} \right. \kern-\nulldelimiterspace} {(n - 1)}}} $

Steps (1)–(4) are repeated until the value of $ \hat{\sigma }_{0} $ converges. For the implementation in the present paper, the value was considered to have converged when $ abs\left( {1 - {{\hat{\sigma }_{0} } \mathord{\left/ {\vphantom {{\hat{\sigma }_{0} } {\hat{\sigma }_{ - 1} }}} \right. \kern-\nulldelimiterspace} {\hat{\sigma }_{ - 1} }}} \right) < 10^{ - 5} $. The choice of k ₀ provides a scale estimate with 50% breakdown.

(b)
Improve the location estimate.

The final stage uses the scale and location estimates $ \hat{\sigma }_{0} $ and $ \hat{\mu }_{0} $ from the S-estimate, as follows:
(1)
set $ \hat{\mu } = \hat{\mu }_{0} $
(2)
set $ w_{i} = \left[ {1 - \min \left( {1,abs\left( {{{(x_{i} - \hat{\mu }_{{}} )} \mathord{\left/ {\vphantom {{(x_{i} - \hat{\mu }_{{}} )} {c\hat{\sigma }_{0} }}} \right. \kern-\nulldelimiterspace} {c\hat{\sigma }_{0} }}} \right)} \right)^{2} } \right]^{2} $where c is chosen from Table 1 for the desired efficiency (for example, for 95% efficiency c = 4.69)
(3)
recalculate $ \hat{\mu } $ from $ \hat{\mu } = {{\sum {w_{i} x_{i} } } \mathord{\left/ {\vphantom {{\sum {w_{i} x_{i} } } {\sum {w_{i} } }}} \right. \kern-\nulldelimiterspace} {\sum {w_{i} } }} $

Steps (2) and (3) are repeated until the value of $ \hat{\mu } $ converges. In the present implementation, the value was considered to have converged if the change in $ \hat{\mu } $ was less than $ {{10^{ - 5} \hat{\sigma }_{0} } \mathord{\left/ {\vphantom {{10^{ - 5} \hat{\sigma }_{0} } {\sqrt n }}} \right. \kern-\nulldelimiterspace} {\sqrt n }} $.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ellison, S.L.R. Performance of MM-estimators on multi-modal data shows potential for improvements in consensus value estimation. Accred Qual Assur 14, 411–419 (2009). https://doi.org/10.1007/s00769-009-0571-2

Download citation

Received: 02 February 2009
Accepted: 15 July 2009
Published: 04 August 2009
Issue Date: August 2009
DOI: https://doi.org/10.1007/s00769-009-0571-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance of MM-estimators on multi-modal data shows potential for improvements in consensus value estimation

Abstract

Access this article

Similar content being viewed by others

Robust Minimax Variance Estimation of Location under Bounded Distribution Interquantile Ranges

A Comparative Study of Robust and Stable Estimates of Multivariate Location

Mean Values: A Multicriterial Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Formulae and algorithm for MM-estimation

Tukey’s bisquare

MM-estimate for location with equal prior weights

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Performance of MM-estimators on multi-modal data shows potential for improvements in consensus value estimation

Abstract

Access this article

Similar content being viewed by others

Robust Minimax Variance Estimation of Location under Bounded Distribution Interquantile Ranges

A Comparative Study of Robust and Stable Estimates of Multivariate Location

Mean Values: A Multicriterial Analysis

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Formulae and algorithm for MM-estimation

Appendix: Formulae and algorithm for MM-estimation

Tukey’s bisquare

MM-estimate for location with equal prior weights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation