Skip to main content
Log in

Comparison of biased and unbiased estimators of variances of qualitative and semi-quantitative results of testing

Accreditation and Quality Assurance Aims and scope Submit manuscript

Abstract

Unbiased estimators of within-laboratory and between-laboratory (or within reference material unit and between-unit) variances of results of qualitative and semi-quantitative testing are formulated and discussed. Qualitative and semi-quantitative test results were treated as binary nominal and ordinal values, respectively, in framework of the newly developed ordinal analysis of variance (ORDANOVA). It is shown that the difference of the unbiased and the biased estimators of a within-laboratory variance does not exceed 5 %, when the number of replicate tests in a laboratory is larger than 20. Such a difference is increasing when the replicate number is decreasing, not depending on the number of laboratories and the between-laboratory variation, since both the unbiased and the biased estimators are based on the averaged within-laboratory variances. The difference of the unbiased and the biased estimators of the between-laboratory variance depends not only on the number of replicates, but also on the number of laboratories and on the ratio of the contributions to the total variance (the between-laboratory variance and the averaged within-laboratory variance). This difference does not exceed 5 %, when the number of replicates and the number of laboratories are larger than 20 and the ratio of the between-laboratory to the averaged within-laboratory variances does not yield 1. For a limited size of experiment (smaller numbers of replicates and laboratories), the difference is increasing with the size decreasing and can be significant.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Bashkansky E, Gadrich T, Kuselman I (2012) Interlaboratory comparison of test results of an ordinal or nominal binary property: analysis of variation. Accred Qual Assur 17:239–243

    Article  Google Scholar 

  2. Goldschmidt H, Libeer JC, De Biévre P, Schimmel H, Petersen PH (2004) How far can the concepts of traceability and GUM/VIM can be applied to measurement results in laboratory medicine. Accred Qual Assur 9:125–127

    Article  Google Scholar 

  3. Galdrich T, Bashkansky E (2012) ORDANOVA: analysis of ordinal variation. J Stat Plan Inference 142:3174–3188

    Article  Google Scholar 

  4. Wehling P, LaBudde RA, Brunelle SL, Nelson MT (2011) Probability of detection (POD) as a statistical model for the validation of qualitative methods. J AOAC Int 94:335–347

    CAS  Google Scholar 

  5. Blair J, Lacy MG (2000) Statistics of ordinal variation. Sociol Methods Res 28:251–280

    Article  Google Scholar 

  6. Franceschini F, Galetto M, Varetto M (2005) Ordered samples control charts for ordinal variables. Qual Reliab Eng Int 21:177–195

    Article  Google Scholar 

  7. Bashkansky E, Gadrich T (2008) Evaluating quality measured on a ternary ordinal scale. Qual Reliab Eng Int 24:957–971

    Article  Google Scholar 

  8. Uhlig S, Niewohner L, Gowik P (2011) Can the usual validation standard series for quantitative methods, ISO 5725, be also applied for qualitative methods? Accred Qual Assur 16:533–537

    Article  Google Scholar 

  9. Light RJ, Margolin BH (1971) An analysis of variance for categorical data. J Am Stat Assoc 66:534–544

    Article  Google Scholar 

  10. Gibbs JP, Poston DL Jr (1975) The division of labor: conceptualization and related measures. Soc Forces 53:468–476

    Google Scholar 

  11. Cardenas S, Varcarsel M (2005) Analytical features in qualitative analysis. Trends Anal Chem 24:477–487

    Article  CAS  Google Scholar 

  12. Kuselman I, Fajgelj A (2010) IUPAC/CITAC Guide: selection and use of proficiency testing schemes for a limited number of participants—chemical analytical laboratories (IUPAC Technical Report). Pure Appl Chem 82:1099–1135

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilya Kuselman.

Appendices

Appendix 1

Lemma 1

The expectation of within-laboratory variance is:

$$ E\left[ {\hat{h}_{\left( {\rm{W}} \right)}^{2} } \right] = E\left[ {\frac{4}{M}\sum\limits_{i = 1}^{M} {\hat{p}_{i} \left( {1 - \hat{p}_{i} } \right)} } \right] = 4\left( {1 - \frac{1}{n}} \right){p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) - \left( {1 - \frac{1}{n}} \right)\frac{4}{M}\sum\limits_{i = 1}^{M} {\left( {p_{i} - {p{_\bullet}} } \right)^{2} } $$

Proof

$$ E\left[ {\hat{h}_{\left({\rm{W}} \right)}^{2} } \right] = E\left[ {\frac{4}{M}\sum\limits_{i = 1}^{M} {\hat{p}_{i} \left( {1 - \hat{p}_{i} } \right)} } \right] = 4\left[ {{p{_\bullet}} - \frac{1}{M}\sum\limits_{i = 1}^{M} {\frac{{p_{i} \left( {1 - p_{i} } \right)}}{n}} - \frac{1}{M}\sum\limits_{i = 1}^{M} {p_{i}^{2} } } \right] = 4\left[ {\left( {1 - \frac{1}{n}} \right){p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) - \left( {1 - \frac{1}{n}} \right)\frac{1}{M}\sum\limits_{i = 1}^{M} {\left( {p_{i} -{p{_\bullet}} } \right)^{2} } } \right] $$

Lemma 2

The expectation of between-laboratory variance is:

$$E\left( {S_{\left({\rm{B}} \right)}^{2} } \right) = E\left( {\frac{4}{M}\sum\limits_{i = 1}^{M} {\left( {\hat{p}_{i} - \hat{p}_{\bullet} } \right)^{2} } } \right) = \left[ {1 - \frac{1}{n}\left( {1 - \frac{1}{M}} \right)} \right]\frac{4}{M}\sum\limits_{i = 1}^{M} {\left( {p_{i} - {p{_\bullet}} } \right)^{2} } + \frac{4}{n}\left( {1 - \frac{1}{M}} \right){p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) $$

Proof

$$ E\left( {S_{\left({\rm{B}} \right)}^{2} } \right) = E\left( {\frac{4}{M}\sum\limits_{i = 1}^{M} {\left( {\hat{p}_{i} - \hat{p}_{\bullet} } \right)^{2} } } \right) = \frac{4}{M}\sum\limits_{i = 1}^{M} {\left\{ {\frac{{p_{i} \left( {1 - p_{i} } \right)}}{n} + p_{i}^{2} } \right\} - 4\left\{ {\frac{1}{{M^{2} }}\sum\limits_{i = 1}^{M} {\frac{{p_{i} \left( {1 - p_{i} } \right)}}{n}} + {p{_\bullet}}^{2} } \right\}} = 4\left[ {1 - \frac{1}{n}\left( {1 - \frac{1}{M}} \right)} \right]\frac{1}{M}\sum\limits_{i = 1}^{M} {\left( {p_{i} - {p{_\bullet}} } \right)^{2} } + \frac{4}{n}\left( {1 - \frac{1}{M}} \right){p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) $$

Lemma 3

The expectation of total variance is equal to:

$$ E\left[ {\hat{h}_{\left({\rm{T}} \right)}^{2} } \right] = E\left[ {4{\hat{p}{_\bullet}} \left( {1 - {\hat{p}{_\bullet}} } \right)} \right] = 4\left( {1 - \frac{1}{Mn}} \right){p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) + \frac{4}{{M^{2} n}}\sum\limits_{i = 1}^{M} {\left( {p_{i} - {p{_\bullet}} } \right)^{2} } $$

Proof

$$ E\left[ {\hat{h}_{\left({\rm{T}} \right)}^{2} } \right] = E\left[ {4{\hat{p}{_\bullet}} \left( {1 -{\hat{p}{_\bullet}} } \right)} \right] = 4{p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) - \frac{4}{{M^{2} }}\sum\limits_{i = 1}^{M} {\frac{{p_{i} \left( {1 - p_{i} } \right)}}{n}} = 4\left( {1 - \frac{1}{Mn}} \right){p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) + \frac{4}{{M^{2} n}}\sum\limits_{i = 1}^{M} {\left( {p_{i} - {p{_\bullet}} } \right)^{2} } $$

Lemma 4

Comparison of the expressions in Lemmas 2 and 3 yields the following unbiased estimator for the between-laboratory variance \( \sigma_{\left({\rm{B}} \right)}^{2} = \frac{4}{M}\sum\nolimits_{i = 1}^{M} {\left( {p_{i} - {p{_\bullet}} } \right)^{2} } , \)

$$ \sigma_{\left({\rm{B}} \right)}^{2} = \frac{Mn - 1}{{M\left( {n - 1} \right)}}E\left( {S_{\left({\rm{B}} \right)}^{2} } \right) - \frac{M - 1}{{M\left( {n - 1} \right)}}E\left( {\hat{h}_{\left({\rm{T}} \right)}^{2} } \right) $$

Proof

From Lemma 2, it follows that \( E\left( {S_{\left({\rm{B}} \right)}^{2} } \right) = \left[ {1 - \frac{1}{n}\left( {1 - \frac{1}{M}} \right)} \right]\sigma_{\left({\rm{B}} \right)}^{2} + \frac{4}{n}\left( {1 - \frac{1}{M}} \right){p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) \), while from Lemma 3, we get:

$$ E\left[ {\hat{h}_{\left({\rm{T}} \right)}^{2} } \right] = \left( {1 - \frac{1}{Mn}} \right)4{p{_\bullet}} \left( {1 - {p{_\bullet}} } \right) + \frac{1}{Mn}\sigma_{\left({\rm{B}} \right)}^{2}.$$

Substituting the second expression in the first one gives the required result.

Appendix 2

Rearranging the elements in formula (11) yields:

$$ S_{\text{u1(B)}}^{ 2} = \frac{Mn - 1}{M(n - 1)}S_{{({\text{B}})}}^{2} - \frac{M - 1}{M(n - 1)}\hat{h}_{{({\text{T}})}}^{2} = \frac{{df_{\text{T}}}}{{df_{\text{W}}}}S_{{({\text{B}})}}^{2} - \frac{{df_{\text{B}}}}{{df_{\text{W}}}}\hat{h}_{{({\text{T}})}}^{2} = \frac{{df_{\text{B}}}}{{df_{\text{W}}}}\hat{h}_{{({\text{T}})}}^{2} \left[ {I - 1} \right], $$

where the indicator \( I =[ {{S_{{({\rm{B}})}}^{2} /df_{({\rm{B}})}}}]/[{{\hat{h}_{{({\rm{T}})}}^{2} /df_{({\rm{T}})} }}] \) was defined in Ref. [1]. Since the null hypothesis H0 was rejected (test results obtained in different laboratories are not equivalent), the indicator exceeds unity. Thus, the estimator \( S_{\text{u1(B)}}^{ 2} \) is positive.

The same is for the unbiased estimator by formula (12):

$$ S_{{{\text{u}}2({\text{B}})}}^{2} = S_{{({\text{B}})}}^{2} - \frac{M - 1}{M(n - 1)}\hat{h}_{{({\text{W)}}}}^{ 2} = S_{{({\text{B}})}}^{2} - \frac{{df_{\text{B}}}}{{df_{\text{W}}}}\hat{h}_{{({\text{W)}}}}^{ 2} = \frac{{df_{\text{B}}}}{{df_{\text{W}}}}\hat{h}_{{({\text{W)}}}}^{ 2} \left[ {\frac{{{{S_{\left( {\text{B}} \right)}^{2} } \mathord{\left/ {\vphantom {{S_{\left( {\text{B}}\right)}^{2} } {df_{\text{B}}}}} \right. \kern-0pt} {df_{\text{B}}}}}}{{{{\hat{h}_{\left( {\text{W}} \right)}^{2} } \mathord{\left/ {\vphantom {{\hat{h}_{\left( W \right)}^{2} } {df_{\text{W}}}}} \right. \kern-0pt} {df_{\text{W}}}}}} - 1} \right].$$

More details on the indicator I are available in Ref. [1].

Appendix 3

Proposition

Assume that M laboratories perform the same number n of measurements or tests on K categories scale basis. Each m-th laboratory (m = 1, 2, … , M) is characterized by its vector of population frequencies \( p_{km} \) belonging to categories k = 1, 2, … , K. Define total variance and the population parameters of the within-laboratory and between-laboratory variances as:

$$ h_{\left({\rm{T}} \right)}^{2} = \frac{1}{{{{\left( {K - 1} \right)} \mathord{\left/ {\vphantom {{\left( {K - 1} \right)} 4}} \right. \kern-0pt} 4}}}\sum\limits_{k = 1}^{K - 1} {F_{k{_\bullet}} \left( {1 - F_{k{_\bullet}} } \right),} $$
$$ \sigma_{\left({\rm{B}} \right)}^{2} = \frac{1}{{{{\left( {K - 1} \right)} \mathord{\left/ {\vphantom {{\left( {K - 1} \right)} 4}} \right. \kern-0pt} 4}}}\sum\limits_{k = 1}^{K - 1} \frac{1}{M} \sum\limits_{m = 1}^{M} {\left( {F_{km} - F_{k{_\bullet}} } \right)^{2} }, $$
$$ h_{\left({\rm{W}} \right)}^{2} = \frac{1}{M}\sum\limits_{m = 1}^{M} {\frac{1}{{{{\left( {K - 1} \right)} \mathord{\left/ {\vphantom {{\left( {K - 1} \right)} 4}} \right. \kern-0pt} 4}}}\sum\limits_{k = 1}^{K - 1} {F_{km} \left( {1 - F_{km} } \right)} } . $$

From the binomial distribution for a multi-categorical case:

$$ E\left( {\hat{p}_{km} } \right) = p_{km} ,\quad {\text{Var}}(\hat{p}_{km} ) = \frac{{p_{km} (1 - p_{km} )}}{n},\quad {\text{Cov}}(\hat{p}_{km} ,\hat{p}_{{k^{\prime}m}} ) = - \frac{{p_{km} p_{{k^{\prime}m}} }}{n}, $$

where Cov is the covariance of the observed frequencies \( \hat{p}_{km} \) and \( \hat{p}_{{k^{\prime}m}} \) belonging to categories k and k′, respectively.

The unbiased estimator for the within-laboratory variance (9) and the unbiased estimators for the between-laboratory variance (11)–(12) are valid also for the multi-categorical case.

Proof

Similar to Lemma 1 in “Appendix 1”, we get:

$$ E\left[ {\hat{h}_{\left({\rm{W}} \right)}^{2} } \right] = E\left[ {\frac{1}{M}\sum\limits_{m = 1}^{M} {\frac{1}{{{{\left( {K - 1} \right)} \mathord{\left/ {\vphantom {{\left( {K - 1} \right)} 4}} \right. \kern-0pt} 4}}}\sum\limits_{k = 1}^{K - 1} {\hat{F}_{km} \left( {1 - \hat{F}_{km} } \right)} } } \right] = \left( {1 - \frac{1}{n}} \right)\frac{1}{M}\sum\limits_{m = 1}^{M} {\frac{1}{{{{\left( {K - 1} \right)} \mathord{\left/ {\vphantom {{\left( {K - 1} \right)} 4}} \right. \kern-0pt} 4}}}\sum\limits_{k = 1}^{K - 1} {F_{km} \left( {1 - F_{km} } \right)} } = \left( {1 - \frac{1}{n}} \right)h_{\left({\rm{W}} \right)}^{2} = \left( {1 - \frac{1}{n}} \right)h_{\left({\rm{T}} \right)}^{2} - \left( {1 - \frac{1}{n}} \right)\sigma_{\left({\rm{B}} \right)}^{2} $$

Similar to Lemma 2 in “Appendix 1”, we have:

$$ E\left( {S_{\left({\rm{B}} \right)}^{2} } \right) = E\left[ {\frac{1}{(K - 1)/4}\sum\limits_{k = 1}^{K - 1} {\frac{1}{M}\sum\limits_{m = 1}^{M} {\left( {\hat{F}_{km} - \hat{F}_{k{_\bullet}} } \right)^{2} } } } \right] = \frac{1}{(K - 1)/4}\sum\limits_{k = 1}^{K - 1} {\left[ {\frac{1}{M}\sum\limits_{m = 1}^{M} {\left\{ {\frac{{F_{km} \left( {1 - F_{km} } \right)}}{n} + F_{km}^{2} } \right\}} - \left\{ {\frac{1}{{M^{2} }}\sum\limits_{m = 1}^{M} {\frac{{F_{km} \left( {1 - F_{km} } \right)}}{n} + F_{k{_\bullet}}^{2} } } \right\}} \right]} = \left[ {1 - \frac{1}{n}\left( {1 - \frac{1}{M}} \right)} \right]\sigma_{\left({\rm{B}} \right)}^{2} + \frac{1}{n}\left( {1 - \frac{1}{M}} \right)h_{\left({\rm{T}} \right)}^{2} $$

The unbiased estimators by formulas (9) and (11)–(12) are extricated by comparing the above results. More details are available in Ref. [3].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gadrich, T., Bashkansky, E. & Kuselman, I. Comparison of biased and unbiased estimators of variances of qualitative and semi-quantitative results of testing. Accred Qual Assur 18, 85–90 (2013). https://doi.org/10.1007/s00769-012-0939-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00769-012-0939-6

Keywords

Navigation