Skip to main content

Advertisement

Log in

Improvements in the application and reporting of advanced Bland–Altman methods of comparison

  • Original Research
  • Published:
Journal of Clinical Monitoring and Computing Aims and scope Submit manuscript

Abstract

Bland and Altman have developed a measure called “limits of agreement” to assess correspondence of two methods of clinical measurement. In many circumstances, comparisons are made using several paired measurements in each individual subject. If such measurements are considered as statistically independent pairs, rather than as sets of measurements from separate individuals, limits of agreement will be too narrow. In addition, the confidence intervals for these limits will also be too narrow. Suitable software to compute valid limits of agreement and their confidence intervals is not readily available. Therefore, we set out to provide a freely available implementation accompanied by a formal description of the more advanced Bland–Altman comparison methods. We validate the implementation using simulated data, and demonstrate the effects caused by failing to take the presence of multiple paired measurements per individual properly into account. We propose a standard format of reporting that would improve analysis and interpretation of comparison studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Altman DG, Bland JM. Measurement in medicine: the analysis of method comparison studies. Stat. 1983;32:307–17.

    Google Scholar 

  2. Biancofiore G, Critchley LAH, Lee A, Yang X, Bindi LM, Esposito M, Bisà M, Meacci L, Mozzo R, Filipponi F. Evaluation of a new software version of the Flotrac/Vigileo (version 3.02) and a comparison with previous data in cirrhotic patients undergoing liver transplant surgery. Anesth Analg. 2011;113:515–22.

    PubMed  Google Scholar 

  3. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;i:307–10.

  4. Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8:135–60.

    Article  CAS  PubMed  Google Scholar 

  5. Bland JM, Altman DG. Applying the right statistics: analyses of measurement studies. Ultrasound Obstet Gynecol. 2003;22:85–93.

    Article  CAS  PubMed  Google Scholar 

  6. Bland JM, Altman DG. Agreement between methods of measurement with multiple observations per individual. J Biopharm Stat. 2007;17:571–82.

    Article  PubMed  Google Scholar 

  7. Bland JM, Altman DG. Agreed statistics: measurement method comparison. Anesthesiology. 2012;116:182–5.

    Article  PubMed  Google Scholar 

  8. Burdick RK, Graybill FA. Confidence intervals on linear combinations of variance components in the unbalanced one-way classification. Technometrics. 1984;26:131–6.

    Article  Google Scholar 

  9. Columb MO. Clinical measurement and assessing agreement. Curr Anaesth Crit Care. 2008;19:328–9.

    Article  Google Scholar 

  10. Donner A, Zou GY. Closed-form confidence intervals for functions of the normal mean and standard deviation. Stat Methods Med Res. 2012;21:347–59.

    Article  PubMed  Google Scholar 

  11. Efron B, Tibshirani RJ. An introduction to the bootstrap. New York: Chapman and Hall; 1993.

    Book  Google Scholar 

  12. Hamilton C, Lewis S. The importance of using the correct bounds on the Bland–Altman limits of agreement when multiple measurements are recorded per patient. J Clin Monit Comput. 2010;24:173–5.

    Article  PubMed  Google Scholar 

  13. Hamilton C, Stamey J. Using Bland–Altman to assess agreement between two medical devices—don’t forget the confidence intervals!. J Clin Monit Comput. 2007;21:331–3.

    Article  PubMed  Google Scholar 

  14. Mantha S, Roizen MF, Fleisher LA, Thisted R, Foss J. Comparing methods of clinical measurement: reporting standards for Bland and Altman analysis. Anesth Analg. 2000;90:593–602.

    Article  CAS  PubMed  Google Scholar 

  15. Mozilla Foundation: Javascript. https://developer.mozilla.org/en-US/docs/JavaScript. Accessed April 2013.

  16. Myles PS, Cui J. Using the Bland–Altman method to measure agreement with repeated measures. Br J Anaesth. 2007;99:309–11.

    Article  CAS  PubMed  Google Scholar 

  17. Oldham PD. A note on the analysis of repeated measurements of the same subjects. J Chronic Dis. 1962;15:969–77.

    Article  CAS  PubMed  Google Scholar 

  18. Schnur D, et al. Flot: attractive JavaScript plotting for jQuery. http://www.flotcharts.org/. Accessed April 2013.

  19. Sjöstrand F, Rodhe P, Berglund E, Lundström N, Svensen C. The use of a noninvasive hemoglobin monitor for volume kinetic analysis in an emergency room setting. Anesth Analg. 2013;116:337–42.

    Article  PubMed  Google Scholar 

  20. The jQuery Foundation: jQuery. http://jquery.com/. Accessed April 2013.

  21. Thomas JD, Hultquist RA. Interval estimation for the unbalanced case of the one-way random effects model. Ann Stat. 1978;6:582–7.

    Article  Google Scholar 

  22. Uemura K, Kawada T, Inagaki M, Sugimachi M. A minimally invasive monitoring system of cardiac output using aortic flow velocity and peripheral arterial pressure profile. Anesth Analg. 2013;116:1006–17.

    Article  CAS  PubMed  Google Scholar 

  23. UncertWeb: jStat: a JavaScript statistical library. http://www.jstat.org/. Accessed April 2013.

  24. Zou GY. Confidence interval estimation for the Bland–Altman limits of agreement with multiple observations per individual. Stat Methods Med Res. 2013;22:630–42.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

The statistical properties of the simulated data for the example application were inspired by a real data set kindly provided by Prof. L.A.H. Critchley.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Erik Olofsen.

Appendix: Derivations

Appendix: Derivations

This section summarizes all theory needed to compute the limits of agreement and their 95 % confidence intervals for the situations described by Bland and Altman [6].

1.1 The model

The model for differences \(D _{{i j }}\) is the sum of a constant bias \(B\) and independent random variables [4],

$$\begin{aligned} D _{{i j }} = B + I _{{i }} + E _{{i j }} , \end{aligned}$$

where \(I\) and \(E\) have zero means and variances \(\sigma _{{d I }}^{2 }\) and \(\sigma _{{d w }}^{2 }\), respectively; \(\sigma _{{d I }}^{2 }\) denotes between-subject variance, \(\sigma _{{d w }}^{2 }\) denotes within-subject variance, and \(\sigma _{{d I }}^{2 } + \sigma _{{d w }}^{2 } = \sigma _{{d }}^{2 }\) is the total variance of the differences. There is a total of \(n _{{\text {obs}}}\) observations from \(n\) individuals \((i = 1, \ldots , n )\), with \(m _{i }\) observations for individual \(i\;( j = 1, \ldots , m _{i } )\).

1.2 The mean of the differences

Let the “grand” mean of the \(D _{{i j }}\) be estimated by

$$\begin{aligned} \hat{B} = \frac{1 }{{n _{{\text {obs}}} }}\sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} D _{{i j }} . \end{aligned}$$
(1)

The expected value of \(\hat{B}\) is \(B\); the variance of \(\hat{B}\) is

$$\begin{aligned} {\text{VAR}}\left\{ \hat{B} \right\} = {\text{E}} \left\{ {\hat{B} }^{2 } \right\} - \left( {\text{E}} \left\{ \hat{B} \right\} \right) ^{2 } . \end{aligned}$$

The \(D _{{i j }}\) have expected variance \(\sigma _{{d w }}^{2 }\) only for equal subscripts \(i = k\) and \(j = l\) and autocovariance \(\sigma _{{d I }}^{2 }\) for all \(j , y = 1 \dots m _{i }\) in each subject \(i = k\)

$$\begin{aligned} {\text {E}}\left\{ \left( \sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} D _{{i j }} \right) \cdot \left( \sum _{{k = 1 }}^{{n }} \sum _{{l = 1 }}^{{m _{k } }} D _{{k l }} \right) \right\} = {n}_{{\text {obs}}} \cdot \sigma _{{dw}}^{2 } + \left( \sum _{{i = 1 }}^{{n }} m _{i }^{2 } \right) \cdot \sigma _{{d I }}^{2 } \end{aligned}$$

So

$$\begin{aligned} {\text{VAR}}\left\{ \hat{B} \right\} = \frac{{\sigma _{{d w }}^{2 } }}{{n _{{\text {obs}}} }}+ \frac{{\sum _{{i = 1 }}^{{n }} m _{i }^{2 } }}{{n _{{\text {obs}}}^{2 } }}\cdot \sigma _{{d I }}^{2 } . \end{aligned}$$
(2)

And for equal \(m _{i } = n _{{\text {obs}}} / n\)

$$\begin{aligned} {\text{VAR}}\left\{ \hat{B} \right\} = \frac{{\sigma _{{d w }}^{2 } }}{{n _{{\text {obs}}} }}+ \frac{{\sigma _{{d I }}^{2 } }}{{n }}. \end{aligned}$$
(3)

An alternative estimator of \(B\) is

$$\begin{aligned} \hat{B}_a = \frac{1}{n}\sum _{{i = 1 }}^{{n }} \frac{1}{m_i} \sum _{{j = 1 }}^{{m _{i } }} D _{{i j }} . \end{aligned}$$
(4)

The expected value of \(\hat{B}_a\) is \(B\); the variance of \(\hat{B}_a\) is

$$\begin{aligned} {\text{VAR}}\left\{ \hat{B}_a \right\} = \left( \frac{1}{n^2}\sum _{{i = 1 }}^{{n }} \frac{1}{m_i} \right) \cdot \sigma _{{d w }}^{2 } + \frac{ \sigma _{{d I }}^{2 } }{n}. \end{aligned}$$
(5)

For equal \(m _{i } = n _{{\text {obs}}} / n\), this variance is also equal to Eq. (3). It can be shown that with unequal \(m_i,\; \frac{1}{n^2}\sum _{{i = 1 }}^{{n }} \frac{1}{m_i} > \frac{1}{ n _{{\text {obs}}} }\), and \(\frac{{\sum _{{i = 1 }}^{{n }} m _{i }^{2 } }}{{n _{{\text {obs}}}^{2 } }} > \frac{1}{n}\). Using these results, it can be seen that \(B\) is more precisely estimated with Eq. (2) when \(\sigma _{{d I }}^{2 }\) is small, and more precisely estimated with Eq. (5) when \(\sigma _{{d w }}^{2 }\) is small.

An estimate of \({\text{VAR}}\left\{ \hat{B} \right\}\), or \({\text{VAR}}\left\{ \hat{B}_a \right\}\), can be obtained by substituting estimates of \(\sigma _{{d w }}^{2 }\) and \(\sigma _{{d I }}^{2 }\) in Eq. (2) or Eq. (5). Estimators of these variances are derived below.

A 95 % confidence interval for \(\hat{B}\) or \(\hat{B}_a\) may be obtained by assuming these have a Student’s \(t\)-distribution with \(n-1\) degrees of freedom.

1.3 Limits of agreement and their confidence intervals

The limits of agreement (LoA) are estimated by

$$\begin{aligned} {\text{LoA}} = {\hat{B}} \pm 1.96 \cdot {\hat{\sigma }}_{d } . \end{aligned}$$
(6)

To obtain the variance of the LoA, the variances of \(\hat{B}\) and \({\hat{\sigma }}_{d }^{2 }\) are needed:

$$\begin{aligned} {{\rm VAR}}\{{\rm LoA}\} = {{\rm VAR}}\{{\hat{B}}\} + 1.96 ^{2 } \cdot {{\rm VAR}}\{{{\hat\sigma}}_{d }\} \end{aligned}$$
(7)

The former is given by Eq. (2) above; the latter can be obtained by assuming that sums of squares are \(X ^{2 }\) distributed, and using

$$\begin{aligned} {\text{VAR}}\left\{ {\hat{\sigma }}_{d } \right\} \approx \frac{{{\text{VAR}}\left\{ {\hat{\sigma }}_{d }^{2 } \right\} }}{{4 \cdot {\text{E}}\left\{ {\hat{\sigma }}_{d } \right\} }}. \end{aligned}$$

Expressions for \(\text {VAR}\left\{ {{\hat\sigma }}_{d }^{2 } \right\}\) are derived below. Confidence intervals around the LoA can finally be constructed by taking 1.96 times the square root of \(\text {VAR}\{ \text {LoA}\}\), assuming the LoA are normally distributed. This procedure was described by Bland and Altman [4].

1.3.1 Confidence interval estimation by the MOVER

Donner and Zou described the application of the MOVER (Method of Variance Estimates Recovery) to estimate confidence intervals of the LoA [10, 24]. We implemented Eqs. (5) and (6) from the paper by Zou [24] in our Javascript library. The method combines confidence intervals for the estimated bias and the standard deviaton of the differences in Eq. (6). The latter CI is based on percentiles of the \(X ^{2 }\) distributions assumed for the sums of squares that are used to compute the standard deviation. The confidence interval for the mean used here is the one based on the normal approximation (see [24]) instead of on Student’s-\(t\) distribution. Otherwise, the CI of the LoA are too wide.

1.3.2 Parametric bootstrap-t confidence intervals

When the LoA are not normally distributed, their confidence intervals may be asymmetric, for example when the estimated value of a statistic scales with its estimation error. A bootstrap procedure [11] may be used to construct better confidence intervals. In the parametric bootstrap, data sets are simulated using the established model. In the present context, the model consists of an overall bias, the partitioning of the total variance in between-subjects and within-subject variance, and their estimated values. From each of the bootstrap data sets, the LoA are calculated, and their 2.5 and 97.5 % determined. The bootstrap-\(t\) interval “studentizes” the bootstrap estimated LoA using their associated standard deviation. Often, this is a disadvantage of this method, but in our case expressions for the standard error of the LoA can be derived (see below) that have sufficient accuracy. Furthermore, the bootstrap-\(t\) is applicable to location statistics—such as percentiles—in particular [11]. A nonparametric procedure based on resampling of individuals gave confidence intervals that were both too symmetric and too narrow.

1.4 The pooled data method

1.4.1 The variance of the differences

Let the variance of the \(D _{{i j }}\) be estimated by

$$\begin{aligned} {\hat{\sigma }}_{d }^{2 } = \frac{1 }{{n _{{\text {obs}}} }}\sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} \left( D _{{i j }} - \hat{B} \right) ^{2 } , \end{aligned}$$
(8)

with \(\hat{B}\) as defined in section“The mean of the differences”. Expanding the sums of squares yields

$$\begin{aligned} \sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} D _{{i j }}^{2 } -2 \cdot \left( \sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} D _{{i j }} \right) \cdot \left( \frac{1 }{{n _{{\text {obs}}} }}\sum _{{k = 1 }}^{{n }} \sum _{{l = 1 }}^{{m _{k } }} D _{{k l }} \right) + \sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} \left( \frac{1 }{{n _{{\text {obs}}} }}\sum _{{k = 1 }}^{{n }} \sum _{{l = 1 }}^{{m _{x } }} D _{{k l }} \right) ^{2 } . \end{aligned}$$

The expected value of the first part is

$$\begin{aligned} {\text{E}}\left\{ \sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} D _{{i j }}^{2 } \right\} = n _{{\text {obs}}} \cdot \left( \sigma _{{d w }}^{2 } + \sigma _{{d I }}^{2 } \right) , \end{aligned}$$

and the expected values of the second and third parts can be found using the result in section “The mean of the differences”. We then find

$$\begin{aligned} {\text{E}} \left\{ {\hat{\sigma }}_{d }^{2 } \right\} = \left( 1 - \frac{1 }{{n _{{\text {obs}}} }}\right) \cdot \sigma _{{d w }}^{2 } + \left( 1 - \frac{{\sum _{{i = 1 }}^{{n }} m _{i }^{2 } }}{{n _{{\text {obs}}}^{2 } }}\right) \cdot \sigma _{{d I }}^{2 } . \end{aligned}$$

For equal \(m _{i } = n _{{\text {obs}}} / n\)

$$\begin{aligned} {\text{E}} \left\{ {\hat{\sigma }}_{d }^{2 } \right\} = \left( 1 - \frac{1 }{{n _{{\text {obs}}} }}\right) \cdot \sigma _{{d w }}^{2 } + \left( 1 - \frac{1 }{{n }}\right) \cdot \sigma _{{d I }}^{2}. \end{aligned}$$
(9)

At this point there is only an estimator of \(\sigma _{{d w }}^{2 }\) [Eq. (8)] available, while an estimator of \(\sigma _{{d I }}^{2 }\) is not available. The commonly used unbiased estimator of the variance, which divides the sum of squares by \(n _{{\text {obs}}} -1\) instead of \(n _{{\text {obs}}}\) [in Eq. (8)] gives

$$\begin{aligned} {\text{E}} \left\{ {\hat{\sigma }}_{{d c }}^{2 } \right\} = \sigma _{{d w }}^{2 } + \frac{{n _{{\text {obs}}} }}{{n _{{\text {obs}}} -1 }}\cdot \frac{{n -1 }}{{n }}\cdot \sigma _{{d I }}^{2 } , \end{aligned}$$
(10)

which can only be an unbiased estimator—of \(\sigma _{{d w }}^{2 }\)—if \(\sigma _{{d I }}^{2 } = 0\).

1.4.2 The variance of the limits of agreement

For now, \(\sigma _{{d I }}^{2 }\) is taken to be zero. The sum of squares

$$\begin{aligned} \sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} \left( D _{{i j }} - \hat{B} \right) ^{2 } , \end{aligned}$$

normalized by \(\sigma _{{d w }}^{2 }\), is assumed to have a \(X ^{2 }\) distribution with \(n _{{\text {obs}}} -1\) degrees of freedom, so the variance of this quantity is \(2 ( n _{{\text {obs}}} -1 )\). The variance of the unbiased estimator of \({\hat{\sigma }}_{d }^{2 }\) [Eq. (8)] is then

$$\begin{aligned} {\text{VAR}}\left\{ {\hat{\sigma }}_{d }^{2 } \right\} = \frac{{2 \sigma _{{d w }}^{4 } }}{{n _{{\text {obs}}} -1 }}, \end{aligned}$$

and

$$\begin{aligned} {\text{VAR}}\left\{ {\hat{\sigma }}_{d } \right\} \approx \frac{{{\text{VAR}}\left\{ {\hat{\sigma }}_{d }^{2 } \right\} }}{{4 {\text{E}}\left\{ {\hat{\sigma }}_{d } \right\} }}= \frac{{\sigma _{{d w }}^{2 } }}{\left. 2 ( n _{{\text {obs}}} -1 \right) }. \end{aligned}$$

Next, the variance of the LoA is given by the sum of the variance of the mean of the differences [Eq. (2)] and \(1.96 ^{2 }\) times the variance of the standard deviation of the differences, so

$$\begin{aligned} {\text{VAR}}\{ \text {LoA}\} = \left( \frac{1 }{{n_{{\text {obs}}} }}+ \frac{{1.96 ^{2 } }}{\left. 2 (n_{{\text {obs}}} -1 \right) }\right) \cdot \sigma _{{d w }}^{2 } . \end{aligned}$$

Confidence intervals around the LoA can then be constructed by using the procedure described in section “Limits of agreement and their confidence intervals”.

1.5 The standard true value varies method

The ANOVA calculates two mean sums of squares:

$$\begin{aligned} {\text{MSSR}} = \frac{1 }{{n _{{\text {obs}}} - n }}\cdot \sum _{{i = 1 }}^{{n }} \sum _{{j = 1 }}^{{m _{i } }} \left( D _{{i j }} - {\hat{B} }_{i } \right) ^{2 } , \end{aligned}$$
(11)

and

$$\begin{aligned} {\text{MSSI}} = \frac{1 }{{n -1 }}\cdot \sum _{{i = 1 }}^{{n }} m _{i } \cdot \left( {\hat{B} }_{i } - \hat{B} \right) ^{2 } , \end{aligned}$$

where

$$\begin{aligned} {\hat{B} }_{i } = \frac{1 }{{m _{i } }}\sum _{{k = 1 }}^{{m _{i } }} D _{{i k }} . \end{aligned}$$

Their expected values are

$$\begin{aligned} {\text{E}} \{ {\text {MSSR}}\} = \sigma _{{dw}}^{2 } , \end{aligned}$$

and

$$\begin{aligned} {\text{E}} \{{\text {MSSI}}\} = \sigma _{{dw}}^{2 } + \frac{1 }{{n -1 }}\cdot \frac{{n _{{\text {obs}}}^{2 } - \sum _{{i = 1 }}^{{n }} m _{i }^{2 } }}{{n _{{\text {obs}}} }}\cdot \sigma _{{d I }}^{2 } = \sigma _{{d w }}^{2 } + \lambda \cdot \sigma _{{dI}}^{2 } , \end{aligned}$$

where \(\lambda\) denotes an abbreviation. Therefore estimators of the components of variance are

$$\begin{aligned} {\hat{\sigma}}_ {dw}^{2} = {\text {MSSR}}, \end{aligned}$$

and

$$\begin{aligned} {\hat{\sigma}}_ {d I}^{2} = \left( {\text{MSSI}}- {\hat{\sigma}}_{dw}^{2} \right) \cdot \frac{\left( n -1 ) \cdot n _{{\text {obs}}} \right. }{{n _{{\text {obs}}}^{2 } - \sum _{{i = 1 }}^{{n }} m _{i }^{2 } }}= \left( {\text{MSSI}} - {\hat{\sigma}}_ {d w}^{2} \right) / \lambda \end{aligned}$$

Next the variance of the differences can be estimated by

$$\begin{aligned} {\hat{\sigma}}_{d}^{2} = {\hat{\sigma}}_ {d w}^{2} + {\hat{\sigma}}_ {dI}^{2} = ( 1 - 1 / \lambda ) \cdot \text {MSSR}+ ( 1 / \lambda ) \cdot \text {MSSI}. \end{aligned}$$

Now \(( n _{{\text {obs}}} - n ) \cdot \text {MMSR}/ \sigma _{{d w }}^{2 }\) is approximately \(X ^{2 }\) distributed with \(n _{{\text {obs}}} - n\) degrees of freedom, and \(( n -1 ) \cdot \text {MSSI}/ ( \sigma _{{d w }}^{2 } + \lambda \sigma _{{d I }}^{2 } )\) is approximately \(X ^{2 }\) distributed with \(n -1\) degrees of freedom, at least with balanced data, so when \(m _{i }\) values are equal [8]. The properties of the \(X ^{2 }\) distributed quantities can be used to derive an expression for the variance of \({\hat{\sigma }}_{d }^{2 }\):

$$\begin{aligned} {\text{VAR}}\left\{ {\hat{\sigma }}_{d }^{2 } \right\} = \frac{{2 \left( ( 1 -1 / \lambda ) \sigma _{{d w }}^{2 } \right) ^{2 } }}{{n _{{\text {obs}}} - n }}+ \frac{{2 \left( \sigma _{{d w }}^{2 } / \lambda + \sigma _{{d I }}^{2 } \right) ^{2 } }}{{n -1 }}. \end{aligned}$$

This result can be transformed and added to the variance of the mean to obtain approximate confidence intervals as in section “Limits of agreement and their confidence intervals”. The MOVER (see section “Confidence interval estimation by the MOVER”) may be applied, using percentiles of the \(X ^{2 }\) distributions of \(\text {MSSR}\) and \(\text {MSSI}\), to obtain better confidence intervals.

1.5.1 The modified true value varies method

An alternative mean sum of squares was studied by Thomas and Hultquist [21]. They used the expression

$$\begin{aligned} {\text{MSSIa}} = \frac{1 }{{n -1 }}\cdot \sum _{{i = 1 }}^{{n }} \cdot \left( {\hat{B} }_{i } - {\hat{B} }_{a } \right) ^{2 } . \end{aligned}$$
(12)

Note that \({\hat{B} }_{a }\) is used here (Eq. (4)), which may be different from \({\hat{B} }\) (Eq. (1)) with unbalanced data. The expectation of MSSIa is

$$\begin{aligned} {\text{E}} \{ \text{MSSIa}\} = \frac{1 }{{n }}\cdot \sum _{{i = 1 }}^{{n }} \frac{1 }{{m _{i } }}\cdot \sigma _{{d w }}^{2 } + \sigma _{{d I }}^{2 } = \lambda \cdot \sigma _{{d w }}^{2 } + \sigma _{{d I }}^{2 } , \end{aligned}$$
(13)

where \(\lambda\) is an abbreviation. Next the variance of the differences can be estimated by

$$\begin{aligned} {\hat{\sigma }}_{d }^{2 } = ( 1 - \lambda ) \cdot \text {MSSR}+ \text {MSSIa}. \end{aligned}$$

Thomas and Hultquist show that \(( n -1 ) \cdot {\text{MSSIa}}/ ( \lambda \cdot \sigma_{{d w}}^{2} + \sigma _{{dI}}^{2} )\) is close to a \(X ^{2 }\) distribution with \(n -1\) degrees of freedom, for \(\sigma _{{d I }}^{2 } \ge \sigma _{{d w }}^{2 } / 4\). Burdick and Graybill [8] gave a method to obtain confidence intervals for \({\hat{\sigma }}_{d }^{2 }\), which may also be used to obtain confidence intervals for \(1.96 \cdot {\hat{\sigma }}_{d }\), but this method was not investigated given the accuracy of the MOVER (see section “Confidence interval estimation by the MOVER”). The properties of the \(X ^{2 }\) distributed quantities can be used to derive an expression for the variance of \({\hat{\sigma }}_{d }^{2 }\):

$$\begin{aligned} {\text{VAR}}\left\{ {\hat{\sigma }}_{d }^{2 } \right\} = \frac{{2 \left( ( 1 - \lambda ) \sigma _{{d w }}^{2 } \right) ^{2 } }}{{n _{{\text {obs}}} - n }}+ \frac{{2 \left( \lambda \sigma _{{d w }}^{2 } + \sigma _{{d I }}^{2 } \right) ^{2 } }}{{n -1 }}. \end{aligned}$$
(14)

This result can be transformed and added to the variance of the mean to obtain approximate confidence intervals as in section “Limits of agreement and their confidence intervals”. Zou showed how to use the MOVER using the distributions of \(\text {MSSR}\) and \(\text {MSSIa}\) (see [24] and section “Confidence interval estimation by the MOVER”) to obtain better confidence intervals.

1.6 The true value constant method

To obtain expressions for the 95 % confidence intervals for the True value constant method, the approach of Bland and Altman [4] was followed, but generalized for unbalanced data. In this method, sums of squares are calculated from the measurements themselves rather than from their differences. These sums of squares have the same properties concerning their \(X ^{2 }\) distributions as described in section “The modified true value varies method”. In the derivation of Bland and Altman the wrong expression was used for the variance of the mean. The estimated variance of the differences was substituted (\({\hat{\sigma }}_{d }^{2 }\)) instead of the variance of the differences between the within-subject means (\(s _{{\bar{d} }}^{2 }\)—for equal \(m _{i }\), the ratio of Eq. (13) and \(n\) is equal to Eq. (3)). Because the variance of the differences is larger than the variance of the differences between the within-subject means, the resulting 95 % CI will be too wide.

The True value constant method calculates the means of the \(X\) and \(Y\) data per subject: \(\bar{X} _ i = \sum _{{j = 1 }}^{{m _{i } }} X _{{i j }}\), and likewise for \(Y\). Next, the variance of the differences between the within-subject means is calculated, which is the variance of \(\bar{X} _ i - \bar{Y} _ i\), which is equal to the variance of \(\hat{B} _ i\), so it is identical to Eq. (12) from the modified True value varies method. Furthermore, the True value constant method calculates sums of squares as given by Eq. (11), but separately for the \(X\) and \(Y\) data, and subsequently adds these. The expectation of that result is \(\sigma _{{d w }}^{2 } = \sigma _{{x w }}^{2 } + \sigma _{{y w }}^{2 }\), which is identical to the expectation of Eq. (11). Thus when the true value is indeed constant, the True value constant method has properties identical to the modified True value varies method, except that with the latter method, the number of measurements from both measurement devices needs to be the same \(( m _{{x i }} = m _{{y i }})\), to be able to calculate the \(D _{{i j }}\). Furthermore, the separate estimates of \(\sigma _{{x w }}^{2 }\) and \(\sigma _{{y w }}^{2 }\) from the True value constant method may be used to assess repeatability of the measurement devices [4].

1.7 Generation of simulation data

The simulated data for the validation study and the example analysis were generated using variants of the following pseudocode:

figure a

Here n is the number of subjects, and m[i] is the number of paired measurements for subject i. sxI and syI are the standard errors of the between-subjects variabilities, and swx and swy are the standard errors of the within-subject variabilities. normal() is a function that generates normally distributed numbers with a mean of zero and variance of 1. Ix and Iy are the subject-specific biases, and X and Y are data arrays. With the above pseudocode the expected overall bias is \(B{\text{x}} - B\mathtt{y}\), the true value constant and zero, and the expected between-subjects and within-subjects variances are \(\sigma _{dI}^2 = \mathtt{sxi}^2 + \mathtt{syi}^2\) and \(\sigma _{dw}^2 = \mathtt{sxw}^2 + \mathtt{syw}^2\), respectively.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Olofsen, E., Dahan, A., Borsboom, G. et al. Improvements in the application and reporting of advanced Bland–Altman methods of comparison. J Clin Monit Comput 29, 127–139 (2015). https://doi.org/10.1007/s10877-014-9577-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10877-014-9577-3

Keywords

Navigation