Introduction

It is typically assumed that the uncertainty of results from many measurements can be described by a normal distribution, i.e., a symmetric distribution. This assumption is stated in the GUM document (Evaluation of measurement data—Guide to the expression of uncertainty in measurement) [1, 2] that is a fundamental reference document for evaluation of measurement uncertainty. The document is based on a linear combination of random variables giving a normal (symmetric) distribution of results according to the central limit theorem. However, in many instrumental techniques (for instance in chemical analysis), results are typically generated by multiplicative combination of random variables, i.e. distributions of results are driven towards a log-normal distribution (according to the multiplicative version of the central limit theorem). At small or modest relative standard uncertainties < 15 %–20 %, normal and log-normal distributions are so similar that the normal distribution can serve as a suitable approximation. When the relative standard uncertainty becomes larger than 15 %–20 %, asymmetry, or skewness, needs to be handled when calculating uncertainty intervals [3,4,5]. This is outside the scope of GUM [1, 2], and is commonly handled by transforming data using log10 \(x\) or loge \(x\) [3, 5,6,7,8] prior to calculation of uncertainty intervals. Transformation is also widely used in microbiological enumeration methods [9, 10]. Hence, it is typically assumed that distribution of measurement results can be approximated with either a normal distribution or a log-normal distribution. Recently, an approach based on a power transformation for handling a broader spectra of skewness has been proposed [4].

Here, the approach of power transformation is discussed further, and the purpose is twofold. First, to explain and clearly describe the approach for calculating uncertainty intervals, and second, to compare for different types of chemical analyses the uncertainty intervals obtained with other expressions for uncertainty intervals (assuming either normal or log-normal distribution of measurement results).

List of symbols

\(B\) Parameter in transformation (\(x^{B}\))
\(B_{{{\text{opt}}}}\) Optimized \(B\)
\({\text{CV}}\) Coefficient of variation (in %)
\(k\) Coverage factor for a given probability
\(n\) Number of data
\(s\) Standard deviation*
\(s_{{{\text{rel}}}}\) Relative standard deviation in original space
\(s_{{{\text{trans}}}}\) Standard deviation in transformed space using \(x^{{B_{opt} }}\) transformation
\(s_{{\text{rel,trans}}}\) Relative standard deviation in transformed space using \(x^{{B_{opt} }}\) transformation
\(s_{{\log_{10} }}\) Standard deviation in transformed space using log10 transformation
\(x\) Measurement result in the original space
\(x_{{{\text{trans}}}}\) Measurement result in transformed space using \(x^{B}\) transformation
\(x_{{\log_{10} }}\) Measurement result in transformed space using log10 transformation
  1. *In this paper, for calculating uncertainty intervals, assumed to be equal to the combined standard uncertainty

Transformation based on \({\varvec{x}}^{{\varvec{B}}}\) when evaluating measurement uncertainties

Transformation based on \({\varvec{x}}^{{\varvec{B}}}\)

Many data sets with asymmetric distributions can be transformed to data sets with symmetric and approximated normal distributions using a power transformation according to

$$x_{{{\text{trans}}}} = x^{B}$$
(1)

where \(x_{{{\text{trans}}}}\) and \(x\) are the transformed and original data, respectively, and \(B\) is a parameter that is optimized with the goal that transformed data should have a skewness close to zero, i.e., become symmetric. Equation 1 can then be written as

$$x_{{{\text{trans}}}} = x^{{B_{{{\text{opt}}}} }}$$
(2)

where \(B_{{{\text{opt}}}}\) is the optimized \(B\). Different values of \(B\) will transform different distributions to symmetric distributions as shown in Fig. 1.

Fig. 1
figure 1

Illustration of \(B\) values that will transform different distributions to a symmetric distribution using the transformation \(x^{B}\)

This has been studied using Monte Carlo simulations [4]. As illustrated in the figure, there are two values that can serve as reference points on a “\(B\)-scale”, 0 and 1. With \(B = 1\), no transformation will occur, i.e. original distributions that can be approximated with a normal distribution will have an optimized \(B\) value equal to 1. Using \(B\) close to 0 is analogous to taking the logarithm of the values, and original distributions that can be approximated with a log-normal distribution will be transformed to normal distribution approximations. Note that \(B = 0\) will transform all values to 1. To avoid this, \(B\) values between − 0.0001 and 0.0001 are not used. With \(B\) values somewhere between 0 and 1, distributions with other types of positive skewness can be transformed to approximately symmetric distributions. These distributions are here referred to as being “between” normal and log-normal distributions. Note that transformations using the square root, i.e. \(B_{{{\text{opt}}}}\) equal to 0.5, will transform distributions that are “between” normal and log-normal distributions to approximated normal distributions. For original distributions with negative skewness, \(B\) values larger than 1 will transform the distributions to approximately normal distributions. Finally, for distributions far from zero with a positive skewness, \(B\) values less than 0 will transform distributions to approximately normal distributions. Such distributions can be obtained for instance by adding a constant number to values with positive skewness, or by summing values with different distributions.

After finding an appropriate \(B_{{{\text{opt}}}}\) confidence interval can be calculated in the transformed space and then back-transferred according to

$$x = x_{{{\text{trans}}}}^{{1/B_{{{\text{opt}}}} }}$$
(3)

Note that for \(B_{{{\text{opt}}}} < 0\), the order of data in the transformed space will be opposite to the order of data in the original space. Hence, when calculating a confidence interval, the lower limit of the interval in the transformed space corresponds to the upper limit in the original space.

Calculation of uncertainty intervals

In the following text, the standard deviation, \(s\), is assumed to be equal to the combined standard uncertainty. Different expressions for calculating uncertainty intervals are given in Fig. 2 for a measurement result, \(x\), when the standard deviation increases proportional to the measurand level, i.e. the coefficient of variation (CV) and the relative standard deviation (\(s_{{{\text{rel}}}}\)) are independent of the measurand level, which is often the case for instance in instrumental analysis [11].

Fig. 2
figure 2

Overview of different expressions for calculating uncertainty intervals for a measurement result depending on the distribution of the measurement result. See text for explanation of the different symbols

Expressions for uncertainty intervals are given in Fig. 2 when the distribution of the measurement results can be (1) approximated with a normal distribution, (2) have various distributions, or (3) can be approximated with a log-normal distribution. Derivations of the different equations (Eqs. 410) are given below.

If the distribution of the measurement results can be approximated with a normal distribution and the relative standard deviation, \(s_{{{\text{rel}}}}\), is independent of the measurand, an uncertainty interval for a measurement result, \(x\), will be asymmetric since the standard uncertainty, \(s\), will be different at the lower and the upper limit. For small \(s_{{{\text{rel}}}} ,\) the difference in \(s\) at the lower and upper limit can be neglected, and the interval can be obtained as

$$x - k \times s_{{{\text{rel}}}} \times x\;{\text{to}}\;x + k \times s_{{{\text{rel}}}} \times x$$
(4)

where \(k\) is the coverage factor. However, if the difference in \(s\) at the lower and upper limit is taken into account, the uncertainty interval will be asymmetric and can be calculated as [3, 4]

$$\frac{x}{{1 + k \times s_{{{\text{rel}}}} }}\;{\text{to}}\;\frac{x}{{1 - k \times s_{{{\text{rel}}}} }}$$
(5)

This equation is valid for \(k \times s_{{{\text{rel}}}} < 1\).

If the distribution of the measurement results can be approximated with a log-normal distribution, the standard deviation in the transformed space after transformation using log10 \(x\) or ln \(x\) will be independent of the measurand level. An uncertainty interval in the transformed space for a measurement result, \(x_{{\log_{10} }}\), will then be given by

$$x_{{\log_{10} }} - k \times s_{{\log_{10} }} \;{\text{to}}\;x_{{\log_{10} }} + k \times s_{{\log_{10} }}$$
(6)

where \(s_{{\log_{10} }}\) is the standard deviation in the transformed space. This will give an uncertainty interval in the original space that is

$$\frac{x}{{10^{{k \times s_{{\log_{10} }} }} }}\;{\text{to}}\;x \times 10^{{k \times s_{{\log_{10} }} }}$$
(7)

Equation (7) can also be written as

$$\frac{x}{{{}_{{}}^{F} U}}\;{\text{to}}\;x \times {}_{{}}^{F} U$$
(8)

where \({}_{{}}^{F} U\) is called the expanded uncertainty factor that here is calculated as \(10^{{k \times s_{\log 10} }}\) [5].

For many distributions including log-normal, transformation using \(x^{{B_{opt} }}\) will result in symmetric distributions that can be approximated with normal distributions. In the transformed space, the relative standard deviation of the transformed data, \(s_{{\text{rel,trans}}}\), will be independent of the measurand level [4]. Hence, an uncertainty interval in the transformed space for a measurement result, \(x_{trans}\), can be obtained as

$$\frac{{x_{{{\text{trans}}}} }}{{1 + k \times s_{{\text{rel,trans}}} }}\;{\text{to}}\;\frac{{x_{{{\text{trans}}}} }}{{1 - k \times s_{{\text{rel,trans}}} }}$$
(9)

where \(s_{{\text{rel,trans}}}\) is the relative standard deviation of transformed data. An uncertainty interval in the original space will then be given by

$$\frac{x}{{\left( {1 + k \times s_{{\text{rel,trans}}} } \right)^{{1/B_{{{\text{opt}}}} }} }}\;{\text{to}}\;\frac{x}{{\left( {1 - k \times s_{{\text{rel,trans}}} } \right)^{{1/B_{{{\text{opt}}}} }} }}$$
(10)

Equation 10 is valid for \(k \times s_{{\text{rel,trans}}}\) < 1, i.e. for large \(s_{{\text{rel,trans}}} ,\) Eq. 10 will not be valid. However, distributions in the original space having large standard deviations will typically be asymmetric. In the transformed space, these distributions will ideally be symmetric having small standard deviations not restricting Eq. 10. This is demonstrated in Example 4.

As shown in Fig. 2, intervals calculated using Eqs. 5 and 10 will be identical when \(B_{{{\text{opt}}}}\) approaches 1. In addition, though less obvious, intervals calculated using Eqs. 7 and 10 will be identical when \(B_{{{\text{opt}}}}\) approaches 0 (for instance, using \(B_{{{\text{opt}}}}\) equal to 0.0001) [4]. This shows that Eq. 10 when \(B_{{{\text{opt}}}}\) is known will provide a way of expressing uncertainty intervals handling a broad spectra of asymmetry when the relative standard deviation (\(s_{\text {rel}}\)) is independent of the measurand level. Hence, the uncertainty intervals calculated according to Eq. 10 can be considered to be more correct than using Eqs. 4, 5 or 7 if an appropriate value of \(B_{{{\text{opt}}}}\) is used, although the improvements are generally small. As discussed below, however, estimation of a proper value needs a measurement model or extensive raw data.

Note that transformation based on log10 \(x\), loge \(x\) and \(x^{{B_{opt} }}\) will result in symmetry around the median. Hence, an interval (in the original space) calculated using Eqs. 7 or 10 will result in an interval that will cover the median with a given probability (95 % when using \(k\) equal to 2). In many applications, it is sensible to use the median as what is intended to be measured when skewness originates from the measurement process.

Estimation of \(B_{{{\text{opt}}}}\)

A value of \(B_{{{\text{opt}}}}\) in Eq. 10 for a given data set that will result in a skewness close to zero in the transformed space can be obtained using mathematical tools available in many calculation software (including the widely used Microsoft Excel [4]). Although skewness is here utilized to optimize \(B\), other distribution characteristics can also be considered. A detailed discussion of this is beyond the scope of this work. For the purpose of this study, skewness is considered to be appropriate for optimization.

However, in reality, it is difficult to find an optimized value of \(B\) based on data set of experimental data [4]. The accuracy of \(B_{{{\text{opt}}}}\) will improve with increasing number of data in the data set and with increasing coefficient of variation (CV) of the data set [4]. However, the extremely large numbers of experimental data (typically > 103 to 104) that are needed are rarely available. This is not surprising, and it is well-known that departure from normality has to be quite large in order to demonstrate non-normality [12]. It has been suggested that without any other information of a proper value for \(B_{{{\text{opt}}}}\) [4], it is sensible to assume \(B_{{{\text{opt}}}}\) equal to 1 when CV is less than approx. 15 % (i.e. no transformation of the data is performed). For these low CV, the value of \(B_{{{\text{opt}}}}\) is not critical, and different values of \(B_{{{\text{opt}}}}\) will result in similar measurement uncertainty intervals.

For CV > approx. 15 % to 20 %, it is often sensible to assume \(B_{{{\text{opt}}}}\) close to zero (for instance 0.0001), i.e. to assume a log-normal distribution. A proper value of \(B_{{{\text{opt}}}}\) can be obtained from Monte Carlo simulations if a relevant model equation is available. Alternatively, a general agreed value of \(B_{{{\text{opt}}}}\) could be used for a specific measurement procedure. For instance, in microbiology, transformations using the square root, i.e. \(B_{{{\text{opt}}}}\) equal to 0.5, are sometimes used [13].

Control samples and control charts that are important tools in quality work [14], describing within-laboratory reproducibility, can contain data in the order of 102 to103. However, typically this is not enough, or barely enough, to obtain a relevant estimate of \(B_{{{\text{opt}}}}\). Some examples of within-laboratory reproducibility data from different types of chemical analysis are given in Table 2.

Combination of uncertainty components with different asymmetry

Above it has been assumed that the standard deviation, \(s\), is equal to the standard uncertainty. However, it is of interest to be able to combine uncertainties components with different distributions having different asymmetry for instance when adding a component related to bias or sampling to a component describing analytical precision. This can be performed as described previously [4] by:

  1. 1.

    finding \(B_{{{\text{opt}}}}\) for each uncertainty component

  2. 2.

    determination of mean and standard deviation in the transformed space for each uncertainty component

  3. 3.

    generation of random large normal distributed data set (using the determined mean and standard deviation in the transformed space) describing the two uncertainty components in their transformed space

  4. 4.

    back-transformation of the two large data set to the original space

  5. 5.

    combination of the two data set in the original space (by multiplication or addition depending on judgements)

  6. 6.

    finding a new \(B_{{{\text{opt}}}}\) for the combined data set.

Calculations

Calculations were performed using Excel software (Office 365, Microsoft). Finding \(B_{{{\text{opt}}}}\) was performed using Solver (an Excel add-in program) with settings given in Table 1.

Table 1 Settings used in Solver (Excel add-in program) when finding \(B_{opt}\)

The constraint \(B \ge 0.0001\) was used to prevent \(B\) from reaching 0 in the optimization. If optimization resulted in \(B = 0.0001\), a second optimization step was performed with the constraint \(B \le - 0.0001\). A start value of 0.5 was used (− 0.5 if a second optimization step was performed) but the value is not critical. Random data with a normal probability distribution were generated using NORM.INV(RAND();mean;standard deviation) and with a rectangular distribution using RAND().

Analysis of variance (ANOVA) was performed using RANOVA2 (a stand-alone program running in Microsoft Excel) available from Royal Society of Chemistry (RSC) website [15].

Results and discussions

Implementation of transformation using \(x^{{B_{opt} }}\) when calculating expanded uncertainty intervals (i.e. using Eq. 10) on experimental data from different types of chemical analysis is given in Table 2.

Table 2 Expanded uncertainty intervals (95 %) calculated (1) using transformation based on \(x^{{B_{opt} }}\), (2) without transformation and (3) using transformation based on log10 \(x\) for data from different types of chemical analysis

Also included are expanded uncertainty intervals (95 %) calculated without transformation of data (using Eqs. 4 and 5), and using transformation based on log10 \(x\) (i.e. using Eq. 7).

The examples 1–5 in Table 2 are discussed.

Example 1 Determination of sulphur in gas samples using gas chromatography and chemiluminescence detection:

The within-laboratory reproducibility is 15 % at two different concentration levels, which is on the border when asymmetry needs to be considered. New control samples were prepared when the previous control sample was finished, and the measured concentrations have been corrected to account for the difference in nominal concentrations between the control samples. The data indicate that a “true” \(B_{opt}\) is around 0.5, i.e. the example illustrates a case when the distribution of measurement results is “between” a normal distribution and a log-normal distribution. The calculated expanded uncertainty intervals (95 %) around a nominal measured value of 10 and 20 based on within-laboratory reproducibility are compared in Fig. 3(a) and (b).

Fig. 3
figure 3

Expanded uncertainty intervals (95 %) for determination of sulphur in gas samples using gas chromatography and chemiluminescence detection calculated around a nominal measured value of a 10 mg/kg and b 20 mg/kg. (A) Using transformation based on \(x^{{B_{{{\text{opt}}}} }}\) (Eq. 10), (B) without transformation neglecting that \(s\) will be different at lower and upper limit (Eq. 4), (C) without transformation taking into account that \(s\) will be different at lower and upper limit (Eq. 5), and (D) using transformation based on log10 \(x\) (Eq. 7)

The intervals are fairly similar, and this example illustrates a case that is on the border when asymmetry needs to be considered (the within-laboratory reproducibility is 15 %). The shape of interval A is “between” interval C (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 1) and interval D (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 0) which is reasonable since \(B_{{{\text{opt}}}}\) for the data sets is found to be around 0.5.

Example 2 Determination of nitrogen (N) using an elemental analyser for C, H and N:

The within-laboratory reproducibility is 6 %, which is in the range when asymmetry is typically considered negligible. However, \(B_{opt}\) is around 5 to 6, indicating a negative skewness, which in this case has an impact on the uncertainty interval even at a within-laboratory reproducibility as low as 6 %. The calculated expanded uncertainty intervals (95 %) around a nominal measured mass fraction of 0.10 % based on within-laboratory reproducibility data are compared in Fig. 4.

Fig. 4
figure 4

Expanded uncertainty intervals (95 %) for determination of nitrogen (N) using an elemental analyser for C, H and N calculated around a nominal measured mass fraction of 0.10 %. a Using transformation based on \(x^{{B_{{{\text{opt}}}} }}\) (Eq. 10), b without transformation neglecting that \(s\) will be different at lower and upper limit (Eq. 4), c without transformation taking into account that \(s\) will be different at lower and upper limit (Eq. 5), and d using transformation based on log10 \(x\) (Eq. 7)

Interval A has a somewhat different shape compared to the other intervals, and has a shape that is “outside” the shape of interval C (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 1) and interval D (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 0). Note that although the measurement results will have a negative skewness, the fact that the relative standard deviation is independent of the measurand level (i.e. the standard deviation increases with the measurand level) will cause the uncertainty interval to have a positive skewness.

Example 3 Determination of biochemical oxygen demand (BOD) using electrochemical detection of oxygen:

The within-laboratory reproducibility is 5 %. This is another example where \(B_{{{\text{opt}}}}\) is above 1, i.e. the measurement results have a negative skewness. The results indicate that a “true” \(B_{{{\text{opt}}}}\) is around 4. BOD is determined as the dissolved oxygen concentration before incubation minus dissolved oxygen concentration after incubation. Hence, if results for measurement of the oxygen concentration after incubation have positive skewness, results for measurement of BOD can have a negative skewness since it is calculated as a difference. Similar to example 2, this has an impact on the uncertainty intervals. The calculated expanded uncertainty intervals (95 %) around a nominal measured value of 200 mg/l based on within-laboratory reproducibility data are compared in Fig. 5.

Fig. 5
figure 5

Expanded uncertainty intervals (95 %) for determination of biochemical oxygen demand (BOD) calculated around a nominal measured value of 200 mg/l. a Using transformation based on \(x^{{B_{{{\text{opt}}}} }}\) (Eq. 10), b without transformation neglecting that \(s\) will be different at lower and upper limit (Eq. 4), c without transformation taking into account that \(s\) will be different at lower and upper limit (Eq. 5), and d using transformation based on log10 \(x\) (Eq. 7)

As in the previous example, the shape of interval A is not “between” the shapes of interval C (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 1) and interval D (that corresponds to using \(B_{{{\text{opt}}}}\) equal to zero). The difference in shape between interval A and C is smaller than in example 2 reflecting that the within-laboratory reproducibility and \(B_{opt}\) are both somewhat smaller than that in example 2. As in example 2, the uncertainty interval will have a positive skewness even if the measurement results will have a negative skewness.

Example 4 Determination of lead (Pb) in contaminated soil:

Here, the measurement uncertainty is based on repeatability data that is obtained from measurements of duplicate samples. Repeatability is calculated for the sampling step and the analysis step using ANOVA. This is often referred to as the “duplicate method” and is described in the Eurachem/CITAC Guide Measurement uncertainty arising from sampling—A guide to methods and approaches [6, 7]. Using results for determination of lead in contaminated top soil in the Eurachem/CITAC Guide (Example A2) describing between target variability, a \(B_{{{\text{opt}}}}\) value of − 0.31 was obtained. This \(B_{{{\text{opt}}}}\) value was then used to transform results for duplicate samples (assuming that sampling variability has the same distribution as the between target variability) followed by calculation of repeatability for the sampling step and the analysis step using ANOVA. In the original literature, it was instead assumed that the between target variability had a close to log-normal distribution and that the sampling variability had the same distribution as the between target variability. A more detailed description of the calculations is available in the literature [4, 6]. The calculated expanded uncertainty intervals (95 %) around a nominal measured value of 300 mg/kg based on repeatability data are compared in Fig. 6 for the sampling step (Fig. 6a), the analysis step (Fig. 6b), and the whole measurement (Fig. 6c).

Fig. 6
figure 6

Expanded uncertainty intervals (95 %) for determination of the mass fraction lead (Pb) in contaminated soil calculated around a nominal measured value of 300 mg/kg for a sampling step, b analysis step, and c total measurement. (A) Using transformation based on \(x^{{B_{opt} }}\) (Eq. 10), (B) without transformation neglecting that \(s\) will be different at lower and upper limit (Eq. 4), (C) without transformation taking into account that \(s\) will be different at lower and upper limit (Eq. 5), and (D) using transformation based on log10 \(x\) (Eq. 7)

The shape of interval A is “outside” the shapes of interval C (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 1) and D (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 0), but very similar in shape to interval D since \(B_{{{\text{opt}}}}\) for the data set is found to be just below zero (− 0.31). It is also apparent that for the sampling step, intervals B and C, where it is assumed that measurement results have a normal distribution, do not work well. In particular, the upper limit of the interval for the sampling step for interval C is unreasonably high. This is due to the wrong assumption that the distribution of measurement results has a normal distribution when using Eq. 5. Transforming data to obtain an approximate normal distribution prior to calculation of the interval will solve this issue (see interval A). For the analysis step where CV is small, the differences between the intervals are small. The uncertainty for the sampling step is dominating the whole analysis, and the uncertainty for the analysis step is almost negligible.

Example 5 Determination of organophosphorus pesticides in bread:

In this example, Monte Carlo simulations was used to generate data sets (with 106 data) for a model equation describing the combined standard uncertainty for a calculated concentration \(C\). A suitable model equation is taken from the literature [11]:

$$C = \frac{{I_{{\text{p}}} \times C_{{{\text{ref}}}} \times V_{{{\text{dil}}}} }}{{I_{{{\text{ref}}}} \times m \times R}} \times F_{{{\text{hom}}}} \times F_{I}$$
(11)

where the input quantities are defined in Table 3. Four different data sets were generated denoted (a), (b), (c), and (d). The probability distribution of the input quantities and the parameters describing the distribution (mean set to 1, and standard deviation or halfwidth) are also given in Table 3. The Monte Carlo simulations for (a) was aimed to generate a data set with CV similar to the relative standard uncertainty reported in the original literature (34 %).

Table 3 Input quantities for modelling the procedure for determination of pesticides in bread, their distribution, mean value, and standard deviation or halfwidth

The \(B_{{{\text{opt}}}}\) value was found to be in the range of 0.26 to 0.32 (see Table 2). The calculated expanded uncertainty intervals (95 %) around a nominal measured value of 1 based on combined standard uncertainty for the model equation are compared in Fig. 7 for the four data sets.

Fig. 7
figure 7

Expanded uncertainty intervals (95 %) for determination of pesticides in bread calculated around a nominal measured value of 1 for four simulated data sets with \(CV\) in the original space equal to a 33 %, b 20 %, c 11 %, and d 2.1 %. (A) Using transformation based on \(x^{{B_{{{\text{opt}}}} }}\) (Eq. 10), (B) without transformation neglecting that \(s\) will be different at lower and upper limit (Eq. 4), (C) without transformation taking into account that \(s\) will be different at lower and upper limit (Eq. 5), and (D) using transformation based on log10 \(x\) (Eq. 7)

In Fig. 7a, the shape of interval A is “between” interval C (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 1) and interval D (that corresponds to using \(B_{{{\text{opt}}}}\) equal to 0) which is reasonable since \(B_{{{\text{opt}}}}\) for the data sets is found to be around 0.32. As in example 4, intervals B and C, where it is assumed that measurement results have a normal distribution, do not work well. The data sets denoted (b), (c), and (d) were generated to have CV equal to 20 %, 11 %, and 2.1 %, respectively, by scaling down the standard deviation and halfwidth of the input quantities (see Table 3). Clearly, the difference between intervals calculated in different ways (A, B, C, and D) vanishes with decreasing CV of the data set. This illustrates that when CV is smaller than 15 % to 20 %, positive skewness in the measurement results can be neglected when calculating measurement uncertainty intervals. Furthermore, when CV is smaller than 10 % to 15 %, neglecting that \(s\) will be different at lower and upper limit will work fine.

It is also possible to calculate a 95 % coverage interval for the output quantity of the Monte Carlo generated data sets using the 0.025- and 0.975-quantiles as endpoints [16]. For data set (a) (CV of 33 %), the interval will be 0.47 to 1.79. An identical interval will be obtained by calculating an interval in the \(x^{{B_{{{\text{opt}}}} }}\) transformed space as \(\overline{x}_{{{\text{trans}}}} - 1.96 \times s_{{{\text{trans}}}}\) to \(\overline{x}_{{{\text{trans}}}} + 1.96 \times s_{{{\text{trans}}}} ,\) where \(\overline{x}_{{{\text{trans}}}}\) is the mean in the transformed space, followed by back-transformation to the original space. This demonstrates how well the transformation using \(x^{{B_{{{\text{opt}}}} }}\) works.

Conclusions

Several conclusions can be made from above:

  1. 1.

    Uncertainty intervals calculated using

    $$\frac{x}{{\left( {1 + k \times s_{{\text{rel,trans}}} } \right)^{{1/B_{{{\text{opt}}}} }} }}\;{\text{to}}\;\frac{x}{{\left( {1 - k \times s_{{\text{rel,trans}}} } \right)^{{1/B_{{{\text{opt}}}} }} }}$$
    (12)

    can handle many types of asymmetry in the measurement results when the coefficient of variation is independent of the measurand level. Here, \(k\) is the coverage factor, and \(s_{{\text{rel,trans}}}\) is the relative standard deviation of measurement results in the transformed space. The parameter \(B_{{{\text{opt}}}}\) is optimized to get a symmetric distribution. Equation 10 includes cases where distribution of the measurement results can be approximated with a normal distribution (using \(B\) equal to 1) and a log-normal distribution (using \(B\) close to 0, for instance \(B\) equal to 0.0001).

  2. 2.

    Several of the examples indicate that \(B_{{{\text{opt}}}}\) for many types of chemical analysis is neither close to 1 that corresponds to normal distributed measurement results, nor 0 that corresponds to log-normal distributed measurement results.

  3. 3.

    A value for \(B_{{{\text{opt}}}}\) can be estimated based on experimental results, modelling of results, or on judgement.

  4. 4.

    When the coefficient of variation of measurement results is less than 15 % to 20 %, approximation with a normal distribution for the measurement results is often “good enough” for most applications.