Introduction

The first proficiency tests organized by the Institute for Chemical Processing of Coal in Zabrze, Poland (IChPW), were conducted in 1999. The structure of the Institute has included a proficiency testing provider (PTP) since 2008. Since December 2010, the IChPW PTP possessed an implemented management system conforming to the requirements that are included in standard PN-EN ISO/IEC 17043:2010 [1], as confirmed by a certificate issued by the Polish Centre for Accreditation. The scope of accreditation comprises proficiency testing programs in the field of hard coal, brown coal, biomass for power purposes, coke from hard coal and solid combustion by-product analyses.

The organization of the proficiency test (PT) proceeded according to ISO 17043:2010 [1], and the statistical treatment of data was performed according to ISO 13528:2005 [2] and ISO 5725-2 [3] standards.

The PT was designed to assess the analytical performance of laboratories in charge of confirmatory analysis as well as screening analysis, since most of the laboratories belonging to the Polish heat and power plants.

This paper presents the organization of a PT performed in 2012 for hard coal analysis: the determination of the mass fraction of ash, w(A), total sulfur, w(S), and total carbon, w(C), and the gross calorific value, Q.

Experimental

Preparation of hard coal samples

All of the hard coal samples were prepared in accordance with standard PN-90/G-04502 [4]. First the material was collected and adequately conditioned. Then, it was air-dried and grounded to particle size <2 mm before homogenization and sample dividing. The samples weighing 50 g were separated from the material and placed in boxes.

Determination of homogeneity and stability of PT

The homogeneity of the samples of hard coal was tested by analyzing the ash mass fraction, and parallel determinations from 15 subsamples were performed. There were performed the investigations of coal properties and developed method of selection and conditioning of this material, which was distributed to participants as a part of the PT.

Materials distributed in the PTs must be sufficiently stable over the period in which the assigned values are valid. The stability of the samples of hard coal was tested by analyzing the ash mass fraction (similarly as in the case of homogeneity), and parallel determinations from the 10 subsamples that remained after sending all of the samples to the participants were performed.

The sample homogeneity and stability were carried out by analyzing different measurands and in all cases; the results exhibited the homogeneity and stability. It is therefore concluded that the analysis of the ash mass fraction will be sufficient for evaluation of sample homogeneity and stability.

The computational methods used to assess the homogeneity and stability of the test objects complied with the ISO 13528 standard [2].

Proficiency testing

The organization of the PT in hard coal analysis proceeded according to ISO 17043:2010 [1]. At the beginning, all instructions and relevant information for participants were made available at the IChPW Web site. After registration, in accordance with confidentiality rules, each participant that took part in the PT was provided a code. All of the results, tables and performance assessments presented in this paper are presented according to this code. In total, forty-four participants from Polish heat and power plants, research entities and independent laboratories were included in the testing round.

The samples were sent to the participants, who measured the mass fraction of ash, w(A), total sulfur, w(S), and total carbon, w(C), and the gross calorific value, Q in triplicate. The values were reported to the PTP, along with the mean value of the results and the expanded uncertainty of the measurements (for k = 2). All of the participants calculated the uncertainties according to European co-operation guidelines for Accreditation EA-04/16 (2003) [5]. The results were calculated to a dry-weight basis and presented in this state. Additional information regarding the techniques and instruments was also asked. Participants carried out the determination of investigated measurands according to specified standards and their own procedures. The methods and description of their main principle are shown in Table 1.

Table 1 Description of methods applied to measure the investigated measurands

After return of results by participants, the statistical treatment of data and the evaluation of the laboratory performances were performed according to the recommendations of the ISO 17043 [1] and ISO 13528 [2] standards.

Statistical analysis of results

The assigned values X of the measurands, the corresponding uncertainties u X and standard deviations σ were determined based on the results obtained by the participants, using a robust method in accordance with the ISO 13528 standard [2]. The expanded uncertainty of assigned value U X is calculated as U X  = k·u X , where k is coverage factor, and it is determined from the Student’s t distribution corresponding to the appropriate associated degrees of freedom and % confidence for each measurands (k = 2).

To use performance statistics in the assessment of participant performance, the arithmetic means of the studied measurands must feature a normal distribution, as described in the ISO 13528 standard [2]. To analyze the normality of the distribution of the results, outliers and warning signals must be identified using the Grubbs test, in accordance with ISO 5725-2 standard [3]. These values cannot be considered in the subsequent calculations. The normality of the mean values of the data obtained from the participants was verified using the Shapiro–Wilk test. The value of the test probability p was used as the test results. A value of 0.05 was the upper limit of the test probability.

To select an appropriate performance statistic, one must determine whether the uncertainty of the assigned value is lower than the critical value (0.3 σ). The proficiency standard deviation σ was assumed to be equal to the robust standard deviation s*, which was determined from the results obtained by the participants in the current round. It is important to determine whether the uncertainty of the assigned value is lower than the critical value: if these conditions are satisfied, the uncertainty value may be not considered in the performance statistic calculations due to its low value. In this case, to specify the participant’s performance, the z score should be determined. If the condition is not satisfied, the assigned uncertainty must be considered in the calculation, and the z′ score should be used.

In the current round of proficiency testing, the above conditions were satisfied; thus, the z score was used, according to the following equation:

$$ z = \frac{x - X}{\sigma } $$
(1)

where x is the participant’s result, X is the assigned value and σ is the robust standard deviation for proficiency assessment [2].

The test results of the participants was interpreted as follows:

$$ \begin{array}{*{20}l} {\left| z \right| < 2} & {{\text{satisfactory}}\;{\text{result}}} \\ {2 < \left| z \right| < 3} & {{\text{questionable}}\;{\text{result}}} \\ {\left| z \right| \ge 3} & {{\text{unsatisfactory}}\;{\text{result}}} \\ \end{array} \, $$

The performance statistics which accounts for the uncertainty of the participants’ measurements are E n , ζ (zeta) and E z scores.

In the PT report, which was sent to the participants, the E z score was used to evaluate the measurement uncertainty according to the applicable standard ISO 13528:2005 [2]. Other statistics, such as E n and ζ (zeta) scores, may be used when the assigned value is not calculated using the results reported by the participants. In the current round, the assigned value was calculated using a robust method and was correlated with the results reported by the participants. But E n and ζ (zeta) scores are usually used by National Metrology Institutes and widely accepted, so for the purposes of this paper, the E n score was used to evaluate the measurement uncertainty [2], using the equation:

$$ E_{n} = \frac{x - X}{{\sqrt {U_{x}^{2} + U_{X}^{2} } }} $$
(2)

where x is the participant’s result, X is the assigned value, U x is expanded measurement uncertainty (k = 2) and U X is expanded uncertainty of assigned value (k = 2). When no uncertainty was reported, it was set to zero (U x  = 0).

Using E n score, it is possible to evaluate the reported “range” by the laboratory x ± U x (k = 2) to the assigned “range” X ± U X (k = 2). A value of |E n | < 1 provides objective evidence that the estimate of uncertainty is consistent with the definition of expanded uncertainty given in the GUM [14].

Results and discussion

Evaluation of homogeneity and stability of PT

In homogeneity test, the standard deviation between the unit samples was lower than the critical value (0.3 σ), indicating that the sample was homogeneous.

In stability test, the difference between the results obtained at the beginning of the round and those obtained after 4 weeks was lower than the critical value (0.3 σ), indicating that the samples were stable during the time frame of the analyses.

The proficiency standard deviation σ was assumed to be equal to the robust standard deviation s*, which was determined from the results obtained by the participants in the previous round in the case of homogeneity test and in the current round in the case of stability test, in accordance with the ISO 13528 standard [2].

Statistical analysis of results

From the 44 laboratories having registered to the PT, all laboratories reported measurement results for w(A) and Q, while two and four laboratories did not report results for w(S) and w(C), respectively. However, three laboratories (code number 25, 28, 37) having reported results did not submitted uncertainty budgets for any of the measurands investigated.

The specifications of the calculated assigned values X and the corresponding uncertainties U X and relative standard deviations σ are presented in Table 2.

Table 2 Assigned value (X), associated expanded uncertainty (U X , k = 2) and relative standard deviation for PT assessment (σ) for investigated measurands

Distribution of the arithmetic means of the studied measurands was proven to be normal according to the Shapiro–Wilk test (probability p values ranged from 0.05 to 0.88). In the current round of proficiency testing, the z and E n scores were used. The values of the z and E n scores, obtained by the participants for the determined measurands and evaluation of measurement uncertainty, are presented in the Table 3.

Table 3 Specification of obtained z and E n scores values

The laboratories’ performances appear to be good for all measurands with a percentage of satisfactory results of z and E n scores of above 85 % (Table 3). The percentage of the satisfactory results of z score is 84, 95, 90 and 90 % for w(A), w(S), Q and w(C), respectively. The percentage of the satisfactory results of E n score is 86, 88, 89 and 85 % for w(A), w(S), Q and w(C), respectively.

In seven cases, participants obtained questionable result of z score and unsatisfactory result of E n score [for w(A) and w(C)] and two participants obtained unsatisfactory results of z and E n scores [for w(S) and Q].

Figure 1 presents the mean values of the investigated measurands of hard coal x, the expanded measurement uncertainties U x and the assigned values X, their expanded uncertainties U X and standard deviation for PT σ. The solid line represents the assigned value X, the dashed lines delimit the assigned interval X ± U X (k = 2), which is relevant for the E n score evaluation and the dotted lines delimit the target interval (X ± 2σ), which is relevant for the z score evaluation.

Fig. 1
figure 1

Measurement results and uncertainties (vertical bars) of determined measurands: a w(A), b w(S), c Q, d w(C). The solid line represents the assigned value X, the dashed lines delimit the assigned interval X ± U X (k = 2) and the dotted lines delimit the target interval (X ± 2σ)

Based on Fig. 1, the qualitative measurands and estimated measurement uncertainties could be evaluated. If the measurands were determined correctly, their values belong to region defined as X ± 2σ. In this case, the participants obtained satisfactory z score.

Additionally, in accordance with the ISO 13528 standard [2], the qualitative parameters and estimated measurement uncertainties were properly determined, if their value covered the area defined as X ± U X . It can be seen that most participants (above 85 %) carried out valid calculation of the expanded uncertainties of their results. There are also some cases, in which uncertainties are correct, but the values are too high (for example participant with code 8 in the case of sulfur mass fraction or 21 in the case of gross calorific heat value). Some participants estimated the measurement uncertainties correctly, but the mean values of investigated parameter are too high or too low (for example participant with code 34 in the case of ash mass fraction, 6, 23 in the case of gross calorific heat value or 26, 39, 44 in the case of carbon mass fraction).

The E n score states if the laboratory result agrees with the assigned value within the respective uncertainty. It includes all the most important parameters such as measurement result x, assigned value X, expanded uncertainty of the assigned value U X (k = 2) and the measurement uncertainty U x (k = 2).

An additional assessment was provided to each laboratory, based on the evaluation of reported measurement uncertainties, similar to what is implemented by the IMEP program [15]. Figure 2 presents for each measurand the number of participants having underestimated, overestimated or estimated correctly their measurement uncertainties.

Fig. 2
figure 2

Uncertainty evaluation: u x , measurement uncertainty, u X , uncertainty of assigned value, calculated from expanded uncertainties for k = 2, σ, standard deviation for PT assessment

The minimum and maximum acceptable uncertainties were set equal to the uncertainty of assigned value (u X ) and to the standard deviation for PT assessment (σ), respectively. The measurement uncertainty reported by a laboratory (u x ) was considered as:

  • acceptable when ranging between u X and σ,

  • underestimated when smaller than u X or

  • overestimated when larger than σ.

In the last two cases, laboratories were advised to review their uncertainty budget. More than half of the participants seem to have overestimated measurement uncertainties in the case of w(A), w(S) and Q. However, few laboratories having overestimated their measurement uncertainties got unsatisfactory E n scores, probably due to a large analytical bias that needs to be investigated.

Specification of methods

Participants used the methods described in Polish and international standards, as well as proprietary testing procedures (proprietary procedure—single-laboratory validated method). The main methods and their description are shown in the Table 1.

Figure 3 presents methods applied to measure the investigated measurands, which was assessed using the z score and the number of participants, which obtained satisfactory, questionable and unsatisfactory results.

Fig. 3
figure 3

Methods applied to measure the investigated measurands, number of participants, which obtained satisfactory, questionable and unsatisfactory results of z score (proprietary procedure—single-laboratory validated method)

Most of the participants having reported satisfactory results applied the following methods for the determination of the various measurands:

  • thermogravimetry according to ISO 1171:2010 for w(A),

  • SO2 detection by infrared (IR) according to PN-G-04584:2001 for w(S),

  • calorimetry according to PN-81/G-04513 for Q, and

  • CO2 detection by IR according to PN-G-04571:1998 for w(C).

Many other single-laboratory validated methods (proprietary procedure) were used for w(C). The performance of laboratories having reported questionable or unsatisfactory results will be monitored in our next proficiency testing rounds, in order to follow the improvement of their measurement capability.

Summary

The Institute for Chemical Processing of Coal in Zabrze (Poland) organized in 2012 a PT exercise to determine the mass fraction of ash, total sulfur and total carbon and the value of gross calorific value. Over 85 % of the 44 participants from Polish heat and power plants, research institutes and independent laboratories reported satisfactory results. Several Polish and/or international standard methods were successfully applied. However, most of the participants reported overestimated measurement uncertainties that should be carefully reevaluated. Only few laboratories were identified having a significant bias. Their analytical performance will be monitored in the next PT rounds, to ensure the improvement of the measurement capabilities of the Polish laboratories in the field.