Background

An increasing number of countries are establishing threshold values to monitor and evaluate the content of contaminants in soil. These values are subsequently applied to protect the environment and human health by restricting the reuse of soil and soil-like materials or by classifying them into landfill categories. Compliance control requires reliable and reproducible methods of sampling, sample pre-treatment prior to analysis and analytical measurement itself to produce results valid for legal purposes [1, 2]. The development of validated standards for analytical methods can thus be considered as an important part of the implementation of standards and threshold values into related regulations.

Often several standards have been developed for one group of analytes and varying scope as different matrices are considered. Furthermore, analytical standards are available that have been developed in parallel in various countries in the past. Different organisations, such as International Organization for Standardization (ISO) and particularly European Committee for Standardization (CEN), are forced to develop harmonised standards, which are accepted on a broader European or international level. The use of harmonised standards in the investigation of environmental compatibility of materials enables stakeholders to demonstrate compliance with relevant EU legislation. Additionally, so-called horizontal methods are preferred in Europe, which means that the standards can be used for a range of different matrices, which is beneficial from an economic point of view.

In the framework of the project HORIZONTAL, financed by the European Commission under the 6th Framework Program, such standards were developed to support EU regulations as the revised version of the Sewage Sludge Directive (91/692/EEC), the Landfill Directive (1999/31/EC), the EU Waste Framework Directive (2008/98/EC, including biological treatment of biodegradable waste) and the initiative on a European framework for soil protection and therefore focuses on the matrices sludge, biowaste and soil [3]. The project included the determination of several hygienic, inorganic and organic parameters in these areas of application. The group of strongly adsorbing non-volatile organic compounds includes hydrophobic organic contaminants that are considered in this validation study: polyaromatic hydrocarbons (PAH), polychlorinated biphenyls (PCB), polychlorinated dibenzo dioxins (PCDD), polychlorinated dibenzo furans (PCDF) and dioxin-like polychlorinated biphenyls (dl-PCB) [4].

All these compounds are well-studied and long-known contaminants that are still relevant today because they are ubiquitously present and may accumulate in soils and soil-like materials [5,6,7], which can serve as a secondary source of pollution [8].

Polyaromatic hydrocarbons are formed during the incomplete combustion of organic materials and are widely distributed by atmospheric deposition [9]. Another important source of pollution is the use of petroleum products [10]. PAH comprise a large group of several hundred compounds, some of which are carcinogenic, toxic and mutagenic. Often 16 PAH chosen as indicator substances by the Environmental Protection Agency of the United States (16 EPA PAH) are analysed, but the priority PAH may vary depending on the legal context [10]. Although there is a need to update the key substances [11], the 16 EPA priority list is still widely employed [12]. For threshold values, both the concentration of single PAH and summarised concentrations are used, depending on the matrix [13,14,15].

PCB originate 100% from anthropogenic sources [5]. The technical conditions during the production processes determine the composition of the PCB mixture generated [16]. The group of PCB consists of 209 congeners numbered in accordance with the substituent position of the chlorine atoms by Ballschmiter [17]. Because of the ubiquitous occurrence of PCB, a reasonable approach for the determination of the total PCB content is the quantification of six marker PCB (PCB 28, PCB 52, PCB 101, PCB 138, PCB 153 and PCB 180) and multiplying them by 5 [18, 19]. Many European governmental and regulatory bodies additionally require the determination of PCB 118 as an indicator for the presence of dioxin-like PCB [19].

PCB have been banned worldwide from production and use by the Stockholm Convention on persistent organic pollutants [20]. PCB are toxic, carcinogenic and teratogenic [21]. In the past, PCB mixtures were used in several applications, e.g. as hydraulic fluids and dielectric fluids in transformers and as additives in paints, paper, sealants and plastics [22]. Today, sources of PCB are formed unintentionally in combustion processes, released from PCB-containing sealants and paints still present in buildings or mobilised from contaminated sites [23].

PCDD/F are unintended and undesired by-products mainly of combustion processes in waste incineration, power plants and other industrial activities like metallurgical processes. Emissions from industrial processes have decreased due to better flue gas treatment technologies, resulting in an increase of the relevance of other unregulated sources, such as wood burning [24]. PCDD/F cause adverse health effects including immune suppression, carcinogenicity, tumour promotion and disruption of the endocrine system in laboratory animals and cell cultures [25]. Toxic equivalency factors (TEFs) using 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) as a reference have been introduced to account for the huge difference in toxicity between PCDD/F and dioxin-like PCB TEF [6].

The aim of this work is to provide performance characteristics for three analytical standards for the determination of PAH, PCB and PCDD/F developed in the framework of the European project HORIZONTAL as the basis of their final validation. The data were obtained by an international round robin trial conveyed by the Federal Institute for Materials and Testing (BAM). The standards allow the application of various extraction procedures prior to analyses. The influence of these options on the results was evaluated to assess their equivalence.

Materials and methods

Processing of test materials

The adsorption and binding of contaminants is a crucial issue that might differ between samples spiked in a laboratory and naturally aged samples, and therefore the latter should be preferred as test materials [26]. However, blending highly contaminated with uncontaminated material or even spiking samples was unavoidable, due to the lack of suitable initial sample materials, especially in the case of treated biowaste. Furthermore, it was desired neither to employ test samples exhibiting high concentration levels nor to take samples with too low concentrations, to avoid unnecessary influences on the performance characteristics of the methods.

Sludge contaminated with PAH originated from a waste-water treatment plant near Berlin, Germany (SL PAH), whereas sludge contaminated with PCB and PCDD/F (SL DIO) was sampled from two former disposal sites in North Rhine-Westphalia, Germany. All sludge samples were freeze-dried and ground using a cross hammer mill (Retsch, Haan, Germany), resulting in a grain size < 500 µm to facilitate homogenisation. The PCB-containing sludge was diluted with sludge from the waste-water treatment plant, due to the high ratio of PCB content (7.8 mg/kg) to SL PCB.

Treated biowaste with a grain size < 15 mm was delivered from a composting plant near Berlin, Germany. The air-dried biowaste, which had a residual water content of 5%, was milled using a cross hammer mill, resulting in a grain size < 250 µm. This material was directly used to produce the test material containing PAH (CO PAH). To spike with PCB, this biowaste was mixed with the sludge containing PCB, sieved to a grain size < 250 µm, homogenised for 2 h using a gyro-wheel mixer and equilibrated for 6 weeks in a cooling chamber at 4 °C (CO PCB). Dioxin-containing biowaste was obtained by blending and equilibrating dioxin-containing soil with the biowaste. Additionally, biowaste spiked with some dioxins was added as a third component to change the contamination pattern in comparison with the soil sample, leading to CO DIO.

The basis for all soil samples was an uncontaminated reference soil (RefeSol 01-A, a dystric cambisol) distributed by Fraunhofer IME (Schmallenberg, Germany) [27]. One aliquot of this soil was blended (diluted) with another soil of similar texture containing PAH sampled at a plant for impregnating railway sleepers in Brandenburg, Germany, and sieved to < 2 mm (SO PAH). Another part of this reference soil was blended with the 63-µm fraction of excavated soil material of a remedial action, which was contaminated with PCB to soil SO PCB. A PCDD/F-containing soil was sieved to < 63 µm and blended with the same reference soil and the aforementioned sludge (SO DIO).

In total, nine test materials were processed as described above. All of them were homogenised on a gyro-wheel mixer and divided into representative subsamples of 100 g (PAH and PCB) or 50 g (PCDD/F) using a cross-riffling scheme [28, 29]. Depending on the amount of starting material, 50–80 subsamples of each test material were obtained.

Content analysis and homogeneity check

8 subsamples of each test material were randomly chosen to determine solid matter content and as the basis for the homogeneity check. This number of subsamples fulfils the ASTM-E826-14 [30] criterion of a minimum amount of 8% of subsamples. From every subsample, two replicates were analysed in accordance with the correspondent standards: PCB [18] and dioxins [31]. Analyses of dioxin content were performed by a commercial analytical laboratory as subcontractor. Diverging from the standards, soxhlet extraction was carried out with n-hexane before dioxin analysis. PAH content was analysed following a method published by LANUV NRW (state office for nature and environment North Rhine-Westphalia, Germany) [32]. Methodological details are given in Additional file 1. Homogeneity was evaluated using an F test in accordance with DIN 53803-1 [33]. All processed materials passed the homogeneity tests and could be used for the round robin test.

Round robin trial

The final validation step during the standardisation of an analytical method is the evaluation of performance through an interlaboratory ring test to demonstrate that the method is fit for purpose and to determine the performance characteristics of the method (i.e. mainly standard deviations of repeatability and reproducibility for relevant analytes).

Interlaboratory comparisons were conducted based on the given draft standard methods: FprCEN/TS 16181 for PAH [34], EN 16167 for PCB [18] and CEN/TS 16190 for PCDD/F and dl-PCB [31].

The participants in the interlaboratory trial were requested to strictly follow the given standards. This is important, since in contrast to round robin tests for proficiency testing, validation trials aim at the evaluation of the performance of a procedure and not at the evaluation of the analytical abilities of the participating laboratories. Unfortunately, laboratories frequently tend to deviate from the specified procedure to rank their own method. Furthermore, there are several options within the draft standard, e.g. for extraction, clean-up and analytical method, which influences the final measurement result of individual labs. This aspect was evaluated additionally.

The validation trial was internationally announced via standardisation bodies and other networks dealing with environmental analysis to acquire a sufficient number of participating laboratories and to ensure a large number of individual test results for the evaluation, even after outlier rejection and the representation of both analytical methods for the determination of PAH (HPLC and GC-MS) and PCB (GC-MS and GC-ECD).

Applications were received from 34 participating laboratories from 11 European and 2 Asian countries, whereby the number of participants varied depending on the analyte and the matrix (17 PAH GC-MS, 9 PAH HPLC, 17 PCB GC-MS, 9 PCB GC-ECD, 12 PCDD/PCDF/dl-PCB). All participants received one sample of sludge, biowaste and soil for each of the analytes they had registered for. As a quality control measure, every set of samples was accompanied by control samples for each analyte group produced from commercially available standard solutions by BAM. The control samples were stored in CERTAN® vials (Promochem, Wesel, Germany) to minimise evaporation. Details on preparation of the control samples can be found in Additional file 1.

Data evaluation

The participants reported the results to BAM using the prepared reporting sheets, which could be directly read and processed by a commercially available software [35]. Standard deviations and variation coefficients were calculated for the reproducibility and repeatability of each of the analytes and matrices.

Values below the limit of quantification (LOQ) were not considered for further evaluation, because this would have resulted in a distortion of the evaluations. Thus, for some analytes, the number of results was actually lower. PAH analysis using HPLC with fluorescence detection does not allow the detection of one of the 16 EPA PAH (acenaphthylene). Thus, a sum parameter of 15 EPA PAH was calculated for laboratories using this method. This compound was also excluded for the calculation of the sum parameter for GC-MS measurements, enabling the comparison of the two methods.

The results of the quality control solution were not used as a criterion for exclusion of participants, but to evaluate the contribution of the analysis to the variance of the statistical results.

Sample preparation methods that are allowed optionally in the draft standard for PAH and PCB were grouped in accordance with Table 1 to facilitate the evaluation of the influence of the chosen test options by the laboratories.

Table 1 Grouping of sample preparation methods for analysis of PAH and PCB according to extraction method and polarity of solvent

The performance characteristics were calculated using the software PROLab Plus (QuoData GmbH, Dresden, Germany) [35] in accordance with DIN ISO 5725-2 [36]. The following outliers were removed from the data set: type A) deviating single value within the repetitions of a certain laboratory (Grubbs Test); type B), a differing laboratory mean (Grubbs-Test) and type C), excessive standard deviation (low repeatability) of a single laboratory (Cochran Test).

Results and discussion

Mandel’s statistic and outliers

Graphic plots of Mandel’s statistics revealing differences between laboratories [37] are shown in Additional file 1. A comparison of diagrams for reference solutions and extracts of solid samples does not indicate that certain laboratories show constantly high relative deviations from the mean value (h value) or insufficient precision compared with the standard deviation among all the laboratories (k value). Thus, the results of the reference solution were not used to exclude laboratories from further data evaluation. Evaluation shows that few outliers appear for most analytes and matrices, but for individual compounds such as acenaphthylene and dibenzo[ah]anthracene in the SOPAH sample and several PCDD/F in the RSDIO and SODIO samples, the number of valid results was clearly lower. However, the minimum number of participation laboratories as required for a successful validation [38] was achieved.

Repeatability and reproducibility

A summary of the sum parameters of the substance groups is shown in Table 2. Detailed results for individual compounds of the interlaboratory comparisons are given as tables in Additional file 1 (reference solutions: Tables S2, S4 and S6, matrix samples: Tables S3, S5 and S7) or in the final report [39].

Table 2 Performance characteristics for the analysis of PAH following FprCEN/TS 16181 (sum parameters), of PCB following EN 16167 (sum parameters) and of dioxins, furans and dl-PCB following CEN/TS 16190 (sum parameters)

It must be considered that the uncertainty both of sample preparation and of the chemical analysis itself contributes to the overall variation. The amount of this contribution could be estimated by comparing the results of the quality control solutions and the samples themselves. The variation coefficient for PCB and some of the HCDD/F and dl-PCB is in the same order of magnitude for all samples, depending on the matrix, while for PAH the variation is usually higher for samples containing a larger amount of organic matter.

The variation coefficients of repeatability CVr of the sum parameters range between 3.1 and 7.6% for all parameters and show no trend considering the substance groups or matrices (Table 2). The CVr for the individual substances are listed in Additional file 1: Tables S3, S5 and S7.

The concentrations of the individual substances varied by several orders of magnitude. Usually the variation coefficients for substances with low concentrations are higher [40]. Thus, their contribution to the sum parameter may be small and the sum parameter may exhibit a better variation coefficient than the individual substances, as was already observed in other studies, e.g. [41].

The two reference solutions for PAH measurement (Table 2) show that the variation coefficient for repeatability is smaller for the HPLC (2.9%) method than for the measurement using GC-MS (4.01%). The differences become even more visible if individual PAH are evaluated (Additional file 1: Table S3): anthracene (1.52% HPLC and 7.31% GC-MS) and benzo[a]anthracene (1.73% HPLC and 6.78% GC-MS) are the most prominent examples. The same diagnostic ions are always used to analyse anthracene and phenanthrene, as well as benzo[a]anthracene and chrysene, and chromatographic separation depends mainly on the length of the chromatographic column. Shorter columns hamper a reliable peak integration. PAH with higher molecular weight have broader peaks due to the enhanced retention time and have a lower response factor in mass spectrometry, resulting in higher variance in results. Incomplete separation between dibenzo[ah]anthracene and indeno[123-cd]pyrene further increases observed variance. This general chromatographic phenomenon also influences the performance of samples containing matrix. Thus, it can be concluded that, at least for samples in the concentration range used in the validation trial, the HPLC method seems to be more robust. Gratz et al. [42] found that GC-MS methods are more robust for the analysis of PAH, but due to a different level of concentration (approx. 100-fold higher than in this study), comparability with our study is not given.

The repeatability of the measurements of PCB is slightly higher (2.49–8.62%) and does not differ much for individual PCB congeners. The repeatability for PCDD/F and dl-PCB is higher (6.42–19.79%) than for PAH and PCB, which are below 10% with the exception of 1,2,3,7,8,9-HxCDF (19.79%) and 2,3,7,8-TCDF (10.05%). This is probably caused by the high chromatographic requirements for this group of substances and the development of specific chromatographic phases [43]. The repeatability of the matrix samples is higher, but still less than 10% for three-fourths of the samples. Despite the higher variation for HCDD/F and dl-PCB, the results are satisfying, taking into account that the clean-up for the analysis of HCDD/F and dl-PCB is challenging [44].

The variation coefficient of reproducibility CVR for the three substance groups and three matrices demonstrates higher differences between the substance groups than between the matrices (sewage sludge, treated biowaste and soil). The greatest CVR was found for the analysis of 15 EPA PAH which was between 26 and 35%, depending on the matrix. For the sum of the 7 key PCB, the CVR ranged between 19 and 23%. The lowest CVR—7 to 17%—were obtained for the calculated toxicity equivalents based on selected PCDD/F and dl-PCB.

The higher value for CVR for the analysis of PAH in sludge meets the expectations, as this matrix requires a laborious clean-up procedure. The same chromatographic issues as discussed above concerning repeatability lead to higher values for individual PAH. Additionally, the separation between benzo[b]fluoranthene and benzo[k]fluoranthene is often insufficient and the sum of both analytes is used. Benzo[j]fluoranthene, which is not part of the EPA PAH, elutes between the two analytes and can lead to higher concentrations than expected if it is not sufficiently excluded from the sum of the two benzofluoranthenes [45]. Reproducibility is, in general, concentration-dependent and is higher for lower analyte content. For an analyte content < 0.05 mg/kg, reproducibility is > 50%. Especially PAH with low molecular weight are affected, due to higher volatility.

In the case of PCB, the sludge has also the greatest reproducibility (34.92%), while reproducibility is lower for treated biowaste (25.26%) and soil (22.68%). In contrast to the PAH, the values for individual congeners do not vary so much, which is probably caused by a more equal concentration pattern and a better chromatographic separation of individual compounds. PCDD/F and dl-PCB show similar values for the reference solution and the samples with matrix except for individual extreme values. The concentration of the individual compounds comprises a wide range (1–106 ng/kg), and their reproducibility shows a clear dependence on the concentration [40]. Especially the high values for 2,3,7,8-TCDD, 1,2,3,7,8-PeCDD and 1,2,3,4,7,8-HxCDD can be explained by this relation. Co-elution of 1,2,3,7,8,9-HxCDF and PCB may cause higher values for these compounds, but the wide variety of chromatographic columns used by participants means that general conclusions cannot be drawn [43].

Overall, the CVR were in the usual range for the analysis of organic substances 25–50% [41, 46,47,48]. The higher variation for PAH and PCB might be attributed to the different analytical methods admitted in each case. This factor probably increased the variance of measured values.

Repeatability (within-lab variation) CVr is always better (lower values) than reproducibility (between-lab variation) CVR. Typically, reproducibility ranges between 2 and 4 times repeatability, as found in other studies, e.g. [49].

Overall, the results confirm that the procedures described in the draft standards are comparably applicable for all three matrices and thereby confirm the feasibility of the HORIZONTAL approach.

Comparison of analytical methods and sample pre-treatment procedures for PAH and PCB

Results of the reference solution for PAH cannot be evaluated for this purpose, as two reference solutions with different solvents were used. The results for the different matrices shown in Figs. 1 and 2 are normalised to one of the analytical methods (PAH to HPLC and PCB to GC-MS) and indicate that the concentration determined using GC-MS is usually higher than for the HPLC (PAH) and GC-ECD (PCB). However, it must be considered that extraction and clean-up have a significant influence on the results. For both groups of analytes, four extraction and four (PAH) or eleven (PCB) clean-up methods are included in the standard. This large number of possible treatments results in a data set that cannot be statically evaluated, as not only the measurement method, but also the sample preparation may vary between laboratories.

Fig. 1
figure 1

Comparison of analytical methods for determining PAH in different matrices. a Soil, b treated biowaste, c sludge; HPLC—black columns, GC-MS—grey columns

Fig. 2
figure 2

Comparison of analytical methods for determining PCB in different matrices. a Soil, b treated biowaste, c sludge; GC-ECD—black columns, GC-MS—grey columns

Figure 3 shows data for selected analytes in more detail; for results of all analytes, refer to the report [39]. The examples of fluoranthene, benzo[a]pyrene and PCB 180 show no trend for the sample preparation method. For the analytical method, a slight trend for higher concentration values for GC-MS analysis is observed for benzo[a]pyrene and PCB 180, while the analytical method does not influence the results for the other two examples. Aside from the trend to higher concentrations when GC-MS is used, there is no statistical justification to prefer a particular analytical method.

Fig. 3
figure 3

Selected examples for the results of the validation trials. Selected substance/matrix combination: a fluoranthene/COPAH, b benzo[a]pyrene/SLPAH, c sum PCB/RSPCB and d PCB 180/SOPCB. Individual results of laboratories (L01–L34) for the analysis of substances are displayed with increasing concentration along the x-axis. The analytical methods used are illustrated by white bars for HPLC or GC-ECD and grey bars for GC-MS. Outliers are highlighted by red frames. The error bars represent the within-lab standard deviation. Different extraction techniques and solvents are indicated by numbers: 1 agitation mixture polar/non-polar, 2 agitation non-polar, 3 pressurised liquid extraction non-polar, 5 soxhlet non-polar, 7 ultrasonic bath polar, 8 agitation + ultrasonic bath non-polar

Conclusions

This study considered the analytic standards for environmentally relevant organic contaminants in sludge, treated biowaste and soil that were drafted in the framework of the EU project HORIZONTAL but were not accepted as European Standards due to missing validation data. This aim could be reached to provide validation data for all matrices and selected substance groups taking into account CEN/TS 16190:2012 “Sludge, treated biowaste and soil—determination of dioxins and furans and dioxin-like polychlorinated biphenyls by gas chromatography with high-resolution mass selective detection (HR GC-MS)” [31], FprCEN/TS 16181:2012 “Sludge, treated biowaste and soil—determination of polycyclic aromatic hydrocarbons (PAH) by gas chromatography (GC) and high-performance liquid chromatography (HPLC)” [34] and EN 16167:2012 “Sludge, treated biowaste and soil—determination of polychlorinated biphenyls (PCB) by gas chromatography with mass selective detection (GC-MS) and gas chromatography with electron-capture detection (GC-ECD)” [18, 19].

Specifications for the analysis of the selected organic substance groups comprise instructions on the sample preparation steps (extraction and clean-up of the extract), as well as on the analytic measurement. There are several options available to choose for both the extraction and the clean-up of the extract. Within the validation trial, the number of participants was too low to cover and evaluate all possible combinations of extraction, clean-up and measurement methods.

The obtained results of the interlaboratory comparison confirm that the technical specifications considered in this study are appropriate for the analysis of the differing matrices sludge, treated biowaste and soil. It can be assumed that the results of the interlaboratory comparison give a realistic impression of the currently feasible performance of the analytical protocols evaluated. This applies also to the substances where the performance data are less sufficient.

Outlook regarding the implementation of results in standardisation

The obtained performance data have been accepted by the Technical Committee 444 “Test methods for environmental characterisation of solid matrices” and will be published in the following three standards: EN 16190, EN 16181 and EN 16167. In addition, the Technical Committee is working on harmonising these validated standards for PAH and PCB applicable for sludge, treated biowaste and soil with standards for different types of waste, aiming at reducing the number of very similar standards and enhancing the efficiency of laboratory work. The first result is the publication of prEN 17322 “Environmental Solid Matrices—Determination of polychlorinated biphenyls (PCB) by gas chromatography—mass selective detection (GC-MS) or electron-capture detection (GC-ECD)” [50] applicable for the determination of PCB in sludge, soil, biowaste and waste. A corresponding standard for PAHs is currently under revision and will be published by the end of 2019.

For a subsequent adoption of the EN standards as ISO standards, a project registration will be submitted to ISO TC 190 SC 3 “Chemical and physical characterization”.