Keywords

1 Introduction

This compendium comprises a series of articles, among them the overview of the World Health Organization (WHO) and the United Nations Environment Programme (UNEP)-coordinated exposure studies on persistent organic pollutants (POPs) in human milk and their link to the Stockholm Convention (Malisch et al. 2023a); the findings and discussion of results of polychlorinated biphenyls (PCB), polychlorinated dibenzo-p-dioxins (PCDD), and polychlorinated dibenzofurans (PCDF) (Malisch et al. 2023b); and the assessment of time trends derived from countries with repeated participation for PCB and PCDD/PCDF (Malisch et al. 2023c).

The references used for the review on findings of POPs in human milk (Fürst 2023) show a wide range of analytical methods for determination of PCB and PCDD/PCDF. This chapter describes the analytical methods and quality control used for the determination of PCDD/PCDF, dioxin-like polychlorinated biphenyls (DL-PCB), and non-dioxin-like polychlorinated biphenyls (NDL-PCB) in human milk samples obtained from the WHO/UNEP-coordinated exposure studies performed between 2000 and 2019.

2 Materials and Methods

The protocols for the collection of samples, preparation of pooled samples considered to be representative for a country or a subgroup and their submission to dedicated Reference Laboratories (for chlorinated and brominated POPs between 2000 and 2019: CVUA Freiburg, Germany) and the overview on the participating countries with respect to regional distribution and the temporal differentiation of the collected samples are given in Part I (Malisch et al. 2023a).

2.1 Analytical Procedure and Analytes

After freeze-drying 100 ml human milk sample, the lipid portion containing the contaminants of interest was extracted for eight hours with ethanol/toluene (70/30, v/v), using a hot extraction device (Twisselmann extractor). After evaporation of the solvent, polar co-extractives in the crude extract were removed by dissolving the residue in tert-butyl methyl ether and re-evaporating to provide a purified fat extract.

2.5 g aliquot of this extract was spiked with 13C12-labelled standards. Tables 1216 (in the appendix) list the 55 determined native congeners and the 51 13C12-labelled standards that were used, as follows:

  • Table 12: All 17 native and 13C12-labelled PCDD/PCDF with 2,3,7,8-chlorine substitution (therefore with attribution of Toxic Equivalency Factors [TEF], see Sect. 2.2 “Toxic Equivalents [TEQ]”)

  • Table 13: 5 native and 13C12-labelled non-ortho substituted PCB, which includes the 4 dioxin-like congeners with attribution of TEFs

  • Table 14: 17 native and 10 13C12-labelled mono-ortho substituted PCB (including the 8 dioxin-like mono-ortho PCB with attributed TEFs)

  • Table 15: 16 native and 11 13C12-labelled di/tri/tetra-ortho substituted PCB (including the 6 non-dioxin-like PCB), see Sect. 2.3 “Sum parameter for non-dioxin-like PCB (“indicator PCB”)”

  • Table 16: 8 13C12-labelled PCDD/PCDF without 2,3,7,8-substitution, used as recovery standards

To the lipid aliquot, the 13C12-labelled standards listed in Tables 1215 were added as internal standards for calculation of the recoveries for these congeners. The 13C12-labelled PCDD/PCDF listed in Table 16 (without 2,3,7,8-substitution) were used as recovery standards (also called “injection standards”; for extract reconstitution before GC/HRMS analysis) and added after clean-up before the final determination step.

Gel permeation chromatography on Bio Beads S-X3 was used to remove fat (in four runs with a maximum of 0.75 g fat each; 50 g Bio Beads S-X3; eluent ethyl acetate/cyclohexane [1/1, v/v]). Small amounts of remaining lipid and oxidizable substances were removed using a mixed column loaded with layers of 1 g sulfuric acid (96%) impregnated silica gel and 1 g NaOH-impregnated silica gel (eluent: 20 ml heptane). A Florisil™ column (deactivated with 3% water) was used to separate the PCB (eluted with heptane containing 0.2% of toluene; as the first fraction) from the PCDD/PCDF (eluted with toluene; as the second fraction). The PCDD/PCDF-fraction was purified on a Carbopack B-column (automated version; Carbopack B/Celite mixture; the first fraction eluted with hexane contained potentially interfering substances, the second fraction was a reverse elution of the PCDD/PCDF with toluene) or on a Carbopack C-column (manual version; Carbopack C/Celite mixture; washed with heptane; then eluted with toluene).

The second fraction was evaporated to a final volume of 20 μl after the addition of the recovery standards. Initially only 1,2,3,4-13C12-TCDD was used, but since 2018, other 13C12-labelled PCDD/PCDF without 2,3,7,8-substitution have been used as additional recovery standards (see Table 16) and were the most recent improvement to the study during 2016–2019.

The PCB eluted in the first fraction were further separated into three fractions by elution through a Carbopack B-column, firstly with hexane to yield di/tri/tetra-ortho PCB, then with hexane/toluene (92.5/7.5, v/v) for the mono-ortho PCB and finally reverse eluted with toluene to yield the non-ortho PCB. Since 2016, the separation of the PCB was improved to yield just two fractions, the first of which contained all the ortho-substituted PCB, with non-ortho-substituted PCB in the second fraction. After addition of 13C12-PCB 80 as recovery standard, the fractions were evaporated to a final volume of 60 μl (non-ortho PCB) or 500 μl (ortho PCB).

The measurements for these three groups of analytes (PCDD/PCDF, non-ortho PCB, and ortho PCB) were carried out using HRGC/HRMS (initially on a Fisons Autospec, later on a Thermo Fisher MAT95XP and Thermo Scientific DFS) at a resolution 10,000, and quantified against a 5-point calibration curve. PCDD/PCDF (PTV; 5 μl injection volume) were separated on a DB5-MS GC column, whereas the PCB (1 μl splitless injection) were separated using either one of STX-500 (Crossbond® carborane/dimethyl polysiloxane; Restek), HT-8 PCB (8% Phenyl-Polycarboran-Siloxan; SGE/Trajan ™), or MXT-500 (Crossbond® carborane/dimethyl polysiloxane; Restek) columns.

2.2 Toxic Equivalents (TEQ)

Toxic Equivalents (TEQ) were calculated as the sum of the products of the concentration of each compound (17 PCDD/PCDF congeners with 2,3,7,8-substitution and 12 dioxin-like PCB congeners) multiplied by the corresponding toxic equivalency factors (TEF), and provided an estimate of the summed 2,3,7,8-TCDD-like activity for both analyte groups.

The TEF values for PCDD, PCDF, and DL-PCB that were initially proposed by the 1997 WHO expert group for calculation of WHO-TEQs (Van den Berg et al. 1998) were used for the results obtained during the first two sampling periods (2000–2003; 2004–2007). These TEFs were re-evaluated at a further WHO expert meeting held in 2005 (Van den Berg et al. 2006) with changes to some values, and these revised values were used for WHO-TEQ calculations for the later sampling periods. In order to facilitate comparison with other human milk surveys (in particular with previous studies, thus allowing conclusions on time trends), the results have been calculated using both sets of TEF values and these are compared in Part III (Malisch et al. 2023b).

Three summarizing parameters can be calculated: “WHO-PCDD/PCDF-TEQ” comprising PCDD/PCDF, “WHO-PCB-TEQ” for dioxin-like PCB, and “Total TEQ” or “WHO2005-TEQ” comprising PCDD/PCDF and dioxin-like PCB.

2.3 Sum Parameter for Non-dioxin-like PCB (“Indicator PCB”)

Concentrations of NDL-PCB are expressed as the sum of six NDL-PCB (“indicator PCB”; ΣPCB6) including the congeners PCB 28, PCB 52, PCB 101, PCB 138, PCB 153, and PCB 180 (UNEP 2019).

3 Development of Regulations and Standards for Analytical Criteria over Time

The determination of PCDD/PCDF and PCB in the human milk samples has followed the analytical criteria for these analytes in food and feed as specified in EU legislation since 2002 (European Commission 2002a), and for the target error as required by the Global Monitoring Plan (GMP) (UNEP 2013; UNEP 2019).

3.1 Historical Background (2000–2001)

In the absence of internationally harmonized analytical criteria for the determination of PCDD/PCDF and PCB, a comprehensive quality control programme was initiated at the start of the third round of WHO/UNEP-coordinated exposure studies in 2000 (Malisch and van Leeuwen 2002). In 2001, general acceptance criteria for PCDD/PCDF analyses in feed and food samples for the control of maximum levels (whose introduction in the EU was discussed at that time) were developed as contribution to the Joint FAO/WHO Expert Committee on Food Additives (JECFA) evaluation of PCDD/PCDF and dioxin-like PCB (Canady et al. 2002). These included quality criteria for methods applying GC/MS determination (Malisch et al. 2001) as well as bioassays (Behnisch et al. 2001) and became the basis of the EU regulations of 2002 for the control of food (European Commission 2002a) and feed (European Commission 2002b).

3.2 Limits of Quantification

The limit of detection (LOD) and/or limit of quantification (LOQ) are important parameters for the evaluation of the reliability of analytical results. For the determination of PCDD/PCDF and PCB in food, the European Commission has provided a definition and specific requirements for the LOQ (European Commission 2004), as follows:

The accepted specific LOQ of an individual congener is the concentration of an analyte in the extract of a sample which produces an instrumental response at two different ions, to be monitored with an S/N (signal/noise) ratio of 3:1 for the less sensitive signal and fulfillment of the basic requirements such as, e.g., retention time, isotope ratio according to the determination procedures as described in EPA method 1613 revision B.

This definition of the LOQ for the individual congeners was slightly modified in 2012 (European Commission 2012a; European Commission 2012b) and is one of the two pillars of the presently valid analytical criteria in the EU legislation for determination of PCDD/PCDF and dioxin-like PCB in food (European Commission 2017a) and feed (European Commission 2017b). The regulations now also allow the use of the lowest point on the calibration curve under defined conditions as alternative method for determining the LOQ. These regulations refer to the “Guidance Document on the Estimation of LOD and LOQ for Measurements in the Field of Contaminants in Feed and Food” (EU Reference Laboratories for contaminants 2016), which provides practical advice for laboratories carrying out these determinations.

In the human milk studies described here, these conditions for the estimation of LOQ were followed for determination of all 17 PCDD/PCDF congeners with 2,3,7,8-substitution, as well as the twelve dioxin-like PCB congeners which are collectively required for the calculation of WHO-TEQ.

3.3 Upper-bound and Lower-bound Results for WHO-TEQ and ∑PCB6 and Acceptable Differences

For calculation of the WHO-TEQ value, the results of each of the relevant congeners are multiplied by the specific TEF and then summed. In most cases, it is normal for the concentrations of a few congeners to fall below the LOQ. However, the interpretation of the results may be affected if many congeners are not quantifiable, in particular, those congeners with higher TEF values.

In order to facilitate the interpretation of the data, different imputation approaches for handling “non-detects” (more exactly: not quantified congeners) were tested using the limit of detection, among them: (i) calculation of the contribution of each non-detected congener to the TEQ as zero (lower-bound concentrations); (ii) calculation of the contribution of each non-detected congener to the TEQ as the limit of detection (upper-bound concentrations); (iii) calculation of the contribution of each non-detected congener to the TEQ as half of the limit of detection (Hoogerbrugge and Liem 2000). Later, these proposed definitions of lower- and upper-bound concentrations were used as pillars of the analytical criteria, but they were based on the LOQ rather than the detection limit.

These distinctions (upper and lower bounds) have important implications for the interpretation of the analytical results. If the contribution of non-detected congeners to the TEQ is calculated as “0”, the resulting lower-bound TEQ concentrations could be interpreted as the detection of low levels of PCDD/PCDF and dioxin-like PCB in samples, whereas in effect the low TEQ content would really be the result of inadequate (not low enough) limits of quantitation. Particularly in the following cases, the measured analyte concentrations could lie near or below the limit of quantification: (i) low concentration ranges, (ii) use of mass spectrometers with insufficient sensitivity (low resolution mass spectrometers), (iii) limited sample amount available for analysis, in particular for biological samples with low lipid content, (iv) low sample aliquot weight (for quick and easy analyses). It was therefore proposed that for food analysis, the difference between upper- and lower-bound TEQ should be in the range of 10 to 20% in defined cases (Malisch et al. 2001).

These harmonized quality criteria were included in the safety evaluation of PCDD/PCDF and dioxin-like PCB by the Joint FAO/WHO Expert Committee on Food Additives (JECFA) (Canady et al. 2002) and became basis of the EU regulations for feed and food (for food as an example, see the 2002 regulation [European Commission 2002a] and their amendments until 2012 [European Commission 2012a, European Commission 2012b]). According to these requirements, the difference between the upper-bound and lower-bound WHO-TEQ level shall not exceed 20% for foodstuffs with a contamination of about 1 pg WHO-TEQ/g fat (based on the sum of PCDD/PCDF and dioxin-like PCB). Comparable requirements were laid down for the determination of PCDD, PCDF, and dioxin-like PCB in feed as well, however on a product basis (at 12% moisture). The current regulations demand that the difference between upper- and lower-bound levels shall not exceed 20% for confirmation of the exceedance of maximum levels (European Commission 2017a; European Commission 2017b). The same requirement (maximum difference below 20%) was also set as one of the criteria for the sum of NDL-PCB at the maximum level, determined by Isotope Dilution Mass Spectrometry as well as by other techniques (European Commission 2017a; European Commission 2017b).

The acceptable difference between upper- and lower-bound values is also of particular importance for the analysis of samples that are used to derive time trends in contaminant concentration, as an evaluation of the effectiveness of the Stockholm Convention: Differences that are too wide might actually be caused by inadequate analytical sensitivity and not by changes in the real levels of PCDD/PCDF and dioxin-like PCB in the samples. Therefore, a similar request was included in the guidance on the Global Monitoring Plan for persistent organic pollutants (UNEP 2013 and later amendments, including the most recent version UNEP 2019) when the following reporting format is recommended (cit.): “The upper-bound (ND=LOQ) and the lower-bound (ND=0) values should be given. As a QA/QC measure, the difference between these two should be less than 20%”. As comment, “ND” (not detected) would be better replaced by “not quantified”.

3.4 Amendments of EU Regulations; EU Guidance Documents

Various amendments were developed by the network of the EU Reference Laboratory and National Reference Laboratories for dioxins and PCB for food (European Commission 2017a) and feed (European Commission 2009) in order to improve the interpretation of the criteria. Additionally, for data obtained using confirmatory methods the regulations require that the expanded measurement uncertainty should be taken into account as described in the “Guidance Document on Measurement Uncertainty for Laboratories performing PCDD/PCDF and PCB Analysis using Isotope Dilution Mass Spectrometry” (Eppe et al. 2017) and that the limit of quantification should be estimated as described in the “Guidance Document on the Estimation of LOD and LOQ for Measurements in the Field of Contaminants in Feed and Food” (EU Reference Laboratories for contaminants 2016).

3.5 Global Monitoring Plan

To provide reliable monitoring information for the Parties to the Stockholm Convention, the guidance document for the Global Monitoring Plan (GMP) proposed that a quantified objective for temporal studies should be stated, e.g. “to detect a 50% decrease in the levels of POPs within a 10 year period” (UNEP 2013; UNEP 2019). The statistical model used in the Bi-ennial Global Interlaboratory Assessment on Persistent Organic Pollutants is based on a target error of 25% as the decisive criterion to assess the performance of each laboratory for each analyte in each matrix (UNEP 2017).

4 Quality Control

Human milk samples were received at the reference laboratory over 20 years between 2000 and 2019, in five rounds, each covering approximately four years. Therefore, a rigid quality control programme was run to ensure that any differences in measured concentrations over this period did not arise from analytical variations. This approach was already applied during the third round of WHO-coordinated exposure studies (2000–2003) when the first sample was sent by one country (New Zealand) in 2000 and the last sample was received in 2003 (USA), and was continuously used in the following rounds, as well.

Accuracy depends on systematic errors and random components. “Trueness” (Closeness of agreement between the expectation of a test result or a measurement result and a true value) (ISO 3534-2: 2006, /24/) and “Precision” (closeness of agreement between independent test/measurement results obtained under stipulated conditions) (ISO 3534-2: 2006, /24/) are used to describe accuracy and are therefore important criteria for assessment of reliability of analytical methods (Eppe et al. 2017).

The comprehensive quality control programme included procedural blank samples, various kinds of in-house reference material (vegetable oil samples spiked at different levels and different kinds of quality control samples), and confirmation of certain results by duplicate analysis. Possible systematic errors were checked by the analysis of reference material or participation in numerous interlaboratory studies. This validation should guarantee a very high degree of accuracy and is part of the general quality control programme applied in the daily routine for analysis of all kinds of samples. Therefore, comprehensive validation data are available, showing the accuracy for WHO/UNEP human milk samples relative to the accuracy achieved in general routine analysis. As a result, the validation of the results gives a complex picture. The statistical evaluation of these pillars of the quality control reflects a “worst case scenario”: Analyses were performed by different operators using different chemicals over a long period (more than 20 years) with data collected in separate runs—therefore, these quality control data collected under intermediate conditions are much more robust than data from a single validation when one technician performs repeated analyses under the same conditions using the same chemicals in one sequence. The approach and detailed results for the first years were presented earlier (Malisch and van Leeuwen 2002).

The concentrations of the relevant congeners (17 PCDD/PCDF, 12 dioxin-like PCB, 6 non-dioxin-like PCB) and four main summarizing parameters (TEQ for PCDD/PCDF, dioxin-like PCB and total TEQ; ΣPCB6) were determined with a rigid quality control as described in the next sections. 19 additional PCB congeners, which are not included in regulatory listings and therefore cannot be validated externally as part of any proficiency tests, were included in the routine analysis for a broader picture of the PCB spectrum.

4.1 Procedural Blank Samples

For PCDD/PCDF, the median of 434 procedural blank samples analysed between 2000 and 2019 is 0.04 pg WHO-PCDD/PCDF-TEQ/g (upper-bound LOQ, estimated on a lipid basis, using an aliquot equivalent to 3 g lipid for determination, see Sect. 2.1 “Analytical procedure and analytes”) (Table 1). In most cases, tetra- through hexa-substituted congeners were below the LOQ. Therefore, the upper-bound calculated procedural blank is more an indication of the LOQ than a reagent blank which could be considered for possible subtraction. For dioxin-like PCB, the median value of 401 reagent blank samples was 0.01 pg WHO-PCB-TEQ/g lipid (upper-bound), and 0.05 ng/g lipid for the sum of the 6 NDL-PCB (ΣPCB6). The median of these 434 procedural blank samples was about two orders of magnitude below the median of the 232 pooled human milk samples analysed for the WHO/UNEP-coordinated exposure studies. For samples of these studies, the maximum of procedural blank samples run in sequences together with the human milk samples had to be about one order of magnitude below the concentration of these samples. Therefore, the influence of procedural blank samples was negligible for human milk samples.

Table 1 WHO-PCDD/PCDF-TEQ (pg/g lipid), WHO-PCB-TEQ (pg/g lipid), and ΣPCB6 (ng/g lipid) levels of reagent blank samples analysed together with human milk and fatty food samples between 2000 and 2019 (TEQ results with use of WHO1998-TEF or WHO2005-TEF; all results as upper-bound concentrations)

4.2 Freeze Drying and Particular PCB-related Aspects

PCB 47, PCB 51, and PCB 68 were identified as a major non-Aroclor source in residential homes. It was hypothesized that these congeners were inadvertent byproducts of polymer sealant manufacturing for finished cabinetry and produced from the decomposition of 2,4-dichlorobenzoyl peroxide used as an initiator in free-radical polymerization of polyester resins (Herkert et al. 2018).

These congeners belong neither to the group of the six indicator PCB nor to the twelve non-dioxin-like PCB and are therefore no relevant parameters for the WHO/UNEP-coordinated exposure studies. However, when also other PCB congeners were included in the applied analytical method for a more complete picture, the findings of elevated PCB 47 concentrations after freeze-drying could be explained by the hypothesis of formation from 2,4-dichlorobenzoyl peroxide at production of sealants. Tests of freeze-drying of cow’s milk with three different kinds of cables used for a freeze-dryer (Beta 1–8, Martin Christ) hint at the insulation of cables as the source for the PCB 47 concentrations exceeding the ranges of PCB 138, PCB 153, and PCB 180 as usually highest PCB congeners in food of animal origin and human milk (Table 2). If a method required also the determination of PCB 47, such a possible contamination could be avoided, e.g., by liquid/liquid distribution or use of adsorbents at the extraction of lipids.

Table 2 Variation of PCB 47 concentrations in comparison to PCB 138, PCB 153, and PCB 180 in freeze-dried whole milk using different types of cables in the freeze-dryer (ng/g lipid)

The obvious contamination of submitted samples in particular with lower chlorinated PCB was detected in cases of two countries submitting samples freeze-dried instead of deep-frozen, as requested by the protocols. Seven samples from Egypt of 2001 were freeze-dried before shipment, and were apparently contaminated with lower chlorinated PCB during freeze-drying: Concentrations of the indicator-PCB PCB 28, PCB 52, PCB 101 and of the non-ortho PCB 77 and PCB 81 were an order of magnitude or more higher than usual. Hence, two additional pooled samples were submitted by Egypt in 2002, which were shipped frozen and not freeze-dried and showed the normal PCB pattern (Table 3). Furthermore, the sample from Cuba (2011) was freeze-dried before shipment. Also here, a contamination with lower chlorinated PCB occurred during freeze-drying and resulted in an unusual PCB pattern (Malisch et al. 2023b). It is unknown what exactly might have caused this contamination in these submitted samples, e.g. sealants or pump oil, as freeze-drying with suitable instruments is a well-proven contamination-free technique. Anyways, samples have to be shipped deep-frozen or, in cases when frozen receipt by the reference laboratory cannot be guaranteed, after addition of a small amount of potassium dichromate (K2Cr2O7) to the sample for stabilization (Malisch et al. 2023a).

Table 3 PCB concentrations in pooled human milk samples from Egypt (pools 1–7 freeze-dried before shipment in 2001; pools 8–9 sent deep-frozen in 2002; marked in italics: extremely high concentrations of PCB 28, PCB 52, PCB 101, PCB 77, and PCB 81 caused by freeze-drying before shipment; ng/g lipid for the six indicator PCB [PCB 28–PCB 180] and the 8 mono-ortho dioxin-like PCB [PCB 105–PCB 189]; pg/g lipid for the four non-ortho dioxin-like PCB [PCB 77–PCB 169])

4.3 Vegetable Oil Samples Fortified at Different Levels as in-house Reference Material

Fortification of refined vegetable oil (sunflower oil) with different levels of native PCDD/PCDF and PCB is a well-established procedure to check the recovery of native analytes and variation at various levels. Therefore, starting with the analysis of the human milk samples for the “third round” performed between 2000 and 2003, the fortification experiments for control of the usual contamination of food and feed were expanded to also over the higher levels that are found in human milk in order to check the linearity of the response for PCDD/PCDF (range of about 0.6 to 25 pg WHO-PCDD/PCDF-TEQ1998/g lipid and 1 to 40 pg WHO-PCB-TEQ1998/g lipid). Table 4 summarizes the recoveries and relative standard deviation (RSD) of fortification tests performed between 1994 and 2003 for WHO-PCDD/PCDF-TEQ1998. The mean recoveries at six fortification levels (range 0.6–24.3 pg/g) were in the range between 93 and 103% with an RSD between 2 and 9%.

Table 4 Mean recovery and RSD (%) of samples fortified at various WHO-PCDD/PCDF-TEQ1998 concentrations (pg/g lipid) analysed together with human milk and food samples between 1994 and 2003

Table 5 summarizes the results for WHO-PCB-TEQ1998 of fortified samples which were performed between 2000 and 2003 in combination of analyses of human milk samples of the third round (2000–2003). The mean recoveries at 4 fortification levels were between 91 and 105% with an RSD between 1.7 and 11.5%.

Table 5 Mean recovery and RSD (%) of samples fortified at various WHO-PCB-TEQ1998 concentrations (pg/g lipid) analysed together with human milk samples between 2000 and 2003

As result, the quality parameters are comparable at different fortification levels including the lower concentrations that were usually found in food and the higher concentrations found in human milk.

4.4 Quality Control Samples as in-house Reference Material and Precision

Numerous quality control samples have been used for monitoring of the precision of PCDD/PCDF analysis since 1994. Initially, samples of butter and extracted lipids from eggs that were contaminated at different levels were used. Table 6 summarizes the results obtained for these samples between 1994 and 2003 as indication of the variation of results (expressed as RSD [%]) at times of the analyses of human milk during the 2000–2003 round.

Table 6 RSD (%) of quality control samples in the range between about 0.7 pg/g and 5 pg/g WHO-PCDD/PCDF-TEQ1998 used between 1994 and the end of the 2000–2003 round

Figure 1 illustrates the quality control chart for butter A over the whole period of its use from 1994 until 2007 (thus, including the third and fourth round performed between 2000 and 2007). In this time, 100 replicates were analysed by numerous technicians under varying conditions—various batches of chemicals, instrumental conditions, etc. Around the mean (M), warning levels are set at two sigma (lower warning level at M-2 s, upper warning level at M + 2 s), control levels at three sigma (lower control level at M-3 s, upper control level at M + 3 s). In the 13 years of use of this quality control sample, an RSD of nearly 10% was observed for a mean level of 0.67 pg WHO-PCDD/PCDF-TEQ1998/g lipid. The lower warning level of 0.54 pg/g was reached by one of the 100 replicates, the upper warning level of 0.80 pg/g exceeded by three samples. None of the 100 replicates exceeded the lower (0.44 pg/g) or upper (0.90 pg/g) control level which would have required a thorough follow-up to identify the reason for the deviation. Note that butter A with a concentration of 0.67 pg WHO-PCDD/PCDF-TEQ1998/g lipid was on average about a factor of 10 lower than the concentrations found in human milk between 2000 and 2003 (range 3.3–22.3 pg WHO-PCDD/PCDF-TEQ1998/g lipid) and therefore demonstrates the performance even at the much lower concentration ranges that were found in many food samples of animal origin (cow’s milk, poultry, beef, veal, hen’s eggs) at that time (EU Scientific Committee on Food 2000; EFSA 2012).

Fig. 1
A quality control chart plots 5 horizontal lines for U C L, U W L, mean, L W L, and L C L at 0.9, 0.8, 0.67, 0.53, and 0.43. It plots dots within the lines. The highest dot is at 969, 0.86). Values are estimated.

Quality control chart for “butter A” (0.67 pg WHO-PCDD/PCDF-TEQ1998/g lipid) used from 1994 to 2007 (UCL = upper control level, UWL = upper warning level, LWL = lower warning level, LCL = lower control level)

Over time, the number of quality control samples was expanded considerably: 10 different quality control samples (mixed fat, milk fat, egg fat, fish oil, and pork fat) were used between 2003 and 2019 for control of food and human milk samples (see Table 7) covering a range between about 2 and 12 pg WHO-PCDD/F-PCB-TEQ2005/g lipid, and 7 to 80 ng/g ΣPCB6. For all samples and all sum parameters, the RSD was between 3 and 11% (median about 7 to 8%).

Table 7 Quality control samples used between 2003 and 2019

In conclusion, based on the clean-up and GC-HRMS determination of these quality control samples, the methodology achieved a long-term precision of below 15% over the 2000–2019 period.

4.5 Duplicate Analysis and Precision

PCDD/PCDF and PCB concentrations of most samples were determined by duplicate analyses. At the study performed between 2000 and 2003, this approach was optimized in a way which can best be described as the “overlapping sandwich method”. A large portion of the samples was analysed in duplicate with the second confirmatory analysis being performed in sequences with samples from other countries and with different quality control samples. This ensured that the results of all samples from different countries had the same reliability despite receipt of the various samples by the reference laboratory over a period of three years: The combination of the “overlapping sandwich method” and use of quality control samples ensured that the sample results in this round of the study of the first country sending samples (New Zealand, 2000) had the same reliability as the last country (USA, 2003). Any differences in levels could therefore be attributed to the result of real concentration differences and not of any analytical variation. In addition, it is possible to calculate the repeatability standard deviation from duplicate analyses.

As an example, from 102 samples received from 26 countries between 2000 and 2003, 64 were analysed as duplicates and additionally, 10 were analysed as triplicates. These 74 samples were sorted into five groups with different ranges of WHO-PCDD/PCDF-TEQ1998 concentrations. The repeatability standard deviation (RSD) was calculated using the sum of differences of the individual results to the mean results of the samples. Table 8 summarizes the results for the RSD (%) for the mean concentration in each range. At all levels (from <5 pg WHO-PCDD/PCDF-TEQ/g lipid to >15 pg WHO-PCDD/PCDF-TEQ/g lipid), the RSD was between 2 and 6%.

Table 8 RSD (%) for determination of WHO-PCDD/PCDF-TEQ1998 based on duplicate analyses of 74 human milk samples received between 2000 and 2003 with differentiation into five concentration ranges

Table 9 summarizes the results for the calculation of RSD (%) for the 74 samples analysed as duplicates or triplicates for WHO-PCB-TEQ1998 with differentiation into 4 groups (from “<5 pg WHO-PCB-TEQ/g lipid” to “>17 pg WHO-PCB-TEQ/g lipid”). The RSD of the mean WHO-PCB-TEQ concentrations in the respective groups was in the range between 3 and 7%. Furthermore, in Table 10, also the mean of the corresponding NDL-PCB results in these 4 groups is given (range between 35 and 430 ng ΣPCB6/g lipid); the RSD for the NDL-PCB determinations was between 2 and 3%.

Table 9 RSD (%) of determination of WHO-PCB-TEQ1998 based on duplicate analyses of 74 human milk samples received between 2000 and 2003
Table 10 RSD (%) of determination the sum of the 6 non-dioxin-like PCB (ΣPCB6) based on duplicate analyses of 74 human milk samples received between 2000 and 2003

4.6 Pooled Human Milk Samples Remaining from WHO Interlaboratory Assessment Study 1995–1996 as Quality Control Samples 2000–2003

At the fourth quality assessment study conducted by WHO on levels of PCB, PCDD, and PCDF in human milk in 1995 and 1996 (WHO 2000), the CVUA Freiburg was designated as the WHO Reference Laboratory for the following WHO exposure study. In order to check the analytical reliability for the WHO-coordinated exposure study 2000–2003, four pools of human milk samples remaining from the 1995–1996 WHO interlaboratory assessment study were provided to CVUA Freiburg for inclusion as quality control samples. This allowed a check on whether the performance of the quality assessment study could be reproduced about six years later when various human milk samples from different countries were analysed. Three of the four samples had a sufficient sample amount allowing repeated use in two different analytical sequences. Thus, seven replicates could be performed between 2000 and 2003, when samples from different countries were analysed. These quality control data were obtained under intermediate conditions by analyses of different operators using different chemicals over a long period in separate runs.

A main criterion for the evaluation of the fourth quality assessment study conducted by WHO was the long-time reliability of the analytical performance. Trueness and precision of measurements was assessed for compounds classified as group I (the most important congeners with dioxin-like properties: 2,3,7,8-TCDD, 1,2,3,7,8-PCDD, 1,2,3,6,7,8-HxCDD, 2,3,4,7,8-PeCDF, PCB 126, PCB 118, PCB 156, PCB 157) and group II (the most abundant non-dioxin-like PCB: PCB 138, PCB 153, PCB 180).

Table 11 summarizes the mean recoveries of the measurements for the seven replicates of remaining human milk samples between 2000 and 2003 in comparison to the results submitted for the 1995–1996 quality WHO assessment study and the corresponding RSDs (with split of WHO-PCB-TEQ into two subgroups of non-ortho PCB and mono-ortho PCB). The RSDs for the decisive individual congeners were for PCDD/PCDF (2,3,7,8-TCDD, 1,2,3,7,8-PCDD, 1,2,3,6,7,8-HxCDD, 2,3,4,7,8-PeCDF) in the range 7 to 12%, for PCB 126 14%, for mono-ortho PCBs (PCB 118, PCB 156, PCB 157) in the range 10–14%, and for non-dioxin-like PCB (PCB 138, PCB 153, PCB 180) 10–11%. As a result, these measurements between 2000 and 2003 were in line with the 1995–1996 submitted results, when the criteria for acceptance of the fourth quality assessment study were met.

Table 11 Mean recoveries (%) and RSDs (%) of seven replicates of remaining human milk of samples used as quality control samples between 2000 and 2003 in comparison to the results submitted for the 1995–1996 quality WHO assessment study

4.7 Participation in Proficiency Tests and Trueness

Between 2000 and 2019, the CVUA Freiburg successfully participated in 32 proficiency tests covering 81 test samples of food of animal origin or human milk (Table 17 in the appendix).

In order to summarize the evaluation of all 81 samples of these proficiency tests performed between 2000 and 2019, the results for 52 samples were calculated on a lipid basis, 29 on product basis (based on the specifications of the EU regulations for maximum levels for WHO-PCDD/PCDF-PCB-TEQ and for ΣPCB6), regardless of the requirements for reporting data for individual proficiency tests. Figure 2 depicts the deviation (%) of results of CVUA Freiburg for WHO-PCDD/PCDF-PCB-TEQ from the consensus values, Fig. 3 for WHO-PCDD/PCDF-TEQ, Fig. 4 for WHO-PCB-TEQ, and Fig. 5 for the sum of the 6 NDL-PCB. In all figures, the deviation for results calculated on a lipid and whole basis is marked as blue and red squares, respectively. Green dotted lines mark the lower and upper end of the range relevant for human milk samples (minimum and maximum found in all 232 submitted pooled samples). There was no differentiation between TEQ results based on the use of WHO-TEF1998 or WHO-TEF2005.The general idea of this summarizing evaluation was to check how the results reported by CVUA Freiburg deviated from the respective consensus value—and this deviation has always been calculated on the same basis (same TEF, same basis “lipid” or “product”).

Fig. 2
A scatterplot plots 2 dots for fat and fresh weight along the horizontal line at 0. 2 vertical lines are plot at 2 and 65. The highest dot is with fat at (0.3, 16). A dot for fresh weight is beyond the vertical lines at (85, 5). Values are estimated.

Deviation (%) of results of CVUA Freiburg for WHO-PCDD/PCDF-PCB-TEQ from the assigned values of proficiency tests performed between 2000 and 2019 (blue squares: results on lipid base as pg WHO-PCDD/PCDF-PCB-TEQ/g lipid; red squares: results on product base as pg WHO-PCDD/PCDF-PCB-TEQ/g product; green dotted line: range relevant for human milk samples [about 1–50 pg/g lipid])

Fig. 3
A scatterplot plots 2 sets of dots for fat and fresh weight along the horizontal line at 0. 2 vertical lines are plotted at 1 and 85. The highest dot is with fat at (0.2, 18). Values are estimated.

Deviation (%) of results of CVUA Freiburg for WHO-PCDD/PCDF-TEQ from the assigned values of proficiency tests performed between 2000 and 2019 (blue squares: results on lipid base as pg WHO-PCDD/PCDF-TEQ/g lipid; red squares: results on product base as pg WHO-PCDD/PCDF-TEQ/g product; green dotted line: range relevant for human milk samples [about 1–40 pg/g lipid])

Fig. 4
A scatterplot plots 2 sets of dots for fat and fresh weight along the horizontal line at 0. 2 vertical lines are plot at 0.4 and 20. The highest dot is with fat at (0.2, 37). Some of the dots are beyond the vertical line at 20. Values are estimated.

Deviation (%) of results of CVUA Freiburg for WHO-PCB-TEQ from the assigned values of proficiency tests performed between 2000 and 2019 (blue squares: results on lipid base as pg WHO-PCB-TEQ/g lipid; red squares: results on product base as pg WHO-PCB-TEQ/g product¸ green dotted line: range relevant for human milk samples [about 0.3–15 pg/g lipid])

Fig. 5
A scatterplot plots 2 sets of dots for fat and fresh weight along the horizontal line at 0. 2 vertical lines are plotted at 0.95 and 1000. The highest dot is with fat at (10, 20.5). Values are estimated.

Deviation (%) of results of CVUA Freiburg for the sum of the 6 NDL-PCB (ΣPCB6) from the assigned values of proficiency tests performed between 2000 and 2019 (blue squares: results on lipid base as ng ΣPCB6 /g lipid; red squares: results on product base as ng ΣPCB6/g product; green dotted line: range relevant for human milk samples [about 1–1000 ng/g lipid])

As result, in the range relevant for human milk above 1 pg WHO-TEQ/g lipid, about 90% of the results differed by less than 10% from the assigned values for WHO-PCDD/PCDF-PCB-TEQ and WHO-PCDD/PCDF-TEQ, and by about 15% for WHO-PCB-TEQ. In the wider range relevant for the sum of 6 NDL-PCB (1–1000 ng/g lipid), about 90% of the results differed by less than 15% from assigned values. The maximum deviation at these relevant concentrations was less than about 15–20% for the TEQ-based parameters, and about 25% for the sum of the 6 NDL-PCB.

The observed tendency to higher deviations in some proficiency tests for the sum of the 6 NDL-PCB raises some questions about analytical performance. NDL-PCB are found at much higher concentrations than PCDD/PCDF or DL-PCB. Therefore, it is assumed that NDL-PCB analysis would be easier and provide more accurate results than PCDD/PCDF-analysis. In contrast to the observations in proficiency tests, the internal quality control data gathered over about two decades do not show a reason for such a higher deviation: The RSD of the quality control samples (see Sects. 4.3 and 4.4) and of the replicates of left-overs of the quality WHO assessment samples (see Sect. 4.6) shows no difference between TEQ-based results and the sum of the 6 NDL-PCB. The duplicate analyses of samples of these studies give even slightly better RSDs for the sum of the 6 NDL-PCB in comparison to the TEQ-based results (see Sect. 4.5). Therefore, it is assumed that the deviations in some proficiency tests might result more from higher uncertainty of the assigned values in certain tests rather than from internal analytical deficiencies. The analytical methodology for the determination of NDL-PCB at CVUA-Freiburg has always used 13C-labelled internal standards and HRGC-HRMS determination (see Sect. 2.1). However, for many years, a number of laboratories have also used other techniques, often not controlling the quantification of the 6 NDL-PCBs by 13C-labelled internal standards. Therefore, in some cases, the calculation of the assigned value for NDL-PCB might have been based on a wider distribution of results.

Figure 6 demonstrates the consistency of the performance over time by a plot showing the average deviation per year in % of results of CVUA Freiburg from assigned values of proficiency tests performed between 2000 and 2019 for WHO-PCDD/PCDF-TEQ, WHO-PCDD/F-TEQ, WHO-PCB-TEQ, and sum of 6 NDL-PCB covering the respective relevant ranges.

Fig. 6
A scatterplot of deviation versus years from 2000 to 2019 plots 4 dots for W H O-P C D D slash P C D F-T E Q, W H O-P C D D slash F-T E Q, W H O-P C B-T E Q, and the sum of 6 indicator P C B. The highest dots are for W H O-P C B-T E Q and the sum 6 indicator P C B with a value of 25%, approximately.

Average deviation per year in % of results of CVUA Freiburg from assigned values of proficiency tests performed between 2000 and 2019 for WHO-PCDD/PCDF-TEQ, WHO-PCDD/F-TEQ, WHO-PCB-TEQ, and sum of 6 NDL-PCB covering the respective relevant ranges

4.8 Accreditation

In 1993, new quality standards were introduced for laboratories entrusted with the official control of foodstuffs by the Member States of the European Economic Community. Laboratories had to comply with the general criteria for the operation of testing laboratories laid down in European Standard EN 45001 supplemented by standard operating procedures and the random audit of their compliance by quality assurance personnel not later than November 1998 (Council Directive 93/99/EEC (1993)). In a revision of the regulations on official controls in 2004 (EU Regulation 882/2004), it was stipulated that laboratories that were designated for official control should operate and be assessed and accredited in accordance with the European Standard EN ISO/IEC 17025—“General requirements for the competence of testing and calibration laboratories” (European Standard EN ISO/IEC 17025 (2017)). Therefore, the CVUA Freiburg was accredited in 1998 and has since been re-accredited continuously.

As a result, all analyses performed by CVUA Freiburg for the WHO/UNEP-coordinated exposure studies since 2000 followed the strict rules of the accreditation system and the general criteria for the operation of testing laboratories as laid down in EN ISO/IEC 17025.

5 Summary and Conclusions

In support of the WHO/UNEP-coordinated exposure studies on PCDD/PCDF and PCB in human milk, comprehensive quality control was performed on the received samples over two decades. As analytical criteria for human milk are not specified, the criteria for the analyses of PCDD/PCDF and PCB in feed and food required by EU legislation were used. The analytical methodology and validation data used were applicable for the analysis of the human milk samples and met the regulatory requirements for acceptance of results for foods including milk. Supporting this, the CVUA Freiburg participated successfully in 32 proficiency tests between 2000 and 2019, covering 81 test samples of food of animal origin or human milk, including two UNEP-interlaboratory studies on POPs.

As a key parameter supporting the reliability of these studies, trueness was estimated from the proficiency test samples in the relevant range above 1 pg WHO-TEQ/g lipid. The deviation was less than 10% from assigned values for WHO-PCDD/PCDF-PCB-TEQ and WHO-PCDD/PCDF-TEQ and less than about 15% for WHO-PCB-TEQ, for about 90% of the results. In the wider range relevant for the sum of 6 NDL-PCB (1–1000 ng/g lipid), about 90% of the results differed by less than 15% from assigned values.

Furthermore, it has been proven that the results meet an important criterion for monitoring data on the presence of these contaminants in order to identify trends in levels. The guidance document for the Global Monitoring Plan recommends, as a quantitative objective for temporal studies, the detection of a 50% decrease in the levels of POPs within a time period of 10 years. Therefore, UNEP-coordinated interlaboratory assessments are evaluated based on a target error of 25%. A long-term fulfilment of the criterion of a variation of ±25% for the determination of PCDD/PCDF and PCB was demonstrated by participation in the 2000–2019 proficiency tests. Additionally, a long-term precision of below 15% was achieved, based on quality control samples analysed during the above-mentioned period.

Collectively, these controls ensured that any differences in concentration levels in this wide span of altogether 20 years between 2000 and 2019, e.g. lower levels found in later years, did not arise from any analytical variation but were the result of decreasing time trends.