Introduction

Polycyclic aromatic hydrocarbons (PAHs) are a large class of organic compounds that are composed of two or more fused aromatic rings and are ubiquitous environmental pollutants which can be toxic and carcinogenic depending on the actual compound [1, 2]. PAHs are primarily formed by incomplete combustion or pyrolysis of organic matter (forest fires and volcanoes) and from anthropogenic sources such as motor vehicle exhaust, industrial processes, domestic heating, waste incineration and tobacco smoke [35]. Thus, the natural and anthropogenic sources of PAHs in the environment are numerous. Major routes of exposure for non-smokers are food ingestion and air inhalation. Food can be contaminated by PAHs that are present in air, soil or water, by industrial food processing methods (e.g. heating, drying and smoking processes) and during domestic food preparation (e.g. grilling and roasting processes) [6]. As PAHs generally do not occur individually but in mixtures, a set of 16 PAHs were considered by the US Environmental Protection Agency (EPA) as priority pollutants. The 16 EPA PAHs have been commonly used to characterise the PAH load in different samples [7, 8]. However, this list does not include some of the more toxic PAHs as listed by the Scientific Committee on Food (SCF), e.g. 5-methyl chrysene (5-MC), cyclopenta[cd]pyrene (CPP), most of the dibenzopyrenes and benzo[j]fluoranthene (BjF) [9]. In 2005, the European Commission (EC) listed 15 + 1 PAHs to be monitored by the EU Member States [10], including 15 PAHs identified as genotoxic and carcinogenic by the SCF, together with benzo[c]fluorene, which was highlighted by the Joint FAO/WHO Expert Committee on Food Additives (JECFA) [11]. In 2008, the European Food Safety Authority (EFSA) concluded that the sum of either four or eight specific PAHs would be the most suitable indicator for monitoring PAHs in food (noting that monitoring the sum of eight PAHs would not provide a significant added value compared with the sum of four PAHs) [9]. In line with the EFSA opinion, current EU legislation sets maximum concentration levels for the sum of four PAHs (benzo[a]pyrene, benzo[a]anthracene, benzo[b]fluoranthene and chrysene) to minimise the health risk from dietary PAHs exposure [12].

Liquid chromatography coupled to fluorescence detection (LC-FLD) and gas chromatography coupled to mass spectrometry (GC-MS) are the most commonly used techniques for instrumental determination of PAHs in complex food and environmental samples [5, 6]. In the past, LC with ultraviolet (UV) or photo-diode array (PDA) detector [13, 14] and GC with flame ionisation detector (FID) were widely applied [9], although they lack of sufficient sensitivity and selectivity. It is possible by using LC-FLD to separate and determine some PAH isomers which are difficult to separate by GC, such as benzo-fluoranthenes or dibenzo-anthracenes [6, 8]. A better performance for the heaviest compounds is achieved as well, since it avoids discrimination observed for these compounds during GC injection at high temperatures [15, 16]. Nevertheless, GC has become the preferred technique for PAH analysis as its coupling with MS detection provides superior resolution capabilities, improved sensitivity and selectivity as well as the possibility of using isotopically labelled internal standards [16, 17]. Sample preparation for PAH determination in complex solid food matrices usually involves solid–liquid extraction (SLE) and a subsequent clean-up step to avoid interfering compounds and to minimise the maintenance of the chromatographic system, especially when using GC [6]. Different techniques such as Soxhlet extraction, pressurised liquid extraction (PLE) and ultrasound-assisted extraction have been successfully applied for a wide range of matrices, while gel permeation chromatography (GPC), solid phase extraction (SPE) and combinations thereof are the most popular strategies to clean the extracts [6].

Certified reference materials (CRMs) are crucial quality assurance tools for laboratories to validate and operate measurement procedures, in order to deliver reliable and accurate results. Guidelines for the production and certification of RMs are provided in ISO Guide 34 [18] and ISO Guide 35 [19] and are adopted by corresponding CRMs producers, which are in some cases also accredited for the activity. Several challenges must be considered for the preparation of CRMs. For example, the between-bottle heterogeneity must be quantified and sufficiently low as well as the short- and long-term stability of the target parameters must be investigated. Moreover, suitable and representative analyte/matrix combinations that are similar to routine samples with respect to mass fraction levels and material properties have to be selected. Incurred materials are generally preferred as starting materials because of their proximity to real samples, however they are often difficult to obtain. Spiked materials could be a suitable alternative as long as sufficient representativeness, equivalence, homogeneity and stability are achieved for the candidate CRM. During the material processing, the minimisation of analyte and matrix modifications is desirable in order to preserve the material properties as much as possible. The production of CRMs is a very complex process and requires a thorough planning. Therefore, prior feasibility studies are frequently required to investigate the best way of preparing the material to ensure a sufficiently homogeneous and stable CRM, which will be suitable for its purpose [19].

This work assesses the feasibility of producing a matrix RM for the 15 + 1 EU priority PAHs, at a concentration level of 1 μg/kg, using a commercially available baby food as starting material. Different processing procedures such as freezing and freeze drying, traditionally employed for preparing matrix reference materials, and the more novel autoclaving procedure were considered with three main objectives: (i) to obtain materials with minimal manipulation (i.e. as close as possible to real samples) avoiding PAH degradation; (ii) sufficient material homogeneity; and (iii) sufficient stability of the target parameters upon storage and shipment of the material. In order to evaluate the homogeneity and stability of the three resulting materials, a fit-for-purpose method for the determination of the 15 + 1 EU PAHs has been validated in-house. It was especially relevant to achieve highly repeatable results since they increase the sensitivity of the tests applied to detect potential heterogeneous distribution of PAHs in the baby food materials. The same is true during the assessment of the PAHs stability via isochronous studies, where analytical measurements are performed under repeatability conditions.

Experimental

Chemicals and materials

PAHs as neat solids used in this study (Table 1) were either from BCR- or ERM-certified reference materials provided by the Institute for Reference Materials and Measurements (IRMM; Geel, Belgium), except cyclopenta[cd]pyrene (CPP), obtained from the Biochemical Institute for Environmental Carcinogens (Grosshansdorf, Germany), benzo[c]fluorine (BcF) and dibenzo[a,i]pyrene (DaiP) which were obtained from Dr. Ehrenstorfer (Augsburg, Germany). Solutions used for spiking the material were gravimetrically prepared using acetonitrile as solvent. All other solutions, i.e. individual stock solutions, a mixed stock solution containing all PAHs and subsequent working solutions for the analytical calibration were gravimetrically prepared using toluene as solvent. Isotopically labelled compounds with stated chemical purity benzo[a]anthracene-D12 (BaA-D12; 99.5 %), chrysene-D12, (Chr-D12; 99.5 %), benzo[b]fluoranthene-D12 (BbF-D12; 99.0 %), benzo[k]fluoranthene-D12 (BkF-D12; 99.5 %), benzo[a]pyrene-D12 (BaP-D12; 99.0 %), indeno[1,2,3-cd]pyrene-D12 (IcdP-D12; 100 %), dibenzo[a,h]anthracene-D14 (DahA-D14; 99.0 %) benzo[g,h,i]perylene-D12 (BghiP-D12; 98.5) and dibenzo[a,i]pyrene-D14 (DaiP-D14; 99.0 %) were purchased as neat crystals from Dr. Ehrenstorfer (Augsburg, Germany) and used as internal standards.

Table 1 15 + 1 EU PAHs used in the feasibility study

Acetonitrile (MeCN), toluene, dichloromethane (DCM) and n-hexane used in this study were of SupraSolv grade from Merck KGaA (Darmstadt, Germany). Sodium sulphate (≥99.0 %) was obtained from Sigma, and Accubond™ AminoII SPE cartridges (6 mL, 1 g) were obtained from Agilent Technologies (PaloAlto, CA, USA).

Preparation of the study materials

Figure 1 shows a flow diagram for the preparation of the three materials considered in this study. The content of 202 jars of commercially available baby food (40 kg of material) was placed in a stainless steel mixing vessel of a paste mixer (IKA-Janke Kunkel, Staufen, Germany) and mixed at full speed for 3 h. The baby food had a composition of 40 % carrots, 18 % potatoes, 18 % tomatoes, 13 % white beans, 10 % meat and 1 % maize oil, as stated by the product label, and was supplied as a very fine homogeneous paste. The direction of mixing was changed every 30 min. Next, 3.7 kg of the material were taken aside in a stainless steel container for blank samples. From the remaining material, a portion of 2.2 kg was placed into a glass beaker and thoroughly mixed with 22.8 g of a spiking solution, containing the 15 + 1 priority PAHs in MeCN at a mass fraction of approximately 2.2 μg/g each. The mixture was stirred manually with a glass rod and subsequently added to the rest of the baby food and homogenised for 6.5 h. The target mass fraction was 1 μg/kg for each PAH, as it is the maximum level established by the current EU regulation for BaP in baby food [12]. Once the material was thoroughly mixed it was divided in three parts for further treatment resulting in three different matrix presentations.

Fig. 1
figure 1

Flow diagram for the preparation of the three materials considered in this study

Frozen material

Forty-gramme portions of the homogenised material were manually filled into 100 mL glass jars (140 spiked plus 10 blank units), closed with screw cap and subsequently introduced in aluminium pouches thermally sealed to avoid light exposure. The material was stored at −20 °C until analysis.

Autoclaved material

Forty-gramme portions of the homogenised material were manually filled into 50 mL glass jars (140 spiked plus 10 blank units), which were closed with screw cap and introduced in an autoclave system (Matachana B4023, Webeco, Ober-Ramstadt, Germany ) for sterilisation at 121 °C for a period of 20 min, with a total cycle time of 1 h. Once the cycle was finished and room temperature was reached, the jars were put into aluminium pouches thermally sealed, in order to avoid light, and stored at 4 °C until analysis.

Freeze-dried material

Approximately 8 kg of the homogenised material was spread on Teflon-coated trays (1.0 kg/tray) and freeze dried in an Epsilon 2-85D freeze-dryer (Martin Christ, Osterode, Germany) for 70 h. The initial starting temperature was −30 °C, and the sublimation step lasted about 60 h with step-wise increments of shelf temperature by 10 °C at a time. Following a secondary drying step at 20 °C for about 8 h, the material was sufficiently dry. The freeze-dried material was manually crushed with a PFTE pestle and then milled with a heavy duty cutting mill (Retsch, Haan, Germany) with a 1.0 mm sieve insert. In order to perform a second milling with a 0.5-mm sieve insert, a prior cooling of both the material and the mill with liquid N2 was necessary. The powder obtained after milling was homogenised in a Dyna-MIX CM200 mixer (WAB, Basel, Switzerland). Ten-gramme portions were filled into 100 mL amber glass vials (140 spiked plus 10 blank units) and closed with an insert and aluminium crimp caps. The material was stored at 4 °C until analysis.

Analytical methodology for determination of PAHs

Preparation of calibration standards

Individual standard stock solutions of both native and isotopically labelled compounds were prepared gravimetrically with a PAH mass fraction of approximately 50 μg/g in toluene and kept in 100 mL capped amber vials at −20 °C until use. Two intermediate stock solutions, one containing the native compounds and the other containing the isotopically labelled compounds, were prepared by mixing appropriate amounts of each individual stock solutions and toluene to reach a mass fraction of approximately 1500 μg/kg of each PAH. Appropriate amounts of each intermediate stock solution were mixed and, once again, diluted gravimetrically with toluene to obtain calibration working solutions covering the range 50–150 μg/kg of each native compound and containing 100 μg/kg of each isotopically labelled compound.

Sample treatment procedure

Sample treatment was based on procedures described elsewhere with some modifications [2023]. It comprised an ultrasound-assisted solid–liquid extraction and a subsequent thorough clean-up based on GPC followed by SPE. As soon as the frozen (FR) and autoclaved (AC) materials reached room temperature, they were homogenised and 10-g aliquots were weighed into 50-mL glass tubes. For the freeze-dried (FD) material, 2 g were weighed into the glass tubes (the mass reduction because of water loss during freeze-drying process is 81.2 %, i.e. a pre-concentration of about five times), and 8 g of MilliQ water was gravimetrically added. In both cases, the contents were thoroughly mixed before and after addition of an appropriate amount of the IS solution to reach a mass fraction of approximately 1 μg/kg in the baby food. The samples were allowed to equilibrate for at least 1 h at room temperature and in the dark before extraction. Then, 10 mL of DCM were added, sonicated on an ultrasonic cleaner (USC600TH, VWR, Leuven, Belgium) for 5 min and then centrifuged (Heraeus Megafuge 1.0RS, Kendro Laboratory Products GmbH, Hanau, Germany) at 4000 rpm for 5 min at 18 °C. After centrifugation, three different layers were obtained. The upper aqueous layer and the middle solid layer were discarded, and the organic layer, located at the bottom of the tube, was transferred to a 50-mL glass tube with the help of a Pasteur pipette. The extract was mixed with two spoons of Na2SO4, filtered through a 0.45-μm PTFE syringe filter directly into a 10-mL GPC vial and concentrated to approximately 2.5 mL under a gentle stream of nitrogen before injection into a GPC system (AccuPrep MPS™ GPC system, J2 Scientific, Columbia, MO, USA, equipped with a Express™ column (J2 Scientific). DCM was used as organic mobile phase at a flow rate of 3 mL/min. The fraction from 24 to 36 min was collected (36 mL) in a round flask and evaporated down to approximately 1 mL by means of a rotary evaporator (40 °C, 750 mbar). After that, 1 mL of n-hexane was added, and a second clean-up step was performed loading the extract into SampliQ Amino (NH2) SPE cartridge (6 mL, 1000 mg, Agilent). The SPE cartridge was previously conditioned with 20 mL of n-hexane/DCM (1:1, v/v). Twenty-five millilitres of n-hexane/DCM (1:1, v/v) were passed through the cartridge for analyte elution. The fraction was collected and subsequently pre-concentrated under a gentle nitrogen stream. Finally, 100 μL of toluene were added, and the extract was transferred to an amber glass vial for injection into the chromatographic system.

GC-MS analysis

The GC-MS system consisted of a Thermo Trace GC system (Thermo-Finnigan, Milan, Italy) coupled to a Trace DSQ single quadrupole mass spectrometric detector (Thermo-Finnigan). Injections (injection volume, 1 μL) were automatically performed by means of a PAL autosampler (CTC Analytics, Zwingen, Switzerland), into a split/splitless programmed-temperature vaporising injector (PTV) using a 10 μl syringe. The mass spectrometer was operated in the electron ionisation (EI) mode (electron energy, 70 eV). GC-MS settings were the same as reported elsewhere [17]. The GC separation was performed using a fused silica DB-17HT (50 % phenyl methyl polysiloxane, 30 m length, 0.32 mm i.d., 0.15 μm film thickness) from Agilent Technologies. Helium at 99.9997 % (AirLiquide, Belgium) was used as carrier gas at a flow rate of 1.5 mL/min. The PTV injector temperature was held at 110 °C for 0.01 min, ramped to 330 °C at 8 °C/s and held at this temperature for 5 min. Subsequently, the injector was purged during 8 min at a temperature of 350 °C and a flow rate of 100 mL/min. The oven temperature was programmed as follows: 80 °C held for 1.5 min, ramped to 250 °C at 10 °C/min, held for 5 min and finally ramped to 320 °C at 10 °C/min and held for 10 min. The GC-MS ion source temperature was 250 °C and the transfer line temperature was 320 °C. The quadrupole temperature was set to 150 °C. Each target compound was monitored in the selective ion monitoring mode (SIM) according to their retention time, using one quantitative ion (QT) and one qualitative ion (QL). The programmed time-segmented group and related selected ions monitored for each selected PAHs are shown in Table 2.

Table 2 GC-MS parameters for the determination of PAHs

Method validation

The analytical method was applied for homogeneity and stability studies and was validated in-house. The performance parameters evaluated included: selectivity, linearity and working range, detection (LOD) and quantification limits (LOQ), trueness, method repeatability and stability of the extracts. Each day of the validation, a method blank (no matrix) was prepared following the procedure described in “Sample treatment procedure” and analysed in order to check for any possible reagent or material cross-contamination.

Blank samples of each material were also checked for the presence/absence of any potential interfering peaks. A set of solvent standards at mass fraction levels of 50, 75, 100, 125 and 150 μg/kg of each PAH, and 100 μg/kg of each internal standard, was prepared every day and injected five times along the measurement sequence. The quantification of analytes in the samples was carried out with solvent standards according to Eq. (1). A relative response factor (RRF) was applied as average value of the RRFs, obtained from the three calibration data sets for each concentration level (5 concentration levels × 5 instrumental replicates × 3 days of the validation study) according to Eq. (2).

$$ {w}_{{\mathrm{PAH}}_x}=\frac{A_{{\mathrm{PAH}}_x}*{w}_{{\mathrm{d}\hbox{-} \mathrm{P}\mathrm{A}\mathrm{H}}_x}*{m}_{{\mathrm{d}\hbox{-} \mathrm{P}\mathrm{A}\mathrm{H}}_x}}{A_{{\mathrm{d}\hbox{-} \mathrm{P}\mathrm{A}\mathrm{H}}_x}*\mathrm{R}\mathrm{R}\mathrm{F}*{m}_{\mathrm{sample}}} $$
(1)
$$ \mathrm{R}\mathrm{R}\mathrm{F}=\frac{A_{{\mathrm{PAH}}_x}*{w}_{{\mathrm{d}\hbox{-} \mathrm{P}\mathrm{A}\mathrm{H}}_x}}{A_{{\mathrm{d}\hbox{-} \mathrm{P}\mathrm{A}\mathrm{H}}_x}*{w}_{{\mathrm{PAH}}_x}} $$
(2)

Where \( {w}_{{\mathrm{PAH}}_x} \) is mass fraction of the PAH x (μg/kg); \( {A}_{{\mathrm{PAH}}_x} \) is area of the PAH x ; \( {A}_{{\mathrm{d}\hbox{-} \mathrm{P}\mathrm{A}\mathrm{H}}_x} \) is area of the deuterated PAH x ; \( {w}_{{\mathrm{d}\hbox{-} \mathrm{P}\mathrm{A}\mathrm{H}}_x} \) is mass fraction of deuterated PAH x (μg/kg); \( {m}_{{\mathrm{d}\hbox{-} \mathrm{P}\mathrm{A}\mathrm{H}}_x} \) is mass of the deuterated mix solution added to the vial (g); and m sample is mass of sample taken for analysis (g).

In order to study method linearity and working range for each analyte, calibration curves were constructed by plotting the peak area ratio (native/isotopically labelled) versus the mass fraction ratios (native/isotopically labelled), obtained from standard solutions. Linearity was firstly evaluated by visual inspection of the plotted data (area ratio and mass fraction ratio of analytes and isotopically labelled standards), and then by the residual plots and calculation of the correlation coefficient (r). Linearity was also assessed through the RSD of the average RRFs obtained each day. In addition, p values for the lack-of-fit test at 95 % confidence interval (α = 0.05) were obtained using Statgraphics 5.0. Software [24]. LOD and LOQ were estimated through the calibration data according to the German Standard DIN 32645 [25]. Trueness was assessed by spiking experiments at two different mass fraction levels, i.e. 1 and 1.5 μg/kg for frozen and autoclaved materials and five times higher mass fraction for the freeze-dried material (5 and 7.5 μg/kg) due to the pre-concentration because of water loss during freeze drying. For this purpose, three different blank samples of each material were spiked at each concentration level, processed according to the abovementioned sample preparation procedure and injected twice. Then, the recovery values were calculated by comparing both the measured and the theoretical mass fractions of PAHs. The repeatability of the method was assessed for each material type by analysing six samples per material in duplicate (n = 12). Each subsample was injected twice on the GC. The stability of the extracts was investigated by injecting spiked samples (at 1 μg/kg) at the day of preparation and re-analysing them after 2 weeks. During that period, the extracts were kept in closed (capped) vials and stored at +4 °C in darkness.

Homogeneity studies

Equivalence between all the bottles produced in a batch is an essential requirement for any reference material (RM). Therefore, RM producers are required to quantify the between-bottle heterogeneity of the certified parameters [19]. For each material, eight bottles were chosen using a random stratified sample selection scheme, thus ensuring that the complete batch was covered. Three independent replicate determinations per sample were performed for their PAH content by applying the in-house-validated analytical methodology. The measurements were carried out under repeatability conditions (in one analytical run) and in a random order to allow to separate analytical drift and a possible trend in the filling sequence. The data were evaluated using analysis of variance (ANOVA) via the SoftCRM software [26]. Linear regression analyses were performed for all the PAH results to detect any potential trends in the filling sequence. Grubbs tests at 99 % confidence levels were employed to identify potential outliers among individual replicate results or bottle averages.

The distribution of results was checked using normal probability plots and histograms. Finally, ANOVA was performed to calculate the within-bottle standard deviation (s wb) and the between-bottle standard deviation (s bb). As s wb and s bb are estimates of the true standard deviations subject to random fluctuations, the mean square between groups (MSbetween) could be smaller than the mean squares within groups (MSwithin). In these cases, s bb cannot be calculated, and the maximum heterogeneity that could be hidden by method repeatability (u * bb), is an alternative to estimate the uncertainty contribution due to possible heterogeneity. u * bb was estimated as described elsewhere [27].

Stability studies

Stability testing is necessary to (i) establish suitable conditions for material dispatch to the end user (short-term stability study) and to (ii) establish suitable conditions for the storage of a CRM (long-term stability study). Knowledge of the stability during transport, covering the possible worst-case scenarios, is necessary in order to ensure that the potential instability of the material during dispatch is negligible [19].

Stability studies were conducted following isochronous schemes [28]. The samples were moved to a reference temperature after different periods of time at the temperatures evaluated. The design allows the simultaneous analysis of the samples at various exposure times, i.e. under repeatability conditions, thus greatly improving the sensitivity of the study to detect degradation and increasing the significance of the result because of the reduction of the analytical variation. For both studies (short- and long-term), six independent sub-samples (2 bottles × 3 replicates) were analysed for each time/temperature point. Storage temperatures for the short-term stability study were −20, 4 and 18 °C (FR batch) and 4, 18 and 60 °C (AC and FD batches). For the long-term stability study, temperatures were −20 °C (FR batch) and 4 and 18 °C (AC and FD batches). The reference temperature was −70 °C for the frozen batch and −20 °C for the autoclaved and freeze-dried batches. Storage times were 0, 1, 2 and 4 weeks (short-term stability study) and 0, 6, 12 and 18 months (long-term stability study). Measurements were carried out by applying the in-house-validated analytical procedure. The results were grouped and evaluated per time point and temperature and screened for outliers. Linear regression analysis as a function of time was performed to check for significant trends which may indicate degradation of the analytes. The slopes were tested for significance using a t test with t α, df being the critical t value (two-tailed) for a significance level α = 0.01 (99 % confidence interval). The uncertainties associated to short-term stability (u sts) and long-term stability (u lts) were estimated according to the following equation [29], for periods of 4 weeks (short-term study) and 18 months (long-term study), respectively:

$$ {u}_{\mathrm{rel}\ \left(\mathrm{s}\mathrm{t}\mathrm{s}\kern0.5em \mathrm{or}\ \mathrm{l}\mathrm{t}\mathrm{s}\right)}=\frac{{\mathrm{RSD}}_{\mathrm{stab}}}{\sqrt{{\displaystyle \sum \Big({x}_{\mathrm{i}} -}\overline{x}\Big){}^2}}\ast {t}_{\mathrm{sl}} $$
(3)

where RSDstab is the relative standard deviation of all results of the stability study, x i is the time point for each replicate, \( \overline{x} \) is the average of all time points of the study and t sl is the shelf life.

Results and discussion

Characteristics of the materials produced

Freeze-dried material

The average water content in the final bottles of the FD material was 6.1 ± 0.7 % (m/m) as determined using volumetric Karl Fischer titration system (Metrohm, Herisau, CH). For a dry mass determination of the materials, the water content must be checked on a separate portion of the sample. The top particle size (X 90) of the resulting powder was <400 μm as determined by laser diffraction (Symptec Helos Clausthal-Zellerfeld, DE). The fat content in the dry material is close to 15 % (as declared by the manufacturer considering the dry mass in the material). The mass loss in the freeze-dryer corresponds to about 81 % (m/m) which basically corresponds to the water content in the commercially available baby food.

Frozen material

The FR material had the characteristics of the commercially supplied material with respect to water content and particle size. The supplied material is essentially identical to routine samples with the exception of 9.5 h of very gentle mixing in a paste mixer with the addition of the spike in 29 mL acetonitrile. The fat content in the FR is 3.4 % as declared by the manufacturer. For both the FR and AC materials, it is essential that the water content is constant over a production series since any analyte should be certified as is.

Autoclaved material

The AC material is the same as the frozen material except for the final thermal treatment. Therefore, water content and particle size distribution are the same in both materials. The supplied material is identical to routine samples with the exception of 9.5 h of very gentle mixing in a paste mixer with the addition of the spike in 29 mL acetonitrile. Tests have shown that no mass loss took place in the autoclavation step. The fat content in the AC material is 3.4 % as declared by the manufacturer.

Performance of the analytical procedure

It was concluded during the development of the method that optimal results are obtained by the use of a DB-17HT column. This column provided sufficient separation of the target analytes. Closely eluting peaks such as CPP/Chr, BbF/BkF and IcdP/DahA were separated by a valley equal or minor to 20 % of the height of the smallest peak within the pair. The analytes were identified and confirmed by the following criteria: relative retention time (t R in the sample/t R in calibration solution) ≥0.99, presence of the quantification ion as well as one confirmation ion (Table 2) and relative qualifying ion abundance in sample within ±15 % of the analogous compound in the calibration solution. The criteria were met for all analytes down to a level comparable with the estimated LODs, except for 5-MC, for which an interference was observed, possibly because of the presence and co-elution of another MC isomer. Thus, the validation criterion regarding selectivity was not entirely fulfilled for this compound, and therefore it was not considered for subsequent studies. Figure 2 shows typical chromatograms from processed FR, AC and FD samples.

Fig. 2
figure 2

GC-MS chromatogram corresponding to frozen, autoclaved and freeze-dried materials. Identification of peaks: 1, BcF; 2, BaA; 3, Chr; 4, CPP; 5, 5-MC; 6, BbF; 7, BkF; 8, BjF; 9, BaP; 10, IcdP; 11, DahA; 12, BghiP; 13, DalP; 14, DaeP; 15, DaiP; 16, DahP

Calibration curves were obtained as described in “Method validation”. The statistical parameters calculated from least-square regression are presented in the Electronic supplementary material (ESM) in Table S1. The correlation coefficients (r) were >0.99, and the p values for the lack-of-fit test (α = 0.05) were larger than 0.05. In addition, the average RSD of RRFs obtained each day for each calibration level was lower than 5 % for all the analytes. Thus, the linearity of the analyte responses is confirmed over the tested range. LOD and LOQ values were extrapolated from the calibration curves to calculate absolute mass fraction values in a sample: 10 g wet sample for AC and FR materials and 2 g dry sample for the FD material. Blank samples of each material were spiked at these levels and analysed in order to verify that PAHs could easily be detected, identified and quantified at these mass fraction levels. As shown in ESM Table S1, LOQs for all considered PAHs were in the range 0.1–0.9 μg/kg for the AC and FR materials. In the case of the FD material, higher values, in the range 0.6–4.5 μg/kg, where obtained because of the water loss and pre-mass fraction after the freeze-drying process. The repeatability of the method ranged from 1.6 to 11.7 % in the three materials. Table 3 summarises the results obtained from recovery studies carried out at two mass fraction levels of each individual PAH in the three different materials. Satisfactory results in terms of recovery and RSD were obtained in most cases, although some differences were found depending on the material, especially for dibenzopyrenes. These high molecular mass PAHs (m/z 302) normally yield low analytical signal intensities [17]. For FR material, recoveries in the range 86–101 % with RSDs equal or lower than 9.0 % were obtained. Exceptions were CPR and CHR with recoveries of 112 % at 1 μg/kg and 113 and 108 % at 1.5 μg/kg, respectively. However, the excellent RSD obtained for his compounds in FR material (≤5.6 %) suggest that the method is suitable for the current CRM production feasibility study. In the case of the AC material, recoveries were in the range 83–101 % for all analytes, except for CPP and Chr for which a higher value (112 and 109 % respectively at 1 μg/kg; 113 and 117 %, respectively at 1.5 μg/kg) were obtained. Similarly than for the FR material, the RSD for these two compounds were considered good enough for the present study (≤6.1 %). RSDs for the rest of the compounds in the AC material were better than 9.8 %, except for DahP (12.7 and 16.7 % at 1 and 1.5 μg/kg concentration levels, respectively). As an exception, poor results were obtained for BghiP at 1 μg/kg mass fraction level (recovery of 113 % with an RSD of 51 %); however, they were assumed to be caused by experimental systematic errors as good results were obtained for the same compound in the same material at 1.5 μg/kg (recovery of 93 %, RSD 1.6 %). The FD material showed recoveries in the range 90–109 % for all the PAHs with some exceptions. For DalP and DaeP at 5 μg/kg mass fraction level, a yield higher than 120 % was found, while for DahP, it was 83 % at 5 μg/kg and 65 % at 7.5 μg/kg mass fraction levels (these results could be explained by the relatively higher RSD obtained for this compound). As mentioned before, the low analytical signal yielded by these late eluting compounds with high molecular mass, have a major influence on these results as they are peaks difficult to integrate. Recoveries of 117 and 110 % were obtained for Chr at low and high concentration level, respectively. However, good RSD (≤9.5 %) was obtained for this compound at both concentration levels, and therefore the method is assumed to provide reliable information for the purpose of the present study. Regarding the RSDs obtained for the rest of PAHs in the FD material, they were in the range 1.6–12.8 %, except for dibenzopyrene isomers (late eluting heavy compounds) with RSD varying from 14.2 to 22.8 %. To complete the studies, the stability of extracts was tested over a period of 4 weeks, showing no signs of degradation. Between the injections, the extracts were kept at +4 °C in the dark.

Table 3 Mean recoveries (%) and RSD (%, n = 6) obtained for the 15 + 1 EU priority PAHs at two different mass fraction levels in the three treated materials

Homogeneity

The evaluation of the data for the homogeneity study (not shown) revealed that for all considered analytes, both the individual results and bottle means showed unimodal and, in most cases, normal distribution. A small number of bottle mean outliers (maximum one per compound) were identified for the different processed materials. As no technical reason could be identified for the occurrence of these outliers, all data were retained for statistical analysis in order to provide a conservative estimate of inhomogeneity for the materials.

A regression analysis was performed to detect possible trends regarding the filling or the analytical sequence. With the exception of BaA in the frozen matrix, no statistically significant trend was detected for PAHs in the filling sequence of neither FD, FR nor AC material. The degree of possible heterogeneity in each of the materials was evaluated through the estimation of uncertainties as described earlier. The outcome, summarised in ESM Table S2, does not show any major influence of the processing strategy concerning the homogeneity of the material, as most of the u bb values are around or below 5 %. Regarding the variability within-bottle (s wb), values close to or even above 10 % were obtained in some cases. To reduce further this variability, an improved homogenisation step during processing or an increase of the sample intake during analysis could be envisaged prior to processing of a candidate reference material. Despite this effect, the overall results show the suitability of the three processing strategies tested for the preparation of sufficiently homogeneous candidate materials for PAHs in baby food.

Stability

Analytical data obtained for each temperature and time scheme of the isochronous stability studies (results not shown) were evaluated as described earlier. Some outliers were found but retained since no technical reason for their exclusion could be justified. The slope of the regression line of the PAH mass fraction vs. time was evaluated for statistical significance in order to detect any trend of instability in the material. Table S3 in the ESM summarises the evaluation of the short-term stability study for the three materials. None of the analyte/matrix combinations investigated showed a statistically significant trend at low or medium temperatures (4 and 18 °C for the AC and FR materials and −20 and 4 °C for the FR material). For the AC material at 4 °C, BjF data showed a significant slope (α = 0.01), which was possibly an analytical artefact as no significance was observed at 18 °C or even at 60 °C. A significant slope was obtained for BkF at 60 °C in the AT material and Chr in the FD material. From the short-term stability study, it can be concluded that AC and FD materials should be shipped with cooling elements and the FR material in dry ice.

The slopes and statistical parameters associated with the long-term stability study are presented in Table 4. In this case, the AC material tested at 4 and 18 °C for 18 months, showed degradation only for DahP at both temperatures (α = 0.01). A significant slope was detected for DahA at 18 °C, indicating losses for this analyte when stored under these conditions. For the FD material, five PAHs (CPP, BaP, DalP, DaeP, DahP) showed a statistically significant slope at 18 °C. Furthermore, DahP exhibited instability as well at 4 °C. In the case of BbF, although showing a significant slope at 4 °C, results suggested stability of this compound when the FD material is stored at 18 °C. None of the analytes showed significant trend in the frozen material at −20 °C (α = 0.01). In order to provide a more general picture about the stability of the analytes between the different materials, data were re-evaluated at 95 % of confidence levels. No significant changes were found compared with results at 99 % confidence interval except in very few cases. BjF and DaiP exhibited significant trend in the AC material at 4 °C (α = 0.05). However, results for BjF proved stability for this compound when the AC material is stored at 18 °C. On the other hand, DaiP showed instability at 18 °C in the AC material, similarly to the results obtained when data were scrutinised at 99 % confidence level. In the case of the FD material, the only difference found when data were evaluated at 95 % confidence level concerned only DaiP, which showed significant trends at both temperatures (4 and 18 °C), therefore these results show losses of this compound in the FD material. No differences were found for FR at −20 °C when results for the LTS study were scrutinised at 95 % confidence interval and were compared with those scrutinised at 99 % confidence interval. For the sake of comparison, the approximate shelf lives for each considered PAH in the FR, AC and FD materials were predicted for a target uncertainty of 5 %. As can be seen in Fig. 3, for the majority of the analytes, adequate stability was obtained for the freeze-dried material stored at 4 °C. For a CRM producer, higher temperatures (ideally above zero) facilitate the storage and distribution of the material. However, freeze-dried materials are provided in a physical form that requires reconstitution and which could potentially change the properties of the matrix after reconstitution.

Table 4 Linear regression and statistical parameters associated to long-term stability
Fig. 3
figure 3

Shelf lives (months) predicted for a target uncertainty of 5 % for each considered PAH in the frozen, autoclaved and freeze-dried materials

One purpose of this study was to be able to prepare a sufficiently stable material that can be stored at a temperature above 0 °C. It would also be advantageous if the material could be supplied in a physical form as close as possible to routine samples. Results shown in Fig. 3 for the AC material stored at 4 °C indicate a notable improvement on the stability for a number of compounds, compared with the other two processing techniques. Particularly relevant is the good stability of BaP and BbF, both compounds are included in the EU legislation for PAHs in foodstuff [12]. On the other hand, the stability of the lighter compounds, e.g. BcF, BaA, CPP and Chr, is reduced, or in the best case similar in comparison with the other two processing techniques.

Conclusions

During the feasibility study for the production of a CRM for 15 + 1 EU PAHs in baby food, the potential of autoclaving has been investigated and compared with two processing techniques classically employed for the production of food CRMs, namely freezing and freeze drying. Analyses performed with a GC-MS method developed and validated in-house, indicated an acceptable level of heterogeneity for the PAHs investigated in each of the three materials. 5-MC was excluded from the studies as the validation criterion regarding the selectivity was not completely fulfilled for this compound. At storage temperatures of 4 °C (for autoclaved and freeze-dried) and at −20 °C (for the frozen material), PAH degradation was not statistically significant for a period of 18 months, with a few exceptions. However, the extrapolation of the results for the estimation of the shelf life for the baby food preparations indicate an increased stability of the PAHs in the autoclaved material compared with the freeze-dried material, with the exception of the lighter PAHs (BcF, BaA, CPP and Chr). The stability of these compounds is negatively influenced by the autoclaving process. Both autoclaved and freeze-dried materials exhibit increased stability in comparison with the frozen material with exception of the two lightest compounds. In summary, freeze drying and autoclaving were found to be suitable processing techniques for the production of a candidate CRM for PAHs in baby food. For the lightest PAHs, autoclaving conditions would require further investigation for improved results. Finally, it is important to realise that an autoclaved wet paste resembles routine samples to a much higher degree than a freeze-dried powder. Therefore, processing methodologies involving thermal sterilisation are very promising for the development of a matrix reference material for PAHs in a processed food matrix similar to routine samples.