PFΔScreen — an open-source tool for automated PFAS feature prioritization in non-target HRMS data

Zweigle, Jonathan; Bugsel, Boris; Fabregat-Palau, Joel; Zwiener, Christian

doi:10.1007/s00216-023-05070-2

PFΔScreen — an open-source tool for automated PFAS feature prioritization in non-target HRMS data

Paper in Forefront
Open access
Published: 30 November 2023

Volume 416, pages 349–362, (2024)
Cite this article

Download PDF

You have full access to this open access article

Analytical and Bioanalytical Chemistry Aims and scope Submit manuscript

PFΔScreen — an open-source tool for automated PFAS feature prioritization in non-target HRMS data

Download PDF

Jonathan Zweigle ORCID: orcid.org/0000-0002-7194-1567¹,
Boris Bugsel¹,
Joel Fabregat-Palau² &
…
Christian Zwiener¹

2611 Accesses
1 Citation
6 Altmetric
Explore all metrics

Abstract

Per- and polyfluoroalkyl substances (PFAS) are a huge group of anthropogenic chemicals with unique properties that are used in countless products and applications. Due to the high stability of their C-F bonds, PFAS or their transformation products (TPs) are persistent in the environment, leading to ubiquitous detection in various samples worldwide. Since PFAS are industrial chemicals, the availability of authentic PFAS reference standards is limited, making non-target screening (NTS) approaches based on high-resolution mass spectrometry (HRMS) necessary for a more comprehensive characterization. NTS usually is a time-consuming process, since only a small fraction of the detected chemicals can be identified. Therefore, efficient prioritization of relevant HRMS signals is one of the most crucial steps. We developed PFΔScreen, a Python-based open-source tool with a simple graphical user interface (GUI) to perform efficient feature prioritization using several PFAS-specific techniques such as the highly promising MD/C-m/C approach, Kendrick mass defect analysis, diagnostic fragments (MS²), fragment mass differences (MS²), and suspect screening. Feature detection from vendor-independent MS raw data (mzML, data-dependent acquisition) is performed via pyOpenMS (or custom feature lists) with subsequent calculations for prioritization and identification of PFAS in both HPLC- and GC-HRMS data. The PFΔScreen workflow is presented on four PFAS-contaminated agricultural soil samples from south-western Germany. Over 15 classes of PFAS (more than 80 single compounds with several isomers) could be identified, including four novel classes, potentially TPs of the precursors fluorotelomer mercapto alkyl phosphates (FTMAPs). PFΔScreen can be used within the Python environment and is easily automatically installable and executable on Windows. Its source code is freely available on GitHub (https://github.com/JonZwe/PFAScreen).

Graphical abstract

Efficient PFAS prioritization in non-target HRMS data: systematic evaluation of the novel MD/C-m/C approach

Article Open access 24 February 2023

patRoon: open source software platform for environmental mass spectrometry based non-target screening

Article Open access 06 January 2021

In silico MS/MS spectra for identifying unknowns: a critical examination using CFM-ID algorithms and ENTACT mixture samples

Article Open access 22 January 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Per- and polyfluoroalkyl substances (PFAS) are a large group of anthropogenic chemicals characterized by containing multiple C-F bonds [1, 2]. Due to their unique properties, they are used in a wide array of daily products and different industrial applications [3]. Their high chemical resistance and water and oil repellency lead to the production of PFAS with a variety of different chemistries [4]. Due to the high stability of C-F bonds, the perfluoroalkyl chains of PFAS exhibit an intrinsic persistence that leads to a worldwide distribution of PFAS and their terminal transformation products (TPs) such as perfluoroalkyl acids (PFAAs) which were extensively produced and used in the past [5,6,7,8]. Nowadays, the number of known PFAS ranges from thousands to millions, depending on the definition and source of information. According to the updated OECD definition, all chemicals containing a CF₃ or isolated CF₂ group are considered PFAS, which has increased the number of PFAS considerably [9, 10]. Global regulatory efforts restricted the production of selected longer-chain PFAAs such as perfluorooctanoic acid (PFOA) and perfluorooctanesulfonic acid (PFOS) due to their persistence, bioaccumulation potential, and adverse effects on humans and the environment [11]. This resulted in the production of replacement compounds of rather similar persistence, increasing the number of different PFAS on the global market that are also eventually emitted into the environment [12]. Therefore, PFAS are considered to be regulated as a chemical class in the European Union in the future [13].

Several studies have shown that considerable fractions of organically bound fluorine (e.g., extractable organic fluorine) in environmental and human samples cannot be explained sufficiently by routinely analyzed PFAS (target screening), which usually include less than 50 analytes [14,15,16,17]. Since almost no fluorinated organic compounds occur naturally, unknown fractions of organically bound fluorine are clear indications of anthropogenic chemicals [18].

Due to the sheer number of different PFAS that transform into an even larger number of unknown TPs, a comprehensive use of authentic reference standards is usually not possible and most likely will not be soon [19, 20]. The fact that PFAS are industrial chemicals that often underlie the trade secrets even complicates the availability of standards. Therefore, non-target screening (NTS) based on high-resolution mass spectrometry (HRMS) is necessary for a more comprehensive characterization of PFAS [21, 22]. Several studies have shown that target analysis is insufficient to capture PFAS present in complex samples, which can easily result in the overlooking of important compounds even when present in high concentrations [23]. NTS approaches led to the identification of more than 750 novel PFAS in various samples in the past worldwide, showing their high relevance in analytical approaches [22, 24]. Since NTS is typically a time-consuming and often partially manual process, efficient prioritization techniques are needed to separate detected matrix components from the analytes of interest (often a data reduction from ~ 5000 detected compounds to 10–100 identified analytes or even less) [25].

The intrinsic properties of PFAS (with a certain fluorine percentage) allow the use of several techniques for their prioritization [21, 26]: The chemical mass defect (MD) of PFAS is typically lower (MD_F = − 0.0016 Da) than the one of hydrocarbons (MD_H = + 0.0078 Da) and has been used to remove detected features outside a predefined MD range (e.g., − 0.25 to + 0.1 Da) [27,28,29]. However, this range is not fixed, and depending on the structure, it is important to know that hydrocarbons of higher mass that exceed a MD of + 0.75 Da can also fall into the same range. Similarly, polyfluorinated PFAS with a high H content may bear a positive MD exceeding + 0.1 Da. Recently, a promising approach based on the MD normalized to the carbon number (MD/C) vs. the mass normalized to the carbon number (m/C) was proposed to separate PFAS much more efficiently from other hydrocarbon features in HRMS data which was further systematically evaluated for ~ 200,000 PFAS from chemical databases [26, 30]. The carbon number can be easily estimated for all HRMS features by using the relative abundance of the M+1 isotope (¹³C). PFAS have a much higher m/C when their mass is dominated by fluorine (e.g., m/C ~ 50), while hydrocarbons of similar mass are dominated by carbon (m/C ~ 14), allowing a convenient separation. Details on the MD/C-m/C approach are summarized in Zweigle et al. [26]. Especially, the m/C dimension can be used to remove large fractions of non-PFAS features when applied appropriately. This is illustrated in Fig. 1 with a 2D histogram of the MD/C-m/C locations of over 50,000 features from previous HRMS measurements of PFAS-contaminated soils and grease-repelling papers, where a clear separation of potentially highly fluorinated compounds is observed (region around m/C ≈ 40, MD/C = − 0.002). It is important to note, however, that the MD/C-m/C separation works better the higher the percentage of fluorine in a molecule is, with an accordingly higher F/C and a lower H/F ratio [26]. Like the MD, the MD/C-m/C approach cannot separate, for instance, hydrocarbons with one or two CF₃ groups from other hydrocarbons.

Besides the MD and MD/C-m/C approach, the Kendrick mass defect (KMD) analysis to detect homologous series of PFAS (e.g., with CF₂ or CF₂O as repeating units) is of great relevance since it allows the grouping of structurally related PFAS, simplifying their identification [27, 33]. In the MS² data, lists of PFAS-specific diagnostic fragments (DFs) as well as fragment mass differences and neutral losses can be used to prioritize fragmentation spectra [28, 31, 34]. These techniques are often combined with suspect screening by matching accurate mass (or further evidence) with PFAS lists [22, 35].

KMD, DFs, fragment mass differences, and especially suspect screening with large lists (e.g., PFASMASTER, gathering over 12,000 compounds [36]) in combination with complex samples (thousands of features) are prone to a high number of false-positive detections (depending on mass tolerance) that often need to be excluded manually, which is a time-consuming process. Even with extremely high mass resolution, naturally occurring compounds can still mimic certain PFAS-specific repeating units such as CF₂, complicating KMD analysis and making retention time shifts a necessary criterion [37]. Therefore, if the number of features can be preliminarily reduced by the MD/C-m/C approach before applying those techniques, a faster and more accurate NTS workflow can be performed, decreasing both computational and manual effort regarding the further inspection of the features. Although many of the above discussed PFAS-specific techniques for prioritization and identification are applied, they are often not performed in a systematic way using open-source software [38]. Therefore, it is important to combine the data processing in a more systematic step-wise procedure.

To facilitate the non-targeted screening of PFAS in complex samples, we developed PFΔScreen, an open-source Python-based software tool with a simple graphical user interface (GUI) that combines the discussed techniques to efficiently prioritize PFAS in LC- or GC-HRMS data acquired with electrospray (ESI) or atmospheric pressure chemical ionization (APCI). PFΔScreen can be applied vendor-independently either on mass spectrometric raw data (mzML, automated feature finding via pyOpenMS) or on custom feature lists (external feature finding by other software tools). The PFΔScreen workflow presented here is then applied to four PFAS-contaminated agricultural soil extracts from south-western Germany (Rastatt case [27, 39]), where several PFAS classes, including novel PFAS, were identified. The advantages of the combined workflow are discussed in detail. The source code is available via GitHub and can be easily automatically installed and executed via batch files on Windows within the Python environment. The Python source code can also be executed on other operating systems within the Python environment (without automatic installation).

Materials and methods

PFΔScreen workflow

PFΔScreen is a fully automated tool for detection and prioritization of potential PFAS features (LC- or GC-HRMS with ESI or APCI source) in raw mass spectrometric data written in Python (3.9.13) (Fig. 2). PFΔScreen is structured in several individual Python functions that are executed from one main file that allows data and parameter input via a simple GUI programmed with the tkinter library (Fig. S1). It can easily be automatically installed and executed on Windows using batch files. Detailed instructions on installation and functionality are provided in the SI. Input MS raw data can be converted vendor-independently from data-dependent acquisition (ddMS²) files into the mzML data format (.mzML) by using the MSConvert software from ProteoWizard [40, 41]. Only mzML files with centroided spectra and one collision energy (CE) should be used. If profile data was acquired and MS² spectra from several different CEs per precursor m/z are present, the peak picking (for centroiding) and subset functions (to keep only one desired CE) from MSConvert can be used to generate the correct mzML input files.

In the following, the three main functionalities of PFΔScreen are explained in the same order as they can be executed in the GUI (Fig. 2 and Fig. S1).

FeatureFinding

The first step usually performed in NTS is detection of features in the MS raw data characterized by chromatographic peak shapes of coeluting isotopes, resulting in a list of m/z, retention time (RT), and peak area. This task is performed with pyOpenMS, a Python interface to the C++ OpenMS library [42,43,44,45,46]. For feature detection, the FeatureFinderMetabo algorithm is used, which is designed for metabolites and small molecules [47,48,49]. Three parameters (mass error (ppm), intensity threshold, and an isotope model for more accurate detection of coeluting isotopologues) can be specified. The most important parameter is the intensity threshold, which is highly dependent on the instrument used, sample, and the underlying NTS question. After feature finding in the MS¹ data, MS² spectra can be aligned to their respective precursors by specifying an m/z and RT tolerance. Only one unique MS² spectrum with the highest precursor intensity is assigned to the respective MS¹ precursor.

With PFΔScreen, a single sample with a corresponding (optional) blank can be processed at a time. Blank correction is performed by setting an m/z and RT tolerance as well as a fold change with the desired increase of abundance in the sample compared to the blank. Features appearing in both sample and blank within the specified criteria are removed from the dataset. After preprocessing, the raw data is ready for specific PFAS prioritization. If feature finding by an external software is desired (e.g., vendor software), the following steps can also be performed by loading a feature table (.csv, that requires m/z, RT, and intensities of the [M] and [M+1] isotopes) into PFΔScreen without feature detection via OpenMS. However, the raw mzML files are still needed to assign MS² data to the features in the feature table (see SI). Besides pyOpenMS, the mass spectrometric Python library Pyteomics is used for selected calculations [50, 51].

PFASPrioritization

The PFAS prioritization workflow is intended in an iterative manner: after feature detection, the MD/C-m/C plot should firstly be manually inspected to determine reasonable boundaries to remove most of the detected features (e.g., ~ 90%) that cannot be PFAS due to their MD/C-m/C locations (depending on the underlying question). After determination of these cutoffs, the PFAS feature prioritization can be executed again focused on a subset of features, which will strongly decrease false positives in KMD analysis, fragment matching, and suspect screening where the respective parameters can be adjusted accordingly without a strong increase of wrong assignments. Since the execution time of PFΔScreen is usually below 1 min (e.g., for ~ 4000 spectra per sample), input parameters can easily be varied to test their influence on the outcome. After execution, a folder is generated named after the sample file where important results are saved, including a summary in an Excel sheet which is formatted as a table that can be easily inspected, sorted, and subset for a faster overview of the results as well as an additional CSV file that includes the same data (Fig. S2). Important plots are saved in the interactive HTML format which can easily be opened in any browser, allowing zooming and data inspection with interactive tooltips (Fig. S3).

In the workflow to prioritize features according to their likelihood of being PFAS, several pieces of evidence are calculated individually for all detected features in the first place. For all MS¹ features, the number of carbon atoms, MD, and both MD/C and m/C dimensions are determined. To detect homologues series (HS), the KMD (with a predefined repeating unit required; e.g., CF₂) is calculated and corresponding features belonging to a certain HS are aligned by providing a unique HS number (parameters: mass tolerance, minimum number of homologues).

For all MS² spectra, fragment mass differences are calculated comprehensively. Therefore, all fragment differences within each MS² spectrum are calculated and matched against a predefined list of PFAS typical mass differences (e.g., ΔCF₂, ΔC₂F₄, ΔHF, ΔC₁₀H₃F₁₇, more details can be found in [31]). This allows an efficient detection of fragments indicative for PFAS without prior knowledge on their actual mass [23, 31]. Furthermore, a list of typical PFAS diagnostic fragments (DFs, approximately 900 fragments) from literature are automatically matched with all fragmentation spectra (which is easily extendable) [52, 53]. Both negative and positive fragments are considered depending on the measurement polarity which can be specified in the GUI. The most important parameter is the MS² noise threshold, used to specify the lowest MS² intensity to be considered for DF, and mass difference matching. It is important to select a suitable instrument-specific threshold as a too low input value may result in a high number of false-positive annotations. Besides a mass tolerance for fragment matching, a minimal number of positive DFs or mass differences can be specified to flag a MS² spectrum as potential hit.

To enhance annotation in the MS², fragments that have a defined mass difference to another already annotated fragment (accurate mass match and therefore also a chemical formula) are also annotated by subtraction or addition of the respective mass difference (e.g., ΔC₂F₄) to an annotated chemical formula (e.g., C₁₂H₅F₁₂O₄S + ΔC₂F₄). This allows the calculation of unknown chemical formulas for fragment masses that are not present in the list of DFs (see Fig. S6).

In the third step, suspect screening by accurate mass match (with mass tolerance) can be performed. We used a custom PFAS suspect list format (.csv) which currently includes PFAS from the NIST suspect list [54]. This list which includes compound name, SMILES, chemical formula, and exact mass can easily be modified or extended with data from other suspect lists (e.g., from NORMAN or the CompTox Dashboard). For suspect screening, three adducts can be chosen which are [M–H]⁻ for negative polarity, and both [M+H]⁺ and [M]⁺ for positive polarity (compounds such as betaines present in various AFFF formulations are often detected as M⁺ ions) [55].

RawDataVisualization

After feature finding or the complete workflow, the MS raw data can be directly visualized via the PFΔScreen GUI (Fig. 2 and S4).

EIC extractor

Extracted ion chromatograms (EICs) can be generated by accurate m/z (e.g., from the Excel or CSV results file) and inspected in an external window. Several masses can be extracted together (comma separated) to investigate coelution or RT shifts. To verify the systematic RT shifts of detected HS, a repeating unit can be specified (e.g., CF₂) and n EICs are extracted at once (Fig. 2 and S4), allowing fast checking for reasonable peak shapes and elution order of suspected masses.

MS¹ extractor

To visualize single MS¹ spectra, a certain RT of interest can be specified. Theoretical isotope patterns of chemical formulas from suspect hits can then be plotted on top of the experimental MS¹ isotope pattern (Fig. 2 and S4).

MS² extractor

MS² spectra can also be directly accessed via the GUI by inputting the accurate m/z value. If DFs and fragment mass differences were detected, they are displayed within the respective MS² spectrum (Fig. S4).

EIC correlator

To detect potential in-source fragments (e.g., [M-HF]⁻) or adducts (e.g., [M+Br]⁻ or [M+Acetate]⁻) by coelution correlation, an m/z of interest can be specified and all detected features within a certain RT range are correlated (EICs) and only highly correlating ions can be visualized (e.g., correlation of R² > 0.95). This can greatly enhance understanding of ionization processes and helps to find related ions that were not grouped during feature detection (more detailed explanation in the “Results and discussion” section, Fig. 6 and S9).

Soil collection and extraction

To present the feature prioritization procedure via PFΔScreen, four different PFAS-contaminated composite agricultural topsoil samples from Rastatt (R1 and R2) and Mannheim (M1, M2) regions (Germany) were extracted and measured by HPLC-QTOF-MS (see sampling details and soil physicochemical properties in the SI (S3)). The R1, R2, S1, and S2 soil names correspond to soils B, A, D, and H from Röhler et al. [56], respectively. Agricultural fields in these regions were subjected to contaminated paper sludge in the past and found to be highly contaminated with several PFAS classes [27, 32, 56]. Information on all chemicals used can be found in SI (S4). Soil extraction was adapted from existing procedures [27]. Briefly, 5 g of dried soil (40 °C) was weighed in 50-mL polypropylene (PP) tubes and combined with 10 mL of methanol (MeOH). The suspension was sonicated for 1 h and overhead shaken for 16 h. After centrifugation (10 min @ 4000 rcf), the supernatant was transferred into a 20-mL glass vessel, and extraction was repeated. The combined extracts (20 mL) were evaporated under a gentle stream of N₂ until dryness at 40 °C and reconstituted in 1 mL of MeOH, sonicated for 10 min, and thoroughly vortexed for 1 min. In the last step, the enriched extract was filtered through a 0.2-µm regenerated cellulose syringe filter, transferred into PP HPLC vials, and stored in the fridge (4°C) until analysis. As quality control, an extraction blank following the identical extraction procedure but without adding any soil was prepared to account for background contamination.

LC-HRMS measurements and data acquisition

Soil extracts were analyzed with an Agilent 1260 Infinity HPLC system (Poroshell 120 EC-C₁₈ column; 2.1 mm × 100 mm; 2.7 µm particles at 40 °C) at a flow rate of 0.3 mL/min coupled to an Agilent 6550 QTOF-mass spectrometer. For compound separation, a 23-min gradient program was used (A: 95/5 H₂O/MeOH + 2 mM NH₄Ac; B: 5/95 H₂O/MeOH + 2 mM NH₄Ac) and both negative and positive measurements were performed (details in Table S1–S2). Data acquisition was performed in the data-dependent mode (ddMS²) using 3 scans/s (MS¹ range: m/z 100–1700 and MS² range m/z 70–1700) with a static exclusion list (resulting from prior MeOH blank injections) to avoid fragmentation of background signals. Furthermore, a rolling exclusion list was used to iteratively exclude previously triggered precursor masses from previous measurements (three injections) of the same sample to maximize the MS² coverage. The threshold for precursor selection was set to 1000 counts, and each precursor was excluded for 0.5 min after collection of three MS² spectra. For collision-induced dissociation, a linear m/z-dependent collision energy (CE) according to the following equation was used: \(\mathrm{CE(}\mathrm{m/z) = 3}\frac{\mathrm{m/z}}{100}+ \mathrm{15 eV}\). To prevent sample cross contamination, a threefold needle wash in MeOH was performed in-between each injection. Each measurement sequence included several blanks and quality controls (PFAS reference standard mixture) to monitor instrument drift.

Results and discussion

PFAS prioritization and identification with PFΔScreen are aimed to be performed in an iterative process. This means that the program is executed multiple times allowing parameter adjustment to generate reasonable results. PFΔScreen runtimes are usually below 1 min (e.g., for ~ 4000 spectra per sample) for the whole workflow. When changing specific input parameters (e.g., tolerances, thresholds, mass differences), their effect on the output can directly be observed. In this way, input parameters can be conveniently adjusted depending on end-user needs and sample types. After feature detection, blank correction, and a short inspection of the results, the data can be reduced by the MD/C-m/C approach by setting an appropriate m/C cutoff value. Subsequent KMD analysis, fragment mass differences, DF matching, and suspect screening then result in a detailed table of a manageable size.

To demonstrate the PFΔScreen workflow, it was applied here to four contaminated agricultural topsoils. The iterative identification process started with the soil extract of M1. After data preprocessing and application of prioritization techniques, the identified PFAS (including adducts and in-source fragments) were manually added to the suspect list, and the same workflow was applied to the next soil sample. In the following, the whole workflow starting from data reduction to final identification is discussed in detail.

Data preprocessing

After data-dependent acquisition (DDA), the raw MS data (.d files, Agilent) were converted into mzML with MSConvert [40]. For each soil, PFΔScreen was executed individually together with the extraction blank to remove background signals originating from both the extraction procedure and the HPLC system. The mass error for feature detection was set to 10 ppm, the MS¹ intensity threshold was set to 2000 counts and the metabolites (5% RMS) isotope model from OpenMS was used to exclude features with unusual peak shapes of isotopic traces. Peaks reported after feature detection have to have a full width at half maximum (FWHM) above 1 s and below 1 min, and at least two isotopic traces. MS² spectra were aligned with a mass tolerance of 5 mDa and an RT tolerance of 0.2 min (these tolerances can be verified by an interactive m/z vs. RT plot (Fig S3a)). Features detected in both sample and extraction blank that deviated by < 2 mDa at a RT difference of < 0.1 min and were not at least fivefold more abundant in the sample were removed. Exemplified on soil M1, 4209 features were detected, which were reduced to 3750 features after blank correction in the ESI⁻ mode. A total of 1026 out of 2450 acquired MS² spectra corresponded to detected features, from which 417 unique spectra remained (~ 11% MS² coverage in first iteration).

Data reduction by m/C and MD/C

After these feature preprocessing steps, the m/C and MD/C dimensions were used for data reduction. When looking at the MD/C-m/C plot of all soils together (containing more than 12,000 features), a clear separation of three groups of compounds can be observed (Fig. 3a). Most features were located below m/C 30, which are a wide variety of different hydrocarbon molecules. A theoretical molecule exclusively consisting of (CH₂)_n groups would be located at m/C = 14, while for the four soil extracts a clear peak distribution ranging from m/C ≈ 10–25 and reaching a maximum around m/C ≈ 16 was observed (Fig. 3c). The determination of the carbon number strongly depends on the peak picking algorithm, since it is based on robustly integrated EICs from the monoisotopic mass and its corresponding M+1 isotope (C ≈ I_M+1/I_M/0.011145). Therefore, a certain uncertainty should always be expected, which increases with decreasing ion abundance. Nonetheless, ~ 92% of all detected features are clearly located below m/C = 30 (e.g., humic substances) (Fig. 3c). Therefore, here a cutoff at m/C = 30 was chosen since PFAS that are dominated by fluorine usually have a higher m/C (e.g., m/C_{6:2 diPAP} ≈ 49; m/C_PFOA ≈ 51; m/C_6:2 FTAB ≈ 38). 6:2 FTAB is an AFFF constituent which already has a considerable fraction of hydrogen (C₁₅H₁₉F₁₃N₂O₄S) compared to other PFAS, while other organic compounds containing less fluorine (compared to hydrogen, high H/F ratio) such as the pharmaceutical fluoxetine with only three fluorine atoms (C₁₇H₁₈F₃NO, m/C ≈ 18) fall below the applied cutoff. Depending on the underlying NTS question, this cutoff can be adjusted accordingly. Attempting to remove further features, an MD/C cutoff of < + 0.003 was set, although as seen in Fig. 3a the m/C dimension was much more effective for data reduction. The MD/C-m/C approach was more efficient to reduce features compared to the MD, as shown in Fig. 3b and d. When applying a MD range from − 0.25 to + 0.1 Da, which would include 92% of the PFAS in the PFASOECDNA list (CompTox Dashboard [36, 57]), 17% of the features remained, while the combined m/C and MD/C cutoffs led to only 7.4% of remaining features. It is very important to note here that the number of features that strongly exceed a MD of + 0.5 is not negligible, since a conventional calculation of the MD would result in a negative MD (e.g., − 0.2 Da for a saturated hydrocarbon with 60 carbon atoms (H(CH₂)₆₀H), whereas the true MD would be + 0.8 Da). As can be seen from the carbon number, a considerable number of features has more than 60 carbon atoms (up to 80 carbons) which are in a PFAS typical MD range (Fig. 3b). Therefore, setting an appropriate m/C cutoff is highly recommended, since these features are easily removed by this additional criterion. Eventually, when combining both m/C and MD/C cutoffs, only 949 features (7.4%) remain in all four soils together. This is an appropriate number of features for further PFAS-specific calculations such as KMD analysis, DFs, fragment mass differences, and suspect screening. It should be noted in particular that due to the removal of ~ 90% of the initial features, the false-positive rate decreases drastically (especially with large lists) and allows adjustment of selected tolerances with smaller effect on false positives.

KMD analysis, fragment differences, DFs, and suspect screening

For further prioritization and tentative identification, repeating units representative for PFAS such as (CF₂)_n and CF₂O were applied to detect HS (mass tolerance was set at ± 2 mDa, with at least 3 homologues). Without any m/C cutoff, in soil M1, 74 (CF₂)_n-based HS were detected, likely including numerous false positives (Fig. 4a) evidenced by a random RT pattern (no RT shift in linked KMD m/z vs. RT plot). The KMD analysis in PFΔScreen is performed without checking the systematic RT shift, but the interactive KMD plot (HTML) allows a fast verification of RT shifts. Each HS can be highlighted individually by clicking on it, and the respective m/z vs. RT correlation is visualized (Fig. S5). Obviously, many hydrocarbon features were detected in the soil extract that are mimicking CF₂-repeating units, which is a common issue of complex matrices [31, 37]. These compounds have a higher CF₂-based KMD (e.g., 0.2 to 0.5, or lower if their MD strongly exceeds + 0.5 Da) compared to that of PFAS (Fig. 4a). If the combined MD/C-m/C cutoff is applied, the number of detected HS in soil M1 is reduced to 26 (~ 65% data reduction, see Fig. 4b) which confirms the utility of this approach.

For detection of fragment mass differences and DFs in the MS² data, preliminary ΔCF₂, ΔC₂F₄, ΔHF, and the list of DFs were used (later specific mass differences were searched). This resulted in the detection of 30 MS² spectra that contained the specified mass differences, and 47 spectra with DF hits out of a total number of 373 unique MS² spectra at a mass tolerance set to ± 2 mDa and an MS² intensity threshold of 2000 counts in the M1 soil extract (first iteration).

In the suspect screening process, the hits by accurate mass (tolerance of 4 mDa) were reduced from 217 to 176 by the MD/C-m/C cutoff in soil M1.

Manual identification process with the PFΔScreen results table

The verification and (partially manual) identification process of prioritized features from the PFΔScreen results table (Excel or CSV) was performed by sorting the table according to decreasing intensity, after removing features based on defined MD/C-m/C cutoffs. For soil M1, this resulted in a feature list with 305 potential compounds. Note that some features appear multiple times in the list due to structural isomerism, resulting in multiple features at multiple distinct RTs depending on the degree of separation and the peak finding algorithm. Each feature was verified manually for occurrence in the extraction blank and reasonable peak shape (until < 1% of the most abundant feature). Although a blank correction was performed, typical contaminations from the LC system with long tailing peaks can be integrated multiple times at different RTs. Therefore, they are not always correctly removed depending on the specified parameters. By using the RawDataVisualization tool of PFΔScreen, EICs of every m/z belonging to one HS (using the integrated HS extrapolator) can be verified for RT shift and peak shape, eventually resulting in identification of homologues with very low abundances that were missed in the feature finding process due to the MS¹ intensity threshold. The chemical formulas from suspect hits were used to check for reasonable isotope patterns with the RawDataVisualization of PFΔScreen. SMILES codes were used to verify at least one candidate per HS by an MS² spectrum.

In total, nine PFAS classes could be identified via PFΔScreen in the four soils that exhibited at least one suspect hit per HS or compound (Fig. 5). Perfluoroalkyl carboxylic acids (PFCAs, C₄–C₂₀), fluorotelomer alkyl phosphate diesters (diPAPs, 4:2/6:2–12:2/12:2), n:3 fluorotelomer carboxylic acids (FTCAs, 5:3–13:3), fluorotelomer sulfonic acids (FTSAs, 6:2–16:2), perfluorosulfonic acids (PFSAs, C₄–C₁₀), perfluorooctane sulfonamide (PFOSA), N-ethylperfluoro-1-octanesulfonamidoacetic acid (N-EtFOSAA), and N-ethyl perfluorooctane sulfonamide ethanol–based phosphate diester (diSAmPAP) were identified in all four soils. Different chain length distributions and abundances were observed (Fig. 5). diPAPs were detected as complex mixtures of several structural isomers depending on their chain length (e.g., 6:2/10:2 and 8:2/8:2, shown by MS/MS). Additionally, their EICs showed peaks at much later RTs corresponding to in-source fragments of triPAPs (Fig. S7). While all telomer-based PFAS were detected as linear chains, the PASF-based PFAS (PFSAs, N-EtFOSAA, PFOSA, and diSAmPAP) showed typical chromatographic peak shapes of mixtures of branched and linear isomers [58]. In these cases, the dominance of a C₈-based chemistry can be observed (see PFSAs in Fig. 5).

All four soils had a similar contamination pattern. However, for soils M1 and M2 (Mannheim region), another very abundant precursor class, namely FTMAPs, was detected (including isomeric profiles ranging 6:2/6:2 to 10:2/12:2), as well the previously identified TPs FTMAP-sulfoxides [31].

The 6:2 fluorotelomer mercapto alkyl phosphate esters (6:2/6:2 FTMAP) could be confirmed with an in-house synthesized reference standard, leading to identification levels of 1 for 6:2 FTMAP and 2a for the further homologues due to clear MS/MS evidence [59]. In general, all identified PFAS are in good agreement with previous studies including biotransformation that characterized other soil samples from both Rastatt and Mannheim [27, 31, 32, 39, 56].

The PFΔScreen results table also revealed several unknown HS that were detected but did not have an accurate mass match with the suspect list. Their identification with the help of the EIC correlator of PFΔScreen is discussed in the following.

EIC correlator: coelution correlation analysis for identification of unknowns

After identification of the PFAScreen results, there were several C₂F₄-based HS left without any hit in the suspect list. When looking at several MS¹ spectra of different homologues, many coeluting ions were observed, often characterized by HF losses and other mass differences (Fig. S8). This is an indication of in-source fragmentation of these classes [60, 61]. To be able to efficiently group corresponding in-source fragments and potential adduct ions together, the EIC correlator from the raw data visualization tools of PFΔScreen was used to correlate the EICs of suspected features (from a given HS) with the EICs of all detected features that coelute within a given RT range of ± 25 s. Strong correlation of EICs can be used to detect related ions and allows their isolation from other ions in consecutive MS¹ spectra without knowing their mass differences [62,63,64,65]. This is exemplified on the unknown m/z 966.9944 which is a member of a suspected HS. When correlating the EIC of m/z = 966.9944 with the EICs of all coeluting features within a RT range of 50 s, 12 out of 368 EICs correlated with an R² > 0.96 at an extraction width of 5 mDa (see Fig S9 for more details). The result is an MS¹ spectrum that only contains coeluting ions (correlation spectrum) of several in-source fragments and adducts (Fig. 6). Since well-known mass differences such as ΔC₂F₄ and ΔHF were found in this MS¹ spectrum, a telomer-based PFAS with potentially two telomer chains (e.g., 6:2/8:2) was suspected [31]. When looking at the mass differences of detected coeluting ions, [M+Cl]⁻, [M+Br]⁻, and [M+Ac]⁻ adducts and several other in-source fragments could be observed. The detection of [M+Cl]⁻ and [M+Br]⁻ ions was of great importance since they allowed the determination of [M] rather easily which then also allowed the identification of other adducts and the molecular formula. The m/z = 966.9944 (in-source fragment) corresponds to a FTMAP-related substance, which was tagged FTMA-diol-sulfone-sulfoxide or FTMA-diol-O₃ (see Figs. 5 and 6). With this correlation technique, several tens of unknown HS could be grouped into four novel FTMAP-related compound classes (Fig. 6). They were identified with one oxygen (sulfoxide) and up to 4 oxygens (disulfone) and to the best of our knowledge not reported in literature before. They could be microbial or photochemical FTMAP TPs and close the unknown gap in a previous FTMAP-related transformation study [66], or they could be used intentionally or as side-products in PFAS-coated papers that contaminate these soils. These kinds of correlation spectra made identification possible since the MS² spectra of the adducts ([M-H]⁻ ions of the FTMA-diols were not detected at all which makes sense with ESI) barely formed useful fragments except for Br⁻ which made them hard to interpret. The use of in-source fragments for identification has the advantage that isotope patterns are available for all ions (features), which is often not the case in MS² spectra depending on the isolation width of the precursor ion. All these FTMAP-related substances form multiple in-source fragments (and adducts), all could be confirmed with rather high confidence (identification level of 2b). They all could be grouped by C₂F₄- and O-based KMD (for O-KMD, see Fig. S10) with systematic RTshifts, besides eluting at higher RT than FTMAPs due to their lower polarity attributed to the loss of the phosphoric acid group.

Conclusions

PFΔScreen can be used efficiently for prioritizing features in both LC- and GC-HRMS (ESI and APCI) raw data in all kinds of samples independent of the vendor of the mass spectrometer used. Especially, the MD/C-m/C approach is a powerful tool to drastically decrease the number of features and thus reduce false-positive assignments, overcoming a common issue during NTS. Due to the short computational time of PFΔScreen (less than 1 min for 4000 spectra), input parameters can be conveniently adjusted depending on the tested sample, instrument used, and end-user needs. Since the number of unknown PFAS in complex environmental and technical samples is still unknown, NTS approaches that combine several data reduction techniques for an efficient workflow are of importance to comprehensively elucidate the identity occurrence and fate of organic pollutants such as PFAS.

Code availability and license

The Python source code of PFΔScreen is available on GitHub (https://github.com/JonZwe/PFAScreen) together with example files. It is published under the LGPL-2.1 license.

References

Evich MG, Davis MJB, McCord JP, Acrey B, Awkerman JA, Knappe DRU, Lindstrom AB, Speth TF, Tebes-Stevens C, Strynar MJ, Wang Z, Weber EJ, Henderson WM, Washington JW. Per- and polyfluoroalkyl substances in the environment. Science 2022;375(6580):eabg9065. https://doi.org/10.1126/science.abg9065.
Lindstrom AB, Strynar MJ, Libelo EL. Polyfluorinated compounds: past, present, and future. Environ Sci Technol. 2011;45(19):7954–61. https://doi.org/10.1021/es2011622.
Article PubMed CAS Google Scholar
Ng C, Cousins IT, DeWitt JC, Glüge J, Goldenman G, Herzke D, Lohmann R, Miller M, Patton S, Scheringer M, Trier X, Wang Z. Addressing urgent questions for PFAS in the 21st century. Environ Sci Technol. 2021. https://doi.org/10.1021/acs.est.1c03386.
Article PubMed PubMed Central Google Scholar
Glüge J, Scheringer M, Cousins IT, DeWitt JC, Goldenman G, Herzke D, Lohmann R, Ng CA, Trier X, Wang Z. An overview of the uses of per- and polyfluoroalkyl substances (PFAS). Environ Sci Process Impacts. 2020;22(12):2345–73. https://doi.org/10.1039/d0em00291g.
Article PubMed PubMed Central CAS Google Scholar
Cousins IT, DeWitt JC, Glüge J, Goldenman G, Herzke D, Lohmann R, Ng CA, Scheringer M, Wang Z. The high persistence of PFAS is sufficient for their management as a chemical class. Environ Sci Process Impacts. 2020;22(12):2307–12. https://doi.org/10.1039/d0em00355g.
Article PubMed PubMed Central CAS Google Scholar
Wang Z, Cousins IT, Scheringer M, Buck RC, Hungerbuhler K. Global emission inventories for C4–C14 perfluoroalkyl carboxylic acid (PFCA) homologues from 1951 to 2030, Part I: production and emissions from quantifiable sources. Environ Int. 2014;70:62–75. https://doi.org/10.1016/j.envint.2014.04.013.
Article PubMed CAS Google Scholar
Wang Z, Cousins IT, Scheringer M, Buck RC, Hungerbuhler K. Global emission inventories for C4–C14 perfluoroalkyl carboxylic acid (PFCA) homologues from 1951 to 2030, part II: the remaining pieces of the puzzle. Environ Int. 2014;69:166–76. https://doi.org/10.1016/j.envint.2014.04.006.
Article PubMed CAS Google Scholar
Cousins IT, Johansson JH, Salter ME, Sha B, Scheringer M. Outside the safe operating space of a new planetary boundary for per- and polyfluoroalkyl substances (PFAS). Environ Sci Technol. 2022;56(16):11172–9. https://doi.org/10.1021/acs.est.2c02765.
Article PubMed PubMed Central CAS Google Scholar
Wang Z, Buser AM, Cousins IT, Demattio S, Drost W, Johansson O, Ohno K, Patlewicz G, Richard AM, Walker GW, White GS, Leinala E. A new OECD definition for per- and polyfluoroalkyl substances. Environ Sci Technol. 2021;55(23):15575–8. https://doi.org/10.1021/acs.est.1c06896.
Article PubMed CAS Google Scholar
Schymanski EL, Zhang J, Thiessen PA, Chirsir P, Kondic T, Bolton EE. Per- and polyfluoroalkyl substances (PFAS) in PubChem: 7 million and growing. Environ Sci Technol. 2023;57(44):16918–28. https://doi.org/10.1021/acs.est.3c04855.
Article PubMed PubMed Central CAS Google Scholar
Stockholm Convention. The new POPs under the Stockholm Convention. 2022. http://www.pops.int/TheConvention/ThePOPs/TheNewPOPs/tabid/2511/Default.aspx. Accessed 21.03.2023
Kwiatkowski CF, Andrews DQ, Birnbaum LS, Bruton TA, DeWitt JC, Knappe DRU, Maffini MV, Miller MF, Pelch KE, Reade A, Soehl A, Trier X, Venier M, Wagner CC, Wang Z, Blum A. Scientific basis for managing PFAS as a chemical class. Environ Sci Technol Lett. 2020;7(8):532–43. https://doi.org/10.1021/acs.estlett.0c00255.
Article PubMed PubMed Central CAS Google Scholar
ECHA. ECHA publishes PFAS restriction proposal. 2023. https://echa.europa.eu/de/-/echa-publishes-pfas-restriction-proposal. Accessed 21.09.2023
Aro R, Carlsson P, Vogelsang C, Karrman A, Yeung LW. Fluorine mass balance analysis of selected environmental samples from Norway. Chemosphere. 2021;283: 131200. https://doi.org/10.1016/j.chemosphere.2021.131200.
Article PubMed CAS Google Scholar
Aro R, Eriksson U, Kärrman A, Chen F, Wang T, Yeung LWY. Fluorine mass balance analysis of effluent and sludge from Nordic countries. ACS ES&T Water. 2021;1(9):2087–96. https://doi.org/10.1021/acsestwater.1c00168.
Article CAS Google Scholar
Simon F, Gehrenkemper L, Becher S, Dierkes G, Langhammer N, Cossmer A, von der Au M, Gockener B, Fliedner A, Rudel H, Koschorreck J, Meermann B. Quantification and characterization of PFASs in suspended particulate matter (SPM) of German rivers using EOF, dTOPA, (non-)target HRMS. Sci Total Environ. 2023;885: 163753. https://doi.org/10.1016/j.scitotenv.2023.163753.
Article PubMed CAS Google Scholar
Koch A, Aro R, Wang T, Yeung LWY. Towards a comprehensive analytical workflow for the chemical characterisation of organofluorine in consumer products and environmental samples. Trac-Trend Anal Chem. 2020;123: 115423. https://doi.org/10.1016/j.trac.2019.02.024.
Article CAS Google Scholar
Aro R, Eriksson U, Karrman A, Yeung LWY. Organofluorine mass balance analysis of whole blood samples in relation to gender and age. Environ Sci Technol. 2021;55(19):13142–51. https://doi.org/10.1021/acs.est.1c04031.
Article PubMed PubMed Central CAS Google Scholar
Ruan T, Jiang G. Analytical methodology for identification of novel per- and polyfluoroalkyl substances in the environment. TrAC Trends Anal Chem. 2017;95:122–31. https://doi.org/10.1016/j.trac.2017.07.024.
Article CAS Google Scholar
Jia S, Marques Dos Santos M, Li C, Snyder SA. Recent advances in mass spectrometry analytical techniques for per- and polyfluoroalkyl substances (PFAS). Anal Bioanal Chem. 2022;414(9):2795–807. https://doi.org/10.1007/s00216-022-03905-y.
Article PubMed CAS Google Scholar
Strynar M, McCord J, Newton S, Washington J, Barzen-Hanson K, Trier X, Liu Y, Dimzon IK, Bugsel B, Zwiener C, Munoz G. Practical application guide for the discovery of novel PFAS in environmental samples using high resolution mass spectrometry. J Expo Sci Environ Epidemiol. 2023. https://doi.org/10.1038/s41370-023-00578-2.
Article PubMed Google Scholar
Joerss H, Menger F. The complex ‘PFAS world’ - how recent discoveries and novel screening tools reinforce existing concerns. Curr Opin Green Sustain Chem. 2023. https://doi.org/10.1016/j.cogsc.2023.100775.
Article Google Scholar
Zweigle J, Bugsel B, Röhler K, Haluska AA, Zwiener C. PFAS-contaminated soil site in germany: nontarget screening before and after direct TOP assay by Kendrick mass defect and FindPFΔS. Environ Sci Technol. 2023;57(16):6647–55. https://doi.org/10.1021/acs.est.2c07969.
Article PubMed CAS Google Scholar
Liu Y, D'Agostino LA, Qu G, Jiang G, Martin JW. High-resolution mass spectrometry (HRMS) methods for nontarget discovery and characterization of poly- and per-fluoroalkyl substances (PFASs) in environmental and human samples. TrAC Trends Anal Chem. 2019;121. https://doi.org/10.1016/j.trac.2019.02.021.
Hulleman T, Turkina V, O’Brien JW, Chojnacka A, Thomas KV, Samanipour S. Critical assessment of the chemical space covered by LC-HRMS non-targeted analysis. Environ Sci Technol. 2023. https://doi.org/10.1021/acs.est.3c03606.
Article PubMed PubMed Central Google Scholar
Zweigle J, Bugsel B, Zwiener C. Efficient PFAS prioritization in non-target HRMS data: systematic evaluation of the novel MD/C-m/C approach. Anal Bioanal Chem. 2023. https://doi.org/10.1007/s00216-023-04601-1.
Article PubMed PubMed Central Google Scholar
Bugsel B, Zwiener C. LC-MS screening of poly- and perfluoroalkyl substances in contaminated soil by Kendrick mass analysis. Anal Bioanal Chem. 2020;412(20):4797–805. https://doi.org/10.1007/s00216-019-02358-0.
Article PubMed PubMed Central CAS Google Scholar
Koelmel JP, Paige MK, Aristizabal-Henao JJ, Robey NM, Nason SL, Stelben PJ, Li Y, Kroeger NM, Napolitano MP, Savvaides T, Vasiliou V, Rostkowski P, Garrett TJ, Lin E, Deigl C, Jobst K, Townsend TG, Godri Pollitt KJ, Bowden JA. Toward comprehensive per- and polyfluoroalkyl substances annotation using FluoroMatch software and intelligent high-resolution tandem mass spectrometry acquisition. Anal Chem. 2020;92(16):11186–94. https://doi.org/10.1021/acs.analchem.0c01591.
Article PubMed CAS Google Scholar
Dickman RA, Aga DS. Efficient workflow for suspect screening analysis to characterize novel and legacy per- and polyfluoroalkyl substances (PFAS) in biosolids. Anal Bioanal Chem. 2022. https://doi.org/10.1007/s00216-022-04088-2.
Article PubMed PubMed Central Google Scholar
Kaufmann A, Butcher P, Maden K, Walker S, Widmer M. Simplifying nontargeted analysis of PFAS in complex food matrices. J AOAC Int. 2022. https://doi.org/10.1093/jaoacint/qsac071.
Article PubMed Google Scholar
Zweigle J, Bugsel B, Zwiener C. FindPFΔS: non-target screening for PFAS - comprehensive data mining for MS2 fragment mass differences. Anal Chem. 2022;94(30):10788–96. https://doi.org/10.1021/acs.analchem.2c01521.
Article PubMed PubMed Central CAS Google Scholar
Bugsel B, Bauer R, Herrmann F, Maier ME, Zwiener C. LC-HRMS screening of per- and polyfluorinated alkyl substances (PFAS) in impregnated paper samples and contaminated soils. Anal Bioanal Chem. 2022;414(3):1217–25. https://doi.org/10.1007/s00216-021-03463-9.
Article PubMed CAS Google Scholar
Munoz G, Michaud AM, Liu M, Vo Duy S, Montenach D, Resseguier C, Watteau F, Sappin-Didier V, Feder F, Morvan T, Houot S, Desrosiers M, Liu J, Sauve S. Target and nontarget screening of PFAS in biosolids, composts, and other organic waste products for land application in France. Environ Sci Technol. 2022;56(10):6056–68. https://doi.org/10.1021/acs.est.1c03697.
Article PubMed CAS Google Scholar
Liu L, Lu M, Cheng X, Yu G, Huang J. Suspect screening and nontargeted analysis of per- and polyfluoroalkyl substances in representative fluorocarbon surfactants, aqueous film-forming foams, and impacted water in China. Environ Int. 2022;167. https://doi.org/10.1016/j.envint.2022.107398.
Ng K, Alygizakis N, Androulakakis A, Galani A, Aalizadeh R, Thomaidis NS, Slobodnik J. Target and suspect screening of 4777 per- and polyfluoroalkyl substances (PFAS) in river water, wastewater, groundwater and biota samples in the Danube River Basin. J Hazard Mater. 2022;436: 129276. https://doi.org/10.1016/j.jhazmat.2022.129276.
Article PubMed CAS Google Scholar
Grulke CM, Williams AJ, Thillanadarajah I, Richard AM. EPA’s DSSTox database: history of development of a curated chemistry resource supporting computational toxicology research. Comput Toxicol. 2019;12. https://doi.org/10.1016/j.comtox.2019.100096.
Young RB, Pica NE, Sharifan H, Chen H, Roth HK, Blakney GT, Borch T, Higgins CP, Kornuc JJ, McKenna AM, Blotevogel J. PFAS analysis with ultrahigh resolution 21T FT-ICR MS: suspect and nontargeted screening with unrivaled mass resolving power and accuracy. Environ Sci Technol. 2022;56(4):2455–65. https://doi.org/10.1021/acs.est.1c08143.
Article PubMed CAS Google Scholar
Bugsel B, Zweigle J, Zwiener C. Nontarget screening strategies for PFAS prioritization and identification by high resolution mass spectrometry: a review. Trends Environ Anal Chem. 2023;40. https://doi.org/10.1016/j.teac.2023.e00216.
Nürenberg G, Nödler K, T LF, Schäfer C, Huber K, Scheurer M. Nachweis von polyfluorierten Alkylphosphatestern (PAP) und Perfluoroktansulfonamidoethanol-basierten Phosphatestern (SAmPAP) in Böden. Mitt Umweltchem Ökotox; 2018.
Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S, Gatto L, Fischer B, Pratt B, Egertson J, Hoff K, Kessner D, Tasman N, Shulman N, Frewen B, Baker TA, Brusniak MY, Paulse C, Creasy D, Flashner L, Kani K, Moulding C, Seymour SL, Nuwaysir LM, Lefebvre B, Kuhlmann F, Roark J, Rainer P, Detlev S, Hemenway T, Huhmer A, Langridge J, Connolly B, Chadick T, Holly K, Eckels J, Deutsch EW, Moritz RL, Katz JE, Agus DB, MacCoss M, Tabb DL, Mallick P. A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol. 2012;30(10):918–20. https://doi.org/10.1038/nbt.2377.
Article PubMed PubMed Central CAS Google Scholar
Martens L, Chambers M, Sturm M, Kessner D, Levander F, Shofstahl J, Tang WH, Rompp A, Neumann S, Pizarro AD, Montecchi-Palazzi L, Tasman N, Coleman M, Reisinger F, Souda P, Hermjakob H, Binz PA, Deutsch EW (2011) mzML--a community standard for mass spectrometry data. Mol Cell Proteomics. 2011;10(1):R110 000133. https://doi.org/10.1074/mcp.R110.000133.
Röst HL, Schmitt U, Aebersold R, Malmstrom L. pyOpenMS: a Python-based interface to the OpenMS mass-spectrometry algorithm library. Proteomics. 2014;14(1):74–7. https://doi.org/10.1002/pmic.201300246.
Article PubMed CAS Google Scholar
Röst HL, Sachsenberg T, Aiche S, Bielow C, Weisser H, Aicheler F, Andreotti S, Ehrlich HC, Gutenbrunner P, Kenar E, Liang X, Nahnsen S, Nilse L, Pfeuffer J, Rosenberger G, Rurik M, Schmitt U, Veit J, Walzer M, Wojnar D, Wolski WE, Schilling O, Choudhary JS, Malmstrom L, Aebersold R, Reinert K, Kohlbacher O. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Methods. 2016;13(9):741–8. https://doi.org/10.1038/nmeth.3959.
Article PubMed CAS Google Scholar
Sturm M, Bertsch A, Gropl C, Hildebrandt A, Hussong R, Lange E, Pfeifer N, Schulz-Trieglaff O, Zerck A, Reinert K, Kohlbacher O. OpenMS - an open-source software framework for mass spectrometry. BMC Bioinformatics. 2008;9:163. https://doi.org/10.1186/1471-2105-9-163.
Article PubMed PubMed Central CAS Google Scholar
Sachsenberg T, Pfeuffer J, Bielow C, Wein S, Jeong K, Netz E, Walter A, Alka O, Nilse L, Colaianni P, McCloskey D, Kim J, Rosenberger G, Bichmann L, Walzer M, Veit J, Boudaud B, Bernt M, Patikas N, Pilz M, Startek MP, Kutuzova S, Heumos L, Charkow J, Sing J, Feroz A, Siraj A, Weisser H, Dijkstra T, Perez-Riverol Y, Röst H, Kohlbacher O. OpenMS 3 expands the frontiers of open-source computational mass spectrometry. Preprint. 2023. https://doi.org/10.21203/rs.3.rs-3286368/v1.
Pfeuffer J, Sachsenberg T, Alka O, Walzer M, Fillbrunn A, Nilse L, Schilling O, Reinert K, Kohlbacher O. OpenMS - a platform for reproducible analysis of mass spectrometry data. J Biotechnol. 2017;261:142–8. https://doi.org/10.1016/j.jbiotec.2017.05.016.
Article PubMed CAS Google Scholar
Kenar E, Franken H, Forcisi S, Wormann K, Haring HU, Lehmann R, Schmitt-Kopplin P, Zell A, Kohlbacher O. Automated label-free quantification of metabolites from liquid chromatography-mass spectrometry data. Mol Cell Proteomics. 2014;13(1):348–59. https://doi.org/10.1074/mcp.M113.031278.
Article PubMed CAS Google Scholar
Helmus R, Ter Laak TL, van Wezel AP, de Voogt P, Schymanski EL. patRoon: open source software platform for environmental mass spectrometry based non-target screening. J Cheminform. 2021;13(1):1. https://doi.org/10.1186/s13321-020-00477-w.
Article PubMed PubMed Central CAS Google Scholar
Kontou EE, Walter A, Alka O, Pfeuffer J, Sachsenberg T, Mohite OS, Nuhamunada M, Kohlbacher O, Weber T. UmetaFlow: an untargeted metabolomics workflow for high-throughput data processing and analysis. J Cheminform. 2023;15(1):52. https://doi.org/10.1186/s13321-023-00724-w.
Article PubMed PubMed Central Google Scholar
Goloborodko AA, Levitsky LI, Ivanov MV, Gorshkov MV. Pyteomics–a Python framework for exploratory data analysis and rapid software prototyping in proteomics. J Am Soc Mass Spectrom. 2013;24(2):301–4. https://doi.org/10.1007/s13361-012-0516-6.
Article PubMed CAS Google Scholar
Levitsky LI, Klein JA, Ivanov MV, Gorshkov MV. Pyteomics 4.0: five years of development of a Python proteomics framework. J Proteome Res. 2019;18(2):709–714. https://doi.org/10.1021/acs.jproteome.8b00717.
Koelmel JP, Stelben P, McDonough CA, Dukes DA, Aristizabal-Henao JJ, Nason SL, Li Y, Sternberg S, Lin E, Beckmann M, Williams AJ, Draper J, Finch JP, Munk JK, Deigl C, Rennie EE, Bowden JA, Godri Pollitt KJ. FluoroMatch 2.0-making automated and comprehensive non-targeted PFAS annotation a reality. Anal Bioanal Chem. 2022;414(3):1201–1215. https://doi.org/10.1007/s00216-021-03392-7.
Barzen-Hanson KA, Roberts SC, Choyke S, Oetjen K, McAlees A, Riddell N, McCrindle R, Ferguson PL, Higgins CP, Field JA. Discovery of 40 classes of per- and polyfluoroalkyl substances in historical aqueous film-forming foams (AFFFs) and AFFF-impacted groundwater. Environ Sci Technol. 2017;51(4):2047–57. https://doi.org/10.1021/acs.est.6b05843.
Article PubMed Google Scholar
Place B. Suspect list of possible per- and polyfluoroalkyl substances (PFAS). National Institute of Standards and Technology; 2021. https://data.nist.gov/od/id/mds2-2387. Accessed 14 Nov 2023.
Xiao F, Golovko SA, Golovko MY. Identification of novel non-ionic, cationic, zwitterionic, and anionic polyfluoroalkyl substances using UPLC-TOF-MS(E) high-resolution parent ion search. Anal Chim Acta. 2017;988:41–9. https://doi.org/10.1016/j.aca.2017.08.016.
Article PubMed CAS Google Scholar
Röhler K, Susset B, Grathwohl P. Production of perfluoroalkyl acids (PFAAs) from precursors in contaminated agricultural soils: batch and leaching experiments. Sci Total Environ. 2023;902: 166555. https://doi.org/10.1016/j.scitotenv.2023.166555.
Article PubMed CAS Google Scholar
OECD. Toward a new comprehensive global database of per- and polyfluoroalkyl substances (PFASs): summary report on updating the OECD 2007 list of per- and polyfluoroalkyl substances (PFASs). OECD; 2018.
Londhe K, Lee C-S, McDonough CA, Venkatesan AK. The need for testing isomer profiles of perfluoroalkyl substances to evaluate treatment processes. Environ Sci Technol. 2022. https://doi.org/10.1021/acs.est.2c05518.
Article PubMed PubMed Central Google Scholar
Charbonnet JA, McDonough CA, Xiao F, Schwichtenberg T, Cao D, Kaserzon S, Thomas KV, Dewapriya P, Place BJ, Schymanski EL, Field JA, Helbling DE, Higgins CP. Communicating confidence of per- and polyfluoroalkyl substance identification via high-resolution mass spectrometry. Environ Sci Technol Lett. 2022;9(6):473–81. https://doi.org/10.1021/acs.estlett.2c00206.
Article PubMed PubMed Central CAS Google Scholar
Berger U, Langlois I, Oehme M, Kallenborn R. Comparison of three types of mass spectrometers for HPLC/MS analysis of perfluoroalkylated substances and fluorotelomer alcohols. Eur J Mass Spectrom (Chichester). 2004;10(5):579–88. https://doi.org/10.1255/ejms.679.
Article PubMed CAS Google Scholar
Trier X, Granby K, Christensen JH. Tools to discover anionic and nonionic polyfluorinated alkyl surfactants by liquid chromatography electrospray ionisation mass spectrometry. J Chromatogr A. 2011;1218(40):7094–104. https://doi.org/10.1016/j.chroma.2011.07.057.
Article PubMed CAS Google Scholar
Kuhl C, Tautenhahn R, Bottcher C, Larson TR, Neumann S. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Anal Chem. 2012;84(1):283–9. https://doi.org/10.1021/ac202450g.
Article PubMed CAS Google Scholar
Godzien J, Armitage EG, Angulo S, Martinez-Alcazar MP, Alonso-Herranz V, Otero A, Lopez-Gonzalvez A, Barbas C. In-source fragmentation and correlation analysis as tools for metabolite identification exemplified with CE-TOF untargeted metabolomics. Electrophoresis. 2015;36(18):2188–95. https://doi.org/10.1002/elps.201500016.
Article PubMed CAS Google Scholar
Seitzer PM, Searle BC. Incorporating in-source fragment information improves metabolite identification accuracy in untargeted LC-MS data sets. J Proteome Res. 2019;18(2):791–6. https://doi.org/10.1021/acs.jproteome.8b00601.
Article PubMed CAS Google Scholar
Tada I, Chaleckis R, Tsugawa H, Meister I, Zhang P, Lazarinis N, Dahlen B, Wheelock CE, Arita M. Correlation-based deconvolution (CorrDec) to generate high-quality MS2 spectra from data-independent acquisition in multisample studies. Anal Chem. 2020;92(16):11310–7. https://doi.org/10.1021/acs.analchem.0c01980.
Article PubMed CAS Google Scholar
Bugsel B, Schussler M, Zweigle J, Schmitt M, Zwiener C. Photocatalytical transformation of fluorotelomer- and perfluorosulfonamide-based PFAS on mineral surfaces and soils in aqueous suspensions. Sci Total Environ. 2023;894: 164907. https://doi.org/10.1016/j.scitotenv.2023.164907.
Article PubMed CAS Google Scholar

Download references

Acknowledgements

The authors would like to thank Axel Walter for his GitHub repository ion-chromatogram-extractor which inspired some functions related to building EICs in PFΔScreen, Klaus Röhler for sampling the composite soil samples, and the DBU (Deutsche Bundesstiftung Umwelt) for the scholarship of JZ.

Funding

Open Access funding enabled and organized by Projekt DEAL. Open Access funding enabled and organized by Projekt DEAL. Deutsche Bundesstiftung Umwelt (DBU) funded the PhD scholarship of JZ.

Author information

Authors and Affiliations

Environmental Analytical Chemistry, Department of Geosciences, University of Tübingen, Schnarrenbergstraße 94-96, 72076, Tübingen, Germany
Jonathan Zweigle, Boris Bugsel & Christian Zwiener
Hydrogeochemistry, Department of Geosciences, University of Tübingen, Schnarrenbergstraße 94-96, 72076, Tübingen, Germany
Joel Fabregat-Palau

Authors

Jonathan Zweigle
View author publications
You can also search for this author in PubMed Google Scholar
Boris Bugsel
View author publications
You can also search for this author in PubMed Google Scholar
Joel Fabregat-Palau
View author publications
You can also search for this author in PubMed Google Scholar
Christian Zwiener
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JZ conceptualized the structure of PFΔScreen, wrote most of the Python source code, and wrote the first draft of the manuscript. BB was part of writing and designing PFΔScreen processes and reviewed the manuscript. JFP performed the soil extractions and reviewed the manuscript. CZ supervised the study and reviewed the manuscript.

Corresponding authors

Correspondence to Jonathan Zweigle or Christian Zwiener.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 2.59 MB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zweigle, J., Bugsel, B., Fabregat-Palau, J. et al. PFΔScreen — an open-source tool for automated PFAS feature prioritization in non-target HRMS data. Anal Bioanal Chem 416, 349–362 (2024). https://doi.org/10.1007/s00216-023-05070-2

Download citation

Received: 26 September 2023
Revised: 17 November 2023
Accepted: 21 November 2023
Published: 30 November 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s00216-023-05070-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

PFΔScreen — an open-source tool for automated PFAS feature prioritization in non-target HRMS data

Abstract

Graphical abstract

Similar content being viewed by others

Efficient PFAS prioritization in non-target HRMS data: systematic evaluation of the novel MD/C-m/C approach

patRoon: open source software platform for environmental mass spectrometry based non-target screening

In silico MS/MS spectra for identifying unknowns: a critical examination using CFM-ID algorithms and ENTACT mixture samples

Introduction

Materials and methods

PFΔScreen workflow

FeatureFinding

PFASPrioritization

RawDataVisualization

EIC extractor

MS1 extractor

MS2 extractor

EIC correlator

Soil collection and extraction

LC-HRMS measurements and data acquisition

Results and discussion

Data preprocessing

Data reduction by m/C and MD/C

KMD analysis, fragment differences, DFs, and suspect screening

Manual identification process with the PFΔScreen results table

EIC correlator: coelution correlation analysis for identification of unknowns

Conclusions

Code availability and license

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 2.59 MB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

MS¹ extractor

MS² extractor