Aggressive dereplication using UHPLC–DAD–QTOF: screening extracts for up to 3000 fungal secondary metabolites

In natural-product drug discovery, finding new compounds is the main task, and thus fast dereplication of known compounds is essential. This is usually performed by manual liquid chromatography-ultraviolet (LC-UV) or visible light-mass spectroscopy (Vis-MS) interpretation of detected peaks, often assisted by automated identification of previously identified compounds. We used a 15 min high-performance liquid chromatography–diode array detection (UHPLC–DAD)–high-resolution MS method (electrospray ionization (ESI)+ or ESI−), followed by 10–60 s of automated data analysis for up to 3000 relevant elemental compositions. By overlaying automatically generated extracted-ion chromatograms from detected compounds on the base peak chromatogram, all major potentially novel peaks could be visualized. Peaks corresponding to compounds available as reference standards, previously identified compounds, and major contaminants from solvents, media, filters etc. were labeled to differentiate these from compounds only identified by elemental composition. This enabled fast manual evaluation of both known peaks and potential novel-compound peaks, by manual verification of: the adduct pattern, UV–Vis, retention time compared with log D, co-identified biosynthetic related compounds, and elution order. System performance, including adduct patterns, in-source fragmentation, and ion-cooler bias, was investigated on reference standards, and the overall method was used on extracts of Aspergillus carbonarius and Penicillium melanoconidium, revealing new nitrogen-containing biomarkers for both species. Electronic supplementary material The online version of this article (doi:10.1007/s00216-013-7582-x) contains supplementary material, which is available to authorized users.


Section 1. Construction of compound database
The database was constructed in ACD Chemfolder (Advanced Chemistry Development, Toronto, Canada) from: i) our in-house collection of reference standards (~1500 compounds) [1]; ii) compounds tentatively identified during the last 30 years (~500 compounds) [2][3][4][5]; iii) compound-peaks appearing in blank samples; iv) putative biosynthetic intermediates mainly from A. niger and A. nidulans and PKS pathways; and v) all compounds in AntiBase2012 which were listed as coming from: Aspergillus, Fusarium, Trichoderma, Penicilium, Chaetomium, Stachybotrys, Alternaria and Cladosporium, as well as their teleomorphic genera.
Records of compounds reported from studies where the fungal culture was considered incorrectly identified were corrected, before addition to the compound database. When obtained from our own data or the literature, the full UV/VIS spectrum was linked to the record.
From our work on metabolite profiling genera such as Aspergillus, Fusarium, Penicillium, Alternaria and Cladosporium, approximately 400 unknown compounds were added to the database as "unknowns" and registered via their elemental composition and from which species the compounds were detected.
For each compound the known or suspected major adducts, based on analysis of reference standards, were

Creating search lists for Target Analysis (TA)
A Microsoft Excel application was created so the whole Chemfolder data-base (without structures) could be copied into one of the Excel sheets, and then sorted to include one or more genera, subspecies, known impurities, or all compounds with unknown retention time (RT). These data were transferred to a data search-list for TA containing: RT (if known), elemental composition and charge state of desired adduct, and name of compound.
For labelling of peaks in Bruker DataAnalysis 4.0 (DA) (Bruker Daltonics, Bremen, Germany), compounds that were available as reference standards were labelled "S-x" in front of the name where x is the reference standard number in our database. Compounds observed in sample blanks, were labelled "Bl-" in front of the name. Finally, compounds not tentatively identified were labelled as "Unknown"-"producing species"number in the species, e.g. "Unknown-Aspergillus nidulans No. 3".
Automated screening of fungal samples TA 1.2 (Bruker Daltonics, Bremen, Germany), was used to process data-files with the following typical parameters: A) retention time (if known) as ± 1.2 min (broad range), 0.8 min (medium range) and 0.3 min (narrow range); B) SigmaFit; broad 1000 (isotope fit not used), 40 as medium, and 20 as narrow range; and C) mass accuracy of the peak assessed at 4 ppm (broad range), 2.5 ppm (medium range), and 1.5 ppm (narrow range). Area cut off was set to 3000 counts as default, but was often adjusted in case of very concentrated or dilute samples.
The Software DA was used for manual comparison of all the extracted-ion-chromatograms (EIC), generated by TA, to the BPC chromatograms in order to identify non-detected major peaks.
The extract was examined by the AD method searching for a subset of ~1700 Penicillium compounds and additional 700 compounds, and was found to produce a large number of secondary metabolites, see the figure ( Fig. S5, Tables S1 and S2).
Previously detected metabolites along with additional families of secondary metabolites are listed in the Table S1 and the full search results list can be seen in the Table S2. Twenty five secondary metabolites could be assigned with a high degree of confidence. Chrysogine, 6-oxopiperidine-2-carboxylic acid, and 8-(methoxycarbonyl)-1-hydroxy-9-oxo-9H-xanthene-3-carboxylic acid were detected for the first time in P.
melanoconidium, but been found in related Penicillium species [41;42]. Eight members of the roquefortine biosynthetic family (end products oxalines) were found, and also further confirmed by UV spectra and retention times. Concerning the penitrems, taxonomic and biosynthetic considerations, in connection with polarity and literature data, were used to verify the presence of penitrem A-F. Furthermore the UV spectrum and RT was the same for the authentic standard of penitrem A. Isomeric compounds of penitrem A such as pennigritrem and the acid hydrolysis products thomitrem A [43] could be excluded based on UV spectra different from that of penitrem A or because they were minor compounds (pennigritrem) as compared to the main product penitrem A [44;45]. PF1101A and B had the penitrem A UV spectrum which is different from the shearinine and janthitrems [46] and penitrems molecules were therefore much more likely candidates.
Biosynthetic and taxonomic considerations also dictate that it must be the penitrems that are produced by P.

melanoconidium.
The polyketides penicillic acid and verrucosidins were also found in P. melanoconidium. Verrucosidin had the same molecular formula as atranone A (C 24 H 32 O 6 ) [12], but the UV spectrum easily verified the right one. The finding of normethylverrucosidin and deoxyverrucosidin [47] also confirms that the verrucosidin and P. aurantiogriseum also produce these [40]. A metabolite with the formula C 24 H 32 O 4 was annotated as 6farnesyl-5,7-dihydroxcy-4-methylphthalide. However this metabolite has a mycophenolic acid chromophore, which has never been found in P. melanoconidium. The formula could be hypothesized to be a "dideoxyverrucosidin", but this has to be confirmed.
Primary metabolites were few, and included: choline-O-sulfate, linoleic acid, phenylalanine and 1,2dilininoyl-n-glycero-3-phosphocholine, which could be annotated based on reference standards. In conclusion several new families of compounds were which are highly toxic, especially the verrucosidins, but also chrysogine a compound often detected in cereal infecting fungi, e.g. Fusarium. Such information is valuable for future comparative genomics for revealing biosynthetic pathways.