Reflection on the status quo of dereplication

To bring knowledge forward, you have to distinguish new from old. This is a difficult task, especially when analyzing complex mixtures that contain thousands of compounds and are subject to natural variation. Natural samples have an intrinsic dynamic complexity—some parts stay the same, and other parts change. Currently, the view on complex natural samples is limited and important active compounds may be overlooked, among others, due to tedious sample preparation (only a part of the sample is analyzed) or selected detection principles (not every compound is ionizable and detectable by mass spectrometry). Nevertheless, it is easy to discover new metabolites in natural samples containing many thousands of compounds as the separation and detection capability of analytical methods increases steadily. So there are still many thousands that we will discover in the future—the better the technology, the more we will see. Databases with natural products get larger and larger, such as the Natural Products Atlas (www.npatlas.org, covering bacterial and fungal natural products). At the same time, quickly detecting and excluding compounds that have already been discovered and studied is an ever-increasing challenge.

Current analysis and dereplication strategies for complex natural samples (Abbas-Mohammadi et al. 2022; Bayona et al. 2022; Bhattarai et al. 2022; Poyer et al. 2022; Safriani et al. 2022; Shishido et al. 2019) show potential but also limitations. On the one hand, they trigger a cascade of tedious efforts for verification, either trying to comprehensively consider as many signals as possible or focusing on target compounds and related ones with extensive sample preparation prone to substance loss. On the other hand, they depend on molecules that can be ionized and fragmented under default settings, such as the Global Natural Products Social Molecular Networking (https://gnps.ucsd.edu). In-depth analysis is complicated by natural variations and xenogenic entries. Instruments used, such as high-performance liquid chromatography—diode array detection—electrospray ionization high-resolution mass spectrometry (HPLC−DAD−ESI-HRMS) systems, require sufficient data points per peak and dwell time for an ever-increasing number of compounds or acceptable sensitivity for samples that are highly diluted (Steiner et al. 2021) to reduce system contaminations and matrix-effects (Raposo and Barceló 2021). Currently overlooked low-signal, discriminated, or non-ionizable compounds can have important biological effects and be important targets in a complex sample. If no current mainstream technology is able to detect them, there are no critical questions about methodological bubbles. Such fallacies can be disregarded, but they impede progress, understanding, and safety. Statistical models first need to be established and then always adjusted to the purpose with limited proof. Large data sets and compound databases need to be handled. Further limitations of targeted and untargeted strategies using comprehensive chromatographic techniques have recently been discussed elsewhere. (Morlock 2021).

Through all these undertakings, the complexity of analytical data is increasing and drifting away from rational human comprehension to reliance on algorithms, predictions, and computational virtual assessments. This makes it all the more necessary to develop disruptive dereplication strategies that deliver experimental real-world results, offer prioritization of the important compounds and return to efficiency. Simplicity is the ultimate sophistication according to Leonardo da Vinci. Considering multicomponent natural mixtures, we should at least know the important active compounds in them, and not what is possible to be analyzed. Even if all target compounds were in focus (concerning metabolites, conversion products, degradation products, derivatives, residues, contaminants, adulterants, etc.), the largest portion of the sample will remain uncertain and unidentified. Among the unknown unknowns, important molecules may be low in concentration but highly active. Molecular networking can help to capture even more coherences, but much like looking at the universe, the more we see, the more humble we are and the more we have to prioritize what it means. It is far better to focus on active compounds in complex samples than to give a precise answer on selected facets, as we currently do. Is the idea of the Mona Lisa painting captured better, if one takes only dark brown pixels or the important pixels? In both cases, we can only hypothesize what it might mean, but the success rate (discovering the smile) is higher for the latter pixels taken. Increasing the manageable number of brown pixels does not help much if the important ones are not there. So how do we find important new ones in addition to the ones already known and provide efficiency through disruptive thinking?

Principle of planar chromatographic super-hyphenations

Natural samples are complex. They are rich in thousands of different compounds. Since every sample preparation step can alter the sample composition and discriminate constituents, samples need to be kept as original as possible. Thus, sample preparation has to be kept minimalistic not to lose any important compound. This is an important precondition to be reasonably confident that a complex sample can be evaluated comprehensively. Omnipresent accurate analysis of cleaned-up samples may not be the appropriate tool to provide results for important decisions. Minimalistic sample preparation, orthogonal separations (reducing matrix interference or co-elution), and multi-detection including effect-detection (capturing the structural diversity of the complex sample as far as possible, not to overlook compounds) are important prerequisites for evaluating a complex sample. Currently, most of the separated compounds in a complex sample are unknown in terms of their structure and toxicological profile. So combining the two disciplines chemistry and biology is extremely helpful in picking out and prioritizing the important compounds that are active either beneficial or harmful. The separation of mixtures, and on the same surface, the effect-directed detection of bioactive compounds, followed by their identification leads to a straightforward dereplication technique. The workflow (Fig. 1A) consists of (I) parallel screening and separation of multiple complex mixtures using imaging high-performance thin-layer chromatography (HPTLC), (II) planar multiplex bioassay for non-targeted detection of important active compounds, and (III) heart-cut elution of active zones of interest directly out of the bioautogram into orthogonal HPLC−DAD−ESI-HRMS for targeted characterization. Only interesting bioactive zones were isolated (by the cutting edge of the elution head-based interface), eluted, and transferred. This is straightforward and also minimizes matrix contamination of the HRMS system.

Fig. 1
figure 1

Workflow steps of the 12 D super-hyphenation (A) with parallel HPTLC screening, planar multiplex bioassay detection, and heart-cut elution to HPLC−DAD−ESI-HRMS. Principle of the planar multiplex bioassay and effect differentiation shown for estrogen-like compound detection (B). Example of dose response curve for quantification of individual effects (C) and zone assignment (D)

The hyphenation can include multiple but different (orthogonal) separation/detection dimensions depending on the integrated tools chosen, as shown for the straightforward 12 D hyphenation (Fig. 1A). Such super-hyphenation reduces thousands of compounds to the important compounds, i.e. all sample compounds that show a distinct biological reaction. The great advantage of this dereplication strategy is that it prioritizes compounds worth to be identified or elucidated in their structure, either known or unknown (Fig. 2). On the one hand, the obtained results may confirm existing literature data (considered as proof of the super-hyphentation). On the other hand, unexpected bioactive compounds are revealed not previously in the focus (unknown unknowns), which is important new knowledge. Especially the latter bioactive unknowns that have not previously been the focus of analysts are important health-related information in the risk assessment of complex samples. In addition, super-hyphenation significantly reduces the amount of data to be evaluated in a complex sample—a not insignificant aspect for more efficiency. In comparison to column-based hyphenations, many further advantages of planar hyphenations were reported (Fig. 2) (Morlock and Schwack 2010). For example, samples can be analyzed in parallel under the same chromatographic conditions on always fresh adsorbent, system parameters can be adjusted so that no microliter of an expensive sample is lost, a wide range of sample volumes (0.1 to 1000 µL, area application for larger volumes) is useable for analysis, and the sample solution can simply be concentrated by solvent evaporation during spray-on application.

Fig. 2
figure 2

Source: Modified from Anal Chim Acta 1180 (2021) 338,644 with permission from Elsevier

Dereplication strategy through prioritization of important compounds via effect-directed detection (A); three key elements of the straightforward hyphenation and advantageous features (B).

Super-hyphenation workflow

Liquid samples are directly applied or only diluted because the HPTLC separation is matrix-robust, whereas solid samples need to be brought in a liquid form and thus need to be extracted. For example, an extraction solvent mixture consisting of water—ethanol—ethyl acetate 1:1:1 (V/V) is highly suited as a good compromise to extract compounds of differing polarities at one go (Morlock et al. 2021a). All bioactive compounds were extracted with this mixture, which were also observed in different individual solvents studied (covering almost the whole polarity range). Many samples (as crude or raw as possible) are applied in parallel and are simultaneously separated on the normal phase (NP)-HPTLC plate. Multi-imaging via ultraviolet (UV), visible (Vis), and fluorescence detection (FLD) followed, which displays the result in the form of an image (chromatogram). Image evaluation and comparison of side-by-side separated sample compounds are easy and intuitive to perform. Favorably, an image is understandable across all languages. Then the planar bioassay is on-surface applied for effect-directed detection. Selected active zones of interest were eluted from the bioautogram and heart-cut guided first into an orthogonal reversed phase (RP)-precolumn used as an analyte trap for desalting (Kirchert and Morlock 2020; Schreiner and Morlock 2021). After the removal of the bioassay medium salts, the valve was switched to elute the trapped molecule(s) to the main endcapped RP-HPLC column for orthogonal separation, followed by DAD–HRMS detection.

The separation efficiency of HPTLC is optimal for the heart-cut zone or fraction transfer to the second orthogonal HPLC dimension. Advantageous is that the organic solvents are readily evaporated before the bioassay detection and that the separated compounds are directly accessible for the assay, without any difficulties (as observed for in vitro assays) such as incompatibility of sample solvents or insolubility of compounds with regard to the polar assay medium. The proposed workflow is efficient and suited for routine use. The steps are automated and only a few minutes are required for each zone transfer. The parallel analysis of 20 samples reduces the consumable costs per sample to 0.5−0.9 Euro and the analysis time to 5−25 min per sample, depending on the selected planar assay (some assays have a longer incubation time, or multiplex assays need the additional application of the stripe). The targeted identification of selected bioactive zones then follows, also highly efficient as already mentioned. The operating time of the expensive HRMS system is reduced to a minimum. The requirement for data storage and data handling is comparably minimalistic. The aforementioned biological prioritization of compounds results in less contamination of the expensive system by avoiding the recording of the non-active sample portion and matrix, which is inefficient (not important) to record. The versatile information provided by the 12 D hyphenation on an active compound zone supports its straightforward sound assignment to a molecular formula (Fig. 1).

Power of planar multiplex bioassays

An image is worth a thousand words and an effect image even more so. This fact cannot be repeated often enough, as the power of planar assays is still unknown and overlooked (Morlock 2021). The bioautogram illustrates mechanistic data obtained in reality through real experiments, and not via computational virtual prognoses or activity prediction. Multiplex bioassays on the surface of the chromatographic separation enable the differentiation of opposing signals/effects of individual compounds in complex natural sample mixtures analyzed in parallel (Ronzheimer et al. 2022). They provide a deeper understanding on prevailing effects and potential mechanistic interactions (Fig. 1B). This is a considerable advantage compared to microtiter plate assays. The latter in vitro assays provide a sum parameter for complex samples, in which opposing signals/effects can be canceled out and wrong conclusions may be drawn. Since, all together in the same well, antagonists or false-positive antagonists hinder the detection of agonists due to the measured sum of opposing signals (e.g., minus signal and plus signal leads to a zero signal judging the sample as safe, but it is not).

Exemplarily, the power of planar multiplex bioassays is explained for detecting compounds with hormonal effects. The workflow is analogously used for the human estrogen receptor (Ronzheimer et al. 2022) or androgen receptor (Schreiner et al. 2022). On the chromatogram containing the separated samples, a stripe of a well-known agonist is sprayed along each separated sample track (needed for detection of antagonists in the later bioautogram). In addition, another stripe of the end-product chemical formed by the intended enzyme—substrate reaction is sprayed along each separated sample track (needed for the detection of false-positives and thus for the verification of true antagonists in the later bioautogram). Then the genetically modified yeast cells with the incorporated human receptor and reporter gene encoding for β-D-galactosidase were applied on the plate. After incubation (hormonal active sample compounds bind to the responsive element in the yeast cell, finally to release galactosidase), the fluorogenic substrate 4-methylumbelliferyl-β-D-galactoside was applied and again incubated (now the released galactosidase cleaves the substrate to the blue fluorescent 4-methylumbelliferone). In the resulting bioautogram), an agonist is detected as usual as blue fluorescent band (Fig. 3A), whereas an antagonist causes a fluorescence-reduction of the applied agonist stripe (Fig. 1B). When at the same band position the fluorescence of the end-product chemical stripe is not reduced, the antagonist is verified to be true. In contrast, a false-positive antagonist would show also on the end-product chemical stipe a fluorescence reduction due to physico-chemical fluorescence reduction (Fig. 1B). A synergistic effect is detected via comparison of the fluorescence intensity of the agonist stripe on the sample track versus solvent blank track. If the intensity is enhanced on the sample stripe and an agonist band is detected too, it can be an additive effect (1 + 1 = 2). If the intensity is enhanced, but no agonist band is observed, it is a synergistic effect (0 + 1 > 1), i.e. a non-active compound enhances the agonist effect.

Fig. 3
figure 3

Estrogenic bioassay profiling of wine samples (detected at FLD 366 nm before and after the bioassay) reveals various estrogenic compound zones as blue fluorescent zones due to the enzyme—substrate reaction end-product 4-methylumbelliferone formed upon receptor binding (A); Examples of different types of planar assays already applied on the adsorbent surface, as illustrated for the genetically modified Saccharomyces cerevisiae (B). Source: modified from Schreiner and Morlock (2022, in press), and Anal Chim Acta 1180 (2021) 338,644, both with permission from Elsevier

It was surprising that many of the 68 studied botanicals showed synergistic effects for the human estrogen receptor (Ronzheimer et al. 2022) but not for the androgen receptor (Schreiner et al. 2022). Changes in the fluorescence signal (enhancement or reduction) observed in the bioautogram can also be measured (video)densitometrically, providing not only the detection and differentiation but also quantification of individual effects (Fig. 1C). The concept of a planar multiplex bioassay can generally be applied and transferred to any other assay. Unfortunately, the terminology is often not properly used, although important for scientific communication (Morlock 2018). The umbrella term effect-directed assays (EDA) includes biological, biochemical, and microchemical assays, whereby the latter must be effect-directed (Fig. 3B). The term biological assay (bioassay) implies the application of cells or at least cell organelles. The term biochemical assay involves the use of enzymes. The term microchemical effect-directed assay, such as a radical scavenging assay, means the use of a chemical compound that indicates a specific effect. If a sample component (and not the released enzyme) reacts with the substrate (Rhee et al. 2003), the substrate should be substituted by another one (Morlock et al. 2021b).

New adherent human cell lines as on-surface assays

The latest development of planar adherent human cell line assays is of interest for the whole medical and toxicological field. As first examples, the human osteosarcoma (cancer) U2OS cell line (Cytotox CALUX) and peroxisome proliferator-activated receptor γ (PPARγ CALUX) cell line (Klingelhöfer et al. 2021) as well as a transfected human embryonic kidney cell line (HEK 293 T-CMV-ELuc) or the HeLa (cervical carcinoma) cells (Mügge and Morlock 2022) were proven to be viable on the porous adsorbent (Fig. 3B). The detection via an incorporated luciferase expressing reporter gene was straightforward. The bioluminescence detection was shown in two ways, i.e. not only the bioluminescence reduction via a cytotoxic compound effect on the cells that continuously produce luciferase but also the bioluminescence production via the expression of luciferase triggered by the binding of ligands in the PPARγ responsive element (Klingelhöfer et al. 2021). The obtained bioprofiles pointed to individual active compounds with impact on the selected adherent cell line. The further characterization of the active zones was successful (Klingelhöfer et al. 2021; Mügge and Morlock 2022) due to the mentioned super-hyphenation  potential.

Added value through on-surface metabolization

Both the gastrointestinal and liver metabolism can activate or inactivate molecules, which may generate beneficial or harmful compounds. To study such mechanisms of action, a simulated metabolization can be applied on the same adsorbent surface. Recently, a harmonized in vitro protocol (Minekus et al. 2014) was transferred to a planar nanomolar in situ protocol, termed nanoGIT+active. The functioning of the miniaturized all-in-one system was demonstrated for basic food samples and plant extracts using amylolytic, proteolytic, lipolytic, and S9 liver enzymes. (Morlock et al. 2021c) The S9 mixture is an extract of the enzyme-rich tissue from the mammalian liver used for metabolic simulation of the activation or inactivation of molecules, mainly through the phase 1 cytochrome P450 pathway. Thus, the on-surface metabolization in combination with immediate separation and bioassay detection enables a side-by-side comparison of metabolized and non-metabolized samples and thus the study of changes in active compounds due to digestion and de-/toxification (Azadniya et al. 2020).

2LabsToGo system for citizen science

The chemistry and biology laboratory is combined and miniaturized in the latest 2LabsToGo system (Fig. 4) (Sing et al. 2022). It was open-source developed as a fully solvent-resistant all-in-one system for wider use, also as Citizen Science. The codes of software (https://github.com/OfficeChromatography/OC-Manager3), firmware, 3D print files for self-construction (https://github.com/OfficeChromatography/OCLab3) and bill of materials are freely available. It combines sample application, separation, sample derivatization, plate heating, bioassay application, bioassay incubation, biological effect detection, and multi-imaging (UV/Vis/FLD and bioluminescence). In particular, biological effect detection is key for prioritizing compounds among the many thousands of compounds in complex samples (Fig. 2). Exemplarily, the screening of water samples (50 µL each directly applied without sample clean-up) via the Aliivibrio fischeri bioassay frequently used in the environmental field showed that drinking water typically contains and shows no biologically active compounds in the bioautogram, whereas landfill leachate samples or biogas plant water did (Fig. 4, observed as dark zones). The dark zone in the rainwater sample collected unprotected in an open pot overnight (with leaves and insects fallen into it) was surprising. The comparison of patterns or profiles is easy to understand globally across all languages. The lean 2LabsToGo system is compact (26 cm × 31 cm × 34 cm), lightweight (6.8 kg), and affordable (€ 1717 bill of materials). It also supports instrumental eco-friendliness, minimalism in material consumption and method greenness.

Fig. 4
figure 4

Source: Modified from Anal Chem 10.1021/acs.analchem.2c02339 with permission from ACS

Miniaturizing the chemistry and biology laboratory: solvent-resistant open-source all-in-one 2LabsToGo system suited for Citizen Science, consisting of syringe pump (1), valve − nozzle system (2), electronics (3), miniature cabinet with camera and LEDs (4), multi-functional heatable plate holder (5) and motors for positioning (6) as well as bioluminescent Aliivibrio fischeri bioautogram of nine different water samples (applied directly), differentiating clean versus contaminated ones (dark bioactive zones I − III).

Chances and limitations for dereplication

Much remains to be discovered. To date, the mode of action of complex natural samples, such as microbial cultures (Kruse et al. 2021a, b), traditional and herbal medicines, or phytomedicines (Liu et al. 2020), is poorly understood. The same applies to food or feed enriched with functional ingredients, which might be able to gently modulate our well-being when consumed on a regular (daily) basis (Morlock et al. 2021b). It is increasingly important to also detect xenogenic compounds that are active and often overlooked in natural samples (Morlock et al. 2021a). This highlights the need to combine two different diciplines for an efficient non-target strategy and straightforward dereplication technique to accelerate knowledge gain with a focus on the important bioactive compounds. The proposed matrix-robust super-hyphenation focuses on the relevant sample part (bioactive hotspots), and thus underpins and eases the process of decision-making (Fig. 2). This prioritization strategy is a disruptive approach to current dereplication strategies. However, it can also complement safety and risk assessment and provide in-depth insights into complex natural sample mixtures (Schreiner et al. 2021).

The ability to detect active compounds depends on the response intensity of the planar bioassay (Fig. 3). For higher sample volumes applied and a sensitive bioassay response, such as for the planar estrogen screen bioassay (Klingelhöfer and Morlock 2015) or planar genotoxicity bioassay (Meyer et al. 2021), the ultra-trace level (ng/L or kg) can be directly reached without a sample concentration step. Although not all bioassays are equally sensitive, in all comparisons of planar bioassays with modern microtiter plate bioassays, the planar bioassay was equally sensitive or even more sensitive in response (Azadniya et al. 2020; Klingelhöfer et al. 2021; Meyer et al. 2021). Additional advantages are matrix-robustness, separation of mixtures, effect differentiation and straightforward heart-cut transfer to further hyphenation dimensions, which support the assignment of molecular formula (Klingelhöfer et al. 2020; Ronzheimer et al. 2022; Schreiner et al. 2021; Schreiner et al. 2022).

A clear limitation of the super-hyphenation is that volatile minor constituents (not retained by adsorption energy) may be lost, or due to the open planar format, constituents may be altered by (photo)oxidation, although most sample constituents remain safely stored on the adsorbent. Photooxidation is judged to be not so critical, as most processed products are hardly excluded from this influence, too. Nevertheless, a speedy (not interrupted) execution of the work steps is recommended. The super-hyphenation may detect a biologically active compound zone, but the compound need not necessarily be ionizable, even by different ionization sources and instrumental settings, which hampers straightforward identification. In addition, a bioactive compound, detected by a very sensitive bioassay, can be present at very low amounts (Klingelhöfer and Morlock 2015), too low for successful HRMS recording in general. However, the important bioactive compound is at least recognized to be present as such, in contrast to solely HPLC−HRMS based dereplication strategies. The detection of synergy (Fig. 1B) is not comprehensive due to the abundance of possible combinations. However, synergistic effects have already been discovered in many plant extracts by the demonstrated multiplex bioassay, but still need to be studied in depth to understand the underlying mechanism. Sophisticated supervised and unsupervised image evaluation offers great potential (Fichou et al. 2016, 2018; Fichou and Morlock 2018), but advanced software is still undeveloped for planar chromatographic hyphenations, which needs to be improved in future. Emerging planar technologies open new avenues with new possibilities and potentials. Finally, it must be recognized that planar chromatography separates compounds but easily combines disciplines due to its open planar format, which makes the technique powerful.