Introduction

Medicinal plants play significant roles in public health care (Avoseh et al. 2015). With an increasing interest in the use of herbal drugs, 60% of the world population are considered to rely on medicinal plants as therapeutic agents (Balakrishnan et al. 2014). Among medicinal herbs with outstanding record in ayurvedic therapy are Cymbopogon (C.) species, e.g., C. citratus (lemongrass) has widespread use in folk medicine for the treatment of inflammation, digestive disorders, fever, and diabetes (Shah et al. 2011).

The genus Cymbopogon (Poaceae), comprising 54 species, is dispersed in the tropical and subtropical regions of the world of which C. citratus, C. pendulus, and C. flexuosus are widely distributed (Avoseh et al. 2015). The genus name originated from the Greek word “kymbe-pogon” which means boat-beard referring to its many-awned inflorescences and boat-shaped spathes (Shah et al. 2011).

Cymbopogon species are mostly recognized for their enriched essential oil that has been exploited in perfumes, cosmetics, food flavors, and pharmaceuticals (Khanuja et al. 2005). Among the most important aromatic species are C. citratus (West Indian lemongrass), C. flexuosus (East Indian lemongrass), C. martini (palmarosa), and C. nardus (citronella grass) (Akhila 2009). The essential oil contains varying volatile constituents among the different Cymbopogon chemotypes, e.g., neral, geranial, citronellol, geraniol, and piperitone (Avoseh et al. 2015). Moreover, phytochemical investigation of different species revealed the presence of terpenoids, alkaloids, phenolic acids, tannins, and carotenoids, and the genus Cymbopogon was reported to be a rich source of C-glycosyl flavonoids (Asaolu et al. 2009; Cheel et al. 2005; Rizk et al. 1995).

Most phytochemical reports on Cymbopogon have focused on C. citratus, with less emphasis on other species within that genus. Most studies have been also related to the essential oil composition (Cerceau et al. 2020) with limited research on other secondary bioactives likely to mediate for the genus biological effects. Numerous pharmacological activities were reported for the genus including anti-inflammatory, antitumor, antioxidant, antimicrobial, cardio-protective, and antirheumatic. Effects attributed to its essential oil include antiprotozoal, insecticidal, antidiabetic, antitumor, and anti-inflammatory activities (Avoseh et al. 2015; Ekpenyong et al. 2015).

To better correlate between these biological effects and metabolite profile, a holistic approach is warranted to unveil the metabolite composition and level for further standardization and quality control measures.

For functional food analysis, several modern approaches are increasingly employed including that of large-scale metabolomics aiming at the detailed characterization of metabolites within food specimens. Metabolomics is considered a fast and reproducible method to gain insight into the metabolome in different biological materials (Chen et al. 2013; Fiehn 2002). It is extensively applied for exploring the effect of growth stage; processing method, seasonal variation, or storage conditions on the plant metabolome; detecting biomarkers of cultivars from different geographical origins; or verifying the quality and safety of new food products (Huo et al. 2017; Klockmann et al. 2017; Patiño-Rodríguez et al. 2018).

Mass spectrometry and nuclear magnetic resonance are the most common analytical techniques utilized in metabolomics. NMR spectroscopy offers a powerful tool for rapid, simple, reproducible, and non-destructive measurements, and unlike other techniques, it shows high accuracy and selectivity (Girelli et al. 2018; Yuan et al. 2017). It has been broadly used in metabolomic analysis of ginseng roots, Passiflora leaves, and date palm by-products (Farag et al. 2016; Otify et al. 2019; Yang et al. 2012). Moreover, NMR has been satisfactorily utilized for quantification in different food products dismissing the need of standard response measurements (Farag et al. 2018). Despite the high potentiality of this technique, NMR-based metabolomic investigations on Cymbopogon species are still scarce (Abdelsalam et al. 2017).

Our goal is to investigate variations in primary and secondary metabolites from different Cymbopogon species in terms of genotypes to provide the first insight into chemotaxonomic relatedness. To achieve our aim, metabolite profiling and fingerprinting, using a NMR approach, was employed for the first-time analysis of the official C. citratus shoots from 2 geographical origins as well as 4 other Cymbopogon species (a total of 13 samples, Table 1).

Table 1 Origin of Cymbopogon shoot samples used in this study

1H-NMR was further utilized to determine the absolute concentrations of the identified metabolites in sample extracts for future standardization purposes. The main advantage of NMR quantification is that signal integration of the compound is directly proportional to its molar concentration, making such technique well suitable for quality control or adulteration detection.

Due to the complexity of acquired data represented by large specimen number and variables, i.e., chemical shifts as typical in case of NMR metabolomics, multivariate statistical analyses are usually performed to organize NMR datasets. The combination of multivariate analysis and NMR provides a more powerful technique for comparing the chemical profiling of different Cymbopogon species.

Such information is deemed to be of interest for the quality evaluation of Cymbopogon in the future and to determine metabolite heterogeneity among its different accessions. Additionally, quantification of the major metabolites detected using NMR is presented for standardization of its extracts.

Materials and Methods

Plant Material and Chemicals

Shoots from 5 different species (C. citratus, C. nardus, C. martini, C. flexuosus, and C. procerus) within genus Cymbopogon (13 samples) were collected from 2 geographical origins, Germany and Egypt (Table 1). Plant names had been checked according to The Plant List http://www.theplantlist.org/ (The Plant List (2013)) accessed on May 20, 2020. The collected material was kept at − 20 °C until further analyzed.

Hexamethyldisiloxane (HMDS) and methanol-d4 (99.80% D) were supplied from Deutero GmbH (Kastellaun, Germany). All other chemicals were obtained from Sigma Aldrich (St. Louis, MO, USA). For NMR quantification and calibration of chemical shift, HMDS was added to a final concentration of 0.94 mM.

Sample Preparation for NMR Analysis

Sample extraction followed the protocol described in Farag et al. (2015). Briefly, 130 mg of dried shoot powder was homogenized with 5 ml 100% methanol using a Turrax mixer (11,000 RPM) for five 20-s periods, with 1-min interval to prevent warming. Extracts were then vigorously vortexed and centrifuged (3000 g for 30 min) to remove plant debris. Three milliliters were then aliquoted by a syringe, and the solvent was evaporated under nitrogen to dryness. Dried extracts were re-suspended with 700 µl 100% methanol-d4 containing 0.94 mM hexamethyldisiloxane (HMDS) as an internal chemical shift NMR standard. The supernatant was centrifuged (13,000 g for 1 min) and transferred to a 5-mm NMR tube. All 1H-NMR spectra for multivariate data analysis were obtained successively within a 48-h time interval with samples prepared directly before data acquisition. Repeated control experiments after 48 h showed no additional variation. Three biological replicates for each specimen were extracted and analyzed in parallel under the same conditions to assess for biological variance.

NMR Data Acquisition

All 1H-NMR spectra were recorded on an Agilent VNMRS 600 NMR spectrometer operating at a proton NMR frequency of 599.83 MHz equipped with a 5-mm inverse detection cryoprobe, digital resolution 0.367 Hz/point (32 k complex data points), pulse width (pw) = 2.1 μs (30°), relaxation delay = 18 s, acquisition time = 2.0 s, number of transients = 160, and temperature = 297 K. Zero filling up to 128 k and an exponential window function with lb = 0.4 were used prior to Fourier transformation. 2D-NMR spectra were recorded at a frequency of 599.83 MHz using standard CHEMPACK 6.2 pulse sequences (COSY, HSQC, HMBC) implemented in standard VNMRJ 4.0A spectrometer software. The HSQC experiment was optimized for 1JCH = 146 Hz with DEPT-like editing and 13C-decoupling during acquisition time. The HMBC experiment was optimized for a long-range coupling of 8 Hz; a 2-step 1JCH filter was used (130–165 Hz).

NMR Data Processing and Multivariate Data Analyses

Spectra were imported to ACD/NMR Manager lab version 10.0 software (Toronto, Canada) and automatically Fourier transformed to ESP files. The spectra were referenced to internal HMDS at 0.062 ppm for 1H-NMR and 1.96 ppm for 13C-NMR, respectively. Spectral intensities were reduced to integrated regions, referred to as buckets, of equal width (0.04 ppm) for all spectral (δ 0.4–11.0 ppm) and aromatic (δ 5.5–11.0 ppm) regions. The spectral regions corresponding to the residual solvent signals, δ 4.90–4.80 (water) and δ 3.33–3.28 ppm (methanol), were removed before multivariate analyses. This binning allowed to evaluate the absolute quantification of the identified metabolites. Table of buckets was imported into SIMCA-P version 13.0 (Umetrics, Umea, Sweden), and hierarchical cluster analysis (HCA), principal component analysis (PCA), and orthogonal projection to latent structure-discriminant analysis (OPLS-DA) were performed with all variables mean centered and scaled to Pareto variance. Seven-fold cross-validation method was applied for selecting the optimum number of principal components required for modeling the data, and distance to the model (DModX) test was used for verifying the presence of outliers. Validation of the developed OPLS-DA models was verified using permutation tests, receiver operating characteristic (ROC) curve, and CV-ANOVA (ANOVA of cross-validated residuals). Metabolite markers were then recognized by analyzing the S-plot, which was declared with covariance (p) and correlation (pcor) in addition to the variable influence in the projection (VIP).

Quantification of Major Metabolites via 1H-NMR

For the quantification of metabolites listed in Table S2 using NMR spectroscopy, the peak area of selected proton signals belonging to the target compounds and the peak area of the internal standard (HMDS) were integrated manually for all the samples. The following equation was applied for the calculations:

$$\mathrm{m_{T}}= \mathrm{M_{T}}\times \frac{\mathrm{I}_{\mathrm{T}}}{\mathrm{I}_{\mathrm{St}}} \times \frac{{\mathrm{X}}_{\mathrm{St}}}{{\mathrm{X}}_{\mathrm{T}}} \times \mathrm{C_{St}} \times \mathrm{V_{St}}$$
m T :

: Mass of the target compound in the solution used for 1H-NMR measurement (μg).

M T :

: Molecular weight of the target compound (g/mol).

I T :

: Relative integral value of the 1H-NMR signal of the target compound.

I St :

: Relative integral value of the 1H-NMR signal of the standard compound.

X St :

: Number of protons belonging to the 1H-NMR signal of the standard compound.

X T :

: Number of protons belonging to the 1H-NMR signal of the target compound.

C S :

: Concentration of the standard compound in the solution used for 1H-NMR measurement (mmol/l)

V St :

: Volume of solution used for 1H-NMR measurement (ml).

Statistical Analysis

NMR quantification data were analyzed using the Co-Stat version 8 software (Monterey, CA, USA). Data are expressed as mean ± SD of the groups. One-way ANOVA followed by Student–Newman–Keuls tests were used to determine significant differences among Cymbopogon sample groups, with 95% confidence limit. Differences were considered statistically significant when p ≤ 0.05.

Result and Discussion

Analysis of Cymbopogon Extract NMR Spectra

The present study provides the first NMR-based metabolomic profiling and fingerprinting of Cymbopogon species. Metabolite identity was assigned in accordance with the literature (Abdelsalam et al. 2017; Campos et al. 2014; Chen and Beattie 2008; Francisco et al. 2014; Goad and Akihisa 1997; Mannina et al. 2012; Ragasa et al. 2008) and in comparison to publicly accessible databases, such as the Human Metabolome Database (HMDB; http://www.hmdb.ca/) and the Biological Magnetic Resonance Data Bank (BMRB; http://www.bmrb.wisc.edu). Signal assignments were also verified using 2D-NMR experiments including 1H-1H COSY, 1H-13C HSQC, and 1H-13C HMBC (Figs. S1–S14). Twenty-four metabolites were identified and listed along with their chemical shifts and distribution among the different Cymbopogon species in Table S1, and their corresponding structures are shown in Fig. 1. Representative 1H-NMR spectra from C. citratus and C. flexuosus methanol extracts are shown in Fig. 2. Two main regions appeared in the 1H-NMR spectrum: an up-field region (δ 0.5 to 5.5 ppm) for high intensity signals belonging mostly to primary metabolites (Fig. 2a) and a down-field region (δ 5.5 to 9.5 ppm) ascribed to secondary metabolites, namely, fatty acids (N1N3), sugars (N4N6), amino acids and nitrogenous compounds (N7N13), organic acids (N14 and N15), sterols and triterpenes (N16 and N17), flavonoids (N18), phenolic acids (N19 and N20), alkaloids (N21), and monoterpenes (N22 and N23) (Fig. 2b).

Fig. 1
figure 1

Structure of the major primary and secondary metabolites detected in Cymbopogon species methanol extracts. Carbon numbering system for each compound is based on analogy rather than IUPAC rules. Metabolite numbers follow those listed in Table S1 for metabolite identification using 1D- and 2D-NMR

Fig. 2
figure 2

1H-NMR spectra of a C. citratus methanol extract at δ 0.8–6.0 ppm and b C. flexuosus atδ 6.0–8.0 ppm showing characteristic signals for primary and secondary metabolites. Peaks assigned in the spectra are labeled as follows: ω-9 fatty acid (N1), ω-6 fatty acid (N2), ω-3 fatty acid (N3), sucrose (N4), α-glucose (N5), β-glucose (N6), asparagine (N7), alanine (N8), valine (N9), threonine (N10), choline (N11), betaine (N12), phenylalanine (N13), (E)-aconitic acid (N14), tartaric acid (N15), cymbopogonol (N16), β-sitosterol (N17), isoorientin (N18), caffeic acid (N19), gallic acid (N20), trigonelline (N21), neral (N22), and geranial (N23). The identities of NMR peaks are listed in Table S1

Fatty Acids

A key feature of unsaturated fatty acid presence in 1H-NMR spectra are the signals of olefinic protons at δ 5.30–5.37 ppm (Fig. 2a) showing HSQC cross-peak correlation with 13C signals at δ 128–133 ppm (Table S1). Moreover, such unsaturation was also confirmed by the allylic methylenes resonating at δ 2.05–2.10 ppm and correlated with the 13C signals at δ 28.0 ppm (Fig. 2a and Fig. S1). In detail, ω-9 fatty acid (N1) was detected as evident from the terminal methyl signal resonating at δ 0.90, whereas 2 triplets at δ 2.77 (J = 6.6 Hz) and 2.81 (J = 6.0 Hz) ppm were readily assigned to the bis-allylic methylenes of ω-6 (N2) and ω-3 (N3) fatty acids, respectively (Fig. 2a). All previous assignments were further confirmed from 2D-NMR spectral data, viz., COSY, HSQC, and HMBC (Figs. S1 and S2).

Sugars

Sugars were mostly represented by sucrose (N4), α-glucose (N5), and β-glucose (N6) as evident from their corresponding anomeric protons at δ 5.39 (d, J = 3.8 Hz), 5.10 (d, J = 3.7 Hz), and 4.47 (d, J = 7.8 Hz) ppm, respectively (Fig. 2a), and showing HSQC cross-peak correlation at δ 93.9, 94.2, and 98.8 ppm, respectively (Fig. S3).

Amino Acids/Nitrogenous Compounds

A total of 7 metabolites were identified as dominant amino acids and nitrogenous compounds (Table S1) including asparagine (N7), alanine (N8), valine (N9), threonine (N10), choline (N11), betaine (N12), and phenylalanine (N13) (Abdelsalam et al. 2017). In detail, chemical shifts of the two well-resolved H-3 signals at δ 2.68 (dd, J = 9.5, 17.1 Hz) and 2.94 (dd, J = 3.9, 17.1 Hz) ppm are distinctive of asparagine (N7) (Figs. S4 and S5). Alanine (N8), valine (N9), and threonine (N10) exhibited clear methyl group signals at δ 1.46 (d, J = 7.1 Hz), δ 1.02 and 1.06 (d, J = 7.0 Hz), and δ 1.31 (overlapped) ppm, respectively, and showing further distinct HMBC correlations with carbons at δ 52.2 (C-2) ppm for alanine (N8), δ 30.5 (C-3) and 62.1 (C-2) ppm for valine (N9), and δ 62.2 (C-2) and 67.5 (C-3) ppm for threonine (N10) (Figs. S4 and S5). Choline (N11) and betaine (N12) could be unequivocally detected via their highly intense singlets at δ 3.21 and 3.27 ppm, respectively (Fig. 2a), assigned to their corresponding N-CH3 groups, and confirmed using HSQC and HMBC spectra (Figs. S1 and S4). High choline and betaine content is included in plant protection (Chen and Beattie 2008). Three unique proton chemical shifts of the phenyl ring at δ 7.28 (H-2′/H-6′), 7.32 (C-3′/C-5′), and 7.34 (H-4’) ppm showing HSQC cross peaks with 13C carbons at δ 129–131 ppm confirmed phenylalanine (N13) and supported via HMBC long correlations (Figs. S6 and S7).

Organic Acids

The appearance of 2 singlets at δ 6.87 (H-1) and 3.67 (H-3) ppm revealed (E)-aconitic acid (N14) presence (Fig. 2); both signals showed long range HMBC correlations with 13C resonances at δ 139.3 (C-2) and 173.1 and 176.8 (carboxylate carbons) ppm (Fig. S5). Among other organic acids detected was tartaric acid (N15) showing a singlet at δ 4.22 (H-2/H-3) ppm correlated via HMBC with 13C signals at δ 75.2 (C2/C-3) and 175.8 (carboxylate carbons) ppm (Table S1 and Fig. S5). Tartaric acid contributes to the overall acidic taste well known for Cymbopogon species (Abdelsalam et al. 2017; Zoecklein et al. 1990).

Triterpene and Sterol

A major lanostane triterpene (N16) was detected in all shoots identified from key exocyclic methylene protons at δ 4.76 and 4.80 (each d, J = 1.6 Hz) ppm; an allylic oxy-methine proton at δ 4.23 (br. s) ppm; 5 methyl singlets at δ 1.22, 0.95, 0.90, 0.88, and 0.96 ppm; and 2 isopropyl methyl signals (overlapped) at δ 0.89 and 0.91 ppm (Table S1). These resonances indicated a pentacyclic triterpene nucleus with an oxy-methine and an exocyclic double bond annotated for lemongrass cymbopogonol triterpenoid (Ragasa et al. 2008). Assignments for N16 were further supported by HSQC, while connectivities were confirmed by HMBC correlations, among which was that between exocyclic methylene proton (H-23) at δ 4.80 ppm and oxy-carbon (C-3) at δ 75.8 ppm and quaternary carbon (C-5) at δ 41.4 ppm, thus confirming ring A structure (Figs. S1, S9, and S10). β-Sitosterol (N17) was also detected showing characteristic weak signals at δ 0.71 (H-18), 0.83 (H-29), and 5.36 (H-6) ppm and confirmed via 2D-NMR spectra (Table S1 and Fig. S1) (Goad and Akihisa 1997).

Flavonoids

In Cymbopogon flavonoids occur mostly as luteolin glycosides (Avoseh et al. 2015). Isoorientin (luteolin-6-C-glucoside, N18) was identified in the 1H-NMR spectrum down-field region via its characteristic ring A and B proton signals (Table S1). In detail, the spectrum indicated the presence of an ABX system of B-ring at δ 7.38 (d, J = 2.2 Hz), 6.91 (d, J = 8.7 Hz), and 7.41 (dd, J = 8.7, 2.2 Hz) ppm corresponding to H-2′, H-5′, and H-6′, respectively (Fig. 2b). Moreover, the spectrum revealed a glucose anomeric proton at δ 4.91 (d, J = 9.6 Hz) ppm appearing up-field at chemical shift ranges of C-glycosides and confirmed from its HSQC 13C signal at δ 74.3 ppm (Table S1). Finally, the presence of 2 singlets at δ 6.55 (H-3) and 6.50 (H-8) ppm and HMBC correlations from the anomeric proton to carbons at δ 110.0 (C-6), 162.4 (C-5), and 165.9 (C-7) ppm confirmed the trisubstituted A-ring structure and glucose attachment to the C-6 position. Altogether these signals for N18 led to its identification as isoorientin (Figs. S7 and S8) and in agreement with reported literature (Francisco et al. 2014).

Phenolic Acids

Phenolic acids represent a common class of plant secondary metabolites, and like flavonoids, they contribute mainly to the antioxidant activity of food materials aside for their taste. NMR analysis revealed the presence of 2 phenolic acids (N19 and N20). In detail, caffeic acid (N19) was identified from the characteristic doublets of the olefinic protons H-2 and H-3 exerting distinct cross-peaks in 1H–1H COSY experiment and appearing at δ 6.29 and 6.57 (each d, J = 16.0 Hz) ppm indicative of E-oriented olefinic protons correlated with the 13C signals at δ 115.8 and 147.7 ppm, respectively (Table S1). Moreover, the 1H-NMR spectrum also revealed an ABX-ring system from signals resonating at δ 7.05 (d, J = 2.0 Hz), 6.78 (overlapped), and 6.95 (dd, J = 8.0, 2.0 Hz) ppm corresponding to H-2′, H-5′, and H-6′, respectively (Fig. 2b). A distinct aromatic singlet at δ 6.99 ppm (H-2′/H-6′) showing HSQC cross peak with 13C at δ 117.1 ppm and HMBC long correlation with carboxylate carbon at δ 166.0 ppm (Figs. S6 and S7) identified compound N20 as gallic acid. Assignments of all phenolic acids (N19 and N20) were confirmed via HMBC spectral data (Fig. S7).

Alkaloids

Trigonelline (N21) was identified in the down-field region (δ 8.0–9.5 ppm) showing 4 aromatic signals: a doublet of doublet at δ 8.05 (overlapped) ppm, two doublets at δ 8.86 and 8.90 (J = 6.3 Hz) ppm, and a singlet at δ 9.18 ppm (Fig. S11). A final singlet resonating at δ 4.43 ppm ascribed to the N-methyl group confirmed the structure of trigonelline. All values were supported via COSY and HMBC correlations and were consistent with those reported in the literature (Campos et al. 2014) (Figs. S12 and S13).

Monoterpenes

The key aroma compound of Cymbopogon was detected in nearly all shoots identified as citral (a mixture of neral and geranial). In the region from δ 9.5 to 10.0 ppm and from 5.0 to 6.0 ppm, signals were assigned to neral (N22) and geranial (N23), two unsaturated aldehydic monoterpenes characteristic of Cymbopogon species (Mannina et al. 2012). The two compounds could be readily distinguished by chemical shifts of the aldehyde protons at δ 9.84 (d, J = 9.9 Hz) and 9.92 (d, J = 8.1 Hz) ppm, respectively, and the methyl groups in cis or trans position at δ 2.01 and 2.20 ppm, respectively (Table S1 and Figs. S11, S13, and S14).

Compared to previous NMR studies directed mainly to essential oil analysis, quality, or detection of its of adulteration, this study provides the first comprehensive NMR metabolites fingerprinting of 5 Cymbopogon species. The employed NMR technique revealed the identification of 23 metabolites of important nutritional, flavoring, and biological significance and belonging to various classes, i.e., fatty, amino, and organic acids, sugars, triterpenes and sterols, phenolics compounds, alkaloids, and volatile monoterpenes.

Multivariate Data Analysis of NMR Dataset

A total of 39 samples was employed in this study (13 accessions; each has 3 biological replicates); this warranted for the employment of multivariate data analysis for detecting the trends among the specimens. PCA and HCA were conducted to obtain a global overview of all investigated Cymbopogon samples and to distinguish between their corresponding metabolic profiles. PCA was performed considering NMR datasets in two attempts, from all spectral width (δ 0.4–11.0 ppm) and from the aromatic region only (δ 5.5–11.0 ppm) to eventually focus on secondary metabolites in Cymbopogon, i.e., phenolic acids, alkaloids, terpenoids, and flavonoids (Farag et al. 2012).

In the first PCA model (δ 0.4–11.0 ppm), several Cymbopogon samples showed considerable overlap in NMR score plot with PC1 and PC2 components accounting for 45% of the total variance (Fig. S15). Such model showed weak discrimination power with even biological replicates within the same sample failing to group together. Hence, a second PCA model considering only the aromatic region (δ 5.5–11.0 ppm) was applied yielding a model prescribed by two components (PC1 and PC2) and explaining 54% of the total variance (Fig. 3). No clear variation could also be seen with respect to the sample cultivars as evident from PCA score plot (Fig. 3a), with C. citratus samples from 2 different origins (Germany and Egypt) failing to cluster together and spreading along PC1. Similar results were also observed from HCA dendrogram (Fig. 3c). Such models confirmed the unsupervised models’ inability to determine growing habitat effect on Cymbopogon chemical composition. Nevertheless, the Egyptian C. citratus appeared most distant from all other species with negative PC1 score values. Additionally, a distinct cluster of C. martini shoots was also observed as an outlier negatively to both PC1 and PC2 (Fig. 3a).

Fig. 3
figure 3

1H-NMR (δ 5.5–11.0 ppm) based principal component analysis (PCA) and hierarchical cluster analysis (HCA) of Cymbopogon species (n = 3). a Score plot of PC1 vs. PC2, colors follow that displayed in Table 1 for origin of Cymbopogon shoot samples. b Loading plot for PC1 and PC2 showing signals of isoorientin and caffeic acid as the major signals contributing to sample discrimination. c HCA of Cymbopogon species, the model is colored according to marked groups

Such partial segregation observed in the second PCA model scores could be attributed to some discriminatory NMR signals as evident from the loading plot (Fig. 3b). In detail, high levels of neral, geranial (δ 5.84 ppm) and (E)-aconitic acid (δ 6.87 ppm) levels were observed in most Cymbopogon species, whereas high isoorientin (δ 6.50, 6.55, and 7.38 ppm) and caffeic acid (δ 6.78 ppm) content was detected in Egyptian C. citratus specimens. The signals discriminating the C. martini samples could not conclude a confirmed structure, yet, these signals seem to belong to an unknown apigenin derivative, most probably a C-glycoside, a characteristic class in genus Cymbopogon (Avoseh et al. 2015). This is evident from the singlet resonating at δ 6.62 ppm (H-3), the 2 ortho-coupled doublets at δ 6.94 (H-3′, H-5′) and 7.95 (H-3′, H-5′) ppm of ring B, and the absence of H-6 and H-8 protons suggesting their substitution by two sugar moieties (Fig. S16). However, the type of sugars and the exact structure need further confirmation using other spectral tools and post isolation,to enrich its levels.

OPLS-DA models were also attempted to help in sample discrimination that failed during unsupervised analysis, with special emphasis on modeling C. citratus versus other Cymbopogon samples. OPLS-DA is a supervised classification technique having a great potential in finding the maximum separation among overlapping sample groups and identifying differential biomarkers (Bylesjö et al., 2006). The first OPLS-DA model was performed for the whole NMR range (δ 0.4–11.0 ppm) with all C. citratus samples modeled against other C. species (Figs. S17 and S18). Interestingly, a clear separation of C. citratus from other Cymbopogon samples was observed along the predictive component (Fig. S17a). The metabolites that influence such pattern can be verified from the loadings S-plot (Fig. S17b) and VIP plot (Fig. S17c), where C. citratus samples were found to be particularly enriched in (E)-aconitic acid and betaine as the major characterizing metabolites compared to other Cymbopogon species. Besides, ROC curve was also performed to demonstrate the model classification ability. The area under the ROC curve (AUC) was considered for C. citratus group as a validation criterion for its classification and was found to be 0.996 (Fig. S17d), indicating an effective classification model. Other validation procedures include permutation tests that showed negative Q2 intercept value (Fig. S18c) and CV-ANOVA with p value below 0.05 (Fig. S18d) typical for valid models. This predictive model also has a fitness of data R2(X) of 76% and a cumulative sum of variation R2(Y) of 89% with predictive ability Q2 of 78% (Fig. S18a). However, one of the major pitfalls regarding this model is the large number of principal components, i.e., one predictive and 5 orthogonal components required to obtain such validity based on the cross-validation results with root mean square error of cross-validation (RMSECV) of 0.23 (Fig. S18b).

A second OPLS-DA model for C. citratus versus other Cymbopogon samples was also developed considering only the aromatic region (δ 5.5–11.0 ppm) and focusing on the plant secondary metabolites (Figs. S19 and S20). Interestingly, it was found that one predictive and two orthogonal components were sufficient to model the data based on the cross-validation results with R2(X) of 60%, R2(Y) of 81%, Q2 of 72%, and RMSECV of 0.26 (Figs. S20a and S20b). Increasing the number of orthogonal components to three has led to overfitting of the developed model with decreasing in the Q2 and increasing in the RMSECV values (Figs. S20a and S20b) ensuring that one predictive and two orthogonal components were sufficient for data modeling. The score plot derived from this OPLS-DA model showed a clear segregation between both sample groups, along the predictive component (Fig. S19a), while the corresponding S-plot loadings (Fig. S19b) and VIP plot (Fig. S19c) highlighted for the enrichment of C. citratus samples in neral, geranial, (E)-aconitic acid, isoorientin, and caffeic acid compared to other Cymbopogon species. The high essential oil and phenolic content in C. citratus highlights its valuable antioxidant activity and rationalizes for its commercial and medicinal use among Cymbopogon genus. Besides, ROC curve obtained for this model passes through the upper left corner (100% selectivity, 100% sensitivity) with AUC value for C. citratus group of 1 indicative for better classification model (Fig. S19d). Permutation test with 200 times and CV-ANOVA were conducted for evaluating whether the model is over fitted (Figs. S20c and S20d), with negative Q2 intercept value and p value below 0.05 proving the model validity.

Quantification of Major Metabolites via1H-NMR

To ensure quality control of different Cymbopogon products, precise metabolites measurement is warranted. 1H-NMR was further used to measure the absolute levels of identified metabolites in Cymbopogon species via integration of their well-determined corresponding peaks in the NMR spectra (see experimental section). The concentration of metabolites was calculated as μg/mg dry powder in different samples as shown in Table S2.

Sugars amounted to the major metabolites in all species with maximal levels found in C. citratus extracts (104.1 ± 4.3 μg/mg total sugars) and with glucose (in its α- and β-forms) representing the major sugar in most samples. The high sugar content adds to the nutritional value and palatable taste of C. citratus.

The abundance of the unsaturated fatty acids in all Cymbopogon species (up to 26.8 ± 2.2 and 25.3 ± 3.1 μg/mg in C. martini and C. nardus, respectively) poses the genus as a healthy functional food product due to their potential to decrease serum LDL cholesterol (Liang and Liao 1992; O'Brien 2009).

Concerning non-essential amino acids, alanine was almost equal in all shoots (1.4 ± 0.2—2.9 ± 0.8 μg/mg), while asparagine was found at comparable levels in the examined species (8.1 ± 0.2—23.3 ± 3.2 μg/mg). Phenylalanine, an essential amino acid, was also quantified at nearly the same level in the different samples (1.1 ± 0.1—1.5 ± 0.4 μg/mg). However, valine could not be quantified in most examined specimens owing to signals overlap.

Total choline and betaine level was quantified in all sample extracts reaching ca. 6.3 ± 3.7 μg/mg in C. citratus, albeit found at its lowest concentration in C. flexuosus (0.9 ± 0.1 μg/mg). Both compounds have anti-inflammatory and antidiabetic actions and their high levels in C. citratus rationalize for its use in inflammation and diabetes (Chung et al. 2015; Gao et al. 2017; Zhao et al. 2018).

(E)-Aconitic acid, a potential antifeedant, reached its highest levels in C. nardus (22.2 ± 1.7 μg/mg) and C. citratus (17.8 ± 4.1 μg/mg) emphasizing the traditional use of both species as insect repellent (Kim et al. 1976; Saxena 1986).

Regarding the phenolic metabolites, contributing mainly to the antioxidant potential of food, gallic acid was found with nearly equal content in all species (1.0 ± 0.1—2.4 ± 0.1 μg/mg). However, isoorientin and caffeic acid could not be quantified in all  samples due to signals overlap.

The key aroma compounds, neral and geranial, responsible for the lemony flavor in Cymbopogon species were quantified in nearly all shoots presenting them as important agents in food and flavor industry (Carlson et al. 2001). A previous study reported that the potential antibacterial action of C. citratus essential oil resides mainly in these two components (Onawunmi et al. 1984). Both volatiles were found at maximal levels (8.6 ± 1.0 and 10.6 ± 1.0 μg/mg, respectively) in C. citratus, yet absent in C. martini and Egyptian C. citratus cultivar, justifying their appearance most distant from all other Cymbopogon species in foregoing multivariate analysis results (Fig. 3a).

Cymbopogonol, as an important antimicrobial non-volatile terpenoid, was quantified in all samples reaching its highest level in C. martini and C. citratus (20.3 ± 0.8 and 19.2 ± 5.0 μg/mg, respectively). This high cymbopogonol content together with other essential oil constituents may explain the strong antimicrobial activity of C. martini although found to be poor in neral and geranial content (Gemeda et al. 2018; Ragasa et al. 2008).

Trigonelline was found at comparable levels (1.0 ± 0.9—3.1 ± 0.8 μg/mg) in nearly all examined species. Research studies revealed that trigonelline exhibited significant hypoglycemic activity in both animal models (50 mg/kg) and humans (500 mg oral daily dose) (Mishkinsky et al. 1967; Sharma et al. 1990; Zhou et al. 2012). Moreover, the antidiabetic action of fenugreek seed, well-known for its high trigonelline content (0.13–0.36%), has been attributed to its trigonelline as a major hypoglycemic agent (Barnes et al. 2002; Mishkinsky et al. 1974; O'Neil 2001). Trigonelline levels quantified in Cymbopogon species were close to those found in fenugreek seed and thus strongly suggesting that trigonelline contributes mainly to the antidiabetic activity reported for many Cymbopogon species (Kamble et al. 2020; Kouame et al. 2016).

To the best of our knowledge and compared to previous NMR studies directed mainly to essential oil analysis, this study provides the first comprehensive NMR metabolite fingerprinting and standardization of 5 Cymbopogon species from 2 different biological origins.

Conclusions

This study reports a comprehensive NMR metabolic profiling study of 5 Cymbopogon species used extensively in folk medicine, yet with limited phytochemical characterization or standardization. NMR coupled with multivariate data analyses was further employed for sample classification and suggests that geographical origin cannot be revealed from the NMR dataset in any of the examined Cymbopogon specimens. Only differential metabolites of C. citratus and C. martini could be identified with neral, geranial, (E)-aconitic acid, isoorientin, and caffeic acid as the major discriminating metabolites for C. citratus, while an unknown apigenin derivative appeared to distinguish C. martini. The enrichment of C. citratus in essential oil, phenolic compounds, and (E)-aconitic acid justified its preferred commercial use in food and cosmetics and further highlighted its medicinal potential as antioxidant and insecticidal agent in comparison to other Cymbopogon species.

Although the results here represent a preliminary report in terms of finding potential specific biomarkers related to Cymbopogon species, insights gained by the present work, as the first one based on NMR metabolic profiling of genus Cymbopogon, are useful for further species or cultivars as well as employing other analytical investigations. Further research can be done by enlarging the sample size and by covering more cultivation areas to validate these exploratory results. In addition, an extended approach utilizing liquid chromatography-mass spectrometry and gas chromatography–mass spectrometry can be applied to pinpoint differences in bioactive secondary metabolite profiles among Cymbopogon accessions and to rationalize for more of its flavor composition and health effects in more depth. Quantification of key metabolites using NMR provides the first basis for Cymbopogon shoot standardization which can be used further for their estimation when present in complex drug mixtures. Moreover, the effect of seasonal variation, drying method, and storage conditions of Cymbopogon shoots on their metabolite composition have yet to be addressed using this platform.