Identification of bacterial lipo-amino acids: origin of regenerated fatty acid carboxylate from dissociation of lipo-glutamate anion

The identification of bacterial metabolites produced by the microbiota is a key point to understand its role in human health. Among them, lipo-amino acids (LpAA), which are able to cross the epithelial barrier and to act on the host, are poorly identified. Structural elucidation of few of them was performed by high-resolution tandem mass spectrometry based on electrospray combined with selective ion dissociations reach by collision-induced dissociation (CID). The negative ions were used for their advantages of yielding only few fragment ions sufficient to specify each part of LpAA with sensitivity. To find specific processes that help structural assignment, the negative ion dissociations have been scrutinized for an LpAA: the N-palmitoyl acyl group linked to glutamic acid (C16Glu). The singular behavior of [C16Glu-H]¯ towards CID showed tenth product ions, eight were described by expected fragment ions. In contrast, instead of the expected product ions due to CONH-CH bond cleavage, an abundant complementary dehydrated glutamic acid and fatty acid anion pair were observed. Specific to glutamic moiety, they were formed by a stepwise dissociation via molecular isomerization through ion–dipole formation prior to dissociation. This complex dissociated by partner splitting either directly or after inter-partner proton transfer. By this pathway, surprising regeneration of deprotonated fatty acid takes place. Such regeneration is comparable to that occurred from dissociation to peptides containing acid amino-acid. Modeling allow to confirm the proposed mechanisms explaining the unexpected behavior of this glutamate conjugate.


Introduction
Among intestinal microbiota metabolites, lipopeptides (Pérez-Berezo et al. 2017) and lipo-amino acids (LpAA) (Vizcaino et al. 2014) have been described for their capacity to regulate host homeostasis. The major technical barrier for the identification of bacterial LpAA or lipopeptides is the difficulty or even the impossibility to isolate a sufficient sample amount to make the analysis either by NMR or/and by X-ray, which would require more than several tens of μg and several mg, respectively. In a previous study, identification of LpAA and lipopeptides produced by Escherichia coli Nissle 1917, such as the N-lauroyl acyl group linked to asparagine (C12Asn), has been done in negative ESI mode yielding abundant [C12Asn-H]¯ anions (Pérez-Berezo et al. 2017).
Analysis of these metabolites by hyphenated methods as liquid chromatography-tandem high-resolution mass spectrometry LC-HRMS/MS is especially versatile for complex mixture analysis. This method is characterized by a large selectivity, specificity and sensitivity when it is combined to desorption/ionization as electrospray (ESI). Indeed, this mode is suitable for ionization of a broad variety of more or less polar metabolites. Furthermore, it may be a powerful method for structural elucidation of do novo compounds (Thomas et al. 2019) especially, using ion collisional activation (i.e. collision-induced dissociation, CID). First, using low-resolution instrumentation based on linear ion trap quadrupole, an encouraging study of ornithine and glutamine lipids (from Rhodobacter sphaeroides) proposed mechanisms (Zhang et al. 2009) or a possible rationalization of ion dissociations under low-energy collision conditions in resonant excitation mode.
In this mode, only the parent ion is excited by its own frequency. Therefore, since the produced ions are not excited, their consecutive dissociations are inhibited unlike competitive dissociations (Bennaceur et al. 2013). However, they occur when the dissociative process is exothermic and/or the residual internal energy carried by the produced ions is sufficient to drive consecutive dissociations (Darii et al. 2021) . A previous study (Boukerche et al. 2016) investigated the detected fragment ions of protonated LpAA ([LpAA + H] + ) resonantly excited under low-energy collisional activation conditions. They interpreted formation of these product ions using exclusively mechanisms based on charge-promoted dissociations. In certain cases, as a consequence, molecular isomerization into ion-dipole as intermediate of stepwise dissociation pathways is considered (Afonso et al. 2010). Such an isomerization explains competitive dissociations mainly based on internal proton transfer into ion/neutral complex responsible of the complementary product ions. This approach is in fact an alternative to the one previously proposed (Zhang et al. 2009) for some fragmentations where the charge remained spectator as it happened during collision process with energies in the keV range (Wysocki et al. 1988;Gross et al. 1992). Interestingly, to test the hypothesis that many lipo-amino acids originate in brain, the identification of new LpAAs was conducted in LC/ HRMS/MS by Tan et al. (2010) and explained the observed fragmentations by charge-driven dissociations after CID of selected ions.
With the future aim to identify more metabolites from bacteria of the microbiota and based on these above results, the LpAA characterization will be extended to different amino acids conjugated to various fatty acyl groups using LC-ESI/HRMS/MS. For this, a broader study will be conducted to build a database of product ion spectra (or fragment ion spectra (Murray et al. 2013)) of various LpAA ions with a specific and clear fragmentations of [LpAA-H]¯ to detect, identify and elucidate the do novo molecule structure of LpAA (Thomas et al. 2019). The negative ESI mode was chosen to provide abundant deprotonated LpAA useful to deliver, after CID, fragment ions which are characteristic of both the amino-acid and fatty acyl moieties.
For our purpose, dissociations of deprotonated LpAA standards were investigated under low collision energy conditions (i.e., non-resonant mode) by considering previous studies. In our previous study (Pérez-Berezo et al. 2017), the major dissociative processes of the [C12Asn-H]¯ from N-lauroyl-asparagine occurred, in addition to the water loss (corresponding to the base peak), through competitive covalent bond cleavages around the amide linkage with competitive formations of complementary ion pairs. This takes place through stepwise dissociation via ion/neutral molecular isomerization resulting in the [(C12Asn-H)-(Asn-NH 3 )]¯ and [(Asn-H)-NH 3 ]¯ product ion pair through competitive proton transfer allowing to define the two parts of the LpAA. In this study, behavior of the system constituted by both the glutamic residue and the C16 fatty acyl moieties (Scheme 1) was deeply scrutinized. It will be compared to a polar, asparagine (C12Asn) and non-polar amino acid, leucine (C12Leu) (Scheme 1) to reach a pertinent interpretation of the [C16Glu-H]¯ fragmentations based on elemental composition of precursor ion and product ions using an hybrid Qq/TOF tandem mass spectrometer.
Our study will focus on the production of the observed product ions and especially those that are unexpected i.e., product ions generated from LpAA skeleton rearrangement and in particular the carboxylate of the fatty acid. This ion has also been observed and described by Tan et al. (2010). Their work can be considered as a landmark since their proposed interpretation for regeneration of the fatty acid carboxylate from [LpGlu-H]¯. Indeed, without involving the second carboxylic acid group, the proposed tetrahedral intermediate of dissociative [LpGlu-H]¯ evolved directly towards the fatty acid carboxylate product ion. However, dissociation of [LpGABA-H]¯ (containing only one carboxylic acid group) did not lead to regeneration of the carboxylate product ion (Tan et al. 2010). This suggests that the second carboxylic acid in [LpGlu-H]¯ must play important role in the regeneration of the unexpected fatty acid carboxylate product ion. In addition, they suggested that the [(Glu-H)-H 2 O]¯ ion was generated from [Glu-H]¯ by water loss. However, product ion spectrum of [C16Glu-H]¯ (m/z 384) under resonant excitation conditions displayed, in addition to the H 2 O and CO 2 losses (ions at m/z 366 and m/z 340), only the pair of the deprotonated fatty acid and dehydrated glutamate product ions ( Figure S1) which evidences that the latter ions is directly formed from the precursor ion rather than from dehydration of glutamate anion as intermediate.
To rationalize formation of this particular species, the product ion spectrum of the deprotonated [C16GluMe-H]¯ mono ester (Scheme 1) was investigated and compared to [C16Glu-H]¯. For a better description of the origin of the unexpected regeneration of the deprotonated C16 fatty acid, the respective product ion spectra of the C16Glu and C16GluMe anions were investigated under different collision energy conditions. Finally, this study provided an analytical approach to determine without ambiguity the structure of lipo-amino acids and lipopeptides within the framework of mass spectrometry analysis essential for bacterial LpAA identification. For this purpose, it is necessary to better account for the dissociation processes of these compounds by means of well-defined systematic mechanisms. The described mechanisms must explain formation of both the expected and unexpected product ions with similar mechanisms based on the chemical reactivity.
MS/MS experiments product ion spectra of the precursor [C16Glu-H]ˉ ion were performed in segmented RF only quadrupole (Waters, Milford, MA) under collision-induceddissociation conditions (i.e., CID, using non-resonant excitation mode) within a full width at half maximum (FWHM) resolution of 16 000 resolution at m/z 400 for product ion analysis. Precursor ions were selected within ± 5 m/z. The product ion spectra (Murray et al. 2013) (called also CID spectra) were recorded at two collision energies (noted as E Lab ): 17 eV and 30 eV. These CID spectra were inspected manually using MassLynx software (Waters) to confirm annotations (vide infra). Only the signal higher than 5% of base peak were annotated in CID spectra and in the supplementary Table S1, the reported signals were those with relative abundances ≥ 0.4% of the total ionic current TIC at one E lab values. The ion abundances were relative to TIC which included precursor and product ions and were given in percent of TIC. Relative and absolute ion abundances are reported in the supplementary Table S1.

Modeling
DFT calculations were performed using Gaussian16 package, employing the B3LYP functional with the 6-31 + G(d,p) basis set. Zero Point Energies (ZPE) are added as computed at the optimizing level. The nature of all stationary points was confirmed by analyzing the harmonic vibrational frequencies. The energies reported in this work are E energies in kcal/mol, corrected from ZPE. For reasons of computational time and under the reasonable assumption that the length of the aliphatic chain will not significantly influence the nature of the pathways studied, the structures were modeled with an ethyl group (for computational details see Supplementary Information S1).

Nomenclature and notation of product anions
The notations used to describe and discuss mass spectrometry results were based on IUPAC recommendations (Murray et al. 2013) to provide more accurate and consistent text throughout the manuscript. In addition, it was proposed a simple nomenclature which could be applied to the LpAA anion dissociation (Tan et al. 2010). In this nomenclature, only some fragmentations are described and concern only the loss of small neutrals (H 2 O, NH 3 , CO 2 ….) which can be also consecutive losses (H 2 O + CO 2 ) from the amino-acid (AA) residue. Such an annotation suggests their formations occur through consecutive fragmentations from this AA residue generated by dissociation of LpAA anions. This is not true since some of these fragment ions (the abundant ones) are generated from direct cleavages of the deprotonated lipoamino acid and not from the amino-acid as intermediate. This nomenclature cannot be applied for deprotonated lipopeptide dissociations since, N-terminus (linked to fatty acyl moiety) or C-terminus (related to peptide moiety) product ions from peptide bond cleavages cannot be attributed. To meet this requirement, the nomenclature on deprotonated peptide dissociations (Chu et al. 2015) based on fragmentations of protonated peptide ions (with (i + j) residues in sequence) yielding was adapted, e.g. fragment ions as [b i -2H]ˉ, c iˉ for the N-terminus and y jˉ, [z j -2H]ˉ for the C-terminus. This was based on the nomenclature of product ions generated under keV collision energy conditions introduced by Roepstorff et Fohlman (Roepstorff et Fohlman 1984) and Biemann (Biemann 1988). This nomenclature as reported in Scheme 2 allows to describe both the lipo-amino acid and lipo-peptide fragmentations.
Indeed, annotation distinction between (i) fragment ions constituted by the fatty acyl moiety (related to N-terminus of peptide sequence) was annotated by the Lp prefix linked to peptide fragment (e.g. [Lpb i -2H]ˉ, Lpc iˉ …) and (ii) fragment ions related to the C-terminus peptide sequence Scheme 2. Annotation of the observed product ions from dissociation of deprotonated lipoamino acids contains exclusively a peptide fragment sequence (e.g. y jˉ, [z j -2H]ˉ …

Result and discussion
The mass spectrum of the C16Glu compound (Mw 385 u) displayed essentially the [C16Glu-H]ˉ anions (base peak at m/z 384, Figure S1). Under the source conditions used, adduct ions [C16Glu + anion]¯ (anions as inorganic or organic acids) as well as deprotonated  Fig. 1a, b) was characterized by a product ion spectrum (at E lab = 17 eV) displaying tenth peaks representative to product ions. The peaks with intensities lower than 5% of the base peak at E lab = 17 eV were annotated in the CID spectrum only when their intensities became higher than 5% of base peak at E lab = 30 eV, and conversely for the intensities of base peaks lower than 5% at E lab = 30 eV and higher than 5% at E lab = 17 eV.  Table S1) In our previous study (Pérez-Berezo et al. 2017), two intense peaks at m/z 198 (ion lpc¯) and m/z 114 (ion [z-2H]¯) emerged in addition to the base peak at m/z 295 (related to the H 2 O release) from CID spectrum of [C12Asn-H]¯ (m/z 313, Figure S2). The complementary lpc¯/[z-2H]¯ ion pair were interpreted to be formed via ion/dipole intermediate which allows the production of these complementary fragment ions. By analogy with C12Asn, the product ion spectrum of the [C16Glu-H]ˉ anion (m/z 384) (Fig. 1a, b) should display the fragment ions at m/z 254.2483 (ion lpc¯ as C 16 H 32 NO with error of 2.5 ppm), and m/z 129,0193 (ion [z-2H]-as C 5 H 5 O 4 , with error of 151.6 ppm. This product fragment ion m/z 129 may match, with the first 13 C isotopologue of the m/z 128 ion (ion [(y-H 2 O)-1]-as 12 C 4 13 C 1 H 6 NO 3 with error of 1.7 ppm). Surprisingly, the abundance of this ion lpc¯ and [z-2H]¯ pair was of very low. A pair of product ions corresponding to [lpc + 1]¯ (at m/z 255) and [(z-2H)-1]¯ (at m/z 128) appeared. Less abundant fragment ions corresponding to either small size neutral losses or cleavage of bonds close the amide linkage were displayed in the CID spectra (Fig. 1a, b). The latter corresponded to product ions with charge retention at the amino acid moiety or at the fatty acyl part as confirmed by high-resolution measurements (Table S1 and interpretations were reported in Schemes S1 and S2).
These signals were detected at: (i) m/z 366 and m/z 340 are abundant species accompanied by weaker ion at m/z 322 (less than 5% of TIC at this energy at E lab = 17 eV and increases to 1.8 at 30 eV) and m/z 296 (~ 1% at 17 eV and 21% at 30 eV), respectively, due to the expected H 2 O, CO 2 , (H 2 O + CO 2 ) and 2CO 2 small size neutral losses (proposed fragmentation mechanisms in Scheme S1); (ii) m/z 146 (i.e., y¯) and m/z 102 i.e., [y-CO 2 ]¯, < 1% at 17 eV and 13% at 30 eV) resulting from the loss of fatty acyl moiety (proposed fragmentation mechanisms reported Scheme S2). The detected m/z 129 ion which was the 13 C natural isotopologue of the abundant m/z 128 ion (i.e., [y-H 2 O]¯) and not the [y-NH 3 ]¯ isobaric ion, is not reported in Table S1.
Since the formation mechanisms of product ions described above have already been studied (Pérez-Berezo et al. 2017), we focused on the origin and formation mechanism of the abundant unexpected m/z 255 and m/z 128 product ions displayed in the CID spectra recorded at 17 eV and 30 eV. These complementary ions are considered together rather than separately as proposed by Tan et al. (Tan et al. 2010). The elemental compositions attributed to the ions at m/z 255.2321 and at m/z 128.0363 were, respectively, C 16 H 31 O 2 (error of 3.3 ppm) and C 5 H 6 NO 3 (error of 2.4 ppm) (Table S1). Thus, they may be, respectively, annotated as the  (Afonso et al. 2010). The first step is a regioselective nucleophilic attack of amino acid side chain carboxylate site to amide linkage resulting in cyclic tetravalent intermediate. The negative alkoxide site migration induces the C-N bond cleavage and ring opening concomitant with proton transfer to nitrogen atom and resulting in an anhydride linkage. The end group carboxylate interacts with the closer anhydride C=O group yielding a 6-membered tetravalent system which isomerizes in ion/dipole consisting in α amino anhydride neutral and long chain carboxylate. This complex can evolve either to a direct partner splitting to yield m/z 255 (i.e., the fatty acid carboxylate as [lpb + O]¯) or after internal proton transfer from the amino anhydride neutral to the C16 carboxylate partner, release fatty acid neutral to drive to the deprotonated 6-membered amino-anhydride (m/z 128 annotated as [y-H 2 O]¯) (Scheme 3).
The regioselectivity of the first step was confirmed by the CID spectrum of [C16GluMe-H]¯, m/z 398 for which the peak at m/z 255 was strongly reduced in this product ion spectrum (Fig. 1c, d). Indeed, the peak at m/z 255 which was the base peak in the CID spectra of [C16Glu-H]¯ did not reach 1% of base peak at E Lab = 17 eV and 9.3% at E Lab = 30 eV for [C16GluMe-H]¯. This confirmed that the nucleophilic reactivity of the carboxylic acid in α position of amide linkage is significantly hindered compared to the carboxylic group of the side chain. In addition, the CID spectra of [C12Leu-H]¯ (m/z 312), a non-polar LpAA, (Scheme S2 and Table S1) did not display peak at m/z 199 ion corresponding to the [lpb + O]¯ ion confirming the very weak reactivity of the amino α carboxyl group. On the other hand, this explained also why the dissociation of the [LpGABA-H]¯ ion did not yield the lpc¯ product ion (Tan et al. 2010).
To confirm the validity of the proposed mechanism above, modeling was performed. A first investigation was carried out starting from the structure A corresponding to the LpGlu (pathway in black, Fig. 2). For reasons of computational time and under the reasonable assumption that the length of the aliphatic chain will not significantly influence the nature of the pathways studied, the structures were modeled with an ethyl group.
A transition state (TS A->B ) leading to the cyclization into a tetrahedral intermediate B situated 40.8 kcal/mol above A and corresponding to an endothermic transformation by 36.1 kcal/mol could be located. This cyclization is facilitated by an assistance of the proton of the COOH group of the amino acid which stabilizes the produced alkoxide by its neutralization. In a second step, we explored the possibility of the C-N bond cleavage by proton migration from the OH group of tetrahedral intermediate to the neighbored nitrogen atom. This migration results in the creation of an ammonium group which should promote the anticipated cleavage to open the ring and to form an anhydride group (i.e., the C' intermediate). A transition state (TS B->C ) for such a proton transfer was found (25.4 kcal/mol above B). However, the opened ring product C to which it connects is not the result of the C-N bond cleavage wished. Indeed, this proton transfer leads to the opening of the anhydride, immediately followed by the migration of the proton initially on -NHgroups which is captured by the -CO 2¯ group (Scheme 4).
To strongly hinder protonation of the carboxylate group by an ammonium proton for favoring C-N cleavage and ring opening, a water molecule was introduced (A' + H 2 O) in calculations to check its effect on the re-orientation of reaction toward formation of the (C' + H 2 O) intermediate rather than that the C deprotomer. In a similar way, a first cyclization ( (Fig. 3), the EtCOO − group is favorably positioned to attract a mobile proton from the enolizable CH 2 site of the carbonyl group. This deprotonation requires a cost of 9.1 kcal/mol to form an intermediate E located very slightly (1.4 kcal/mol) above the TS D->E . This is probably due to a persistent interaction between the OH group of the carboxylic acid and the CH that prevents the full formation of the enolate form. This is evidenced by the analysis of the value of the dihedral angle OC-CH in this structure which is 20.2 degrees, while it would expect a value very close to 0° for an enolate form. The system gains energy by migration of the acid towards the oxygen atom of the enolate by developing a hydrogen bond type interaction. In structure F, the enolate is this time completely formed, the previously discussed dihedral angle being now close to 0 degrees (− 1.0). Evolutions of the C-C and C-O distances between E and F are also logical.
Finally, the competitive dissociation of both the D and F intermediates are only desolvation steps (i.e., cleavage of hydrogen bond) characterized by level lower than the highest transition state (i.e., TS B'->C' ) of these stepwise dissociations of the mono-hydrated model system of lipoglutamic acid.

Conclusion
In this study, LC-ESI-HRMS/MS performed in negative ion mode for the LpAA analysis has some advantages in terms of sensibility and specificity. Under low-energy collision-induced-dissociation (CID) conditions, [LpAA-H]¯ anion dissociate mainly into complementary product ion pair allowing to unambiguously qualify the two conjugated components of LpAA except for non-polar AA residues which present only one characteristic product ion. ¯ product anions appears due to regeneration of fatty acid carboxylate and formation of dehydrated glutamate via a nucleophilic attack of amide group by one of the two carboxylic groups. This annotation was the one used for deprotonated peptide dissociation. Interestingly, dissociation of deprotonated C16GluMe mono methylester located at the glutamic side chain leads to reduction of the [lpb + O]¯ ion abundance which evidences the strong regioselectivity of the δ carboxylic acid on the nucleophilic attack on amide linkage resulting in 7-membered ring with a tetrahedral reactive site. From this intermediate evolves toward a stepwise process via isomerization into ion-dipole intermediate from which through internal proton transfer and complex splitting yields in competition, the [lpb + O]¯ and [y-H 2 O]¯ product anions. The specific behavior of the glutamate residue allows it to be distinguished from other amino acid residues independently to the fatty acid amide chain length. Thanks to the modeling, the pertinence of the above proposed stepwise process mechanism leading to the complementary product ions especially the regeneration of the fatty acid carboxylate is confirmed. All these results allow to improve the identification of LpAA using mass spectrometry which is essential for the study of this low concentrated bacterial metabolites.

Research involving human participants and/or animals and informed consent
No biological or human material were used in this study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated  Fig. 3 Structures D, E and F. Distances in Å otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.