1 Introduction

Animal venoms represent a very large group of bioactive peptide toxins. Considering that around 44,000 terrestrial venomous species (scorpions, snakes, spiders), 20,000 marine species (cone snails, turrids, and Cnidarians) and 103,000 flying species (hymenopterans) have already been reported, and that each of their venoms is composed of approximately 250 different toxins, this natural pool consists of more than 40 million peptide toxins, biochemically stable and with particular pharmacologic properties [1, 2]. From this impressive number of toxins, only 1600 have been identified corresponding to less than 0.01% of the whole bank. The ability to sequence toxins directly by mass spectrometric analysis of venoms [3], combined with the sequencing of the genomes of venomous animals [4, 5], will certainly help to increase the rate of discovered and characterized peptide toxins over the next few years. The pharmacologic activity of peptide toxins is mostly related to their binding to ion channels, such as voltage-gated and ligand-gated ionic channels, highly involved in controlling the mobility of their prey [4, 68], but also to other kinds of receptors such as G-protein coupled receptor [9], integrine receptors [10], or enzymes as acetylcholine esterase [11]. Another characteristic of animal toxins is the remarkable number and variety of post-translational modifications (PTMs). The most common PTMs are disulfide bonds. [12] They have a primordial role in the stabilization and the structure of peptides and proteins. They bring the toxin in the correct conformation to bind to its pharmacologic target with a high selectivity and affinity (pM to μM). They also reduce immunogenic reactions of the prey animals, leading to an increase of the toxin efficiency to reach its receptor. C-terminal amidation is also frequently found in toxins. More unusual PTMs have also been characterized, especially within the cone snail family. Hydroxylation of prolines [13], bromination of tryptophans [13, 14], γ-carboxylation of glutamic acids [15, 16], sulphation of tyrosines [17], and N-terminal pyroglutamic acid formations [18] represent the PTMs observed more or less frequently in the toxin world. D-amino acids are also present in toxins [19]. Phosphorylation of animal toxins has not been described in the literature so far, in contrast to glycosylation for which few articles were published [20]. Regarding only snake venoms, several high mass enzymes have been found to be glycosylated. For example, L-amino acid oxidase from the venom of Malayan pit viper Calloselasma rhodostoma is remarkably homogeneously N-glycosylated at Asn172 and Asn361 [20, 21]. Many of these glycosylated snake toxins belong to serine protease group. The mass of the predicted mature protein of Bothrops protease A, a trypsin-like serine peptidase, is 25.4 kDa. Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) experiments reveal, however, a mass of approximately 67.4 kDa due to the fact that ~62% of the total toxin mass is related to glycosylations [22]. All the glycosylated snake toxins described in the literature are mainly related to enzymes or high mass toxins, between 20 and 120 kDa. The smallest glycosylated toxins found in snake venom belong to the large family of three-fingered toxins. The glycosylated cytotoxin, discovered in the venom of the Tai cobra Naja kaouthia, weighs around 9–10 kDa instead of 7 kDa due to its carbohydrate micro heterogeneity [23]. Other glycosylated snake toxins have been described and reported in 2009 in a review by Soares and Oliveira [20].

In this study, we aim to investigate the venom of the green mamba, Dendroaspis angusticeps to detect new peptide families and activities. Mamba venoms mainly consist of high mass toxins such as some neurotoxins (dendrotoxins) [24], acetyl cholinesterase inhibitors (fasciculins) [11], muscarinic toxins [25], or adrenergic toxins [9]. Looking for low masses, we detected in our study an unusual family of small peptides of 1–2 kDa. Moreover, this family of peptides is characterized by a very high proportion of proline residues (~30%) and a glycosylation site. Although proline residues and glycosylation complicate the study, we succeeded in determining peptide sequences, glycan moiety, and the glycosylation site by the combined use of collision induced and electron transfer dissociation experiments. The present study describes the smallest glycosylated peptides ever described from snake venom. Interestingly, the sequence of the peptides do not match any other already published sequence or part of a sequence, highlighting the fact that these peptides constitute a totally new family of snake peptides expressed in the venom.

2 Material and Methods

2.1 Venom Fractionation

One gram of Dendroaspis angusticeps venom (Latoxan, Valence, France) was separated into 13 fractions by ion exchange (2 × 15 cm) on Source 15S using the protocol previously described [26]. Fraction B (2 mg) was purified by reverse-phase chromatography (Waters 600) on a semipreparative column (C18, 15 μm, 1.6 cm, 20 cm; Vydac, Sigma-Aldrich, Saint Quentin Fallavier, France) using a linear gradient from 10% to 50% acetonitrile and 0.1% trifluoroacetic acid in 50 min at 4 mL/min (Eluent A was 0.1% TFA + 5% ACN and Eluent B = 0.1% TFA in ACN).

2.1.1 Mass Spectrometry Analysis

MALDI-mass spectrometry experiments were carried out with an Ultraflex II MALDI-TOF/TOF mass spectrometer (Bruker Daltonics, Bremen, Germany) equipped with a Nd-YAG Smartbeam laser (MLN 202, LTB) [27]. 2,5-Dihydroxybenzoic acid at 20 mg/mL in acetonitrile/formic acid 0.1% 50/50 (vol/vol) was used for the fingerprinting of the fractions and TOF/TOF experiments. Electron-transfer dissociation experiments were carried out with an amaZon Ion Trap mass spectrometer (Bruker Daltonics, Bremen, Germany). Two μL of the analyzed fraction was diluted into 9 μL of formic acid 0.1%/acetonitrile 50/50 (vol/vol). After the addition of 5% of i-PrOH, the sample was directly infused using the Advion Triversa Nanomate system (Advion, Harlow, United Kingdom). CID and ETD were performed on the triply charged ion of the glycosylated peptides. Spectra were finally studied with FlexAnalysis (ver. 3.0), DataAnalysis (ver. 4.0), BioTools (ver. 3.2), and SequenceEditor (ver. 3.2) software (Bruker Daltonics, Bremen, Germany).

3 Results and Discussion

3.1 Fractionation of the Crude Venom

The green mamba venom was separated into 13 fractions by cation exchange chromatography over 16 h (Figure 1a). As the venom consists of tens of compounds, each of these fractions was sub-fractionated by reversed-phase chromatography. In the case of fraction B, which represents the most intense chromatographic peak in the first dimension, eleven secondary fractions were collected and named Da1BA to Da1BK (Figure 1b).

Figure 1
figure 1

Two-step chromatographic purification of Dendroaspis angusticeps crude venom. (a) Preparative cation-exchange chromatography of the crude venom. Thirteen fractions were collected (labelled A to M). (b) Reversed-phase chromatography of fraction B on a Vydac C18 semi-preparative column. Eleven fractions were collected (labeled A to K). Blue letters indicate the fractions in which glycopeptides were detected

MALDI-TOF analysis reveal that Da1BB, and Da1BE to Da1BL are only composed of masses which fit with classic snake toxins (6–9 kDa and 13–15 kDa). However, when analyzing Da1BA, BC, and BD, intense signals were detected from peptides of around 2 kDa (Figure 2). Snake toxins of 2 kDa have already been described in several species. For example, sarafotoxins are highly toxic peptides present in the venom of Atractaspidae snakes [28]. These vasoconstrictor peptides, similar to mammal endothelins in terms of structure and biological activity, consist of 15 to 30 amino acids, which means masses between 1.8 and 4 kDa [29]. Other low mass toxins such as bradykinin-potentiating peptides [30] and also natriuretic peptides in snake venoms [31] have been described. This mass range is still really atypical for Dendroaspis venoms. The peptides were then selected and fragmented to determine their sequences.

Figure 2
figure 2

MALDI-TOF mass spectra of (a) Da1BA, (b) Da1BB, and (c) Da1BD sub-fractions. The samples were spotted using 2,4-DHB as the matrix. The four detected glycopeptides are highlighted in bold red

3.2 Peptide Sequencing

All of the small peptides were fragmented by MALDI-TOF/TOF to determine their sequences by de novo sequencing. Interestingly, four of them displayed atypical fragmentation patterns compared to classical peptides (Figure 3). Intense neutral losses of 162 and 203 Da were observed in each spectrum corresponding to losses of hexose (Hex) and N-acetylhexosamine (HexNAc), respectively. These makers clearly reveal a carbohydrate moiety in each peptide.

Figure 3
figure 3

MALDI- TOF/TOF mass spectra of the four glycopeptides characterized in D. angusticeps venom. Peptides (a) and (b) were sequenced from Da1BA fraction, (c) and (d) from Da1BC and Da1BD, respectively. (Full size spectra are available in Online Resource 1 and 2)

The remaining spectrum mainly shows low abundant peaks from the peptide backbone. These signals are, however, sufficiently intense to perform a manual full de novo sequencing. In the picture, leucines have been arbitrary labeled for better readability but due to the identical molecular weight no differentiation between leucine and isoleucine is possible at this stage. Potentially, the four sequences correspond to the same peptide which has been degraded, for example by an endopeptidase. There is no simple answer to this point as no biological target has been determined for these new peptides so far. However, one remarkable point is that the four sequences comprise a high proportion of proline residues representing around one-third of the total amino acid sequence (Table 1). Proline is known to make peptide sequencing by mass spectrometry difficult [32, 33]. In CID, proline N-terminal fragmentation is frequently favored, giving rise to abundant peaks, whereas C-terminal fragmentation is seldom detected. Fortunately in this case, the peptide sequence contains the basic amino acid lysine as (or close to) the N-terminus explaining why b-type ions are more abundant than y-ions.

Table 1 Sequences of the four glycosylated peptides reveal a high proportion of proline residues (underlined). Potential glycosylated residues are indicated in bold. (Masses include the observed glycan part)

From a biological point of view, proline-rich snake toxins have already been described as bradykinin-potentiating peptides and angiotensin-converting enzyme inhibitors [34]. Although this information could point us to the activity of our new peptides, the poor sequence homology between the known toxins and the new peptides totally exclude all conclusions of this type. None of the previously identified toxins are glycosylated, which adds another important structural difference to consider. In our study, the glycan moiety is only composed of a hexose (Hex) linked to a N-acetylhexosamine (HexNAc). The HexNAc of this glycan is linked to the side chain of an amino acid, as revealed by MS/MS spectra (Figure 3). The determination of the nature of the two monosaccharides (for example, Hex could be glucose, galactose, or mannose, and HexNAc could be N-acetylglucosamine or N-acetylgalactosamine) as well as the configuration of the glycosidic linkage between them (1–3, 1–4, 1–5) represents a complex task. This is usually achieved by GC-MS experiments of the released and derivatized monosaccharides [35]. However, the low amount of available material did not allow us to go further into the glycan characterization.

Localization of the glycosylation site is an important but not trivial task in the characterization of the new peptides. Each of them includes three or four hypothetic modification sites; an asparagine, which could be N-glycosylated and two/three serine residues, possibly O-glycosylated. Among the different strategies used to localize peptide glycosylation, the best one is based on the use of electron-mediated fragmentation techniques, particularly electron transfer dissociation (ETD) [36]. Whereas CID and related techniques mainly lead to the fragmentation of the glycan moiety, ETD (and ECD) preserves the glycosylation on the peptide backbone. ETD leads to cleavage between amino acids and thereby provides the peptide sequence information [37]. The two fragmentation techniques appear to be complementary for the analysis of glycopeptides. However, the large number of proline residues in the sequence could considerably lower the power of ETD fragmentation. Due to its cyclic structure, no fragments are observed from the fragmentation at proline. Even though the N–Cα is cleaved the resulting fragments remain linked via the proline ring. Figure 4 displays the ETD mass spectra obtained for the peptide 3 and 4 (Table 1) after the fragmentation of the triply charged precursor ion (at m/z 693.4). Unfortunately, the two other peptides have not been studied by ETD since not enough sample was available for these experiments. However, as the peptides have a conserved sequence, we assumed that the glycosylation of peptides 3 and 4 is localized on the same amino acid than peptides 1 and 2.

Figure 4
figure 4

Electron transfer dissociation mass spectra of the ions at (a) m/z 693.4 and (b) m/z 760.1 corresponding to the triply charged species of the glycosylated peptides KSPPQALNKPLPAPSAP and KSPPQALNKPLPAPSAPSL, respectively. The experiment determines the glycosylation at serine 15 without any ambiguity

In spite of several proline residues, ETD spectra display a lot of signals including the precursor [M + 3H]3+, which did not react with the ETD reagent, the charge-reduced species [M + 3H]2+. and [M + 3H]+..deriving from electron transfer with no dissociation (ETnoD) and finally, a nice series of c- and z-fragment ions. The glycosylation site can be deduced by different ways. For example, in the spectrum Figure 4a, ions from c4 to c14 match the expected masses of the non-glycosylated fragments indicating that the glycosylation is not localized on the first 14 amino acids. Following this, the only possibility of glycosylation remains at the serine in position 14. This hypothesis is validated by (1) the annotation of the c15 and c16 ions which include the mass of the modified serine and (ii) by the signals observed for the z-ions. All the z-ions (from z 9 to z 15) are completely in agreement with the glycosylated fragments, which indicates that the glycosylation is localized between the amino acids 9 and 17. These coincident conclusions unambiguously verify that the serine 14 is the modified amino acid. The same strategy is applied for the second peptide and the same conclusion appears (Figure 4b). The serine 14 is also glycosylated for the second peptide and is additionally validated by the difference observed between z4 and z5 ions.

3.3 Leucine or Isoleucine?

Each new peptide contains one, two, or even three leucine or isoleucine residues. To decipher between leucine and isoleucine, Edman degradation was performed on the longest peptide. The results (not shown) clearly indicate that all of the three ambiguous residues are leucines. The experiment additionally confirmed the sequence of the peptide as well as the glycosylated residue, for which no signal was observed after chemical degradation. Database searches by sequence homologies did not highlight a particular family of proteins or toxins, confirming that this kind of peptide had never been described before.

4 Conclusion

The study describes the structural characterization of a novel family of glycosylated peptides from Dendroaspis angusticeps venom. Mass spectrometry was intensively employed to characterize four new peptides. Combined CID and ETD experiments were used to sequence the peptide, to characterize the carbohydrate moiety, and to localize the modification. Additionally, Edman degradation was carried out to complete the sequences by determining the nature of the isobaric in the sequences. The high proportions of proline residues as well as the glycosylated serine are particularly atypical in comparison to other snake peptides. The next task is to determine their biological activity, to understand their role in the venom. This step has already been performed on different G-protein coupled receptors. None of the receptors tested revealed an affinity with these new compounds. Other experiment are, however, being made on other kind of GPCRs and also on ion channels.