1 Introduction

Tandem mass spectrometry (MS/MS) is a powerful technique in peptide sequencing and protein identification, with collisionally activated dissociation (CAD) [1, 2] being the most routinely used method. In CAD, energy transfer through collisions with the inert gas molecules can lead to internal excitation and eventual dissociation of the precursor ions, usually via backbone amide bond cleavages, producing b- and y-type ions. The mass difference between fragment ions of the same type resulting from cleavages of adjacent amide bonds can be used to deduce the identity of the amino acid residue in between. When such a series of fragment ion masses are present in the tandem MS spectrum, the peptide sequence may be reconstructed. CAD has been extended to large biological ions [3, 4], and in-source CAD of intact protein ions up to 200 kDa in mass has been reported [5]. Because of CAD’s tendency to preferentially break the weakest bond in peptides [68], high sequence coverage is often difficult to achieve, particularly in the presence of labile amide bonds, such as the Asp/Glu-Pro sequence [9, 10]. Multiple stages of CAD may be performed to improve the sequence coverage. However, in such MSn experiments, sequence rearrangements of b-ions may occur as evident by the observation of non-direct sequence fragment ions with loss of internal amino acid residues, which can lead to erroneous peptide sequence assignments [11, 12]. On the other hand, the recently developed electron capture dissociation (ECD) method tends to cleave, nonselectively, many more inter-residue N–Ca bonds along the peptide backbone, and has proved to be an extremely valuable tool in post-translational modification (PTM) characterization because of its ability to preserve labile modifications while breaking backbone bonds. In general, though, as the size of the protein increases, ECD efficiency decreases, often due to the existence of extensive noncovalent interactions which both restrict the conformational space and prevent fragment ion separation and detection. The ECD efficiency and sequence coverage can be improved by precursor ion activation, via either infrared irradiation or collisions with inert gases, as employed in activated ion (AI)-ECD [13]. It is often best to perform CAD and ECD in duet, where the complementary information provided by each method individually can be combined to allow higher confidence in sequence assignment and protein identification, either de novo or through database searching [14]. Conceivably, sequence coverage may also be increased by performing CAD and ECD in tandem, where a large protein ion is first fragmented by CAD, and a fragment ion that is both smaller in size and somewhat activated is then subjected to ECD [15]. Because non-direct sequence ions have been observed in b-ion CAD experiments, it is important to investigate whether they are also produced when ECD is used in the second tandem MS stage. Since the fragmentation behavior of an ion is influenced by its structure, a better understanding of the fragment ion structure is beneficial for spectral interpretation and protein sequencing using the MS3 approach.

Of the major product ions produced by CAD, the y ions generally have the same structure as the protonated truncated peptides [16], but the b-ions may exist as several structural variants. The structure of peptide b-ions has been the subject of a long-standing debate. Originally, the structure of b-ion was proposed as an acylium [17], although the oxazolone is now generally regarded as the kinetically and energetically favored product based on its fragmentation behavior, a conclusion that is also supported by both the theoretical modeling [18] and the infrared (IR) spectroscopic study [19]. There has been growing evidence, recently, showing that a b-ion may also assume a macro-cyclic form, which can lead to extensive sequence scramblings upon CAD [11, 12, 2027]. Most of the macro-cyclic structures are formed via the nucleophilic attack on the C-terminus by the N-terminal amino group, while in some lysyl and ornithyl peptides, the ε-amino group on the basic amino acid side chain may also be involved [28, 29]. Upon further collisional activation, these cyclic b-ions can reopen at various positions to form a mixture of linear oxazolones giving rise to non-direct sequence ions. A recent review presented an extensive discussion on the structures of small b1-4 ions and medium-sized b n ions generated in CAD [30].

Factors influencing the b-ion cyclization have been investigated, although no obvious correlation was established. Medium-sized b-ions are generally considered to have a greater potential to form macro-cyclic structure than large b-ions or small b2 to b4 ions, which follows stepwise degradation via “the oxazolone rule” [20, 25, 31, 32]. The effect of amino acid composition was also investigated. The aliphatic amino acids, such as Ile, Leu, Val, and Ala were found to be prone to undergo internal elimination [12]. Selective ring opening of protonated cyclic peptides [24] may be influenced by the Pro [33] and the Asn/Gln effects [34], as well as certain acidic, basic, and amide side chains [35]. The doubly charged b-ions have a higher tendency to form cyclic structures than their singly charged counterparts [12]. Further, activation methods and collisional cooling rates may also play important roles in CAD chemistry [32], and the experimental conditions, such as the collisional activation energy and activation time, can affect the formation of macro-cyclic structure as well [27].

Very few ECD studies on b-ions have been performed. In one ECD study, abundant CO loss was observed and used as evidence to support the acylium structure [36], although it was shown and later demonstrated that CO may also be lost from the oxazolone form [37]. In the current study, ECD was performed on various doubly charged medium-sized b n ions (n = 5–10) from several tachykinin peptides with multiple basic residues near their N-termini to investigate whether sequence scrambling presents a problem in the tandem CAD/ECD experiments, and to probe the b-ion structures based on their fragmentation behaviors. Theoretical calculations were also performed to identify the low energy conformers of selected b-ions.

2 Experimental

All experiments were performed on a solariX 12 T hybrid Qh-Fourier transform ion cyclotron resonance (FTICR) MS instrument (Bruker Daltonics, Billerica, MA, USA). Substance P (RPKPQQFFGLM-NH2) and neurokinin A (HKTDSFVGLM-NH2) were purchased from Sigma-Aldrich (St. Louis, MO, USA), and eledoisin (pEPSKDAFIGLM-NH2) was purchased from AnaSpec (Fremont, CA, USA), and used without further purification. Peptides were dissolved in standard electrospray solution containing 49.5:49.5:1 methanol:water:formic acid (vol/vol/vol) to a concentration of ~2 μM. Micro-electrospray was used to introduce the sample into the mass spectrometer with a flow rate of ~100 μL/h. In-source dissociation in the funnel-skimmer region was used to fragment the precursor ions, and the collision energy was adjusted to maximize the abundance of the specific fragment ion of interest for further ECD analysis. All fragment ions were focused by and pre-stored in the source octopole, and the doubly charged b-ions of interest were isolated by the mass filtering quadrupole with an isolation window of ~2 m/z unit, accumulated in the collision cell hexapole for up to 8 ms before being transferred to the ICR cell. Low energy electrons (~1 eV) were generated by a hollow cathode dispenser mounted on the rear side of the ICR cell, with an extraction lens located between the cathode and the cell to help guide the electrons into the ICR cell. The spectra were internally calibrated by the precursor ion and charge reduced species, analyzed using the DataAnalysis software (Bruker Daltonics, Billerica, MA, USA), and manually interpreted, with a typical mass accuracy around 0.5 ppm.

Low energy conformers of selected b-ions were investigated theoretically. Molecular dynamics (MD) simulations were performed to generate the energetically favored conformer candidates. The CHARMm force field was applied in MD simulations, using the NVT canonical ensemble technique. In order to explore the conformation space as comprehensively as possible, the system temperature was chosen to be at 1000 K. Each MD step was integrated to 1 fs, and each trajectory lasted 200 ps. For each trajectory, geometries were recorded every 5 ps, followed by temperature-independent energy minimization using the molecular mechanics methodology. The MD simulations were performed using the Discovery Studio 5.0 software (Accelrys, San Diego, CA, USA).

The energies of the favorable conformer candidates generated by MD simulations were further calculated using the quantum chemistry methodology. Due to the relatively large sizes of the peptide ions studied, geometry optimizations were performed using the restricted self-consistent field based ab initio- (RHF) approach with the small 3-21 G basis set. To minimize the energy inaccuracy resulting from the omission of the correlation energy, density functional theory (DFT)-based single point energy calculations were performed on optimized HF/3-21 G geometries using the hybrid of Becke’s exchange and Lee-Yang-Parr’s correlation functionals (B3LYP) with the 6-31 G(d) basis set. The B3LYP/6-31 G(d) electronic energies were used as zero Kelvin enthalpy without zero-point vibration energy correction. All calculations were carried out using Gaussian 03 program suite [38] at the supercomputing facility at Boston University.

3 Results and Discussion

Three tachykinin neuropeptides [3941] (neurokinin A, eledoisin, and Substance P) were used as model peptides. The tachykinin peptides are neurotransmitters and can rapidly induce the contraction of gut tissue [42]. They have a common C-terminal sequence: Phe-Xxx-Gly-Leu-Met-NH2, which is considered as the message domain, with Phe important for receptor binding and Xxx important for receptor selectivity [43]. The ECD spectra of varied sized b-ions from neurokinin A, eledoisin, and Substance P are shown in Figures 1, 2, 3, and 4. The calculated low energy conformers of the b 2+8 and b 2+6 ions of Substance P are shown in Figures 5 and 6. The major peaks are labeled in the spectra, and the detailed peak lists can be found in the Supplemental Tables 1–11.

Fig. 1
figure 1

ECD spectra of the doubly charged b9 (a), b8 (b), and b7 (c) ions from neurokinin A (HKTDSFVGLM-NH2). The z-a, z-b, and z-c-ions are fragment ions with their N-termini resembling that of a z-ion, and C-termini resembling that of an a-, b-, and c-ion, respectively. Partial side chain loss and small molecule loss are labeled with the molecular composition of the leaving group(s). Peaks marked with “-amino acid residue” represent the entire side chain loss. *Marks noise peaks, and ω marks harmonic peaks

Fig. 2
figure 2

ECD spectra of the doubly charged b10 (a) and b9 (b) ions from eledoisin (pEPSKDAFIGLM-NH2). Peak labeling follows the same convention as that in Fig. 1

Fig. 3
figure 3

ECD spectra of the doubly charged b10 (a), b9 (b), and b8 (c) ions from Substance P (RPKPQQFFGLM-NH2). Peak labeling follows the same convention as that in Fig. 1

Fig. 4
figure 4

ECD spectra of the doubly charged b7 (a), b6 (b), and b5 (c) ions from Substance P (RPKPQQFFGLM-NH2). Peak labeling follows the same convention as that in Fig. 1

Fig. 5
figure 5

Lowest energy conformers of the Substance P b 2+8 ion and their relative energies obtained at the B3LYP/6-31 G(d)//RHF/3-21 G level of theory. (a) The FK-linked cyclic structure, (b) the oxazolone structure, and (c) the FR-linked cyclic structure. Numbers are in the unit of kcal/mol

Fig. 6
figure 6

Lowest energy conformers of the Substance P b 2+6 ion and their relative energies obtained at the B3LYP/6-31 G(d)//RHF/3-21 G level of theory. (a) The QK-linked cyclic structure, (b) the oxazolone structure, and (c) the QR-linked cyclic structure. Numbers are in the unit of kcal/mol

3.1 ECD of b-Ions from Neurokinin A and Eledoisin

Neurokinin A (HKTDSFVGLM-NH2), also known as Substance K, is a homologue of Substance P (RPKPQQFFGLM-NH2) [41]. ECD of the b 2+9 and b 2+8 ions from neurokinin A generated a near complete series of c-ions (Figure 1a and b), which can be produced from the linear oxazolone structure. However, these c-ions comprise only a small portion of the product ions. In the ECD spectrum of the b 2+9 (HKTDSFVGL2+) ion of neurokinin A, the most intense peaks correspond to a series of non-direct sequence ions of the z-b-type with HK as their C-terminal sequence: LHK, GLHK, VGLHK, FVGLHK, SFVGLHK, and DSFVGLHK. A z-b-ion is a fragment ion that resembles a normal z•-ion at its N-terminus and a normal b-ion at its C-terminus. Other types of non-direct sequence ions are also present, labeled as the z-a- and z-c-ions in the spectrum. These ions do not necessarily terminate with the HK at their C-termini, and are in general of lower abundances. The ECD spectrum of the b 2+8 (HKTDSFVG2+) ion of neurokinin A showed a similar cleavage pattern to the b 2+9 ion, with the major peaks corresponding to the normal c-ions and sequence scrambled GHK, VGHK, FVGHK, SFVGHK, and DSFVGHK (z-b-ion) fragments. The z-a- and z-c-ions are also observed at lower abundances, again not necessarily with an HK C-terminal sequence. ECD of the smaller b 2+7 (HKTDSFV2+) ions from neurokinin A produced predominantly non-direct sequence ions, VHK, FVHK, and SFVHK (z-b-ion), with no c-ions, (Figure 1c).

The abundant z-b-ions observed may be explained by the formation of a macro-ring structure of the original b-ion, connecting its C-terminal carbonyl group to its N-terminal amino group, followed by ring reopening to re-form a new sequence rearranged linear oxazolone, preferentially with lysine as its C-terminus, (Scheme 1). The preference is likely driven by the favorable formation of an oxazolone with histidine as the amino acid residue next to the C-terminus, characteristic of an ergodic process [44]. ECD of the sequence rearranged linear b-ions can produce a series of z-b fragment ions as observed in Figure 1a and b. As the charge carrying histidine and lysine residues were located near the C-terminus of the newly formed linear oxazolone, no non-direct sequence c-ions were observed. Meanwhile, ring reopening by cleaving the N–Cα or Cα–C(=O) bond is energetically unfavorable, so that the z-a- and z-c-ions observed were likely formed via a different mechanism. Note that the b-ions, if assuming a macro-ring structure, are essentially the same as the cyclic peptide ions. It has been reported that ECD of doubly charged cyclic peptides can generate numerous internal fragment ions, which were proposed to be formed via the free radical cascade (FRC) mechanism [45]. Likewise, the z• radical produced by the N-Ca bond cleavage within the macro-ring of the b-ions studied here can also initiate free radical cascade, leading to the formation of the z-a- and z-c-ions, via apparent loss of one or multiple internal amino acid residues (Scheme 1). The major difference between the formation of the z-b-ions and the z-a- and z-c-ions is that the former proceeds via ring-opening prior to ECD and the latter via ECD before ring-opening followed by FRC. Thus, z-b-ions were selectively formed with HK as their C-terminal sequence, as expected from an ergodic ring-opening process; while z-a- and z-c-ions tend to contain more random sequences, as expected from the nonstatistical behavior of ECD. These results indicated that b-ions can exist as a mixture of linear and cyclic structures, and their interconversion in the gas phase occurred prior to ECD. Since sequence scrambling occurred before ECD, peptide sequencing using the MS3 data could be erroneous, even when ECD was used in the second tandem MS stage.

Scheme 1
scheme 1

Proposed ECD fragmentation pathways of the neurokinin A b 2+9 ion. FRC stands for the free radical cascade

Although eledoisin (pEPSKDAFIGLM-NH2) has a similar C-terminal sequence as the neurokinin A, non-direct sequence ions were not observed in the ECD spectra of its b 2+10 - and b 2+9 -ions, as shown in Figure 2a and b, respectively. This is perhaps not surprising in light of the presence of the N-terminal pyroglutamic acid residue, as its N-terminal amino group is blocked by the succinimide formation, thus preventing its conversion to the cyclic form via the N- and C-termini connection. This phenomenon is similar to the previous observation that the N- and C-termini cyclization of b-ions can be effectively blocked by acetylation of the N-terminal amino group [12, 21, 22, 26].

3.2 ECD of b-Ions from Substance P

In addition to the normal c-ions, ECD of doubly charged b n 2+-ions (n = 10–5) from Substance P (RPKPQQFFGLM-NH2) produced a series of fragment ions that correspond to c-ions with additional lysine side-chain loss, labeled as cm-Lys, m = 4 to (n–1), (Figures 3 and 4). This is at first surprising, because c-ions do not usually contain a radical that is capable of initiating further side-chain losses as in the case of z• ions [4649]. The formation of these unusual fragments may be explained by the presence of another type of cyclic b-ion structure, where the C-terminal carbonyl forms an amide bond with the lysine side-chain amino group. In ECD of this C-terminus-Lys linked cyclic b-ion, the initial N–Ca bond cleavage will not result in the formation of individual c and z• fragments, as they are still held together by covalent bonds. The radical on the a-carbon of the z• moiety can migrate through space to induce the lysine side-chain loss and produce the c-Lys fragment (Scheme 2). Because the initial N-Ca bond cleavage within the ring can occur at various positions, a series of cm-Lys fragments can be generated. Similar cyclization involving the ε-NH2 group of the Lys side chain to form a caprolactam structure has been reported previously [50].

Scheme 2
scheme 2

Proposed mechanism for the formation of cm-Lys side chain ions in ECD of the Substance P b 2+10 ion

The relative abundance of the cm-Lys fragments to that of the cm-ions depends on the size of the precursor b n ions: going up from n = 10 to 8 (Figure 3) and then down from n = 7 to 5 (Figure 4). The c4-Lys fragment ions in the ECD spectrum of the b 2+5 -ion was assigned with a big error (~2 ppm) compared to the ~0.5 ppm mass accuracy in all other data generated in this study. This may be due to the very low signal-to-noise ratio (S/N) of the monoisotopic peak of the c4-Lys fragment in the b 2+5 -ion ECD spectrum, as it was difficult to isolate and accumulate the very low abundance b 2+5 -precursor ions generated by funnel-skimmer dissociation. The ECD spectrum of the b 2+8 -ion was dominated by the cm-Lys ions, with almost no c-ions. Thus, the majority of b 2+8 -ions were likely present in the C-terminus-Lys cyclized form, which is consistent with the previous study of oligoglycine b8-ions by another group [32]. For most other b 2+n -ions, both the cm- and cm-Lys-ions were abundantly present in the ECD spectra, indicating that these b-ions likely existed as a mixture of the two forms, both in appreciable quantities. Non-direct sequence ions were seldom observed in any Substance P b-ion ECD spectrum. Thus, the N- and C-termini connected macro-ring structure was either not formed, or formed but preferentially reopened at its original position.

In addition, the electron transfer dissociation (ETD) study [51] of the doubly charged b10-ion of Substance P was performed on an ion-trap instrument as described previously [46]. Although a series of peaks corresponding to the cm-Lys fragment ions were also observed, they were in general present in much lower intensities (data not shown). The lower abundance of the cm-Lys fragment ions in ETD might be due to the lower amount of energy available comparing to ECD, reducing the likelihood of secondary fragmentations.

Cyclic structure formation involving the Lys side chain has been proposed previously. In a CAD study of the doubly charged b10, b9, and b7 ions of Substance P, transfer of one (L) or two (GL) residues from the C-terminus of b-ions to lysine side chain were observed, suggesting the formation of a cyclic structure connecting the lysine side chain to the C-terminus [28, 29]. For other peptides, fragmentation after ring opening at one of the several ring positions was also observed. The presence of a proline residue close to the lysine or ornithine residue on the C-terminal side was considered as a feature for the ring structure formation.

To better understand the structure and ECD behavior of these doubly charged b-ions, geometric optimization and ab initio calculation of the relative energy of low energy conformers of selected b-ions were carried out at the B3LYP/6-31 G(d)//RHF/3-21 G level. Since no transition states were considered in these calculations, prediction of the fragmentation behavior can be difficult, as some of the calculated thermodynamically favorable structures may not be readily accessible kinetically.

The optimized structures and relative stabilities of the lowest energy linear (oxazolone), C-terminus-Lys linked and C-N termini linked cyclic structures of the b 2+8 and b 2+6 ions generated from Substance P are shown in Figures 5 and 6, respectively. For the b 2+8 (RPKPQQFF2+)-ion, the lowest energy oxazolone structure, which has the protonated guanidine and oxazolone groups solvated by the N-terminal amino nitrogen and backbone carbonyl oxygen, respectively, is 6.6 kcal/mol higher in energy compared to its FK-linked cyclic counterpart that has both positive charge sites solvated by backbone carbonyl oxygen atoms. Consequently, cm-Lys-ions are expected to be the dominant fragments in the ECD spectrum of the b 2+8 -ion, in agreement with the experimental observation.

For the b 2+6 (RPKPQQ2+)-ion, the QK-linked cyclic structure (Figure 6a) is 12.7 kcal/mol more preferred energetically over the oxazolone structure (Figure 6b). If energetics is the only factor taken into consideration, one would expect the predominance of the QK-linked cyclic structure over the linear oxazolone one, which would lead to the predominant formation of cm-Lys fragment ions as well. The coexistence of both cm- and cm-Lys-ions with comparable abundance in the b 2+6 -ion ECD spectrum indicates that other factors may also play important roles in the ECD process. In particular, this low energy conformer of the QK-linked b 2+6 -ion has a compact and most likely inflexible structure, with the charge on the lysine side chain involved in hydrogen bonding to the carbonyl groups of the Lys3 and Gln5 residues. Electron capture at the protonated Lys3 carbonyl will not lead to backbone cleavage due to the presence of its C-terminal proline residue, while electron capture at the protonated Gln5 residue can produce the c5-Lys side chain ion, which is the only abundantly formed cm-Lys-ion observed in the ECD spectrum of the b 2+6 -ion. On the contrary, the linear oxazolone structure is much more flexible, which affords the conformational heterogeneity to produce various sized cm-ions, as observed experimentally.

As can be seen from Figures 5 and 6, the lowest energy N- and C-termini linked cyclic structures, FR- for the b 2+8 -ion (Figure 5c) and QR- for the b 2+6 (Figure 6c), are considerably higher in energy compared with the lysine linked cyclic structures. This is perhaps not all that surprising, as the localized charge on the N-terminal arginine is often stabilized by bridging to the N-terminus (Figure 5b), providing no free N-terminal amino group for macrocycle formation. Thus, the N- and C-termini linked cyclic structures are not expected to comprise an appreciable fraction of the total ion population and, consequently, no non-direct sequence ions were observed in the ECD spectra of the b 2+6 - and b 2+8 -ions.

With the sole exception of b-ions from eledoisin, which does not contain a free N-terminal amino group, macro-ring formation was observed in all b-ions studied here, either via the N- and C-termini linkage or via the Lys side chain and C-terminus connection. The identity of the C-terminal amino acid residues of b-ions, on the other hand, did not seem to play an important role in the cyclic structure formation of the limited samples investigated. The macro-cyclic structure appeared to be abundantly formed in medium-sized b-ions, with their sizes ranging from b 2+6 to b 2+10 . The abundance of the macro-cyclic structure in smaller b-ions (e.g., the b 2+5 -ion from Substance P) is significantly lower, possibly due to the ring strain present in smaller macrocyclic b-ions containing only trans-amide bonds [18]. Meanwhile, the trend observed in ECD of Substance P b n ions (n = 8–10) indicated that the cyclic structure formation may also be disfavored as the size of the b-ions increases. This could be due to the entropic factor, i.e., it is more difficult for larger b-ions to find the low-energy conformation that brings the N-terminal or side-chain amino group close to the C-terminal oxazolone in the increased conformational space. ECD of even larger b n ions (n > 10) is currently under investigation.

Finally, it is important to note that while sequence scrambling seems to be a common feature among small b n ions (n ≤ 10), ECD of small b-ions are rarely needed in peptide sequencing and protein identifications, as the N-terminal sequence of a protein is usually adequately covered in MS2 experiments. In a recent study done here [15], it has been reported that the CAD/ECD approach can be a valuable tool to provide sequence information in regions that CAD or ECD alone or CAD based MSn cannot access. In that study, no sequence scrambled fragment ions have been observed in ECD of large b n ions (n > 30) from medium-sized proteins, hemoglobin, and transthyretin (TTR). In addition to the entropic factor mentioned above, lack of cyclization in these very large b-ions may also be due to the extensive charge solvations that stabilize the oxazolone structure.

4 Conclusion

Both the ECD experimental and theoretical works clearly demonstrated that a peptide b-ion may exist as a mixture of several different forms in the gas phase, with their propensities influenced by its size, N-terminus, and possibly the side chains of basic amino acid residues. Blockage of the N-terminal amino group eliminated the formation the N- and C-termini linked cyclic structure. When a lysine residue is present in the sequence, cyclization via the C-terminus-Lys linkage may also occur. The possibility of cyclization is the highest in medium-sized b-ions, and becomes lower in very small b-ions due to the ring strain, as well as in large b-ions due to the decreased likelihood of finding the geometry necessary for ring formation. Non-direct sequence ions can be formed from the cyclized structure after ring reopening at different positions from the original linkage site, with ring-opening following the common preference observed in CAD experiments. When the C-terminus-Lys linkage was formed, unusual fragments corresponding to the lysine side-chain loss from the c-ions were observed. The presence of these non-direct sequence ions and unusual secondary fragment ions can complicate the spectral interpretation and may also lead to errors in peptide sequencing. The current results underscored a potential problem in protein identification and peptide sequencing based on the data generated from multiple stages of tandem mass spectrometry. However, such problem may not be as serious when using the CAD/ECD approach to sequence large proteins, where MS3 is needed the most.