Characterization of the Cysteine Content in Proteins Utilizing Cysteine Selenylation with 266 nm Ultraviolet Photodissociation (UVPD)
Characterization of the cysteine content of proteins is a key aspect of proteomics. By defining both the total number of cysteines and their bound/unbound state, the number of candidate proteins considered in database searches is significantly constrained. Herein we present a methodology that utilizes 266 nm UVPD to count the number of free and bound cysteines in intact proteins. In order to attain this goal, proteins were derivatized with N-(phenylseleno)phthalimide (NPSP) to install a selectively cleavable Se–S bond upon 266 UVPD. The number of Se–S bonds cleaved upon UVPD, a process that releases SePh moieties, corresponds to the number of cysteine residues per protein.
KeywordsCysteine Photodissociation Selenylation
Cysteine is involved in both stabilizing tertiary protein structure through disulfide bonding and modulation of protein redox activity . There have been numerous studies aimed at characterization of the redox states of cysteines in proteins owing to the inherent importance of cysteine-mediated chemistry in countless biological processes. Methods such as UV absorption spectroscopy , fluorescent labeling , and X-ray absorption spectroscopy  have been used to quantify or characterize the cysteine content of proteins. Characterizing cysteine content by mass spectrometry has also become a popular option as a consequence of the development of well-established bottom-up proteomics approaches [5, 6, 7] often in combination with various clever cysteine-selective derivatization methods . MS strategies for determination of cysteines have been based on utilization of mass tags  or isotopic labels [10, 11], differential monitoring of ESI mass spectra after cysteine-selective reactions , proteolysis in isotopically heavy solvents , characterization by high resolution top down MS/MS , or utilization of selective ion/ion reactions [15, 16], all of which have proven to be versatile methods for either qualitative or quantitative characterization of cysteine content of proteins and peptides. Electrochemical tagging reactions of cysteines have also been implemented via coupling an electrochemical cell in an on-line fashion to a mass spectrometer [17, 18, 19, 20, 21]. The latter methodologies have even shown promise for mapping reactivities of cysteine residues in different locations of proteins . Chemical derivatization of cysteine residues in peptides and proteins by reactions with quinone has been shown to differentiate free thiols versus disulfide bonds . For this prior study, 266 nm UVPD was used to promote homolytic cleavage of C–S bonds of quinone-derivatized proteins and peptides, generating neutral loss products that were subjected to CID to achieve radical-directed dissociation (RDD) . Selective cleavage of disulfide bonds to elucidate disulfide-linked peptide pairs has also been shown to be effective using 266 nm photodissociation .
Proteins containing up to eight cysteines were analyzed in three ways: (1) as intact proteins prior to NPSP derivatization, (2) derivatization with NPSP without reduction of intrinsic disulfide bonds, and (3) derivatization with NPSP after reduction of disulfide bonds. For both (1) and (2), proteins were suspended at a concentration of 10 μM in 50:50 H2O:acetonitrile and 1% formic acid. For (2), the proteins were reacted with 10 mM NPSP dissolved in dry acetonitrile (typically using 200 uL of protein solution and 2 uL of NPSP solution) for 30 s prior to analysis by mass spectrometry (without clean-up). For (3), proteins were reduced by incubation with 10 mM DTT in 150 mM NH4HCO3 for 3 h at 55 °C (typically using 200 uL of protein solution and 2 uL of DTT solution). Following reduction, proteins were buffer-exchanged three times into 1% formic acid using Amicon Ultra Centrifugal Filters (Merck Millipore, Billerica, MA, USA). Proteins were then diluted by addition of one volume 1% formic acid in acetonitrile and incubated with 10 mM NPSP for 30 s. The final protein solutions were analyzed without further clean-up.
Mass spectrometry experiments were undertaken in positive mode on a Thermo Scientific Velos Pro dual-pressure linear ion trap mass spectrometer (San Jose, CA, USA) equipped with CID, HCD, ETD, and 266 nm UVPD capabilities. Each spectrum consisted of 3 μ scans averaged. The applied ESI spray voltage was 3 kV. Tandem mass spectrometry was carried out in the high pressure trap for all activation strategies using the most abundant ion of the charge state envelope. For CID and HCD a normalized collision energy of 35 NCE was used (during an activation period of 10 ms for CID and 2 ms for HCD), whereas for ETD a reaction time from 40 to 120 ms was utilized. UVPD was implemented in a manner described previously  and was performed using the fourth harmonic of a Continuum Minilite Nd:YAG laser (San Jose, CA, USA) with an energy output of approximately 6 mJ per pulse. Ions were subjected to an increasing number of 266 nm laser pulses until either the precursor was completely eliminated or no additional neutral losses were observed. Post-acquisition, data analysis was assisted by charge state deconvolution software, MagTran . Bovine β-lactoglobulin, bovine α-lactalbumin, chicken lysozyme, bovine ribonuclease A., bovine aprotinin, horse cytochrome c, and bovine serum albumin (BSA) were used in this study.
Results and Discussion
Comparison of Theoretical and Experimental Cysteine Counting Results
Protein (AC) [Species]
Free cys (Uniprot)
Free cys (exp)
Total cys (Uniprot)
Total cys (exp)
% Proteome match
(28 out of 24113)
(6 out of 17691)
8 by mass
7 by UVPD
(36 out of 24113)
S3, S7, S10
(74 out of 24113)
(36 out of 24113)
S5, S8, S11
(19 out of 22718)
(2 out of 24113)
The NPSP tagging method indicated that α-lactalbumin had four disulfide-bound cysteines and no free cysteines and that cytochrome c had two disulfide-bound cysteines and no free cysteines (Table 1). The NPSP strategy was unsuccessful for characterization of BSA because of incomplete tagging, an outcome not unexpected owing to the fact that BSA has 35 cysteines. The method showed mixed success for lysozyme. Lysozyme has eight cysteines, all engaged in disulfide bonds. Upon reaction with NPSP, the resulting mass spectrum displayed a mass shift corresponding to attachment of one SePh tag, which suggested the presence of one free cysteine. This odd result may correspond to partial degradation of lysozyme, thus leading to cleavage of one disulfide bond in the intact protein. Although addition of two SePh tags might be expected for this scenario, it is possible that one cysteine remains inaccessible owing to the six other disulfide bonds in the protein. Upon reduction of lysozyme and reaction with NPSP, the resulting mass shift was consistent with eight cysteines as expected.
Analysis by 266 nm UVPD
Sulfur-selenium bonds have been shown previously to be photolytically cleavable in solution, producing radical products . Implementation of this type of photoreaction for ions in the gas phase using 266 nm photons in the present study resulted in exclusive cleavage of the S–Se bonds and loss of the SePh tags. Monitoring the loss of SePh tags was used as a second facile means to count the number of cysteine residues per protein (1) prior to reduction and (2) post-reduction.
Among the proteins examined, ribonuclease A yielded discrepancies in the characterization of cysteine content via the UVPD method compared with the result obtained from the mass shift observed in the MS1 mass spectrum. For ribonuclease A, only seven neutral losses were observed with high confidence upon UVPD (Figure 4b), whereas the NPSP-modified protein contains eight tags and is thus expected to lose eight tags. One hypothesis to explain the fact that only seven out of eight Se–S bonds were cleaved in ribonuclease A arises from the presence of the aromatic amino acids tyrosine, tryptophan, and phenylalanine, all known to absorb 266 nm photons (with tryptophan and tyrosine having significantly larger photoabsorption cross-sections than phenylalanine). Consider the comparison of 266 nm UVPD spectra for NPSP-tagged β-lactoglobulin, lysozyme, ribonuclease A, and α-lactalbumin (Figures 3, 4, and 5). Each of these proteins has a number of aromatic residues (tryptophan/tyrosine/phenylalanine) that may absorb 266 nm photons: 2/4/4 (β-lactoglobulin), 6/3/3 (lysozyme), 0/6/3 (ribonuclease A), and 4/4/4 (α-lactalbumin) for tryptophan/tyrosine/phenylalanine residues. Based on absorbance profiles of amino acids in solution, it is anticipated that the photoabsorption cross-section for tryptophan in the gas phase is likely greater than that of tyrosine at 266 nm, and the photoabsorption cross-section for phenylalanine is expected to be rather low at 266 nm (these remarks are derived from solution profiles, not the gas phase ). Excitation energy transfer has been shown to occur between tryptophan or tyrosine and disulfide bonds, ultimately resulting in homolytic cleavage of the disulfide bond via an excited state . We speculate that a similar phenomenon may occur for the NPSP-tagged proteins. For example, lysozyme and α-lactalbumin contain multiple tryptophan residues, which may enhance S–Se bond cleavage from an excited state induced by absorption of 266 nm photons (similar to that shown for disulfide bonds ). The lack of tryptophan residues in ribonuclease A may explain the inhibition of tag loss for the protein. Additionally, lysozyme appears to have a less efficient SePh tag loss series than α-lactalbumin, yet both proteins contain the same number of cysteine residues. Lysozyme has six tryptophans in its primary sequence compared with α-lactalbumin, which has only four tryptophans, and this may contribute to a greater absorption cross-section for lysozyme and may lead to fragmentation by other pathways. While our results indicate that the photoabsorption cross-section of the benzeneselenol group is significantly greater than that of the aromatic side-chains at 266 nm, the availability of other absorbing moieties may inhibit S–Se cleavage by affording access to other fragmentation pathways, specifically pathways caused upon photoabsorption by the aromatic side-chains. The availability of other fragmentation pathways for lysozyme was further supported by activation of non-reduced SePh-tagged lysozyme (Figure 3b). For this protein, in addition to the characteristic SePh loss, an unexpected loss of 76 Da was also observed suggesting an alternative fragmentation route.
Other Activation Methods (HCD, ETD, CID)
It was previously reported that NPSP-derivatized peptides undergo Se–S cleavage upon ETD or CID . Thus, for comparative purposes, collision-based (CID and HCD) and electron-based (ETD) methods were also used to activate the SePh-tagged proteins in the present study. Examples of the resulting MS/MS spectra are shown in Supplementary Figure S10 for aprotinin (with six SePh tags) and Supplementary Figure S11 for cytochrome c (with two SePh tags). Neither HCD nor CID nor ETD resulted in efficient Se–S cleavage. ETD promoted cleavage of up to two S–Se cleavages in conjunction with charge reduction for tagged aprotinin; the analogous MS/MS spectra for cytochrome c were not readily interpretable. Based on this comparison, 266 nm UVPD showed remarkably high efficiency and selectivity for Se–S cleavage relative to the other activation methods.
The NPSP-derivatization strategy and 266 nm UVPD proved to be successful as a new means to count free and bound cysteines in proteins. Proteins containing up to eight cysteine residues were successfully characterized. The SePh tag served as an excellent chromophore for absorption of 266 nm photons, and the selective cleavage of the Se–S bond was striking. Tracking free and bound cysteines has numerous applications in proteomics and offers opportunities for incorporation in informatics engines. For example, the last column of Table 1 shows the percentages of proteins in each proteome that have the same cysteine content (free versus bound cysteines) as each protein included in this study. On average, about 0.1% of all possible proteins match each combination of free and bound cysteines, thus illustrating that characterizing cysteine content offers a significant way to constrain protein identification in database searches.
The authors acknowledge funding from the NSF (CHE-1402753) and the Welch Foundation (F-1155).
- 4.Bellacchio, E., McFarlane, K.L., Rompel, A., Robblee, J.H., Cinco, R.M., Yachandra, V.K.: Counting the number of disulfides and thiol groups in proteins and a novel approach for determining the local pKa for cysteine groups in proteins in vivo. J. Synchrotron Radiat. 8, 1056–1058 (2001)CrossRefGoogle Scholar
- 25.Nicolaou, K.C., Claremon, D.A., Barnette, W.E., Seitz, S.P.: N-phenylselenophthalimide (N-PSP) and N-phenylselenosuccinimide (N-PSS). Two versatile carriers of the phenylseleno group. Oxyselenation of olefins and a selenium-based macrolide synthesis. J. Am. Chem. Soc. 101, 3704–3706 (1979)CrossRefGoogle Scholar