Original Paper

Analytical and Bioanalytical Chemistry

, Volume 398, Issue 7, pp 2867-2881

First online:

Open Access This content is freely available online to anyone, anywhere at any time.

Optimization of parameters for coverage of low molecular weight proteins

  • Stephan A. MüllerAffiliated withDepartment of Proteomics, UFZ, Helmholtz-Centre for Environmental Research
  • , Tibor KohajdaAffiliated withDepartment of Metabolomics, UFZ, Helmholtz-Centre for Environmental Research
  • , Sven FindeißAffiliated withBioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig
  • , Peter F. StadlerAffiliated withBioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of LeipzigInstitute for Theoretical Chemistry, University of ViennaMax Planck Institute for Mathematics in the SciencesNomics Group, Fraunhofer Institute for Cell Therapy and ImmunologySanta Fe Institute
  • , Stefan WashietlAffiliated withComputer Science and Artificial Intelligence Laboratory, Broad Institute, Massachusetts Institute of Technology
  • , Manolis KellisAffiliated withComputer Science and Artificial Intelligence Laboratory, Broad Institute, Massachusetts Institute of Technology
  • , Martin von BergenAffiliated withDepartment of Proteomics, UFZ, Helmholtz-Centre for Environmental ResearchDepartment of Metabolomics, UFZ, Helmholtz-Centre for Environmental Research
  • , Stefan KalkhofAffiliated withDepartment of Proteomics, UFZ, Helmholtz-Centre for Environmental Research Email author 

Abstract

Proteins with molecular weights of <25 kDa are involved in major biological processes such as ribosome formation, stress adaption (e.g., temperature reduction) and cell cycle control. Despite their importance, the coverage of smaller proteins in standard proteome studies is rather sparse. Here we investigated biochemical and mass spectrometric parameters that influence coverage and validity of identification. The underrepresentation of low molecular weight (LMW) proteins may be attributed to the low numbers of proteolytic peptides formed by tryptic digestion as well as their tendency to be lost in protein separation and concentration/desalting procedures. In a systematic investigation of the LMW proteome of Escherichia coli, a total of 455 LMW proteins (27% of the 1672 listed in the SwissProt protein database) were identified, corresponding to a coverage of 62% of the known cytosolic LMW proteins. Of these proteins, 93 had not yet been functionally classified, and five had not previously been confirmed at the protein level. In this study, the influences of protein extraction (either urea or TFA), proteolytic digestion (solely, and the combined usage of trypsin and AspN as endoproteases) and protein separation (gel- or non-gel-based) were investigated. Compared to the standard procedure based solely on the use of urea lysis buffer, in-gel separation and tryptic digestion, the complementary use of TFA for extraction or endoprotease AspN for proteolysis permits the identification of an extra 72 (32%) and 51 proteins (23%), respectively. Regarding mass spectrometry analysis with an LTQ Orbitrap mass spectrometer, collision-induced fragmentation (CID and HCD) and electron transfer dissociation using the linear ion trap (IT) or the Orbitrap as the analyzer were compared. IT-CID was found to yield the best identification rate, whereas IT-ETD provided almost comparable results in terms of LMW proteome coverage. The high overlap between the proteins identified with IT-CID and IT-ETD allowed the validation of 75% of the identified proteins using this orthogonal fragmentation technique. Furthermore, a new approach to evaluating and improving the completeness of protein databases that utilizes the program RNAcode was introduced and examined.

Keywords

LTQ Orbitrap Nano-HPLC Nano-ESI-MS MS Proteomics Low molecular weight proteome Escherichia coli