Advertisement

Label-Free Protein Quantitation Using Weighted Spectral Counting

  • Christine Vogel
  • Edward M. MarcotteEmail author
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 893)

Abstract

Mass spectrometry (MS)-based shotgun proteomics allows protein identifications even in complex biological samples. Protein abundances can then be estimated from the counts of MS/MS spectra attributable to each protein, provided that one corrects for differential MS-detectability of the contributing peptides. We describe the use of a method, APEX, which calculates Absolute Protein EXpression levels based on learned correction factors, MS/MS spectral counts, and each protein’s probability of correct identification.

The APEX-based calculations consist of three parts: (1) Using training data, peptide sequences and their sequence properties, a model is built that can be used to estimate MS-detectability (O i) for any given protein. (2) Absolute abundances of proteins measured in an MS/MS experiment are calculated with information from spectral counts, identification probabilities and the learned O i-values. (3) Simple statistics allow for significance analysis of differential expression in two distinct biological samples, i.e., measuring relative protein abundances. APEX-based protein abundances span more than four orders of magnitude and are applicable to mixtures of hundreds to thousands of proteins from any type of organism.

Key words

Quantitative proteomics Protein expression Label-free mass spectrometry Spectral counting 

Abbreviations

APEX

Absolute Protein EXpression

MS

Mass spectrometry

MS/MS

Tandem mass spectrometry

Notes

Acknowledgments

C.V. acknowledges support by the International Human Frontier Science Program. We thank John Braisted and Srilatha Kuntumalla from JCVI for many useful discussions regarding the APEX calculations. This work was supported by grants from the Welch (F-1515) and Packard Foundations, the National Science Foundation, and National Institutes of Health (to E.M.M.).

References

  1. 1.
    Nesvizhskii AI, Keller A, Kolker E, Aebersold R (2003) A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75:4646–4658PubMedCrossRefGoogle Scholar
  2. 2.
    Oda Y, Huang K, Cross FR et al (1999) Accurate quantitation of protein expression and site-specific phosphorylation. Proc Natl Acad Sci USA 96:6591–6596PubMedCrossRefGoogle Scholar
  3. 3.
    Ong SE, Mann M (2005) Mass spectrometry-based proteomics turns quantitative. Nat Chem Biol 1:252–262PubMedCrossRefGoogle Scholar
  4. 4.
    Ong SE, Blagoev B, Kratchmarova I et al (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics 1:376–386PubMedCrossRefGoogle Scholar
  5. 5.
    Gygi SP, Rochon Y, Franza BR, Aebersold R (1999) Correlation between protein and mRNA abundance in yeast. Mol Cell Biol 19:1720–1730PubMedGoogle Scholar
  6. 6.
    Gerber SA, Rush J, Stemman O et al (2003) Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc Natl Acad Sci USA 100:6940–6945PubMedCrossRefGoogle Scholar
  7. 7.
    Ishihama Y, Oda Y, Tabata T et al (2005) Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics 4:1265–1272PubMedCrossRefGoogle Scholar
  8. 8.
    Silva JC, Gorenstein MV, Li GZ et al (2006) Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol Cell Proteomics 5:144–156PubMedGoogle Scholar
  9. 9.
    Malmstrom J, Beck M, Schmidt A et al (2009) Proteome-wide cellular protein concentrations of the human pathogen Leptospira interrogans. Nature 460:762–765PubMedCrossRefGoogle Scholar
  10. 10.
    Kislinger T, Gramolini AO, Pan Y et al (2005) Proteome dynamics during C2C12 myoblast differentiation. Mol Cell Proteomics 4:887–901PubMedCrossRefGoogle Scholar
  11. 11.
    Kislinger T, Cox B, Kannan A et al (2006) Global survey of organ and organelle protein expression in mouse: combined proteomic and transcriptomic profiling. Cell 125:173–186PubMedCrossRefGoogle Scholar
  12. 12.
    Blondeau F, Ritter B, Allaire PD et al (2004) Tandem MS analysis of brain clathrin-coated vesicles reveals their critical involvement in synaptic vesicle recycling. Proc Natl Acad Sci USA 101:3833–3838PubMedCrossRefGoogle Scholar
  13. 13.
    States DJ, Omenn GS, Blackwell TW et al (2006) Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study. Nat Biotechnol 24:333–338PubMedCrossRefGoogle Scholar
  14. 14.
    Florens L, Washburn MP, Raine JD et al (2002) A proteomic view of the Plasmodium falciparum life cycle. Nature 419:520–526PubMedCrossRefGoogle Scholar
  15. 15.
    Gao J, Friedrichs MS, Dongre AR, Opiteck GJ (2005) Guidelines for the routine application of the peptide hits technique. J Am Soc Mass Spectrom 16:1231–1238Google Scholar
  16. 16.
    Gao J, Opiteck GJ, Friedrichs MS et al (2003) Guidelines for the routine application of the peptide hits technique. J Proteome Res 2:643–649PubMedCrossRefGoogle Scholar
  17. 17.
    Liu H, Sadygov RG, Yates JR 3rd (2004) A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem 76:4193–4201PubMedCrossRefGoogle Scholar
  18. 18.
    Steen H, Pandey A (2002) Proteomics goes quantitative: measuring protein abundance. Trends Biotechnol 20:361–364PubMedCrossRefGoogle Scholar
  19. 19.
    Elias JE, Gibbons FD, King OD et al (2004) Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nat Biotechnol 22:214–219PubMedCrossRefGoogle Scholar
  20. 20.
    Gay S, Binz PA, Hochstrasser DF, Appel RD (2002) Peptide mass fingerprinting peak intensity prediction: extracting knowledge from spectra. Proteomics 2:1374–1391PubMedCrossRefGoogle Scholar
  21. 21.
    Craig R, Cortens JP, Beavis RC (2005) The use of proteotypic peptide libraries for protein identification. Rapid Commun Mass Spectrom 19:1844–1850PubMedCrossRefGoogle Scholar
  22. 22.
    Kuster B, Schirle M, Mallick P, Aebersold R (2005) Scoring proteomes with proteotypic peptide probes. Nat Rev Mol Cell Biol 6:577–583PubMedCrossRefGoogle Scholar
  23. 23.
    Le Bihan T, Robinson MD, Stewart II, Figeys D (2004) Definition and characterization of a “trypsinosome” from specific peptide characteristics by nano-HPLC-MS/MS and in silico analysis of complex protein mixtures. J Proteome Res 3:1138–1148PubMedCrossRefGoogle Scholar
  24. 24.
    Mallick P, Schirle M, Chen SS et al (2007) Computational prediction of proteotypic peptides for quantitative proteomics. Nat Biotechnol 25:125–131PubMedCrossRefGoogle Scholar
  25. 25.
    Tang H, Arnold RJ, Alves P et al (2006) A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 22:e481–e488PubMedCrossRefGoogle Scholar
  26. 26.
    Lu P, Vogel C, Wang R, Yao et al (2007) Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotechnol 25:117–124PubMedCrossRefGoogle Scholar
  27. 27.
    Ghaemmaghami S, Huh WK, Bower K et al (2003) Global analysis of protein expression in yeast. Nature 425:737–741PubMedCrossRefGoogle Scholar
  28. 28.
    Newman JR, Ghaemmaghami S, Ihmels J et al (2006) Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441:840–846PubMedCrossRefGoogle Scholar
  29. 29.
    Futcher B, Latter GI, Monardo P et al (1999) A sampling of the yeast proteome. Mol Cell Biol 19:7357–7368PubMedGoogle Scholar
  30. 30.
    Lopez-Campistrous A, Semchuk P, Burke L et al (2005) Localization, annotation, and comparison of the Escherichia coli K-12 proteome under two states of growth. Mol Cell Proteomics 4:1205–1209PubMedCrossRefGoogle Scholar
  31. 31.
    Laurent J, Vogel C, Kwon T et al (2010) Protein abundances are more conserved than mRNA abundances across diverse taxa. Proteomics 23(10):4209–4212CrossRefGoogle Scholar
  32. 32.
    Wang R, Marcotte EM (2008) The proteomic response of Mycobacterium smegmatis to anti-tuberculosis drugs suggests targeted pathways. J Proteome Res 7:855–865PubMedCrossRefGoogle Scholar
  33. 33.
    Baerenfaller K, Grossmann J, Grobei MA et al (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics. Science 320:938–941PubMedCrossRefGoogle Scholar
  34. 34.
    Vogel C, de Sousa AR, Ko D et al (2010) Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line. Mol Syst Biol 6:400PubMedCrossRefGoogle Scholar
  35. 35.
    Schmidt MW, Houseman A, Ivanov AR, Wolf DA (2007) Comparative proteomic and transcriptomic profiling of the fission yeast Schizosaccharomyces pombe. Mol Syst Biol 3:79PubMedCrossRefGoogle Scholar
  36. 36.
    Schrimpf SP, Weiss M, Reiter L et al (2009) Comparative functional analysis of the Caenorhabditis elegans and Drosophila melanogaster proteomes. PLoS Biol 7:e48PubMedCrossRefGoogle Scholar
  37. 37.
    Keller A, Nesvizhskii AI, Kolker E, Aebersold R (2002) Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 74:5383–5392PubMedCrossRefGoogle Scholar
  38. 38.
    Braisted JC, Kuntumalla S, Vogel C et al (2008) Quantitative proteomics tool: generating protein quantitation estimates from LC-MS/MS proteomics results. BMC Bioinformatics 9:529PubMedCrossRefGoogle Scholar
  39. 39.
    Vogel C, Marcotte EM (2008) Calculating absolute and relative protein abundance from mass spectrometry-based protein expression data. Nat Protoc 3:1444–1451PubMedCrossRefGoogle Scholar
  40. 40.
    Cagney G, Amiri S, Premawaradena T et al (2003) In silico proteome analysis to facilitate proteomics experiments using mass spectrometry. Proteome Sci 1:5PubMedCrossRefGoogle Scholar
  41. 41.
    Neidhardt FC, Umbarger HE (eds) (1996) Escherichia coli and Salmonella typhimurium: cellular and molecular biology, part 4. ASM Press, Washington, DCGoogle Scholar
  42. 42.
    Sundararaj S, Guo A, Habibi-Nazhad B et al (2004) The CyberCell Database (CCDB): a comprehensive, self-updating, relational database to coordinate and facilitate in silico modeling of Escherichia coli. Nucleic Acids Res 32:D293–D295PubMedCrossRefGoogle Scholar
  43. 43.
    Fasman GD ed. (1976) “Handbook of Biochemistry and Molecular Biology”, 3rd ed., Proteins – Volume 1, CRC Press, ClevelandPubMedCrossRefGoogle Scholar
  44. 44.
    Chou PY, Fasman GD (1978) Prediction of the secondary structure of proteins from their amino acid sequence. Adv. Enzymol. 47: 45–148PubMedCrossRefGoogle Scholar
  45. 45.
    Wertz DH, Scheraga HA (1978) Influence of water on protein structure. An analysis of the preferences of amino acid residues for the inside or outside and for specific conformations in a protein molecule. Macro­molecules 11:9–15PubMedCrossRefGoogle Scholar
  46. 46.
    Zimmerman JM, Eliezer N, Simha R (1968) The characterization of amino acid sequences in proteins by statistical methods. J Theor Biol 21:170–201PubMedCrossRefGoogle Scholar
  47. 47.
    Klein P, Kanehisa M, DeLisi C (1984) Prediction of protein function from sequence properties: Discriminant analysis of a data base. Biochim Biophys Acta 787:221–226PubMedCrossRefGoogle Scholar
  48. 48.
    Eisenberg D, McLachlan AD (1986) Solva­tion energy in protein folding and binding. Nature 319:199–203PubMedCrossRefGoogle Scholar
  49. 49.
    Fauchere JL, Charton M, Kier LB, Verloop A, Pliska V (1988) Amino acid side chain parameters for correlation studies in biology and pharmacology. Int J Peptide Protein Res 32:269–278PubMedCrossRefGoogle Scholar
  50. 50.
    Vihinen M, Torkkila E, Riikonen P (1994) Accuracy of protein flexibility predictions. Proteins 19:141–149PubMedCrossRefGoogle Scholar
  51. 51.
    Guy HR (1985) Amino acid side-chain partition energies and distribution of residues in soluble proteins. Biophys J 47:61–70PubMedCrossRefGoogle Scholar
  52. 52.
    Nozaki Y, Tanford C (1971) The solubility of amino acids and two glycine peptides in aqueous ethanol and dioxane solutions. J Biol Chem 246:2211–2217PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Department of Biology, Center for Genomics and Systems BiologyNew York UniversityNew YorkUSA
  2. 2.Center for Systems and Synthetic Biology, Institute for Cellular and Molecular BiologyUniversity of Texas at AustinAustinUSA

Personalised recommendations