Summary
Shotgun proteomics is based on identification and quantification of peptides from digested proteins using tandem mass spectrometry. In this chapter, we discuss computational methods to analyze tandem mass spectra of peptides, including database searching, de novo peptide sequencing, hybrid approaches, library searching, and unrestricted modification search. A special focus is given to database searching programs since they are most widely used. The process of inferring proteins from identified peptides is then discussed. We also provide description of key steps in the quantitative analysis of mass spectrometry proteomics data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Herbert, B.R., Sanchez, J.-C., Bini, L. (1997) Two-dimensional electrophoresis: the state of art and future directions. In Proteome Research: New Frontiers in Functional Genomics. pp. 13–33. Springer Berlin, Germany.
Zhu, H. and Snyder, M. (2003) Protein chip technology. Curr Opin Chem Biol 7, 55–63.
Little, D.P., Speir, J.P., Senko, M.W., O’Connor, P.B., McLafferty, F.W. (1994) Infrared multiphoton dissociation of large multiply charged ions for biomolecule sequencing. Anal Chem 66, 2809–2815.
Senko, M.W., Speir, J.P., McLafferty, F.W. (1994) Collisional activation of large multiply charged ions using Fourier transform mass spectrometry. Anal Chem 66, 2801–2808.
Washburn, M.P., Wolters, D., Yates, J.R. 3rd (2001) Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat Biotechnol 19, 242–247.
Venable, J.D., Dong, M.Q., Wohlschlegel, J., Dillin, A., Yates, J.R. (2004) Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat Method 1, 39–45.
Eng, J.K., McCormack, A.L., Yates, J.R. 3rd (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom 5, 976–989.
Perkins, D.N., Pappin, D.J.C., Creasy, D.M., Cottrell, J.S. (1999). Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567.
Frank, A.M., Savitski, M.M., Nielsen, M.L., Zubarev, R.A., Pevzner, P.A. (2007) De novo peptide sequencing and identification with precision mass spectrometry. J Proteome Res 6, 114–123.
Ma, B., Zhang, K., Hendrie, C., Liang, C., Li, M., Doherty-Kirby, A., Lajoie, G. (2003) PEAKS: powerful software for peptide de novo sequencing by MS/MS. Rapid Commun Mass Spectrom 17, 2337–2342.
Tabb, D.L., Saraf, A., Yates, J.R. 3rd (2003) GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model.Anal Chem 75, 6415–6421.
Tanner, S., Shu, H., Frank, A., Wang, L.C., Zandi, E., Mumby, M., Pevzner, P.A., Bafna, V. (2005) InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal Chem 77, 4626–4639.
Yates, J.R. 3rd, Morgan, S.F., Gatlin, C.L., Griffin, P.R., Eng, J.K. (1998) Method to compare collision-induced dissociation spectra of peptides: potential for library searching and subtractive analysis. Anal Chem 70, 3557–3565.
Frewen, B.E., Merrihew, G.E., Wu, C.C., Noble, W.S., MacCoss, M.J. (2006) Analysis of peptide MS/MS spectra from large-scale proteomics experiments using spectrum libraries. Anal Chem 78, 5678–5684.
Tsur, D., Tanner, S., Zandi, E., Bafna, V., Pevzner, P.A. (2005) Identification of post-translational modifications by blind search of mass spectra. Nat Biotechnol 23, 1562–1567.
Havilio, M., Wool, A. (2007) Large-scale unrestricted identification of post-translation modifications using tandem mass spectrometry. Anal Chem 79, 1362–1368.
Sadygov, R.G., Cociorva, D., Yates, J.R. 3rd (2004) Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nat Method 1, 195–202.
Nesvizhskii, A.I. (2007) Protein identification by tandem mass spectrometry and sequence database searching. Method Mol Biol 367, 87–119.
Sadygov, R.G., Yates, J.R., 3rd (2003) A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases. Anal Chem 75, (15), 3792–3798.
Geer, L.Y., Markey, S.P., Kowalak, J.A., Wagner, L., Xu, M., Maynard, D.M., Yang, X., Shi, W., and Bryant, S.H. (2004) Open mass spectrometry search algorithm. J Proteome Res 3, 958–964.
Craig, R., Beavis, R.C. (2004) TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20, 1466–1467.
Xu, T., Venable, J.D., Kyu Park, S., Cociorva, D., Lu, B., Liao, L., Wohlschlegel, J., Hewel, J., Yates, J.R. 3rd (2006) ProLuCID, a fast and sensitive tandem mass spectra-based protein identification program. Mol Cell Proteomics 5(10) Supplement, 174.
Field, H.I., Fenyo, D., Beavis, R.C. (2002) RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database. Proteomics 2, 36–47.
Grubbs, F.E., Procedures for detecting outlying observations in samples. Technometrics 1969, 11(1), 1–21.
Zubarev, R.A., Horn, D.M., Fridriksson, E.K., Kelleher, N.L., Kruger, N.A., Lewis, M.A., Carpenter, B.K., McLafferty, F.W. (2000) Electron capture dissociation for structural characterization of multiply charged protein cations. Anal Chem 72, 563–573.
Syka, J.E., Coon, J.J., Schroeder, M.J., Shabanowitz, J., Hunt, D.F. (2004) Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci U S A 101, 9528–9533.
Tabb, D.L., McDonald, W.H., Yates, J.R. 3rd (2002) DTASelect and contrast: tools for assembling and comparing protein identifications from shotgun proteomics. J Proteome Res 1, 21–26.
Keller, A., Nesvizhskii, A.I., Kolker, E., Aebersold, R. (2002) Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 74(20), 5383–5392.
Cociorva, D., Tabb, D., Yates, J.R. 3rd (2006) Validation of tandem mass spectrometry database search results using DTASelect. Curr Protoc Bioinformatics supplement 16, 13.4.1–13.4.14.
Savitski, M.M., Nielsen, M.L., Kjeldsen, F., Zubarev, R.A. (2005) Proteomics-grade de novo sequencing approach. J Proteome Res 4, 2348–2354.
Horn, D.M., Zubarev, R.A., McLafferty, F.W. (2000) Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules. J Am Soc Mass Spectrom 11, 320–332.
Lu, B., Chen, T. (2004) Algorithms for de novo peptide sequencing via tandem mass spectrometry. Drug Discov Today: BioSilico 2, 85–90.
Mann, M., Wilm, M. (1994) Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Anal Chem 66, 4390–4399.
Han, Y., Ma, B., Zhang, K. (2005) SPIDER: software for protein identification from sequence tags containing de novo sequencing error. J Bioinformatics Comput Biol 3, 697–716.
Searle, B.C., Dasari, S., Wilmarth, P.A., Turner, M., Reddy, A.P., David, L.L., Nagalla, S.R.(2005) Identification of protein modifications using MS/MS de novo sequencing and the OpenSea alignment algorithm. J Proteome Res 4, 546–554.
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. (1990) Basic local alignment search tool. J Mol Biol 215, 403–410.
Pearson, W.R., Lipman, D.J. (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 85, 2444–2448.
Taylor, J.A., Johnson, R.S. (1997) Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. Rapid Commun Mass Spectrom 11, 1067–1075.
Mackey, A.J., Haystead, T.A., Pearson, W.R. (2002) Getting more from less: algorithms for rapid protein identification with multiple short peptide sequences. Mol Cell Proteomics 1, 139–147.
Shevchenko, A., Sunyaev, S., Loboda, A., Shevchenko, A., Bork, P., Ens, W., Standing, K.G. (2001) Charting the proteomes of organisms with unsequenced genomes by MALDI-quadrupole time-of-flight mass spectrometry and BLAST homology searching. Anal Chem 73, 1917–1926.
Heller, S. (1999) The history of the NIST/EPA/NIH mass spectral database. Today’s Chemist Work 8, 45–50.
Craig, R., Cortens, J.C., Fenyo, D., Beavis, R.C. (2006) Using annotated peptide mass spectrum libraries for protein identification. J Proteome Res 5, 1843–1849.
Lam, H., Deutsch, E.W., Eddes, J.S., Eng, J.K., King, N., Stein, S.E., Aebersold, R. (2007) Development and validation of a spectral library searching method for peptide identification from MS/MS. Proteomics 7, 655–667.
Liu, J., Bell, A.W., Bergeron, J.J., Yanofsky, C.M., Carrillo, B., Beaudrie, C.E., Kearney, R.E. (2007) Methods for peptide identification by spectral comparison. Proteome Sci 5, 3.
Zhang, Z. (2004) Prediction of low-energy collision-induced dissociation spectra of peptides. Anal Chem 76, 3908–3922.
DeGnore, J.P., Qin, J. (1998) Fragmentation of phosphopeptides in an ion trap mass spectrometer. J Am Soc Mass Spectrom 9, 1175–1188.
Yates, J.R. 3rd, Eng, J.K., McCormack, A.L., Schieltz, D. (1995) Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal Chem 67, 1426–1436.
Nesvizhskii, A.I., Keller, A., Kolker, E., Aebersold, R. (2003) A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75, 4646–4658.
Julka, S., Regnier, F. (2004) Quantification in proteomics through stable isotope coding: a review. J Proteome Res 3, 350–363.
Ong, S.E., Blagoev, B., Kratchmarova, I., Kristensen, D.B., Steen, H., Pandey, A., Mann, M. (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics 1, 376–386.
Gygi, S.P., Rist, B., Gerber, S.A., Turecek, F., Gelb, M.H., Aebersold, R. (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 17, 994–999.
Ross, P.L., Huang, Y.N., Marchese, J.N., Williamson, B., Parker, K. et al (2004) Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol Cell Proteomics 3, 1154–1169.
Mirgorodskaya, O.A., Kozmin, Y.P., Titov, M.I., Korner, R., Sonksen, C.P., Roepstorff, P. (2000) Quantitation of peptides and proteins by matrix-assisted laser desorption/ionization mass spectrometry using (18)O-labeled internal standards. Rapid Commun Mass Spectrom 14, 1226–1232.
Liu, H., Sadygov, R.G., Yates, J.R. (2004) A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal Chem 76, 4193–4201.
Blondeau, F., Ritter, B., Allaire, P.D., Wasiak, S., Girard, M., Hussain, N.K., Angers, A., Legendre-Guillemin, V., Roy, L., Boismenu, D., Kearney, R.E., Bell, A.W., Bergeron, J.J., McPherson, P.S. (2004) Tandem MS analysis of brain clathrin-coated vesicles reveals their critical involvement in synaptic vesicle recycling. Proc Natl Acad Sci U S A 101, 3833–3838.
Bondarenko, P.V., Chelius, D., Shaler, T.A. (2002) Identification and relative quantitation of protein mixtures by enzymatic digestion followed by capillary reversed-phase liquid chromatography–tandem mass spectrometry. Anal Chem 74, 4741–4749.
Chelius, D., Bondarenko, P.V. (2002) Quantitative profiling of proteins in complex mixtures using liquid chromatography and mass spectrometry. J Proteome Res 1, 317–323.
Chelius, D., Zhang, T., Wang, G., Shen, R.F. (2003) Global protein identification and quantification technology using two-dimensional liquid chromatography nanospray mass spectrometry. Anal Chem 75, 6658–6665.
Higgs, R.E., Knierman, M.D., Gelfanova, V., Butler, J.P., Hale, J.E. (2005) Comprehensive label-free method for the relative quantification of proteins from biological samples. J Proteome Res 4, 1442–1450.
Wang, W., Zhou, H., Lin, H., Roy, S., Shaler, T.A., Hill, L.R., Norton, S., Kumar, P., Anderle, M., Becker, C.H. (2003) Quantification of proteins and metabolites by mass spectrometry without isotopic labeling or spiked standards. Anal Chem 75, 4818–4826.
Li, X.J., Yi, E.C., Kemp, C.J., Zhang, H., Aebersold, R. (2005) A software suite for the generation and comparison of peptide arrays from sets of data collected by liquid chromatography-mass spectrometry. Mol Cell Proteomics 4, 1328–1340.
Wiener, M.C., Sachs, J.R., Deyanova, E.G., Yates, N.A. (2004) Differential mass spectrometry: a label-free LC-MS method for finding significant differences in complex peptide and protein mixtures. Anal Chem 76, 6085–6096.
Pan, C., Kora, G., McDonald, W.H., Tabb, D.L., VerBerkmoes, N.C., Hurst, G.B., Pelletier, D.A., Samatova, N.F., Hettich, R.L. (2006) ProRata: a quantitative proteomics program for accurate protein abundance ratio estimation with confidence interval evaluation. Anal Chem 15, 7121–7131.
Park, S.K., Venable, J.D., Xu, T., Yates, J.R. 3rd (2008) A quantitative analysis software tool for mass spectrometry-based proteomics. Nat Methods 5(4), 319–22.
Schulze, W.X., Mann, M. (2004) A novel proteomic screen for peptide-protein interactions. J Biol Chem 279, 10756–10764.
Han, D.K., Eng, J., Zhou, H., Aebersold, R. (2001) Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nat Biotechnol 19, 946–951.
Li, X.J., Zhang, H., Ranish, J.A., Aebersold, R. (2003) Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. Anal Chem 75, 6648–6657.
MacCoss, M.J., Wu, C.C., III, Yates, J.R. (2003) A correlation algorithm for the automated analysis of quantitative “shotgun” proteomics data. Anal Chem 75, 6912–6921.
Pang, J.X., Ginanni, N., Dongre, A.R., Hefta, S.A., Opiteck, G.J.J. (2002) Biomarker discovery in urine by proteomics. J Proteome Res 1, 161–169.
Gao, J., Opiteck, G.J., Friedrichs, M.S., Dongre, A.R., Hefta, S.A.J. (2003) Changes in the protein expression of yeast as a function of carbon source. J Proteome Res 2, 643–649.
Zybailov, B.L., Florens, L., Washburn, M.P. (2007) Quantitative shotgun proteomics using a protease with broad specificity and normalized spectral abundance factors. Mol Biosyst 3, 354–360.
Clauser, K.R., Baker, P., Burlingame, A.L. (1999) Role of accurate mass measurement (+/− 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Anal Chem 71, 2871–82.
Mo, L., Dutta, D., Wan, Y., Chen, T. (2007) MSNovo: a dynamic programming algorithm for de novo peptide sequencing via tandem mass spectrometry. Anal Chem 79, 4870–4878.
Fischer, B., Roth, V., Roos, F., Grossmann, J., Baginsky, S., Widmayer, P., Gruissem, W., Buhmann, J.M. (2005) NovoHMM: a hidden Markov model for de novo peptide sequencing. Anal Chem 77, 7265–7273.
Frank, A., Pevzner, P. (2005) PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal Chem 77, 964–973.
Bern, M., Goldberg, D. (2006) De novo analysis of peptide tandem mass spectra by spectral graph partitioning. J Comput Biol 13, 364–378.
DiMaggio, P.A. Jr, Floudas, C.A. (2007) De novo peptide identification via tandem mass spectrometry and integer linear optimization. Anal Chem 79, 1433–1446.
Bern, M., Cai, Y., Goldberg, D. (2007) Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. Anal Chem 79, 1393–1400.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Lu, B., Xu, T., Park, S.K., Yates, J.R. (2009). Shotgun Protein Identification and Quantification by Mass Spectrometry. In: Reinders, J., Sickmann, A. (eds) Proteomics. Methods in Molecular Biology™, vol 564. Humana Press. https://doi.org/10.1007/978-1-60761-157-8_15
Download citation
DOI: https://doi.org/10.1007/978-1-60761-157-8_15
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-60761-156-1
Online ISBN: 978-1-60761-157-8
eBook Packages: Springer Protocols