Skip to main content

UniNovo : A Universal Tool for de Novo Peptide Sequencing

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2013)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 7821))

  • 3214 Accesses

Abstract

Mass spectrometry (MS) instruments and experimental protocols are rapidly advancing, but de novo peptide sequencing algorithms to analyze tandem mass (MS/MS) spectra are lagging behind. While existing de novo sequencing tools perform well on certain types of spectra (e.g., Collision Induced Dissociation (CID) spectra of tryptic peptides), their performance often deteriorates on other types of spectra, such as Electron Transfer Dissociation (ETD), Higher-energy Collisional Dissociation (HCD) spectra, or spectra of non-tryptic digests. Thus, rather than developing a new algorithm for each type of spectra, we develop a universal de novo sequencing algorithm called UniNovo that works well for all types of spectra or even for spectral pairs (e.g., CID/ETD spectral pairs). The performance of UniNovo is compared with PepNovo+, PEAKS, and pNovo using various types of spectra. The results show that the performance of UniNovo is superior to other tools for ETD spectra and superior or comparable to others for CID and HCD spectra. UniNovo also estimates the probability that each reported reconstruction is correct, using simple statistics that are readily obtained from a small training dataset. We demonstrate that the estimation is accurate for all tested types of spectra (including CID, HCD, ETD, CID/ETD, and HCD/ETD spectra of trypsin, LysC, or AspN digested peptides). The appendix is available online at http://proteomics.ucsd.edu/Software/UniNovo.html

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bandeira, N., Olsen, J.V., Mann, M., Pevzner, P.A.: Multi-spectra peptide sequencing and its applications to multistage mass spectrometry. Bioinformatics 24(13), i416–i423 (2008)

    Google Scholar 

  2. Barton, S.J., Whittaker, J.C.: Review of factors that influence the abundance of ions produced in a tandem mass spectrometer and statistical methods for discovering these factors. Mass Spectrometry Reviews 28(1), 177–187 (2009)

    Article  Google Scholar 

  3. Bern, M., Cai, Y., Goldberg, D.: Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. Anal. Chem. 79(4), 1393–1400 (2007)

    Article  Google Scholar 

  4. Breci, L.A., Tabb, D.L., Yates, J.R., Wysocki, V.H.: Cleavage n-terminal to proline: analysis of a database of peptide tandem mass spectra. Analytical Chemistry 75(9), 1963–1971 (2003)

    Article  Google Scholar 

  5. Chen, T., Kao, M.Y., Tepel, M., Rush, J., Church, G.M.: A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. Journal of Computational Biology: A Journal of Computational Molecular Cell Biology 8(3), 325–337 (2001)

    Article  Google Scholar 

  6. Chi, H., Sun, R., Yang, B., Song, C., Wang, L., Liu, C., Fu, Y., Yuan, Z., Wang, H., He, S., Dong, M.: pNovo: de novo peptide sequencing and identification using HCD spectra. J. Proteome Res. 9(5), 2713–2724 (2010)

    Article  Google Scholar 

  7. Dancik, V., Addona, T.A., Clauser, K.R., Vath, J.E., Pevzner, P.A.: De novo peptide sequencing via tandem mass spectrometry. Journal of Computational Biology 6(3-4), 327–342 (1999)

    Article  Google Scholar 

  8. Datta, R., Bern, M.: Spectrum fusion: using multiple mass spectra for de novo peptide sequencing. Journal of Computational Biology 16(8), 1169–1182 (2009)

    Article  MathSciNet  Google Scholar 

  9. Elias, J.E., Gygi, S.P.: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nature Methods 4(3), 207–214 (2007)

    Article  Google Scholar 

  10. Eng, J.K., McCormack, A.L., Yates, J.R.: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of the American Society for Mass Spectrometry 5(11), 976–989 (1994)

    Article  Google Scholar 

  11. Frank, A.: A ranking-based scoring function for peptide-spectrum matches. Journal of Proteome Research 8(5), 2241–2252 (2009)

    Article  Google Scholar 

  12. Frank, A., Pevzner, P.: PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal. Chem. 77(4), 964–973 (2005)

    Article  Google Scholar 

  13. Frese, C.K., Altelaar, A.F.M., Hennrich, M.L., Nolting, D., Zeller, M., Griep-Raming, J., Heck, A.J.R., Mohammed, S.: Improved peptide identification by targeted fragmentation using CID, HCD and ETD on an LTQ-Orbitrap velos. J. Proteome Res. 10(5), 2377–2388 (2011)

    Article  Google Scholar 

  14. He, L., Ma, B.: ADEPTS: advanced peptide de novo sequencing with a pair of tandem mass spectra. Journal of Bioinformatics and Computational Biology 8(6), 981–994 (2010)

    Article  Google Scholar 

  15. Huang, Y., Triscari, J.M., Tseng, G.C., Pasa-Tolic, L., Lipton, M.S., Smith, R.D., Wysocki, V.H.: Statistical characterization of the charge state and residue dependence of low-energy CID peptide dissociation patterns. Analytical Chemistry 77(18), 5800–5813 (2005)

    Article  Google Scholar 

  16. Hunter, D.: An upper bound for the probability of a union. Journal of Applied Probability 13(3), 597–603 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  17. Jeong, K., Kim, S., Bandeira, N., Pevzner, P.A.: Gapped spectral dictionaries and their applications for database searches of tandem mass spectra. Molecular & Cellular Proteomics 10(6), M110.002220 (2011)

    Google Scholar 

  18. Johnson, R.S., Martin, S.A., Biemann, K., Stults, J.T., Watson, J.T.: Novel fragmentation process of peptides by collision-induced decomposition in a tandem mass spectrometer: differentiation of leucine and isoleucine. Anal. Chem. 59(21), 2621–2625 (1987)

    Article  Google Scholar 

  19. Käll, L., Canterbury, J.D., Weston, J., Noble, W.S., MacCoss, M.J.: Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nature Methods 4(11), 923–925 (2007)

    Article  Google Scholar 

  20. Keller, A., Nesvizhskii, A., Kolker, E., Aebersold, R.: Empirical statistical model to estimate the accuracy of peptide identifications made by ms/ms and database search. Anal. Chem. 74, 5383–5392 (2002)

    Article  Google Scholar 

  21. Kersey, P.J., Duarte, J., Williams, A., Karavidopoulou, Y., Birney, E., Apweiler, R.: The international protein index: an integrated database for proteomics experiments. Proteomics 4(7), 1985–1988 (2004)

    Article  Google Scholar 

  22. Kim, S., Bandeira, N., Pevzner, P.A.: Spectral profiles, a novel representation of tandem mass spectra and their applications for de novo peptide sequencing and identification. Molecular & Cellular Proteomics 8(6), 1391–1400 (2009)

    Article  Google Scholar 

  23. Kim, S., Gupta, N., Bandeira, N., Pevzner, P.A.: Spectral dictionaries. Molecular & Cellular Proteomics 8(1), 53–69 (2009)

    Article  Google Scholar 

  24. Kim, S., Mischerikow, N., Bandeira, N., Navarro, J.D., Wich, L., Mohammed, S., Heck, A.J.R., Pevzner, P.A.: The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: Applications to database search. Molecular & Cellular Proteomics 9(12), 2840–2852 (2010)

    Article  Google Scholar 

  25. Liu, X., Shan, B., Xin, L., Ma, B.: Better score function for peptide identification with ETD MS/MS spectra. BMC Bioinformatics 11(suppl. 1), S4 (2010)

    Google Scholar 

  26. Ma, B., Johnson, R.: De novo sequencing and homology searching. Molecular & Cellular Proteomics, O111.014902 (2011)

    Google Scholar 

  27. Ma, B., Zhang, K., Hendrie, C., Liang, C., Li, M., Doherty-Kirby, A., Lajoie, G.: PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Communications in Mass Spectrometry: RCM 17(20), 2337–2342 (2003)

    Article  Google Scholar 

  28. Nesvizhskii, A.I.: A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. J. Proteomics 73(11), 2092–2123 (2010)

    Article  Google Scholar 

  29. Ng, J., Amir, A., Pevzner, P.A.: Blocked Pattern Matching Problem and Its Applications in Proteomics. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 298–319. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  30. Olsen, J.V., Macek, B., Lange, O., Makarov, A., Horning, S., Mann, M.: Higher-energy c-trap dissociation for peptide modification analysis. Nature Methods 4(9), 709–712 (2007)

    Article  Google Scholar 

  31. Perkins, D.N., Pappin, D.J., Creasy, D.M., Cottrell, J.S.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18), 3551–3567 (1999)

    Article  Google Scholar 

  32. Savitski, M.M., Nielsen, M.L., Kjeldsen, F., Zubarev, R.A.: Proteomics-Grade de novo sequencing approach. J. Proteome Res. 4(6), 2348–2354 (2005)

    Article  Google Scholar 

  33. Swaney, D.L., McAlister, G.C., Coon, J.J.: Decision tree-driven tandem mass spectrometry for shotgun proteomics. Nature Methods 5(11), 959–964 (2008)

    Article  Google Scholar 

  34. Swaney, D.L., Wenger, C.D., Coon, J.J.: Value of using multiple proteases for Large-Scale mass Spectrometry-Based proteomics. J. Proteome Res. 9(3), 1323–1329 (2010)

    Article  Google Scholar 

  35. Tabb, D.L., Huang, Y., Wysocki, V.H., Yates, J.R.: Influence of basic residue content on fragment ion peak intensities in Low-Energy Collision-Induced dissociation spectra of peptides. Anal. Chem. 76(5), 1243–1248 (2004)

    Article  Google Scholar 

  36. Wysocki, V.H., Tsaprailis, G., Smith, L.L., Breci, L.A.: Mobile and localized protons: a framework for understanding peptide dissociation. Journal of Mass Spectrometry 35(12), 1399–1406 (2000)

    Article  Google Scholar 

  37. Zubarev, R.A., Zubarev, A.R., Savitski, M.M.: Electron Capture/Transfer versus collisionally Activated/Induced dissociations: Solo or duet? Journal of the American Society for Mass Spectrometry 19, 753–761 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jeong, K., Kim, S., Pevzner, P.A. (2013). UniNovo : A Universal Tool for de Novo Peptide Sequencing. In: Deng, M., Jiang, R., Sun, F., Zhang, X. (eds) Research in Computational Molecular Biology. RECOMB 2013. Lecture Notes in Computer Science(), vol 7821. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37195-0_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37195-0_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37194-3

  • Online ISBN: 978-3-642-37195-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics