Pyteomics—a Python Framework for Exploratory Data Analysis and Rapid Software Prototyping in Proteomics

  • Anton A. Goloborodko
  • Lev I. Levitsky
  • Mark V. Ivanov
  • Mikhail V. Gorshkov
Application Note

Abstract

Pyteomics is a cross-platform, open-source Python library providing a rich set of tools for MS-based proteomics. It provides modules for reading LC-MS/MS data, search engine output, protein sequence databases, theoretical prediction of retention times, electrochemical properties of polypeptides, mass and m/z calculations, and sequence parsing. Pyteomics is available under Apache license; release versions are available at the Python Package Index http://pypi.python.org/pyteomics, the source code repository at http://hg.theorchromo.ru/pyteomics, documentation at http://packages.python.org/pyteomics. Pyteomics.biolccc documentation is available at http://packages.python.org/pyteomics.biolccc/. Questions on installation and usage can be addressed to pyteomics mailing list: pyteomics@googlegroups.com

Key words

Data processing Bioinformatics Proteomics 

References

  1. 1.
    Kessner, D., Chambers, M., Burke, R., Agus, D., Mallick, P.: ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics 24(21), 2534–2536 (2008)CrossRefGoogle Scholar
  2. 2.
    Kohlbacher, O., Reinert, K., Gröpl, C., Lange, E., Pfeifer, N., Schulz-Trieglaff, O., Sturm, M.: TOPP—the OpenMS proteomics pipeline. Bioinformatics 23(2), 191–197 (2007)CrossRefGoogle Scholar
  3. 3.
    May, D., Law, W., Fitzgibbon, M., Fang, Q., McIntosh, M.: Software platform for rapidly creating computational tools for mass spectrometry-based proteomics. J.P.R. 8(6), 3212–3217 (2009)Google Scholar
  4. 4.
    Ousterhout, J.K.: Scripting: higher level programming for the 21st century. Computer 31, 23–30 (1998)Google Scholar
  5. 5.
    Loui, R.P.: In praise of scripting: real programming pragmatism. Computer 41, 22–26 (2008)CrossRefGoogle Scholar
  6. 6.
    Prince, J.T., Marcotte, E.M.: mspire: mass spectrometry proteomics in Ruby. Bioinformatics (Oxford, England) 24(23), 2796–2797 (2008)Google Scholar
  7. 7.
    Strohalm, M., Kavan, D., Novák, P., Volný, M., Havlícek, V.: mMass 3: a cross-platform software environment for precise analysis of mass spectrometric data. Anal. Chem. 82, 4648–4651 (2010)CrossRefGoogle Scholar
  8. 8.
    Specht, M., Kuhlgert, S., Fufezan, C., Hippler, M.: Proteomics to go: proteomatic enables the user-friendly creation of versatile MS/MS data evaluation workflows. Bioinformatics (Oxford, England) 27, 1183–1184 (2011)Google Scholar
  9. 9.
    Bald, T., Barth, J., Niehues, A., Specht, M., Hippler, M.: pymzML–Python module for high throughput bioinformatics on mass spectrometry data. Bioinformatics (Oxford, England) 28(7), 1052–1053 (2012)Google Scholar
  10. 10.
    Perez, F., Granger, B.E., Hunter, J.D.: Python: an ecosystem for scientific computing. Comput. Sci. Eng. 13, 13–21 (2011)CrossRefGoogle Scholar
  11. 11.
    Behnel, S., Bradshaw, R., Citro, C., Dalcin, L., Seljebotn, D.S., Smith, K.: Cython: the best of both worlds. Comput. Sci. Eng. 13, 31–39 (2011)Google Scholar
  12. 12.
    Rother, K., Potrzebowski, W., Puton, T., Rother, M., Wywial, E., Bujnicki, J.M.: A toolbox for developing bioinformatics software. Brief. Bioinform. 13, 244–257 (2012)CrossRefGoogle Scholar
  13. 13.
    Meek, J.L.: Prediction of peptide retention times in high-pressure liquid chromatography on the basis of amino acid composition. Proc. Natl. Acad. Sci. U. S. A. 77, 1632–1636 (1980)CrossRefGoogle Scholar
  14. 14.
    Gorshkov, A.V., Tarasova, I.A., Evreinov, V.V., Savitski, M.M., Nielsen, M.L., Zubarev, R.A., Gorshkov, M.V.: Liquid chromatography at critical conditions: comprehensive approach to sequence-dependent retention time prediction. Anal. Chem. 78, 7770–7777 (2006)CrossRefGoogle Scholar
  15. 15.
    Martens, L., Chambers, M., Sturm, M., Kessner, D., Levander, F., Shofstahl, J., Tang, W.H., Rompp, A., Neumann, S., Pizarro, A.D., Montecchi-Palazzi, L., Tasman, N.,Coleman, M., Reisinger, F., Souda, P., Hermjakob, H., Binz, P-A., Deutsch E.W.: mzML— a community standard for mass spectrometry data. Mol. Cell. Proteom. 10(1) (2011)Google Scholar
  16. 16.
    Eisenacher, M.: mzIdentML: an open community-built standard format for the results of proteomics spectrum identification algorithms. Methods Mol. Biol. (Clifton, NJ) 696, 161–177 (2011)Google Scholar
  17. 17.
    Goloborodko, A.A., Gorshkov, M.V., Good, D.M., Zubarev, R.A.: Sequence scrambling in shotgun proteomics is negligible. J. Am. Soc. Mass Spectrom. 22, 1121–1124 (2011)CrossRefGoogle Scholar
  18. 18.
    Levitsky, L., Goloborodko, A.A., Gorshkov, M.V.: The influence of search parameters and mass spectrometry data quality on the search engine performance in shotgun proteomics: a systematic study. Proceedings of the 59th ASMS Conference on Mass Spectrometry and Allied Topics. June 2011, Denver, CO, (2011)Google Scholar
  19. 19.
    Moskovets, E., Goloborodko, A.A., Gorshkov, A.V., Gorshkov, M.V.: Limitation of predictive 2D liquid chromatography in reducing the database search space in shotgun proteomics: in silico studies. J. Sep. Sci. 35, 1771–1778 (2012)CrossRefGoogle Scholar
  20. 20.
    Goloborodko, A.A., Mayerhofer, C., Zubarev, A.R., Tarasova, I.A., Gorshkov, A.V., Zubarev, R.A., Gorshkov, M.V.: Empirical approach to false discovery rate estimation in shotgun proteomics. Rapid Commun. Mass Spectrom. 24, 454–462 (2010)Google Scholar

Copyright information

© American Society for Mass Spectrometry 2013

Authors and Affiliations

  • Anton A. Goloborodko
    • 1
    • 2
    • 3
  • Lev I. Levitsky
    • 2
    • 3
  • Mark V. Ivanov
    • 2
    • 3
  • Mikhail V. Gorshkov
    • 2
    • 3
  1. 1.Department of PhysicsMassachusetts Institute of TechnologyBostonUSA
  2. 2.Institute for Energy Problems of Chemical PhysicsRussian Academy of SciencesMoscowRussia
  3. 3.Moscow Institute of Physics and Technology (State University)DolgoprudnyRussia

Personalised recommendations