Proteome Analysis Pipeline
A proteome analysis pipeline analyzes the data acquired during a proteomics experiment and produces results that are interpretable and accessible to the researcher, and could be used for publication or deposition in a proteomics resource.
A proteome data analysis pipeline interprets proteomics data and makes it available for further scrutiny. The interpretation uses several computer applications, some of which involve only a single experiment and can be executed in parallel, whereas some rely on data from several experiments and are only applied once per analysis. The analysis often involves very large amounts of data (GB range), uses nontrivial amounts of computing power (on a compute cluster), and may be organized as a workflow. The workflow depends on what the experiment is designed to answer, is subject to frequent modifications, and may involve parameter sweeps. It is advantageous to specify the computational protocol as a workflow where each workflow is designed for a single purpose. Parts of the workflow can be reused in other workflows.
The most commonly used packages in protein identification are MASCOT (Perkins et al. 1999), SEQUEST (Eng et al. 1994), OMSSA (Geer et al. 2004), X!Tandem, SpectrumMill (Agilent 2011), OLAV (Colinge et al. 2003), and TPP.
- Agilent (2011) www.agilent.com. Accessed 25 May 2011
- Eng JK, McCormack AJ, Yates JR III (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom 5:976–989Google Scholar
- Kunszt P, Espona Pernas L, Quandt A, Schmid E, Hunt E, Malmstrom L (2011) The Swiss Grid Proteomics Portal. PARENG’11: The Second International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering. Civil-Comp Press, Stirlingshire, UK, Paper 81Google Scholar
- mzXML (2011) http://sashimi.sourceforge.net/schema_revision/mzXML_2.1. Accessed 25 May 2011
- TPP (2011) Trans-Proteomics Pipeline. http://tools.proteomecenter.org/wiki/index.php?title=Software:TPP. Accessed 25 May 2011
- Vizcaíno JA, Côté R, Reisinger F, Foster JM, Mueller M, Rameseder J, Hermjakob H, Martens L (2009) A guide to the Proteomics Identifications Database proteomics data repository. Proteomics 9(18):4276–4283, http://www.ebi.ac.ukk/pride