Data Mining in Proteomics pp 123-145
Tranche Distributed Repository and ProteomeCommons.org
Tranche is a distributed repository designed to redundantly store and disseminate data sets for the proteomics community. It has several important features for researchers, including support for large data files, prepublication access controls, licensing options, and ensuring both data provenance and integrity. Tranche tightly integrates with ProteomeCommons.org, an online community resource that offers a variety of useful tools for proteomics researchers, including project management and data annotation. In this chapter, we discuss the development of Tranche and ProteomeCommons.org, paying particular attention to why it is desirable that data be publicly available and unrestricted as well as the challenges facing data archiving and open access. We then provide a technical overview of Tranche and ProteomeCommons.org as well as step-by-step instructions for using these resources, including the graphical user interface (GUI ), command-line tools, and Application Programmer Interface (API). We end with a brief discussion of current and future development efforts and collaborations.
- 1.Falkner JA, Ulintz PJ, Andrews PC (2006) A code and data archival and dissemination tool for the proteomics community. Am Biotechnol Lab 38:28–30Google Scholar
- 4.Editorial (2009) Data’s shameful neglect. Nature 461:145Google Scholar
- 7.Wiley S (2009) Why don’t we share data? The Scientist 23:33Google Scholar
- 13.(2007) Publication guidelines for the analysis and documentation of peptide and protein identifications. Mol Cell Proteomics (http://www.mcponline.org/misc/ParisReport_Final.dtl) accessed on July 13 2009.
- 14.Editorial (2007) Democratizing proteomics data. Nat Biotechnol 25:262Google Scholar
- 15.(2008) Instructions to authors. Proteomics (http://www3.interscience.wiley.com/cgi-bin/jabout/76510741/2120_instruc.pdf) accessed on July 13 2009.
- 16.(2003) Final NIH statement on sharing research data. (http://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html) accessed on July 13 2009
- 18.Martin DB, Nelson PS (2001) From genomics to proteomics: techniques and applications in cancer research. Trends Cell Biol 11:61–65Google Scholar
- 20.(2009) About CC0--“No Rights Reserved”. (http://creativecommons.org/about/cc0) accessed on July 13 2009
- 22.Why tumor samples are so important for research. (http://www.pediatricgist.cancer.gov/Source/Research/ResearchArticles/TumorSampleImpArticle.aspx)
- 30.Bayer R (1971) Binary B-trees for virtual memory. ACM-SIGFIDET Workshop 1971:219–235Google Scholar
- 31.Martens L, Deutsch E, Hermjakob H, Omenn G (2009) Proteomics data submission strategy for ProteomeExchange. (http://proteomexchange.org/doc/ProteomExchange_data_submission_strategy_final.pdf)