Skip to main content

Proteogenomics: Key Driver for Clinical Discovery and Personalized Medicine

  • Chapter
  • First Online:
Proteogenomics

Abstract

Proteogenomics is a multi-omics research field that has the aim to efficiently integrate genomics, transcriptomics and proteomics. With this approach it is possible to identify new patient-specific proteoforms that may have implications in disease development, specifically in cancer. Understanding the impact of a large number of mutations detected at the genomics level is needed to assess the effects at the proteome level. Proteogenomics data integration would help in identifying molecular changes that are persistent across multiple molecular layers and enable better interpretation of molecular mechanisms of disease, such as the causal relationship between single nucleotide polymorphisms (SNPs) and the expression of transcripts and translation of proteins compared to mainstream proteomics approaches. Identifying patient-specific protein forms and getting a better picture of molecular mechanisms of disease opens the avenue for precision and personalized medicine. Proteogenomics is, however, a challenging interdisciplinary science that requires the understanding of sample preparation, data acquisition and processing for genomics, transcriptomics and proteomics. This chapter aims to guide the reader through the technology and bioinformatics aspects of these multi-omics approaches, illustrated with proteogenomics applications having clinical or biological relevance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Aviner, R., Geiger, T., & Elroy-Stein, O. (2013). PUNCH-P for global translatome profiling: Methodology, insights and comparison to other techniques. Translation (Austin), 1(2), e27516. doi:10.4161/trla.27516

    Google Scholar 

  • Bantscheff, M., Schirle, M., Sweetman, G., Rick, J., & Kuster, B. (2007). Quantitative mass spectrometry in proteomics: A critical review. Analytical and Bioanalytical Chemistry, 389(4), 1017–1031. doi:10.1007/s00216-007-1486-6.

    Article  CAS  PubMed  Google Scholar 

  • Bantscheff, M., Lemeer, S., Savitski, M. M., & Kuster, B. (2012). Quantitative mass spectrometry in proteomics: Critical review update from 2007 to the present. Analytical and Bioanalytical Chemistry, 404(4), 939–965. doi:10.1007/s00216-012-6203-4.

    Article  CAS  PubMed  Google Scholar 

  • Barrett, T., Wilhite, S. E., Ledoux, P., Evangelista, C., Kim, I. F., Tomashevsky, M., Marshall, K. A., Phillippy, K. H., Sherman, P. M., Holko, M., Yefanov, A., Lee, H., Zhang, N., Robertson, C. L., Serova, N., Davis, S., & Soboleva, A. (2013). NCBI GEO: Archive for functional genomics data sets–update. Nucleic Acids Research, 41(Database issue), D991–D995. doi:10.1093/nar/gks1193.

    Article  CAS  PubMed  Google Scholar 

  • Bensimon, A., Heck, A. J., & Aebersold, R. (2012). Mass spectrometry-based proteomics and network biology. Annual Review of Biochemistry, 81, 379–405. doi:10.1146/annurev-biochem-072909-100424.

    Article  CAS  PubMed  Google Scholar 

  • Bertsch, A., Gropl, C., Reinert, K., & Kohlbacher, O. (2011). OpenMS and TOPP: Open source software for LC-MS data analysis. Methods in Molecular Biology, 696, 353–367. doi:10.1007/978-1-60761-987-1_23.

    Article  CAS  PubMed  Google Scholar 

  • Besemer, J., Lomsadze, A., & Borodovsky, M. (2001). GeneMarkS: A self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Research, 29(12), 2607–2618.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bischoff, R., & Schlüter, H. (2012). Amino acids: Chemistry, functionality and selected non-enzymatic post-translational modifications. Journal of Proteomics, 75(8), 2275–2296. doi:10.1016/j.jprot.2012.01.041.

    Article  CAS  PubMed  Google Scholar 

  • Bischoff, R., Permentier, H., Guryev, V., & Horvatovich, P. (2015). Genomic variability and protein species – Improving sequence coverage for proteogenomics. Journal of Proteomics. doi:10.1016/j.jprot.2015.09.021.

    PubMed  Google Scholar 

  • Bjornson, R. D., Carriero, N. J., Colangelo, C., Shifman, M., Cheung, K. H., Miller, P. L., & Williams, K. (2008). X!!Tandem, an improved method for running X! Tandem in parallel on collections of commodity computers. Journal of Proteome Research, 7(1), 293–299. doi:10.1021/pr0701198.

    Article  CAS  PubMed  Google Scholar 

  • Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics, 30(15), 2114–2120. doi:10.1093/bioinformatics/btu170.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chambers, M. C., Maclean, B., Burke, R., Amodei, D., Ruderman, D. L., Neumann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J., Hoff, K., Kessner, D., Tasman, N., Shulman, N., Frewen, B., Baker, T. A., Brusniak, M. Y., Paulse, C., Creasy, D., Flashner, L., Kani, K., Moulding, C., Seymour, S. L., Nuwaysir, L. M., Lefebvre, B., Kuhlmann, F., Roark, J., Rainer, P., Detlev, S., Hemenway, T., Huhmer, A., Langridge, J., Connolly, B., Chadick, T., Holly, K., Eckels, J., Deutsch, E. W., Moritz, R. L., Katz, J. E., Agus, D. B., MacCoss, M., Tabb, D. L., & Mallick, P. (2012). A cross-platform toolkit for mass spectrometry and proteomics. Nature Biotechnology, 30(10), 918–920. doi:10.1038/nbt.2377.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chang, C., Li, L., Zhang, C., Wu, S., Guo, K., Zi, J., Chen, Z., Jiang, J., Ma, J., Yu, Q., Fan, F., Qin, P., Han, M., Su, N., Chen, T., Wang, K., Zhai, L., Zhang, T., Ying, W., Xu, Z., Zhang, Y., Liu, Y., Liu, X., Zhong, F., Shen, H., Wang, Q., Hou, G., Zhao, H., Li, G., Liu, S., Gu, W., Wang, G., Wang, T., Zhang, G., Qian, X., Li, N., He, Q. Y., Lin, L., Yang, P., Zhu, Y., He, F., & Xu, P. (2014). Systematic analyses of the transcriptome, translatome, and proteome provide a global view and potential strategy for the C-HPP. Journal of Proteome Research, 13(1), 38–49. doi:10.1021/pr4009018.

    Article  CAS  PubMed  Google Scholar 

  • Christin, C., Bischoff, R., & Horvatovich, P. (2011). Data processing pipelines for comprehensive profiling of proteomics samples by label-free LC-MS for biomarker discovery. Talanta, 83(4), 1209–1224. doi:10.1016/j.talanta.2010.10.029.

    Article  CAS  PubMed  Google Scholar 

  • Chuh, K. N., & Pratt, M. R. (2015). Chemical methods for the proteome-wide identification of posttranslationally modified proteins. Current Opinion in Chemical Biology, 24, 27–37. doi:10.1016/j.cbpa.2014.10.020.

    Article  CAS  PubMed  Google Scholar 

  • Cock, P. J., Fields, C. J., Goto, N., Heuer, M. L., & Rice, P. M. (2010). The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, 38(6), 1767–1771. doi:10.1093/nar/gkp1137/ConsortiumN.

    Article  CAS  PubMed  Google Scholar 

  • Consortium U. (2015). UniProt: A hub for protein information. Nucleic Acids Research, 43(Database issue), D204–D212. doi:10.1093/nar/gku989.

    Article  Google Scholar 

  • Cote, R. G., Griss, J., Dianes, J. A., Wang, R., Wright, J. C., van den Toorn, H. W., van Breukelen, B., Heck, A. J., Hulstaert, N., Martens, L., Reisinger, F., Csordas, A., Ovelleiro, D., Perez-Rivevol, Y., Barsnes, H., Hermjakob, H., & Vizcaino, J. A. (2012). The PRoteomics IDEntification (PRIDE) Converter 2 framework: An improved suite of tools to facilitate data submission to the PRIDE database and the ProteomeXchange consortium. Molecular & Cellular Proteomics, 11(12), 1682–1689. doi:10.1074/mcp.O112.021543.

    Article  Google Scholar 

  • Cox, J., & Mann, M. (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotechnology, 26(12), 1367–1372. doi:10.1038/nbt.1511.

    Article  CAS  PubMed  Google Scholar 

  • Craig, R., Cortens, J. C., Fenyo, D., & Beavis, R. C. (2006). Using annotated peptide mass spectrum libraries for protein identification. Journal of Proteome Research, 5(8), 1843–1849. doi:10.1021/pr0602085.

    Article  CAS  PubMed  Google Scholar 

  • Deutsch, E. W., Lam, H., & Aebersold, R. (2008). PeptideAtlas: A resource for target selection for emerging targeted proteomics workflows. EMBO Reports, 9(5), 429–434. doi:10.1038/embor.2008.56.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Deutsch, E. W., Mendoza, L., Shteynberg, D., Farrah, T., Lam, H., Tasman, N., Sun, Z., Nilsson, E., Pratt, B., Prazen, B., Eng, J. K., Martin, D. B., Nesvizhskii, A. I., & Aebersold, R. (2010). A guided tour of the trans-proteomic pipeline. Proteomics, 10(6), 1150–1159. doi:10.1002/pmic.200900375.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Deutsch, E. W., Mendoza, L., Shteynberg, D., Slagel, J., Sun, Z., & Moritz, R. L. (2015). Trans-proteomic pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics. Proteomics Clinical Applications, 9(7–8), 745–754. doi:10.1002/prca.201400164.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., & Gingeras, T. R. (2013). STAR: Ultrafast universal RNA-seq aligner. Bioinformatics, 29(1), 15–21. doi:10.1093/bioinformatics/bts635.

    Article  CAS  PubMed  Google Scholar 

  • Domon, B., & Aebersold, R. (2006). Mass spectrometry and protein analysis. Science, 312(5771), 212–217. doi:10.1126/science.1124619.

    Article  CAS  PubMed  Google Scholar 

  • Elias, J. E., & Gygi, S. P. (2010). Target-decoy search strategy for mass spectrometry-based proteomics. Methods in Molecular Biology, 604, 55–71. doi:10.1007/978-1-60761-444-9_5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Eng, J. K., Searle, B. C., Clauser, K. R., & Tabb, D. L. (2011). A face in the crowd: Recognizing peptides through database search. Molecular & Cellular Proteomics, 10(11), R111.009522. doi:10.1074/mcp.R111.009522.

    Article  Google Scholar 

  • Eng, J. K., Jahan, T. A., & Hoopmann, M. R. (2013). Comet: An open-source MS/MS sequence database search tool. Proteomics, 13(1), 22–24. doi:10.1002/pmic.201200439.

    Article  CAS  PubMed  Google Scholar 

  • Farrah, T., Deutsch, E. W., Omenn, G. S., Campbell, D. S., Sun, Z., Bletz, J. A., Mallick, P., Katz, J. E., Malmstrom, J., Ossola, R., Watts, J. D., Lin, B., Zhang, H., Moritz, R. L., & Aebersold, R. (2011). A high-confidence human plasma proteome reference set with estimated concentrations in PeptideAtlas. Molecular & Cellular Proteomics, 10(9), M110 006353. doi:10.1074/mcp.M110.006353.

    Article  Google Scholar 

  • Fiume, M., Williams, V., Brook, A., & Brudno, M. (2010). Savant: Genome browser for high-throughput sequencing data. Bioinformatics, 26(16), 1938–1944. doi:10.1093/bioinformatics/btq332.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Frank, A., & Pevzner, P. (2005). PepNovo: De novo peptide sequencing via probabilistic network modeling. Analytical Chemistry, 77(4), 964–973.

    Article  CAS  PubMed  Google Scholar 

  • Gawron, D., Gevaert, K., & Van Damme, P. (2014). The proteome under translational control. Proteomics, 14(23–24), 2647–2662. doi:10.1002/pmic.201400165.

    Article  CAS  PubMed  Google Scholar 

  • Geer, L. Y., Markey, S. P., Kowalak, J. A., Wagner, L., Xu, M., Maynard, D. M., Yang, X., Shi, W., & Bryant, S. H. (2004). Open mass spectrometry search algorithm. Journal of Proteome Research, 3(5), 958–964. doi:10.1021/pr0499491.

    Article  CAS  PubMed  Google Scholar 

  • Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A., Rhind, N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N., & Regev, A. (2011). Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology, 29(7), 644–652. doi:10.1038/nbt.1883.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Griss, J., Jones, A. R., Sachsenberg, T., Walzer, M., Gatto, L., Hartler, J., Thallinger, G. G., Salek, R. M., Steinbeck, C., Neuhauser, N., Cox, J., Neumann, S., Fan, J., Reisinger, F., Xu, Q. W., Del Toro, N., Perez-Riverol, Y., Ghali, F., Bandeira, N., Xenarios, I., Kohlbacher, O., Vizcaino, J. A., & Hermjakob, H. (2014). The mzTab data exchange format: Communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience. Molecular & Cellular Proteomics, 13(10), 2765–2775. doi:10.1074/mcp.O113.036681.

    Article  CAS  Google Scholar 

  • Gstaiger, M., & Aebersold, R. (2009). Applying mass spectrometry-based proteomics to genetics, genomics and network biology. Nature Reviews Genetics, 10(9), 617–627. doi:10.1038/nrg2633.

    Article  CAS  PubMed  Google Scholar 

  • Herrero, J., Muffato, M., Beal, K., Fitzgerald, S., Gordon, L., Pignatelli, M., Vilella, A. J., Searle, S. M., Amode, R., Brent, S., Spooner, W., Kulesha, E., Yates, A., & Flicek, P. (2016). Ensembl comparative genomics resources. Database: The Journal of Biological Databases and Curation. doi:10.1093/database/bav096.

    Google Scholar 

  • Hoopmann, M. R., & Moritz, R. L. (2013). Current algorithmic solutions for peptide-based proteomics data generation and identification. Current Opinion in Biotechnology, 24(1), 31–38. doi:10.1016/j.copbio.2012.10.013.

    Article  CAS  PubMed  Google Scholar 

  • Horvatovich, P. L., & Bischoff, R. (2010). Current technological challenges in biomarker discovery and validation. European Journal of Mass Spectrometry, 16(1), 101–121. doi:10.1255/ejms.1050.

    Article  CAS  PubMed  Google Scholar 

  • Horvatovich, P., Govorukhina, N., & Bischoff, R. (2006). Biomarker discovery by proteomics: Challenges not only for the analytical chemist. The Analyst, 131(11), 1193–1196. doi:10.1039/b607833h.

    Article  CAS  PubMed  Google Scholar 

  • Horvatovich, P., Hoekman, B., Govorukhina, N., & Bischoff, R. (2010). Multidimensional chromatography coupled to mass spectrometry in analysing complex proteomics samples. Journal of Separation Science, 33(10), 1421–1437. doi:10.1002/jssc.201000050.

    Article  CAS  PubMed  Google Scholar 

  • Horvatovich, P., Lundberg, E. K., Chen, Y. J., Sung, T. Y., He, F., Nice, E. C., Goode, R. J., Yu, S., Ranganathan, S., Baker, M. S., Domont, G. B., Velasquez, E., Li, D., Liu, S., Wang, Q., He, Q. Y., Menon, R., Guan, Y., Corrales, F. J., Segura, V., Casal, J. I., Pascual-Montano, A., Albar, J. P., Fuentes, M., Gonzalez-Gonzalez, M., Diez, P., Ibarrola, N., Degano, R. M., Mohammed, Y., Borchers, C. H., Urbani, A., Soggiu, A., Yamamoto, T., Salekdeh, G. H., Archakov, A., Ponomarenko, E., Lisitsa, A., Lichti, C. F., Mostovenko, E., Kroes, R. A., Rezeli, M., Vegvari, A., Fehniger, T. E., Bischoff, R., Vizcaino, J. A., Deutsch, E. W., Lane, L., Nilsson, C. L., Marko-Varga, G., Omenn, G. S., Jeong, S. K., Lim, J. S., Paik, Y. K., & Hancock, W. S. (2015). Quest for missing proteins: Update 2015 on chromosome-centric human proteome project. Journal of Proteome Research, 14(9), 3415–3431. doi:10.1021/pr5013009.

    Article  CAS  PubMed  Google Scholar 

  • Hughes, C., Ma, B., & Lajoie, G. A. (2010). De novo sequencing methods in proteomics. Methods in Molecular Biology, 604, 105–121. doi:10.1007/978-1-60761-444-9_8.

    Article  CAS  PubMed  Google Scholar 

  • Jeong, K., Kim, S., & Pevzner, P. A. (2013). UniNovo: A universal tool for de novo peptide sequencing. Bioinformatics, 29(16), 1953–1962. doi:10.1093/bioinformatics/btt338.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kall, L., Canterbury, J. D., Weston, J., Noble, W. S., & MacCoss, M. J. (2007). Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nature Methods, 4(11), 923–925. doi:10.1038/nmeth1113.

    Article  PubMed  Google Scholar 

  • Kapp, E., & Schutz, F. (2007). Overview of tandem mass spectrometry (MS/MS) database search algorithms. Current protocols in protein science / editorial board, John E Coligan [et al] Chapter 25:Unit25 22. doi:10.1002/0471140864.ps2502s49.

  • Keller, A., Nesvizhskii, A. I., Kolker, E., & Aebersold, R. (2002). Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Analytical Chemistry, 74(20), 5383–5392.

    Article  CAS  PubMed  Google Scholar 

  • Kertesz-Farkas, A., Keich, U., & Noble, W. S. (2015). Tandem mass spectrum identification via cascaded search. Journal of Proteome Research, 14(8), 3027–3038. doi:10.1021/pr501173s.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kessner, D., Chambers, M., Burke, R., Agus, D., & Mallick, P. (2008). ProteoWizard: Open source software for rapid proteomics tools development. Bioinformatics, 24(21), 2534–2536. doi:10.1093/bioinformatics/btn323.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Khan, Z., Bloom, J. S., Garcia, B. A., Singh, M., & Kruglyak, L. (2009). Protein quantification across hundreds of experimental conditions. Proceedings of the National Academy of Sciences of the United States of America, 106(37), 15544–15548. doi:10.1073/pnas.0904100106.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kim, S., & Pevzner, P. A. (2014). MS-GF+ makes progress towards a universal database search tool for proteomics. Nature Communications, 5, 5277. doi:10.1038/ncomms6277.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., & Salzberg, S. L. (2013). TopHat2: Accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology, 14(4), R36. doi:10.1186/gb-2013-14-4-r36.

    Article  PubMed  PubMed Central  Google Scholar 

  • Kirchner, M., Steen, J. A., Hamprecht, F. A., & Steen, H. (2010). MGFp: An open Mascot Generic Format parser library implementation. Journal of Proteome Research, 9(5), 2762–2763. doi:10.1021/pr100118f.

    Article  CAS  PubMed  Google Scholar 

  • Lam, H. (2011). Building and searching tandem mass spectral libraries for peptide identification. Molecular & Cellular Proteomics, 10(12), R111.008565. doi:10.1074/mcp.R111.008565.

    Article  Google Scholar 

  • Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., Harris, K., Heaford, A., Howland, J., Kann, L., Lehoczky, J., LeVine, R., McEwan, P., McKernan, K., Meldrim, J., Mesirov, J. P., Miranda, C., Morris, W., Naylor, J., Raymond, C., Rosetti, M., Santos, R., Sheridan, A., Sougnez, C., Stange-Thomann, Y., Stojanovic, N., Subramanian, A., Wyman, D., Rogers, J., Sulston, J., Ainscough, R., Beck, S., Bentley, D., Burton, J., Clee, C., Carter, N., Coulson, A., Deadman, R., Deloukas, P., Dunham, A., Dunham, I., Durbin, R., French, L., Grafham, D., Gregory, S., Hubbard, T., Humphray, S., Hunt, A., Jones, M., Lloyd, C., McMurray, A., Matthews, L., Mercer, S., Milne, S., Mullikin, J. C., Mungall, A., Plumb, R., Ross, M., Shownkeen, R., Sims, S., Waterston, R. H., Wilson, R. K., Hillier, L. W., McPherson, J. D., Marra, M. A., Mardis, E. R., Fulton, L. A., Chinwalla, A. T., Pepin, K. H., Gish, W. R., Chissoe, S. L., Wendl, M. C., Delehaunty, K. D., Miner, T. L., Delehaunty, A., Kramer, J. B., Cook, L. L., Fulton, R. S., Johnson, D. L., Minx, P. J., Clifton, S. W., Hawkins, T., Branscomb, E., Predki, P., Richardson, P., Wenning, S., Slezak, T., Doggett, N., Cheng, J. F., Olsen, A., Lucas, S., Elkin, C., Uberbacher, E., Frazier, M., Gibbs, R. A., Muzny, D. M., Scherer, S. E., Bouck, J. B., Sodergren, E. J., Worley, K. C., Rives, C. M., Gorrell, J. H., Metzker, M. L., Naylor, S. L., Kucherlapati, R. S., Nelson, D. L., Weinstock, G. M., Sakaki, Y., Fujiyama, A., Hattori, M., Yada, T., Toyoda, A., Itoh, T., Kawagoe, C., Watanabe, H., Totoki, Y., Taylor, T., Weissenbach, J., Heilig, R., Saurin, W., Artiguenave, F., Brottier, P., Bruls, T., Pelletier, E., Robert, C., Wincker, P., Smith, D. R., Doucette-Stamm, L., Rubenfield, M., Weinstock, K., Lee, H. M., Dubois, J., Rosenthal, A., Platzer, M., Nyakatura, G., Taudien, S., Rump, A., Yang, H., Yu, J., Wang, J., Huang, G., Gu, J., Hood, L., Rowen, L., Madan, A., Qin, S., Davis, R. W., Federspiel, N. A., Abola, A. P., Proctor, M. J., Myers, R. M., Schmutz, J., Dickson, M., Grimwood, J., Cox, D. R., Olson, M. V., Kaul, R., Shimizu, N., Kawasaki, K., Minoshima, S., Evans, G. A., Athanasiou, M., Schultz, R., Roe, B. A., Chen, F., Pan, H., Ramser, J., Lehrach, H., Reinhardt, R., McCombie, W. R., de la Bastide, M., Dedhia, N., Blocker, H., Hornischer, K., Nordsiek, G., Agarwala, R., Aravind, L., Bailey, J. A., Bateman, A., Batzoglou, S., Birney, E., Bork, P., Brown, D. G., Burge, C. B., Cerutti, L., Chen, H. C., Church, D., Clamp, M., Copley, R. R., Doerks, T., Eddy, S. R., Eichler, E. E., Furey, T. S., Galagan, J., Gilbert, J. G., Harmon, C., Hayashizaki, Y., Haussler, D., Hermjakob, H., Hokamp, K., Jang, W., Johnson, L. S., Jones, T. A., Kasif, S., Kaspryzk, A., Kennedy, S., Kent, W. J., Kitts, P., Koonin, E. V., Korf, I., Kulp, D., Lancet, D., Lowe, T. M., McLysaght, A., Mikkelsen, T., Moran, J. V., Mulder, N., Pollara, V. J., Ponting, C. P., Schuler, G., Schultz, J., Slater, G., Smit, A. F., Stupka, E., Szustakowki, J., Thierry-Mieg, D., Thierry-Mieg, J., Wagner, L., Wallis, J., Wheeler, R., Williams, A., Wolf, Y. I., Wolfe, K. H., Yang, S. P., Yeh, R. F., Collins, F., Guyer, M. S., Peterson, J., Felsenfeld, A., Wetterstrand, K. A., Patrinos, A., Morgan, M. J., de Jong, P., Catanese, J. J., Osoegawa, K., Shizuya, H., Choi, S., & Chen, Y. J. (2001). Initial sequencing and analysis of the human genome. Nature, 409(6822), 860–921. doi:10.1038/35057062.

    Article  CAS  PubMed  Google Scholar 

  • Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., & Durbin, R. (2009). The sequence alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078–2079. doi:10.1093/bioinformatics/btp352.

    Article  PubMed  PubMed Central  Google Scholar 

  • Low, T. Y., van Heesch, S., van den Toorn, H., Giansanti, P., Cristobal, A., Toonen, P., Schafer, S., Hubner, N., van Breukelen, B., Mohammed, S., Cuppen, E., Heck, A. J., & Guryev, V. (2013). Quantitative and qualitative proteome characteristics extracted from in-depth integrated genomics and proteomics analysis. Cell Reports, 5(5), 1469–1478. doi:10.1016/j.celrep.2013.10.041.

    Article  CAS  PubMed  Google Scholar 

  • Markiv, A., Rambaruth, N. D., & Dwek, M. V. (2012). Beyond the genome and proteome: Targeting protein modifications in cancer. Current Opinion in Pharmacology, 12(4), 408–413. doi:10.1016/j.coph.2012.04.003.

    Article  CAS  PubMed  Google Scholar 

  • Martin, J. A., & Wang, Z. (2011). Next-generation transcriptome assembly. Nature Reviews Genetics, 12(10), 671–682. doi:10.1038/nrg3068.

    Article  CAS  PubMed  Google Scholar 

  • McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., Garimella, K., Altshuler, D., Gabriel, S., Daly, M., & DePristo, M. A. (2010). The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 20(9), 1297–1303. doi:10.1101/gr.107524.110.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Menschaert, G., & Fenyo, D. (2015). Proteogenomics from a bioinformatics angle: A growing field. Mass Spectrometry Reviews. doi:10.1002/mas.21483.

    PubMed  Google Scholar 

  • Metzker, M. L. (2010). Sequencing technologies – The next generation. Nature Reviews Genetics, 11(1), 31–46. doi:10.1038/nrg2626.

    Article  CAS  PubMed  Google Scholar 

  • Muth, T., Weilnbock, L., Rapp, E., Huber, C. G., Martens, L., Vaudel, M., & Barsnes, H. (2014). DeNovoGUI: An open source graphical user interface for de novo sequencing of tandem mass spectra. Journal of Proteome Research, 13(2), 1143–1146. doi:10.1021/pr4008078.

    Article  CAS  PubMed  Google Scholar 

  • Nesvizhskii, A. I. (2007). Protein identification by tandem mass spectrometry and sequence database searching. Methods in Molecular Biology, 367, 87–119. doi:10.1385/1-59745-275-0:87.

    CAS  PubMed  Google Scholar 

  • Nesvizhskii, A. I. (2010). A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. Journal of Proteomics, 73(11), 2092–2123. doi:10.1016/j.jprot.2010.08.009.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nesvizhskii, A. I. (2014). Proteogenomics: Concepts, applications and computational strategies. Nature Methods, 11(11), 1114–1125. doi:10.1038/nmeth.3144.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nesvizhskii, A., & Avtonomov, D. http://www.batmass.org/

  • Nesvizhskii, A. I., & Aebersold, R. (2005). Interpretation of shotgun proteomic data: The protein inference problem. Molecular & Cellular Proteomics, 4(10), 1419–1440. doi:10.1074/mcp.R500012-MCP200.

    Article  CAS  Google Scholar 

  • Nesvizhskii, A. I., Keller, A., Kolker, E., & Aebersold, R. (2003). A statistical model for identifying proteins by tandem mass spectrometry. Analytical Chemistry, 75(17), 4646–4658.

    Article  CAS  PubMed  Google Scholar 

  • Orchard, S., Taylor, C., Hermjakob, H., Zhu, W., Julian, R., & Apweiler, R. (2004). Current status of proteomic standards development. Expert Review of Proteomics, 1(2), 179–183. doi:10.1586/14789450.1.2.179.

    Article  CAS  PubMed  Google Scholar 

  • Patel, R. K., & Jain, M. (2012). NGS QC Toolkit: A toolkit for quality control of next generation sequencing data. PloS One, 7(2), e30619. doi:10.1371/journal.pone.0030619.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pearson, W. R., Wood, T., Zhang, Z., & Miller, W. (1997). Comparison of DNA sequences with protein sequences. Genomics, 46(1), 24–36. doi:10.1006/geno.1997.4995.

    Article  CAS  PubMed  Google Scholar 

  • Pedrioli, P. G., Eng, J. K., Hubley, R., Vogelzang, M., Deutsch, E. W., Raught, B., Pratt, B., Nilsson, E., Angeletti, R. H., Apweiler, R., Cheung, K., Costello, C. E., Hermjakob, H., Huang, S., Julian, R. K., Kapp, E., McComb, M. E., Oliver, S. G., Omenn, G., Paton, N. W., Simpson, R., Smith, R., Taylor, C. F., Zhu, W., & Aebersold, R. (2004). A common open representation of mass spectrometry data and its application to proteomics research. Nature Biotechnology, 22(11), 1459–1466. doi:10.1038/nbt1031.

    Article  CAS  PubMed  Google Scholar 

  • Robinson, J. T., Thorvaldsdottir, H., Winckler, W., Guttman, M., Lander, E. S., Getz, G., & Mesirov, J. P. (2011). Integrative genomics viewer. Nature Biotechnology, 29(1), 24–26. doi:10.1038/nbt.1754.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rost, H. L., Rosenberger, G., Navarro, P., Gillet, L., Miladinovic, S. M., Schubert, O. T., Wolski, W., Collins, B. C., Malmstrom, J., Malmstrom, L., & Aebersold, R. (2014). OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nature Biotechnology, 32(3), 219–223. doi:10.1038/nbt.2841.

    Article  PubMed  Google Scholar 

  • Ruggles, K. V., Tang, Z., Wang, X., Grover, H., Askenazi, M., Teubl, J., Cao, S., McLellan, M. D., Clauser, K. R., Tabb, D. L., Mertins, P., Slebos, R., Erdmann-Gilmore, P., Li, S., Gunawardena, H. P., Xie, L., Liu, T., Zhou, J. Y., Sun, S., Hoadley, K. A., Perou, C. M., Chen, X., Davies, S. R., Maher, C. A., Kinsinger, C. R., Rodland, K. D., Zhang, H., Zhang, Z., Ding, L., Townsend, R. R., Rodriguez, H., Chan, D., Smith, R. D., Liebler, D. C., Carr, S. A., Payne, S., Ellis, M. J., & Fenyo, D. (2015). An analysis of the sensitivity of proteogenomic mapping of somatic mutations and novel splicing events in cancer. Molecular & Cellular Proteomics. doi:10.1074/mcp.M115.056226.

    Google Scholar 

  • Ruiz-Orera, J., Messeguer, X., Subirana, J. A., & Alba, M. M. (2014). Long non-coding RNAs as a source of new peptides. eLife, 3, e03523. doi:10.7554/eLife.03523.

    Article  PubMed  PubMed Central  Google Scholar 

  • Sajic, T., Liu, Y., & Aebersold, R. (2015). Using data-independent, high-resolution mass spectrometry in protein biomarker research: Perspectives and clinical applications. Proteomics Clinical Applications, 9(3–4), 307–321. doi:10.1002/prca.201400117.

    Article  CAS  PubMed  Google Scholar 

  • Sanger, F., Air, G. M., Barrell, B. G., Brown, N. L., Coulson, A. R., Fiddes, C. A., Hutchison, C. A., Slocombe, P. M., & Smith, M. (1977). Nucleotide sequence of bacteriophage phi X174 DNA. Nature, 265(5596), 687–695.

    Article  CAS  PubMed  Google Scholar 

  • Schwanhausser, B., Busse, D., Li, N., Dittmar, G., Schuchhardt, J., Wolf, J., Chen, W., & Selbach, M. (2011). Global quantification of mammalian gene expression control. Nature, 473(7347), 337–342. doi:10.1038/nature10098.

    Article  PubMed  Google Scholar 

  • Schwanhausser, B., Busse, D., Li, N., Dittmar, G., Schuchhardt, J., Wolf, J., Chen, W., & Selbach, M. (2013). Corrigendum: Global quantification of mammalian gene expression control. Nature, 495(7439), 126–127. doi:10.1038/nature11848.

    Article  PubMed  Google Scholar 

  • Shanmugam, A. K., & Nesvizhskii, A. I. (2015). Effective leveraging of targeted search spaces for improving peptide identification in tandem mass spectrometry based proteomics. Journal of Proteome Research, 14(12), 5169–5178. doi:10.1021/acs.jproteome.5b00504.

    Article  CAS  PubMed  Google Scholar 

  • Sheynkman, G. M., Shortreed, M. R., Frey, B. L., & Smith, L. M. (2013). Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq. Molecular & Cellular Proteomics, 12(8), 2341–2353. doi:10.1074/mcp.O113.028142.

    Article  CAS  Google Scholar 

  • Simpson, J. T., Wong, K., Jackman, S. D., Schein, J. E., Jones, S. J., & Birol, I. (2009). ABySS: A parallel assembler for short read sequence data. Genome Research, 19(6), 1117–1123. doi:10.1101/gr.089532.108.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sturm, M., & Kohlbacher, O. (2009). TOPPView: An open-source viewer for mass spectrometry data. Journal of Proteome Research, 8(7), 3760–3763. doi:10.1021/pr900171m.

    Article  CAS  PubMed  Google Scholar 

  • Tang, S., Lomsadze, A., & Borodovsky, M. (2015). Identification of protein coding regions in RNA transcripts. Nucleic Acids Research, 43(12), e78. doi:10.1093/nar/gkv227.

    Article  PubMed  PubMed Central  Google Scholar 

  • Tay, A. P., Pang, C. N., Twine, N. A., Hart-Smith, G., Harkness, L., Kassem, M., & Wilkins, M. R. (2015). Proteomic validation of transcript isoforms, including those assembled from RNA-Seq data. Journal of Proteome Research, 14(9), 3541–3554. doi:10.1021/pr5011394.

    Article  CAS  PubMed  Google Scholar 

  • Teleman, J., Rost, H. L., Rosenberger, G., Schmitt, U., Malmstrom, L., Malmstrom, J., & Levander, F. (2015). DIANA–algorithmic improvements for analysis of data-independent acquisition MS data. Bioinformatics, 31(4), 555–562. doi:10.1093/bioinformatics/btu686.

    Article  CAS  PubMed  Google Scholar 

  • Ternent, T., Csordas, A., Qi, D., Gomez-Baena, G., Beynon, R. J., Jones, A. R., Hermjakob, H., & Vizcaino, J. A. (2014). How to submit MS proteomics data to ProteomeXchange via the PRIDE database. Proteomics, 14(20), 2233–2241. doi:10.1002/pmic.201400120.

    Article  CAS  PubMed  Google Scholar 

  • Trapnell, C., Williams, B. A., Pertea, G., Mortazavi, A., Kwan, G., van Baren, M. J., Salzberg, S. L., Wold, B. J., & Pachter, L. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 28(5), 511–515. doi:10.1038/nbt.1621.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Trevisiol, S., Ayoub, D., Lesur, A., Ancheva, L., Gallien, S., & Domon, B. (2015). The use of proteases complementary to trypsin to probe isoforms and modifications. Proteomics. doi:10.1002/pmic.201500379.

    Google Scholar 

  • Turewicz, M., & Deutsch, E. W. (2011). Spectra, chromatograms, metadata: mzML-the standard data format for mass spectrometer output. Methods in Molecular Biology, 696, 179–203. doi:10.1007/978-1-60761-987-1_11.

    Article  CAS  PubMed  Google Scholar 

  • Tyanova, S., Temu, T., Carlson, A., Sinitcyn, P., Mann, M., & Cox, J. (2015). Visualization of LC-MS/MS proteomics data in MaxQuant. Proteomics, 15(8), 1453–1456. doi:10.1002/pmic.201400449.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Vaudel, M., Barsnes, H., Berven, F. S., Sickmann, A., & Martens, L. (2011). SearchGUI: An open-source graphical user interface for simultaneous OMSSA and X! Tandem searches. Proteomics, 11(5), 996–999. doi:10.1002/pmic.201000595.

    Article  CAS  PubMed  Google Scholar 

  • Vaudel, M., Burkhart, J. M., Zahedi, R. P., Oveland, E., Berven, F. S., Sickmann, A., Martens, L., & Barsnes, H. (2015). PeptideShaker enables reanalysis of MS-derived proteomics data sets. Nature Biotechnology, 33(1), 22–24. doi:10.1038/nbt.3109.

    Article  CAS  PubMed  Google Scholar 

  • Volders, P. J., Helsens, K., Wang, X., Menten, B., Martens, L., Gevaert, K., Vandesompele, J., & Mestdagh, P. (2013). LNCipedia: A database for annotated human lncRNA transcript sequences and structures. Nucleic Acids Research, 41(Database issue), D246–D251. doi:10.1093/nar/gks915.

    Article  CAS  PubMed  Google Scholar 

  • Volders, P. J., Verheggen, K., Menschaert, G., Vandepoele, K., Martens, L., Vandesompele, J., & Mestdagh, P. (2015). An update on LNCipedia: A database for annotated human lncRNA sequences. Nucleic Acids Research, 43(Database issue), D174–D180. doi:10.1093/nar/gku1060.

    Article  PubMed  Google Scholar 

  • Walsh, C. T., Garneau-Tsodikova, S., & Gatto, G. J., Jr. (2005). Protein posttranslational modifications: The chemistry of proteome diversifications. Angewandte Chemie International Edition, 44(45), 7342–7372. doi:10.1002/anie.200501023.

    Article  CAS  Google Scholar 

  • Walzer, M., Qi, D., Mayer, G., Uszkoreit, J., Eisenacher, M., Sachsenberg, T., Gonzalez-Galarza, F. F., Fan, J., Bessant, C., Deutsch, E. W., Reisinger, F., Vizcaino, J. A., Medina-Aunon, J. A., Albar, J. P., Kohlbacher, O., & Jones, A. R. (2013). The mzQuantML data standard for mass spectrometry-based quantitative studies in proteomics. Molecular & Cellular Proteomics, 12(8), 2332–2340. doi:10.1074/mcp.O113.028506.

    Article  CAS  Google Scholar 

  • Walzer, M., Pernas, L. E., Nasso, S., Bittremieux, W., Nahnsen, S., Kelchtermans, P., Pichler, P., van den Toorn, H. W., Staes, A., Vandenbussche, J., Mazanek, M., Taus, T., Scheltema, R. A., Kelstrup, C. D., Gatto, L., van Breukelen, B., Aiche, S., Valkenborg, D., Laukens, K., Lilley, K. S., Olsen, J. V., Heck, A. J., Mechtler, K., Aebersold, R., Gevaert, K., Vizcaino, J. A., Hermjakob, H., Kohlbacher, O., & Martens, L. (2014). qcML: An exchange format for quality control metrics from mass spectrometry experiments. Molecular & Cellular Proteomics, 13(8), 1905–1913. doi:10.1074/mcp.M113.035907.

    Article  CAS  Google Scholar 

  • Weisser, H., Nahnsen, S., Grossmann, J., Nilse, L., Quandt, A., Brauer, H., Sturm, M., Kenar, E., Kohlbacher, O., Aebersold, R., & Malmstrom, L. (2013). An automated pipeline for high-throughput label-free quantitative proteomics. Journal of Proteome Research, 12(4), 1628–1644. doi:10.1021/pr300992u.

    Article  CAS  PubMed  Google Scholar 

  • Zhang, J., Xin, L., Shan, B., Chen, W., Xie, M., Yuen, D., Zhang, W., Zhang, Z., Lajoie, G. A., & Ma, B. (2012). PEAKS DB: De novo sequencing assisted database search for sensitive and accurate peptide identification. Molecular & Cellular Proteomics, 11(4), M111 010587. doi:10.1074/mcp.M111.010587.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Horvatovich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Barbieri, R., Guryev, V., Brandsma, CA., Suits, F., Bischoff, R., Horvatovich, P. (2016). Proteogenomics: Key Driver for Clinical Discovery and Personalized Medicine. In: Végvári, Á. (eds) Proteogenomics. Advances in Experimental Medicine and Biology, vol 926. Springer, Cham. https://doi.org/10.1007/978-3-319-42316-6_3

Download citation

Publish with us

Policies and ethics