Skip to main content

Identification of Small Novel Coding Sequences, a Proteogenomics Endeavor

  • Chapter
  • First Online:
Proteogenomics

Part of the book series: Advances in Experimental Medicine and Biology ((AEMB,volume 926))

Abstract

The identification of small proteins and peptides has consistently proven to be challenging. However, technological advances as well as multi-omics endeavors facilitate the identification of novel small coding sequences, leading to new insights. Specifically, the application of next generation sequencing technologies (NGS), providing accurate and sample specific transcriptome / translatome information, into the proteomics field led to more comprehensive results and new discoveries. This book chapter focuses on the inclusion of RNA-Seq and RIBO-Seq also known as ribosome profiling, an RNA-Seq based technique sequencing the +/− 30 bp long fragments captured by translating ribosomes. We emphasize the identification of micropeptides and neo-antigens, two distinct classes of small translation products, triggering our current understanding of biology. RNA-Seq is capable of capturing sample specific genomic variations, enabling focused neo-antigen identification. RIBO-Seq can identify translation events in small open reading frames which are considered to be non-coding, leading to the discovery of micropeptides. The identification of small translation products requires the integration of multi-omics data, stressing the importance of proteogenomics in this novel research area.

Volodimir Olexiouk and Gerben Menschaert equally contributed to the book chapter as first authors

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Akimoto, C., et al. (2013). Translational repression of the McKusick-Kaufman syndrome transcript by unique upstream open reading frames encoding mitochondrial proteins with alternative polyadenylation sites. Biochimica et Biophysica Acta - General Subjects, 1830(3), 2728–2738.

    Article  CAS  Google Scholar 

  • Albuquerque, J. P., Tobias-santos, V., & Rodrigues, A. C. (2015). small ORFs: A new class of essential genes for development. Genetics and Molecular Biology, 283, 278–283.

    Article  Google Scholar 

  • Andrews, S. J., & Rothnagel, J. a. (2014). Emerging evidence for functional peptides encoded by short open reading frames. Nature Reviews Genetics, 15(3), 193–204.

    Article  CAS  PubMed  Google Scholar 

  • Apweiler, R., et al. (2014). Activities at the Universal Protein Resource (UniProt). Nucleic Acids Research, 42(D1), D191–D198.

    Article  CAS  Google Scholar 

  • Armengaud, J. (2013). Microbiology and proteomics, getting the best of both worlds! Environmental Microbiology, 15(1), 12–23.

    Article  CAS  PubMed  Google Scholar 

  • Attaf, M., et al. (2015). The T cell antigen receptor: The Swiss Army knife of the immune system. Clinical & Experimental Immunology, 181(1), 1–18.

    Article  CAS  Google Scholar 

  • Badger, J. H., & Olsen, G. J. (1999). CRITICA: Coding region identification tool invoking comparative analysis. Molecular Biology and Evolution, 16(4), 512–524.

    Article  CAS  PubMed  Google Scholar 

  • Bahassi, E. M., & Stambrook, P. J. (2014). Next-generation sequencing technologies: Breaking the sound barrier of human genetics. Mutagenesis, 29(5), 303–310.

    Article  CAS  Google Scholar 

  • Bassani-Sternberg, M., et al. (2015). Mass spectrometry of human leukocyte antigen class I peptidomes reveals strong effects of protein abundance and turnover on antigen presentation. Molecular & Cellular Proteomics, 14(3), 658–673.

    Article  CAS  Google Scholar 

  • Baudet, M., et al. (2010). Proteomics-based refinement of Deinococcus deserti genome annotation reveals an unwonted use of non-canonical translation initiation codons. Molecular & Cellular Proteomics, 9(2), 415–426.

    Article  CAS  Google Scholar 

  • Bazzini, A. A., et al. (2014). Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO Journal, 33(9), 981–993.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Blakeley, P., Overton, I. M., & Hubbard, S. J. (2012). Addressing statistical biases in nucleotide-derived protein databases for proteogenomic search strategies. Journal of Proteome Research, 11(11), 5221–5234.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Brar, G. a., et al. (2012). High-resolution view of the yeast meiotic program revealed by ribosome profiling. Science, 335(6068), 552–557.

    Article  CAS  PubMed  Google Scholar 

  • Calviello, L. et al. (2015, December). Detecting actively translated open reading frames in ribosome profiling data. Nature Methods, 13(2), 1–9.

    Google Scholar 

  • Carninci, P., et al. (2005). The transcriptional landscape of the mammalian genome. Science, 309(5740), 1559–1563.

    Article  CAS  PubMed  Google Scholar 

  • Castrignanò, T. et al. (2004). CSTminer: A web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison. Nucleic Acids Research, 32(Web Server issue), W624–W627.

    Google Scholar 

  • Chanut-Delalande, H., et al. (2014). Pri peptides are mediators of ecdysone for the temporal control of development. Nature Cell Biology, 16(11), 1035–1044.

    CAS  PubMed  Google Scholar 

  • Cheng, K., et al. (2014). Fit-for-purpose curated database application in mass spectrometry-based targeted protein identification and validation. BMC Research Notes, 7, 444.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Chng, S. C., et al. (2013). ELABELA: A hormone essential for heart development signals via the apelin receptor. Developmental Cell, 27(6), 672–680.

    Article  CAS  PubMed  Google Scholar 

  • Chu, Q., Ma, J., & Saghatelian, A. (2015). Identification and characterization of sORF-encoded polypeptides. Critical Reviews in Biochemistry and Molecular Biology, 50(2), 134–141.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Clamp, M., et al. (2007). Distinguishing protein-coding and noncoding genes in the human genome. Proceedings of the National Academy of Sciences of the United States of America, 104(49), 19428–19433.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Craig, R., & Beavis, R. C. (2004). TANDEM: Matching proteins with tandem mass spectra. Bioinformatics, 20(9), 1466–1467.

    Article  CAS  PubMed  Google Scholar 

  • Crappé, J., et al. (2013). Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs. BMC Genomics, 14, 648.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Crappé, J., Ndah, E., et al. (2014a). PROTEOFORMER: Deep proteome coverage through ribosome profiling and MS integration. Nucleic Acids Research, 10, 1–10.

    Google Scholar 

  • Crappé, J., Van Criekinge, W., & Menschaert, G. (2014b). Little things make big things happen: A summary of micropeptide encoding genes. EuPA Open Proteomics, 3, 128–137.

    Article  CAS  Google Scholar 

  • Crowe, M. L., Wang, X.-Q., & Rothnagel, J. a. (2006). Evidence for conservation and selection of upstream open reading frames suggests probable encoding of bioactive peptides. BMC Genomics, 7, 16.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Cunningham, F., et al. (2014). Ensembl 2015. Nucleic Acids Research, 43(D1), D662–D669.

    Article  PubMed  PubMed Central  Google Scholar 

  • Dinger, M. E., et al. (2008). Differentiating protein-coding and noncoding RNA: Challenges and ambiguities. PLoS Computational Biology, 4(11), e1000176.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Dorfer, V., et al. (2014). MS Amanda, a universal identification algorithm optimized for high accuracy tandem mass spectra. Journal of Proteome Research, 13(8), 3679–3684.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Dunn, G. P., et al. (2002). Cancer immunoediting: From immunosurveillance to tumor escape. Nature Immunology, 3(11), 991–998.

    Article  CAS  PubMed  Google Scholar 

  • Edwards, N. J. (2007). Novel peptide identification from tandem mass spectra using ESTs and sequence database compression. Molecular Systems Biology, 3(1), 102.

    PubMed  PubMed Central  Google Scholar 

  • EMBL, SIB Swiss Institute of Bioinformatics, & Protein Information Resource (PIR). (2013). UniProt. Nucleic Acids Research, 41, D43–D47.

    Article  CAS  Google Scholar 

  • Eng, J. K., et al. (2015). A deeper look into comet—Implementation and features. Journal of The American Society for Mass Spectrometry, 26(11), 1865–1874.

    Article  CAS  PubMed  Google Scholar 

  • Faye, M. D., Graber, T. E., & Holcik, M. (2014). Assessment of selective mRNA translation in mammalian cells by polysome profiling. Journal of Visualized Experiments, 92, 1–8.

    Google Scholar 

  • Fei, S. S., et al. (2011). Protein database and quantitative analysis considerations when integrating genetics and proteomics to compare mouse strains. Journal of Proteome Research, 10(7), 2905–2912.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fields, A. P., et al. (2015). A regression-based analysis of ribosome-profiling data reveals a conserved complexity to mammalian translation. Molecular Cell, 60(5), 816–827.

    Article  CAS  PubMed  Google Scholar 

  • Frith, M. C., et al. (2006a). Discrimination of non-protein-coding transcripts from protein-coding mRNA. RNA Biology, 3(1), 40–48.

    Article  CAS  PubMed  Google Scholar 

  • Frith, M. C., et al. (2006b). The abundance of short proteins in the mammalian proteome. PLoS Genetics, 2(4), 515–528.

    CAS  Google Scholar 

  • Fritsch, C., et al. (2012). Genome-wide search for novel human uORFs and N-terminal protein extensions using ribosomal footprinting. Genome Research, 22(11), 2208–2218.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Galindo, M. I., et al. (2007). Peptides encoded by short ORFs control development and define a new eukaryotic gene family. PLoS Biology, 5(5), 1052–1062.

    Article  CAS  Google Scholar 

  • Gerashchenko, M. V., Lobanov, a. V., & Gladyshev, V. N. (2012). Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress. Proceedings of the National Academy of Sciences, 109(43), 17394–17399.

    Article  CAS  Google Scholar 

  • Granholm, V., et al. (2014). Fast and accurate database searches with MS-GF + Percolator. Journal of Proteome Research, 13(2), 890–897.

    Article  CAS  PubMed  Google Scholar 

  • Gubin, M. M., et al. (2014). Checkpoint blockade cancer immunotherapy targets tumour-specific mutant antigens. Nature, 515(7528), 577–581.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gupta, N., et al. (2011). Target-decoy approach and false discovery rate: When things may go wrong. Journal of The American Society for Mass Spectrometry, 22(7), 1111–1120.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Guttman, M., & Rinn, J. L. (2012). Modular regulatory principles of large non-coding RNAs. Nature, 482(7385), 339–346.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hanada, K., et al. (2009). sORF finder: A program package to identify small open reading frames with high coding potential. Bioinformatics, 26(3), 399–400.

    Article  PubMed  CAS  Google Scholar 

  • Hayden, C. a., & Bosco, G. (2008). Comparative genomic analysis of novel conserved peptide upstream open reading frames in Drosophila melanogaster and other dipteran species. BMC Genomics, 9, 61.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Hernandez, C., Waridel, P., & Quadroni, M. (2014). Database construction and peptide identification strategies for proteogenomic studies on sequenced genomes. Current Topics in Medicinal Chemistry, 14(3), 425–434.

    Article  CAS  PubMed  Google Scholar 

  • Hinrichs, C. S., & Rosenberg, S. a. (2014). Exploiting the curative potential of adoptive T-cell therapy for cancer. Immunological Reviews, 257(1), 56–71.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hodi, F. S., et al. (2010). Improved survival with ipilimumab in patients with metastatic melanoma. The New England Journal of Medicine, 363(8), 711–723.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ingolia, N. T. et al. (2009). Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science (New York, N.Y.), 324(5924), 218–223.

    Google Scholar 

  • Ingolia, N. T., Lareau, L. F., & Weissman, J. S. (2011). Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell, 147(4), 789–802.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ingolia, N. T., et al. (2012). The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nature Protocols, 7(8), 1534–1550.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ingolia, N. T., et al. (2014). Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Reports, 8(5), 1365–1379.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Johannes, G., et al. (1999). Identification of eukaryotic mRNAs that are translated at reduced cap binding complex eIF4F concentrations using a cDNA microarray. Proceedings of the National Academy of Sciences of the United States of America, 96(23), 13118–13123.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jorgensen, R. A., & Dorantes-Acosta, A. E. (2012, August). Conserved peptide upstream open reading frames are associated with regulatory genes in Angiosperms. Frontiers in Plant Science, 3, 1–11.

    Google Scholar 

  • Keller, A., et al. (2002). Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Analytical Chemistry, 74(20), 5383–5392.

    Article  CAS  PubMed  Google Scholar 

  • Kessler, M. M., et al. (2003). Systematic discovery of new genes in the Saccharomyces cerevisiae genome. Genome Research, 13(2), 264–271.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kim, S., & Pevzner, P. a. (2014). MS-GF+ makes progress towards a universal database search tool for proteomics. Nature Communications, 5, 5277.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Koch, A., et al. (2014). A proteogenomics approach integrating proteomics and ribosome profiling increases the efficiency of protein identification and enables the discovery of alternative translation start sites. Proteomics, 14, 2688–2698.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Koebel, C. M., et al. (2007). Adaptive immunity maintains occult cancer in an equilibrium state. Nature, 450(7171), 903–907.

    Article  CAS  PubMed  Google Scholar 

  • Kondo, T., et al. (2007). Small peptide regulators of actin-based cell morphogenesis encoded by a polycistronic mRNA. Nature Cell Biology, 9(6), 660–665.

    Article  CAS  PubMed  Google Scholar 

  • Kong, L. et al. (2007). CPC: Assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Research, 35(Web Server issue), W345–W349.

    Google Scholar 

  • Lander, E. S., et al. (2001). Initial sequencing and analysis of the human genome. Nature, 409(6822), 860–921.

    Article  CAS  PubMed  Google Scholar 

  • Lee, S. S., et al. (2012). Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution. Proceedings of the National Academy of Sciences of the United States of America, 109(37), E2424–E2432.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Leinonen, R., Akhtar, R., et al. (2011a). The European nucleotide archive. Nucleic Acids Research, 39(Database issue), D28–D31.

    Article  CAS  PubMed  Google Scholar 

  • Leinonen, R., Sugawara, H., & Shumway, M. (2011b). The sequence read archive. Nucleic Acids Research, 39(Database issue), D19–D21.

    Article  CAS  PubMed  Google Scholar 

  • Lin, M. F., Jungreis, I., & Kellis, M. (2011). PhyloCSF: A comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics, 27(13), 275–282.

    Article  CAS  Google Scholar 

  • Linnemann, C., et al. (2014). High-throughput epitope discovery reveals frequent recognition of neo-antigens by CD4+ T cells in human melanoma. Nature Medicine, 21(1), 81–85.

    Article  PubMed  CAS  Google Scholar 

  • Liu, B., Han, Y., & Qian, S. B. (2013). Cotranslational response to proteotoxic stress by elongation pausing of ribosomes. Molecular Cell, 49(3), 453–463.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lopez-Casado, G., et al. (2012). Enabling proteomic studies with RNA-Seq: The proteome of tomato pollen as a test case. Proteomics, 12, 761–774.

    Article  CAS  PubMed  Google Scholar 

  • Lu, Y. C., et al. (2014). Efficient identification of mutated cancer antigens recognized by T cells associated with durable tumor regressions. Clinical Cancer Research, 20(13), 3401–3410.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ma, J., et al. (2014). Discovery of human sORF-encoded polypeptides (SEPs) in cell lines and tissue. Journal of Proteome Research, 13(3), 1757–1765.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mackowiak, S. D., et al. (2015). Extensive identification and analysis of conserved small ORFs in animals. Genome Biology, 16(1), 179.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Magny, E. G. et al. (2013). Conserved regulation of cardiac calcium uptake by peptides encoded in small open reading frames. Science, 341(6150), 1116–1120.

    Google Scholar 

  • Marguerat, S., & Bähler, J. (2010). RNA-Seq: From technology to biology. Cellular and Molecular Life Sciences, 67(4), 569–579.

    Article  CAS  PubMed  Google Scholar 

  • Menschaert, G., & Fenyö, D. (2015). Proteogenomics from a bioinformatics angle: A growing field. Mass Spectrometry Reviews, 34(1), 16.

    Google Scholar 

  • Menschaert, G., et al. (2013). Deep proteome coverage based on ribosome profiling aids mass spectrometry-based protein and peptide discovery and provides evidence of alternative translation products and near-cognate translation initiation events. Molecular & Cellular Proteomics, 12(7), 1780–1790.

    Article  CAS  Google Scholar 

  • Michel, A. M., & Baranov, P. V. (2013). Ribosome profiling: A Hi-Def monitor for protein synthesis at the genome-wide scale. Wiley Interdisciplinary Reviews: RNA, 4(5), 473–490.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Michel, A. M., et al. (2012). Observation of dually decoded regions of the human genome using ribosome profiling data. Genome Research, 22(11), 2219–2229.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nagaraj, N., et al. (2011). Deep proteome and transcriptome mapping of a human cancer cell line. Molecular Systems Biology, 7(548), 1–8.

    Google Scholar 

  • Nesvizhskii, A. I. (2010). A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. Journal of Proteomics, 73(11), 2092–2123.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ning, K., & Nesvizhskii, A. I. (2010). The utility of mass spectrometry-based proteomic data for validation of novel alternative splice forms reconstructed from RNA-Seq data: A preliminary assessment. BMC Bioinformatics, 11(Suppl 11), S14.

    Google Scholar 

  • Oh, E., et al. (2011). Selective ribosome profiling reveals the cotranslational chaperone action of trigger factor in vivo. Cell, 147(6), 1295–1308.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Olexiouk, V. et al. (2015). sORFs.org: A repository of small ORFs identified by ribosome profiling. Nucleic Acids Research, p.gkv1175.

    Google Scholar 

  • Pauli, A. et al. (2014). Toddler: An embryonic signal that promotes cell movement via Apelin receptors. Science (New York, N.Y.), 343(6172), 1248636.

    Google Scholar 

  • Pauli, A., Valen, E., & Schier, A. F. (2015). Identifying (non-)coding RNAs and small peptides: Challenges and opportunities. BioEssays: News and Reviews in Molecular, Cellular and Developmental Biology, 37(1), 103–112.

    Article  CAS  Google Scholar 

  • Piccirillo, C. a., et al. (2014). Translational control of immune responses: From transcripts to translatomes. Nature Immunology, 15(6), 503–511.

    Article  CAS  PubMed  Google Scholar 

  • Rizvi, N. A., et al. (2015). Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science, 348(6230), 124–128.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Robbins, P. F., et al. (2013). Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells. Nature Medicine, 19(6), 747–752.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ronsin, C. et al. (1999). A non-AUG-defined alternative open reading frame of the intestinal carboxyl esterase mRNA generates an epitope recognized by renal cell carcinoma-reactive tumor-infiltrating lymphocytes in situ. Journal of Immunology (Baltimore, Md. : 1950), 163(1), 483–490.

    Google Scholar 

  • Ruiz-Orera, J., et al. (2014). Long non-coding RNAs as a source of new peptides. eLife, 3, e03523.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Ryu, S. Y. (2014). Bioinformatics tools to identify and quantify proteins using mass spectrometry data. Advances in Protein Chemistry and Structural Biology, 94, 1–17.

    Article  PubMed  Google Scholar 

  • Saghatelian, A., & Couso, J. P. (2015). Discovery and characterization of smORF-encoded bioactive polypeptides. Nature Chemical Biology, 11(12), 909–916.

    Article  CAS  PubMed  Google Scholar 

  • Savard, J., et al. (2006). A segmentation gene in tribolium produces a polycistronic mRNA that codes for multiple conserved peptides. Cell, 126(3), 559–569.

    Article  CAS  PubMed  Google Scholar 

  • Schumacher, T. N., & Schreiber, R. D. (2015). Neoantigens in cancer immunotherapy. Science (New York, N.Y.), 348(6230), 69–74.

    Google Scholar 

  • Sevinsky, J. R., et al. (2008). Whole genome searching with shotgun proteomic data: Applications for genome annotation. Journal of Proteome Research, 7(1), 80–88.

    Article  CAS  PubMed  Google Scholar 

  • Shalgi, R., et al. (2013). Widespread regulation of translation by elongation pausing in heat shock. Molecular Cell, 49(3), 439–452.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Shankaran, V., et al. (2001). IFNγ and lymphocytes prevent primary tumour development and shape tumour immunogenicity. Nature, 410(6832), 1107–1111.

    Article  CAS  PubMed  Google Scholar 

  • Sharma, P., & Allison, J. P. (2015). The future of immune checkpoint therapy. Science (New York, N.Y.), 348(6230), 56–61.

    Google Scholar 

  • Sheynkman, G. M., et al. (2013). Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq. Molecular & Cellular Proteomics, 12(8), 2341–2353.

    Article  CAS  Google Scholar 

  • Singhal, A., Mori, L., & De Libero, G. (2013). T cell recognition of non-peptidic antigens in infectious diseases. The Indian Journal of Medical Research, 138(5), 620–631.

    CAS  PubMed  PubMed Central  Google Scholar 

  • Skarshewski, A., et al. (2014). uPEPperoni: An online tool for upstream open reading frame location and analysis of transcript conservation. BMC Bioinformatics, 15, 36.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Slavoff, S. a., et al. (2013). Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nature Chemical Biology, 9(1), 59–64.

    Article  CAS  PubMed  Google Scholar 

  • Sleator, R. D. (2010). An overview of the current status of eukaryote gene prediction strategies. Gene, 461(1–2), 1–4.

    Article  CAS  PubMed  Google Scholar 

  • Smith, J. E., et al. (2014). Translation of small open reading frames within unannotated RNA transcripts in Saccharomyces cerevisiae. Cell Reports, 7(6), 1858–1866.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Song, J., et al. (2012). An improvement of shotgun proteomics analysis by adding next-generation sequencing transcriptome data in orange. PloS One, 7(6), 5–10.

    Google Scholar 

  • Steitz, J. a. (1969). Nucleotide sequences of the ribosomal binding sites of bacteriophage R17 RNA. Cold Spring Harbor Symposia on Quantitative Biology, 34, 621–630.

    Article  CAS  PubMed  Google Scholar 

  • Stern-Ginossar, N. et al. (2012). Decoding human cytomegalovirus. Science (New York, N.Y.), 338(6110), 1088–1093.

    Google Scholar 

  • Tabb, D. L., Fernando, C. G., & Chambers, M. C. (2007). MyriMatch: Highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. Journal of Proteome Research, 6(2), 654–661.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tonkin, J., & Rosenthal, N. (2015). One small step for muscle: A new micropeptide regulates performance. Cell Metabolism, 21(4), 515–516.

    Article  CAS  PubMed  Google Scholar 

  • Tupy, J. L., et al. (2005). Identification of putative noncoding polyadenylated transcripts in Drosophila melanogaster. Proceedings of the National Academy of Sciences of the United States of America, 102(15), 5495–5500.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Van Damme, P., et al. (2014). N-terminal proteomics and ribosome profiling provide a comprehensive view of the alternative translation initiation landscape in mice and men. Molecular & Cellular Proteomics, 13(5), 1245–1261.

    Article  CAS  Google Scholar 

  • Vanderperre, B., et al. (2011). An overlapping reading frame in the PRNP gene encodes a novel polypeptide distinct from the prion protein. The FASEB Journal, 25(7), 2373–2386.

    Article  CAS  PubMed  Google Scholar 

  • Vaudel, M., & Verheggen, K. et al. (2015). Exploring the potential of public proteomics data. Proteomics, (January 2016), 1–30.

    Google Scholar 

  • Vaudel, M., Burkhart, J. M., et al. (2015b). PeptideShaker enables reanalysis of MS-derived proteomics data sets. Nature Biotechnology, 33(1), 22–24.

    Article  CAS  PubMed  Google Scholar 

  • Verheggen, K. et al. (2015). Pladipus enables universal distributed computing in proteomics bioinformatics. Journal of Proteome Research, p.acs.jproteome.5b00850.

    Google Scholar 

  • Wan, J., & Qian, S. B. (2014). TISdb: A database for alternative translation initiation in mammalian cells. Nucleic Acids Research, 42(November 2013), 845–850.

    Google Scholar 

  • Wang, X., & Zhang, B. (2014). Integrating genomic, transcriptomic, and interactome data to improve peptide and protein identification in shotgun proteomics. Journal of Proteome Research, 13(6), 2715–2723.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang, G., et al. (2009a). Decoy methods for assessing false positives and false discovery rates in shotgun proteomics. Analytical Chemistry, 81(1), 146–159.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang, Z., Gerstein, M., & Snyder, M. (2009b). RNA-Seq: A revolutionary tool for transcriptomics. Nature Reviews Genetics, 10(1), 57–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang, X., et al. (2012). Protein identification using customized protein sequence databases derived from RNA-Seq data. Journal of Proteome Research, 11(2), 1009–1017.

    Article  CAS  PubMed  Google Scholar 

  • Werner, M., et al. (1987). The leader peptide of yeast gene CPA1 is essential for the translational repression of its expression. Cell, 49(6), 805–813.

    Article  CAS  PubMed  Google Scholar 

  • Wolchok, J., & Chan, T. (2014). Cancer: Antitumour immunity gets a boost. Nature, 515, 496–498.

    Article  CAS  PubMed  Google Scholar 

  • Woo, S., et al. (2014). Proteogenomic database construction driven from large scale RNA-Seq data. Journal of Proteome Research, 13(1), 21–28.

    Article  CAS  PubMed  Google Scholar 

  • Xie, S.-Q. et al. (2015). RPFdb: A database for genome wide information of translated mRNA generated from ribosome profiling. Nucleic Acids Research, p.gkv972.

    Google Scholar 

  • Yadav, M., et al. (2014). Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing. Nature, 515(7528), 572–576.

    Article  CAS  PubMed  Google Scholar 

  • Yagoub, D. et al. (2015). Proteogenomic discovery of a small, novel protein in yeast reveals a strategy for the detection of unannotated short open reading frames. Journal of Proteome Research, p.acs.jproteome.5b00734.

    Google Scholar 

  • Yang, X., et al. (2011). Discovery and annotation of small proteins using genomics, proteomics, and computational approaches. Genome Research, 21(4), 634–641.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Conflict of Interest Statement

None declared.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Volodimir Olexiouk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Olexiouk, V., Menschaert, G. (2016). Identification of Small Novel Coding Sequences, a Proteogenomics Endeavor. In: Végvári, Á. (eds) Proteogenomics. Advances in Experimental Medicine and Biology, vol 926. Springer, Cham. https://doi.org/10.1007/978-3-319-42316-6_4

Download citation

Publish with us

Policies and ethics