Skip to main content
Log in

Understanding the proteome encoded by “non-coding RNAs”: new insights into human genome

  • Review
  • Published:
Science China Life Sciences Aims and scope Submit manuscript

Abstract

A great number of non-coding RNAs (ncRNAs) account for the majority of the genome. The translation of these ncRNAs has been noted but seriously underestimated due to both technological and theoretical limitations. Based on the development of ribosome profiling (Ribo-seq), full length translating RNA analysis (RNC-seq) and mass spectrometry technology, more and more ncRNAs are being found to be translated in different organism, and some of them can produce functional peptides. While recently, not only individual new functional proteins, but also a new proteome have been experimentally discovered to be encoded by endogenous lncRNAs and circRNAs. These new proteins are of biological significance, suggesting the connection of the translation of ncRNAs to human physiology and diseases. Therefore, an in-depth and systematic understanding of the coding capabilities of ncRNAs is necessary for basic biology and medicine. In this review, we summarize the advances in the field of discovering this new proteome, i.e. “ncRNA-coded” proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alon, U. (2007). Network motifs: theory and experimental approaches. Nat Rev Genet 8, 450–461.

    Article  CAS  Google Scholar 

  • Aspden, J.L., Eyre-Walker, Y.C., Phillips, R.J., Amin, U., Mumtaz, M.A.S., Brocard, M., and Couso, J.P. (2014). Extensive translation of small Open Reading Frames revealed by Poly-Ribo-Seq. eLife 3, e03528.

    Article  Google Scholar 

  • Bânfai, B., Jia, H., Khatun, J., Wood, E., Risk, B., Gundling, W.E., Kundaje, A., Gunawardena, H.P., Yu, Y., Xie, L., et al. (2012). Long noncoding RNAs are rarely translated in two human cell lines. Genome Res 22, 1646–1657.

    Article  Google Scholar 

  • Bartholomäus, A., Del Campo, C., and Ignatova, Z. (2016). Mapping the non-standardized biases of ribosome profiling. Biol Chem 397, 23–35.

    Article  Google Scholar 

  • Boutet, E., Lieberherr, D., Tognolli, M., Schneider, M., Bansal, P., Bridge, A.J., Poux, S., Bougueleret, L., and Xenarios, I. (2016). UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods Mol Biol 1374, 23–54.

    Article  CAS  Google Scholar 

  • Carvunis, A.R., Rolland, T., Wapinski, I., Calderwood, M.A., Yildirim, M. A., Simonis, N., Charloteaux, B., Hidalgo, C.A., Barbette, J., Santhanam, B., et al. (2012). Proto-genes and de novo gene birth. Nature 487, 370–374.

    Article  CAS  Google Scholar 

  • Chang, C., Li, L., Zhang, C., Wu, S., Guo, K., Zi, J., Chen, Z., Jiang, J., Ma, J., Yu, Q., et al. (2014). Systematic analyses of the transcriptome, translatome, and proteome provide a global view and potential strategy for the C-HPP. J Proteome Res 13, 38–49.

    Article  CAS  Google Scholar 

  • Chen, S., Krinsky, B.H., and Long, M. (2013). New genes as drivers of phenotypic evolution. Nat Rev Genet 14, 645–660.

    Article  CAS  Google Scholar 

  • Chen, S., Zhang, Y.E., and Long, M. (2010). New genes in Drosophila quickly become essential. Science 330, 1682–1685.

    Article  CAS  Google Scholar 

  • Choi, S.W., Kim, H.W., and Nam, J.W. (2019). The small peptide world in long noncoding RNAs. Briefings BioInf 20, 1853–1864.

    Article  CAS  Google Scholar 

  • D’Lima, N.G., Ma, J., Winkler, L., Chu, Q., Loh, K.H., Corpuz, E.O., Budnik, B.A., Lykke-Andersen, J., Saghatelian, A., and Slavoff, S.A. (2017). A human microprotein that interacts with the mRNA decapping complex. Nat Chem Biol 13, 174–180.

    Article  Google Scholar 

  • Deutsch, E.W., Overall, C.M., Van Eyk, J.E., Baker, M.S., Paik, Y.K., Weintraub, S.T., Lane, L., Martens, L., Vandenbrouck, Y., Kusebauch, U., et al. (2016). Human Proteome Project Mass Spectrometry Data Interpretation Guidelines 2.1. J Proteome Res 15, 3961–3970.

    Article  CAS  Google Scholar 

  • Dhamija, S., and Menon, M.B. (2018). Non-coding transcript variants of protein-coding genes — what are they good for? RNA Biol 15, 1025–1031.

    Google Scholar 

  • Eguen, T., Straub, D., Graeff, M., and Wenkel, S. (2015). MicroProteins: small size — big impact. Trends Plant Sci 20, 477–482.

    Article  CAS  Google Scholar 

  • Erhard, F., Halenius, A., Zimmermann, C., L’Hernault, A., Kowalewski, D. J., Weekes, M.P., Stevanovic, S., Zimmer, R., and Dölken, L. (2018). Improved Ribo-seq enables identification of cryptic translation events. Nat Methods 15, 363–366.

    Article  CAS  Google Scholar 

  • Filippov-Levy, N., Cohen-Schussheim, H., Tropé, C.G., Hetland Falkenthal, T.E., Smith, Y., Davidson, B., and Reich, R. (2018). Expression and clinical role of long non-coding RNA in high-grade serous carcinoma. Gynecologic Oncology 148, 559–566.

    Article  CAS  Google Scholar 

  • Frith, M.C., Pheasant, M., and Mattick, J.S. (2005). Genomics: The amazing complexity of the human transcriptome. Eur J Hum Genet 13, 894–897.

    Article  CAS  Google Scholar 

  • Gerashchenko, M.V., and Gladyshev, V.N. (2017). Ribonuclease selection for ribosome profiling. Nucleic Acids Res 45, e6.

    Article  Google Scholar 

  • Guo, J., Lian, X., Zhong, J., Wang, T., and Zhang, G. (2015). Length-dependent translation initiation benefits the functional proteome of human cells. Mol BioSyst 11, 370–378.

    Article  CAS  Google Scholar 

  • Guttman, M., Russell, P., Ingolia, N.T., Weissman, J.S., and Lander, E.S. (2013). Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251.

    Article  CAS  Google Scholar 

  • Hu, L., Xu, Z., Hu, B., and Lu, Z.J. (2017). COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features. Nucleic Acids Res 45, e2.

    Article  Google Scholar 

  • Huang, J.Z., Chen, M., Chen, D., Gao, X.C., Zhu, S., Huang, H., Hu, M., Zhu, H., and Yan, G.R. (2017). A peptide encoded by a putative lncRNA HOXB-AS3 suppresses colon cancer growth. Mol Cell 68, 171–184.e6.

    Article  CAS  Google Scholar 

  • Human Genome Sequencing Consortium, I. (2004). Finishing the euchromatic sequence of the human genome. Nature 431, 931–945.

    Article  Google Scholar 

  • Jackson, R., Kroehling, L., Khitun, A., Bailis, W., Jarret, A., York, A.G., Khan, O.M., Brewer, J.R., Skadow, M.H., Duizer, C., et al. (2018). The translation of non-canonical open reading frames controls mucosal immunity. Nature 564, 434–438.

    Article  CAS  Google Scholar 

  • Kaern, M., Elston, T.C., Blake, W.J., and Collins, J.J. (2005). Stochasticity in gene expression: from theories to phenotypes. Nat Rev Genet 6, 451–464.

    Article  CAS  Google Scholar 

  • Khatun, J., Yu, Y., Wrobel, J.A., Risk, B.A., Gunawardena, H.P., Secrest, A., Spitzer, W.J., Xie, L., Wang, L., Chen, X., et al. (2013). Whole human genome proteogenomic mapping for ENCODE cell line data: identifying protein-coding regions. BMC Genomics 14, 141.

    Article  CAS  Google Scholar 

  • Kosiol, C., Vinar, T., da Fonseca, R.R., Hubisz, M.J., Bustamante, C.D., Nielsen, R., and Siepel, A. (2008). Patterns of positive selection in six Mammalian genomes. PLoS Genet 4, e1000144.

    Article  Google Scholar 

  • Lama, L., Cobo, J., Buenaventura, D., and Ryan, K. (2019). Small RNA-seq: The RNA 5′-end adapter ligation problem and how to circumvent it. J Biol Methods 6, pii: e108.

    Article  Google Scholar 

  • Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860–921.

    Article  CAS  Google Scholar 

  • Lecanda, A., Nilges, B.S., Sharma, P., Nedialkova, D.D., Schwarz, J., Vaquerizas, J.M., and Leidel, S.A. (2016). Dual randomization of oligonucleotides to reduce the bias in ribosome-profiling libraries. Methods 107, 89–97.

    Article  CAS  Google Scholar 

  • Legnini, I., Di Timoteo, G., Rossi, F., Morlando, M., Briganti, F., Sthandier, O., Fatica, A., Santini, T., Andronache, A., Wade, M., et al. (2017). Circ-ZNF609 is a circular RNA that can be translated and functions in myogenesis. Mol Cell 66, 22–37.e9.

    Article  CAS  Google Scholar 

  • Li, D., Lu, S., Liu, W., Zhao, X., Mai, Z., and Zhang, G. (2018). Optimal settings of mass spectrometry open search strategy for higher confidence. J Proteome Res 17, 3719–3729.

    Article  CAS  Google Scholar 

  • Liu, J., Gough, J., and Rost, B. (2006). Distinguishing protein-coding from non-coding RNAs through support vector machines. PLoS Genet 2, e29.

    Article  Google Scholar 

  • Lu, S., Zhang, J., Lian, X., Sun, L., Meng, K., Chen, Y., Sun, Z., Yin, X., Li, Y., Zhao, J., et al. (2019). A hidden human proteome encoded by ‘non-coding’ genes. Nucleic Acids Res 47, 8111–8125.

    Article  CAS  Google Scholar 

  • Ma, J., Ward, C.C., Jungreis, I., Slavoff, S.A., Schwaid, A.G., Neveu, J., Budnik, B.A., Kellis, M., and Saghatelian, A. (2014). Discovery of human sORF-encoded polypeptides (SEPs) in cell lines and tissue. J Proteome Res 13, 1757–1765.

    Article  CAS  Google Scholar 

  • Makarewich, C.A., Baskin, K.K., Munir, A.Z., Bezprozvannaya, S., Sharma, G., Khemtong, C., Shah, A.M., McAnally, J.R., Malloy, C. R., Szweda, L.I., et al. (2018). MOXI is a mitochondrial micropeptide that enhances fatty acid β-Oxidation. Cell Rep 23, 3701–3709.

    Article  CAS  Google Scholar 

  • Malabat, C., Feuerbach, F., Ma, L., Saveanu, C., and Jacquier, A. (2015). Quality control of transcription start site selection by nonsense-mediated-mRNA decay. eLife 4, e06722.

    Article  Google Scholar 

  • Matsumoto, A., Pasut, A., Matsumoto, M., Yamashita, R., Fung, J., Monteleone, E., Saghatelian, A., Nakayama, K.I., Clohessy, J.G., and Pandolfi, P.P. (2017). mTORC1 and muscle regeneration are regulated by the LINC00961-encoded SPAR polypeptide. Nature 541, 228–232.

    Article  CAS  Google Scholar 

  • McLysaght, A., and Hurst, L.D. (2016). Open questions in the study of de novo genes: what, how and why. Nat Rev Genet 17, 567–578.

    Article  CAS  Google Scholar 

  • Mignone, F., Grillo, G., Liuni, S., and Pesole, G. (2003). Computational identification of protein coding potential of conserved sequence tags through cross-species evolutionary analysis. Nucleic Acids Res 31, 4639–4645.

    Article  CAS  Google Scholar 

  • Morello, L.G., Hesling, C., Coltri, P.P., Castilho, B.A., Rimokh, R., and Zanchin, N.I.T. (2011). The NIP7 protein is required for accurate pre-rRNA processing in human cells. Nucleic Acids Res 39, 648–665.

    Article  CAS  Google Scholar 

  • Munsky, B., Neuert, G., and van Oudenaarden, A. (2012). Using gene expression noise to understand gene regulation. Science 336, 183–187.

    Article  CAS  Google Scholar 

  • Nielsen, R., Bustamante, C., Clark, A.G., Glanowski, S., Sackton, T.B., Hubisz, M.J., Fledel-Alon, A., Tanenbaum, D.M., Civello, D., White, T. J., et al. (2005). A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol 3, e170.

    Article  Google Scholar 

  • Omenn, G.S., Lane, L., Overall, C.M., Corrales, F.J., Schwenk, J.M., Paik, Y.K., Van Eyk, J.E., Liu, S., Pennington, S., Snyder, M.P., et al. (2019). Progress on identifying and characterizing the human proteome: 2019 metrics from the HUPO Human Proteome Project. J Proteome Res 18, 4098–4107.

    Article  Google Scholar 

  • Paik, Y.K., Jeong, S.K., Omenn, G.S., Uhlen, M., Hanash, S., Cho, S.Y., Lee, H.J., Na, K., Choi, E.Y., Yan, F., et al. (2012a). The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome. Nat Biotechnol 30, 221–223.

    Article  CAS  Google Scholar 

  • Paik, Y.K., Omenn, G.S., Uhlen, M., Hanash, S., Marko-Varga, G., Aebersold, R., Bairoch, A., Yamamoto, T., Legrain, P., Lee, H.J., et al. (2012b). Standard guidelines for the chromosome-centric human proteome project. J Proteome Res 11, 2005–2013.

    Article  CAS  Google Scholar 

  • Plaza, S., Menschaert, G., and Payre, F. (2017). In search of lost small peptides. Annu Rev Cell Dev Biol 33, 391–416.

    Article  CAS  Google Scholar 

  • Polycarpou-Schwarz, M., Groß, M., Mestdagh, P., Schott, J., Grund, S.E., Hildenbrand, C., Rom, J., Aulmann, S., Sinn, H.P., Vandesompele, J., et al. (2018). The cancer-associated microprotein CASIMO1 controls cell proliferation and interacts with squalene epoxidase modulating lipid droplet formation. Oncogene 37, 4750–4768.

    Article  CAS  Google Scholar 

  • Ponting, C.P., Oliver, P.L., and Reik, W. (2009). Evolution and functions of long noncoding RNAs. Cell 136, 629–641.

    Article  CAS  Google Scholar 

  • Pruitt, K.D., Tatusova, T., Brown, G.R., and Maglott, D.R. (2012). NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res 40, D130–D135.

    Article  CAS  Google Scholar 

  • Radko, S.P., Poverennaya, E.V., Kurbatov, L.K., Ponomarenko, E.A., Lisitsa, A.V., and Archakov, A.I. (2019). The “missing” proteome: undetected proteins, not-translated transcripts, and untranscribed genes. J Proteome Res 18, 4273–4276.

    Article  CAS  Google Scholar 

  • Raj, A., and van Oudenaarden, A. (2008). Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216–226.

    Article  CAS  Google Scholar 

  • Rathore, A., Martinez, T.F., Chu, Q., and Saghatelian, A. (2018). Small, but mighty? Searching for human microproteins and their potential for understanding health and disease. Expert Rev Proteomics 15, 963–965.

    Article  CAS  Google Scholar 

  • Ruiz-Orera, J., Messeguer, X., Subirana, J.A., and Alba, M.M. (2014). Long non-coding RNAs as a source of new peptides. eLife 3, e03523.

    Article  Google Scholar 

  • Salzman, J., Chen, R.E., Olsen, M.N., Wang, P.L., and Brown, P.O. (2013). Cell-type specific features of circular RNA expression. PLoS Genet 9, e1003777.

    Article  CAS  Google Scholar 

  • Shapiro, E., Biezuner, T., and Linnarsson, S. (2013). Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat Rev Genet 14, 618–630.

    Article  CAS  Google Scholar 

  • Slavoff, S.A., Mitchell, A.J., Schwaid, A.G., Cabili, M.N., Ma, J., Levin, J. Z., Karger, A.D., Budnik, B.A., Rinn, J.L., and Saghatelian, A. (2013). Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat Chem Biol 9, 59–64.

    Article  CAS  Google Scholar 

  • Staudt, A.C., and Wenkel, S. (2011). Regulation of protein function by ‘microProteins’. EMBO Rep 12, 35–42.

    Article  CAS  Google Scholar 

  • Stein, C.S., Jadiya, P., Zhang, X., McLendon, J.M., Abouassaly, G.M., Witmer, N.H., Anderson, E.J., Elrod, J.W., and Boudreau, R.L. (2018). Mitoregulin: a lncRNA-encoded microprotein that supports mitochondrial supercomplexes and respiratory efficiency. Cell Rep 23, 3710–3720.e8.

    Article  CAS  Google Scholar 

  • Stein, L. (2001). Genome annotation: from sequence to biology. Nat Rev Genet 2, 493–503.

    Article  CAS  Google Scholar 

  • Thibaud-Nissen, F., DiCuccio, M., Hlavina, W., Kimchi, A., Kitts, P.A., Murphy, T.D., Pruitt, K.D., and Souvorov, A. (2016). The NCBI eukaryotic genome annotation pipeline. J anim Sci 94, 184.

    Article  Google Scholar 

  • Thibaud-Nissen, F., Souvorov, A., Murphy, T., DiCuccio, M., and Kitts, P. (2013). Eukaryotic genome annotation pipeline. In the NCBI handbook [Internet](2nd edition), National Center for Biotechnology Information (US).

  • Thireos, G., Griffin-Shea, R., and Kafatos, F.C. (1980). Untranslated mRNA for a chorion protein of Drosophila melanogaster accumulates transiently at the onset of specific gene amplification. Proc Natl Acad Sci USA 77, 5789–5793.

    Article  CAS  Google Scholar 

  • van Heesch, S., Witte, F., Schneider-Lunitz, V., Schulz, J.F., Adami, E., Faber, A.B., Kirchner, M., Maatz, H., Blachut, S., Sandmann, C.L., et al. (2019). The translational landscape of the human heart. Cell 178, 242–260.e29.

    Article  CAS  Google Scholar 

  • Vitting-Seerup, K., Porse, B.T., Sandelin, A., and Waage, J. (2014). spliceR: an R package for classification of alternative splicing and prediction of coding potential from RNA-seq data. BMC BioInf 15, 81.

    Article  Google Scholar 

  • Wang, T., Cui, Y., Jin, J., Guo, J., Wang, G., Yin, X., He, Q.Y., and Zhang, G. (2013). Translating mRNAs strongly correlate to proteins in a multivariate manner and their translation ratios are phenotype specific. Nucleic Acids Res 41, 4743–4754.

    Article  CAS  Google Scholar 

  • Wanggou, S., Jiang, X., Li, Q., Zhang, L., Liu, D., Li, G., Feng, X., Liu, W., Zhu, B., Huang, W., et al. (2012). HESRG: a novel biomarker for intracranial germinoma and embryonal carcinoma. J Neurooncol 106, 251–259.

    Article  CAS  Google Scholar 

  • Xu, W., Deng, B., Lin, P., Liu, C., Li, B., Huang, Q., Zhou, H., Yang, J., and Qu, L. (2019). Ribosome profiling analysis identified a KRAS-inter-acting microprotein that represses oncogenic signaling in hepatocellular carcinoma cells. Sci China Life Sci, Epub ahead of print, doi: https://doi.org/10.1007/s11427-019-9580-5.

  • Yang, X.Y., Zhang, L., Liu, J., Li, N., Yu, G., Cao, K., Han, J., Zeng, G., Pan, Y., Sun, X., et al. (2015). Proteomic analysis on the antibacterial activity of a Ru(II) complex against Streptococcus pneumoniae. J Proteomics 115, 107–116.

    Article  CAS  Google Scholar 

  • Yeasmin, F., Yada, T., and Akimitsu, N. (2018). Micropeptides encoded in transcripts previously identified as long noncoding RNAs: a new chapter in transcriptomics and proteomics. Front Genet 9, 144.

    Article  Google Scholar 

  • Yin, X., Jing, Y., and Xu, H. (2019). Mining for missed sORF-encoded peptides. Expert Rev Proteomics 16, 257–266.

    Article  CAS  Google Scholar 

  • Zhang, G., Hubalewska, M., and Ignatova, Z. (2009). Transient ribosomal attenuation coordinates protein synthesis and co-translational folding. Nat Struct Mol Biol 16, 274–280.

    Article  CAS  Google Scholar 

  • Zhang, M., Huang, N., Yang, X., Luo, J., Yan, S., Xiao, F., Chen, W., Gao, X., Zhao, K., Zhou, H., et al. (2018a). A novel protein encoded by the circular form of the SHPRH gene suppresses glioma tumorigenesis. Oncogene 37, 1805–1814.

    Article  CAS  Google Scholar 

  • Zhang, M., Zhao, K., Xu, X., Yang, Y., Yan, S., Wei, P., Liu, H., Xu, J., Xiao, F., Zhou, H., et al. (2018b). A peptide encoded by circular form of LINC-PINT suppresses oncogenic transcriptional elongation in glioblastoma. Nat Commun 9, 4475.

    Article  Google Scholar 

  • Zhang, Q., Vashisht, A.A., O’Rourke, J., Corbel, S.Y., Moran, R., Romero, A., Miraglia, L., Zhang, J., Durrant, E., Schmedt, C., et al. (2017). The microprotein Minion controls cell fusion and muscle formation. Nat Commun 8, 15664.

    Article  CAS  Google Scholar 

  • Zhao, J., Qin, B., Nikolay, R., Spahn, C.M.T., and Zhang, G. (2019). Translatomics: the global view of translation. Int J Mol Sci 20, pii: E212.

    Article  Google Scholar 

  • Zhong, J., Cui, Y., Guo, J., Chen, Z., Yang, L., He, Q.Y., Zhang, G., and Wang, T. (2014). Resolving chromosome-centric human proteome with translating mRNA analysis: a strategic demonstration. J Proteome Res 13, 50–59.

    Article  CAS  Google Scholar 

  • Zhong, J., Xiao, C., Gu, W., Du, G., Sun, X., He, Q.Y., and Zhang, G. (2015). Transfer RNAs mediate the rapid adaptation of escherichia coli to oxidative stress. PLoS Genet 11, e1005302.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program (2017YFA0505100, 2017YFA0505001, 2018YFC0910202) and the National Natural and Science Foundation of China (81372135 to TW; 81322028 and 31300649 to GZ; 31570828 and 31770888 to Q.Y.H.).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Tong Wang, Gong Zhang or Qing-Yu He.

Ethics declarations

Compliance and ethics The author(s) declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, S., Wang, T., Zhang, G. et al. Understanding the proteome encoded by “non-coding RNAs”: new insights into human genome. Sci. China Life Sci. 63, 986–995 (2020). https://doi.org/10.1007/s11427-019-1677-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11427-019-1677-8

Keywords

Navigation