Abstract
Tea is one of the most popular beverages and its leaves are rich in catechins, contributing to the diverse flavor as well as beneficial for human health. However, the study of the post-transcriptional regulatory mechanism affecting the synthesis of catechins remains insufficient. Here, we sequenced the transcriptome using PacBio sequencing technology and obtained 63,111 full-length high-quality isoforms, including 1302 potential novel genes and 583 highly reliable fusion transcripts. We also identified 1204 lncRNAs with high quality, containing 188 known and 1016 novel lncRNAs. In addition, 311 mis-annotated genes were corrected based on the high-quality Isoseq reads. A large number of alternative splicing (AS) events (3784) and alternative polyadenylation (APA) genes (18,714) were analyzed, accounting for 8.84% and 43.7% of the total annotated genes, respectively. We also found that 2884 genes containing AS and APA features exhibited higher expression levels than other genes. These genes are mainly involved in amino acid biosynthesis, carbon fixation in photosynthetic organisms, phenylalanine, tyrosine, tryptophan biosynthesis, and pyruvate metabolism, suggesting that they play an essential role in the catechins content of tea polyphenols. Our results further improved the level of genome annotation and indicated that post-transcriptional regulation plays a crucial part in synthesizing catechins.
Similar content being viewed by others
References
Abbasi BH, Tian CL, Murch SJ, Saxena PK, Liu CZ (2007) Light-enhanced caffeic acid derivatives biosynthesis in hairy root cultures of Echinacea purpurea. Plant Cell Rep 26:1367–1372
Abdel-Ghany SE, Hamilton M, Jacobi JL, Ngam P, Devitt N, Schilkey F, Ben-Hur A, Reddy AS (2016) A survey of the sorghum transcriptome using single-molecule long reads. Nat Commun 7:11706
Barbazuk WB, Fu Y, McGinnis KM (2008) Genome-wide analyses of alternative splicing in plants: opportunities and challenges. Genome Res 18:1381–1392
Beier S, Thiel T, Münch T, Scholz U, Mascher M (2017) MISA-web: a web server for microsatellite prediction. Bioinformatics 33:2583–2585
Bhan A, Soleimani M, Mandal SS (2017) Long noncoding rna and cancer: a new paradigm. Can Res 77:3965–3981
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120
Bournay AS, Hedley PE, Maddison A, Waugh R, Machray GC (1996) Exon skipping induced by cold stress in a potato invertase gene transcript. Nucleic Acids Res 24:2347–2351
Bushman JL (1998) Green tea and cancer in humans: a review of the literature. Nutr Cancer 31:151–159
Chao Q, Gao ZF, Zhang D, Zhao BG, Dong FQ, Fu CX, Liu LJ, Wang BC (2019) The developmental dynamics of the Populus stem transcriptome. Plant Biotechnol J 17:206–219
Chen L, Tovar-Corona JM, Urrutia AO (2012) Alternative splicing: a potential source of functional innovation in the eukaryotic genome. Int J Evol Biol 2012:596274
Csorba T, Questa JI, Sun Q, Dean C (2014) Antisense COOLAIR mediates the coordinated switching of chromatin states at FLC during vernalization. Proc Natl Acad Sci USA 111:16160–16165
Ding J, Shen J, Mao H, Xie W, Li X, Zhang Q (2012) RNA-directed DNA methylation is involved in regulating photoperiod-sensitive male sterility in rice. Mol Plant 5:1210–1216
Elkon R, Ugalde AP, Agami R (2013) Alternative cleavage and polyadenylation: extent, regulation and function. Nat Rev Genet 14:496–506
Ferrer JL, Jez JM, Bowman ME, Dixon RA, Noel JP (1999) Structure of chalcone synthase and the molecular basis of plant polyketide biosynthesis. Nat Struct Biol 6:775–784
Foissac S, Sammeth M (2007) ASTALAVISTA: dynamic and flexible analysis of alternative splicing events in custom gene datasets. Nucleic Acids Res 35:297–299
Gao S, Ren Y, Sun Y, Wu Z, Ruan J, He B, Zhang T, Yu X, Tian X, Bu W (2016) PacBio full-length transcriptome profiling of insect mitochondrial gene expression. RNA Biol 13:820–825
Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, MacManes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, LeDuc RD, Friedman N, Regev A (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512
Hayat K, Iqbal H, Malik U, Bilal U, Mushtaq S (2015) Tea and its consumption: benefits and risks. Crit Rev Food Sci Nutr 55:939–954
Hu G, Gong AY, Wang Y, Ma S, Chen X, Chen J, Su CJ, Shibata A, Strauss-Soukup JK, Drescher KM, Chen XM (2016) LincRNA-Cox2 promotes late inflammatory gene transcription in macrophages through modulating SWI/SNF-mediated chromatin remodeling. J Immunol 196:2799–2808
Huerta-Cepas J, Szklarczyk D, Heller D, Hernández-Plaza A, Forslund SK, Cook H, Mende DR, Letunic I, Rattei T, Jensen LJ, von Mering C, Bork P (2019) eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47:309–314
Hung T, Chang HY (2010) Long noncoding RNA in genome regulation: prospects and mechanisms. RNA Biol 7:582–585
Jiang CK, Ma JQ, Liu YF, Chen JD, Ni DJ, Chen L (2020) Identification and distribution of a single nucleotide polymorphism responsible for the catechin content in tea plants. Hortic Res 7:24
Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240
Kong L, Zhang Y, Ye ZQ, Liu XQ, Zhao SQ, Wei L, Gao G (2007) CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res 35:345–349
Kornblihtt AR, Schor IE, Alló M, Dujardin G, Petrillo E, Muñoz MJ (2013) Alternative splicing: a pivotal step between eukaryotic transcription and translation. Nat Rev Mol Cell Biol 14:153–165
Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359
Lei M, La H, Lu K, Wang P, Miki D, Ren Z, Duan CG, Wang X, Tang K, Zeng L, Yang L, Zhang H, Nie W, Liu P, Zhou J, Liu R, Zhong Y, Liu D, Zhu JK (2014) Arabidopsis EDM2 promotes IBM1 distal polyadenylation and regulates genome DNA methylation patterns. Proc Natl Acad Sci USA 111:527–532
Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinform 12:323
Li A, Zhang J, Zhou Z (2014) PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme. BMC Bioinform 15:311
Li Y, Dai C, Hu C, Liu Z, Kang C (2017) Global identification of alternative splicing via comparative analysis of SMRT- and Illumina-based RNA-seq in strawberry. Plant J 90:164–176
Liang YR, Lu JL, Zhang LY, Wu S, Wu Y (2003) Estimation of black tea quality by analysis of chemical composition and colour difference of tea infusions. Food Chem 80:283–290
Liang Y, Zhang L, Lu J (2005) A study on chemical estimation of pu-erh tea quality. J Sci Food Agricul 85:381–390
Liang G, Yang Y, Li H, Yu H, Li X, Tang Z, Li K (2018) LncRNAnet: a comprehensive Sus scrofa lncRNA database. Anim Genet 49:632–635
Love MI, Huber W, Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15:550
Ma L, Guo C, Li QQ (2014) Role of alternative polyadenylation in epigenetic silencing and antisilencing. Proc Natl Acad Sci USA 111:9–10
Mamati GE, Liang Y, Lu J (2006) Expression of basic genes involved in tea polyphenol synthesis in relation to accumulation of catechins and total tea polyphenols. J Sci Food Agric 86:459–464
McKibbin RS, Wilkinson MD, Bailey PC, Flintham JE, Andrew LM, Lazzeri PA, Gale MD, Lenton JR, Holdsworth MJ (2002) Transcripts of Vp-1 homeologues are misspliced in modern wheat and ancestral species. Proc Natl Acad Sci USA 99:10203–10208
Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320:1344–1349
Nam DK, Lee S, Zhou G, Cao X, Wang C, Clark T, Chen J, Rowley JD, Wang SM (2002) Oligo(dT) primer generates a high frequency of truncated cDNAs through internal poly(A) priming during reverse transcription. Proc Natl Acad Sci USA 99:6152–6156
Palazzo AF, Lee ES (2015) Non-coding RNA: what is functional and what is junk? Front Genet 6:2
Pan JB, Hu SC, Wang H, Zou Q, Ji ZL (2012) PaGeFinder: quantitative identification of spatiotemporal pattern genes. Bioinformatics 28:1544–1545
Pang Y, Abeysinghe IS, He J, He X, Huhman D, Mewan KM, Sumner LW, Yun J, Dixon RA (2013) Functional characterization of proanthocyanidin pathway enzymes from tea and their application for metabolic engineering. Plant Physiol 161:1103–1116
Paytuví Gallart A, Hermoso Pulido A, Martínez A, de Lagrán I, Sanseverino W, Aiese Cigliano R (2016) GREENC: a Wiki-based database of plant lncRNAs. Nucleic Acids Res 44:1161–1166
Saijo R (1980) Effect of shade treatment on biosynthesis of catechins in tea plants. Plant Cell Physiol 21:989–998
Salmela L, Rivals E (2014) LoRDEC: accurate and efficient long read error correction. Bioinformatics 30:3506–3514
Schmucker D, Clemens JC, Shu H, Worby CA, Xiao J, Muda M, Dixon JE, Zipursky SL (2000) Drosophila Dscam is an axon guidance receptor exhibiting extraordinary molecular diversity. Cell 101:671–684
Shepard PJ, Choi EA, Lu J, Flanagan LA, Hertel KJ, Shi Y (2011) Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. RNA 17:761–772
Song JH, Cao JS, Yu XL, Xiang X (2007) BcMF11, a putative pollen-specific non-coding RNA from Brassica campestris ssp. chinensis. J Plant Physiol 164:1097–1100
Song JH, Cao JS, Wang CG (2013) BcMF11, a novel non-coding RNA gene from Brassica campestris, is required for pollen development and male fertility. Plant Cell Rep 32:21–30
Stark R, Grzelak M, Hadfield J (2019) RNA sequencing: the teenage years. Nat Rev Genet 20:631–656
Sugliani M, Brambilla V, Clerkx EJ, Koornneef M, Soppe WJ (2010) The conserved splicing factor SUA controls alternative splicing of the developmental regulator ABI3 in Arabidopsis. Plant Cell 22:1936–1946
Szcześniak MW, Rosikiewicz W, Makałowska I (2016) CANTATAdb: a collection of plant long non-coding rnas. Plant Cell Physiol 57:e8
Tan J, Wang M, Tu L, Nie Y, Lin Y, Zhang X (2013) The flavonoid pathway regulates the petal colors of cotton flower. PLoS ONE 8:e72364
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7:562–578
Wang Y, Gao L, Wang Z, Liu Y, Sun M, Yang D, Wei C, Shan Y, Xia T (2012) Light-induced expression of genes involved in phenylpropanoid biosynthetic pathways in callus of tea (Camellia sinensis (L.) O. Kuntze). Sci Hortic 133:72–83
Wang X, Duan CG, Tang K, Wang B, Zhang H, Lei M, Lu K, Mangrauthia SK, Wang P, Zhu G, Zhao Y, Zhu JK (2013) RNA-binding protein regulates plant DNA methylation by controlling mRNA processing at the intronic heterochromatin-containing gene IBM1. Proc Natl Acad Sci USA 110:15467–15472
Wang B, Tseng E, Regulski M, Clark TA, Hon T, Jiao Y, Lu Z, Olson A, Stein JC, Ware D (2016) Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing. Nat Commun 7:11708
Wang T, Wang H, Cai D, Gao Y, Zhang H, Wang Y, Lin C, Ma L, Gu L (2017) Comprehensive profiling of rhizome-associated alternative splicing and alternative polyadenylation in moso bamboo (Phyllostachys edulis). Plant J 91:684–699
Wang M, Wang P, Liang F, Ye Z, Li J, Shen C, Pei L, Wang F, Hu J, Tu L, Lindsey K, He D, Zhang X (2018) A global survey of alternative splicing in allopolyploid cotton: landscape, complexity and regulation. New Phytol 217:163–178
Wang G, Yin H, Li B, Yu C, Wang F, Xu X, Cao J, Bao Y, Wang L, Abbasi AA, Bajic VB, Ma L, Zhang Z (2019) Characterization and identification of long non-coding RNAs based on feature relationship. Bioinformatics 35:2949–2956
Wang Y, Chen F, Ma Y, Zhang T, Sun P, Lan M, Li F, Fang W (2021) An ancient whole-genome duplication event and its contribution to flavor compounds in the tea plant (Camellia sinensis). Hortic Res 8:176
Wei K, Wang L, Zhou J, He W, Zeng J, Jiang Y, Cheng H (2011) Catechin contents in tea (Camellia sinensis) as affected by cultivar and environment and their relation to chlorophyll contents. Food Chem 125:44–48
Wei C, Yang H, Wang S, Zhao J, Liu C, Gao L, Xia E, Lu Y, Tai Y, She G, Sun J, Cao H, Tong W, Gao Q, Li Y, Deng W, Jiang X, Wang W, Chen Q, Zhang S, Li H, Wu J, Wang P, Li P, Shi C, Zheng F, Jian J, Huang B, Shan D, Shi M, Fang C, Yue Y, Li F, Li D, Wei S, Han B, Jiang C, Yin Y, Xia T, Zhang Z, Bennetzen JL, Zhao S, Wan X (2018) Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality. Proc Natl Acad Sci USA 115:4151–4158
Wu TD, Watanabe CK (2005) GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21:1859–1875
Wu X, Liu M, Downie B, Liang C, Ji G, Li QQ, Hunt AG (2011) Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation. Proc Natl Acad Sci USA 108:12533–12538
Wu TD, Reeder J, Lawrence M, Becker G, Brauer MJ (2016) GMAP and GSNAP for genomic sequence alignment: enhancements to speed, accuracy, and functionality. Methods Mol Biol 1418:283–334
Xia EH, Tong W, Wu Q, Wei S, Zhao J, Zhang ZZ, Wei CL, Wan XC (2020) Tea plant genomics: achievements, challenges and perspectives. Hortic Res 7:7
Xin M, Wang Y, Yao Y, Song N, Hu Z, Qin D, Xie C, Peng H, Ni Z, Sun Q (2011) Identification and characterization of wheat long non-protein coding RNAs responsive to powdery mildew infection and heat stress by using microarray analysis and SBS sequencing. BMC Plant Biol 11:61
Yamaguchi A, Abe M (2012) Regulation of reproductive development by non-coding RNA in Arabidopsis: to flower or not to flower. J Plant Res 125:693–704
Yang CS, Wang H, Sheridan ZP (2018) Studies on prevention of obesity, metabolic syndrome, diabetes, cardiovascular diseases and cancer by tea. J Food Drug Anal 26:1–13
Zhang YC, Liao JY, Li ZY, Yu Y, Zhang JP, Li QF, Qu LH, Shu WS, Chen YQ (2014) Genome-wide screening and functional analysis identify a large number of long noncoding RNAs involved in the sexual reproduction of rice. Genome Biol 15:512
Zhang X, Chen S, Shi L, Gong D, Zhang S, Zhao Q, Zhan D, Vasseur L, Wang Y, Yu J, Liao Z, Xu X, Qi R, Wang W, Ma Y, Wang P, Ye N, Ma D, Shi Y, Wang H, Ma X, Kong X, Lin J, Wei L, Ma Y, Li R, Hu G, He H, Zhang L, Ming R, Wang G, Tang H, You M (2021) Haplotype-resolved genome assembly provides insights into evolutionary history of the tea plant Camellia sinensis. Nat Genet 53:1250–1259
Zheng Y, Jiao C, Sun H, Rosli HG, Pombo MA, Zhang P, Banf M, Dai X, Martin GB, Giovannoni JJ, Zhao PX, Rhee SY, Fei Z (2016) iTAK: a program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Mol Plant 9:1667–1670
Zhou X (2020) A study of the tea industry and the current state of quality in the tea industry. Fujian Tea 42:49–50
Zhou Y, Zhao Z, Zhang Z, Fu M, Wu Y, Wang W (2019) Isoform sequencing provides insight into natural genetic diversity in maize. Plant Biotechnol J 17:1473–1475
Acknowledgements
This work was supported by Key-Area Research And Development Program of Guangdong Province (2020B020220004), the Natural Science Foundation of Fujian Province, China (Grant Number 2019J05066), and Key Projects of Science and Technology Bureau of Fuzhou, Fujian, China (Grant Number 2021-N-119). We appreciate anonymous reviewers and editor for the insightful comments and valuable suggestions.
Author information
Authors and Affiliations
Contributions
XZ designed and performed the entire project together. DM was involved in the analysis of the entire study. DM and JF drafted the manuscript. YL and LZ participated in the creation of the results picture. XZ, QD and LW participated in manuscript revision. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by Bing Yang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Ma, D., Fang, J., Ding, Q. et al. A survey of transcriptome complexity using full-length isoform sequencing in the tea plant Camellia sinensis. Mol Genet Genomics 297, 1243–1255 (2022). https://doi.org/10.1007/s00438-022-01913-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00438-022-01913-2