Abstract
Mulberry (Morus atropurpurea) is an economically important tree with a long history of extensive cultivation in Asia because it is the exclusive food source for the silkworm (Bombyx mori). Recently, mulberry has gained additional commercial value as a source of medicinal compounds, as animal fodder, and for landscaping. In the present work, the mulberry transcriptome was sequenced using the Illumina paired-end sequencing technology. A total of 105 million 90-bp paired-end reads were generated, and 60,069 unigenes were assembled with an N50 of 1219 bp. Based on a sequence similarity search with known proteins, 40,121 genes were identified. Among these genes, 31,548 were annotated with 55 gene ontology functional categories, 7790 had a Cluster of Orthologous Groups classification, and 23,188 mapped to 128 biological pathways in the Kyoto Encyclopedia of Genes and Genomes pathway database. In addition, 10,268 microsatellites were developed and characterized as potential molecular markers. These data will accelerate the understanding of mulberry growth and development mechanisms and facilitate gene discovery and functional genomic studies in mulberry.
Similar content being viewed by others
References
Adams KL, Wendel JF (2005) Polyploidy and genome evolution in plants. Curr Opin Plant Biol 8:135–141
Bakel H, Stout JM, Cote AG, Tallon CM, Sharpe AG, Hughes TR, Page JE (2011) The draft genome and transcriptome of Cannabis sativa. Genome Biol 12:R102
Birney E, Clamp M, Durbin R (2004) GeneWise and Genomewise. Genome Res 14:988–995
Bown D (1995) Encyclopedia of herbs and their uses. Dorling Kindersley, London, pp 313–314
Burbulis IE, Shirley BW (1999) Interactions among enzymes of the Arabidopsis flavonoid biosynthetic pathway. Proc Natl Acad Sci 96:12929–12934
Checker VG, Saeed B, Khurana P (2012) Analysis of expressed sequence tags from mulberry (Morus indica) roots and implications for comparative transcriptomics and marker identification. Tree Genet Genomes 8:1437–1450
Chen PN, Chu SC, Chiou HL, Kuo WH, Chiang CL, Hsieh YS (2006) Mulberry anthocyanins, cyanidin 3-rutinoside and cyanidin3-glucoside, exhibited an inhibitory effect on the migration and invasion of a human lung cancer cell line. Cancer Lett 235:248–259
Clark MS, Thorne MA, Vieira FA, Cardoso JC, Power DM, Peck LS (2010) Insights into shell deposition in the Antarctic bivalve Laternula elliptica: gene discovery in the mantle transcriptome using 454 pyrosequencing. BMC Genomics 11:362
Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21:3674–3676
Eisenreich W, Rohdich F, Bacher A (2001) Deoxyxylulose phosphate pathway to terpenoids. Trends Plant Sci 6:78–84
Facchini PJ (2001) Alkaloid biosynthesis in plants: biochemistry, cell biology, molecular regulation, and metabolic engineering applications. Annu Rev Plant Biol 52:29–66
Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, Fox SE, Wong WK, Mockler TC (2010) Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res 20:45–58
Goff S, Ricke D, Lan T, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100
Gulyani V, Khurana P (2011) Identification and expression profiling of drought-regulated genes in mulberry (Morus sp.) by suppression subtractive hybridization of susceptible and tolerant cultivars. Tree Genet Genomes 7:725–738
He NJ, Zhang C, Qi XW, Zhao SC, Tao Y, Yang GJ, Lee TH, Wang XY, Cai QL, Li D et al (2013) Draft genome sequence of the mulberry tree Morus notabilis. Nature Commun 4:2445
Hegedus Z, Zakrzewska A, Agoston VC, Ordas A, Racz P, Mink M, Spaink HP, Meijer AH (2009) Deep sequencing of the zebrafish transcriptome response to mycobacterium infection. Mol Immunol 46:2918–2930
Initiative TAG (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815
Iorizzo M, Senalik DA, Grzebelus D, Bowman M, Cavagnaro PF, Matvienko M, Ashrafi H, Van Deynze A, Simon PW (2011) De novo assembly and characterization of the carrot transcriptome reveals novel genes, new markers, and genetic diversity. BMC Genomics 12:389
Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28:27–30
Kent WJ (2002) BLAT-The BLAST-Like alignment tool. Genome Res 12:656–664
Kitajima S, Taira T, Oda K, Yamato KT, Inukai Y, Hori Y (2012) Comparative study of gene expression and major proteins’ function of laticifers in lignified and unlignified organs of mulberry. Planta 235:589–601
Konno K, Ono H, Nakamura M, Tateishi K, Hirayama C, Tamura Y, Hattori M, Koyama A, Kohno K (2006) Mulberry latex rich in antidiabetic sugar-mimic alkaloids forces dieting on caterpillars. Proc Natl Acad Sci 103:1337–1341
Lal S, Ravi V, Khurana JP (2009) Repertoire of leaf expressed sequence tags (ESTs) and partial characterization of stress-related and membrane transporter genes from mulberry (Morus indica L.). Tree Genet Genomes 5:359–374
Li RQ, Zhu HM, Ruan J, Qian WB, Fang XD, Shi ZB, Li YR, Li ST, Shan G, Kristiansen K et al (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20:265–272
Lu T, Lu G, Fan D, Zhu C, Li W, Zhao Q, Feng Q, Zhao Y, Guo Y, Huang X et al (2010) Function annotation of the rice transcriptome at single-nucleotide resolution by RNA-seq. Genome Res 20:1238–1249
Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y (2008) RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18:1509–1517
Matsuoka T, Kimura T, Muraoka N (1994) Research of the available constituents from mulberry tree. Tohoku Agric Res (Japan) 47:361–362
McGarvey DJ, Croteau R (1995) Terpenoid metabolism. Plant Cell 7:1015–1026
Sanchez MD (2002) Mulberry: an exceptional forage available almost worldwide. In: Sanchez MD (ed) FAO Electronic conference on mulberry for animal production (Morus-L). Rome, pp 271–290
Severin AJ, Woody JL, Bolon Y-T, Joseph B, Diers BW, Farmer AD, Muehlbauer GJ, Nelson RT, Grant D, Specht JE et al (2010) RNA-Seq Atlas of Glycine max: a guide to the soybean transcriptome. BMC Plant Biol 10:160
Soufleros EH, Mygdalia AS, Natskoulis P (2004) Characterization and safety evaluation of the traditional Greek fruit distillate “Mouro” by flavor compounds and mineral analysis. Food Chem 86:625–636
The Sericultural Research Institute of Chinese Academy of Agricultural Sciences (SRICAAS) (1993) Mulberry Cultivars in China. Agriculture Press, Beijing, pp 19–20, Chinese writing
Tipton JL (1994) Relative drought resistance among selected southwestern landscape plants. J Arboric 20:150
Torres TT, Metta M, Ottenwalder B, Schlotterer C (2008) Gene expression profiling by massively parallel sequencing. Genome Res 18:172–177
Wang XW, Luan JB, Li JM, Bao YY, Zhang CX, Liu SS (2010a) De novo characterization of a whitefly transcriptome and analysis of its gene expression during development. BMC Genomics 11:400
Wang Z, Fang B, Chen J, Zhang X, Luo Z, Huang L, Chen X, Li Y (2010b) De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweetpotato (Ipomoea batatas). BMC Genomics 11:726
Wei W, Qi X, Wang L, Zhang Y, Hua W, Li D, Lv H, Zhang X (2011) Characterization of the sesame (Sesamum indicum L.) global transcriptome using Illumina paired-end sequencing and development of EST-SSR markers. BMC Genomics 12:451
Yu MD, Xiang ZH (1996) The discovery and study on a natural haploid Morus notabilis Schneid. Sci Sericult 22:67–71, Chinese writing
Zenoni S, Ferrarini A, Giacomelli E, Xumerle L, Fasoli M, Malerba G, Bellin D, Pezzotti M, Delledonne M (2010) Characterization of transcriptional complexity during berry development in Vitis vinifera using RNA-Seq. Plant Physiol 152:1787–1795
Acknowledgments
We thank the Beijing Genomics Institute for assistance in raw data processing and related bioinformatics analysis. This work was supported by funds from the Natural Science Foundation of Guangdong Province, China (No. S2013040016490) and the President Foundation of Guangdong Academy of Agricultural Sciences, China (No. 201314).
Data archiving statement
The sequencing raw data from this study can be found in the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra) under accession number SRP026705. The assembly sequences data has been deposited at the NCBI Transcriptome Shotgun Assembly Sequence Database (TSA) (http://www.ncbi.nlm.nih.gov/genbank/tsa) under the accession GBZO00000000. The version described in this study is the first version, GBZO01000000.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by J. L. Wegrzyn
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary Figure 1
Transcriptome CDSs predicted by blastx and ESTScan. (A) Length distribution of CDSs predicted by blastx. (B) Length distribution of CDSs predicted by ESTScan. (DOC 100 kb)
Supplementary Table 1
Primer sequences for SSR markers. (XLS 40 kb)
Supplementary Table 2
Estimated full-length genes of assembled unigenes. (DOCX 16 kb)
Supplementary Table 3
Comparison of all unigenes against the mulberry genome sequence. (DOCX 16 kb)
Rights and permissions
About this article
Cite this article
Dai, F., Tang, C., Wang, Z. et al. De novo assembly, gene annotation, and marker development of mulberry (Morus atropurpurea) transcriptome. Tree Genetics & Genomes 11, 26 (2015). https://doi.org/10.1007/s11295-015-0851-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11295-015-0851-4