Introduction

Oaks belong to the genus Quercus, which comprises several hundred diploid and highly heterozygous species spreading throughout the northern hemisphere, from the tropical to the boreal regions (Abrams 1990). Among the oak species, Q. liaotungensis, which leaf is the main food source for Antheraea pernyi, is an important drought-resistant tree species in the northern warm temperate zone. The leaf morphology of Q. liaotungensis is relative small which could adapt to the dry and rainless climate in Northeast China. However, a detailed characterization of genomic information of Q. liaotungensis leaf coping with drought is lacking.

Recent transcriptomic and gene expression profiling studies in oaks have led to the construction of large cDNA libraries (Ueno et al. 2010; Kremer et al. 2012; Tarkka et al. 2013; Torre et al. 2014), and RNA-Seq studies have made it possible to identify genes involved in drought-resistant (Torre et al. 2014). In this study, the leaves of Q. liaotungensis were used for transcriptome analysis for expanding the genetic resources of Q. liaotungensis. Besides, a series of drought-related factors of Q. liaotungensis leaf were also screened. The results of this study may provide a theoretical basis for further research on drought resistance mechanism of Q. liaotungensis.

Materials and methods

Sample preparation and RNA extraction

Q. liaotungensis used in this study were located at the research base of Shenyang Agricultural University under natural environment at 23 ± 2 °C with 70 ± 5% relative humidity. Juvenile and mature leaves (Fig. 1A) from 30 individuals of 4-years old plants were collected and immediately frozen in liquid nitrogen and stored at − 80 °C until processing. Total RNA was extracted using Trizol reagent (Invitrogen) according to the manual. The quantity and purity of RNA were analyzed using NanoDrop ND-1000 (Wilmington, DE, USA). The integrity of RNA was detected by Bioanalyzer 2100 (Agilent, CA, USA).

Fig. 1
figure 1

A The juvenile leaf (up) and matured leaf (down) of Q. liaotungensis. B Number of unigenes annotated in NCBI_nr, GO, KEGG, Swiss-Prot Pfam, Swiss-Prot, and eggNOG databases. C The number of different putative candidate factors involved in drought avoidance of Q. liaotungensis leaf

cDNA library construction and transcriptome sequencing

The high-quality RNA with concentration > 50 ng/μL, RIN number > 7.0, OD260/280 > 1.8, and total RNA amount > 1 μg was sent for cDNA library construction and RNA-seq which were commissioned by Lianchuan Biotechnology Co., Ltd (Hangzhou, China). Briefly, approximately 10 μg of total RNA representing a specific adipose type was subjected to isolate Poly (A) mRNA with poly-T oligo-attached magnetic beads (Invitrogen). Following purification, the poly(A)- or poly(A) + RNA fractions is fragmented into small pieces using divalent cations under elevated temperature. Then the cleaved RNA fragments were reverse-transcribed to create the final cDNA library in accordance with the protocol for the mRNA-seq sample preparation kit (Illumina, San Diego, USA), the average insert size for the paired-end libraries was 300 bp (± 50 bp). And then the paired-end sequencing was performed on an Illumina Hiseq 4000 (lc-bio, China).

De novo transcriptome assembly

The adaptor reads of the raw data were removed using Cutadapt software (https://cutadapt.readthedocs.io/en/stable/) (Martin 2011). The clean data were obtained by removing the low quality and repeat reads. The sequence quality was verified by FastQC software (http://www.bioinformatics.babraham.ac.uk/projects/FastQC/), including Q20, Q30, and GC content. The clean data were de novo assembled using Trinity software (Grabherr et al. 2011). Trinity grouped transcripts into clusters based on the shared sequence content. Each transcript cluster was referred to as a ‘unigene’, and the longest transcript sequence was chosen as the gene sequence of ‘unigene’.

Unigene annotation

The annotations of the unigenes obtained from the assembled transcriptome were obtained by aligning against the databases including Gene ontology (http://www.geneontology.org), Kyoto Encyclopedia of Genes and Genomes (http://www.genome.jp/kegg/), eggnog (http://eggnogdb.embl.de/), NCBI_nr (http://www.ncbi.nlm.nih.gov/), and Pfam (http://pfam.xfam.org/) using DIAMOND software (Buchfink et al. 2015).

Validation of RNA-Seq by quantitative RT-PCR

Ten unigenes were selected for qRT-PCR to validate the transcriptome data. Total RNA from the Q. liaotungensis leaf samples used for qRT-PCR were the same as for RNA-Seq. The gene-specific primers were designed using the predicted CDSs as reference sequences. qRT-PCR was performed on a CFX Connect™ Real-Time System (Bio-Rad) using a 20-μL reaction system with a procedure as follows: 95 °C for 30 s, followed by 39 cycles of 95 °C for 5 s, 60 °C for 30 s. Melting curves were generated after each run to confirm a single PCR product. Each reaction was run in triplicate. mRNA quantity of each gene was calculated with the 2−ΔCT method (Livak and Schmittgen 2001).

Results and discussion

The RNA-seq data generated 54,153,182 raw reads. After filtering out the low-quality reads, 53,021,436 clean reads were obtained, which assembled into 41,207 transcripts with a mean length of 704 bp and GC content of 42.17%, and 25,593 unigenes with a mean length of 687 bp and GC content of 42.31%, based on Trinity assembly platform (Table 1). The raw sequence reads of this study were deposited in the Gene Expression Omnibus (GEO) under the accession number GSE125798. According to the Q. liaotungensis leaf transcriptome, 16,021, 17,935, 9290, 14,476, 19,806, 20,529 unigenes were matched to the Pfam, GO, KEGG, Swiss-Prot, eggNOG, NCBI_nr database, respectively (Table 2, Fig. 1B). The species distribution of the unigenes via blasting into NCBI_nr is shown in Fig. 2. The species that most contributed to the annotation was Juglans regia (59.43%), following with Ziziphus jujuba (4.04%), Vitis vinifera (3.52%), Theobroma cacao (3.3%), Prunus persica (2.41%), Prunus mume (2.07%), which suggested that there may be a closely genetic relationship between Q. liaotungensis and J. regia. The qRT-PCR results showed that the expression patterns of the candidate genes were consistent with those from RNA-Seq (Fig. 3), confirming the expression of the unigenes identified in the deep sequencing analysis.

Table 1 Statistical analysis of the transcriptome sequence data
Table 2 Unigenes annotated in different databases
Fig. 2
figure 2

Species distribution of the unigenes via blasting into NCBI_nr database. Different colors represent different species. The number represents the proportion of unigene matched to different species

Fig. 3
figure 3

Verification of the selected unigenes by qRT-PCR as compared with RNA-Seq data. X-axis represents the 10 genes selected for qRT-PCR validation. Y-axis represents the relative expression. DN6399_c0_g1, a superoxide dismutase 5 [Betula platyphylla]. DN9704_c1_g3, putative peroxidase 48 [Juglans regia]. DN9174_c2_g8, PREDICTED: basic leucine zipper 61 [Juglans regia]. DN10208_c2_g3, PREDICTED: putative peroxidase 48 [Juglans regia]. DN5663_c0_g1, PREDICTED: putative dehydration-responsive element-binding protein 2H [Juglans regia]. DN10418_c0_g5, PREDICTED: probable WRKY transcription factor 23 [Juglans regia]. DN7168_c0_g1, PREDICTED: probable WRKY transcription factor 75 [Juglans regia]. DN809_c0_g1, PREDICTED: bZIP transcription factor 53 [Juglans regia]. DN6101_c0_g1, PREDICTED: basic leucine zipper 23 [Vitis vinifera]. DN9760_c2_g6, PREDICTED: transcription factor MYB44 [Juglans regia]

The unigenes were classified into three categories including cellular component, molecular function and biological process via GO analysis, with 675, 1905 and 2849 GO terms corresponding to each category, respectively. The top 25 significantly clustered GO terms of the unigenes for each category are shown in Fig. 4. Among the subcategories of cellular component, assignments were mostly given to nucleus, integral component of membrane, plasma membrane, cytoplasm, chloroplast, mitochondrion, and cytosol. The majority of the annotated unigenes were assigned to protein binding, ATP binding, protein serine/threonine kinase activity and DNA binding in molecular function. Dominant GO terms of the biological process subcategories were grouped into regulation of transcription, defense response, protein phosphorylation, oxidation–reduction process, and signal transduction.

Fig. 4
figure 4

The top 25 enriched GO terms in cellular component, molecular function and biological process categories of the unigenes from Q. liaotungensis leaf transcriptome

KEGG pathway enrichment analysis was conducted for unigenes obtained from the transcriptome. The results showed that the unigenes in Q. liaotungensis could be assigned to 138 pathways. Among the 25 top KEGG pathways with the highest representation of unigenes, the abundant genes mapped onto endocytosis, plant hormone signal transduction, starch and sucrose metabolism, RNA transport, biosynthesis of amino acids, carbon metabolism, and amino sugar and nucleotide sugar metabolism (Fig. 5).

Fig. 5
figure 5

The top 25 enriched pathways of the unigenes from Q. liaotungensis leaf transcriptome

We screened a series of candidate genes potentially involved in drought adaptation via GO and KEGG analysis from those unigenes, including the genes encoding superoxide dismutase (SOD), peroxidase (POD), catalase (CAT), and several drought resistance-related transcription factors (TFs) such as dehydration-responsive element binding (DREB), v-myb avian myeloblastosis viral oncogene homolog (MYB), WRKY, basic leucine zipper (bZIP), and NAC (Fig. 1C, Table 3).

Table 3 The putative candidate genes involved in drought avoidance

Drought has several effects on plant growth and development, one of which is oxidative damage. We identified 41 genes encoding superoxide dismutase (SOD), peroxidase (POD) and catalase (CAT) (Table 3) which compose the antioxidant defense system in plants that can cooperate to resist the damage of active oxygen to cells. Seven candidate genes encoding DREBs (Table 3) including DREB1 and DREB2 were obtained. Overexpression of DREB1 increased tolerance of Malus baccata to low temperature, drought, and salt stresses (Yang et al. 2010). When treated with dehydration, the expression of DREB2A increased in Arabidopsis thaliana (Sakuma et al. 2007). Several drought resistance-related TFs such as MYB and bZIP which were founded in Quercus pubescens leaves (Torre et al. 2014) were also identified in this study. We screened 5 genes encoding MYB TFs, which is one of the largest transcription factor families and play regulatory roles in developmental processes and defense responses in plants, including MYB108, 24, 44, 12 (Table 3). The bZIP and NAC TFs are known to play a crucial role in response to various processes in plant as well as abiotic or biotic stress challenges such as drought (Xiang et al. 2008; Huang et al. 2015). WRKY proteins are newly identified TFs that are also involved in drought tolerance in plants (Ren et al. 2010). Overall, we screened 10, 42, and 6 candidate genes encoding bZIPs, WRKYs, and NACs, respectively (Table 3).

Through high-throughput sequencing, we can quickly obtain the desired target genes. As a consequence, the results of this study expanded the genetic resources of Q. liaotungensis and provided useful information for further research on drought resistance mechanisms of Q. liaotungensis.

Author contribution statement

GW and LQ conceived the project; GW performed the experiments; GW analyzed the data and wrote the paper.