De Novo Sequencing and Global Transcriptome Analysis of Nannochloropsis sp. (Eustigmatophyceae) Following Nitrogen Starvation
- First Online:
- Cite this article as:
- Liang, C., Cao, S., Zhang, X. et al. Bioenerg. Res. (2013) 6: 494. doi:10.1007/s12155-012-9269-0
- 873 Views
Nannochloropsis sp. is an economically and nutritionally important microalga. Recently it has been demonstrated that Nannochloropsis sp. has significant potential for biofuel production. To determine the mechanisms of lipid formation and accumulation during nitrogen starvation, a transcriptomic study was performed to compare gene expression during growth with and without nitrogen. Digital expression analysis identified 1,855 differentially expressed genes between cells grown under nitrogen-replete and nitrogen-deprived conditions; this provided novel insights into the molecular mechanisms of lipid formation by Nannochloropsis sp. under stress. As expected, nitrogen deprivation induced genes involved in nitrogen metabolism and lipid biosynthesis. Although the chlorophyll content decreased following nitrogen deprivation, a subset of genes putatively encoding light-harvesting complex (LHC) proteins were upregulated. These upregulated LHCs may play a role on photoprotection. The sequence data were confirmed using reverse transcription polymerase chain reaction (RT-PCR) and quantitative real-time RT-PCR. The expressions of a number of genes involved in acetyl-CoA metabolism were also affected under nitrogen-deprived stress, which may change fatty acids indirectly. Overall, we found low gene expression levels for fatty acid synthesis, suggesting that the buildup of precursors for the acetyl-CoA carboxylases may play a more significant role in TAG synthesis compared with the actual enzyme levels of acetyl-CoA carboxylases per se. The changes in transcript abundance in Nannochloropsis sp. following nitrogen deprivation provided a potential source for exploration of molecular mechanisms of lipid formation and accumulation. Furthermore, a set of simple sequence repeat motifs were identified from the expressed sequence tags, which provide useful genetic markers for further genetic analysis.
KeywordsNannochloropsis sp.TranscriptomeNitrogen starvation
The Nannochloropsis species are important in marine aquaculture due to their potential for wide-scale production for nutritional purposes. They are a rich source of protein, carotenoids, lipids, carbohydrates, vitamins, and minerals, and could help fill the “protein gap” and play a role in feeding an ever-expanding world population [1–3]. Recently, several research groups have suggested that these species could be a promising source of oil for biofuel production [4–6] due to the high levels of saturated and monounsaturated fatty acids produced under certain stressful conditions .
However, the production of neutral lipids from naturally occurring microalgal strains has often resulted in much lower yields than the theoretical maximum , making microalgae-based biofuel production prohibitively expensive. The limited success of algal lipid production stems primarily from the lack of understanding of the metabolic pathways regulating the algal lipid metabolism in general and the neutral lipid synthesis and accumulation in particular [8, 9]. The accumulation of lipids under stress conditions in Nannochloropsis usually involves changes in the expression profile. Therefore, exploring the molecular basis of responses to nitrogen starvation in Nannochloropsis sp. will aid in the understanding of the mechanisms of lipid metabolism. Abundant transcriptome data are required for this purpose.
Although the genomics approach has already been demonstrated to be effective in understanding the biological pathways of many algae [10–17], expressed sequence tag (EST) sequencing represents a rapid and relatively economical method for analyzing the transcribed regions of the genome . Indeed, EST analyses have identified many genes involved in plant secondary metabolism . Recent advances in next-generation sequencing technologies, such as 454 sequencing, have allowed the generation of large-scale ESTs efficiently and cost-effectively [19, 20]. The 454 sequencing technology has experienced a rapid improvement in throughput, read length, and accuracy. The newest 454 sequencing platform, the GS 20 Titanium, can generate one million reads with an average length of 400 bases at 99.5 % accuracy per run . This sequencing method, which has been widely applied to many species [17, 22, 23], holds great potential for the discovery of genes and genetic markers in unconventional model species through de novo transcriptome sequence assembly.
Previous studies have shown that nitrogen deprivation affects lipid formation and accumulation in many microalgae [9, 24–28]. Previous studies [17, 26, 28] and our analysis (the data shown in this study) show that Nannochloropsis sp. increases the number of lipid droplets under nitrogen-deprived conditions. The goal of this study was to investigate the major changes in gene expression following nitrogen deprivation. We expected the pattern of the induced levels of transcript to reflect the metabolic changes. The results were able to provide valuable information to further understand the molecular mechanisms of lipid formation and accumulation in algae.
Materials and Methods
Microalga and Growth Conditions
The Nannochloropsis sp. strain NL117 was obtained from the alga culture collection of the Ocean University of China and was maintained in our laboratory. Sequence analysis of the 18S rRNA gene showed that the strain is significantly related (97 % identity) to Nannochloropsis mantima. The strain was cultivated on f/2 medium . Nannochloropsis cells were grown photoautotrophically at 23 °C with 50 μmol m−2 s−1 white light in a 1-L triangle flask containing 600 mL f/2 medium. When the optical density (OD) reached 0.4 at a wavelength of 720 nm, 100 mL of culture was harvested for lipid analysis. To induce nitrogen deprivation, 500 mL of culture was collected by vacuum filtration using 0.45-μL fiber filters and resuspended in 500 mL nitrogen-free f/2 medium in a 1-L triangle flask without shaking. The cells were harvested after they were transferred into the nitrogen-free f/2 medium for 24 and 48 h, respectively. From these samples, 100 mL was collected and immediately frozen in liquid nitrogen for RNA extraction. One hundred-fifty milliliters of culture was harvested for analysis of the accumulation of lipid and fatty acid composition and concentrations, or chlorophyll content. The above cultures were repeated in triplicate.
Lipid Analysis of Nannochloropsis sp. in Nitrogen-Free Nutrient Medium
The accumulation of neutral lipids was measured using the fluorescent dye BODIPY505/515 (Invitrogen, USA), a lipid-soluble fluorescent probe that possesses several characteristics advantageous for in situ screening [30, 31]. After the algal cells were stained with BODIPY505/515, a Nikon Eclipse 80i microscope (Nikon, Japan), with blue light (488 nm) as the excitation wavelength, was used to image and quantify the lipid bodies in the algal cells. A Nikon CCD DS-file digital camera (Nikon, Japan) was used to capture the images. We also measured the relative neutral lipid content of the cells cultured in complete and nitrogen-free medium. When the OD reached 0.4 at a wavelength of 720 nm, the cells were harvested for experiment. The details of culture conditions were mentioned above. From each sample, 5 mL of the algal suspension was stained with 1 μL of 10 mM BODIPY505/515, dissolved in anhydrous dimethyl sulfoxide (DMSO) (final concentration, 2 μM), and then excited at 475 nm before measuring the emission at 510 nm using a spectrofluorimeter (ISS Inc., Champaign, IL). In our work, the cell concentrations were determined by using a hemacytometer.
Samples of the cells from the two different conditions were lyophilized and analyzed to determine the fatty acid composition and concentrations by GC-MS, using the protocol described by Chi et al. . The experiments were repeated in triplicate. Fatty acid concentrations were normalized against the measured biological dry mass (BDM) values to eliminate any slight differences in the BDM values. The average and standard deviation were calculated from the triplicate samples.
Chlorophyll Content Analysis
cDNA Library Construction and Sequencing
The frozen cells were ground using a pestle and mortar prior to RNA extraction. Total RNA was extracted using the TRIzol kit (Invitrogen). The quality of the total RNA was verified on a 1.4 % (w/v) agarose–MOPS–formaldehyde denaturating gel and by assessing the A260/280 and A260/230 ratios using the NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, USA). RNA was purified to remove tRNA and rRNA and to enrich the mRNA, using an mRNA purification kit (Promega) according to the manufacturer's instructions. mRNA was reverse transcribed by Powerscript™ II (Takara) using the PCR primer SMART IV™ oligonucleotide (5′-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGG-3′) and CDS III/3′ PCR primer (5′-ATTCTAGAGGCCGAGGCGGCCGACATG-d(T)30N-1N-3′). Long-distance PCR for double-strand cDNA amplification was performed with LA Taq polymerase (Takara) using 25 PCR cycles (95 °C for 30 s, 68 °C for 8 min) according to the SMART™ cDNA library construction kit user manual. Finally, the double-stranded cDNAs were purified using a DNA purification kit (Qiagen) to generate clean cDNA. The cDNA samples were normalized using the Trimmer Direct cDNA normalization kit (Evrogen) according to the manufacturer's instructions.
Approximately 10 μg of sheared cDNA was used for GS20–454 sequencing. The cDNA sample was end repaired and adapter ligated according to the method described by Margulies et al. . Streptavidin bead enrichment, DNA denaturation, and emulsion PCR were also performed as described by Margulies et al. . A half-plate sequencing run was performed for each sample at the Virginia Bioinformatics Institute Core Laboratory Facility according to the manufacturer's instructions (Roche-454 Life Sciences, Brandford, CT, USA) at SeqWright (USA).
Sequence Processing and Data Assembly
Following 454 sequencing, the raw data were processed to remove low-quality regions, nonsense sequences (adaptor and primer sequences), and poly A sequences, using a Perl program. The high-quality reads were assembled with CAP3  to construct unique consensus sequences. The parameters used by CAP3 are as follows: −o 40 −p 95 −y 50.
Functional Annotation and Pathway Prediction Classification
Unigenes were compared with the NCBI non-redundant nucleotide and non-redundant protein databases using BLASTN and BLASTX, respectively, using the same E value cutoffs ≤1e−5. All the unigenes were functionally annotated by sequence similarity searches against the Swiss Prot  and Clusters of Orthologous Groups of proteins (COG) databases  using BLAST at E value cutoffs ≤1e−10. A Perl script was written to assign the functional class of each unigene. Putative InterPro protein domains  were annotated by InterProScan  release 16.0, and functional assignments were mapped onto Gene Ontology (GO) . WEGO  was used for GO classification and to construct a GO tree. The annotations of the unigenes were compared with the Kyoto Encyclopedia of Genes and Genomes database (KEGG, release 50) . A Perl script was used to retrieve Kyoto information from blast results and to establish biochemical pathway associations between the unigenes and the database.
Identification of Differential Gene Expression
Similar to credibility interval approaches reported for the analysis of SAGE data , we employed IDEG6  to identify differentially expressed mRNAs based on their relative abundance, which is reflected by the total count of individual sequence reads in the two libraries. The general chi-squared test was employed for statistical analyses. This has been proven to be one of the most efficient tests . Expressions of genes with a P value ≤0.01 were considered to be significantly different between the two libraries.
EST-SSR Detection and Primer Designing
The EST resource is an effective and feasible approach to develop SSR markers. EST-SSR detection was performed using the Perl program MISA . Since it was difficult to distinguish real mononucleotide repeats and single nucleotide stretch errors generated by 454 sequencing, mononucleotide repeats were excluded from this study. The parameters were designed for identifying perfect di-, tri-, tetra-, penta-, and hexanucleotide motifs with a minimum of six, five, five, five, and five repeats, respectively. Primer 3 software  was used for primer design. The major parameters for designing the primers were as follows: primer length of 18 to 28 bases with 20 bases being the optimum, PCR product size ranging from 100 to 300 bp, optimum annealing temperature of 60 °C, and GC content of 40 to 70 %, with 50 % being the optimum.
Experimental Validation of Selected Transcripts
Subsets of transcripts, including photosynthesis-related genes, were found to have significantly different expression between the two libraries. Therefore, we focused our attention on the photo-related genes, especially members of the light-harvesting complex (LHC) family. Validation was performed by quantitative real-time PCR (qPCR). The RNAs derived from the samples for the construction of the cDNA libraries were used. qPCR was performed using an oligo (dT) primer and SuperScriptTM II Reverse Transcriptase (Invitrogen Inc., Carlsbad, CA) according to the manufacturer's instructions. The qPCR was performed with an ABI StepOne Plus Real-Time PCR System (Applied Biosystems, USA) using SYBR Green (Takara) according to the manufacturer's instructions. The primers used in qPCR are listed in Additional File 1 (The primers used in the qPCR).
Results and Discussion
Neutral Lipid Analysis of Nannochloropsis sp. in Nitrogen-Free Nutrient Medium
Percentage value of all detected fatty acids (percent w/w TFA) of Nannochloropsis sp.
Fatty acid group
N− 24 h
N− 48 h
0.32 ± 0.03
0.22 ± 0.01
0.21 ± 0.03
5.36 ± 0.4
4.88 ± 0.15
4.01 ± 0.15
33.76 ± 2.3
35.08 ± 3.05
36.78 ± 1.64
27.51 ± 1.56
29.63 ± 2.03
30.34 ± 1.90
0.83 ± 0.02
0.89 ± 0.03
0.85 ± 1.0
6.87 ± 0.43
8.88 ± 1.01
9.2 ± 0.4
1.35 ± 0.01
1.36 ± 0.09
1.04 ± 0.08
0.09 ± 0.00
0.11 ± 0.00
0.08 ± 0.01
1.59 ± 0.03
0.98 ± 0.01
0.78 ± 0.03
18.56 ± 0.3
14.04 ± 0.34
12.03 ± 1.0
2.06 ± 0.2
1.88 ± 0.05
2.52 ± 0.08
1.7 ± 0.1
2.05 ± 0.06
2.16 ± 0.15
Generally, under nitrogen deprivation, microalgae favor the synthesis of neutral lipids more than polar lipids [28, 46, 48]. These neutral lipids usually serve as structural components of the cytoplasm for the maintenance of cells under nitrogen deprivation . It is also believed that fatty acids, which are incorporated into TAG, are used as an electron sink by the cell and as a way to restore the pool of NADP+ when cell growth and division are impaired due to nutrient limitation .
Sequence Generation and Assembly
Sequencing result summary
Cluster size in unigene
Nannochloropsis sp. NL117
Number of unigenes
Percent of unigenes (%)
Max unigene size is
The average length of our ESTs was shorter than that obtained from other species [21, 22, 50], despite using the same sequencing technology, but similar to that observed for Nannochloropsis gaditana . More than 80 % were between 100 and 300 bp, and the results of a replicate experiment were of a similar range. This indicated that the result was not due to technical factors but due to the specific organism.
Functional Annotation of the Nannochloropsis Transcriptome by Sequence Comparison with Public Databases
To infer putative function, unique sequences were first compared with the sequences in the NCBI non-redundant nucleotide database (NT) using the BLASTN algorithm (http://www.ncbi.nlm.nih.gov/) and the non-redundant protein database (NR) (http://www.ncbi.nlm.nih.gov/) using the BLASTX algorithm. When the E value cutoff was set at 10−5, of the 34,097 unigenes, 5,761 (16.9 %) had significant matches in the NT database, and 3,851 (11.3 %) had significant hits in the NR database (shown in Additional File 2).
Of all the unigenes, except for the no hits, the most abundant transcript included the enzyme-encoding glyceraldehyde-3-phosphate dehydrogenase (GAPDH). GAPDH, a Calvin cycle enzyme involved in carbon fixation, was investigated in a wide range of the algal groups . Some transcripts encoding the chloroplast photosystem II 33-kDa oxygen-evolving protein PsbO, coproporphyrinogen III oxidase, were also highly expressed, and these proteins may play a role in photosynthesis.
Comparison of Transcriptomes
Identification of Differential Gene Expression
Furthermore, genes with significantly different expression levels between the two libraries were also investigated by searching against the NR and NT databases. A large number of the genes had no known function or no similarity to known genes. This issue was mainly due to the relatively smaller data from Eustigmatophytes available in the public databases, or the unmatched unigenes may represent species-specific novel genes. Further studies are required to identify novel genes used in extreme conditions. Generally, the photosynthetic efficiency decreased following nitrogen deprivation in microalgae [7, 53]. In addition, the abundance of transcripts encoding photosynthesis-related proteins was substantially reduced following nitrogen deprivation . Interestingly, our analysis revealed that several genes associated with the photosystems showed a greater than 2-fold upregulation in nitrogen-deprived stressed cells, including those that encoded the light-harvesting complex protein (greater than 6-fold), photosystem II oxygen-evolving protein PsbO(2.6-fold), and violaxanthin/chlorophyll, a binding protein precursor (NANVCP) (2.7-fold). These genes may play a protective role in dissipating excess excitation energy due to the lowered photosynthetic efficiency under stress conditions.
We also identified several transcripts encoding enzymes involved in nitrogen metabolism upregulated, such as ferredoxin-nitrite reductase (greater than 29-fold), glutamine synthetase(greater than 1.9-fold), glutamine amidotransferase, carbon–nitrogen ligase activity, and glutamine phosphoribosylpyrophosphate amidotransferase. Nitrate reductase is activated in response to elevated sugar and hexose phosphate levels during photosynthesis [54, 55]. Nitrite reductase catalyzes the reduction of nitrite to ammonium; glutamine synthetase is involved in the ATP-dependent conversion of ammonium and glutamate to glutamine; and ferredoxin-dependent glutamine oxoglutarate amino transferase is required for the conversion of glutamine and 2-oxoglutarate to glutamate . These reactions occur within the chloroplast. Previous studies and our study have shown that nitrogen deprivation results in the accumulation of lipids [24, 26, 27]. However, in this study, only a small fraction of the transcripts encoding the enzymes involved in lipid synthesis (fatty acid elongase, acetyl-CoA carboxylase, N-acetyltransferase, long-chain-fatty-acid-CoA ligase) were upregulated under nitrogen deprivation. An acetyl-CoA carboxylase (ACCase) is generally considered to catalyze the first reaction of the fatty acid biosynthetic pathway—the formation of malonyl CoA from acetyl-CoA and CO2. N-Acetyltransferase is an enzyme that catalyzes the transfer of acetyl groups from acetyl-CoA to arylamines. These enzymes were involved in the fatty acid de novo synthesis pathway in chloroplasts. However, the transcript levels of many lipid synthesis genes were not upregulated. Based on a previous report , the lipase expression or activities may also be controlled at the post-transcriptional level, including translational regulation and post-transcriptional modifications of the encoded proteins. Some of the lipases may have a similar regulatory pattern. Thus, no significant changes were observed when the cells were under stress caused by nitrogen deprivation. Other genes encoding enzymes of primary metabolism also showed changes in transcript abundance. The transcript abundances of glyceraldehyde-3-phosphate dehydrogenase and pyruvate dehydrogenase increased, which are involved in the synthesis of acetyl-CoA, a precursor of fatty acid biosynthesis. Therefore, these enzymes could contribute to the changing profiles of fatty acids of Nannochloropsis sp. following nitrogen deprivation to some extent. Some transcripts encoding transcription factors were also highly expressed, and these proteins may play a role in resistance to nitrogen starvation stress.
Among the downregulated genes, except for a number of genes with unknown function, several genes associated with protein biosynthesis were downregulated following nitrogen deprivation. For example, a subset of ribosomal proteins (L26, L10, L17, L26, and S16) were significantly decreased, which may imply that protein synthesis was influenced following nitrogen deprivation.
Biological pathway assignments were performed according to KEGG mapping . First, the 34,097 unique sequences were compared using BLASTX with an E value cutoff of <10−5 against the KEGG database. Of these unique sequences, 5,364 had significant matches in the database. Among these, 1,578 unique sequences having enzyme commission numbers were assigned to metabolic pathways. The KEGG metabolic pathways that were significantly represented by Nannochloropsis unique sequences were energy metabolism, amino acid metabolism, translation, metabolism of cofactors and vitamins, transport and catabolism, nucleotide metabolism, and lipid metabolism. In the subclass of energy metabolism, the greatest number of unigenes was mapped to oxidative phosphorylation and carbon fixation in photosynthetic organisms. In addition, in the subclass of metabolism of cofactors and vitamins, the most frequent unigene was mapped to porphyrin and chlorophyll metabolism (shown in Additional File 3). These pathways were related to photosynthesis. Because of the lipid production for Nannochloropsis sp. NL 117, we focused on lipid metabolic genes. Some transcripts encoding acetyl/propionyl CoA carboxylase, Acyl-CoA dehydrogenase, and ACP dehydratase were involved in the fatty acid de novo synthesis pathway in chloroplasts. Generally, genes involved in the fatty acid synthesis had low gene expression. The results also suggest that the buildup of precursors to acetyl-CoA may play a more significant role in TAG synthesis rather than the actual enzyme levels of acetyl-CoA carboxylases per se. A similar conclusion was deducted in Phaeodactylum tricornutum by Valenzuela et al. .
Photosynthetic Response to Nitrogen Deprivation
Identification of Simple Sequence Repeat Markers
The identified microsatellite markers derived from ESTs are described as functional markers and offer more utility than markers from anonymous genomic regions [65–67]. In this study, we performed a general screen on the Nannochloropsis unigene data set for the presence of SSRs. A total of 3,397 SSRs were identified (shown in Additional File 4: SSR markers and designed primers). We excluded mononucleotide SSRs in our analysis because of the common homopolymer errors that occur in 454 sequencing data. The majority of the SSRs were trinucleotide (2,086) and dinucleotide (929) repeats, followed by tetranucleotide (295), pentanucleotide (38), and hexonucleotide (49) repeats. These SSRs markers offer a valuable resource for further genetic investigations. To the best of our knowledge, this is the first attempt to develop large numbers of SSR markers using the EST database for Nannochloropsis. Since these markers were developed based on conserved expressed sequences across the Nannochloropsis genus, they may be valuable for functional analysis of candidate genes because part of these EST sequences was derived from nitrogen-deprived stressed cells which can accumulate oils. Also, molecular markers have great potential to speed up the process of developing improved cultivated strains. Our work has established a biotechnological platform for future research.
In conclusion, we have described the global analysis of the Nannochloropsis transcriptome during nitrogen starvation using massively parallel pyrosequencing. To compare the induced transcript levels, we expected to reflect the metabolic changes leading to neutral lipid accumulation. Undoubtedly, the genes involved were not completely identified. Also, the limitation of the algal gene resource made this work more challenging. Functional characterization of selected genes is currently being performed. This is only a first step in understanding the molecular mechanisms of lipid formation and accumulation in algae. We expect other groups will also mine this data set, and access to all EST contigs obtained in this study is available through a file in the supplemental data (Additional File 5: the sequence data of all the unigenes).
The authors thank the Beijing Institutes of Life Science, Chinese Academy of Sciences (BIOLS), for kind assistance in bioinformatic analysis. This work was supported by Shandong Science and Technology plan project (2011GHY11528), the Specialized Fund for the Basic Research Operating Expenses Program (20603022012004), National Natural Science Foundation of China (41176153, 31000135,40972162), Natural Science Foundation of Shandong Province (2009ZRA02075), Qingdao Municipal Science and Technology plan project (11-3-1-5-hy, 11-2-4-3-(5)-jch), and National Marine Public Welfare Research Project (200805069).