BioEnergy Research

, Volume 6, Issue 2, pp 494–505

De Novo Sequencing and Global Transcriptome Analysis of Nannochloropsis sp. (Eustigmatophyceae) Following Nitrogen Starvation

Authors

  • Chengwei Liang
    • Qingdao University of Science and Technology
  • Shaona Cao
    • Qingdao Agricultural University
  • Xiaowen Zhang
    • Yellow Sea Fisheries Research InstituteChinese Academy of Fishery Sciences
  • Baohua Zhu
    • Ocean University of China
  • Zhongliang Su
    • Qingdao University of Science and Technology
  • Dong Xu
    • Yellow Sea Fisheries Research InstituteChinese Academy of Fishery Sciences
  • Xiangyu Guang
    • School of Ocean SciencesChina University of Geosciences
    • Yellow Sea Fisheries Research InstituteChinese Academy of Fishery Sciences
Article

DOI: 10.1007/s12155-012-9269-0

Cite this article as:
Liang, C., Cao, S., Zhang, X. et al. Bioenerg. Res. (2013) 6: 494. doi:10.1007/s12155-012-9269-0

Abstract

Nannochloropsis sp. is an economically and nutritionally important microalga. Recently it has been demonstrated that Nannochloropsis sp. has significant potential for biofuel production. To determine the mechanisms of lipid formation and accumulation during nitrogen starvation, a transcriptomic study was performed to compare gene expression during growth with and without nitrogen. Digital expression analysis identified 1,855 differentially expressed genes between cells grown under nitrogen-replete and nitrogen-deprived conditions; this provided novel insights into the molecular mechanisms of lipid formation by Nannochloropsis sp. under stress. As expected, nitrogen deprivation induced genes involved in nitrogen metabolism and lipid biosynthesis. Although the chlorophyll content decreased following nitrogen deprivation, a subset of genes putatively encoding light-harvesting complex (LHC) proteins were upregulated. These upregulated LHCs may play a role on photoprotection. The sequence data were confirmed using reverse transcription polymerase chain reaction (RT-PCR) and quantitative real-time RT-PCR. The expressions of a number of genes involved in acetyl-CoA metabolism were also affected under nitrogen-deprived stress, which may change fatty acids indirectly. Overall, we found low gene expression levels for fatty acid synthesis, suggesting that the buildup of precursors for the acetyl-CoA carboxylases may play a more significant role in TAG synthesis compared with the actual enzyme levels of acetyl-CoA carboxylases per se. The changes in transcript abundance in Nannochloropsis sp. following nitrogen deprivation provided a potential source for exploration of molecular mechanisms of lipid formation and accumulation. Furthermore, a set of simple sequence repeat motifs were identified from the expressed sequence tags, which provide useful genetic markers for further genetic analysis.

Keywords

Nannochloropsis sp.TranscriptomeNitrogen starvation

Introduction

The Nannochloropsis species are important in marine aquaculture due to their potential for wide-scale production for nutritional purposes. They are a rich source of protein, carotenoids, lipids, carbohydrates, vitamins, and minerals, and could help fill the “protein gap” and play a role in feeding an ever-expanding world population [13]. Recently, several research groups have suggested that these species could be a promising source of oil for biofuel production [46] due to the high levels of saturated and monounsaturated fatty acids produced under certain stressful conditions [7].

However, the production of neutral lipids from naturally occurring microalgal strains has often resulted in much lower yields than the theoretical maximum [8], making microalgae-based biofuel production prohibitively expensive. The limited success of algal lipid production stems primarily from the lack of understanding of the metabolic pathways regulating the algal lipid metabolism in general and the neutral lipid synthesis and accumulation in particular [8, 9]. The accumulation of lipids under stress conditions in Nannochloropsis usually involves changes in the expression profile. Therefore, exploring the molecular basis of responses to nitrogen starvation in Nannochloropsis sp. will aid in the understanding of the mechanisms of lipid metabolism. Abundant transcriptome data are required for this purpose.

Although the genomics approach has already been demonstrated to be effective in understanding the biological pathways of many algae [1017], expressed sequence tag (EST) sequencing represents a rapid and relatively economical method for analyzing the transcribed regions of the genome [18]. Indeed, EST analyses have identified many genes involved in plant secondary metabolism [18]. Recent advances in next-generation sequencing technologies, such as 454 sequencing, have allowed the generation of large-scale ESTs efficiently and cost-effectively [19, 20]. The 454 sequencing technology has experienced a rapid improvement in throughput, read length, and accuracy. The newest 454 sequencing platform, the GS 20 Titanium, can generate one million reads with an average length of 400 bases at 99.5 % accuracy per run [21]. This sequencing method, which has been widely applied to many species [17, 22, 23], holds great potential for the discovery of genes and genetic markers in unconventional model species through de novo transcriptome sequence assembly.

Previous studies have shown that nitrogen deprivation affects lipid formation and accumulation in many microalgae [9, 2428]. Previous studies [17, 26, 28] and our analysis (the data shown in this study) show that Nannochloropsis sp. increases the number of lipid droplets under nitrogen-deprived conditions. The goal of this study was to investigate the major changes in gene expression following nitrogen deprivation. We expected the pattern of the induced levels of transcript to reflect the metabolic changes. The results were able to provide valuable information to further understand the molecular mechanisms of lipid formation and accumulation in algae.

Materials and Methods

Microalga and Growth Conditions

The Nannochloropsis sp. strain NL117 was obtained from the alga culture collection of the Ocean University of China and was maintained in our laboratory. Sequence analysis of the 18S rRNA gene showed that the strain is significantly related (97 % identity) to Nannochloropsis mantima. The strain was cultivated on f/2 medium [29]. Nannochloropsis cells were grown photoautotrophically at 23 °C with 50 μmol m−2 s−1 white light in a 1-L triangle flask containing 600 mL f/2 medium. When the optical density (OD) reached 0.4 at a wavelength of 720 nm, 100 mL of culture was harvested for lipid analysis. To induce nitrogen deprivation, 500 mL of culture was collected by vacuum filtration using 0.45-μL fiber filters and resuspended in 500 mL nitrogen-free f/2 medium in a 1-L triangle flask without shaking. The cells were harvested after they were transferred into the nitrogen-free f/2 medium for 24 and 48 h, respectively. From these samples, 100 mL was collected and immediately frozen in liquid nitrogen for RNA extraction. One hundred-fifty milliliters of culture was harvested for analysis of the accumulation of lipid and fatty acid composition and concentrations, or chlorophyll content. The above cultures were repeated in triplicate.

Lipid Analysis of Nannochloropsis sp. in Nitrogen-Free Nutrient Medium

The accumulation of neutral lipids was measured using the fluorescent dye BODIPY505/515 (Invitrogen, USA), a lipid-soluble fluorescent probe that possesses several characteristics advantageous for in situ screening [30, 31]. After the algal cells were stained with BODIPY505/515, a Nikon Eclipse 80i microscope (Nikon, Japan), with blue light (488 nm) as the excitation wavelength, was used to image and quantify the lipid bodies in the algal cells. A Nikon CCD DS-file digital camera (Nikon, Japan) was used to capture the images. We also measured the relative neutral lipid content of the cells cultured in complete and nitrogen-free medium. When the OD reached 0.4 at a wavelength of 720 nm, the cells were harvested for experiment. The details of culture conditions were mentioned above. From each sample, 5 mL of the algal suspension was stained with 1 μL of 10 mM BODIPY505/515, dissolved in anhydrous dimethyl sulfoxide (DMSO) (final concentration, 2 μM), and then excited at 475 nm before measuring the emission at 510 nm using a spectrofluorimeter (ISS Inc., Champaign, IL). In our work, the cell concentrations were determined by using a hemacytometer.

Samples of the cells from the two different conditions were lyophilized and analyzed to determine the fatty acid composition and concentrations by GC-MS, using the protocol described by Chi et al. [32]. The experiments were repeated in triplicate. Fatty acid concentrations were normalized against the measured biological dry mass (BDM) values to eliminate any slight differences in the BDM values. The average and standard deviation were calculated from the triplicate samples.

Chlorophyll Content Analysis

To measure the effect of nitrogen starvation on photosynthesis, we analyzed the chlorophyll content. Chlorophyll A was extracted from Nannochloropsis cells (OD = 0.4) using acetone. The chlorophyll content was determined by UV–vis spectrophotometers (Purkinje General, China) using the coefficients mentioned by Solovchenko et al. [33].
$$ {C_{{chl\,a}}}\left( {\mathrm{mg}\,{{\mathrm{L}}^{-1 }}} \right)=13.34\ {A_{666}}-4.85\ {A_{650 }} $$
$$ {C_{{\mathrm{chl}\,b}}}\left( {\mathrm{mg}\;{{\mathrm{L}}^{-1 }}} \right)=24.58\ {A_{650 }}-6.65\ {A_{666}} $$
$$ \mathrm{Total}\ {C_{\mathrm{chl}}} = {C_{{\mathrm{chl}\,a}}}+{C_{{\mathrm{chl}\,b}}} $$

cDNA Library Construction and Sequencing

The frozen cells were ground using a pestle and mortar prior to RNA extraction. Total RNA was extracted using the TRIzol kit (Invitrogen). The quality of the total RNA was verified on a 1.4 % (w/v) agarose–MOPS–formaldehyde denaturating gel and by assessing the A260/280 and A260/230 ratios using the NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, USA). RNA was purified to remove tRNA and rRNA and to enrich the mRNA, using an mRNA purification kit (Promega) according to the manufacturer's instructions. mRNA was reverse transcribed by Powerscript™ II (Takara) using the PCR primer SMART IV™ oligonucleotide (5′-AAGCAGTGGTATCAACGCAGAGTGGCCATTACGGCCGGG-3′) and CDS III/3′ PCR primer (5′-ATTCTAGAGGCCGAGGCGGCCGACATG-d(T)30N-1N-3′). Long-distance PCR for double-strand cDNA amplification was performed with LA Taq polymerase (Takara) using 25 PCR cycles (95 °C for 30 s, 68 °C for 8 min) according to the SMART™ cDNA library construction kit user manual. Finally, the double-stranded cDNAs were purified using a DNA purification kit (Qiagen) to generate clean cDNA. The cDNA samples were normalized using the Trimmer Direct cDNA normalization kit (Evrogen) according to the manufacturer's instructions.

Approximately 10 μg of sheared cDNA was used for GS20–454 sequencing. The cDNA sample was end repaired and adapter ligated according to the method described by Margulies et al. [34]. Streptavidin bead enrichment, DNA denaturation, and emulsion PCR were also performed as described by Margulies et al. [34]. A half-plate sequencing run was performed for each sample at the Virginia Bioinformatics Institute Core Laboratory Facility according to the manufacturer's instructions (Roche-454 Life Sciences, Brandford, CT, USA) at SeqWright (USA).

Sequence Processing and Data Assembly

Following 454 sequencing, the raw data were processed to remove low-quality regions, nonsense sequences (adaptor and primer sequences), and poly A sequences, using a Perl program. The high-quality reads were assembled with CAP3 [35] to construct unique consensus sequences. The parameters used by CAP3 are as follows: −o 40 −p 95 −y 50.

Functional Annotation and Pathway Prediction Classification

Unigenes were compared with the NCBI non-redundant nucleotide and non-redundant protein databases using BLASTN and BLASTX, respectively, using the same E value cutoffs ≤1e−5. All the unigenes were functionally annotated by sequence similarity searches against the Swiss Prot [36] and Clusters of Orthologous Groups of proteins (COG) databases [37] using BLAST at E value cutoffs ≤1e−10. A Perl script was written to assign the functional class of each unigene. Putative InterPro protein domains [38] were annotated by InterProScan [39] release 16.0, and functional assignments were mapped onto Gene Ontology (GO) [40]. WEGO [41] was used for GO classification and to construct a GO tree. The annotations of the unigenes were compared with the Kyoto Encyclopedia of Genes and Genomes database (KEGG, release 50) [42]. A Perl script was used to retrieve Kyoto information from blast results and to establish biochemical pathway associations between the unigenes and the database.

Identification of Differential Gene Expression

Similar to credibility interval approaches reported for the analysis of SAGE data [43], we employed IDEG6 [44] to identify differentially expressed mRNAs based on their relative abundance, which is reflected by the total count of individual sequence reads in the two libraries. The general chi-squared test was employed for statistical analyses. This has been proven to be one of the most efficient tests [44]. Expressions of genes with a P value ≤0.01 were considered to be significantly different between the two libraries.

EST-SSR Detection and Primer Designing

The EST resource is an effective and feasible approach to develop SSR markers. EST-SSR detection was performed using the Perl program MISA [45]. Since it was difficult to distinguish real mononucleotide repeats and single nucleotide stretch errors generated by 454 sequencing, mononucleotide repeats were excluded from this study. The parameters were designed for identifying perfect di-, tri-, tetra-, penta-, and hexanucleotide motifs with a minimum of six, five, five, five, and five repeats, respectively. Primer 3 software [46] was used for primer design. The major parameters for designing the primers were as follows: primer length of 18 to 28 bases with 20 bases being the optimum, PCR product size ranging from 100 to 300 bp, optimum annealing temperature of 60 °C, and GC content of 40 to 70 %, with 50 % being the optimum.

Experimental Validation of Selected Transcripts

Subsets of transcripts, including photosynthesis-related genes, were found to have significantly different expression between the two libraries. Therefore, we focused our attention on the photo-related genes, especially members of the light-harvesting complex (LHC) family. Validation was performed by quantitative real-time PCR (qPCR). The RNAs derived from the samples for the construction of the cDNA libraries were used. qPCR was performed using an oligo (dT) primer and SuperScriptTM II Reverse Transcriptase (Invitrogen Inc., Carlsbad, CA) according to the manufacturer's instructions. The qPCR was performed with an ABI StepOne Plus Real-Time PCR System (Applied Biosystems, USA) using SYBR Green (Takara) according to the manufacturer's instructions. The primers used in qPCR are listed in Additional File 1 (The primers used in the qPCR).

Results and Discussion

Neutral Lipid Analysis of Nannochloropsis sp. in Nitrogen-Free Nutrient Medium

Nannochloropsis sp. was considered promising for mass cultivation for biofuel production due to its high ability to produce storage tricylglycerols (TAG) under conditions of nutrient starvation [6, 20]. In this study, we examined the responses of the Nannochloropsis sp. to nitrogen starvation. The lipophilic fluorescent dye BODIPY505/515 was used to determine the algal lipid content. Light (a) and fluorescence microscopy (b, c) images of Nannochloropsis sp. cells are shown in Fig. 1. After staining with BODIPY505/515, the lipid bodies in the algal cells had a characteristic green fluorescence (Fig. 1b, c) and could be clearly identified. The nitrogen-starved stress cells had larger lipid droplets (Fig. 1c) compared with the non-stressed cells (Fig. 1b). The relative lipid content of Nannochloropsis sp. cells when grown under nitrogen-deficient and nitrogen-sufficient conditions was also determined by spectrofluorimeter. The peak emission intensity of BODIPY505/515 in anhydrous DMSO occurs near 510 nm when excited at 475 nm. Using these excitation and emission conditions, we measured the relative neutral lipid content of cells under the two different conditions. We observed that the cells grown under nitrogen-deficient conditions exhibited two times more fluorescence than cells grown under nitrogen-sufficient conditions after 24 h (Fig. 2). The result was consistent with the images obtained using the microscope (Fig. 1). The results demonstrated that nitrogen deficiency could significantly promote the accumulation of lipid content after 24 h of cultivation in the nitrogen-deficient medium.
https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig1_HTML.gif
Fig. 1

Wide-field epifluorescence images of Nannochloropsis sp. cells infused with BODIPY505/515. a Light micrograph; b and c fluorescent micrograph of algae stained with BODIPY505/515; b shows cells cultured in the nitrogen-replete medium, c shows cells cultured in the nitrogen-deprived medium after 24 h. Scale bars = 5 μm

https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig2_HTML.gif
Fig. 2

Relative intensity emission profile of BODIPY505/515 excited at 475 nm and recorded at 510 nm after addition to a solution of live Nannochloropsis sp. cells (OD = 0.4). 0 h refers to the time when the Nannochloropsis sp. NL117 cells were transferred into the nitrogen-deprived medium immediately; 24 h refers to the time when the Nannochloropsis sp. NL117 cells were cultured in the nitrogen-deprived medium for 24 h

To identify the change of fatty acid composition of the stressed Nannochloropsis sp. cells, a typical profile of lipids extracted from lyophilized cells of Nannochloropsis sp. using hexane is given in Table 1. The C16:0 and C16:1 represent about 60–70 % of the total fatty acids. Although it seemed clear that nitrogen limitation induced the increase in lipid concentration (Figs. 1 and 2), the nitrogen limitation had only a small influence on the fatty acid composition. C16:0, C16:1ω-7, and C18:1ω-9 showed a slight increase, whereas the EPA and C20:4ω-3 decreased under nitrogen-deprived conditions. The content of other groups of fatty acids remained nearly constant. EPA is a group of fatty acids which are located in the chloroplast membrane. Under nutritional limitations, such as nitrogen, cells are unable to resynthesize them and/or keep the concentration of these components constant [7]. Thus, we inferred the nitrogen limitation would influence the photosystem.
Table 1

Percentage value of all detected fatty acids (percent w/w TFA) of Nannochloropsis sp.

Fatty acid group

Cultivation conditions

N+

N− 24 h

N− 48 h

C12:0

0.32 ± 0.03

0.22 ± 0.01

0.21 ± 0.03

C14:0

5.36 ± 0.4

4.88 ± 0.15

4.01 ± 0.15

C16:0

33.76 ± 2.3

35.08 ± 3.05

36.78 ± 1.64

C16:1ω-7

27.51 ± 1.56

29.63 ± 2.03

30.34 ± 1.90

C18:0

0.83 ± 0.02

0.89 ± 0.03

0.85 ± 1.0

C18:1ω-9

6.87 ± 0.43

8.88 ± 1.01

9.2 ± 0.4

C18:2ω-6

1.35 ± 0.01

1.36 ± 0.09

1.04 ± 0.08

C18:3ω-6

0.09 ± 0.00

0.11 ± 0.00

0.08 ± 0.01

C20:4ω-3

1.59 ± 0.03

0.98 ± 0.01

0.78 ± 0.03

C20:5 (EPA)

18.56 ± 0.3

14.04 ± 0.34

12.03 ± 1.0

Other (sat)

2.06 ± 0.2

1.88 ± 0.05

2.52 ± 0.08

Other (unsat)

1.7 ± 0.1

2.05 ± 0.06

2.16 ± 0.15

Values with standard deviations. Averages and standard deviation were calculated using the data from three independent biological replicates

sat saturated, unsat unsaturated

Generally, under nitrogen deprivation, microalgae favor the synthesis of neutral lipids more than polar lipids [28, 46, 48]. These neutral lipids usually serve as structural components of the cytoplasm for the maintenance of cells under nitrogen deprivation [49]. It is also believed that fatty acids, which are incorporated into TAG, are used as an electron sink by the cell and as a way to restore the pool of NADP+ when cell growth and division are impaired due to nutrient limitation [8].

Sequence Generation and Assembly

Two cDNA libraries constructed by SMART technology from the total RNA of Nannochloropsis sp. grown with and without nitrogen were subjected to a one-plate run using the 454 GS 20 Titanium platform. This produced 525,891 raw reads with an average sequence length of 200 bases, corresponding to a total of 103.52 Mb. After removing low-quality regions, adaptors, and any possible contamination, we obtained a total of 451,143 high-quality ESTs with an average length of 183 bp. From these reads, 237,932 were from control cells and 213,211 were from the nitrogen-deprived cells (Table 2). The distribution of the high-quality read lengths is shown in Fig. 3.
Table 2

Sequencing result summary

Sample

Raw

High-quality ESTsa

Number

Average length

Median length

Total length

Number

Average length

Median length

Total length

N replete

283,808

200.3

152

56,847,996

237,932

187.6

157.7

4,636,048

N deprived

242,088

192.8

170

46,673,830

213,211

177.5

156.8

37,844,955

aHigh-quality ESTs: ESTs left after removal of low-quality ends of the sequence traces and other noninformative data, such as short sequences (less than 100 bp) and contaminants

https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig3_HTML.gif
Fig. 3

The distribution of 454 read lengths. a Cells cultured in the nitrogen-replete medium; b cells cultured in the nitrogen-replete medium

These high-quality reads were assembled into 11,002 contigs and 23,095 singletons using CAP3 [35]. The contigs and the singletons are collectively referred to as unigenes. The distributions of the length and number of the unigenes are summarized in Fig. 4 and Table 3.
https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig4_HTML.gif
Fig. 4

The distribution of the length of unigenes

Table 3

Cluster size in unigene

Cluster sizea

Nannochloropsis sp. NL117

Number of unigenes

Percent of unigenes (%)

1

23,095

67.7

2

3,002

8.8

3

1,554

4.6

4–5

1,607

4.7

6–10

1,821

5.3

11–20

1,398

4.1

21–50

1,143

3.4

51–100

327

1.0

>100

150

0.4

Max unigene size is

1,093

 

aCluster size: the number of ESTs that assembled into a unigene

The average length of our ESTs was shorter than that obtained from other species [21, 22, 50], despite using the same sequencing technology, but similar to that observed for Nannochloropsis gaditana [17]. More than 80 % were between 100 and 300 bp, and the results of a replicate experiment were of a similar range. This indicated that the result was not due to technical factors but due to the specific organism.

Functional Annotation of the Nannochloropsis Transcriptome by Sequence Comparison with Public Databases

To infer putative function, unique sequences were first compared with the sequences in the NCBI non-redundant nucleotide database (NT) using the BLASTN algorithm (http://www.ncbi.nlm.nih.gov/) and the non-redundant protein database (NR) (http://www.ncbi.nlm.nih.gov/) using the BLASTX algorithm. When the E value cutoff was set at 10−5, of the 34,097 unigenes, 5,761 (16.9 %) had significant matches in the NT database, and 3,851 (11.3 %) had significant hits in the NR database (shown in Additional File 2).

Of all the unigenes, except for the no hits, the most abundant transcript included the enzyme-encoding glyceraldehyde-3-phosphate dehydrogenase (GAPDH). GAPDH, a Calvin cycle enzyme involved in carbon fixation, was investigated in a wide range of the algal groups [51]. Some transcripts encoding the chloroplast photosystem II 33-kDa oxygen-evolving protein PsbO, coproporphyrinogen III oxidase, were also highly expressed, and these proteins may play a role in photosynthesis.

Based on the alignments, unigenes were also functionally annotated by sequence similarity comparison against the database of the Clusters of Orthologous Group proteins. In total, 1,002 unigenes were mapped in this way. The largest proportion of COG-assigned sequences fell into the “post-translational modification, protein turnover, and chaperones” functional classification. The entire data set of amino acid sequences was also compared against the InterPro database of protein families and functional domains [38], and 3,156 were identified as bearing conserved protein domains. The same set of sequences was annotated with GO terms (Fig. 5), resulting in 2,381 functional assignments. This classification scheme was used to assign Nannochloropsis contigs to one of the major GO annotation domains—biological processes, cellular components, and molecular functions, in a species-independent manner. Although this method yielded a significant amount of data, there were fewer database matches compared to when the same method was applied to other species [17, 22, 52]. This result may be due to the relatively small amount of data from eustigmatophytes available in the public databases, or the unmatched unigenes may represent species-specific novel genes.
https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig5_HTML.gif
Fig. 5

The percents and counts of Nannochloronisis sp. NL117 unigenes in each functional category. Nannochloropsis sp. unigenes were classified into different functional groups based on a set of plant-specific GO Slims

Comparison of Transcriptomes

Nannochloropsis sp. is a eustigmatophyte alga that is closely related to Phaeophyceae (brown algae) [17]. Among other species of Nannochloropsis, Nannochloropsis sp. NL 117 is significantly related to N. mantima and Nannochloropsis oceanica (Fig. 6). To determine which Nannochloropsis sp. NL 117 genes are homologous to other algal species, all of the unigenes from Nannochloropsis sp. NL 117 were compared with the NCBI non-redundant protein databases using BLASTX using the same E value cutoffs ≤1e−5. Beyond our expectation, the most frequent is not in Nannochloropsis. There was more homology in the brown alga (Ectocarpus siliculosus) and diatoms (Phaeodactylum tricormutum and Thalassiosira pseudonana) (Fig. 7). This analysis is consistent with the results reported by Radakovits et al. [17]. A possible explanation for this result is a relatively small number of ESTs for Nannochloropsis sp. are presented in the public database.
https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig6_HTML.gif
Fig. 6

Phylogentic analysis of the Nannochloropsis sp. The tree indicates the relationship between different strains of Nannochloropsis based on the 18S ribosomal RNA sequences

https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig7_HTML.gif
Fig. 7

Nannochloropsis genes were compared with other previously sequenced algal genomes in the non-redundant protein database using BLASTX. The number of times an organism was the top BLASTX hits (E value less than 1e−5) of Nannochloropsis gene is indicated

To investigate the change in the expression profile under nitrogen-deprived conditions, we also compared the two transcriptomes from nitrogen-deprived and normal cells, and the results are summarized in Fig. 8. In total, 31.8 % of the unigenes were detected in nitrogen-deprived cells. Only 14.3 % of the unigenes were found in cells grown under both culture conditions. In contrast, 44.2 % of the contigs were present in samples from both growth conditions. The discrepancy between these values for unigenes and contigs is due to the presence of a large number of low-copy singlets.
https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig8_HTML.gif
Fig. 8

The comparison of transcriptomes between the normal cells and nitrogen-starved cells using digital expression

Identification of Differential Gene Expression

Digital expression profiling has been shown to be a powerful and efficient approach for gene expression analysis at the level of the whole genome [52]. In this study, using statistical tests, a total of 1,855 unigenes were found to be significantly differentially expressed, of which 923 showed higher expressions in nitrogen-starved cells. We further identified GO terms for the biological processes encoded by these differentially expressed genes (Fig. 9). The results indicated the broad pathways that were represented in the list of differentially expressed genes based on the KEGG (Additional File 2: the list of the differently expressed genes). Notably, within the biological process domain, genes in the cellular process and metabolic process groups were enriched in both lists. Furthermore, the matches within the molecular function domain were highest in the binding and catalytic activity groups. Finally, for the cellular component domains, the most frequent matches were within the cell part and cell terms. Each functional category of the genes showed similar amounts of up- and downregulated genes. The results indicated that the same biological processes may require different sets of genes.
https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig9_HTML.gif
Fig. 9

Analysis of differently expressed genes by GO terms. The left Y-axis of the bar represents the percentage of genes, and the right Y-axis of the bar plot represents the gene count within each GO category. All processes listed had enrichment P values <0.01

Furthermore, genes with significantly different expression levels between the two libraries were also investigated by searching against the NR and NT databases. A large number of the genes had no known function or no similarity to known genes. This issue was mainly due to the relatively smaller data from Eustigmatophytes available in the public databases, or the unmatched unigenes may represent species-specific novel genes. Further studies are required to identify novel genes used in extreme conditions. Generally, the photosynthetic efficiency decreased following nitrogen deprivation in microalgae [7, 53]. In addition, the abundance of transcripts encoding photosynthesis-related proteins was substantially reduced following nitrogen deprivation [54]. Interestingly, our analysis revealed that several genes associated with the photosystems showed a greater than 2-fold upregulation in nitrogen-deprived stressed cells, including those that encoded the light-harvesting complex protein (greater than 6-fold), photosystem II oxygen-evolving protein PsbO(2.6-fold), and violaxanthin/chlorophyll, a binding protein precursor (NANVCP) (2.7-fold). These genes may play a protective role in dissipating excess excitation energy due to the lowered photosynthetic efficiency under stress conditions.

We also identified several transcripts encoding enzymes involved in nitrogen metabolism upregulated, such as ferredoxin-nitrite reductase (greater than 29-fold), glutamine synthetase(greater than 1.9-fold), glutamine amidotransferase, carbon–nitrogen ligase activity, and glutamine phosphoribosylpyrophosphate amidotransferase. Nitrate reductase is activated in response to elevated sugar and hexose phosphate levels during photosynthesis [54, 55]. Nitrite reductase catalyzes the reduction of nitrite to ammonium; glutamine synthetase is involved in the ATP-dependent conversion of ammonium and glutamate to glutamine; and ferredoxin-dependent glutamine oxoglutarate amino transferase is required for the conversion of glutamine and 2-oxoglutarate to glutamate [56]. These reactions occur within the chloroplast. Previous studies and our study have shown that nitrogen deprivation results in the accumulation of lipids [24, 26, 27]. However, in this study, only a small fraction of the transcripts encoding the enzymes involved in lipid synthesis (fatty acid elongase, acetyl-CoA carboxylase, N-acetyltransferase, long-chain-fatty-acid-CoA ligase) were upregulated under nitrogen deprivation. An acetyl-CoA carboxylase (ACCase) is generally considered to catalyze the first reaction of the fatty acid biosynthetic pathway—the formation of malonyl CoA from acetyl-CoA and CO2. N-Acetyltransferase is an enzyme that catalyzes the transfer of acetyl groups from acetyl-CoA to arylamines. These enzymes were involved in the fatty acid de novo synthesis pathway in chloroplasts. However, the transcript levels of many lipid synthesis genes were not upregulated. Based on a previous report [53], the lipase expression or activities may also be controlled at the post-transcriptional level, including translational regulation and post-transcriptional modifications of the encoded proteins. Some of the lipases may have a similar regulatory pattern. Thus, no significant changes were observed when the cells were under stress caused by nitrogen deprivation. Other genes encoding enzymes of primary metabolism also showed changes in transcript abundance. The transcript abundances of glyceraldehyde-3-phosphate dehydrogenase and pyruvate dehydrogenase increased, which are involved in the synthesis of acetyl-CoA, a precursor of fatty acid biosynthesis. Therefore, these enzymes could contribute to the changing profiles of fatty acids of Nannochloropsis sp. following nitrogen deprivation to some extent. Some transcripts encoding transcription factors were also highly expressed, and these proteins may play a role in resistance to nitrogen starvation stress.

Among the downregulated genes, except for a number of genes with unknown function, several genes associated with protein biosynthesis were downregulated following nitrogen deprivation. For example, a subset of ribosomal proteins (L26, L10, L17, L26, and S16) were significantly decreased, which may imply that protein synthesis was influenced following nitrogen deprivation.

Biological Pathway

Biological pathway assignments were performed according to KEGG mapping [42]. First, the 34,097 unique sequences were compared using BLASTX with an E value cutoff of <10−5 against the KEGG database. Of these unique sequences, 5,364 had significant matches in the database. Among these, 1,578 unique sequences having enzyme commission numbers were assigned to metabolic pathways. The KEGG metabolic pathways that were significantly represented by Nannochloropsis unique sequences were energy metabolism, amino acid metabolism, translation, metabolism of cofactors and vitamins, transport and catabolism, nucleotide metabolism, and lipid metabolism. In the subclass of energy metabolism, the greatest number of unigenes was mapped to oxidative phosphorylation and carbon fixation in photosynthetic organisms. In addition, in the subclass of metabolism of cofactors and vitamins, the most frequent unigene was mapped to porphyrin and chlorophyll metabolism (shown in Additional File 3). These pathways were related to photosynthesis. Because of the lipid production for Nannochloropsis sp. NL 117, we focused on lipid metabolic genes. Some transcripts encoding acetyl/propionyl CoA carboxylase, Acyl-CoA dehydrogenase, and ACP dehydratase were involved in the fatty acid de novo synthesis pathway in chloroplasts. Generally, genes involved in the fatty acid synthesis had low gene expression. The results also suggest that the buildup of precursors to acetyl-CoA may play a more significant role in TAG synthesis rather than the actual enzyme levels of acetyl-CoA carboxylases per se. A similar conclusion was deducted in Phaeodactylum tricornutum by Valenzuela et al. [57].

Photosynthetic Response to Nitrogen Deprivation

Based upon significant gene expression related to photosynthesis, we focused our attention onto photosynthesis and related genes. Various types of stress are known to exert a negative effect on photosynthesis. Typically, under nitrogen deprivation, the chlorophyll content is decreased in the microalgae [7, 53]. In this work, we also found that nitrogen deprivation appeared to have a negative effect on photosynthesis, since the chlorophyll content of Nannochloropsis sp. under stressful conditions was shown to decline (Fig. 10). However, we found an intriguing issue for the molecular response of Nannochloropsis sp. to nitrogen deprivation. A different regulation pattern was observed in those gene putatively encoded in the LHC proteins (S1). This result differed from LHCs in Chlamydomonas reinhardtii [53]. Thus, we focused more of our attention on these genes. We selected eight of the LHC genes for verification. Using gene-specific primer sets for LHC genes, the contrasting nitrogen starvation stress-mediated regulations of such photosynthetic genes were also confirmed by quantitative real-time PCR (Fig. 11). Four of the LHC genes were upregulated more than 4-fold within 24 h of nitrogen deprivation. The results of the qPCR experiment are consistent with the digital analysis. Mock and Kroon [25] revealed that nitrogen limitation produced a similar response as strong light conditions and that energy conversion by photosystem II was affected by nitrogen deprivation. The energy surplus under these conditions is stored in TAGs. The LHC proteins are members of a superfamily and can bind chlorophyll and carotenoid to form pigment–protein complexes. These surround the photochemical reaction centers of PSI (LHCI) and PSII (LHCII, CP29, CP26, and CP24) [57, 58]. The LHCs of land plants and green algae play an essential role in light capture and photoprotection [5962]. Homology searching showed that genes upregulated by stress had high similarity with a number of genes involved in photoprotection under high light conditions. Thus, we deduced that the upregulated LHC genes played a photoprotective role within the photosynthetic membrane under nitrogen deprivation. The other LHC genes may play a role in light harvesting. Generally, the changes in transcript abundance following nitrogen deprivation were related to nitrogen metabolism or nitrogen-containing compounds. In cyanobacteria, nitrogen deprivation led to the degradation of the highly abundant phycobili light-harvesting proteins so that they could be used as a nitrogen source for protein synthesis [63]. In C. reinhardtii, the decrease in photosynthesis following nitrogen deprivation is to prevent the accumulation of reactive oxygen species [52, 64]. Further study will be required to determine whether the induced expression of these LHC genes is important for the acclimation of Nannochloropsis sp. in response to nitrogen deprivation.
https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig10_HTML.gif
Fig. 10

The time course of change in chlorophyll content of Nannochloropsis sp. cultures grown under normal or stressful condition. Diamonds nitrogen-replete medium, circles nitrogen-deprived medium

https://static-content.springer.com/image/art%3A10.1007%2Fs12155-012-9269-0/MediaObjects/12155_2012_9269_Fig11_HTML.gif
Fig. 11

Expression pattern of LHCs in different cells. Comparison of the relative abundance of LHC transcripts in Nannochloropsis sp. under nitrogen starvation resulting from real-time PCR and EST counting, respectively. The sequence data of LHCs were from the assembled unigenes, LHC1: Unique_000114, LHC2: Unique_006110, LHC3: Unique_2756, LHC4 Unique_010469, LHC5: Unique_010259, LHC6: Unique_010825, LHC7: Unique_00752, LHC8: Unique_009648

Identification of Simple Sequence Repeat Markers

The identified microsatellite markers derived from ESTs are described as functional markers and offer more utility than markers from anonymous genomic regions [6567]. In this study, we performed a general screen on the Nannochloropsis unigene data set for the presence of SSRs. A total of 3,397 SSRs were identified (shown in Additional File 4: SSR markers and designed primers). We excluded mononucleotide SSRs in our analysis because of the common homopolymer errors that occur in 454 sequencing data. The majority of the SSRs were trinucleotide (2,086) and dinucleotide (929) repeats, followed by tetranucleotide (295), pentanucleotide (38), and hexonucleotide (49) repeats. These SSRs markers offer a valuable resource for further genetic investigations. To the best of our knowledge, this is the first attempt to develop large numbers of SSR markers using the EST database for Nannochloropsis. Since these markers were developed based on conserved expressed sequences across the Nannochloropsis genus, they may be valuable for functional analysis of candidate genes because part of these EST sequences was derived from nitrogen-deprived stressed cells which can accumulate oils. Also, molecular markers have great potential to speed up the process of developing improved cultivated strains. Our work has established a biotechnological platform for future research.

Conclusion

In conclusion, we have described the global analysis of the Nannochloropsis transcriptome during nitrogen starvation using massively parallel pyrosequencing. To compare the induced transcript levels, we expected to reflect the metabolic changes leading to neutral lipid accumulation. Undoubtedly, the genes involved were not completely identified. Also, the limitation of the algal gene resource made this work more challenging. Functional characterization of selected genes is currently being performed. This is only a first step in understanding the molecular mechanisms of lipid formation and accumulation in algae. We expect other groups will also mine this data set, and access to all EST contigs obtained in this study is available through a file in the supplemental data (Additional File 5: the sequence data of all the unigenes).

Acknowledgments

The authors thank the Beijing Institutes of Life Science, Chinese Academy of Sciences (BIOLS), for kind assistance in bioinformatic analysis. This work was supported by Shandong Science and Technology plan project (2011GHY11528), the Specialized Fund for the Basic Research Operating Expenses Program (20603022012004), National Natural Science Foundation of China (41176153, 31000135,40972162), Natural Science Foundation of Shandong Province (2009ZRA02075), Qingdao Municipal Science and Technology plan project (11-3-1-5-hy, 11-2-4-3-(5)-jch), and National Marine Public Welfare Research Project (200805069).

Supplementary material

12155_2012_9269_MOESM1_ESM.xls (28 kb)
Additional File 1The primers used in the qPCR. (XLS 28 kb)
12155_2012_9269_MOESM2_ESM.xls (7.6 mb)
Additional File 2The total information of Unigenes. This file contains every identified gene, its annotation based on the public database and its change in expression between the two different conditions. (XLS 7,739 kb)
12155_2012_9269_MOESM3_ESM.xls (105 kb)
Additional File 3Pathway based on the KEGG. The file includes the pathways in which the differently expressed gene involved. (XLS 105 kb)
12155_2012_9269_MOESM4_ESM.xls (1.6 mb)
Additional File 4SSR markers and designed primers. SSR markers were identified based on the EST sequences and primers were designed based the SSRs. (XLS 1,631 kb)
12155_2012_9269_MOESM5_ESM.seq (9.7 mb)
Additional File 5All the EST sequence data. The file can be read by the softwares such as Utraedit, Dnaman, NotePad et al. (SEQ 9,945 kb)

Copyright information

© Springer Science+Business Media New York 2012