Background

The orange-spotted grouper (E. coioides), an important cultured marine fish with a high market value, is an ideal model for studying sex differentiation and reproduction [1, 2]. Rapid expansion of aquaculture has, however, led to an increased incidence of disease outbreaks in recent years [3, 4]. Emerging viral infectious diseases, including iridovirus and nodavirus, have caused serious damage to the grouper aquaculture industry with mortality rates due to iridovirus infections ranging from 30% (adult fish) to 100% (fry) [57]. To date, three iridoviruses that were isolated from diseased groupers have been characterized: Singapore grouper iridovirus (SGIV), orange-spotted grouper iridovirus (OSGIV) and Taiwan grouper iridovirus (TGIV) [5, 6, 8]. Nevertheless, the molecular mechanisms associated with iridovirus pathogenesis and virus-host interactions are largely unknown, due to the limited amount of available genomic information on E. coioides.

Rapid progress in next-generation sequencing technologies can be used for large-scale efficient and economical production of ESTs. De novo transcriptome sequencing using 454 pyrosequencing has thus become an important method for studying non-model organisms [912]. Transcriptome sequencing facilitates functional genomic studies, including global gene expression, novel gene discovery, assembly of full-length genes, and single nucleotide polymorphism (SNP) discovery [9, 13]. To our knowledge, the genome sequence of E. coioides is still unavailable, and this has hindered the progress of immunological and developmental research. To overcome this obstacle, the 454 pyrosequencing technology was applied to determine the transcriptome sequence of E. coioides spleen tissue and a comparative analysis of transcriptome data between the control and the SGIV infected group was performed in this study. The data obtained disclosed a great deal of novel gene information in marine fish and suggested that several intracellular immune signaling pathways were involved in virus infection. These results will shed new light on the understanding of marine fish defense mechanisms to viral pathogens.

Results

Sequence analysis of ESTs from different cDNA libraries

Sequencing data from two different libraries was submitted to the NCBI database (accession number is SRA040065.1). In the control (C) and the SGIV (V) infected spleen libraries, a total of 428867 and 446009 ESTs were sequenced, respectively. Following adaptor sequence and low quality sequences trimming 407,027 (C) and 421,141 (V), high-quality ESTs were obtained from the two libraries. After sequence assembly, 60,322 non-redundant ESTs were generated in the control library, including 36,076 singlets and 24,246 contigs with an average length of 504 bp. In the infected library, 66,063 non-redundant ESTs were generated, including 40,527 singlets and 25,536 contigs, with an average length of 547 bp (Table 1).

Table 1 Summary of EST data in mock- and SGIV-infected grouper spleen cDNA libraries.

All the contigs and singlets were designated as unique sequences and used for further comparative sequence analysis between the two libraries. After a homology search in the non-redundant protein database at the National Center for Biotechnology Information (NCBI), a total of 9,616 (C) and 10,426 (V) unique sequences showed significant BLASTX hits of known protein sequences. The distribution of significant BLASTX hits over different organisms was analyzed. Due to the lack of E. coioides genomic information, the majority of sequences in the two libraries matched genes or fragments from Tetraodon nigroviridis (Figure 1).

Figure 1
figure 1

Characteristics of homology search of ESTs against the nr database. (A) E-value distribution of BLAST hits for each unique sequence with a cut-off E-value of 1.0E-5. (B) Similarity distribution of the top BLAST hits for each sequence. (C) Species distribution is shown as a percentage of the total homologous sequences with an E-value of at least 1.0E-5. We used the first hit of each sequence for analysis.

Functional annotation based on GO, COG and KEGG analysis

The putative functions of unique sequences in two different libraries were analyzed according to Gene Ontology (GO) and Clusters of Orthologous Groups of protein (COGs) classifications. Analysis of GO categories showed that the functional distribution of the genes of the two libraries was similar. A total of 14,166 and 14,352 unique sequences map to biological processes, 15,130 and 14,923 sequences map to cellular components, and 7,137 and 7,252 sequences map to molecular functions in the control and SGIV infected libraries, respectively. In both libraries, most of the corresponding biological process genes were involved in cellular processes, biological regulation and metabolic processes. Most of the cellular component genes encode proteins associated with parts of cells and cell organelles; most of the molecular function genes were associated with binding, catalytic activity, and transporter activity (Figure 2).

Figure 2
figure 2

GO annotations of non-redundant sequences in mock and SGIV infected libraries. Most non-redundant sequences can be divided into three major categories, including molecular function (A), cellular component (B), and biological process (C).

Classification of the unigenes into COG categories is critical for functional and evolutionary studies [14]. Among the 25 COG categories, the cluster in the control library for "translation, ribosomal structure and biogenesis" represented the largest group (185 ESTs), followed by the "posttranslational modification, protein turnover, chaperones" and "general function prediction" clusters. Similarly, in the SGIV infected library, the cluster for "translation, ribosomal structure and biogenesis" represented the largest group (172 ESTs) followed by "general function prediction" and "posttranslational modification, protein turnover, chaperones" clusters (Figure 3).

Figure 3
figure 3

Histogram presentation of clusters of orthologous groups (COG) classification in mock and SGIV infected libraries.

KEGG is a pathway-based categorization of orthologous genes that provides useful information for predicting functional profiles of genes [15]. In this study the unique sequences of two libraries were categorized within the KEGG database. The matched sequences were involved in metabolism processes, cellular processes, signal transduction and cell cycles. Partial KEGG pathways associated with immune and inflammation responses are listed in Table 2. The conserved MAPK signaling molecules can be found in control (C) and SGIV-infected libraries (V), which contained 65 and 71 ESTs, respectively (Additional file 1). In addition, a large number of ESTs were involved in RIG-I signaling pathway (C, 21 hits; V, 20 hits), TLR signaling pathway (C, 28 hits; V, 26 hits), chemokine signaling pathway (C, 62 hits; 73 hits) and P53 signaling pathway (C, 22 hits; V, 25 hits) in two different libraries (Additional file 2 and 3). Many ESTs associated with mammalian signaling pathway genes, including MAP Kinase phosphatase 1 (MKP-1), Nur77, stimulator of interferon genes (STING), and tripartite motif protein (finTRIM) were initially disclosed in marine fish.

Table 2 Number of ESTs involved in KEGG pathway (number of ESTs > 10)

Putative genes involved in up-regulation or down-regulation during SGIV infection

Among unique sequences that shared > 30% identity (E value < 1e-5) to known genes in the NCBI database, 2,057 genes were cross-expressed in both the control and the SGIV-infected libraries. Using the Fisher's exact test based on the number of homologous ESTs, we found that 755 genes were significantly up-regulated, while 695 genes were significantly down-regulated in response to SGIV infection. A large number of genes were only present in either the control library or the SGIV-infected library. The up-regulated and down-regulated partial genes are listed in Tables 3 and 4, respectively. The alternated genes included cytoskeletal genes, enzymes, and other immune-related genes, such as chemokines, interleukins and interferon-induced proteins. These genes have different expression patterns during SGIV infection, which implies that they may play an important role in physiological processes associated with SGIV infection.

Table 3 Unique genes with increased expression in spleen after SGIV infection
Table 4 Unique genes with decreased expression in spleen after SGIV infection

Validation of the changes in gene expression by quantitative real-time PCR

To validate whether the up-regulated or down-regulated genes identified by statistical analysis were involved in SGIV infection, we detected the relative expression of partial genes using quantitative real time-PCR (qRT-PCR). As shown in Figure 4, the relative expression of IL-8, Chemokine (C-C motif) ligand 18 (CCL18), g-type lysozyme (g-lysozyme) and cystatin B increased significantly after SGIV infection, compared with the expression of these genes in the control fish. In contrast, the expression of the interferon-inducible GTPase 1 (IIGP1), transcription Factor II D (TFIID), gamma interferon (IFN-γ)-inducible lysosomal thiol reductase (GILT) and C-C chemokine receptor type 4 (CCR4) decreased after SGIV infection. Thus, these results suggested that SGIV infection modulated numerous host gene expressions for the completion its life cycle.

Figure 4
figure 4

The differential expression of selected genes was validated by qRT-PCR. Relative expression of genes with increased abundance (A) or decreased abundance (B) was detected. The relative gene expression in grouper injected with PBS (control) was defined as 1, and that in SGIV infected grouper (48 h p.i.) was indicated by the fold increase or decrease compared to the control.

Discussion

An increasing number of reports reveal that transcriptome sequencing of cDNA has became an efficient strategy for generating enormous sequences that represent expressed genes [16]. Transcriptomes from a number of species including those from Drosophila melanogaster, yeast, Caenorhabditis elegans and various mammals and plants were carried out for different purposes [1721]. However, genome and transcriptome data for many "lower" vertebrate species, particularly marine fishes, have not been disclosed. To our knowledge, a limited numbers of E. coioides genes were cloned and characterized, based on the bioinformatic analysis, including those involved in immune responses after pathogenic attack, growth and development [2227]. Given that the spleen is one of the most important organs associated with immune responses in fish and is also the main target organ for SGIV infection, the transcriptome sequencing of the E. coioides spleen can be expected to provide a significant number of ESTs for marine fish immune responses and contribute to understanding iridovirus-host interactions [5].

After removal of overlapping sequences between the control and SGIV-infected libraries, we obtained 65374 non-redundant consensus sequences from E. coioides. With the exception of sequences related to cellular structure and metabolism, abundant sequences were found to be homologous to known immune-relevant genes in other species, based on the BLAST, Conserved Domain Database (CDD), and SWISS-PROT annotation [2830]. More than 80 sequences shared homology to signaling molecules of the mammalian mitogen-activated protein kinase (MAPK) pathways, such as critical molecules associated with extracellular signal-regulated kinase (ERK), p38 MAPK, Ras, RSK2, MKK4, MKK7, ASK1, MEK1/2 and Raf1. The mammalian MAPK signaling pathway was activated during virus infection and contributed to virus replication [3133]. Although the MAPK signaling molecules including ERK, c-Jun N-terminal kinase (JNK) and p38 MAPK were activated in the spleens of SGIV-infected fish (EAGS) cells, identifying the exact roles of these molecules during SGIV replication will benefit from the E. coioides EST information [34, 35]. With the exception of homologue components in the MAPK cascade, different members of interferon-related genes were obtained, including the interferon-induced protein viperin, the interferon-stimulated gene 15 (ISG15), interferon-induced protein 35 kD (IFP35), interferon-stimulated gene 56 (ISG56), and interferon regulatory factors (IRF-1, IRF-2, IRF-3, IRF-4, IRF-5, IRF-7, IRF-8 and IRF-9). Interferon-induced, or stimulated, genes were important for the resistance of the host to virus infection, including virus entry, replication and release [3638]. The E. coioides IRF-1, IRF-2 and IRF-7 genes have been cloned and characterized and IRF-7 was confirmed as being vitally important for SGIV replication [39, 40]. Human ISG15 expression is strongly up-regulated during viral infections, such as human cytomegalovirus (HCMV) and herpes simplex virus (HSV), and ISG15 up-regulation was considered to be involved in different strategies relating to the antiviral response [4144]. IFP35 and ISG56 were also involved in the cellular antiviral response against virus infection [38, 45]. A detailed investigation on the functions of E. coioides interferon-related genes during SGIV infection will contribute greatly to understanding how the SGIV exploited, or evaded, the host interferon immune response.

We also obtained sequences that shared homology to SGIV-encoded immune evasion genes, including lipopolysaccharide-induced tumor necrosis factor-α factor (LITAF), tumor necrosis factor receptor (TNFR), ubiquitin and Bcl-2 [4648]. Iridovirus-encoded LITAF and Bcl-2 could mediate the fate of host cells by regulating apoptosis [47, 48]. It has been reported that many viral immune evasion genes are considered as "stolen" mimics from the host and such viruses may interfere with the host response by modulating or disrupting the function of corresponding host genes [4951]. The discovery of these sequences will be helpful in studies on host-virus interactions. In addition, we also found that other molecules such as lectin, hepcidin, lysozyme and antimicrobial peptide are involved in immune responses. The functions of these genes during virus infection will be investigated in the further studies.

Based on results from exploratory statistical analysis, we identified genes that are up-regulated or down-regulated after SGIV infection. The present data from qRT-PCR analysis validated the hypothesis that expression of partial genes is regulated by SGIV infection, including cytokine, cytokine receptor and transcription factor, apoptosis-associated genes, interferon-related genes, and cytoskeleton genes. Previous studies indicated that the expression of different groups of genes relating to cellular structure, apoptosis, gene transcription and immune regulation were altered in response to virus infections or other stimuli [37, 5254]. Further research into the roles of these differentially-expressed genes will contribute to an increased understanding of the critical events that take place during SGIV replication.

Conclusions

In summary, we studied the immune response of marine fish to virus infection using SGIV infected E. coioides as a model. More than 400 000 high-quality ESTs were obtained from the E. coioides spleen cDNA library by 454 sequencing. These unique sequences contribute greatly to the investigation into changes in gene expression patterns and their molecular functions during pathogens infection, and also provide an abundant data source for the identification of novel genes in E. coioides. This gene information can be used to provide further insights into the functions of chemokines, proinflammatory factors, interferon-induced genes and other cytokines and will thus stimulate further study on the immune response of E. coioides to pathogens. The experimental validation of the gene expression alterations during SGIV infection provides new insights into understanding iridovirus-host interactions.

Methods

E. coioides and virus challenge

To construct spleen cDNA libraries, groupers (E. coioides) of 15 cm total length were obtained from a local farm in Guangzhou, China. Sampling detection indicated that these fish tested negative to SGIV infection. All the fish were maintained in a laboratory recirculating seawater system at 25-30°C for 2 weeks. Healthy fish that displayed normal levels of activity were used in this study. The virus suspension used as a challenge was collected from SGIV-infected GS cells. The fish were challenged by injecting with 0.2 ml of the SGIV suspension (1 × 105 TCID50/ml). As a control, an equal volume of PBS was likewise injected. At 48 h post-infection, fish were sacrificed and tissue samples were taken from the spleens. These were stored in liquid nitrogen for later RNA extraction.

RNA extraction, cDNA library construction and 454 sequencing

Total RNA was extracted from the spleens of the control and SGIV-infected fish using an SV total RNA Isolation kit (Promega). The cDNA library preparation and 454-pyrosequencing were performed as described in Salem et al. [11]. This encompassed a number of procedures as described below. In brief, the first and second strand cDNA were synthesized from 1 μg of total RNA using the SMART PCR cDNA Synthesis Kit (Clonetech, USA) with modified 3' primer 5'-AAGCAGTGGTATCAACGCAGAGTGCAG(T20)VN-3' that contained a BsgI cleavage site. Then the double-stranded cDNA was digested with BsgI for 16 h and cleaned with a QIAquick Minelute PCR purification column (Qiagen, CA). The purified cDNA was sheared into fragments ranging from about 400 to 1000 base pairs by nebulization. After the short fragments (< 400 bp) were removed by AMPure bead (Agencourt), samples were processed with GS FLX Titanium General DNA Library Preparation Kit (Roche) following the manufacturer's instructions. Sequencing was carried out using Roche 454 Genome Sequencer FLX instrument. All the obtained data were submitted to NCBI database.

Data analysis

To analyze the data generated by the FLX sequencer, the sequences of adapters, low complexity and low-quality sequences were filtered out by using Seq-clean and LUCY software [55]. The screened high-quality sequences were de novo assembled used CAP3 software under default parameters [56]. ESTs that did not form contigs were designated as singlets. Putative functions of all the unique sequences (contigs and singlets) were predicted using local BLASTall programs against sequences in the NCBI non-redundant (nr) protein database and the swissprot database (E-value < 1e-5). Each unique sequence was used to determine the COG term, GO term, and the involvement of KEGG pathway database [14, 15].

To compare the gene expression profile between two different libraries, EST occurrence was evaluated statistically. The abundance of unique sequence, expressed as an increase or decrease if the number of hits in SGIV-infected library, was classed as "significantly more" or "significantly less" than that of a normal library. The statistical significance of ESTs with different abundance values was determined using Fisher's exact test [57, 58]. A P value of < 0.05 was considered as statistically significant.

Quantitative real-time PCR

Quantitative real-time PCR was carried out using a LightCycler ® 480 Real-Time PCR System (Roche), with SYBR Green as the fluorescent dye, according to the manufacturer's protocol (TOYOBO). Different genes including cytokines (IL-8, CCL18), cytokine receptors (CCR4), transcription factors (TFIID), apoptosis-associated genes (cystatin B), interferon-related genes (GILT, IIGP1) and others (lysozyme G) were used for validation. Primer sequences are listed in Table 5. Reaction conditions were as follows: 95°C for 1 min, followed by 40 cycles at 94°C for 15 s and at 60°C for 1 min; all the reactions were performed in biological triplicates and samples were normalized using β-actin. Results were expressed as relative fold of β-actin in each experiment, as mean ± SD.

Table 5 Primers used in this study.