Abstract
We analysed the genomes and codon usage patterns of seven small (DNA and RNA) shrimp viruses. Effective number of codon (ENC) values indicated moderate (35 < ENC < 50) codon usage bias in shrimp viruses. Correlation analysis between GC compositions at non-synonymous codon and synonymous codon positions (GC1, 2 and GC3) as well as GC3 versus ENC curves indicated varying influences of mutational pressure on codon usage. The presence of deoptimized codons and host-antagonistic codon usage trends in shrimp viruses suggested the adaptation of a slow replication strategy by these viruses to avoid host defences. Low CpG frequencies indicated that shrimp viruses have evolved with underrepresentation of CpGs to avoid the host’s immune response.
Frequent disease outbreaks in shrimp culture have resulted in great losses to the industry. A Global Aquaculture Alliance (GAA) study in 2008 estimated that total losses due to diseases over the past 15 years might be on the order of US$ 15 billion. Among the various shrimp pathogens, viruses are responsible for the most disease outbreaks. Infections due to white spot syndrome virus (WSSV), yellow head virus (YHV), infectious myonecrosis virus (IMNV), and taura syndrome virus (TSV) may result in significant production losses and mortalities [11, 17].
Due to the redundancy of the genetic code, the majority of amino acids (except Met and Trp) can be encoded by more than one synonymous codon. During evolution, each species is subjected to specific genomic pressures that can influence its codon usage pattern, resulting in preferred selection of some codons over others within the same synonymous group [12]. Natural selection and mutational pressure have been found to be the two most important factors affecting codon usage in the majority of viruses [2, 27]. The effects of natural selection can be evident in the form of gene expression, translational selection (speed and accuracy) and protein folding [13, 32]. Under mutational pressure, codon usage of a particular region/gene is shaped by it nucleotide composition. For example, a high GC content in a particular genomic region will lead to a GC-biased codon usage pattern. Viruses are dependent on the host cell for their replication and transmission, and thus, the interrelationship of codon usage patterns between the host and virus may have a significant impact on virus survival and evolution. Knowledge of these patterns helps in achieving a better understanding of viral gene expression and regulation, which in turn can be applied to efficient in vitro expression of viral proteins and vaccine development [5, 25].
In this study, we analysed the genomes and codon usage patterns of five RNA and two DNA shrimp viruses. Several codon usage metrics in IMNV, gill-associated virus (GAV), TSV, YHV, Penaeus vannamei nodavirus (PvNV), Penaeus monodon hepandensovirus (also known as hepatopencreatic parvovirus [HPV]) and Penaeus monodon penstyldensovirus 1 (also known as infectious hypodermal and hematopoitic necrosis virus [IHHNV]) were evaluated to investigate the extent of codon bias and influencing factors. Coding sequences (CDSs) for all available genomes of five RNA viruses and two DNA viruses (Table 1) were obtained from NCBI GenBank (http://www.ncbi.nlm.nih.gov).
Overall nucleotide compositions (AT and GC) as well as nucleotide compositions at third synonymous codon (AT3 and GC3) were calculated to understand their genome bias. Except for GAV and PvNV, all shrimp viruses had significantly (Wilcoxon signed-rank test, p < 0.05) AT/AU-biased genomes. All of the following nucleotide values are reported as percentages. IMNV and YHV with mean ± SD AU values of 62.3 ± 2.8 and 54.2 ± 2.0 showed the highest and lowest bias, respectively (Supplementary Table 1). This AT/AU bias in the majority of shrimp viruses suggests that A/U-ended codons might be preferred over G/C-ended codons. The GC composition at third synonymous codon position (GC3) might have a significant impact on the overall codon usage pattern, and the calculated GC3 values in our study ranged from 26.4 ± 2.4 (in IMNV) to 50.5 ± 8.4 (in GAV).
Correlations between nucleotide compositions at non-synonymous and synonymous codon positions (GC1, 2 and GC3) were used to evaluate the comparative influences of mutational pressure and natural selection on codon composition. In the case of DNA viruses, significant but negative correlations between GC1, 2 and GC3 values were found for HPV (r = -0.636, p < 0.05) and IHHNV (r = -0.840, p < 0.05). Among RNA viruses, IMNV showed significant correlation (r = 0.868, p < 0.05) between GC1, 2 and GC3, whereas non-significant correlation was observed for YHV. Due to the limited amount of data available, no correlation analysis was performed for GAV, TSV and PvNV. These observations suggest that in the case of HPV, IHHNV and YHV, base composition does not follow the same pattern at all codon positions, implying that natural selection has occurred, while in the case of IMNV, mutational pressure appears to play an important role.
Nucleotide compositions (overall as well as at the third position of synonymous codons), effective numbers of codons (ENC) and relative synonymous codon usage (RSCU) metrics were calculated using CodonW 1.4.4 software, developed by John Peden (http://sourceforge.net/projects/condonw) [20]. ENC is used to quantify the codon usage bias in absolute terms. The lowest value of 20 signifies that each amino acid is encoded by one codon, whereas the highest value of 61 indicates completely random codon usage. ENC values revealed the presence of moderate (35 < ENC < 50) codon usage bias in IMNV, GAV, YHV, HPV and IHHNV (Supplementary Table 1). On the other hand, TSV and PvNV, with ENC values (mean ± SD) of 53.7 ± 0.7 and 56 ± 7.4, respectively, showed low bias.
Among the RNA viruses, one PvNV gene and all IMNV and TSV genes were found to lie roughly on the pENC curve towards the low GC3 region (Fig. 1). Moreover, significant correlation (r = 0.957, p < 0.05) between GC3 and ENC was also observed for IMNV. No correlation analysis was performed for TSV and PvNV. These data imply that mutational pressure is the major influencing factor in IMNV and TSV codon usage bias and in some of the genes of PvNV [32]. On the other hand, the ENC values for all genes of GAV, YHV and HPV as well as for remaining genes of PvNV were found to lie well below the pENC curve. For YHV, a significant negative correlation (r= -0.895, p < 0.05) was observed between GC3 and ENC, whereas no correlation was found for HPV. For GAV, YHV and HPV, natural selection seems to be dominant. In the case of IHHNV, the ENC values for all of the genes were found to lie well below the pENC curve, but with positive correlation (r = 0.886, p < 0.05) between GC3s and ENCs. Interestingly, mutational pressure and natural selection might be acting equally on the codon usage bias of PvNV and IHHNV.
For efficient utilization of host cell mechanisms, viruses may show significant adaptations during evolution. Thus, RSCU analysis was performed to compare the patterns of codon usage between shrimp viruses and their host (P. monodon) (Supplementary Table 2). For the shrimp viruses (except GAV and YHV), the numbers of A/U-ended preferred and overrepresented codons (RSCU > 1.6) were comparatively higher than those of the corresponding G/C-ended codons. In the case of underrepresented (RSCU < 0.6) codons in all viruses, the majority were found to be G/C-ended (Supplementary Table 3). As the majority of shrimp viruses had a significantly AT/AU-biased genome, the preferences towards A/U-ended codons and underrepresentations of G/C-ended codons were as per our expectations. In contrast, out of 18 preferred codons for P. monodon, 17 were found to be G/C-ended.
Comparison of RSCU values also indicated that preferred codons common between individual viruses and the host were as low as 1 to 8 (Supplementary Table 2). IMNV had only one preferred codon (CCA [Pro]) in common with its host, while GHV had eight common preferred codons (UUC [Phe], GAC [Asp], AUC [Ile], UAC [Tyr], CAC [His], CCA [Pro], AAC [Asn], and AAG [Lys]). Overall, somewhat antagonistic trends were observed between codon usage patterns of the viruses and the host.
The codon adaptation index (CAI) is used as a measure of relative adaptedness of the codon usage pattern of a gene/set of genes towards a reference set of genes, which in turn tells us the extent to which natural translational selection has been effective in influencing the codon usage bias [26]. In this study, CAI values for viral genes were calculated against a reference set of host genes [23]. The expected CAI (eCAI) values for each gene were also calculated as described by Puigbo et al. [22], to determine the statistical significance of the CAI values. Out of all of the genes of GAV and YHV, the CAI values of two genes each of PvNV and HPV and one gene each of TSV and IHHNV were found to be higher than their respective eCAI values (Supplementary Table 4). These observations suggest that certain genes of these viruses show significant adaptation to their host, and this could be due to translational selection. For IMNV, all genes had CAI values that were lower than their respective eCAI values, signifying the absence of codon adaptation and translational pressure.
Multivariate correspondence analysis (COA) was used to identify the major trends of codon usage patterns in viral CDSs and to represent these trends along the continuous axis for easy visual interpretation. In this study, COA on RSCU was performed separately for each virus. For all of the viruses, axis 1 and axis 2 were together able to explain >50% of variations. Thus, a scatter plot was prepared by plotting the value of each gene along axis 1 and axis 2 (Fig. 2). Although virus genes were widely distributed along both axes, specific clustering based on the type of gene rather than on the virus type was observed. For example, all capsid protein genes from IHHNV (IHHNV_CP) formed a separate group distinct from the non-structural protein gene 1 (IHHNV_NS1) and non-structural protein gene 2 (IHHNV_NS2) groups. These observations suggest that different genes within the same virus were subjected to varying selection pressures during evolution.
In COA, correlation analysis between axis 1 and other codon usage parameters can indicate the influence of various factors on codon bias. Significant correlations (r > 0.82, p < 0.05) between axis 1 – GC3 and axis 1 – ENC were observed for both IMNV and IHHNV. In the case of HPV, a significant correlation between axis 1 and GC3 (r = 0.65, p < 0.05) was observed, whereas axis 1 showed no correlation with ENC. YHV showed non-significant correlations between axis 1 – GC3 and axis 1 – ENC. No correlation analysis was performed for GAV, TSV and PvNV. These results confirm our previous observations that mutational pressure plays an important role in codon usage bias of IMNV and IHHNV but it has only minor influence in the case of YHV and HPV.
It has been reported that dinucleotide frequencies can have a significant effect on codon usage bias [6]. In this study, we calculated the relative dinucleotide abundances of all 16 dinucleotide as a ratio of their observed and expected frequencies using the compseq programme [24, 28]. Karlin et al. [14] reported that the CpG dinucleotide was underrepresented in small DNA viruses, but large DNA viruses showed no bias against CpG. Thus, we also calculated the relative dinucleotide abundances of all 16 pairs for two large DNA shrimp viruses, namely white spot syndrome virus (WSSV) and Penaeus monodon nudivirus (PmNV) to investigate possible differences between small-DNA and RNA shrimp viruses (Supplementary Table 5). Among the various dinucleotides, the frequency of CpG was found to be underrepresented in all viruses and was consistently lower than the normal value of one. Interestingly, although most of the shrimp virus genomes had significant AT/AU bias, patterns of underrepresentation were also observed for UpA in all viruses. On the other hand, the dinucleotide UpG was consistently overrepresented in all viral sequences. The CpA dinucleotide was also consistently overrepresented in all shrimp viruses except IHHNV and WSSV. We also determined the RSCU values of CpG-containing codons to investigate their effect on codon usage bias. Out of eight CpG-containing codons (CCG, GCG, UCG, ACG, CGC, CGG, CGU, and CGA), only CGU was the preferred codon in its synonymous group, and the majority of these codons were found to be underrepresented in their groups (Supplementary Table 2). CGU was also found to be a preferred codon for YHV, GAV and PvNV.
Large DNA viruses such as WSSV and PmNV, with genome sizes of approximately 300 kb and 125 kb, respectively, encode hundreds of proteins. These proteins play an important role in viral pathogenicity, survival and replication [29, 33, 35]. Except for IMNV (genome size, ~26 kb), small shrimp viruses have genomes of 10 kb or less, and these small viruses encode two to five proteins. In comparison to large viruses, small viruses are more dependent on host machinery for their survival and replication. This may lead to different patterns of base composition and codon usage in small and large viruses [25]. The majority of these viruses use deoptimized codons (ones that occur at low frequency in the host genome), and only a few preferred codons (1 to 8) are common between individual viruses and their hosts. On a whole-genome basis, somewhat antagonistic trends were observed between codon usage patterns of these shrimp viruses and their host. Antagonistic, coincident, or a mixture of both types of codon usage between the virus and its host has also been reported for hepatitis A virus, poliovirus, and chikungunya virus [5, 7, 19]. Previously, it was reported that the whole-genome codon usage pattern of PmNV, an important shrimp pathogen, was antagonistic to its host but that forces of natural selection were able to overcome this antagonism in some genes [28]. Coincidence between viral and host codon usage may lead to improved translation efficiency, whereas antagonism may result in slow viral mRNA translation and viral replication. It has been reported that certain viruses such as hepatitis A virus are unable to shut down the synthesis of host proteins. Thus, in order to synthesize its own proteins, hepatitis A virus must compete with the host for the cellular translational machinery and therefore uses deoptimized codons to avoid competition with the host for cellular tRNAs. This strategy results in slow synthesis of viral proteins, including those involved in RNA replication. This low rate of translation and RNA replication might also help hepatitis A virus to grow slowly and thereby avoid host defences [21]. The presence of deoptimized codons and a host-antagonistic codon usage pattern in shrimp viruses suggests that these viruses may also use similar strategies to replicate and to avoid host defense. In our analysis, the CAI metric also revealed that, in spite of overall antagonistic trends in shrimp viruses, some genes showed significant adaptation towards the host’s codon usage pattern. Replicase, capsid and non-structural protein genes from these viruses had CAI values that were higher than their respective eCAI values, suggesting that forces of natural selection were able to have a significant impact on codon usage of these genes. Moreover, gene plots obtained by COA also showed specific clustering based on the type of gene rather than the virus type. It was observed the similar genes from different geographical isolates of the same virus formed a unique cluster that was distinctly separated from other genes of same viruses. These observations suggested that, during evolution, shrimp viruses have been subjected to gene-specific selection pressures, resulting in unique codon usage patterns.
The dinucleotide frequency can have strong influence on the codon usage bias of DNA and RNA viruses. Thus, for each shrimp virus, the relative abundance of all 16 dinucleotides was calculated as the ratio of their observed and expected frequencies. The CpG frequencies for all shrimp viruses were found to be consistently underrepresented. In the case of vertebrate viruses, unmethylated CpGs in viral genomes are recognized as a pathogen signature by the host’s pattern-recognition receptors (PRRs), specifically Toll-like receptor 9 (TLR9) [15]. Binding of these CpG motifs to TLR9 triggers an innate immune response in the host [10]. Recently, the existence of a Toll pathway in shrimp has also been suggested [16, 31]. Several Toll-like receptors (TLRs) from shrimp have been cloned and characterized [1, 30, 34]. Studies have also revealed that the Toll pathway in shrimp responds to bacterial [8] and viral [9] infections. These studies have also suggested that like vertebrate TLRs, shrimp TLRs also play an important role in innate immunity through their ability to recognize microbe-associated molecular patterns [31]. Thus, it is possible that shrimp viruses have evolved with underrepresentation of CpGs to avoid the host immune response. It has been reported previously, that CpG bias is limited to small DNA viruses, whereas large DNA viruses show the expected CpG frequencies without any bias [14]. Shackelton et al. [25] explained the lack of CpG bias in large viruses by suggesting that by virtue of encoding large numbers of proteins, these viruses have a higher capacity to interfere with host PRRs. However, we observed the consistent underrepresentation of CpG in both small and large (WSSV, PmNV) shrimp viruses. This observation suggests that both small and large shrimp viruses are quite dependent on the host for replication/survival and might not be have much ability to interfere with host PRRs. However, this hypothesis needs to be tested in further studies. In spite of the significant AT/AU bias in the majority of shrimp viruses, UpA dinucleotides were also found to be underrepresented in all of them. The underrepresentation of UpA has also been observed in other genomes, including those of vertebrates, invertebrates, plants, and prokaryotes [4].The susceptibility of UpA uracils to the host’s RNase has been suggested as one of the reasons for the underrepresentation of this particular dinucleotide. Moreover, UpA is also present in two of the three stop codons [3]. Thus, underrepresentation of UpA in viral genomes might be helpful in mitigating the risk of nonsense mutations resulting in incomplete proteins. Similar to other studies [5, 18], overrepresentation of CpA and UpG was also observed in shrimp viruses, but the significance of this observation is unknown to us. The data suggest that shrimp virus genomes have been subjected to selective pressure during evolution, leading to an alteration of their dinucleotide frequencies and corresponding codon usage patterns. Finally, this study suggests the codon usage biases in shrimp viruses are due to the interrelationship between the genome composition, selective constraints in the form of translational efficiency, and the need to escape the host immune response.
References
Arts JA, Cornelissen FH, Cijsouw T, Hermsen T, Savelkoul HF, Stet RJ (2007) Molecular cloning and expression of a Toll receptor in the giant tiger shrimp, Penaeus monodon. Fish Shellfish Immunol 23:504–513
Belalov IS, Lukashev AN (2013) Causes and implications of codon usage bias in RNA viruses. PLoS One 8:e56642
Beutler E, Gelbart T, Han JH, Koziol JA, Beutler B (1989) Evolution of the genome and the genetic code: selection at the dinucleotide level by methylation and polyribonucleotide cleavage. Proc Natl Acad Sci USA 86:192–196
Burge C, Campbell AM, Karlin S (1992) Over- and under-representation of short oligonucleotides in DNA sequences. Proc Natl Acad Sci USA 89:1358–1362
Butt AM, Nasrullah I, Tong Y (2014) Genome-wide analysis of codon usage and influencing factors in chikungunya viruses. PLoS One 9:e90905
Cheng X, Virk N, Chen W, Ji S, Ji S, Sun Y, Wu X (2013) CpG usage in RNA viruses: data and hypotheses. PLoS One 8:e74109
D'Andrea L, Pinto RM, Bosch A, Musto H, Cristina J (2011) A detailed comparative analysis on the overall codon usage patterns in hepatitis A virus. Virus Res 157:19–24
Dechamma MM, Rejeish M, Maiti B, Mani MK, Karunasagar I (2015) Expression of Toll-like receptors (TLR), in lymphoid organ of black tiger shrimp (Penaeus monodon) in response to Vibrio harveyi infection. Aquac Rep 1:1–4
Deepika A, Sreedharan K, Paria A, Makesh M, Rajendran KV (2014) Toll-pathway in tiger shrimp (Penaeus monodon) responds to white spot syndrome virus infection: evidence through molecular characterisation and expression profiles of MyD88, TRAF6 and TLR genes. Fish Shellfish Immunol 41:441–454
Dorn A, Kippenberger S (2008) Clinical application of CpG-, non-CpG-, and antisense oligodeoxynucleotides as immunomodulators. Curr Opin Mol Ther 10:10–20
Flegel TW, Lightner DV, Lo CF, Owens L (2008) Shrimp disease control: past, present and future. In: Bondad-Reantaso MG, Mohan CV, Crumlish M, Subasinghe RP (eds) Diseases in Asian aquaculture. Fish Health Section, Asian Fisheries Society, Manila, pp 355–378
Grantham R, Gautier C, Gouy M, Mercier R, Pave A (1980) Codon catalog usage and the genome hypothesis. Nucleic Acids Res 8:r49–r62
Hershberg R, Petrov DA (2008) Selection on codon bias. Annu Rev Genet 42:287–299
Karlin S, Doerfler W, Cardon LR (1994) Why is CpG suppressed in the genomes of virtually all small eukaryotic viruses but not in those of large eukaryotic viruses? J Virol 68:2889–2897
Krieg AM (2003) CpG DNA: trigger of sepsis, mediator of protection, or both? Scand J Infect Dis 35:653–659
Li F, Xiang J (2013) Signaling pathways regulating innate immune responses in shrimp. Fish Shellfish Immunol 34:973–980
Lightner DV (2011) Virus diseases of farmed shrimp in the Western Hemisphere (the Americas): a review. J Invertebr Pathol 106:110–130
Moratorio G, Iriarte A, Moreno P, Musto H, Cristina J (2013) A detailed comparative analysis on the overall codon usage patterns in West Nile virus. Infect Genet Evol 14:396–400
Mueller S, Papamichail D, Coleman JR, Skiena S, Wimmer E (2006) Reduction of the rate of poliovirus protein synthesis through large-scale codon deoptimization causes attenuation of viral virulence by lowering specific infectivity. J Virol 80:9687–9696
Peden J (1999) Analysis of codon usage. Department of Genetics, University of Nottingham, Nottingham
Pinto RM, Aragones L, Costafreda MI, Ribes E, Bosch A (2007) Codon usage and replicative strategies of hepatitis A virus. Virus Res 127:158–163
Puigbo P, Bravo IG, Garcia-Vallve S (2008) E-CAI: a novel server to estimate an expected value of codon adaptation index (eCAI). BMC Bioinform 9:65
Puigbo P, Bravo IG, Garcia-Vallve S (2008) CAIcal: a combined set of tools to assess codon usage adaptation. Biol Direct 3:38
Rice P, Longden I, Bleasby A (2000) EMBOSS: the European molecular biology open software suite. Trends Genet 16:276–277
Shackelton LA, Parrish CR, Holmes EC (2006) Evolutionary basis of codon usage and nucleotide composition bias in vertebrate DNA viruses. J Mol Evol 62:551–563
Sharp PM, Li WH (1987) The codon adaptation index—a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281–1295
Singh NK, Tyagi A, Kaur R, Verma R, Gupta PK (2016) Characterization of codon usage pattern and influencing factors in Japanese encephalitis virus. Virus Res 221:58–65
Tyagi A, Singh NK, Gurtler V, Karunasagar I (2016) Bioinformatics analysis of codon usage patterns and influencing factors in Penaeus monodon nudivirus. Arch Virol 161:459–464
van Hulten MC, Witteveldt J, Peters S, Kloosterboer N, Tarchini R, Fiers M, Sandbrink H, Lankhorst RK, Vlak JM (2001) The white spot syndrome virus DNA genome sequence. Virology 286:7–22
Wang PH, Liang JP, Gu ZH, Wan DH, Weng SP, Yu XQ, He JG (2012) Molecular cloning, characterization and expression analysis of two novel Tolls (LvToll2 and LvToll3) and three putative Spatzle-like Toll ligands (LvSpz1-3) from Litopenaeus vannamei. Dev Comp Immunol 36:359–371
Wang XW, Wang JX (2013) Pattern recognition receptors acting in innate immune system of shrimp against pathogen infections. Fish Shellfish Immunol 34:981–989
Wright F (1990) The ‘effective number of codons’ used in a gene. Gene 87:23–29
Yang F, He J, Lin X, Li Q, Pan D, Zhang X, Xu X (2001) Complete genome sequence of the shrimp white spot bacilliform virus. J Virol 75:11811–11820
Yang LS, Yin ZX, Liao JX, Huang XD, Guo CJ, Weng SP, Chan SM, Yu XQ, He JG (2007) A Toll receptor in shrimp. Mol Immunol 44:1999–2008
Yang YT, Lee DY, Wang Y, Hu JM, Li WH, Leu JH, Chang GD, Ke HM, Kang ST, Lin SS, Kou GH, Lo CF (2014) The genome and occlusion bodies of marine Penaeus monodon nudivirus (PmNV, also known as MBV and PemoNPV) suggest that it should be assigned to a new nudivirus genus that is distinct from the terrestrial nudiviruses. BMC Genom 15:628
Acknowledgements
We are grateful to the Dean, College of Fisheries, Guru Angad Dev Veterinary & Animal Sciences University, Ludhiana, India, for providing the facilities and support required for this study. The Dell T630 data analysis server used in this study was purchased from Science & Engineering Research Board, Department of Science & Technology (DST-SERB) Young Scientist Start Up Research Grant (YSS/2014/000269).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
The authors declare that no funding support was received for this study.
Conflict of interest
All authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Tyagi, A., Kumar, B.T.N. & Singh, N.K. Genome dynamics and evolution of codon usage patterns in shrimp viruses. Arch Virol 162, 3137–3142 (2017). https://doi.org/10.1007/s00705-017-3445-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00705-017-3445-7