Background

Varicella-zoster virus (VZV) is an alpha-herpesvirus and the cause of chickenpox (varicella) and shingles (zoster). Chickenpox is characterized by fever and generalized rash, and is most prevalent in children due to primary infection. VZV can establish a latent infection in nerve cells of dorsal root ganglia and its reactivation from latency causes shingles in older adults and in immunocompromised people.

Isolation and propagation of VZV in cell culture was first reported in 1953 [1], and the first determination of the complete nucleotide sequence was made from the Dumas strain [2]. As of August 2010, complete nucleotide sequences had been determined and were available from NCBI GenBank database from 23 VZV strains including three vaccine strains derived from the Oka strain. Comparison of the full nucleotide sequences of clinical with vaccine strains has enabled researchers to suggest putative regions that might be responsible for attenuation in vaccine strains [36].

In Korea, the pharmaceutical company GCC has been manufacturing an attenuated VZV vaccine for chickenpox since 1994. The live-attenuated vaccine strain, SuduVax®, was obtained through serial passage of wild-type virus in cell culture. The original wild-type virus was isolated in primary human embryonic lung (HEL) cell culture from a 33-month-old boy with chickenpox in 1989 in Seoul, Korea [7]. The virus was attenuated by 10 passages in HEL cells, 12 passages in guinea pig embryonic lung cells, and passaged five times in HEL cells to prepare an attenuated strain, designated MAV06, for vaccine production [8]. The attenuated viruses were stored in liquid nitrogen (master virus banks). Working virus banks are routinely produced after five passages of master virus bank stocks in HEL cells. The final vaccine (SuduVax) is manufactured after passaging of the working virus bank five times in HEL cells.

SuduVax has been marketed in Korea since 1994 and internationally since 1998. Although the efficacy and safety of SuduVax have been proved in the marketplace, molecular studies explaining the mechanism of attenuation or the efficacy of the vaccine have not been available. In this study, the complete nucleotide sequence of SuduVax was determined and compared with those of 23 VZV strains whose full genomic sequences are registered in the NCBI GenBank database.

Results

Overall genome structure of the Korean vaccine strain SuduVax

The genome of the VZV strain SuduVax was determined to be 124,759 bp. The architecture of the SuduVax genome is typical of VZV in that the genome could be divided into TRL, UL, IRL, IRS, US and TRS (88, 104,799, 88, 7,276, 5,232, and 7276 bp, respectively). The G + C content of the SuduVax genome is approximately 46.1%. The lengths of the genome, lengths of each region and the G + C contents are very similar among the 24 VZV strains analyzed in this study (Table 1). The SuduVax genome contains 74 ORFs. Of these 64 are UL genes and four are US genes. Three genes in IRS (ORFs 62-64) are inversely repeated in TRS (ORFs 69-71). Of the 74 ORFs, 39 are in the forward direction and 35 are in the reverse direction. The directions of ORFs are 100% conserved among the analyzed VZV strains. The ORF map of strain SuduVax is presented in Figure 1.

Table 1 Information of the VZV strains analyzed in this study
Figure 1
figure 1

ORF map of the VZV strain SuduVax. The direction of the arrows indicates the direction of transcription.

Phylogenetic analysis

Phylogenetic trees were constructed using the full nucleotide sequences of SuduVax and 23 VZV strains whose full genomic DNA sequences are known. As shown in an unrooted tree generated by maximum-likelihood method, SuduVax and four Oka strains (pOka, vOka, VarilRix, VariVax) formed a clade and strains M2DR and 8 formed an adjacent clade (Figure 2a). These two clades were joined with the clade whose member was the strain CA123 only. Strains 11, 22, 03-500 and HJ0 formed another clade and the rest of the clinical strains formed the last clade. Almost identical topology was observed in a tree generated by neighbour-joining method (data not shown) and Bayesian method [9]. SuduVax together with Oka strains formed a distinctive clade, corresponding to clade 2 proposed by the VZV Nomenclature Meeting 2008 [10]. When trees were constructed with concatenated coding nucleotide sequences (ORF) or amino acid sequences, similar tree topologies were obtained (data not shown). Next, we tried to build phylogenetic trees using non-coding sequences. Again, SuduVax grouped with four Oka strains, forming clade 2 (Figure 2b). One notable difference between the trees built by full or coding sequences and the tree built by non-coding sequences was the location of pOka, the parental Oka strain from which vaccine strain vOka was derived. While pOka was located between the four vaccine strains and 19 clinical strains in the tree built by full or coding sequences, pOka was buried among the vaccine strains in tree built by non-coding sequences (compare Figures 2a, b). In other words, four vaccine strains (vOka, VarilRix, VariVax, and SuduVax) formed a subclade within the clade 2 in the trees built by full or coding sequences (bootstrap value = 1,000 in neighbour-joining trees), but not in the tree built by non-coding sequences.

Figure 2
figure 2

Phylogenetic analysis of 24 VZV strains. Nucleotide or amino acid sequences were multiple-aligned using ClustalW program (ver 2.0.1) and the resulting *.phy files were used to construct phylogenetic trees using maximum-likelihood (ML) or neighbor-joining (NJ) methods in Phylip package (version 3.69). (a) ML tree based on full nucleotide sequences. (b) ML tree based on non-coding sequences. (c) NJ tree based on the nucleotide sequences of ORF62, showing clear separation of vaccine strains from pOka within clade 2. (d) NJ tree based on the nucleotide sequences of ORF1. Vaccine strains are separated from clinical strains, but formation of clade 2 is not evident.

In order to find which ORFs are important in distinguishing vaccine strains from clinical strains, further phylogentic analyses using individual ORF were performed. Of the 74 phylogenetic trees, 12 ORF trees exhibited clear branches leading to a formation of clusters consisting of vaccine strains. These 12 ORFs included ORF 0, 1, 6, 18, 31, 35, 39, 59, 62, 64, 69 and 71 (Figure 2c). The bootstrap values for vaccine clusters were greater than 640. In majority of ORF trees, vaccine clusters formed subclades within clade 2. However, in phylogenetic trees based on ORFs 1, 18, 39 and 59, branches leading to clade 2 were not present or very short with low bootstrap values (Figure 2d). Thus, the vaccine strains did not always form a subclade within clade 2.

Evolutionary relationships between the Korean vaccine strain SuduVax and other VZV strains were investigated by calculating genetic distances among the 24 VZV strains. As a whole, VZV genome sequences were highly conserved among the strains. At the level of full nucleotide sequences, SuduVax was the most similar to VarilRix, followed by vOka, VariVax and pOka (Table 2). Similar results were obtained when the genetic distances were calculated using concatenated non-coding nucleotide sequences or amino acid sequences. The average distance between SuduVax and three vaccine strains at the full nucleotide level was calculated to be 0.20 ± 0.05 × 10-3, which was < 10% of the average distance between SuduVax and 20 clinical strains (2.08 ± 0.39 × 10-3, Table 2). Among the clinical strains except for pOka, strain 8 was the most similar to SuduVax.

Table 2 Genetic distances between SuduVax and other VZV strains

Mutations found in SuduVax ORFs

SuduVax ORF0 exists as longer form due to a read-through mutation. The stop codon TGA (nucleotide position 388-390) was mutated to CGA coding for Arg. A putative stop codon TGA was found downstream and overlapped with ORF1 (Figure 3). This extended ORF0 encoded a new protein with 221 amino acid residues. The same read-through mutation was found in other vaccine strains, vOka, VarilRix and VariVax. All clinical strains including pOka contained 390 bp-long ORF0 coding for 129 amino acids.

Figure 3
figure 3

Read-through mutation in ORF0 of SuduVax and Oka vaccine strains. ORF0 sequences of 24 VZV strains were extracted and aligned using the ClustalW program. Substitution of T388C and putative downstream new stop codon TGA are shaded.

Compared to the reference strain Dumas, the lengths of ORF17 and ORF56 of the strain SuduVax were 3 bp short due to deletion of TCA at position 367 to 369 and TCT at position 658 to 660, respectively. Both deletions resulted in deletion of amino acid S residue. On the other hand, insertion of three nucleotides ATG at position 27 was found in ORF60 of the strain SuduVax. Interestingly, the aforementioned two deletions and one insertion were also found in all Oka strains including pOka. SuduVax as well as Oka strains were found to have a15 bp (AACATTTCAGGGTCA) shorter ORF29 than most clinical isolates that contain two tandem reiterations of this 15 bp sequence. Among the clinical strains, M2DR, CA123 and 8 contained only one copy of the 15 bp element in ORF29. Strains M2DR and 8 shared the same length for ORF60 with Oka and SuduVax strains. Table 3 summarizes the insertion and deletion mutations found in SuduVax.

Table 3 Deletions and insertion found in SuduVax

Discussion

VZV strain SuduVax has been used by a Korean pharmaceutical company to produce live attenuated vaccine for chickenpox since 1994. Although its efficacy and safety have been proven in the marketplace, molecular characteristics of the vaccine strain have not been available. In this study sequencing and analyses of the nucleotide sequence of the Korean varicella vaccine strain SuduVax were undertaken.

In the original paper on the first complete sequencing of VZV strain Dumas [2], 71 ORFs were proposed. However, the information obtained from the NCBI GenBank database for Dumas (NC_001348) identifies 73 ORF if three ORFs located in TRS are counted as separate ORFs. Sequencing of two Oka-derived vaccine strains, VarilRix (DQ008354) and VariVax (DQ008355), identified 72 ORFs [5]. A Blast search using these three strains as queries produced 74 possible ORFs for VZV. We were presently able to locate ORF45 (position 81,523- 82,593) to Dumas and ORF33.5 to VarilRix (position 60,257 - 61,165) and VariVax (60,254 - 61,162). Extended from of ORF0 due to read-through mutation was identified in SuduVax as well as in Oka vaccine strains (see below). Using these reference strains Dumas and VarilRix as queries, we were able to identify and locate 74 ORFs in the genome of the strain SuduVax as well as in other 23 VZV strains analyzed in this study.

Phylogenetic analysis using the full nucleotide sequences of 24 VZV strains identified five distinct clades, consistent with previous findings [9, 10]. Phylogenetic trees constructed with concatenated amino acid sequences and coding nucleotide sequences also revealed five clades with the same members. The tree built using non-coding nucleotide sequences appeared similar to the other trees, except that the strains 8 and M2DR did not form a clear clade as in other trees. SuduVax co-clustered with Oka strains and this clade consisted exclusively of isolates from Japan and Korea in clade 2. SuduVax shares the minimum complement of single nucleotide polymorphism at 27 positions [10] with other members of the clade 2. Various genotyping methods using limited genetic information of VZV strains have been proved to represent genotyping using full genome information [1115]. Any genotyping method unequivocally placed SuduVax to the same genogroup with Oka strains as in phylogenetic trees based on full or near-full genetic information (data not shown).

It is not presently certain, because of the lack of full genome sequences from other Asian isolates, whether this clade 2 could be extended to include isolates from other Asian countries or whether it is confined to isolates from Japan and Korea only. However, available data based on partial nucleotide sequences or restriction fragment length polymorphism suggest that all Korean isolates and Chinese isolates form a clade with Japanese isolates [16, 17]. Thus, it is possible that the clade 2 could be extended to include China, which is geographically close to Japan and Korea.

Coding sequences occupy approximately 91% of the VZV genome and reflect most of the sequence information of the whole genome. Thus, it was expected that the phylogenetic trees based on the coding sequences are very similar to the trees based on the full nucleotide sequences. We found that the coding sequence trees and amino acid trees were similar to the full nucleotide trees. Noncoding sequences were found to be interspersed between coding sequences or ORFs, accounting for approximately 9% of the VZV genome. The phylogenetic trees based on VZV noncoding sequences are not different from those based on full or coding nucleotide sequences or amino acid sequences. One notable difference is the location of pOka within clade 2. In full or coding sequence trees, pOka was separated from four vaccine strains to form two independent subclades within clade 2. On the contrary, pOka did not form a subclade separated from vaccine strains in noncoding sequence trees. pOka is a clinical strain. Thus, coding sequences or amino acid sequences of VZV genome may provide information distinguishing vaccine strains from clinical strains, while noncoding sequences does not.

Phylogenetic analyses using the nucleotide sequences of individual ORFs suggested 12 ORFs may be important in distinguishing vaccine strains from clinical strains. Yamanish identified 23 ORFs that are different between pOka and Oka vaccine [6], including 12 ORFs identified in this study. Moreover, our preliminary studies of single nucleotide polymorphism among the full genomic DNA sequences of the 24 VZV strains revealed 12 ORFs that may be characteristic for vaccine strains and these 12 ORFs coincide with the above-mentioned 12 ORFs [manuscript in preparation].

ORF0, also known as ORFS/L, is thought to be essential for VZV growth and encodes a membrane protein with 129 amino acid residues, which is possibly involved in vesicular trafficking and altering cell adhesion molecules in infected cells [18, 19]. ORF0 in SuduVax was determined to possess an extended C-terminal sequence due to a read-through mutation of its original stop codon TGA to CGA coding for Arg. The nearest downstream stop codon TGA was found to overlap with ORF1 and the extended ORF0 is expected to code for a new protein with 221 amino acid residues. Interestingly, this read-through mutation was also found in the three Oka-derived vaccine strains, while the stop codons were found to be unaltered in all of the clinical strains including the parent Oka strain. In cells infected with vOka, the extended form of ORF0 protein with 221 amino acid residues and its spliced form with 155 amino acid residues are expressed [20]. Since other vaccine strains, including SuduVax, share 100% identical nucleotide sequences within and downstream of ORF0 up to the new stop codon, both forms of the extended ORF0 proteins are expected to be expressed in permissive cells infected with SuduVax. Thus, read-through mutation in ORF0 might be an important feature distinguishing vaccine strains from clinical strains.

Besides the read-through mutation in ORF0, SuduVax share same mutational events in ORFs 17, 29, 56 and 60 with Oka strains. ORF17 codes for an mRNA-specific RNase [21] and ORF29 encodes single strand DNA binding protein via its zinc-finger domain [22]. The function of ORF56 has not been well characterized, but its gene product is reported to co-localize with regulatory protein ICP22 and nuclear protein UL3 in small, dense nuclear bodies (NCBI, http://www.ncbi.nlm.nih.gov/pubmed?Db=geneCmd=retrievedopt=full_reportlist_uids=1487683. The gene product of ORF60 is glycoprotein L, which acts as a chaperon for glycoprotein H [23]. Three bp deletions were found in ORFs 17 and 56, and an insertion of 3-bp was found in ORF60. While most of the clinical strains contain two tandem copies of 15 bp (AACATTTCAGGGTCA) elements in ORF29, while the SuduVax and Oka strains contain only one copy of this 15 bp element. Of these four deletion and insertion events, two events (ORFs 29, 60) are shared with the clinical strains 8 and M2DR, and one event (ORF29) is also found in the strain CA123. Since these deletion and insertion events are also found in some of the clinical strains including pOka, they by themselves may not be important in attenuation, although it is still possible that they, in combination with other events such as read-through mutation in ORF0, may play some roles in attenuation of vaccine strains.

Conclusion

We obtained and analyzed full nucleotide sequence of the Korean vaccine strain SuduVax. SuduVax was shown to be genetically most similar to Oka-derived vaccine strains. We are now comparing the SuduVax nucleotide and amino acid sequences with those of other vaccine and clinical strains. Further comparative genomic and bioinformatics analyses will help to elucidate the molecular basis of the attenuation of the VZV vaccine strains.

Materials and methods

Virus and DNA sequencing

DNA of the VZV strain SuduVax was extracted from commercial vials of SuduVax™ with QIAamp DNA Mini Kit (QIAGEN) at a concentration of 5.5 μg/100 μL. The DNA sequence was determined by the high throughput sequencing method using a Genome Sequencer FLX Titanium System of Roche Diagnostics, serviced by Macrogen. Sequence fragments (n = 23,722) with an average length of approximately 400 bp were obtained and these were assembled and viewed using the Consed program http://bozeman.mbt.washington.edu/consed/consed.html. The average quality of the sequence fragments was more than 99.99%. A total of 99.38% of the 124,759 sequences aligned with the derived consensus sequence and the average coverage was 83 reads per nucleotide. These were aligned against reference strain Dumas (NC_001348) and vaccine strain VarilRix (DQ008354). The gaps between the contigs were filled by polymerase chain reaction sequencing using primers whose sequences were obtained from the adjacent contigs. The completed sequence was deposited into NCBI GenBank (accession number JF306641).

Allocation of ORFs

ORFs of the strain SuduVax in the full genome sequence was located by Blast search against two reference strains Dumas (NC_001348) and VarilRix (DQ008354). Complementary determining sequences (CDSs) of the reference strains were extracted using FeatureExtract 1.2 Server program http://www.cbs.dtu.dk/services/FeatureExtract/ and used as query. The resulting data included the first and last nucleotide positions of the each ORF in the strain SuduVax genome and direction of the ORFs. The ORF information was verified by ORF finding programs such as CLC Sequence Viewer (version 6.4, http://www.clcbio.com/index.php) and ORF Finder provided by NCBI. When the results of Blast search did not coincide with those of ORF finding programs, the nucleotide sequences of the corresponding ORFs were examined with BioEdit Sequence Alignment Editor (Department of Microbiology, North Carolina State University, version 7.0.5.3, http://www.mbio.ncsu.edu/BioEdit/bioedit.html) and manually edited to locate the position of the start and stop codons. Finally all the allocated ORFs were confirmed by identification of the translated amino acid sequences.

Phylogenetic analysis

Nucleotide sequences of the VZV full genome other than SuduVax were obtained directly from GenBank database (Table 1). For each VZV strain, all ORF sequences were cut and pasted to generate a concatenated coding sequence. Similarly, all inter-ORF sequences were cut and pasted to build a concatenated noncoding sequence. Amino acid sequences were obtained by translation of the corresponding ORFs and pasted to generate a concatenated sequence harbouring all 74 ORFs. These full or concatenated nucleotide or concatenated amino acid sequence of the 24 VZV strains were multiple-aligned using the ClustalW program (ver 2.0.1) followed by manual editing. The resulting out-files were used to calculate genetic distances using Dnadist (for nucleotide) or Protdist (for amino acid) program included in Phylip package (version 3.69, http://evolution.genetics.washington.edu/phylip.html). Distance matrix was obtained by Kimura-2-parameter for nucleotide or Jones-Taylor-Thornton method for amino acid. Cluster analysis was performed by neighbour-joining (NJ) and maximum-likelihood (ML) method and resulting tree files were viewed by Treeview program (version 1.6.6). The significance of the phylogenetic trees was verified by bootstrap analysis. Phylogenetic trees were constructed from 1,000 replicates generated by the Seqboot program and the consensus tree was identified by the Consense program.