Background

Hepatitis C virus (HCV) is estimated to infect 170 million people worldwide and creates a huge disease burden from chronic, progressive liver disease [1] HCV has become a major cause of liver cancer and one of the commonest indications of liver transplantation [2, 3].

HCV has been classified in the family Flaviviridae, although it differs from other members of the family in many details of its genome organization [1]. Like most RNA viruses, HCV circulates in vivo as a complex population of different but closely related viral variants, commonly referred to as a quasispecies [47].

HCV is an enveloped virus with an RNA genome of approximately 9400 bp in length [8, 9]. Comparison of nucleotide sequences of variants recovered from different individuals and geographical regions has revealed the existence of at least six major genetic groups [1, 1012]. Each of the six major genetic groups of HCV contains a series of more closely related sub-types [1].

Interferon monotherapy provided the first hope for patients with chronic hepatitis C that the virus could be permanently eradicated. An important development in treating this disease was the recognition that the effects of interferon could be greatly enhanced by combining it with ribavirin, a nucleoside analogue. This combination regimen essentially doubled the sustained virological response rates seen with interferon alone. Recently, pegylated forms of interferon have been developed, and when that forms of interferon are used in combination with ribavirin, it demonstrates even better efficacy. For that reason, peginterferon alfa-2a and peginterferon alfa-2b are the latest innovations for the treatment of chronic hepatitis C [3, 13, 14].

Recombination plays a significant role in the evolution of RNA viruses by creating genetic variation. For example, the frequent recovery of poliovirus that results from recombination has the potential to produce "escape mutants" in nature as well as in experiments [15]. Recombination has also been detected in other RNA viruses for which multivalent vaccines are in use or in trials [16, 17].

Recently, a natural intergenotypic recombinant (2 k/1b) of HCV has been identified in Saint Petersburg (Russia) [18, 19]. Phylogenetic analyses of HCV strains circulating in Peru, demonstrated the existence of natural intra-genotypic HCV recombinant strains (1a/1b) circulating in the Peruvian population [20]. In these cases, the recombination events have taken place in the non-structural region of the HCV genome. Recombination break-points in HCV structural capsid genomic region has been recently identified [21].

Given the implications that recombination has for RNA virus evolution [16], it is clearly important to determine the extent to which recombination plays a role in the evolution of HCV quasispecies populations in vivo, when patients are undergoing anti-viral therapy.

Results

To gain insight into possible recombination events, a phylogenetic profile analysis was carried out using HCV NS5A sequences from HCV quasispecies populations obtained by Puig-Basgoiti et al. [22] from patients undergoing anti-viral therapy. Sequences were obtained by the use of the LANL database [23] (for patients, strains accession numbers, quasispecies obtained at different time points during therapy and therapy outcome, see Table 1). Phylogenetic profile analysis was done by the use of the SimPlot program [24]. Interesting, when the analysis was carried out for strain AY378694 (obtained on week 4 in patient No. 7, see Table 1), a recombination point was detected at position 286 of the NS5A sequence alignment and two putative parental-like strains (AY378615 and AY378641, obtained on weeks 0 and 2 from the same patient, respectively) were identified (see Fig. 1 and Table 1).

Table 1 Quasispecies population in HCV patients undergoing anti-viral therapya
Figure 1
figure 1

Phylogenetic profiles of HCV sequences. Results from SimPlot analysis are shown. The query sequence (AY378694) is indicated on the upper part of the figure. Sequences to be compared with the query sequence are indicated on the right side of the figure. When comparisons were done, SimPlot generates a similarity plot using the Kimura-two parameter distance model in a sliding window of 200 nucleotides, moving 20 nucleotides between plots. The y-axis gives the percentage of identity found. Comparison of HCV strain AY378694 with strains AY378615, AY378641 and AY378635 is shown. The red vertical line shows the recombination point at position 286. Red numbers on the bottom part of the figure denote the number of informative sites that support clustering of the query sequence with the respective strains indicated in red on the bottom left side of the figure.

In order to confirm these results, the same sequences were used for a bootscanning study [25]. The basic principle of bootscanning is that mosaicism is suggested when one observes high levels of phylogenetic relatedness between a query sequence and more than one reference sequence [25].When the putative recombinant strain identified in the previous analysis (AY378694) is used as a query, this is observed for this strain and the two putative parental-like strains previously detected (see Fig. 2). The same recombination break-point position is observed in the bootscanning analysis (see Figs. 1 and 2), confirming a recombination break-point at position 286 of the NS5A of strain AY378694.

Figure 2
figure 2

Bootscanning of HCV sequences. Query sequence (AY378694) is shown on the upper part of the figure. Sequences to be compared with the query sequence are indicated on the right side of the figure. When comparisons were done, SimPlot generates a graph of percentage of permutated trees obtained using a sliding window of 200 nucleotides, moving 20 nucleotides at a time. The y-axis gives the percentage of permutated trees. This approach permits to observe levels of phylogenetic relatedness between a query sequence and a reference sequence in different genomic regions. The rest same as Fig. 1A.

To assess whether the recombination model we obtained gave a significantly better fit to the data than the null hypothesis of no recombination, we used LARD [26]. The results of these studies are shown in Fig. 3.

Figure 3
figure 3

Distribution of the likelihood ratios expected by chance. The distribution of likelihood ratios for the null hypothesis (i.e. no recombination) is shown. The y-axis shows the number of simulations. Likelihood ratios are shown at the bottom of the figure. The arrow shows the likelihood ratio obtained for the real dataset for the putative recombinant strain.

As it can be seen in the figure, simulations of sequence evolution under the null hypothesis (i.e., no recombination) gave strong statistical support for the alternative hypothesis of recombination (P < 0.004, Fig. 3).

Discussion

The extent to which recombination plays a role in the evolution of HCV quasispecies when patients are undergoing anti-viral therapy is currently unknown. In order to get insight into this issue, we performed phylogenetic studies on the NS5A gene of HCV quasispecies populations from patients undergoing antiviral therapy. We have selected NS5A since previous work by Enomoto et al. [27] suggested that the genetic heterogeneity of a specific domain of HCV NS5A, termed IFN sensitivity-determining region (ISDR), was related closely to response in Japanese patients. Although this issue continues to be controversial [22], analysis of the published information supports the hypothesis that a relationship exists between NS5A and response to therapy [28, 29].

The results of these studies reveal that recombination can not be denied as an evolutionary mechanism for generating diversity in HCV in vivo, in patients undergoing anti-viral therapy (see Figs. 1 and 2). Recombination does not seems to play an extensive roll in the evolution of HCV quasispecies populations, at least by the study of NS5A genes, since only one recombinant isolate was observed among all HCV quasispecies populations studied. On the other hand, the true frequency of recombination may be underestimated, since a statistical significant number of differences among recombinant and putative parental-like strains are needed in order to achieve detection by current methods applied to detect recombination events. Recombination may serve two opposite purposes: exploration of a new combination of genomic region from different origins or rescuing of viable genomes from debilitated parental genomes [30]. Interestingly, the recombination break-point identified in recombinant strain AY378694 is situated on the PKR-binding region of NS5A ISDR [27, 31] (see Fig. 4). Recent evidence suggests that HCV NS5A protein can repress PKR function in vivo, possibly allowing HCV to escape the antiviral effects of interferon [1, 3, 32, 33]. An analysis of NS5A translated sequences of recombinant and putative parental-like virus suggests the possibility that recombinant isolate AY378694 may have acquired amino acids known to be present in HCV strains resistant to interferon treatment (see Fig. 4), although more work is needed in order to test this hypothesis, since the results found for the ISDR still remains a controversial issue [22]. For that reason, although recombination may not appeared to be extensive in NS5A genes of HCV quasispecies populations of patients undergoing antiviral therapy, this possibility should be taken into account as a mechanism of genetic variation for HCV.

Figure 4
figure 4

Alignment of amino acid sequences of the PKR-binding region and ISDR of recombinant and putative parental-like strains. Strains are indicated by accession number in the left side of the figure. Identity to strain recombinant strain AY378694 in putative parental-like strains AY378615 and AY 378641 is shown by a dash. PKR-binding region and ISDR sequences are shown in bold. Recombination break-point is indicated by an arrow. Amino acids known to be present in interferon resistant viruses are indicated in red in recombinant strain AY378694 (see refs. [31, 35]).

Conclusion

Only one recombinant strain was detected in all patient quasispecies populations studied. The recombination break-point identified in strain AY378694 is situated on the PKR-binding region of NS5A. The results of these studies reveal that recombination events can be observed in patients undergoing anti-viral therapy. Recombination can not be denied as a mechanism of genetic variation for HCV.

Methods

Patients and strains

Patients and strains refereed in these studies belong to a recent study done by Puig-Basagoiti et al. [22]. All patients had genotype 1b infection. Treatment consisted of the administration of IFN-α-2b and ribavirin (see ref. [22]).

Sequences

NS5A sequences from the study done by Puig-Basagoity et al. [22] were obtained from the HCV LANL database [23] (for patient identification, quasispecies accession numbers obtained at each week of therapy and therapy outcome, see Table 1). Sequences were aligned using the CLUSTAL W program [34].

Recombination analysis

Putative recombinant sequences were identified with the SimPlot program [24]. This program is based on a sliding window method and constitutes a way of graphically displaying the coherence of the sequence relationship over the entire length of a set of aligned homologous sequences. The window width and the step size were set to 200 bp and 20 bp, respectively.

Bootscanning analysis

The results obtained in the recombination analysis were confirmed using a bootscanning analysis [25]. The window width and the step size were set to 200 bp and 20 pb, respectively.

LARD analysis

To assess whether the recombination model we obtained gave a significantly better fit to the data than the null hypothesis of no recombination, we used LARD [26]. Briefly, for every possible breakpoint, the sequence alignment was divided into two independent regions for which the branch lengths of a tree of the putative recombinant and its two parent sequences were optimised. The two results (likelihoods) obtained by using the separate regions were then combined to give a likelihood score for that breakpoint position and the breakpoint position that yielded the highest likelihood then was compared, by using a likelihood ratio test, to the likelihood obtained from the same data under a model that permitted no recombination. The likelihood ratio obtained by using the real data were evaluated for significance against a null distribution of likelihood ratios produced by using Monte Carlo simulation of sequences generated without recombination. Sequences were simulated 1,000 times by using the maximum likelihood model parameters and sequence lengths from the real data using Seq-Gen [35].