Different Epidemic Potentials of the HIV-1B and C Subtypes
- Cite this article as:
- Salemi, M., de Oliveira, T., Soares, M.A. et al. J Mol Evol (2005) 60: 598. doi:10.1007/s00239-004-0206-5
- 90 Views
HIV, the cause of AIDS in humans, is characterized by great genetic heterogeneity. In particular, HIV-1 group M subtypes are responsible for most of the infections worldwide. We investigate the demographic history of HIV-1B and HIV-1C subtypes in South Africa and Brazil using both a parametric and a nonparametric approach based on coalescent theory. Our results show that although both subtypes are spreading exponentially in Brazil, the HIV-1C growth rate is about twice that of Brazilian HIV-1B or South African HIV-1C, providing evidence, for the first time, of a different epidemic potential between two HIV-1 subtypes. The present study not only may have important consequences for devising future vaccination and therapeutic strategies, but also offers additional evidence that skyline plots are indeed a simple and powerful tool for monitoring and predicting the behavior of viral epidemics.
KeywordsEpidemic potentialHIV-1BHIV-1CBrazilSouth AfricaAIDS
HIV, the etiologic agent of AIDS, is classified into two distinct but clearly related types of viruses, HIV-1 and HIV-2, characterized by an extraordinary genetic variability (Hahn et al. 1984; Clavel et al. 1986). Three major groups can be distinguished within HIV-1: group M (for main), O (for outlier), and N (for neither, non-M–non-O, or new) (Robertson et al. 2000; Simon et al. 1998). Moreover, 9 phylogenetically distinct subtypes, 2 two subsubtypes, and at least 14 intersubtype HIV-1 recombinants, known as circulating recombinant forms (CRFs), have been repeatedly identified within group M so far (Robertson et al. 2000; McCutchan 2000; Salminen et al. 1995). Indeed, recombination coupled with the elevated error rate of the reverse transcriptase and the rapid turnover of HIV-1 in infected individuals are at the origin of the high genetic variability of the virus (Peeters and Sharp 2000).
Most subtypes, as well as CRFs, are present in Africa, reflecting the African origin of the epidemic (Gao et al. 1999). HIV-1B is the subtype responsible for most of the infections in Europe, United States, and Australia, whereas HIV-1C is the most prevalent worldwide, accounting for more than 56% of all infections (Esparza and Bhamarapravati 2000). It has been shown that the HIV-1 group M epidemic in humans originated from a zoonotic transmission (Gao et al. 1999) and that the cenancestor (most recent common ancestor) of group M probably dates back to the 1930s (Korber et al. 2000; Salemi et al. 2001). The existence of genetic subtypes could be the result of certain viral strains being involved in extensive transmission chains in a given geographic area, the so-called “founder effect.” In this view, in spite of their genetic differences, HIV-1 subtypes would be biologically equivalent. It is also hypothesized that characteristics such as a higher or lower transmissibility and/or fitness could explain the success or failure of different subtypes in different regions (Bjorndal et al. 1999). No definitive answer has been reached so far. In addition, the ability of a virus to spread in a population could be related to specific transmission routes like sexual contact and injecting drug use (Salemi et al. 1999; Pybus et al. 2001).
Materials and Methods
Likelihood ratio test comparing the likelihood of different demographic models for the HIV data sets
−2(Lk1 − Lk2)
2.1558 (p = 0.142)
0.3 (p = 0.584)
0.4324 (p = 0.511)
2.3524 (p= 0.125)
2.218 (p = 0.136)
2.218 (p = 0.136)
Phylogeny and Evolutionary Rate Estimates
Maximum likelihood (ML) phylogenies were estimated for each data set. The best-fitting nucleotide substitution model was tested with a hierarchical likelihood ratio test following the strategy described by Swofford and Sullivan (2003), using a neighbor-joining tree with HKY85 estimated distances. ML phylogenies were then reestimated with the selected model, using a neighbor-joining tree as starting tree, and the TBR algorithm for branch swapping. Calculations were performed with PAUP* 4.0b10 (D.L. Swofford, Sinauer Associates, Sunderland, MA).
Because the sequences in our data set have been collected over several years, the evolutionary rate μ can be estimated via ML directly from the phylogenetic tree assuming a molecular clock with noncontemporaneous tips (Rambaut 2000). The molecular clock hypothesis can be tested with the likelihood ratio test with n − 3 degrees of freedom, where n is the number of taxa (Rambaut 2000).
A genealogy reconstructed from randomly sampled HIV sequences contains information about population-level processes such as change in population size and growth rate (Pybus et al. 2000). Given a viral phylogeny P and a vector φ representing the parameters of the model N(t), it is possible to calculate the log of the conditional probability ln[φ|P] (Pybus et al. 2000). ML estimates of φ can be found by numerical optimization of ln[(φ|P] and 95% CIs for the estimates obtained with the likelihood ratio statistic (Pybus et al. 2000). The estimated parameters are, in fact, N(0)μ. and r|μ, where μ is the evolutionary rate in nucleotide substitutions per site per year (the parameter c in the logistic model is unaffected by linear scaling of time). Notice that time runs backward into the past so that N(0) is the effective number of the infections at the present, and N(t) represents the effective number of infections at time t.
We also obtained nonparametric estimates of demographic history through the skyline plots (Pybus et al. 2000). However, the phylogenetic trees obtained for the HIV data sets show several zero or near-zero internal branch lengths, which make the skyline plots very noisy and more difficult to interpret. Therefore, we estimated the generalized skyline plots (Strimmer and Pybus 2001) for clock-like phylogenetic trees with dated tips. In such plots adjacent intervals smaller than a threshold of size ε in a tree are grouped together before obtaining the nonparametric estimates of the population size at any given time. For each data set, the optimal ε value used was the one maximizing the AICC (corrected Akaike information criterion) of the plot. All calculations were performed with GENIE version 3.0 (Pybus and Rambaut 2002).
Evolutionary Rate Estimates
The estimated HIV-1 evolutionary rates for the different data sets were 1.40 ± 0.28 × 10−3 (HIV-B protease; Brazil), 1.54 ± 0.38 × 10−3 (HIV-B RT; Brazil), 3.38 ± 0.95 × 10−3 (HIV-C protease; Brazil), 3.55 ± 0.64 × 10−3 (HIV-C RT; Brazil), 1.71 ± 0.27 × 10−3 (HIV-C protease; South Africa), 1.47 ± 0.21 × 10−3 (HIV-C RT; South Africa). The molecular clock hypothesis was rejected by the likelihood ratio test for each data set. However, simulation studies have shown that if there is only a small amount of rate variation among lineages, then the 95% confidence limits of the rate estimate still contain the true mean rate about 95% of the time, even if the clock is rejected (Jenkins et al. 2002). In other words, the clock is very easily rejected, but the rate inferred enforcing the clock is still a good estimate of the mean rate and can still be used as a useful time scale.
Parametric Estimates of HIV-1B and 1CDemographic History in South Africa and Brazil
Estimates of the effective number of infections at present [N(0)], growth rate (r), and basic reproductive number (R0) with an average duration of infectiousness D equal to 5 or 10 years for the different HIV data set (95% CIs in parentheses)
R0 (D = 5 years)
R0 (D = 10 years)
The robustness of r estimates to change in N(0) was tested by reestimating r while constraining N(0) to vary over a range of values including its lower and upper 95% confidence limit given in Table 2 for each data set. In every case, the new estimates of r fell within the CIs of r reported in Table 2 (data not shown).
The r estimates also allow the estimation of the epidemiological quantity R0, the basic reproductive number (infectivity) of a pathogen, with the equation R0 = rD+1, where D is the average duration of infectiousness (Pybus et al. 2001). In Table 2 we use a putative but plausible range for D. For D = 10 years, on average eight secondary infections are generated by each primary HIV-1C infection in Brazil, versus about four secondary infections generated by HIV-1B in South Africa and Brazil or by the South African HIV-1C.
Generalized Skyline Plots of HIV-1B and 1Cin South Africa and Brazil
South Africa is one of the epicenters of the epidemic in the world, with about 22% of the adult population infected with HIV-1 (Department of Health, South Africa 2001), mostly with subtype C infections. Brazil, on other hand, is considered to be one of the best examples where therapeutic implementation has slowed down the course of the epidemic. Today less than 1% of the adult population is infected with HIV, about 600,000 people, mostly by subtype B and, more recently, by subtype C in the southern region of the country (Dumans et al. 2002). HIV-1 growth rates are similar, except for HIV-1C in Brazil, which is spreading about two times faster than either HIV-1B or the South African HIV-1C. In particular, R0 estimates in South Africa are in good agreement with current epidemiological data showing that each infected person has transmitted HIV to at least three new persons within 5 years (Department of Health, South Africa 2001). No evidence has been reported so far indicating that the average duration of infectiousness may vary among HIV-1 subtypes. Therefore, the twofold increase in infectivity of the Brazilian HIV- 1C compared to the Brazilian HIV-1B and the South African HIV-IC may reflect a difference in the efficiency of different transmission routes in different geographic areas.
The above results depend on the assumptions of the coalescent model used: the evolutionary rate constancy, the absence of positive selection, recombination, and migration. We analyzed the protease and RT of naïve patients to reduce the effect of positive selection, and excluded recombinant strains, but it is difficult to assess the importance of migration among subpopulations of infected HIV-1 patients in the countries studied. Also, the uncertainty in the evolutionary rate estimates may confound the interpretation of the analysis. However, the consistency of the results with current epidemiological data stren-gthens our confidence. Moreover, since our estimates of demographic history are consistent among genes, it appears that the level of rate heterogeneity among HIV sequences is not large enough to systematically bias demographic inferences.
Overall we have shown that HIV-1C in Brazil is spreading at an increased rate. Following this trend the subtype may eventually become prevalent in the entire country, as has happened in the southern Brazilian states. Phylogenetic inference is too indirect to establish firmly whether the Brazilian HIV-1C is a new, more infectious strain or whether the virus is spreading faster because of a more favorable transmission route as suggested above. Yet the recently introduced HIV-1C is outcompeting HIV-1B in a country where the latter subtype was virtually the only one present until a few years ago, and a rapid escalation of HIV-1C infections has been occurring throughout sub-Saharan Africa, in India, and in China (Esparza and Bhamarapravati 2000; UNAIDS/WHO 2000). A similar scenario may be possible for other Western countries and deserves to be taken into account for future planning of vaccination and therapeutic campaigns around the world. These results also underline the need for refocusing prevention strategies in Brazil to stop the spreading of this viral variant. In this light, the use of viral gene sequences coupled with the results of coalescent theory appears to be a promising and important tool for monitoring and predicting the epidemic behavior of HIV subtypes and of other pathogens as well (Pybus et al. 2001; Robbins et al. 2003; Tanaka et al. 2002).
This work was supported by the Flemish Funds voor Wetenschappelijk Onderzoek (FWO Grants G.0288.01 and KAN2002 1.5.193.02, Postdoctoral Onderzoeker Contract 530).