Introduction

Mutation and recombination are the two main evolutionary forces that generate genetic variation in HIV-1. Like other human positive-sense RNA viruses, human immunodeficiency virus (HIV-1) has a high mutation rate, which in its case is due to the error-prone nature of the viral reverse transcriptase (3 × 10-5 mutations per nucleotide per replication cycle) [1, 2]. This high rate of mutation, coupled with a high replication rate (10.3 × 109 particles per day) [3], allows for the generation and fixation of a variety of advantageous mutations in a virus population. These changes are selected in response to the host immune pressure to enable the virus to resist the host defense. Recombination is another potential source of genetic variation that contributes significantly to the genetic diversification of HIV and could potentially produce more virulent viruses, drug resistant viruses, or viruses with altered cell tropism that may reduce the effectiveness of antiretroviral therapy and may present major challenges for the design of vaccines [4]. It has been reported that recombinant viruses including the unique recombinant forms (URF) and circulating recombinant forms (CRF), may account for at least 20% of all HIV infections [5]. The existence of recombinant viruses is an evidence of simultaneous infection of multiple viruses during a single transmission event (co-infection) or from the sequential infection of viruses during multiple transmission events (superinfection). Co-infection has been well documented in individuals that are infected with both HIV-1 and HIV-2 [6, 7], and individuals infected with viruses from different HIV-1 groups [8], and individuals infected with different subtypes or recombinant variants [916], and with divergent variants of the same subtype from different sources [1723]. The consequence of co-infection has significant implications on antiretroviral resistance and vaccine development. Furthermore, it could lead to immunologic escape and subsequent disease progression [21, 24]. Thus, determining the frequency of dual infections is of great interest for the clinical management of HIV infection.

Unpublished data from our laboratory found evidence of an HIV-1 subtype B and F1 dual infected homosexual patient. Therefore, we attempted to retrospectively evaluate the frequency of HIV-1 subclade F1 and subtype B dual infections in Brazilian recently HIV-1-infected men who have sex with men (MSM).

Materials and methods

Patients

The subjects in this study were part of a previously described prospective cohort of recently HIV-1 infected persons from São Paulo, Brazil [25]. Recent infection was defined as being infected for less than 170 days (95% confidence interval: 145–200 days) using the serologic testing algorithm for recent human immunodeficiency virus (HIV) seroconversion (STARHS) strategy [26]. Forty-one MSM participants were selected for this study based on the following criteria: infection with a subtype B virus based on near full-length genome or partial pol (including complete integrase region) analysis [27, 28], availability of a blood sample from the initial time point. The details of this cohort and the methods for identifying recent infection were described elsewhere [25, 27, 29]. For this study, evidence for dual infection was defined as the presence of subclade F1-proviral DNA in the same blood samples and in the same genomic region that was previously identified for subtype B infection. Detection of both viruses in a single clinical sample strongly suggests that the two variants were due to either co-infection or superinfection. However, only the frequency of dual infection was concluded in this study because we do not know whether co-infection or superinfection originally occurred. Patient data, including age, number of CD4-positive T cells, and viral load (VL) was obtained from the patient medical records (Table 1). Information on the sexual behaviors, including the specific number of unprotected sexual acts, sexual partners, sex acts per partner, the HIV status of a partner and VL of an HIV-1 positive partner at the time of sexual intercourse are lacking. All study participants signed an informed consent form, and the project was approved by the ethics committee of the Federal University of São Paulo.

Table 1 Characteristics of the 41 study MSM subjects

Proviral DNA extraction and amplification

All samples used in this study were the same as those described previously [27]. PMNs were isolated from patient blood samples collected at the time of enrollment, and genomic DNA was extracted using a QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany), according to the manufacturer’s instructions. The proviral DNA was subjected to nested PCR to amplify a 248 bp fragment of the integrase gene (pol-IN) using the primers listed in Table 2. The assay was developed to detect subclade F1 viral isolates with a high sensitivity and specificity. By analyzing the alignment of complete genome sequences of different subtypes of HIV-1, the pol-IN gene region was targeted because it was well conserved within subclade F1 strains, yet has a unique sequence compared to other HIV-1 subtypes. In the initial PCR, a 266 bp integrase fragment was amplified using an Eppendorf Master cycler. The PCR condition consisted of an initial denaturation step at 95°C for 2 minutes, followed by 35 cycles of 95°C for 30 seconds, 58°C for 30 seconds and 68°C for 1 minute, and a final extension was carried out at 68°C for 5 minutes. For the nested PCR, 5 μl of the first PCR reaction was used, and the PCR mix included subclade F1-specific inner primers (Table 2). The amplified product was electrophoresed through a 1.5% (wt/vol) agarose gels containing 0.5 × Tris Borate EDTA followed by ethidium bromide staining. To allow a more advanced phylogenetic analysis, another set of forward primers that were specific for subclade F1 pol region were designed. These primers were used in combination with the reverse pol-IN primers to amplify a 1247 bp L-pol fragment. Amplification and detection of the L-pol fragment was carried out using the same PCR conditions as described previously with a modification of the extension time to 2 minutes.

Table 2 Details of PCR primer combinations

Isolates that were characterized as subclade F1 by PCR amplification and DNA sequencing of the pol-IN fragment but failed to be amplified by PCR using the L-pol specific primers suggested that these isolates may be recombinant viruses. To address this issue, forward primers were used to amplify a 727-bp product (denoted as M-pol). These primers were able to amplify a broad range of HIV-1 variants including subtype B and F1. These primers were used in combination with the reverse pol-IN primers in a nested PCR assay, using the same conditions described for the pol-IN PCR assay except for an annealing temperature at 55°C and a 2 minute extension time. Both PCR assays, L-pol and M-pol, had a sensitivity of 25 and 15 copies per reaction, respectively. All assays were performed in duplicate for each fragment using the primer combinations shown in Table 2. Positive and negative controls (healthy donor PMNs) were included in each assay. Strict laboratory precautions were taken to avoid cross contamination. Specimens that had a clear amplification in each duplicate reaction were considered to be positive.

Sequencing and phylogenetic analysis

The amplified DNA fragments were purified using a QIAquick PCR Purification Kit (Qiagen, Hilden, Germany) and directly sequenced using second-round primers and the PRISM Big Dye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems/Perkin-Elmer, Foster City, CA) on an automated sequencer (ABI 3130, Applied Biosystems). After excluding the primer regions, each amplicon was assembled into a contiguous sequence alignment and edited with the Sequencher program 4.7 (Gene Code Corp., Ann Arbor, MI). The alignment of multiple sequences, including the reference sequences for subtypes A–D, F–H, J and K (http://hiv-web.lanl.gov), was performed using the CLUSTAL X program [30] and followed by manual editing using the BioEdit Sequence Alignment Editor program version 5.0.7 [31]. Gaps and ambiguous positions were removed from the final alignment. The phylogenetic relationships were determined by two methods: the neighbor-joining (NJ) algorithm of MEGA version 5.0 software [32] and maximum likelihood (ML) using PHYML v.2.4.4 [33]. For the NJ method, trees were constructed under the maximum composite likelihood substitution model and bootstrap re-sampling was carried out 1000 times. For the ML method, phylogentic trees were constructed using the GTR + I + G substitution model and a BIONJ starting tree. Heuristic tree searches under the ML optimality criterions were performed using the nearest-neighbor interchange (NNI) branch-swapping algorithm. The approximate likelihood ratio test (aLRT) based on a Shimodaira-Hasegawa-like procedure was used as a statistical test to calculate branch support. Trees were displayed using the MEGA v.5 package.

Nucleotide sequence accession numbers

The sequences described here have been deposited (accession numbers pending)

Results

Blood samples were obtained from 41 MSM study participants who had been previously characterized as being infected with HIV-1 subtype B virus (Figure 1) [27]. The median VL in this cohort was 4.3 × 104 copies/ml (range, <400-39.3 × 104). The median baseline CD4-positive T cell count was 564 cells/mm3 (range, 198–2449 cells/mm3). The age of the subjects ranged from 18 to 56 years, and the median age was 30.6 years. All patients were treatment-naive at the time of sample collection. The main characteristics of the study population are shown in Table 1.

Figure 1
figure 1

Phylogenetic tree constructed using a maximum-likelihood method from partial pol region (1279 bp; nt 3822–5101 of HXB2) of 41 samples from MSM that have previously been determined to be infected with HIV-1 subtype B (indicated by black circles) and 37 HIV-1 reference sequences from the Los Alamos HIV-1 database representing 11 genetic subtypes. Samples that were identified in this study to host subclade F1 DNA are indicated with star symbol. For purposes of clarity, the tree was midpoint rooted. The approximate likelihood ratio test (aLRT) values of ≥ 90% are indicated at nodes. The scale bar represents 0.05 nucleotide substitutions per site.

Before processing the patient samples, we wanted to determine the sensitivity of the subclade F1 specific primers for the amplification of the pol-IN fragment. This was performed by using multiple reaction tubes, each containing 105 copies of HIV-1 subtype B and a known quantity of subclade F1 partial proviral pol DNA ranging from 1 × 100 to 1 × 106 copies per reaction. The sensitivity of PCR amplification of the subclade F1 partial proviral pol DNA was one copy of target per reaction in a background of 105 copies of HIV-1 subtype B. The specificity of the nested PCR for subclade F1 was confirmed by sequencing the amplified PCR products. Furthermore, this assay was tested on 15 subclade F1 and 25 subtype B patient samples that were previously characterized by partial and near full-length proviral genome analysis [27, 28, 34, 35]. As shown in Figure 2A, clear bands at the expected size of 248 bp for subclade F1 were seen in all reactions, but not in the subtype B strains (Figure 2B).

Figure 2
figure 2

Representative agarose gel of the nested polymerase chain reaction (PCR) produts. Electrophoresis and subsequent ethidium bromide staining of amplicons from the nested PCR amplification with pol-IN subclade F1-specific primers of the DNA samples known to harbor HIV-1 subclade F1 (A), and subtype B (B). Amplicons of nested PCR (1247 bp) performed with L-pol subclade F1-specific primers using samples that were observed to be infected with HIV-1 subclade F1 (solid line), BF1 recombinants with breakpoints located between the PCR primers (dotted line), and subtype B (dashed line). (C). Amplicons of nested PCR (727 bp) performed with M-pol primers (specific for F1 and BF1 variants) using samples observed to be infected with HIV-1 subclade F1 (solid line), BF1 recombinants with breakpoints located between the PCR primers (dotted line), and subtype B (dashed line) (D). M, molecular weight DNA marker (100 bp DNA ladder, Invitrogen).

Having verified the sensitivity and specificity of the pol-IN primers, we wanted to establish the conditions and reliability of the L-pol and M-pol primers in amplifying their specific PCR fragments (see Table 2). Thus, the latter primers were tested against a range of previously published HIV-1 subtype B, F1 and BF1 proviral variants [27, 28, 3437]. The L-pol primers amplified a clear product from only subclade F1 isolates (Figure 2C). Similarly, the M-pol primers amplified a fragment of the correct size from all BF1 and subclade F1 variants, but no or weak amplification of multiple products was observed when the primers were used to assay subtype B variants (Figure 2D). All products were sequenced to determine the specific subtype.

Using pol-IN nested PCR in our clinical samples, 5 of 41 (12.2%) of the MSM patient samples, already described to be infected with subtype B, were also found to be infected with F1 virus. The sequences of the pol-IN fragment from the five subjects were then compared with representative sequences from all subtypes available in the HIV database (year 2005). Consistent with the results of our pol-IN PCR assay, the ML tree, shown in Figure 3, indicated that all five sequences clustered together with the subclade F1 (90% aLRT) reference strain. The mean intersubject sequence diversity among these five isolates was 1.1% (range, 0.2–1.7%). As shown in Figure 1, HIV-1subtype B strains from these 5 MSM did not demonstrate any evolutionary linkages.

Figure 3
figure 3

Phylogenetic tree constructed using a maximum-likelihood method from pol -IN (219 bp ; nt 4269–4488 of HXB2) fragments from five of the samples isolated from MSM observed to be infected with both subtype B (indicated by black circles) and subclade F1 DNA (indicated by black squares) along with HIV-1 reference sequences from the Los Alamos HIV-1 database representing 11 genetic subtypes. For purposes of clarity, the tree was midpoint rooted. The approximate likelihood ratio test (aLRT) values of ≥ 70% are indicated at nodes. The scale bar represents 0.05 nucleotide substitutions per site.

Despite various attempts, amplification of the L-pol fragment in subclade F1 positive samples was unsuccessful. These results suggested that the 5 MSM were co-infected with a BF1 recombinant virus. To address this question, proviral DNA from the samples that were positive for the pol-IN fragment were subjected to amplification and sequencing of M-pol. We used a combination of primers that specifically amplified the M-pol fragment that contained subclade F1 sequence at the 3' end if these patients were infected with recombinant virus. This reaction produced either no amplification or resulted in the amplification of multiple weak fragments with insufficient yield to perform sequencing. Overall, these results indicated that the 5 MSM patients are likely infected with both subtype B and F1 HIV-1. The failure to amplify L-pol and M-pol fragments is likely a result of the low frequency of subclade F1 proviral DNA.

A comparison of the VLs between single and dual infected MSM was performed. In subjects with dual infection, the median VL was 5.3 × 104 and ranged from 1.5 × 104 to 12.5 × 104 copies/mL. In MSM that were infected only with subtype B, the median VL was 3.8 × 104 copies/mL and ranged from undetectable (<400 copies/mL) to 39.3 × 104 copies/mL. As observed in other studies, there results indicated that the VL are statistically the same [38].

Discussion

This study describes the prevalence of HIV-1 subtype B and F1 dual infections in recently infected Brazilian MSM. Among the 41 subjects studied, 12.2% were positive for both subtypes B (from previous study) and F1 proviral DNA (current study). These results are not surprising because both viral subtypes and recombinants are widely circulating in Brazil, which is a country that offers an excellent setting for such studies. It is probable that our results have underestimated the true rate of dual infection in this group. The most likely explanation for underestimation is that some isolates could have been undetected by our specific PCR screening method because of a mismatch at the primer binding sites, low proviral load, or that the subclade F1 isolates were maintained in another reservoir other than the CD4-positive compartment that was sampled in the peripheral blood. Additionally, our method only detects dual infection of subtypes B and F1 when the pol-IN region is subclade F1. We could have missed some instances of co-infection if recombination had happened, and the pol-IN fragment is not subclade F1. Therefore, it is possible that the dual infection in this group may be higher than what was observed if we had sequenced a larger region or sequenced other regions of the viral genome. Our attempts to amplify larger fragments to determine if recombination had occurred were unsuccessful, probably because these subjects were co-infected and maintained low proviral loads of subclade F1 compared to the subtype B viral population. However, because most of the HIV-1 subclade F1 strains circulating in Brazil contain recombinant genomes and particularly with recombination with subtype B [27, 34, 39], we cannot rule out the possibility of F1 recombination in the present study. Our findings also may not be generalizable. This retrospective study focused on a select, relatively small group of recently HIV infected MSM that were known to be infected with subtype B virus. Consequently, the dual infection rates reported cannot necessarily be extrapolated to other populations of HIV-infected MSM. In spite of these caveats, we have been able to identify five cases of dual infection by studying only one genomic region of the HIV genome, similar to the Kenyan study [38]. In the Kenyan study, the authors observed seven cases of HIV-1 superinfection among 36 high-risk women and only five cases of superinfection were detected by differences in only one gene. Unexpectedly, a lack of detectable HIV-1 dual infections has recently been reported in a retrospective study of 83 samples from chronically infected patients on antiretroviral treatment throughout the KwaZulu-Natal region that has a high HIV prevalence [40]. The lack of dual infections in this study was explained by the ability of the immune system to evolve overtime to eliminate or prevent a second viral infection during chronic infection [40, 41].

Despite a large body of literature, the true prevalence and the timing of immune selection in HIV co/superinfection cases have not yet been substantiated by robust clinical studies and the limited data that do exist have yielded inconclusive or contradictory findings that partially contribute to the controversies surrounding the challenge and implications of HIV co/superinfection for efficient vaccine design [38, 42, 43]. Our estimates of subtype B and F1 dual infection rates are not directly comparable to other published studies as each study group has used different research designs, methodologic approaches, and different target population for the search of different HIV-1 subtypes [8, 16, 4452]. All together, these data lend further support to the conclusions that dual infections are an integral part of the HIV/AIDS epidemic, particularly in countries where multiple subtypes are circulating in the population [15].

In summary, our data adds to the knowledge of the prevalence of HIV-1 dual infections caused by HIV-1 subtype B and F1 viruses in MSM subjects and provides data from a country where such a phenomenon is rarely documented. Furthermore, these data agree with the consensus that the presence of two or more HIV-1 subtypes within an infected individual is relatively frequent [53, 54]. Thus, testing for co-infection and superinfection and the implementation of effective preventative measures in the MSM population remains relevant issue.