Introduction

Since 2010, the World Health Organization’s public health approach for the use of antiretroviral therapy (ART) to treat and prevent HIV infection has included the recommendation that national programs should develop policies for first-, second-, and third-line ART combinations [1]. The rationale of such approaches includes that of maximizing HIV treatment efficacy and safety by reducing the range of different ART combinations. However, clinical trials usually prioritize combinations of antiretroviral drugs to optimize individual treatment strategies for patients, without necessarily evaluating regimens that allow combinations with some capacity for generalization, allowing the definition of second and third-line regimens, despite efforts looking for alternatives like these [2, 3].

While antiretroviral (ARV) drugs have significantly reduced the number of people who present with resistance-related virological failure, a substantial number of individuals require combinations according to the viral resistance profile acquired during previous treatments, a condition that poses a challenge to the evaluation and recommendation of standardized second- and third-line regimens. The Joint United Nations Programme on HIV/AIDS (UNAIDS) has predicted 28.5 million PLHIV on ART worldwide in 2025, corresponding to 24.3 million on first-line therapy, 3.5 million on second-line therapy, and 0.6 million on third-line therapy [4]. And yet, 23 years after the introduction of the highly active antiretroviral therapy (HAART), third-line ART still entails considerable uncertainty due to the limited number of options available for these cases [5], especially given that there are few studies addressing the best ART combinations and the treatment impact on the progression of the disease and on AIDS-related mortality.

Hence, from a public health perspective, some controversy remains as to the best ART combinations for experienced HIV-1-infected patients [6,7,8]. Current guidelines simply recommend that previously failed regimens should be changed to new combined ART (cART) according to the results of past and current resistance tests. New cART should include a minimum of 2 and “preferably” [6, 7] or “ideally” 3 fully active drugs [8] chosen by genotype–phenotype assessment to be used in combination with optimized background therapy (OBT). It is worth emphasizing that the clinical recommendations of “preferably”, or “ideally” 3 fully active drugs chosen by genotype-phenotype assessment to be used in combination with OBT have remained unchanged in British and North American antiretroviral treatment guidelines for almost a decade [9,10,11]. Again, these recommendations, although based on the best available evidence, are still general guidelines, without, however, providing guidelines that can be characterized as combinations of second or third lines [9,10,11]. Another additional challenge is limited access to tests to assess HIV resistance to antiretroviral treatment. WHO does not currently endorse HIV drug resistance testing for individual patient management, a condition that reinforces the need for expansion of optional second and third-line regimens, such as those based on dolutegravir, for example, known to be associated with a greater barrier to the emergence of viral resistance such as way of enabling the public health approach to HIV treatment [12]. Salvage regimens are recommended with drugs such as darunavir/ritonavir (DRV/r), etravirine (ETV), dolutegravir (DTG), and raltegravir (RAL), containing regimens with or without previously used ARV [13]. Nevertheless, the WHO Guidelines characterized those recommendations as “conditional” (i.e., desirable effects of adherence to a recommendation probably outweigh the undesirable effects, but is low confident), and as having been based on studies that provided low-quality evidence. The WHO also pointed out that most of the studies used as the foundation for the guidelines had been conducted in limited settings, that is to say middle-to-high and high-income, and therefore, the transferability of this knowledge to lower-resourced settings is unclear [13].

Despite the lack of a clearly delineated statement of how third-line therapy should be implemented, WHO’s Guidelines recommend that national programs should develop policies for third-line therapies and that the corresponding approaches should optimize regimens using genotype profiles and the addition of new drugs with minimal risk of cross-resistance to previously used regimens [13].

In our search through the current guidelines for ART and through the latest publications of the Conference on Antiretroviral Drug Optimization, we found only one systematic review that assessed the efficacy of ART in treatment-experienced HIV-infected adults [14]. The study included all randomized clinical trials (RCT) published from 2003 to 2010 that assessed the efficacy of adding a new ART (vs placebo) to OBT for treatment-experienced HIV-infected subjects [14]. The new ARV + OBT approach vs placebo was first proposed by the TORO clinical trial [15]. Since then, this new ARV + OBT approach has been the most used rationale for evaluating new drugs for individuals with triple-class virological resistance [15,16,17,18,19]. However, the combinations studied have often not allowed direct comparisons between ARVs and have thus resulted in little evidence not only as to which the best combinations of two or three drugs are, but also as to which drug with a novel mechanism is to be chosen. In addition, this systematic review did not assess DTG trials, nor did it address the methodological quality and any research gaps of the RCTs included.

The objective of our systematic review and meta-analysis was twofold: (a) to assess the efficacy of third-line therapy for adults with HIV/AIDS based on RCTs that adopted the “new ARV + OBT” approach and (b) to assess the scientific evidence related to treatment strategies for multi-experienced patients under the WHO proposal of third-line therapeutic approaches.

Methods

The PICOS criteria for inclusion are reported in Additional file 2:Table S1.

Data sources and searches

Our systematic review comprised a search of the following electronic databases spanning January 1, 1966, to December 31, 2015: MEDLINE (accessed by PubMed), EMBASE, LILACS, ISI Web of Science, SCOPUS, and Cochrane Central Register of Controlled Trials. In addition, we searched the references of studies published in the following international scientific meetings: International AIDS Conference (2001 to 2015); Conference on Retroviruses and Opportunistic Infections (CROI) (1997 to 2015); Interscience Conference on Antimicrobial Agents and Chemotherapy (ICAAC) (2003 to 2015); and International Congress on Drug Therapy in HIV Infection (2004 to 2015).

The reason the search for publications only spanned until December 31, 2015, was because this study only aimed to assess third-line therapy in RCTs that used the OBT approach, and the last RCT known in scientific literature that used this strategy was published on March 13, 2013. The search strategy used is shown in the Additional file 4. No language restrictions were established for published studies. The present systematic review is reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [20].

Inclusion criteria

We included all RCTs that were published or presented in their complete versions. Eligible studies were those which enrolled third-line therapy patients, aged 16 or older, who received OBT plus new ARVs, or OBT plus placebo/comparison ARVs. Given that there is no standardized third-line therapy in the scientific literature or in international guidelines, we used the best definition so far to characterize treatment-experienced HIV-1-infected patients, which is patients with a documented genotypic and/or phenotypic resistance to at least one ARV of each of the three following classes: nucleoside reverse transcriptase inhibitors (NRTI), non-nucleoside reverse transcriptase inhibitors (NNRTI), and protease inhibitors (PI). New drugs included enfuvirtide (ENF), tipranavir (TPV), DRV, RAL, ETV, maraviroc (MVC), vicriviroc (VIC), amdoxovir (DAPD), and DTG.

Exclusion criteria

The following exclusion criteria were established: studies that (a) were not randomized and/or did not have a control group for comparison, (b) did not adopt an OBT strategy for comparison, (c) did not provide efficacy and safety data, (d) assessed switch therapy and/or simplifying treatments, (e) included naïve patients, (f) included pregnant and breastfeeding women, and (g) included subjects under 16 years of age.

Study selection

Two investigators (LPM and RSK) screened the titles and abstracts independently and revised the full text of eligible studies. Reviewers were not blinded to the authors’ identities nor to the institutions that published the manuscripts. They evaluated the full-text articles, determined study eligibility, and conducted data extraction independently, solving disagreements by consensus whenever necessary.

Data extraction

Data from included studies was extracted using a structured data collection tool developed by the researchers, which was based on the recommendations of the Consolidated Standards of Reporting Trials (CONSORT) [21], the Cochrane Collaboration’s tool for assessing risk of bias in randomized trials [22], and the literature addressing the most important issues regarding the critical appraisal of randomized clinical trials [23,24,25]. The synthesized information was assessed using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) [26].

Data analysis

The data extracted from the RCTs assessed the characterization of primary and secondary outcomes, as well as efficacy, safety, subgroup analyses, and the results of said studies. The primary outcome assessed in the meta-analysis was the proportion of patients that reached undetectable HIV RNA levels (defined as < 50 copies/mL) at 48 weeks of follow-up. This outcome was based on clinical and statistical criteria and was chosen because it represents long-term cART effectivity and because most of the trials presented results for this outcome. Nonetheless, the referred definition was also an approach to reduce the heterogeneity among the studies. Trials that did not present data for the referred outcome at week 48 were considered in the analysis under another follow-up time (i.e., 16, 24, 96 weeks).

The primary outcome was also analyzed according to the number of fully active ARVs at 48 weeks of follow-up (i.e., OBT with zero, 1 or 2 + active drugs) and the risk of bias. The stratification regarding the risk of bias was implemented based on the Cochrane Collaboration’s tool for assessing risk of bias [22] and resulted in three categories: (a) high risk: 2 or more high-risk criteria or 1 criterion with high risk and 2 or more criteria with unclear risks; (b) moderate risk: 1 criterion with high risk and only one criterion with an unclear risk or no criteria with high risks and 1 or more criteria with unclear risks; and (c) low risk: all criteria with low risks.

Secondary outcomes, including any increases in CD4 + cell count and any other outcome related to a decrease in viral load, were analyzed using descriptive statistics only.

Assessment of risk of bias

The risk assessment of the retrieved studies included random sequence generation, allocation concealment, blinding of participants and personnel, outcome assessment, incomplete outcome data, selective reporting, and other potential sources of bias [22]. Risk of bias in the RCTs was assessed using Cochrane’s Risk of Bias assessment tool [22], and the quality of the evidence available was assessed according to the GRADE [26] criteria.

Statistical analysis

The summary measure was the risk difference between intervention and control groups considering the outcome “proportion of patients reaching undetectable HIV RNA levels (defined as < 50 copies/mL) at 48 weeks of follow-up”. Study estimates were aggregated using the random-effects model with the DerSimonian-Laird estimator and the Mantel-Haenzel method. Heterogeneity among studies was assessed through the I2 statistics and Cochran’s Q test. I2 values greater than 50% were considered likely to indicate substantive heterogeneity. To investigate the presence of heterogeneity among studies, an expected sub-group meta-analysis was planned according to the number of fully active drugs in the OBT (zero, 1, or 2 fully active drugs) and the study’s risk of bias (high, moderate, or low risk of bias). Risk of publication bias was assessed by a funnel plot. Analyses were performed using the Review Manager software (RevMan version 5.1) and the GRADEpro GDT software was used to synthesize the information assessed in accordance with the GRADE criteria [26].

Results

Study selection

Eighteen randomized controlled trials (totaling 7963 patients) comprising 47 ART comparison groups were retrieved [15,16,17,18,19, 27,28,29,30,31,32,33,34,35,36,37,38,39] (Fig. 1 and Table 1). We also identified another 10 publications reporting extension results (referring to the previously selected RCTs that presented results in more advanced follow-up periods in time, i.e., 48 and 96 weeks) and 15 studies reporting subgroup analyses (Additional file 3: Table S2). Although some studies reporting extension results were included in the meta-analysis, only the 18 original studies were considered in the final analysis that we developed.

Fig. 1
figure 1

Flow chart of systematic review and study selection. *We identified 57 abstracts, including 25 whose articles were not published, 28 already identified in our peer-reviewed literature, and 4 containing exclusion criteria. Only 2 articles were selected. #Some studies reporting extension results were considered in the meta-analysis instead of their original studies

Table 1 Randomized controlled trial characteristics

Study characteristics

The 18 RCTs assessed the efficacy and safety of nine new ARVs:ENF [15, 39], TPV [16, 28], DAPD [36], DRV [17, 29, 37], ETV [30,31,32], RAL [18, 38], MVC [19, 33], VIC [27, 34], and DTG [35] (Table 1). Eleven of those studies (61%) were characterized as phase III trials and seven as phase IIb studies. No post-commercialization study was retrieved. Length of follow-up varied among the studies: one study lasted 16 weeks [18], twelve studies lasted 24 weeks [15,16,17, 28,29,30,31,32,33, 36, 38, 39], and five studies assessed patients after 48 weeks of follow-up [19, 27, 34, 35, 37]. All but two studies were superiority trials [35, 37]. A substantial variation was observed regarding the definition of the primary outcomes of the studies.

Treatment efficacy

It is important to note that some trials reporting extension results were also considered in our meta-analysis—as long as their findings were of 48 weeks [40,41,42,43]. Therefore, it is worth clarifying that the number of studies analyzed reduced from 18 to 14 because some publications aggregated the results of two original studies into a single trial reporting extension results [40,41,42,43]. Efficacy results are shown based on the proportion of patients that achieved HIV RNA viral load results below 50 copies/ml at 48 weeks of follow-up (Fig. 2). Of the 14 trials [18, 19, 27, 32,33,34,35,36,37,38, 40,41,42,43], nine trials demonstrated the superiority of the new ARVs plus the studied combination in comparison to OBT control groups, thus demonstrating the efficacy of said ARVs [18, 19, 34, 37, 38, 40,41,42,43]. The other 5 trials did not demonstrate the efficacy of the new ARVs + OBT in comparison to the control group [27, 32, 33, 35, 36]. The pooled measure showed that individuals who received the new investigational drug containing an ARV combination were more likely (18%) to achieve viral undetectability in comparison to control groups: risk difference was 0.18 (95% CI 0.13–0.23) (I2 = 84%, P < 0.00001).

Fig. 2
figure 2

Meta-analysis comparing treatment and control groups. *Studies presenting data at week 24. Outcome: proportion of patients with < 200 HIV-1 RNA copies/mL. Outcome: proportion of patients with < 50 HIV-1 RNA copies/mL at week 48. Obs. BENCHMRK studies present data in a 96-week follow-up

When analyzing the outcome “proportion of patients that achieved HIV RNA viral load results below 50 copies/ml at 48 weeks of follow-up” stratified by the number of fully active ARVs (OBT with zero, 1 or 2 + active drugs), only 8 studies were considered [18, 19, 35, 37, 38, 40, 42, 43], because these were the only ones that provided data results according to such stratification (Fig. 3). Some studies only presented these results in the extension follow-up [44] or in the subgroup analysis [45]. The SAILING study [35] presented results stratified by active drugs in OBT in 2 categories only (< 2 and ≥ 2 active drugs in OBT). Therefore, data of category < 2 active drugs in OBT were presented in the subgroup “1 active drug in OBT,” and data was not estimable in the subgroup “0 active drugs in OBT.” Risk difference among strata with zero, one, and two or more fully active drugs was 0.29 (95% CI 0.12–0.46), 0.28 (95% CI 0.17–0.38), and 0.17 (95% CI 0.10–0.24), respectively. That means that the difference between the proportion of patients who reached an RNA viral load below 50 copies/ml decreased from zero/one to two/more fully active drugs, yet this difference was not statistically significant. The pooled risk difference considering the three strata (i.e., OBT with zero, 1, and 2 or more active drugs) was 0.24 (95% CI 0.18–0.30).

Fig. 3
figure 3

Meta-analysis comparing treatment and control groups stratified by the number of active drugs in OBT. RD, risk difference; CI, confidence interval. *Studies presenting data at week 24. 1Outcome: proportion of patients with < 200 HIV-1 RNA copies/mL. Outcome: proportion of patients with < 50 HIV-1 RNA copies/mL at week 48. Obs.1: BENCHMRK studies present data in a 96-week follow-up. Obs.2: SAILING study presented results stratified by active drugs in OBT only in 2 categories (< 2 and ≥ 2 active drugs in OBT). Data of category < 2 active drugs in OBT were presented in subgroup “1 active drug in OBT”

Over the period analyzed, we observed a linear rising tendency in viral suppression rates in both trial and control groups, starting from the TORO and RESIST trials [40, 41], with respectively 18% vs 8% and 23% vs 10%, to the SAILING trial [35], with rates of 71% vs 64% (Fig. 4). However, in this same outcome, we identified a smaller difference over the entire period (space between the dotted lines) when comparing the trial and control groups.

Fig. 4
figure 4

Time evaluation of the proportion of patients with < 50 HIV-1 RNA copies/mL at week 48

Two studies [16, 28] adopted genotypic sensitivity scores, six used phenotypic scores [15, 30, 31, 37,38,39], and five studies used both methods, which resulted in an overall sensitivity score [18, 19, 27, 33, 35]. The remaining studies did not provide data on the method chosen for the detection of viral resistance [17, 29, 32, 34, 36].

All RCTs assessed CD4 cell count as a secondary outcome. The maximum and minimum increase in CD4 cell count when intervention groups were compared with control groups at week 24 were 19 and 108 cells/mm3, respectively [15,16,17, 28,29,30,31,32,33, 36, 38, 39], and 7 and 67 cells/mm3 respectively at week 48 [19, 27, 34, 35, 37]. The increase in CD4 cell count at week 16 was 64 cells/mm3 [18] (Table 1). The average increase in CD4 cell count was not calculated due to the large heterogeneity in the follow-up time of the studies that presented such data.

Furthermore, only seven studies analyzed disease progression outcomes [16, 18, 28, 30, 31, 35, 37]. Fourteen trials presented results related to mortality [16,17,18,19, 27,28,29,30,31,32,33,34,35, 37].

Risk of bias

The studies were analyzed in relation to the outcome “proportion of patients with < 50 HIV-1 RNA copies/mL at week 48” stratified by study risk of bias according to the Cochrane Collaboration’s tool [22]. The results presented three categories: (a) high risk [15,16,17, 27,28,29, 32, 33, 39]; (b) moderate risk [19, 30, 31, 34,35,36,37]; and (c) low risk [18, 38] (Fig. 5). Pooled risk difference between intervention and control groups varied significantly (p-value < 0.001 in the overall test) among subgroups according to the risk of bias, with 0.12 (95% CI 0.07–0.18), 0.20 (95% CI 0.11–0.29), and 0.33 (95% CI 0.21–0.45) for high, moderate, and low risk of bias respectively. Pairwise comparisons using the Wald test showed a significant difference between high and low risk of bias subgroups (p-value = 0.0045). Although not significant for the high vs. moderate risk (p-value = 0.1602) and for the moderate vs. low risk (p-value = 0.0630), a tendency was observed among the subgroups, showing that the lower the risk of bias in the studies, the greater the risk difference between intervention and control group.

Fig. 5
figure 5

Meta-analysis comparing treatment and control groups stratified by study risk of bias. RD, risk difference; CI, confidence interval. *Studies presenting data at week 24. 1Outcome: proportion of patients with < 200 HIV-1 RNA copies/mL. Outcome: proportion of patients with < 50 HIV-1 RNA copies/mL at week 48. Risk of bias categories: high risk of bias (high risk of bias in 2 or more criteria, according to Cochrane Collaboration’s tool for assessing risk of bias, or high risk of bias in 1 criterion and unclear risk of bias in 2 or more criteria), moderate risk of bias (high risk of bias in only 1 criterion and unclear risk of bias in only 1 criterion, or no criteria with high risk of bias and unclear risk of bias in 1 or more criteria), and low risk of bias (all criteria with low risk of bias). Obs. BENCHMRK studies present data in a 96-week follow-up

Ten out of the 18 studies (55.5%) did not provide enough information for us to assess the method used to generate the sequence of randomization [16, 17, 19, 28, 29, 32,33,34, 36, 39] (Fig. 6A, B). Fifteen studies (83.3%) did not clarify the procedures adopted for allocation concealment [15,16,17, 19, 27,28,29,30,31,32,33,34,35,36, 39]. High risk of performance bias due to lack of participant and researcher blinding was found in eight (44.4%) studies [15,16,17, 28, 29, 32, 37, 39].

Fig. 6
figure 6

Risk of bias graph and summary of the 18 RCT analyzed. A Risk of bias graph. Review authors’ judgements about each risk of bias item presented as percentages across all included studies. B Risk of bias summary. Review authors’ judgements about each risk of bias item for each included study

Risk of publication assessment

Despite the relatively small number of retrieved RCTs, the funnel plot (Additional file 1: Figure) does not suggest publication bias.

Quality of evidence

We assessed the recommendation levels and quality of evidence findings according to the GRADE criteria [26] (Table 2). We observed that the subjects who received new investigational drugs were more likely (47.8%) to achieve the outcome (i.e., less than 50 copies/ml) when compared to control subjects (RR 1.5; 95% CI 1.4–1.6), thus resulting in moderate quality of evidence.

Table 2 GRADE summary of findings

New investigational ARV groups were associated with an average increase of 40.2 cells/mm3 in CD4 count in comparison to the control group. The evaluation of evidence quality was rated as very low. Consequently, the corresponding effect estimates of those findings are very imprecise.

Discussion

The efficacy results demonstrated that the groups that received the new ARV + OBT were more likely to achieve viral suppression when compared to the control groups. Nine trials established such superiority [18, 19, 34, 37, 38, 40,41,42,43] and the pooled measure confirms this finding (risk difference 0.18, 95% CI 0.13–0.23) (Fig. 2). Individuals who received the ARV combination with the new drugs were 18% more likely to achieve viral undetectability when compared to control groups. In addition, the studies showed that new drugs provide CD4 cell count recovery, even though the magnitude of this increase was notably modest (mean of 40 cell/μL).

All categories demonstrated statistical significance in favor of the experimental group when considering the achievement of an HIV RNA viral load below 50 copies/ml at 48 weeks of follow-up according to the number of active drugs in the OBT regimen (i.e., zero, one, and two or more fully active drugs). However, although we did find some risk differences, our meta-analysis did not reach statistical significance among the different strata. Even so, according to the tendency observed in Fig. 3, we observed that adding a novel drug to the OBT might have slightly less effect in achieving complete viremia suppression when there are two or more active drugs in the OBT. To the best of our knowledge, this is the first meta-analysis that has made such a comparison. Besides, our results support the evidence that the greater the number of active drugs in the therapeutic regimen, the higher the chance of viral suppression, no matter which drugs are used in the OBT. Pichenot et al.’s meta-analysis, which assessed the efficacy of ART in treatment-experienced HIV-infected adults, demonstrated that the most important predictive factor for achieving undetectable HIV RNA was the number of fully active drugs included in the regimen [14], which is in agreement with our findings.

Among the 18 RCTs included in our study, only two [18, 38] showed low risk of bias according to the 6 methodological evaluation criteria used [22], a finding which therefore demonstrates that most of the studies were prone to bias (Fig. 6B). In addition, we observed that the lower the risk of bias in the studies, the greater the risk difference between the comparison groups (Fig. 5), thus emphasizing the importance of assessing the risk of bias in studies, as well as its influence on the efficacy results.

A study that conducted a bibliometric analysis of 103 HIV reviews found that further HIV trials are necessary and that it is essential for future trials to incorporate strategies that reduce the risk of bias, since design and methodological flaws have limited the usability of the findings [46]. According to the GRADE criteria, further research is needed to achieve more stable evidence related to efficacy outcomes (Table 2). Besides, even the recommendations of specific drugs [6,7,8] were considered a 1C grading of evidence, characterized as potentially biased, thus stemming from trials with serious flaws and with uncertain effect estimates.

One of the objectives of this systematic review was to analyze the scope of existing evidence on third-line therapy provided by the scientific literature, so as to produce data to support guidelines that would recommend the best antiretroviral schemes that patients should take through a staggered approach. However, third-line ART still lacks an operational definition. Some recent publications that used the term “third-line therapy” were based on drugs with a high genetic barrier to resistance [47,48,49]. One such publication is an observational study developed in Southern Africa [48] while the others are retrospective studies carried out in Johannesburg [47] and in Latin America [49], and all of which defined third-line regimens as being those that use newer generation NNRTIs like etravirine and darunavir, as well as the integrase inhibitor raltegravir. Consequently, due to this lack of a single clear definition, there was a high heterogeneity related to the third-line drug scheme chosen for these studies.

Nevertheless, the comparisons made by the trials that we included do not allow for recommendations as to which drugs the multi-experienced patients should use, given that the 18 studies were essentially designed for regulatory purposes. Consequently, the evidence summarized herein is not yet enough to answer which antiretroviral combinations are more effective for therapeutic regimens that are used sequentially—even more so considering the individualized needs of patients with documented multi-drug resistance. This does not mean that future studies will not be able to demonstrate such achievement, but rather that the RCTs analyzed did not assess which antiretroviral combinations are the most effective. Furthermore, a review developed by Vitoria and coworkers concluded that sequencing first-line, second-line, and third-line regimens will allow better planning by providing a rationale for the choice and the number of regimens that programs need to obtain [50].

Therefore, it is our understanding that further studies should be carried out aiming specifically to define the best ART combinations to be used for HIV-experienced patients who require third-line therapy. Such findings would provide essential information to improve procurement and logistics, all the while providing much needed evidence-based consistency of treatment that would reduce the uncertainty that is experienced at this stage of treatment.

This systematic review has some limitations. It presents only efficacy data, as the analysis of the safety profile of the antiretroviral combinations described here did not present uniform definitions of adverse events, in addition to the fact that most clinical trials are of the phase II and phase III type, therefore, not including a more detailed assessment of the safety of the evaluated drugs. Furthermore, it must be stated that the studies evaluating new drugs in experienced patients are essentially based on surrogate outcomes. We did not assess outcomes related to viral resistance. Moreover, these trials were developed mostly in high- and middle-to-high-income countries. Most subgroup analyses evaluating multidrug-resistance profiles in the existing studies presented post hoc analyses and a small number of patients. Presumably, the substantial heterogeneity found relies potentially on the differences in the assessments of the primary outcomes within the studies, as well as the aforementioned differences in the length of the follow-up study period. A high degree of heterogeneity between studies still remained even though we used strategies to reduce it, such as defining a primary outcome that is widely used, as well as a defined follow-up time and the development of effectiveness analyses according to the number of fully active drugs in the OBT. This happened due to the large number of cARTs analyzed, a substantial variation in the definition of the resistance criteria for ART, differences in the assessment of such resistance among the different existing phenotypic and genotypic tests [23], the risk of bias categories that each trial belongs to, and the distinct approaches to manage effectiveness data and study design.

When compared to Pichenot et al.’s meta-analysis [14], the present study innovated in assessing the virological success rate according to the number of fully active drugs in OBT and added more relevant studies, such as A5118 [36], TITAN [37], Grinsztejn et al. [38], and SAILING [35]. Also, unlike the 2012 publication [14], our study assessed the risk of bias using Cochrane’s Risk of Bias assessment tool and analyzed the recommendation levels together with the quality of evidence according to the GRADE criteria [26].

More than 15 years have elapsed since the first RCT adopting an OBT approach for experienced HIV-infected subjects was published [15, 39]. However, despite the new ARV benefits shown in the studies in the “OBT-era,” the future of OBT-based RCTs evaluating new drugs is controversial due to the growing difficulty to establish superiority in a context of therapeutic combinations that have become progressively more powerful [51]. Figure 3 shows an increasing tendency during the investigated period for viral suppression rates to be higher in recent studies than in older ones. This rising trend emerged in both comparison groups, experimental and control, and the differences between them have been decreasing over time. Such a result is in accordance with the evidence referred to above, indicating a scenario of limited use of OBT strategies to demonstrate the efficacy of new ARVs. There are, however, two recent studies that have used the OBT approach: the first, a clinical trial developed across 23 countries that evaluated the efficacy of fostemsavir in adults with multidrug-resistant HIV-1 infection [52], and the second, a retrospective analysis using secondary data to assess the efficacy of dolutegravir in antiretroviral-experienced patients over a 5-year follow-up period [53].

Novel trial designs for new antiretroviral drugs intended for use with treatment-experienced HIV-infected patients on a failing regimen have been suggested in the past [54], yet their benefits remain to be better assessed, especially regarding efficacy and safety data. Despite unequivocal advances related to therapeutic options for experienced patients, the findings shown here suggest that such evidence has not been fully assessed over the years. Though studies with treatment-experienced HIV-infected patients are necessary and must be developed, they should start from an operational definition of what third-line therapy is in effect. Moreover, these trials must be performed in low- and middle-income countries and must evaluate outcomes of disease progression and mortality. After regulatory goals have been accomplished, explanatory RCTs could be replaced by pragmatic RCTs [55] that are capable of effectively assessing which antiretroviral therapy combinations for experienced-patients allow for clinically relevant results to be achieved.

Conclusion

Our findings demonstrated that the groups of multi-experienced patients that received the new ARV + OBT presented a better chance of achieving viral suppression in comparison to control groups, even when the analysis was stratified according to the number of active drugs in the OBT regimen. Nevertheless, we found some risk difference among strata and a tendency supporting the evidence that the greater the number of active drugs in the therapeutic regimen, the higher the chance of viral suppression, no matter what drugs are used in the OBT. Furthermore, among the eighteen RCTs analyzed, only two showed low risk of bias, a finding which demonstrates how prone to bias the studies might be. As to the scope of evidence on third-line ART, we found that third-line schemes are highly heterogeneous.

Finally, once again, it is important to point out that the RCTs included in this study were essentially designed for regulatory purposes, thus resulting in insufficient evidence to define which combinations are the most effective, especially in a public health approach through clinical recommendations for third-line regimens. New studies with a clear operational definition of third-line therapy and using a sequential cART approach for treatment-experienced patients should be developed in order to enable the creation of better guidelines/schemes including the evaluation of their efficacy and safety.