FormalPara Key Points

Overall, the two first-line antiretroviral regimens recommended by the World Health Organization (i.e. dolutegravir 50 mg [DTG]-based and low-dose [400 mg] efavirenz [EFV400]-based regimens) provided similar patient-reported outcomes (including perceived symptoms, depression, anxiety, and mental and physical quality of life) over the first 96 weeks of follow-up in HIV-1-positive patients enrolled in the randomised NAMSAL ANRS 12313 trial in Cameroon.

Despite the higher risk of obesity associated with DTG, which in turn may impair quality of life, the base-case analysis over 96 weeks showed that the DTG-based regimen was the preferred first-line regimen, with a net health benefit of 0.056 (− 0.037 to 0.153), corresponding to a cost-effectiveness probability of 88% at the threshold of US$500 per quality-adjusted life-year gained.

When extrapolating the outcomes over 5 and 10 years, the cost-effectiveness of the DTG-based first-line regimen further improved, with a ≥ 95% probability of being cost-effective for a large range of cost-effectiveness thresholds (from US$0 to US$5147 over 10 years) thanks to both the lower virological failure risk (using the World Health Organization definition) and lower acquired drug resistance observed in the DTG arm compared with in the EFV400 arm, resulting in turn in higher long-term health benefits and lower costs.

1 Introduction

Until mid-2018, the preferred first-line regimen for treating human immunodeficiency virus type 1 (HIV-1) infection in low- and middle-income countries (LMIC) was a 600 mg efavirenz (EFV)-based regimen (EFV600) [1]. However, concerns about adverse neurosensory effects of EFV600 [2], as well as the low genetic barriers of this drug—which may result in the accumulation of drug-resistance mutations in the absence of regular viral-load monitoring [3]—led to two other first-line regimens being subsequently considered. The first is based on dolutegravir (DTG) (50 mg), a new integrase inhibitor, which is superior to EFV600 in terms of viral suppression and immunological recovery [3]. Additional advantages of DTG include its high genetic barrier to resistance and its availability in a low-cost fixed-dose combination (FDC). However, concerns have been raised about its safety during pre-conception and pregnancy [4] and about its tolerability, because of the associated risk of insomnia and obesity [3, 5, 6]. The second alternative is a low-dose (400 mg) EFV-based regimen (EFV400), which is non-inferior to EFV600 in terms of efficacy and has a better safety profile [7]. Moreover, it is efficient and safe in pregnant women and in patients with tuberculosis [7]. The updated World Health Organization (WHO) 2019 guidelines recommend using a DTG (50 mg)-based regimen as the preferred first-line antiretroviral therapy (ART) and an EFV400-based regimen as an alternative [8].

Conducted in Cameroon, the New Antiretroviral and Monitoring Strategies in HIV-infected Adults in Low-Income Countries (NAMSAL) ANRS 12313 trial is the first trial designed to assess the performance of these two regimens in a head-to-head comparison in HIV-infected individuals initiating ART [9]. Primary results at 48 weeks demonstrated the non-inferiority of the DTG-based regimen [9]. Analyses at 96 weeks confirmed this result and found a lower proportion of virological failure (defined by the WHO as an HIV-RNA viral load > 1000 copies/mL after adherence reinforcement) in the DTG arm (eight of 310 vs 19 of 303 in the EFV400 arm), with no observed resistance mutation to DTG (versus 17 among the 19 EFV400 failure cases) [10]. However, weight gain and incidence of obesity were significantly higher in the DTG arm.

Beyond the clinical challenges, the choice of which first-line regimen to use raises important economic questions in sub-Saharan Africa (SSA), where the prevalence of primary drug resistance continues to increase while access to viral-load monitoring and resistance testing are still limited, mainly because of their high cost [11]. In addition, taking the patient’s point of view into account when assessing treatment alternatives is of particular importance [12], especially considering the concerns about DTG tolerability in terms of weight gain and obesity.

Despite these challenges, knowledge about the patient and economic values of these two regimens in SSA is limited. Although the cost-effectiveness of a DTG-based first-line regimen in this region has recently been assessed using modelling [13,14,15], no study has compared DTG with EFV400 while taking the patient’s point of view into account.

The present study aimed to provide a comprehensive assessment comparing patient-reported outcomes (PROs), costs, and the cost-utility of the WHO’s preferred DTG-based and alternative EFV400-based first-line regimens in the randomised trial NAMSAL ANRS 12313 in Yaoundé, Cameroon.

2 Methods

2.1 Trial Design, Data Collection, and Study Population

NAMSAL was conducted among 613 HIV-1 positive, treatment-naive adults (i.e. aged > 18 years) enrolled between July 2016 and August 2017 in three HIV day-care centres in Yaoundé. It was initially intended to last 48 weeks, but was extended an additional 96 weeks. The present economic evaluation was conducted using the first 96 weeks of the trial to provide evidence on the economic value of DTG- and EFV400-based first-line regimens concomitantly with the recently published clinical results at 96 weeks [10]. Full details of the trial design, procedures, analysis, and results are described elsewhere [9, 10]. Briefly, participants were randomly assigned (1:1) to receive either DTG 50 mg (DTG arm) or low-dose (i.e. 400 mg) EFV (EFV400 arm) once daily, both combined with tenofovir disoproxil fumarate (TDF) and lamivudine (3TC).

Clinical visits were scheduled at baseline and then quarterly. Laboratory monitoring, including CD4 count and plasma viral load measurements, continued biannually until week 96 (W96). Genotypic drug-resistance testing was performed retrospectively at baseline and at the time of virological failure. Patients with virological failure were switched to a boosted protease inhibitor (bPI)-based second-line regimen, primarily based on lopinavir and darunavir.

Data on PROs were collected through face-to-face questionnaires administered to participants after the clinical visit at baseline and then quarterly. The following instruments were used: (1) the Justice index [16] complemented with 12 items on neuro-sensorial symptoms known to be associated with EFV (see the Online Resource, p. 4) [17]; (2) the short-form version of the Depression, Anxiety and Stress Scale (DASS-21) [18]; and (3) the 12-item Medical Outcomes Study Short-Form General Health Survey (MOS SF-12, 2nd version) [19]. The 21 items of the DASS-21—each response having a 4-point ordinal severity/frequency scale—enabled the construction of depression, anxiety, and stress scores (ranging from 0 to 42, with higher values corresponding to poorer outcomes). These scores were used to classify patients according to their severity of depression, anxiety, and stress, based on validated cut-offs [18]. The SF-12 enabled the computation of eight sub-scales and two aggregate scores—the physical component summary (PCS) and mental component summary (MCS)—both ranging from 0 to 100 (higher values corresponding to better health-related quality of life).

The base-case analysis was conducted using a modified per-protocol analysis (n = 575, with 275 and 300 participants in the DTG and EFV400 arms, respectively). Thirty-eight participants were excluded due to major protocol deviations, mainly related to switching from DTG to EFV600 because of pregnancy (see the flowchart in the Online Resource, p. 5). Overall, the median (interquartile range [IQR]) age of the study population was 37 (29–45) years, and 64% were women. At baseline, median (IQR) CD4 cells/mm3 and viral load (log10) were 282 (151–444) and 5.3 (4.8–5.8), respectively, and 66% of patients had a viral load ≥ 100,000 copies/mL.

2.2 Outcomes

The main outcomes considered in the analysis were (1) PROs and quality-adjusted life-years (QALYs), (2) costs reflecting prices over the 2016–2020 period (expressed in US$, 2016 values) and assessed from the perspective of the Cameroonian health system, and (3) incremental costs and QALYs. Outcomes were first measured over the first 96 weeks of follow-up based on NAMSAL data, and then extrapolated over 5 and 10 years using modelling. QALYs and costs were discounted at an annual rate of 3% [20].

The distribution of PROs per arm was described at the different follow-up time points using mean (standard deviation [SD]) or proportions (standard error [SE]). Mean or proportion differences between the two arms were assessed at each time point using independent group t tests and two-sample proportions tests, respectively. Median and proportion changes over time were assessed using the Wilcoxon rank sum test and the McNemar's chi-squared test, respectively. Analyses were also stratified by gender and obesity status at each time point.

For each arm, we then estimated the total number of QALYs gained and the mean number (95% confidence interval [CI]) of QALYs gained per patient over 96 weeks. The total number of QALYs gained per arm was measured as the sum of QALYs gained by each patient, computed as the time spent in a given health state weighted by its utility score. We used the MOS SF-12 scale (which has been validated in Cameroon [21]) to describe patients’ health states and a mapping algorithm—developed from a sample of the general population in the United Kingdom (UK) using the standard gamble (SG) valuation technique—to obtain the corresponding preference-based health-related SF-6D utility scores [22].

We computed costs incurred from enrolment to W96 or, if follow-up time was shorter, to the last visit [23]. The following cost items were included: antiretroviral drugs (ARV), outpatient consultations (including non-routine visits), laboratory tests, concomitant drugs, and hospitalisations. For each arm, total costs were computed as the sum of all individual healthcare resources used by patients during follow-up (as reported in standardised case report forms) multiplied by their respective unit cost.

Data on unit costs were obtained using data collected prospectively in the study sites during the trial period complemented by the most recent relevant national sources for concomitant drugs and laboratory tests, and international sources for ARV (Online Resource, p. 6). In the base-case analysis, considering the most recent ARV prices (3rd quarter 2020), the monthly prices of DTG- and EFV400-based FDC regimens were $5.2 and $5.4 (year 2016 values), respectively [24]. All nominal unit costs were first converted to US dollars using the year-specific exchange rates [25] and then deflated in real value (base year 2016) using the annual US gross domestic product (GDP) deflator [26].

The treatment of monotone and non-monotone missing data for both costs and QALYs is described in the Online Resource (p. 7).

2.3 Economic Analysis

The methods employed in the economic analysis were in line with international guidelines [27, 28]. Using generalised linear models (GLM) to adjust for any imbalances in baseline characteristics, we estimated for each arm (1) the adjusted-mean costs and QALYs per patient and (2) the adjusted-incremental costs and QALYs (Online Resource, p. 8). We then calculated the net health benefit (NHB) of the DTG-based regimen relative to the EFV400-based regimen [29], assuming a cost-effectiveness threshold of $500/QALY based on opportunity cost [30]: NHB = incremental QALYs − incremental costs/$500.

Uncertainty was addressed using a probabilistic sensitivity analysis (PSA) based on Monte Carlo simulations. We first performed a Cholesky decomposition of the variance-covariance matrices obtained from both cost and QALY regressions. This allowed us to parametrise a multivariate normal distribution that was then used to randomly simulate 5000 draws of incremental mean costs and QALYs [29]. Based on these simulations, we computed 95% CI for adjusted-incremental costs, QALYs, and the NHB. We then calculated the probability that the DTG-based regimen was cost-effective for cost-effectiveness thresholds varying from $0 to $10,000/QALY, including $500/QALY, as the proportion of bootstrap replicates with a positive NHB. The cost-effectiveness acceptability curve (CEAC) was finally obtained by plotting these probabilities on the y-axis versus the cost-effectiveness thresholds (from $0 to $10,000/QALY) on the x-axis [31].

Deterministic sensitivity analysis was also performed to address other types of uncertainties and specific scenarios as follows: (1) we varied the discount rate from 0 to 6%; (2) we performed an intention-to-treat analysis including all participants who received at least one dose of their assigned treatment; and (3) we considered only the subgroup of patients with high baseline viral load (> 100,000 copies/mL), as they had a greater risk of virological failure.

Additionally, we performed a cost-effectiveness price threshold (CEPT) analysis to indicate which regimen would be preferred for any price combination of DTG- and EFV400-based first-line regimens.

2.4 Extrapolation of Health Benefits and Costs over 5 and 10 Years

We explored the implications of considering two longer time horizons (i.e. 5 and 10 years) by developing a Markov cohort model to extrapolate disease progression and outcomes beyond W96. These time horizons were chosen to take into account the medium- to long-term effects of both regimens on disease progression and on outcomes. We assumed that given the context of Cameroon (i.e. a lower-middle-income country in SSA), current monitoring and treatment will not change greatly over these timeframes. The model structure depicted in Fig. 1 was adapted from a previous model used in high-income countries and applied in LMIC [32, 33]. It had one absorbing health state (death, whether HIV-related or not) and ten transitional health states defined according to CD4 count ranges combined with virological status (success vs failure, as defined by WHO) and treatment type (first- vs second-line). At model entry, patients were classified into one of the model health states according to their clinical outcome at W96 (Online Resource, p. 9). Depending on treatment success, patients either remained in the same health state or moved to another one, including virological failure and death. Patients on a first-line regimen who had virological failure switched to a bPI-based second-line regimen and either (1) achieved viral suppression or (2) failed to achieve viral suppression or (3) died. We assumed that patients with second-line treatment failure did not have access to third-line treatment and therefore remained in failure and eventually died.

Fig. 1
figure 1

State transition diagram for the Markov cohort simulation model. The oval boxes represent the different HSs in the model. Arrows denote the transitions between HSs according to immunological progression and treatment success. ART antiretroviral treatment, HS health state

Transition probabilities between first-line health states and from first-line to second-line health states were estimated from the intensity transition matrix, computed using transitions observed in the study population during the period from 24 to 96 weeks (Online Resource, p. 10–11). We excluded the first 24 weeks of follow-up to remove the effects of substantial immunological recovery and mortality observed in the first months after treatment initiation [9]. This enabled us to estimate transition probabilities based only on transitions observed once patients have been stabilised with ART.

Furthermore, given the relatively small number of second-line treatment observations in NAMSAL, especially in the DTG arm, we estimated second-line transition probabilities using data collected over 2010–2015 in the second-line ANRS 12169 2-LADY trial [34]. This randomised trial was mainly conducted in Yaoundé, Cameroon, to assess the effectiveness and cost-effectiveness of three alternative bPI-based second-line regimens, including the bPI-based regimens used in NAMSAL (i.e. lopinavir/ritonavir combined with abacavir + didanosine or tenofovir/emtricitabine and darunavir/ritonavir combined with tenofovir/emtricitabine) in patients with first-line ART failure [33, 34]. As the three regimens provided very similar health outcomes, the three arms were pooled and second-line transition probabilities were estimated from the intensity transition matrices observed in the sub-population of participants receiving an EFV-based first-line regimen (n = 181).

Estimated transition probabilities were used to predict health state after W96 and until 5 and 10 years (i.e. 13 and 35 3-month cycles, respectively). Life tables were used to update probability matrices to account for the probability of death conditional on age. The model also accounted for any excess mortality risk related to overweightness and obesity, estimated at 1.11 and 1.25, respectively [15].

For each arm and each health state, the mean cost and the mean number of QALYs per 3-month cycle were estimated using the trial data from 24 to 96 weeks. The total cost and number of QALYs per arm over 5 and 10 years were computed by adding the cumulated costs and number of QALYs obtained over the total 13 and 35 simulated cycles to the cost and number of QALYs accrued over the first 96 weeks of follow-up. We estimated the 95% CI for incremental costs, incremental QALYs, NHB, as well as CEAC using a non-parametric bootstrapping technique.

We tested the internal validity of our model by simulating a cohort receiving the EFV400-based (DTG-based) regimen (1) with the same transition probabilities as its counterpart (life-years saved were equal to 0), (2) with the same transition probabilities and QALYs per cycle and health state as its counterpart (incremental QALYs were equal to 0), and (3) with the same transition probabilities and costs per cycle and health state as its counterpart (incremental costs were equal to 0).

Analyses were performed using Stata version 14.2 and R version 3.5.0 (msm package).

3 Results

3.1 Patient-Reported Outcomes

Most PROs significantly improved over time in both arms, mainly during the first 48 weeks, with no significant changes observed beyond W48 or significant differences between arms at W96 (complete results are available in the Online Resource, p. 13–15).

Between baseline and W96, PCS and MCS scores increased significantly by 11.9% (p < 0.0001) and 8.0% (p = 0.0001), respectively, in the DTG arm, and by 9.0% (p < 0.0001) and 7.5% (p < 0.0001), respectively, in the EFV400 arm (Fig. 2). This was reflected in the SF-6D utility scores, which increased by 11.6% (p < 0.0001) in the DTG arm and by 9.2% (p < 0.0001) in the EFV400 arm during the trial’s first 96 weeks.

Fig. 2
figure 2

Baseline and evolution of the mean Physical and Mental Component Summary (NAMSAL ANRS 12313 trial, n = 575). Baseline and evolution between baseline and W96 of the mean PCS (a) and MCS (b) in the DTG and EFV400 arms. Both scores range from 0 to 100, with higher values corresponding to better health-related quality of life. No significant differences were found between arms at any time point for both the PCS and MCS. The PCS increased by 11.9% (p < 0.0001) and 9.0% (p < 0.0001) in the DTG and EFV400 arm, respectively. The MCS increased by 8.0% (p = 0.0001) and 7.5% (p < 0.0001) in the DTG and EFV400 arms, respectively. DTG dolutegravir, EFV400 efavirenz 400 mg, HIV human immunodeficiency virus, MCS mental component summary, NAMSAL New Antiretroviral and Monitoring Strategies in HIV-infected Adults in Low-Income Countries, PCS physical component summary, W week

Over the same period, the proportions of patients with at least mild depression decreased from 22.3 to 6.2% (p < 0.0001) in the DTG arm versus 23.1 to 6.5% (p < 0.0001) in the EFV400 arm (Fig. 3). For those with at least mild anxiety, proportions decreased from 23.1 to 7.4% (p < 0.0001) versus 22.1 to 10.4% (p = 0.0002). However, the proportion of patients with at least mild stress did not decrease significantly between baseline and W96 in both arms (from 7.0 to 4.5%, p = 0.24, in the DTG arm vs 7.7 to 3.8%, p = 0.054, in the EFV400 arm).

Fig. 3
figure 3

Baseline and evolution of the proportions of patients with at least mild depression, anxiety, and stress (NAMSAL ANRS 12313 trial, n = 575). Baseline and evolution between baseline and W96 of the proportions of patients with at least mild depression (a), mild anxiety (b), and mild stress (c), assessed using the DASS-21 scale, in the DTG and EFV400 arms. No significant differences were found between arms at any time point for all three outcomes. The decrease of the proportion between baseline and W96 of patients with at least mild depression and mild anxiety was significant in both arms. However, the decrease in the proportion of patients with at least mild stress between baseline and W96 was not significant in the DTG arm (p = 0.24) and in the EFV400 arm (p = 0.054). DASS-21 Depression, Anxiety and Stress Scale—21 items, DTG dolutegravir, EFV400 efavirenz 400 mg, HIV human immunodeficiency virus, NAMSAL New Antiretroviral and Monitoring Strategies in HIV-infected Adults in Low-Income Countries, W week

The mean (SD) number of symptoms of at least mild severity per patient decreased from 7.1 ± 5.4 at baseline to 4.7 ± 4.8 at W96 (p < 0.0001) in the DTG arm versus 7.1 ± 5.2 to 4.7 ± 4.5 (p < 0.0001) in the EFV400 arm. Additionally, we did not observe a higher number of neurological symptoms in the EFV400 arm or a higher proportion of patients reporting insomnia in the DTG arm (Online Resource, p. 4 and p. 13–15).

Results from the gender- and obesity-status stratified analyses are presented in the Online Resource (p. 16–25). In the former, results were qualitatively similar, except for a non-significant increase over 96 weeks in the MCS score in men in the DTG arm, which might be explained by the significantly higher baseline MCS score in this subgroup than in the EFV400 arm. Results from the obesity-stratified analysis were also, in general, qualitatively similar than those for the whole sample. One notable exception observed at W96 was the slightly higher number of very severe perceived HIV symptoms in patients in the DTG arm with obesity than in those without obesity (0.2 ± 0.9 vs 0.0 ± 0.2, p = 0.010). Furthermore, the incidence of obesity was not significantly different between arms at baseline, but was substantially higher in the DTG arm at W96 (see the Online Resource, p. 13–15, for a full description).

3.2 Base-Case Cost-Effectiveness

Unadjusted mean costs (US$, year 2016 values) and QALYs per patient accrued over 96 weeks are described in the Online Resource (p. 26) and adjusted-incremental mean (95% CI) costs and QALYs per patient in Table 1. In the base-case analysis over 96 weeks, QALYs per patient were similar for both the DTG-based and EFV400-based regimens (adjusted-incremental mean [95% CI] QALYs 0.000 [− 0.034 to 0.035]). Costs for the DTG-based regimen were slightly lower but not significantly different from that for the EFV400-based regimen (adjusted-incremental mean [95% CI] costs − $27.8 [− 72.1 to 16.6]). The NHB (95% CI) was equal to 0.056 (− 0.037 to 0.153), indicating that the DTG-based regimen was the most cost-effective regimen at the $500 threshold, although it was not cost-effective at the 95% confidence level. Based on the 5000 bootstrap iterations, the probability of DTG-based regimen being cost-effective at the $500 threshold was 88%. Furthermore, the cost-effectiveness plane (Fig. 4a) shows that 46.0% of the 5000 incremental cost-effect pairs obtained in the PSA were located in the southeast quadrant (DTG less costly and more effective, quadrant II [QII]) and were therefore cost-saving. A large proportion (43.2%) of incremental cost-effect pairs were also located in the southwest quadrant (DTG less costly and less effective, QIII), where the cost-effectiveness of DTG increased for lower cost-effectiveness thresholds. Accordingly, the probability of DTG being cost-effective increased and tended toward 90% for cost-effectiveness thresholds lower than $500, as illustrated by the CEAC (Fig. 4b).

Table 1 Multivariate-adjusted QALYs, costs (US$, year 2016 values) and cost-effectiveness (NAMSAL ANRS 12313 trial, n = 575)
Fig. 4
figure 4

Cost-effectiveness plane and acceptability curve of DTG- vs low-dose EFV-based regimen (NAMSAL ANRS 12313 trial). a Cost-effectiveness plane depicting the 5000 simulated pairs of incremental costs and QALYs of DTG vs EFV400, with the hollow diamond representing the base-case estimate (ΔCosts = − US$27.8; ΔQALYs = 0.000). b Cost-effectiveness acceptability curve showing the probability of DTG being cost-effective compared with EFV400 at various thresholds ranging from 0 to US$10,000/QALY. In both figures, the short-dashed line, the long-dashed line, and the dashed-dotted line indicate the cost-effectiveness thresholds of US$500/QALY and one (US$1392/QALY) and three (US$4175/QALY) times the 2016 Cameroonian GDP per capita, respectively. DTG dolutegravir, EFV efavirenz, EFV400 efavirenz 400 mg, GDP gross domestic product, HIV human immunodeficiency virus, NAMSAL New Antiretroviral and Monitoring Strategies in HIV-infected Adults in Low-Income Countries, Q quadrant, QALY quality-adjusted life-year, Δ difference

Increasing (or decreasing) the discount rate to 6% (0%) had little impact on the cost-effectiveness results, yielding an NHB of 0.053 (0.059), which corresponded to an 87% (89%) probability of DTG being cost-effective. The cost-effectiveness of DTG slightly increased in the intention-to-treat analysis with an NHB of 0.055 and an 89% probability of being cost-effective. When considering the subgroup of patients with a viral load > 100,000 copies/mL at baseline, the NHB improved to 0.092 mainly due to larger cost differences, resulting in a 92% probability of DTG being cost-effective.

The CEPT analysis indicates which strategy would be preferred at the $500 threshold for any price combination of DTG- and EFV400-based regimens (Fig. 5). The solid square represents the current (3rd quarter 2020) price combination of generic FDC, corresponding to an 88% probability of DTG being cost-effective. Results showed that the DTG FDC would be preferred with a probability of 95% if (1) its price decreased by 10% (from $5.2 to $4.7/month) or (2) the price of the EFV400 FDC increased by 11% (from $5.4 to $6.0/month), all other things being equal. Conversely, the EFV400 FDC would be preferred with a probability of 95% if (1) its price decreased by 57% (from $5.4 to $2.3/month) or (2) the price of the DTG FDC increased by 63% (from $5.2 to $8.5/month) all other things being equal. Results for the CEPT analysis at the threshold of one times the Cameroonian per capita GDP (i.e. $1392/QALY) are presented in the Online Resource (p. 27).

Fig. 5
figure 5

Cost-effectiveness price thresholds for DTG- and low-dose EFV-based regimens (NAMSAL ANRS 12313 trial). CE price thresholds for DTG- and EFV400-based regimens indicating which regimen would be preferred at the threshold of US$500 per QALY gained, and for any price combination of DTG and EFV400 FDC. The price combinations on the hollow circle line would make DTG CE with a probability of 95%. Any price combination on the solid triangle line would make EFV400 CE with a probability of 95%. The solid circle line depicts the price combinations for which both strategies have the same probability of being CE: Prob(DTG:CE) = Prob(EFV400:CE) = 50%. CE cost-effective, DTG dolutegravir, EFV efavirenz, EFV400 efavirenz 400 mg, FDC fixed-dose combinations, HIV human immunodeficiency virus, NAMSAL New Antiretroviral and Monitoring Strategies in HIV-infected Adults in Low-Income Countries, Prob probability, QALY quality-adjusted life-year

3.3 Extrapolation of Health Benefits and Costs Beyond Week 96

Estimated transition probabilities and mean cost and number of QALYs per 3-month cycle used to extrapolate disease progression and outcomes beyond W96 are presented in the Online Resource (p. 10–12).

Health benefits were similar over 5 years (incremental QALYs [95% CI]: − 0.005 [− 0.061 to 0.052]), but increased progressively more in the DTG arm over 10 years (0.161 [− 0.097 to 0.404]), while cost differences were larger with significantly lower costs in the DTG arm over 5 and 10 years (Table 1).

Accordingly, the NHB (95% CI) increased to 0.194 (0.123–0.261) over 5 years and to 0.484 (0.341–0.622) over 10 years, respectively, which both corresponded to a 100% probability of DTG being cost-effective at the $500 threshold (see the Online Resource, p. 28–29, for the CEAC at 5 and 10 years).

4 Discussion

Using individual data from the NAMSAL trial, this study provides information about PROs, costs, and the cost-effectiveness of the two first-line ART regimens recommended in the 2019 WHO guidelines: DTG 50 mg-based (preferred) and EFV 400 mg-based (alternative) regimens.

In the base-case analysis conducted over 96 weeks alongside the trial, our findings confirmed that a DTG-based regimen is the preferred first-line treatment, with a NHB (95% CI) of 0.056 (− 0.037 to 0.153) at the threshold of $500 per QALY, corresponding to a cost-effectiveness probability of 88%. In scenario analyses, the probability of the DTG-based regimen being cost-effective varied from 87 to 92%. When extrapolating outcomes beyond the observation period to take into account the respective medium- to long-term benefits and risks of each regimen, the DTG-based regimen had a NHB of 0.194 (0.123–0.261) and 0.484 (0.341–0.622) over 5 and 10 years, respectively, and a ≥ 95% probability of being cost-effective for a large range of cost-effectiveness thresholds (from $0 to $5147 over 10 years).

In the short term (i.e. over the first 96 weeks), the health benefits assessed in terms of QALYs were very similar in both arms, which is consistent with the results found for PROs and clinical outcomes. Overall, we observed an improvement of PROs between baseline and W96 in both arms, mainly during the first 48 weeks. The results remained qualitatively similar when stratifying the analysis by gender and obesity status. Furthermore, comparisons of clinical outcomes at W96 showed small differences in terms of efficacy and tolerance, except regarding body weight, which increased more in the DTG arm [10]. However, both the proportions of virological failure—as defined by the WHO—and acquired drug resistance were significantly lower in the DTG arm (see the Online Resource, p. 30). These better outcomes explained the greater long-term health benefits observed over 10 years in the DTG arm than in the EFV400 arm, even when taking into account the excess mortality risk associated with overweightness and obesity.

With regard to costs, the DTG-based regimen tended to be slightly less costly than its EFV400-based counterpart in the base-case analysis, mainly because of the lower rate of virological failure, translating into less frequent switching to more costly second-line treatments in the DTG arm. This explained the larger cost differences in favour of the DTG regimen observed over the medium- to long-term despite the very similar monthly prices of generic FDC ARV (specifically $5.2 and $5.4, year 2016 values) for the DTG- and EFV400-based regimens, respectively. However, given that ARV price evolution has important implications with regard to cost-effectiveness results, we performed a CEPT analysis to identify which of the two regimens would have the best economic value for a large range of ARV price combinations.

Our results are in line with those of Phillips and colleagues [13,14,15] who showed that a DTG-based regimen would be more cost-effective than an EFV-based regimen in the majority of cases. Our study brings added value as it (1) compared the economic value of a DTG-based regimen with a low-dose EFV-based regimen, both recommended first-line ART regimens in the 2019 WHO guidelines; (2) relied on individual data for clinical outcomes and medical resource use collected in a randomised trial in a setting close to real-world healthcare delivery in SSA; and (3) investigated patient value by assessing PROs and QALYs, which are the preferred health outcome measures in economic evaluations [35].

The study does however have several limitations. First, although the EuroQoL-Five Dimensions (EQ-5D) scale is the most widely used scale to derive QALYs in economic evaluations [20], we chose the SF-12 instrument as, unlike the EQ-5D, it has already been validated in Cameroon in people living with HIV [21]. However, the utility scores associated with the health state derived from the SF-12 scale were estimated using the SG technique rather than the time trade-off (TTO) technique, which may have led to upwardly biased estimates, as suggested by studies in high-income countries [36]. Although evidence for LMIC is limited, a Ugandan study comparing SG and TTO utility assessments of HIV/AIDS-related health states did not find any upward bias in SG utilities [37]. While acknowledging the potential for upward bias in SG utilities, it is not expected to differ between arms and therefore should not influence the cost-effectiveness results. Furthermore, given the unavailability of existing utility scores for a sample of the Cameroonian population, we also acknowledge the potential for bias induced by using population norms from another context, specifically the UK [22].

Second, we extrapolated long-term clinical outcomes and costs using a Markov cohort model whose parameters were mainly estimated from the data obtained in the NAMSAL trial from 24 to 96 weeks. One limitation of this approach is its inability to simulate patient trajectories using specific individual risk factors that may affect disease progression over time. However, the first-line transition probabilities obtained after exclusion of the first 24 weeks of follow-up may be representative of the experience of patients over the medium to long term, as immunological progression and virological response are relatively stable once virological success has been achieved if patients maintain good adherence to treatment [38, 39]. The latter condition is an optimistic but also conservative hypothesis, as it is likely to overestimate more the effectiveness of EFV400 than that of DTG, which is more robust to development of resistances.

Third, the use of external data from the 2-LADY trial [33] to estimate second-line transition probabilities may constitute a limitation as the characteristics of patients switching to second-line treatment may have been different in the two trials (Online Resource, p. 9). However, the 2-LADY trial was conducted in two of the three hospitals participating in NAMSAL, and the second-line treatments used in both trials were similar. Extrapolated second-line outcomes were therefore likely to be conservative, as the transition probabilities obtained using 2-LADY data likely provided conservative, yet relatively accurate estimations (the risk of treatment failure being expected to be lower in the DTG arm because of lower resistance acquired).

Fourth, our model did not account for the DTG-based regimen indirect benefit of reduced mother-to-child HIV transmission. Accordingly, overall, our modelling approach may have led to an underestimation of the DTG regimen’s health benefits and, consequently, underestimation of its cost-effectiveness.

Finally, our study was conducted using data from a single trial conducted in Cameroon, which may limit the generalisability of the results. Nevertheless, NAMSAL was conducted in a setting close to real-world healthcare delivery in LMIC with limited eligibility restrictions, allowing the inclusion of patients with a high viral load or a low CD4 cell count at baseline. It is therefore likely that the trial population reflected patients initiating first-line treatment in real-world SSA contexts.

5 Conclusions

This study brings new evidence about PROs in treatment-naive HIV patients receiving either DTG- or low-dose EFV-based regimens. Despite a greater increase in the incidence of obesity in the DTG arm than in the EFV400 arm over the trial’s first 96 weeks, PROs were overall similar in both study arms. Our results show that at current ARV prices (3rd quarter 2020), using a DTG-based regimen as the preferred first-line regimen may constitute the most efficient use of available resources in SSA. This finding supports the latest recommendations of the 2019 WHO guidelines.