Background

Artemisinin-based combination therapy (ACT) is currently the treatment of choice for uncomplicated Plasmodium falciparum malaria in Africa and elsewhere in the world. Since 2006, four ACT medicines have been recommended by the World Health Organization (WHO): artemether–lumefantrine (AL), artesunate–amodiaquine (ASAQ), artesunate–mefloquine (ASMQ) and artesunate–sulfadoxine–pyrimethamine (ASSP) [1, 2]. More recently, dihydroartemisinin–piperaquine (DHAP) has been added to the list of WHO-recommended ACT medicines [3]. To date, more than 80 countries worldwide have adopted ACT as the first-line therapy. Each of these five ACT medicines has several advantages and disadvantages, including safety, tolerability, dosing, post-treatment prophylactic effect, resistance to the partner drug, and price. To addition to these artemisinin-based combinations, a non-ACT combination amodiaquine–sulfadoxine–pyrimethamine (AQSP), had been employed for treatment in African countries, especially during the transition period from chloroquine or sulfadoxine–pyrimethamine monotherapy to ACT in the early 2000s.

Although artemisinin-resistant P. falciparum has emerged and is spreading in Southeast Asia [4, 5], as of 2017, Africa still seems to be spared. In the clinical protocol standardized by the WHO, the following classification with four categories is used to assess the parasitological and clinical outcome [6, 7]: adequate clinical and parasitological response (ACPR), late parasitological failure, late clinical failure, and early treatment failure.

Based on this classification, several individual studies and traditional meta-analyses have been carried out to assess the efficacy of different ACT medicines and alternative drugs in Africa [8,9,10,11,12,13]. A large majority of these studies have been conducted in children less than 5 years old, who bear the brunt of symptomatic malaria in areas of intense transmission. The traditional meta-analysis assumes identical two treatment arms in all randomized trials and is based on head-to-head comparison. However, more complex situations occur when pooling studies with either more than two treatment arms or no identical treatment arms between studies. Accordingly, network meta-analysis (NMA) may offer an enormous potential for a novel methodological approach to meta-analysis.

NMA, also known as mixed treatment comparison (MTC), is a statistical method to combine data from randomized comparisons A versus B, A versus C, B versus D, etc., to generate an internally consistent set of estimates while respecting the randomization in the evidence [14]. An increasing number of systematic reviews use NMA to compare three or more treatments to each other even if they have never been compared directly in a clinical trial [15,16,17,18]. Contrary to the traditional meta-analysis that estimates a common effect of the same intervention A and B among studies, MTC provides estimates of the effect of each intervention relative to each other. Therefore, NMA was designed only for randomized controlled trials and provides the ability to increase precision of point estimate and draw inferences on the comparability between interventions that have never been compared in a clinical trial. The probability that estimates which treatment is the most effective can also be calculated [19, 20] using fixed or random effect models. Recently, a new technique was developed to rank the competitive interventions according to their efficacy and safety [19,20,21]. The NMA method can be applied to both binary and continuous data (aggregate data and individual patient data) and could be extended to longitudinal data with various types of outcomes. Although individual patient meta-analysis is the current gold standard for evidence synthesis, analysis using individual data should consider the hierarchy of different interventions.

To date, only one study performed MTC of the WHO-recommended ACT using the WHO categorical outcome [22]. However, this work was limited to data from only one country (Cameroon) without treatment ranking assessment. Although Donegan et al. [23] have reviewed methods to assess different assumptions in NMA and applied the method to one existing trial of four ACT medicines, there is a need to extend the method to other anti-malarials. In another hand, studies published by the Worldwide antimalarial resistance network (WWARN) group were based on trials that included both one treatment arm and at least two treatment arms [12]. The authors compared drugs in terms of efficacy and prophylactic effect on individual patient data. However, the study did not present the network component of trials and also did not address the following questions: which drug combination is the most effective based on existing clinical data and what is the order of their relative efficacies in Africa? At present little is known about the efficacy of anti-malarials using NMA methods.

In the present work, two models with different assumptions proposed in the articles of NMA with binary outcome were assessed. NMA methods were applied on aggregate data from randomized trials to provide a full comparative efficacy of treatments.

Methods

Search strategy and outcome

Published literature was searched in PubMed. The search included all potentially relevant published articles, characterized by random allocation to treatment and comparison of different artemisinin-based combinations in Africa, starting from the year of the introduction of ACT (2002–2003), up to June 2016. An electronic search of Medline was conducted using the following key words: (“malaria” AND ACT [All Fields]) AND (Randomized Controlled Trial [ptyp]) AND (“2002/01/01” [PDAT]: “2016/06/30” [PDAT]) AND “humans”. In addition, reference lists of reviews were also screened to include potential articles. Studies were eligible if they involved at least 2 ACT medicines and reported clinical efficacy as ACPR corrected by polymerase chain reaction (PCR). Cure, i.e. the term “adequate clinical and parasitological response”, was defined as undetectable parasitaemia with or without fever, without previously meeting the criteria of treatment failure on the last day of follow-up (usually day 28).

Selected articles and data

A total of 91 articles was identified and screened. Details are presented in Fig. 1. Several articles were excluded for the following reasons: studies that were either on tolerability or recurrence of parasitaemia or did not report data on ACPR [24, 25], or based on the mechanism of resistance to ACT [26], or comprised only one ACT-arm as in Ref. [27], which compared AL to azithromycin–chloroquine (AZCQ). Data comprising monotherapies and ACT were excluded. In addition, reference lists in 16 reviews were screened for recent studies [12, 13, 22, 28, 29]. At the end of the screening process, 76 articles were included in the original analysis; K = 13 treatments were involved. The quinine arm was removed from a 3-arm study [30], and two studies published in French were included [31, 32]. The list of the interventions was artesunate–amodiaquine (ASAQ), artesunate–sulfadoxine–pyrimethamine (ASSP), artesunate–sulfamethoxypyrazine–pyrimethamine (ASSMP), artemether–lumefantrine (AL), dihydroartemisinin–piperaquine (DHAP), dihydroartemisinin–piperaquine–trimethoprim (DHAPT), artesunate–chlorproguanil–dapsone (ASCD), ASMQ, artesunate–atovaquone–proguanil (ASATPG), artemisinin–naphtoquine (ASNAPH), artesunate–amodiaquine–chlorpheniramine (ASAQCPH), artesunate–pyronaridine (ASPY) and, the non-ACT combination amodiaquine–sulfadoxine–pyrimethamine (AQSP). The combination AQSP was included in the present analysis because this combination was employed during the transition period before the adoption of ACT in many African countries and its efficacy had been compared to that of ACT in randomized studies.

Fig. 1
figure 1

Flow chart diagram for the selected articles

Patient populations included children and adults: 82% were children less than 15 years old among whom more than 80% were children under 5 years. Some published articles involved both children and adults or adults only. Trial characteristics are described in the Additional file 1. Most of the studies presented the outcome on day 28 only, while few followed the patient until day 42 or even day 63. Hence, only a few studies had longitudinal data on days 28, 42 and 63. Accordingly, the obtained information was pooled on the primary end-point day 28 using the observed number of ACPR. The percentage of ACPR was estimated using the number of enrolled population or the number followed as far as the intention-to-treat (ITT) analysis or a per protocol (PP) approach is concerned, i.e. when the ITT results were not reported, the percentage of ACPR obtained in a PP approach was used to estimate the number of patients with positive outcome.

Configuration of the anti-malarial network and drug numbering

A network of treatments was built after extracting information from the articles. Figure 2 shows the set of nodes represented by the name of ACT or AQSP, linked by lines. The thickness of the line is proportional to the number of randomized clinical trials that have been included. The lines joining AL-DHAP and AL-ASAQ are the thickest. The Figure has a complex structure and allows indirect comparison, e.g. ASMQ and ASSMP with ASAQ as the reference. Close loops like AQSP, ASAQ and ASMQ provide both direct and indirect evidence. The network also shows that AL was the most studied ACT. It could be used as the comparator treatment. However, ASAQ was the most commonly used ACT in clinical practice in Africa due to its cost, availability and tolerability. It has been shown in several randomized clinical trials that ASAQ is comparable to AL. Accordingly, AL was used as the entire network common comparator among trials as it has been evaluated against the highest number of treatment. Therefore, the network trial generated K – 1 = 12 contrasts representing the overall relative treatment difference to be estimated from the K (K − 1)/2 = 78 direct and indirect overall comparisons.

Fig. 2
figure 2

Malaria evidence network. Network of malaria treatment constructed from 76 studies with 13 therapies. The thickness of the line is proportional to the number of randomised clinical trials that have been included. The lines AL–DHAP and AL–ASAQ are the thickest. AL is the most tested ACT. No study directly compared AL and ASATPG but the estimate was obtained from indirect comparison using AL–ASAQ and ASAQ–ASATPG trials. This Figure was generated using the netmeta package (version 0.8-0) available in the statistical software R downloadable at http://cran.r-project.org

The numbers of multi-arms studies were as follows: 2-arm (64 studies), 3-arm (9 studies), and 4-arm (3 studies). The numbers of studies per-treatment arm were as follows: 12 AQSP, 63 AL, 42 ASAQ, 1 ASAQCPH, 1 ASATPG, 4 ASCD, 6 ASMQ, 2 ASNAPH, 2 ASPY, 4 ASSMP, 11 ASSP, 22 DHAP, and 1 DHAPT. A detailed summary regarding the number of studies per-treatment arm, randomized sample sizes, number evaluated for the outcome, and proportion of ACPR is given in Table 1 for each treatment. For example, AQSP was found in 12 articles but only 7 compared its efficacy to that of AL. All cure rates ranged from 64.6 to 98%.

Table 1 Description of the data set in terms of studies per treatment and the sample sizes, on day 28

Based on the information obtained from the data set, the drugs were numbered as follows except for AL, i.e. AL is treatment number 1 representing the common comparator drug, 2 = AQSP, 3 = ASAQ, 4 = ASAQCPH, 5 = ASATPG, 6 = ASCD, 7 = ASMQ, 8 = ASNAPH, 9 = ASPY, 10 = ASSMP, 11 = ASSP, 12 = DHAP, and 13 = DHAPT. For each study, data were summarized in terms of the number of treatment arms, list of treatments, number of PCR-corrected day 28 ACPR in each treatment arm, and sample size in each arm.

Statistical methods

In order to simultaneously compare all treatments in a coherent manner, NMA was used to obtain a hierarchy of the competing interventions [15]. This method required assumptions, notations and presentation of the modelling approaches.

Assumptions

The application of NMA to the selected data assumed that the randomization process was preserved within each trial comparing the estimates of relative effect among treatments. Different assumptions were considered in NMA: (i) the hypothesis of homogeneity, i.e. no variation in treatment effect between trials within pairwise contrast; (ii) the consistency assumption establishing that within a closed loop, estimates for a particular pairwise contrast from head-to-head evidence is the same as what is estimated from the indirect evidence, and (iii) the inconsistency assumption, which means that estimates for a direct evidence are different from the indirect evidence.

For the following, the modelling approaches presented in a technical support document conceived by Dias et al. [33], was used. It is based on a random effect model with homogeneous variance, which allows assessing the validity of the consistency assumption and adjustment for multiple-arm trials. Notations described by Greco and Hong [34, 35] were followed to perform NMA for binary outcome using a Bayesian approach on aggregate data. The principal summary measure was the odds ratio (OR).

Notations

Let N = 76 and K = 13 denote the number of randomized clinical trials and the total number of treatments, respectively. Given that each study i included na i number of treatment arms, i = 1,…, N, and the number of PCR-corrected ACPR r i,k in every arm k (k = 1, …, na i  ≤ K) of study i, then r i,k follows a binomial distribution denoted ~Binom (p i,k, n i,k) , where n i,k is the sample size of arm k in study i, and p i,k is the probability of an event (success, here ACPR) of arm k in study i. For K and N, the number of existing comparisons (edges) between the treatments is equal to 109. This number was obtained by cumulating the number of pairwise comparisons among trials, i.e. the na i (na i  − 1)/2 comparisons. For instance, given a comparator within a trial, a 2-arm trial contributes one pairwise comparison, a 3-arm trial, three comparisons and a 4-arm trial, six possible comparisons. Given the K = 13 treatments, K (K − 1)/2, i.e. 78 treatment effects, are expected. The baseline (also called basic parameters) K – 1 = 12 has to be estimated. Let the following \(d_{12} , d_{13} , d_{14} , d_{35} , d_{16} , d_{17} , d_{18} , d_{19} , d_{1,10} , d_{1,13}\), \(d_{1,12} , d_{1,13}\) denote the baseline parameters, representing the overall treatment effect, with d 35 referring to the direct effect between ASAQ and ASATPG. The probability of event occurrence p ik is modelled on the logit scale as:

$$\theta _{{ik}} = logit\left( {p_{{ik}} } \right) = \left\{ {\begin{array}{*{20}l} {\mu _{i} ;i = 1,2, \ldots ,N;k = b = 1,2, \ldots ,K} \\ {\mu _{i} + \delta _{{i,bik}} ;i = 1,2, \ldots ,N;k > b;k = 2,3, \ldots ,K} \\ \end{array} } \right.$$
(1)

where \(\mu_{i}\) is the “study” random effect that accounted for differences among trials. This is also the trial-specific baseline and represents the log-odds ratio (LnOR) of event in the overall control treatment, while \(\delta_{i,bik}\) is the trial-specific LnOR of event occurrence of the treatment k compared with the “placebo” treatment b i , which can be considered here as a random variable that follows a normal distribution N (d bk, \(\sigma_{i,bk}^{2}\) ), assuming that b i  = b for all studies. Therefore, treatment indexed k = b = 1 corresponds to AL, and d 1,1  = d k,k  = 0. In the context of NMA, it is assumed that the study-specific treatment effects are exchangeable such that \(\sigma_{i,bk}^{2}\) = τ 2, ∀ b,k; i.e. τ2, the between studies variation, is assumed to be the same for all pairwise contrasts in all subsequent methods. \(\delta_{i,bik} \sim N\left( {d_{bk} ,\tau^{2} } \right)\) is seen as the common form of a random effect meta-analysis, especially in the frequentist methods. Here, the parameter \(\tau^{2}\) explains the variability among trials that could be caused by different drug formulations, the methods used, study designs, and the level of transmission that varies from one trial to another. In addition, the node \(\sigma_{i,bk}^{2}\) expresses the same variability within a study among a pairwise contrast.

To account for heterogeneity in the patient populations, a dummy variable called S i was defined as 1 if the population was children less than 15 years and 0 in other patient populations. A random coefficient βi for S i was assumed. Hence, βi followed a normal distribution with mean β0 and variance ε2.

Modelling approach 1: consistency

The model is given by:

$$logit\left( {p_{i,k} } \right) = \mu_{i} + \delta_{i,bk} + \beta_{i} S_{i}$$
(2)

where the node \(\delta_{i,bk}\) is a random effect with mean d bk representing the treatment effect of b compared to k for all studies (k ≠ b) and a variance \(\sigma_{i,bk}^{2}\). This model provided the estimates of the K − 1 = 12 “basic parameters” representing all treatment relative to treatment b, by considering b = 1, 3. The parameter \(\sigma_{i,bk}^{2}\) can be expressed in terms of the study variation τ and the number of “basic parameters”, i.e. \(\sigma _{{i,bk}}^{2} = \tau ^{2} \times {{2\left( {K - 1} \right)} /K}\). For purpose of illustration, to derive the treatment difference between 2 = AQSP and 4 = ASAQCPH given AL as the control group, the consistency equation is given by d 24  = d 14  − d 12 .

Modelling approach 2: inconsistency

While heterogeneity is characterized as between-trial variation within treatment contrast, inconsistency is the variation between contrasts. Contrary to the consistency model, there is no common comparator across studies. Several methods have been studied to test the inconsistency assumption and are considered as an extension of Bucher’s method [36]. Instead of the repeat application of the method, Dias et al. [33] proposed to compare the consistency model with an inconsistency model. Hence, for a study comparing treatment k to treatment k , the model was defined as follows:

$$logit\left( {p_{i,k} } \right) = \mu_{i} + \delta_{{i,kk^{\prime}}} + \beta_{i} S_{i}$$
(3)

In this model, within study i, each node \(\delta_{{i,kk^{\prime}}}\) is a random variable with a mean treatment effect \(d_{{kk^{\prime}}}\) with a variance of \(\sigma_{i,bk}^{2}\). \(\delta_{{i,kk^{\prime}}}\) was treated as a separate parameter to be estimated. Accordingly, the model gives estimates of 12 overall treatment effects and also provides estimates of other 66 treatment effects. To test if consistency is reasonable, model 3 was compared to model 2 by checking statistical difference of their model fit.

Adjustment for multiple arms studies

Since there are 2-, 3- and 4-arm trials, the difference in treatment effects may arise from different sources and can be inconsistent [33]. Although only a small proportion of studies was 3-arm (n = 9) or 4-arm (n = 4) trials, with model 2, an adjustment for multiple arms trials was made. Indirect comparisons accounted for the correlation between any two treatment contrasts in a multi-arm trial. This correlation is equal to 0.5 given the hypothesis that the study variation is the same among all pairwise contrasts.

Statistical data analysis

First, pairwise random effect meta-analyses were conducted. The method was implemented with the meta-analysis package of the software R (rmeta) downloadable at https://cran.r-project.org/. Treatments involved were the five WHO-recommended ACT medicines: ASAQ, ASSP, AL, DHAP and ASMQ. Direct estimates were extracted from head-to-head trials using the odds ratios and 95% confidence interval (CI). The heterogeneity of variance for each pairwise comparison was estimated using the \(I^{2}\) statistic.

Secondly, data were analysed by Bayesian method. Prior information was defined for all unknown parameters d bk and \(\tau^{2}\). The following prior distributions were considered: \(\mu_{i} \sim N\left( {0, 0.0001} \right);\, d_{bk} \sim N(0, 0.0001)\). The inverse of the study variance was given a uniform distribution, i.e. \(s^{2} = \frac{1}{{\tau^{2} }}\sim Uniform(0,2)\); β0 ∼ N (0,0.0001), and 1/ε2 also followed a uniform distribution dunif (0,2). One markov chain was run. After a burn-in of 100,000 iterations, posterior outputs were obtained from the last 50,000 iterations. Model fit statistics were obtained for each model, i.e. deviance information criterion (DIC), deviance (\(\hat{D}\)) which provides an idea of the model likelihood, and \(p_{D }\) the number of parameters in the estimation process. The model with the smallest DIC is considered the best. Regarding treatment ranking, the probabilities that each treatment is the best, second best, and third best on Day 28, among 13 treatments, were calculated according to van Valkenhoef and Kuiper [21]. Treatment ranks were based on posterior probabilities. In each MCMC run, every treatment was ranked according to its estimated magnitude. The proportion of MCMC cycles in which the treatment k ranks first yielded the probability that such specific treatment is the best among all treatments. Analyses were carried under the WinBUGS software. The WinBugs codes that were used for NMA are available in the Additional file 2.

Results

Head-to-head meta-analyses

Thirty-one articles (15,695 participants) compared the efficacies of ASAQ (taken as reference) and AL. The summary measure for these studies showed that the efficacy of ASAQ was comparable to that of AL (OR = 0.96; 95% CI [0.79; 1.17]; p value = 0.682; τ2 = 0.138; I 2 = 54% [32.2%; 69.9%]). The efficacy of ASAQ (as reference) was compared to that of DHAP in six studies (4042 participants); ASAQ appeared to be less efficacious than DHAP but the result was not statistically significant (OR = 0.81; 95% CI [0.54; 1.22]; p value = 0.31; I 2 = 59.5% [0.5%; 83.5%]). Seven studies (2245 patients) compared ASAQ (reference) to ASSP. Results showed that both treatments were comparable (OR = 0.77; 95% CI [0.52; 1.14]; p value = 0.191; τ2 = 0.055; I 2 = 26.7% [0%; 68.2%]). Eighteen articles (100,000 participants) compared AL (taken as reference) to DHAP. The results showed that AL was statistically less efficacious than DHAP (OR = 0.52; 95% CI = [0.36; 0.77], τ2 = 0.518, p value = 0.0009; I 2 = 86.8% [80.6%; 91%]), but AL was as effective as ASMQ in five studies (2869 patients; OR = 0.90; 95% CI [0.56; 1.47], τ2 = 0, I 2 = 0% [0%; 73.4%]). Three studies (1104 patients) compared AL and ASSP, and the differences were not statistically significant (OR = 1.14; 95% CI [0.40; 3.22], I 2 = 36.4% [0%; 78.7%). Only one study compared ASAQ to ASMQ. No study compared ASMQ to DHAP or ASSP. In addition, no study compared ASSP to DHAP.

Comparing AQSP and ACT

Four trials (1584 participants) compared ASAQ (reference) to AQSP and showed that ASAQ was more efficacious than AQSP, but the results were not significant (OR = 1.51; 95% CI = [0.80; 2.87]; p-value = 0.20; τ2 = 0.23, I 2 = 63.4% [0%; 87.7%]). AQSP (reference) was found less efficacious than AL in seven trials (2316 participants) but the result was not statistically significant (OR = 0.76; 95% CI [0.25; 2.30]; p value = 0.62; τ2 = 1.89, I 2 = 93.7% [89.5%; 96.3%]) [37,38,39,40,41,42,43]. The same conclusion was found in three trials comparing AQSP (ref) to DHAP (2518 patients; OR = 0.54; 95% CI [0.16; 2.03]; τ2 = 1.30, I 2 = 96.6% [93%; 98.4%]) [42, 44, 45]. Forest plots of some of these results are illustrated in Additional file 3.

Relative treatment effects from NMA

Table 2 displays the overall treatment effects (given AL as the common comparator) estimated from two models using the Bayesian approach and their 95% credible interval (CrI). All models yielded similar results. The difference in DIC of models 3 and 2 was 1025.04–1024.07 = 0.97, suggesting little evidence for inconsistency. Hence, model 2 gave the best adjustment to the data set. A significant difference was found between DHAP and AL. For example, with model 2, DHAP was 1.92-fold more efficacious than AL (OR = 1.92; 95% CrI = 1.30–2.82; 63 vs 22 studies per-treatment arm; 19,163 participants). New drug combinations like ASSMP, ASATPG, ASPY, and ASAQCPH appeared more efficacious than AL, but the results were not statistically significant as indicated by the confidence intervals. For model 3, the other 78 − 12 = 66 comparisons can be found in Additional file 3.

Table 2 Comparison of Bayesian models; posterior distributions of odds ratios and 95% credible intervals (CrI)

Indirect comparisons

Indirect estimates were calculated from each model assuming the consistency equation (Table 3). The results showed the superiority of DHAP compared to ASCD (OR = 3.25, 95% CI 1.46–7.25), ASAQ (OR = 1.70; 95% CI 1.10–2.64), and AQSP (OR = 2.20; 95% CI 1.21–3.96). Indirect estimate of DHAP vs ASMQ yielded OR = 1.59 with 95% CI 0.62–4.02 showing no significant difference between the two drugs. The efficacy of DHAP was also not statistically different from that of ASPY (OR = 1.36; 95% CI 0.38–4.87). The combination ASATPG was 3.40-fold more efficacious than AL, albeit not statistically significant (95% CI 0.60–18.56).

Table 3 Indirect comparisons calculated from the consistency equation

Rank probabilities

Model 2 was selected for calculating the rank probabilities. Table 4 displays the rank for each treatment and the corresponding rank probability. Data are shown for rank 1, rank 2, rank 3 and for the worst treatment (rank 13). ASATPG combination had the highest probability (0.53) at rank 1, followed by ASNAPH (0.11) and ASAQCPH (0.11). At rank 2, DHAP was the drug with the highest probability (0.25). At rank 3, the best was also DHAP (0.31). For rank 13, the worst treatment was ASCD (0.48).

Table 4 Posterior distributions of the ranking probability for each treatment

Results among WHO-recommended ACT and AQSP

To provide an answer to the best treatment among WHO-recommended ACT medicines and AQSP, a second analysis was carried out using model 2, with AL as the overall control group. Treatments were numbered as follows: AL = 1, AQSP = 2, ASAQ = 3, ASMQ = 4, ASSP = 5, DHAP = 6, 7 = ASAQCPH, 8 = ASATPG, 9 = ASCD, 10 = ASNAPH, 11 = ASPY, 12 = ASSMP, and 13 = DHAPT, allowing the WHO-recommended ACT medicines to be the first 6 treatments. Ranking was only up to 6. Table 5 displays the results. DHAP was more efficacious than AL (OR = 2.09; 95% CI [1.54–2.83]). At ranks 1, 2, and 3, DHAP emerged at the first treatment rank.

Table 5 Results for AQSP and the WHO-recommended ACT medicines

Discussion

NMA of randomized clinical trials is becoming a promising tool to analyse grouped data with multiple interventions. The use of this methodological approach was assessed for the evaluation of malaria treatment efficacy and the selection of the most efficacious drug. Several clinical trials were identified and assessed using the proportions of PCR-adjusted ACPR. The rationale of choosing 2002–2003 as the starting point was that it is considered as the early years of the introduction of ACT in some African countries and the year when the WHO protocol requiring a 28-day follow-up to guide clinical trials was updated [6]. When using this approach, an examination of the type of primary outcome is important because it determines the choice of the modelling approach. Treatment outcome can be categorical or continuous, or it may express as survival outcome. To apply NMA, the included studies should be randomized trials that compare at least two treatments. A single-arm trial cannot be included in the analysis. Robust results can be expected with NMA, depending on the choice of the common end-point of drug efficacy and assessment of heterogeneity. Furthermore, it has been argued that the chief advantage of random-effect meta-analysis is that heterogeneity is taken into account in the modelling. Hypothesis tests provide information about consistency and inconsistency of the network.

The results become more robust when a Bayesian estimation method is used. One of the strengths of this method is that, at a point of time when the efficacy of a drug is not established, the contribution of other trials could suggest its relative efficacy. In addition, pooling trial increases the sample size and allows the selection of the best treatment. This is an advantage of NMA approach that may facilitate the selection of the best treatment through rank probabilities and is, therefore, one feature of the novel method to guide decision making. Two models were examined in the present study, and both provided similar results.

The results did not show any difference in the efficacies of ASAQ and AL. AL was the first co-formulated fixed-combination ACT to become available, followed by ASAQ. The six-dose regimen of AL is highly effective and represents a challenging comparator for any new drug combination although the non-inferiority of ASAQ versus AL had been shown in an earlier multi-centric study [46]. Both ACT are still highly effective and well tolerated with no serious adverse event, supporting their continuous use for the treatment of uncomplicated P. falciparum malaria in Africa [47, 48]. However, in a few studies these two drugs were found to have cure rates lower than the critical threshold of 90% required by the WHO [49, 50]. After PCR correction, the cure rates were 77.8 and 84.1% for AL and ASAQ, respectively [49]. Since cases of ACT resistance have been reported in South-East Asia, the need to understand the mechanism of artemisinin resistance in P. falciparum has become a global research goal. Since their adoption in African countries, few data on the selective impact of ACT in the circulating parasite population are available. A recent study assessed the selective impact of the treatment with ASAQ and AL on Pfcrt and Pfmdr1 alleles and found no association between the presence of Pfcrt and Pfmdr1 alleles before treatment and at the time of treatment failure [26]. Hence, countries that rely on ASAQ and AL for the first-line treatment should continue to monitor their clinical efficacy and molecular markers associated with resistance to these drugs, including kelch 13 [51].

The results highlighted that DHAP is superior to AL, contrary to some published reports in which the overall efficacy of DHAP was found to be similar to that of AL in multi-centric studies in Africa [52, 53] and in individual randomized clinical trials [30, 43, 47, 54,55,56,57]. However, the result of the present study is in agreement with a recent systematic review in which it was found that DHAP reduces overall treatment failure compared to AL [10]. In addition, indirect comparison also showed that DHAP was more efficacious than ASAQ and ASCD. These results should be taken with precaution as piperaquine has a longer elimination half-life than most other drug partners of artemisinin derivatives.

Studies in Africa also demonstrated the efficacy and tolerability of ASMQ [58, 59]. However, ASMQ has not been adopted by African malaria control units due to the high efficacy of AL and ASAQ and a relatively high incidence of side effects, in particular vomiting, associated with mefloquine. On the other hand, studies that compared the efficacies of ASMQ and AL had been conducted, but there is no conclusive evidence and argument to propose ASMQ as a replacement of AL in Africa [58,59,60,61,62]. The present analysis detected no significant difference in the efficacy of these two ACT medicines.

Regarding ASSP, it has been efficacious and well tolerated as ASAQ and, has been used in some African countries as the first-line treatment [63]. Recent literature and the present analysis support the comparable efficacy of ASSP and AL. However, the use of SP combined with artesunate is a source of concern in many African countries where molecular studies have shown increasing prevalence of multiple mutations in dihydrofolate reductase (dhfr) and dihydropteroate synthase (dhps) genes and where SP is employed for intermittent preventive treatment in pregnancy and infancy and AQSP combination is employed for seasonal malaria chemoprevention [3].

In addition to five ACT medicines currently recommended by the WHO [3], several novel combinations have been evaluated in the African continent in recent years. The triple combination dihydroartemisinin–piperaquine–trimethoprim (DHAPT), is administered over 2 days rather than the WHO-recommended, standard 3-day ACT administration, and its efficacy was found to be similar to that of AL in a three-centre study [64]. ASAQCPH resulted in a better haematological recovery and higher cure rates, as compared to ASAQ and AL in Nigeria [65]. The combination ASATPG was highly effective in Cameroonian young children [66]. ASNAPH is available in Africa and is recommended for use as a single-dose regimen by the manufacturer [31, 67]. ASSMP was assessed in several African countries, and its efficacy was found similar to those of ASAQ and AL [43, 68, 69].

ASPY is a newly introduced form of ACT which may possibly be deployed together with primaquine to kill mature P. falciparum gametocytes in an effort to reduce transmission in areas where malaria elimination programme is being implemented [70, 71]. A meta-analysis highlighted its efficacy and safety and concluded that it could be an option for the first-line treatment [11]. In the present analysis, ASPY was found to be more efficacious than AL and less efficacious than DHAP, but the results were not statistically significant.

So far, only a few studies have presented results based on the suggested methodological changes. However, Donegan et al. [23] explored and reviewed methods assessing key assumptions of NMA. These authors applied the methods on data from only one multi-centric study [72] comparing DHAP, ASAQ, AL and ASCD. The authors did not draw conclusions regarding the results and did not present treatment according to their efficacy rank.

Among non-ACT drug combinations, AQSP had been evaluated and compared to ACT in some African countries during the pre-ACT period [73, 74]. In some studies, AQSP was as efficacious as ASSP. AQSP has also been found highly efficacious and well tolerated as DHAP [45]. The results of the present study found no significant difference between AQSP and AL and between DHAP and AQSP. At present, AQSP cannot be recommended for the first-line treatment of uncomplicated malaria. However, the results presented in this study suggest the continued usefulness of AQSP in Africa, in particular for seasonal malaria chemoprevention.

Based on this novel methodological approach, among the 13 combination therapies included in the present analysis, ASATPG emerged at the top rank as the best treatment followed by ASAQCPH and ASNAPH in spite of the limited number of trials. These findings show that NMA tends to be too sensitive when treatments are tested once or twice. The method allowed these treatments to be at first rank. However, to counter this methodological limitation, further randomized clinical trials with AL, ASAQ, AQNAPH and ASATPG would be necessary to confirm these initial findings based on a limited number of trials.

The second analysis with the most widely tested combinations in Africa showed that DHAP was more effective than AL and found to be the best treatment, followed by ASMQ, and AL at the last rank. The combination AL has been found in most studies to be the most efficacious treatment. In the present analysis, the finding that AL is not the most efficacious ACT could be due to the fact that it was fixed as the overall control drug in the network. However, this result does not bring into question its efficacy because in the present analysis it was ranked among the best treatments. As an additional option, ASMQ, which has not been extensively employed in Africa, could be used as an alternative for the first-line anti-malarial therapy.

The present study provides the first overview of NMA methods for anti-malarial drug efficacy. It follows the work initiated by Donegan et al. and offers further research opportunities [23]. However, the study has some limitations. First, search criteria included studies conducted only in Africa. Secondly, although the method can be extended to individual patient data, the present study took into consideration aggregate patient data, which may have reduced the power of the study. However, results were similar with those obtained with individual outcome. Third, an adequate number of trials was not available for a robust inference for the recently tested new combination therapies. Despite this shortcoming, the findings described in this work reflect the reality in the field. Moreover, regarding the change in efficacy, inadequate dataset was available on day 42, day 35 and day 63 to perform separate analyses. Fourth, drug efficacy was limited to the outcome on day 28, as recommended by the WHO. According to WWARN [12], there is variability in efficacy related to drug formulation. As this variability was not known in the past, it was assumed in the present study that drug efficacy does not vary with formulation. The analyses included studies on both children and adults, and the majority of the included patients were children less than 5 years of age. This was done partly to include novel drugs and estimate their contributions to the anti-malarial efficacy. Accordingly, this approach may have increased heterogeneity and partly explain imprecise results. In addition, confounding factors such as age and doses may reduce the demonstrated effect. Nevertheless, the random effect model accounted for this heterogeneity by assuming a trial-random effect and a random coefficient for the type of study population. Both of these factors were introduced in the model to reduce the degree of heterogeneity to about 50%.

Conclusions

NMA technique may have a role to play in the evaluation of different public health policies and interventions for malaria control. DHAP was found the best treatment overall but this observation should be treated with caution in decision-making since the present analysis was based on the outcome on day 28 and did not take into consideration the outcome on day 42 and 63 for drugs with long elimination half-life due to inadequate available data. More comparative studies are needed with novel drug combinations to assess their efficacy as compared to the currently recommended artemisinin-based combinations in Africa.