FormalPara Key Points

As Parkinson’s disease (PD) progresses and optimized treatment with oral therapies fails to control symptoms, device-aided therapies such as levodopa/carbidopa intestinal gel (LCIG), deep brain stimulation (DBS), and continuous subcutaneous apomorphine infusion (CSAI) may improve symptoms such as off-time.

Comparative randomized controlled trials evaluating device-aided therapies for advanced PD are not available.

This network meta-analysis was conducted to compare the three device-aided therapies and best medical therapy (BMT), with a focus on changes in off-time and quality of life (QoL).

In this analysis, LCIG and DBS were associated with superior improvement in off-time and PD-related QoL compared with CSAI and BMT at 6 months after treatment initiation.

There was no significant difference in the effects of LCIG and DBS, but DBS was ranked first for reduction in off-time, and LCIG was ranked first for improvement in QoL.

These results could help inform treatment choices for people whose PD symptoms are not well controlled with optimized oral therapy.

1 Introduction

As Parkinson’s disease (PD) progresses, motor complications can arise and worsen over time; for example, an increase in off-time frequently occurs. This, along with other symptoms, has a negative impact on patient’s quality of life (QoL) and activities of daily living (ADL) [1, 2]. The burden of advancing disease also affects caregivers of people with PD [3, 4]. In practice, when motor fluctuations cannot be improved with optimized oral therapy (or best medical therapy [BMT]), patients may be classified as having advanced PD [5,6,7]. For these patients, device-aided therapy may be considered as an alternative treatment option to BMT.

Device-aided therapies for PD include levodopa/carbidopa intestinal gel (LCIG), deep brain stimulation (DBS), and continuous subcutaneous apomorphine infusion (CSAI). Levodopa/carbidopa intestinal gel is a gel formulation of levodopa/carbidopa that is administered continuously via a percutaneous endoscopic gastrostomy (PEG) into the small intestine using a portable pump [8]. Levodopa/carbidopa intestinal gel circumvents the impact of erratic gastric emptying on oral levodopa/carbidopa and provides more consistent plasma levels of levodopa [9], which is a metabolic precursor of dopamine that is depleted in PD. Deep brain stimulation involves the surgical insertion of electrodes to deliver controlled electrical impulses to the subthalamic nucleus, the internal globus pallidus, or the ventral intermediate nucleus. The electrodes deliver high frequency stimulation to the targeted area, which overcomes the abnormal activity associated with some PD symptoms. Continuous subcutaneous apomorphine infusion is the continuous subcutaneous delivery of apomorphine solution via a pump [10]. This approach provides a continuous plasma delivery of apomorphine, which mimics the effect of dopamine in the striatum. These device-aided therapies have all been shown to reduce off-time, and in some studies they have also improved QoL in patients with advanced PD [11,12,13]. These improvements have also been observed in longer term follow-up and observational studies [14,15,16]. Comparative effectiveness of some device-aided therapies in patients with PD has been reported [17, 18]. A recent systematic review compared data from studies of LCIG, DBS, and CSAI and presented the results in a format aimed at patients [19], and expert guidance on such therapies has been published [20]. However, there have been no comprehensive clinical trials comparing the three device-aided therapies for advanced PD. Individual patient and disease factors may influence the selection of patients for each device-aided therapy [6], and with limited published data, information on the comparative effectiveness of device-aided therapies for reducing off-time and improving QoL is still needed to aid the decision-making process for patients, caregivers, and providers.

When single-arm studies constitute the majority of the evidence, traditional network meta-analysis is a sub-optimal approach due to the lack of common comparators. An unanchored matching-adjusted indirect comparison (MAIC) to connect single-arm studies to the evidence network is an alternative approach. Matching-adjusted indirect comparison is an indirect comparison method frequently used by health technology assessment agencies [21, 22]. It is used to compare one therapy with individual patient-level data (IPD) with another therapy with published aggregate data (AD). It reweights the IPD to match the AD population in baseline characteristics, estimates the outcome in the weighted IPD, and compares it to the outcome in the AD study as if the outcomes were assessed within the same study.

The primary objective of this analysis was to assess the effectiveness of LCIG, DBS, CSAI, and BMT in reducing off-time and improving QoL in patients with advanced PD, using MAIC and a Bayesian network meta-analysis (NMA) of clinical trial and observational study data.

2 Methods

2.1 Data Source and Search Strategy

A systematic literature review was conducted in Medline, Embase, and the Cochrane Library within the date range of January 2003 (the earliest date when all three device-aided therapies were widely available) to September 2019 to identify clinical studies of CSAI, DBS, and LCIG for the management of patients with advanced PD (the methodology of this literature review was not registered). Individual patient-level data were extracted from four AbbVie-sponsored studies [11, 23,24,25]. Predefined search terms were based on the Patient Intervention Comparator Outcome Study (PICOS) design criteria [26] and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA [27]). Search terms included ‘Parkinson disease’ in combination with variations of the device-aided therapy names including ‘deep brain stimulation’, ‘CSAI’, ‘LCIG’, and ‘infusion’ (see Online Resource Table 1 for a detailed list of the search terms).

2.2 Selection of Studies, Screening, and Data Extraction

Studies were included in the analysis if they were published in English, reported randomized clinical trials (RCTs), retrospective and prospective observational studies and other interventional studies and had a sample size of ≥ 20 people with advanced PD treated with either LCIG, DBS, CSAI, or BMT. Studies were included if off-time was assessed by diary or Unified Parkinson's Disease Rating Scale (UPDRS) part IV item 39 and/or QoL was assessed by Parkinson’s Disease Questionnaire (PDQ-39/PDQ-8). The UPDRS item 39 records the proportion of the waking day spent in off-time on a scale from 0 (none) to 4 (76–100%). Parkinson’s Disease Questionnaire scores range from 0 to 100, with higher scores indicating worse health. Data were included if the outcomes were reported at baseline and 6 months or as change at 6 months (± 3 months where the 6-month follow-up was not available). Data had to be reported as mean values with standard error and the number of patients assessed—or if not reported, then this needed to be calculable from standard deviations (SDs) or confidence intervals (CIs) from the published data. Studies were excluded if they included patients with early-stage PD, Parkin mutations, and PD-related dementia, and if they were comparisons of subgroups (e.g., female vs male; overweight vs normal weight; mutation-positive vs mutation-negative).

Following the removal of duplicate studies, each article was screened by two independent reviewers (ET, MLE, SK, FD, SP, BW, or HW; see Acknowledgments for details of reviewers) based on its title and abstract. Full-text publications of studies that passed the first round of screening were then reviewed by two independent reviewers. Data extracted from each selected publication for the NMA were amount of off-time/day and QoL scores at baseline and 6 months. In addition, information on study design, patient characteristics, treatments, and last reported follow-up were extracted. Information was independently entered into a collection form by two reviewers. For the IPD extraction, internal access to the original trial data was permitted. Study investigators (LW and PLK) used the trial protocols and data dictionary to extract individual patient characteristics (age, gender, PD duration, daily levodopa dose, and daily levodopa equivalent dose), outcomes of interest (off-time and PDQ scores), as well as trial-level information from the original datasets. To ensure accuracy of the data collected, each reviewer audited the other reviewer’s collection form. Disputes and any inconsistencies identified were resolved through discussion between the first two reviewers or adjudication by a third reviewer.

2.3 Network Meta-analytical Approach

Single-arm studies were incorporated by unanchored MAIC [21]. To simulate RCTs, studies with IPD were matched to AD from published literature with similar distributions of baseline patient characteristics (e.g., baseline age, sex, years since PD diagnosis, and levodopa daily dose, off-time and PD-related QoL). Matching-adjusted indirect comparison was performed to match each single-arm trial or single treatment observational study with AD to one of three studies with IPD [11, 23, 25]. The fourth study from which IPD were available was not used as only 37 patients were included and this study did not report daily levodopa dose at baseline or off-time at 6 months [24]. Matching-adjusted indirect comparison reweighted the IPD study population such that it had baseline characteristics (mean and percentage composition) similar to the AD study population. The weight assigned to any individual in an IPD study was equal to the odds of being enrolled in an AD trial [21, 22]. Because several AD studies were matched to the same IPD study, correlation between relative effects was accounted for by multinormal distributions with variance-covariance matrix [28]. For off-time, two RCTs were simulated by MAIC, and for QoL, three RCTs were simulated by MAIC (networks shown in Online Resource Figure 1). A sensitivity analysis by type of study (e.g., excluding single-arm studies) is not reported as the findings were considered unmeaningful.

Fig. 1
figure 1

PRISMA flow chart showing identification and selection of studies

Network meta-analysis was performed on RCTs, simulated RCTs and comparative observational studies. A Bayesian hierarchical model with Markov Chain Monte Carlo (MCMC) simulation was used to estimate the relative effect of different treatments. Treatment parameters were estimated using normal likelihood with identity link. Three parallel Markov chains, with 50,000 iterations each discarding the first 5000 burnt-in, were used. The convergence of MCMC chains was checked by trace plots and Gelman–Rubin diagnostic statistics [28]. Fixed and random effects models were assessed and compared using overall residual deviance and the deviance information [28]. To check for consistency of the results, an inconsistency model was used to fit to the data and compared with a standard consistency model [29]. The inconsistency model assumes unrelated mean relative effects with no consistency, and is approximately equivalent to performing separate pairwise meta‐analyses, while the random effects models allow shared variance parameters to be estimated [29]. These direct estimates can be used to calculate inconsistency between studies. League tables of pairwise comparisons for all treatments are reported. Relative treatment effects of all treatment comparisons are reported as median differences and 95% credible intervals (CrI). Treatments are ranked as the percentage of time ranked 1st, 2nd, 3rd, or 4th from Bayesian iterations. A fixed-effect model was chosen for the analysis because most pairwise treatment comparisons involved only one underlying study rather than multiple studies. In addition, results using the fixed and random effect models were found to be similar, with the only difference being that estimations had wider credible intervals using the random effects model.

Study bias was analyzed using the Cochrane Risk of Bias (RoB) tool to assess RCTs, ROBINS-I tool for non-randomized comparative studies, and the National Institutes of Health (NIH) Quality Assessment Tool for before-after (pre-post) studies with no control group or single-arm studies [30,31,32].

3 Results

The literature review identified 64 publications; of these, 22 with data from 2063 patients fulfilled the inclusion criteria and were included in the analysis (Fig. 1; Table 1) [11,12,13, 23,24,25, 33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48]. In these studies, 908 patients were assigned to treatment with LCIG, 705 to DBS, and 322 to CSAI. Of the 22 selected studies, 16 were single-arm studies (seven of LCIG, two of CSAI and seven of DBS), two compared DBS with BMT, and four studies separately compared LCIG with BMT, LCIG with CSAI, CSAI with BMT, and DBS versus LCIG versus CSAI. In two of the RCTs [11, 12], follow-up was 3 months and so data were carried forward to create the 6-month datapoint.

Table 1 Summary of the 22 studies included in the network meta-analysis

3.1 Patient Characteristics

Men outnumbered women in 18 of the 22 studies (Table 2). The mean age at baseline was between 55.5 and 70.9 years, the mean duration of PD was between 9.1 and 15.3 years, and daily levodopa dose at baseline ranged from 905.5 to 1080.3 mg/day (Table 2).

Table 2 Selected baseline characteristics in individual studies

3.2 Off-time

Of the selected studies, nine reported off-time at baseline (ranging from 5.4 to 8.7 h/day) and at 6 months, and one reported change from baseline at 6 months (Table 3). Nine studies used patient diaries to assess daily off-time, and one study used UPDRS part IV item 39. Deep brain stimulation (2.35 h/day; 95% CI 1.66, 3.04), and LCIG (2.25 h/day; 95% CI 1.32, 3.19) but not CSAI (0.90 h/day; 95% CI − 0.08, 1.88) produced significantly greater reductions in off-time compared with BMT (Fig. 2). Pairwise comparisons indicated that DBS (1.45 h/day; 95% CI 0.29, 2.61) and LCIG (1.35 h/day; 95% CI 0.11, 2.60) resulted in greater improvements in off-time at 6 months compared with CSAI. This was not statistically different between DBS and LCIG (Fig. 2). Based on 135,000 Bayesian iterations, DBS ranked highest (ranked 1 for 58% of iterations), and LCIG ranked second highest (ranked 1 for 42% of iterations) for off-time reduction (Fig. 3).

Table 3 Summary of off-time changes from the ten studies that reported off-time at 6 months
Fig. 2
figure 2

Mean (95% CI) off-time reduction (h/day) with the three device-aided therapies versus best medical treatment. The analysis adjusted for baseline age, sex, disease duration, levodopa daily dose, and off-time. BMT best medical therapy, CSAI continuous subcutaneous apomorphine infusion, DBS deep brain stimulation, LCIG levodopa/carbidopa intestinal gel

Fig. 3
figure 3

Ranking of device-aided therapy and best medical therapy based on improvements in off-time at 6 months

Although the inconsistency model had a lower posterior mean of the residual difference in off-time and hence a better fit to the data, the difference in deviance information criterion (DIC) between the two models was < 5 (Online Resource Table 2A). A difference in DIC > 5 is important to choose one model over the other [49]. In addition, the 95% CrI from the two models overlapped for all comparisons. Thus, no inconsistency was observed for this parameter.

3.3 Quality of Life

Baseline PDQ scores were reported in 19 studies with a range from 28.8 to 67.0 (Table 4). All device-aided therapies demonstrated greater improvements in PD-specific QoL than BMT at 6 months, but with smaller estimated improvements for CSAI (3.61; 95% CI 0.55, 6.68) than for LCIG (7.83; 95% CI 5.15, 10.51) and DBS (7.24; 95% CI 5.37, 9.10) (Fig. 4). Pairwise comparisons indicated that both LCIG (4.22, 95% CI 1.77, 6.66) and DBS (3.63, 95% CI 0.74, 6.50) resulted in significantly greater improvements in QoL at 6 months compared with CSAI. Improvement in QoL was not statistically different between LCIG and DBS (Fig. 4). Based on 135,000 Bayesian iterations, LCIG ranked highest (ranked 1 for 70% of iterations), and DBS ranked second highest (ranked 1 for 30% of iterations) for improvement in QoL (Fig. 5).

Table 4 Summary of Parkinson’s disease-related quality of life changes from the 19 studies that reported PDQ scores at 6 months
Fig. 4
figure 4

Mean (95% CI) Parkinson’s disease-specific quality of life improvement (according to PDQ-39/PDQ-8 score) with the three device-aided therapies versus best medical treatment. The analysis adjusted for baseline age, sex, disease duration, levodopa daily dose, and off-time. BMT best medical therapy, CSAI continuous subcutaneous apomorphine infusion, DBS deep brain stimulation, LCIG levodopa/carbidopa intestinal gel, PDQ Parkinson’s Disease Questionnaire

Fig. 5
figure 5

Ranking of device-aided therapy and best medical therapy based on improvement in Parkinson’s disease-related quality of life at 6 months

Although the inconsistency model had a lower posterior mean of the residual difference in PDQ scores and hence a better fit to the data, the difference in DIC between the two models was <1 (Online Resource Table 2B), and no inconsistency was observed for this parameter.

3.4 Risk of Bias Assessment

Of the four RCTs assessed using the Cochrane Risk of Bias Tool [30], three studies demonstrated a high risk of bias in at least one domain, and one study demonstrated some concerns of bias due to unclear risk of bias in one domain (see Online Resource Figs. 2 and 5). The two non-randomized cohort studies, assessed using the ROBINS-I Tool, demonstrated a low-to-moderate risk of bias (see Online Resource Figs. 3 and 5). One study had a moderate risk of bias due to confounding. The 16 single-arm studies assessed using the NIH Quality Assessment Tool for Before-After (Pre-Post) Studies with No Control Groups, each demonstrated a fair risk of bias due to lack of blinding, low sample size, loss to follow-up, and/or unclear eligibility or selection criteria (see Online Resource Figs. 4 and 5).

4 Discussion

To our knowledge, this is the first study to evaluate the relative effectiveness of all three currently available device-aided therapies using a Bayesian NMA. Results showed that LCIG and DBS were associated with superior improvement in off-time and PD-related QoL compared with CSAI and BMT at 6 months after treatment initiation. While this analysis suggested that reduction in off-time with CSAI was not significant compared with BMT, the pivotal trials of the three device-aided therapies demonstrated that off-time significantly improved from baseline [11,12,13]. A reduction of ≥ 1-h/day of off-time is considered clinically meaningful [50], and this was observed for DBS and LCIG, but not for CSAI. While the reduction in off-time was similar for LCIG and DBS, other aspects of these treatments that impact QoL are also considered by patients, caregivers and providers when selecting individualized treatment options. A clinically meaningful improvement in PDQ-8 score is − 5.94 and in PQD-39 is − 4.72 [51], as with off-time such improvements were demonstrated with both LCIG and DBS, but not for CSAI. Patient preference is essential to consider, as some patients may be unwilling to undergo invasive brain surgery for DBS or PEG surgery for LCIG, and would prefer the less invasive procedure of CSAI.

The selection of a device-aided therapy for a patient with advanced PD is a complex process that involves multiple specialties as well as the patient and their caregivers—a shared decision approach is, therefore, essential [52, 53]. While treatment decisions need to be individualized, the choice of device-aided therapy can be guided by some general principles based upon the patient’s age, cognitive function, dyskinesia, and frailty [6]. For example, in patients aged > 70 years, DBS may be suitable in a smaller proportion of patients than infusion therapies. Although older patients can undergo brain surgery if cognitive function is preserved and magnetic resonance imaging (MRI) excludes significant atrophy and vascular lesions, LCIG, or CSAI may be better and safer options [6, 54]. Additionally, there have been suggestions that discontinuation rates with CSAI are relatively high in the long-term due to troublesome adverse events related to the medication (i.e., nausea) or to the route of administration (i.e., subcutaneous infusion-site reactions) [15, 55,56,57]. A clinical situation in which DBS may be preferred over the infusion therapies is when the patient has severe dyskinesia [58]. Despite the guidance offered by these general principles, comparative data are still needed to assess the benefits of each device-aided therapy. In the absence of head-to-head trials, this NMA provides important information to add to the comprehensive evaluation of patients’ clinical status and preference that is warranted to ensure optimal symptom control and QoL. Despite the apparent benefits of LCIG and DBS versus CSAI and BMT shown in this analysis, identification of patients who may benefit more from one device-aided therapy over another depends on individual patient choice as well as hospital resource availability and healthcare professional experience.

Given the magnitude of benefit achieved with all device-aided therapies, the main take-home message is the need to improve the timely referral and identification of patients whose symptoms are poorly controlled by optimized oral treatment. Patients with advanced PD have a higher disease burden in terms of symptoms and negative effects on ADL and QoL [1], and this may also have a negative impact on the caregivers’ burden [59, 60]. Validated selection criteria and easy-to-use tools for identification of advanced PD (e.g., 5-2-1 criteria [61] and MANAGE-PD [62]) may help to improve the timely introduction of device-aided therapies. Research to identify suitable validated clinical indicators is ongoing [1, 5, 61, 63,64,65].

Since the literature search was conducted in 2019, a number of studies have been published. No direct comparisons of device-aided therapies with off-time or QoL as the main endpoints were identified in the last 3 years (a comparison of LCIG and DBS on QoL outcomes is ongoing [66]). Likewise, recently published RCTs of individual device-aided therapies have focused on different aspects of treatment such as dyskinesia [67], night-time treatment and sleep disturbances [68], and management of axial features [69]. A host of single-arm and/or observational studies on these three device-aided therapies have been published and some report off-time or QoL as the main outcomes [70,71,72,73,74,75,76]; however, the results of these studies would have no clear impact on the current analysis. In the future, a follow-up analysis may help refine the results of the current analysis.

The strengths of the approach taken in our analysis include the simultaneous comparison of the three most widely used device-aided therapies, the use of all evidence by the inclusion of single-arm studies, and the development of an analytical framework that could potentially be used to include future treatments. This analysis also has some limitations. Individual patient-level data were extracted from studies of LCIG only, but with the inclusion of several AD studies of LCIG, we do not believe this results in any bias in either direction. In the absence of sufficient randomized evidence to allow for direct comparisons, the use of MAICs to simulate RCTs and adjust for between-trial differences was conducted to reduce this potential confounding factor [21], and has been accepted as an appropriate methodology by decision-making bodies such as the National Institute for Health and Care Excellence (NICE) [22]. The inclusion of single-arm studies and non-randomized studies was deemed necessary as the number of RCTs was limited (three studies for off-time analysis and four studies for QoL analysis), particularly for LCIG and CSAI (one RCT each). Two RCTs were for DBS [13, 37], which may contribute to biases in favor of DBS, especially as the RCTs for DBS had larger patient samples. Furthermore, the DBS RCT outcomes were available at 6 months, while RCTs for LCIG and CSAI were available at 3 months. To maximize the available data and permit similar timepoint comparisons, the Month 3 data were carried forward to Month 6 for the LCIG and CSAI outcomes; however, greater efficacy benefits could have been experienced by patients on LCIG and CSAI if 6-month data were available. Fewer patients were assessed with CSAI (N = 322) than with LCIG (N = 908) and DBS (N = 705), which is evidenced by the wider CIs for CSAI in off-time reduction and QoL improvement. However, the point estimates for CSAI fall outside of CIs of both DBS and LCIG, suggesting a statistically inferior efficacy of CSAI to DBS and LCIG.

The studies included in the NMA were heterogenous, and therefore, there is the potential to introduce confounding factors. The definition of BMT differed between studies, but considering that adjustment and optimization of medications were permitted across studies, BMT is likely to be similar. Furthermore, only a small proportion of studies (three studies in the off-time analysis and four studies in the QoL analysis) included a BMT arm, so this heterogeneity is unlikely to have had a major influence on the findings. While baseline characteristics differed between studies, matching of studies by age, sex, years since PD diagnosis, daily levodopa dose, off-time and PD-related QoL, and a focus on changes in off-time and QoL rather than their absolute values at 6 months, should have limited confounding effects. Ideally, PD severity would have been a useful addition factor for matching of studies, but this was not commonly reported across studies and daily levodopa dose and baseline off-time may serve as proxies. Specific aspects of QoL may be important in patient/physician preference for a given device-aided therapy for each individual, but sub-domain scores for PDQ-8/PDQ-39 were not consistently available across the studies included in this analysis and so a more detailed analysis of QoL was not possible. Due to these described limitations of the included studies and the varying levels of risk of bias across the RCTs, this limited the meaningfullness of conducting a systematic array of sensitivity analyses (such as conducting the analysis with single-arm studies excluded). Therefore, a robust network meta-analysis was performed, using all available evidence in the published literature with the MAIC approach on the available IPD studies to minimize confounding factors.

In the absence of sufficient randomized evidence to allow for direct comparisons the use of MAICs to simulate RCTs and adjust for between-trial differences was conducted to reduce this potential confounding factor [21], and has been accepted as an appropriate methodology by decision-making bodies such as the NICE. [22].

5 Conclusions

In conclusion, given the absence of head-to-dead direct comparisons of the three currently available device-aided therapies for advanced PD, this Bayesian network meta-analysis provides comparative data on the clinical and humanistic value of these treatments. Levodopa/carbidopa intestinal gel and DBS demonstrated superior reductions in off-time and improvement in PD-related QoL compared with CSAI and BMT at 6 months after treatment initiation. Understanding the comparative benefits of each treatment provides additional information that can help the patient, caregiver, and provider in the selection of the most appropriate therapy to ensure optimal symptom control and improved QoL. Patients’ treatment preferences must be part of the shared decision approach, and this aspect has been also highlighted by the recent European Guidelines on invasive therapies for PD [20]. Future efforts should focus on the earlier detection of patients who are candidates for device-aided therapy, increasing appropriate referral of these patients, and to broaden the availability of these therapies globally for patients with advanced PD including the potential to increase access to costly treatments for patients in the developing world. Patient preference studies may also inform treatment and reimbursement decision-making.