Background

Vascular anomalies (VAs) represent a large panel of diseases and are classified according to their clinical, biological, radiological, pathological and molecular characteristics by the International Society for the Study of Vascular Anomalies (ISSVA) as vascular tumors (VTs), which might be benign, borderline or malignant, or vascular malformations (VMs) [1]. VTs are characterized by endothelial-cell proliferation or hyperplasia, whereas VMs result from a defect in embryonic vasculogenesis that might be linked to somatic or germinal gene mutations. VMs are classified according to the vessels involved (i.e., capillary VMs [port-wine stains], venous VMs, lymphatic VMs or arteriovenous VMs). They are considered “simple” when one type of vessel is involved and combined or syndromic if associated with other malformations [2,3,4]. This classification does not take into account the prevalence of the conditions, which might be common (port-wine stains, infantile hemangiomas) or rare. Rare vascular anomalies often are chronic conditions: in Europe, a disease affecting fewer than 1 in 2,000 people is considered a rare disease [5], whereas in the United States, a rare disease is defined as a condition that affects fewer than 200,000 people [6]. The natural history of most rare VAs is progressive worsening without therapeutic intervention.

Treatment of rare VAs includes different therapeutic modalities (i.e., physiotherapy, interventional radiology [sclerotherapy/embolization], surgery, interventions with device such as lasers, and drugs [anticoagulants, mammalian target of rapamycin inhibitors], including targeted therapeutic ones such as selective inhibitor of the phosphoinositol-3-kinase for CLOVES syndrome) [7,8,9,10]. Often, treatments are started in childhood. The management of rare VAs is based on a personalized approach that considers the patient’s goals for treatment and usually requires multidisciplinary consultations. There are no validated guidelines for treatment of rare VAs [11]. Indeed, recommendations are difficult to develop given the broad spectrum and insufficient number of prospective clinical studies to prove the efficacy of treatment [7, 9, 12].

Randomized controlled trials (RCTs) provide gold-standard evidence to guide clinical practice. However, trials for rare diseases have recruitment issues and require tailored designs. Regarding rare VAs, the design of clinical trials is difficult because of the cumulative difficulty of the rarity of diseases, the heterogeneity of conditions and often that the population of interest is children, which, above recruitment issues, raises specific ethical issues (consent of both parents, acceptance by the child, limited number of samples and invasive exams) [13, 14]. To overcome these methodological and ethical problems, alternative designs have emerged but are still little used [15].

The aim of this study was to investigate the study designs used in therapeutic clinical studies of rare VAs by a systematic literature search.

Methods

Registration and protocol

This study was designed as a systematic methodological literature. It was registered at the Prospective Register of Systematic Reviews (PROSPERO: CRD42021232449).

Eligibility criteria

Reports of randomized and non-randomized studies were included if they met the following criteria: were prospective studies of rare superficial VA therapies, dealt with humans (adults and children) and were published in English from 2000. The different groups of VAs included were simple lymphatic malformations, simple venous malformations, slow-flow combined malformations including syndromic forms, arteriovenous malformations and rare vascular tumors. We included ongoing studies (i.e., studies, comparative or not, whose results were not published at the time of the electronic research on January 25, 2021). We excluded case reports/case series reporting fewer than 10 patients, reviews, retrospective studies, animal studies, studies of systemic or common VAs (prevalence greater than 1 in 2,000 people) and non-therapeutic studies. There was no minimum number of patients for clinical trials. To differentiate the prospective cohorts from case series, we followed the definition proposed by Dekkers et al. [16]: “in a cohort study, patients are sampled on the basis of exposure and are followed over time, and the occurrence of outcomes is assessed.” In a case series, patients can be sampled according to both the specific outcome and specific exposure or only a specific outcome. To distinguish cohort studies from case series, we mainly considered the participant selection and sampling parameter (Is it linked to the outcome? to a specific exposure?) and to the presence of a follow-up period during which the outcome is assessed (cohort study criteria) [17]. Two authors reviewed the full text of each study, with blinding, to label them. Discrepancies were resolved by discussion between these two authors. Many studies that could have been initially defined as case series were reclassified as cohort studies.

Search strategy

The electronic search conducted on January 25, 2021, involved the databases PubMed, Embase (on Embase.com), and Cochrane Central Register of Controlled Trials (CENTRAL) and the registers ClinicalTrials.gov and European Union Clinical Trials Register (EUCTR). The search terms used can be found in the supplementary materials and methods file.

Selection process

According to the pre-defined criteria, two authors (AAT and SL) independently selected reports based on the title abstracts. Any discrepancies were resolved by the senior authors (AM and BG). The same two authors then examined the full texts of the selected reports. They excluded duplicate publications, general reviews, systematic reviews or reports with insufficient information as full text not accessible. Publication duplicates were detected by using first Zotero software and then Airtable. Duplicates between registers and publications were manually detected and resolved directly on Airtable.

Data collection process

For each selected study, two reviewers independently followed a standard template for data extraction. Any disagreements were resolved by discussion and if necessary by a senior author. Data were extracted to an Airtable spreadsheet (https://airtable.com/) and included publication metrics (name of the first author, journal, publication year, source and article type), study recruitment characteristics (continent and number of centres), study design (study type, randomization, trial design, design justification by the authors, blinding and important changes after trial beginning), study groups (number of groups, experimental treatment and control), primary outcome, characteristics of patients (type of VAs using the ISSVA classification system, age category), planned sample size, number of patients included and funding sources. Intervention type in the experimental group were extracted and classified as follows: parenteral drugs, enteral drugs, topical drugs, interventional radiology (i.e., image-guided minimally invasive treatment such as embolization or percutaneous sclerosis) and physiotherapy. Figure S1 (supplementary material) represents the designs used in comparative studies included in the final analysis.

Data analysis

We did not assess risk of bias in the included studies. The review provided a descriptive analysis of relevant features of eligible research studies, focusing on designs used. The characteristics of each included clinical trial were summarized in tables using descriptive statistics. Categorical variables are presented as counts and proportions (n, %). Quantitative variables are presented as median and interquartile range (median [Q1-Q3]). To statistically analyze categorical variables, we used the chi-square test. To statistically analyze quantitative variables, we used the non-parametric Mann-Whitney U test. R-4.2.2 software was used for statistical analyses.

Results

In the initial database search, we identified 2 046 reports and finally included 97 studies (62 reports and 35 ongoing studies): 25 RCTs, 7 non-randomized comparative studies, 64 prospective cohort studies and 1 case series. Figure 1 presents the flow of articles in the review.

Fig. 1
figure 1

Flow of articles in the review

Characteristics of the included studies

Table 1 shows the epidemiology and reporting characteristics of the 97 included studies. Twenty studies that could have been initially defined as case series were reclassified as cohort studies [18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37].

Table 1 Description of studies included in the final analysis

Patient characteristics

Most of the included studies involved heterogeneous patients. First, about two thirds of studies included both children and adults [7, 19,20,21, 23,24,25,26,27, 29,30,31,32,33,34, 36, 38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84]. Second, nearly one third were of several diseases [7, 22, 27, 29, 30, 33, 38, 40, 45, 50, 52, 54, 55, 57, 58, 60, 64, 65, 67, 68, 74, 77, 79, 84,85,86]. Most studies were of lymphatic malformations (LMs) and/or venous malformations [7, 18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34, 36,37,38,39,40,41,42,43,44,45,46, 48, 50,51,52,53,54,55,56,57,58,59,60,61,62, 64,65,66,67,68, 71,72,73,74,75,76,77,78,79,80, 82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104], followed by rare VTs [7, 27, 29, 35, 40, 52, 55, 60, 63, 64, 67, 68, 105,106,107,108,109,110], arteriovenous malformations [29, 38, 45, 50, 52, 55, 58, 67,68,69,70, 84, 111, 112], and slow-flow combined malformations or syndromic malformations [7, 22, 38, 45, 47, 49, 50, 52, 54, 57, 58, 64, 65, 68, 74, 77, 79, 81, 85, 86].

Experimental treatments

Overall, whatever the study type (comparative or non-comparative, randomized or non-randomized), the most evaluated experimental treatment was interventional radiology [18, 19, 21,22,23,24,25,26,27, 29, 30, 32,33,34, 36,37,38,39,40,41,42, 44,45,46, 50,51,52,53,54,55,56,57, 59,60,61,62, 66, 67, 72, 73, 75, 76, 78,79,80, 84, 87,88,89,90,91,92,93,94, 97, 98, 102, 104, 111,112,113] followed by enteral drugs [7, 20, 31, 47, 49, 63,64,65, 68,69,70, 74, 77, 81, 82, 85, 96, 99,100,101, 107,108,109,110]. total of 14 studies offered a combination of treatments in the experimental arm. For 10, it was a combination of two interventional radiology treatments [19, 32, 36, 39, 42, 44, 56, 59, 83, 91]. Three studies investigated the combination of two enteral drugs [47, 63, 100] and one study combined sclerotherapy and surgery [66]. Thus, many included studies used interventional radiology to treat LMs [19, 21,22,23,24,25,26,27, 29, 30, 33, 37,38,39,40, 42, 44, 45, 50,51,52, 54, 55, 57, 60, 75, 84, 97, 98, 102, 104, 113] (32 studies) and venous malformations [18, 22, 27, 29, 30, 32,33,34, 36, 38, 41, 45, 46, 50, 52,53,54,55,56,57, 59, 61, 62, 66, 67, 72, 73, 76, 78,79,80, 84, 87,88,89,90,91,92,93,94] (40 studies).

Planned and achieved sample size

Most studies (n = 57, 58.8%) did not report a planned sample size [18, 19, 21,22,23,24,25,26,27,28,29,30, 32,33,34,35,36,37, 39,40,41, 44,45,46, 51,52,53,54,55,56,57, 60,61,62, 66, 67, 72, 73, 75, 76, 78,79,80, 83, 84, 89, 91,92,93,94, 96,97,98, 100, 102, 104, 113]. A higher proportion of comparative studies [43, 48, 49, 59, 63, 65, 71, 74, 81, 85, 86, 88, 90, 99, 103, 105, 107, 109, 110] than non-comparative studies [7, 20, 31, 38, 42, 47, 50, 58, 64, 68,69,70, 77, 82, 87, 101, 106, 108, 111, 112] had a planned sample size (n = 19, 59.4% vs. n = 21, 32.3%, p = 0.011). For more than half the studies, the median (Q1-Q3) planned sample size was 38 (25–61) patients per study. In the 62 reports, the median (Q1-Q3) number of included patients was 28 (15–44) Comparative studies are significantly larger than non-comparative studies (median [Q1-Q3] sample size 39 [28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90] vs. 20 [14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43], p = 0.008). Table 1 presents the intervention and sample size description of studies included in the final analysis.

Methodology of the studies

Comparative study design (32 studies)

Many of the comparative studies (n = 21, 65.6%) used a classical design with parallel groups [41, 44, 57, 59, 60, 63, 67, 72, 80, 83, 86, 88,89,90,91, 97, 99, 105, 107, 109, 110]. Table 2 presents the methodological data for comparative studies included in the final analysis. The 11 other studies used different designs: cross-over [48, 65], randomized placebo phase [43, 85], delayed-start [49, 103], within-person [71], challenge–dechallenge–rechallenge [74], and observational run-in period [81] or used a historical control group [66, 73]. Table 3 reports the characteristics of these studies. Two studies had two stages: a pilot phase with a small sample size followed by a larger comparative trial [43, 88].

Table 2 Methodological data for comparative studies included in the final analysis
Table 3 Characteristics of 11 comparative studies with a non-parallel design

Among the 32 comparative studies, 14 (43.8%) had some form of blinding [43, 48, 49, 59, 67, 71, 80, 85, 86, 88, 91, 99, 103, 107]. These 14 studies were RCTs, representing 56% of RCTs included in the review. Blinded actors were outcome assessors (4 studies, 12.5%) [43, 85, 103, 107] or both participants and care providers (4 studies, 12.5%) [48, 71, 80, 86]. Three studies (9.4%) [49, 59, 99] exhibited triple blinding (participants, care provider and outcome assessor).

Design justification by the authors

For 6 of the 32 (18.8%) comparative studies, the authors justified their design in their report [43, 71, 73, 81, 85, 103]. Of these 6 studies, 4 were randomized [43, 71, 85, 103] and 2 were not [73, 81]. The first randomized study used a delayed-start design with an observational period of 6 months based on data suggesting that after this period, spontaneous regression was unlikely [103]. For the following 3 randomized studies [43, 71, 85], the authors explained that the population with the pathologies of interest was too rare to be able to set up a classic parallel-group trial. One used a within-person design because it allowed for reducing the number of patients to be included as well as inter-observation variability and all patients to receive the experimental treatment [71]. The other 2 randomized studies had a randomized placebo-phase design, which allowed for reducing the number of participants to include and increased acceptability (because every participant received the experimental intervention) and therefore ensured feasibility [43, 85]. For the non-randomized studies, the first used historical controls because randomization was judged too difficult to implement with the varying nature of VMs [73]. The second used an observational run-in period as a control and not a placebo because the experimental treatment was increasingly available and this would have compromised recruitment [81].

Important changes in the protocol after trial beginning

After the beginning of the clinical trial, 10 (10.3%) studies made important changes to their protocol [7, 47, 63,64,65, 81, 87, 99, 110, 114]. Changes concerned the addition, removal or change of the primary or secondary outcomes (6 studies) [7, 47, 64, 65, 81, 87], a decrease in estimated enrollment (3 studies) [63, 99, 114], a modification of the intervention (1 study) [110] and a modification of the inclusion criteria (1 study) [47].

Discussion

This methodological systematic literature search allowed identifying 62 reports and 35 ongoing studies representing 32 comparative and 65 non-comparative therapeutic studies of rare superficial VAs. Studies were frequently non-comparative cohorts, assessing interventional radiology in venous malformations or LMs. The proportion of experimental (i.e. randomized) studies was low. Our review showed that some authors used randomized designs distinct from the classical two-parallel group design, such as cross-over, within-person, randomized placebo-phase and delayed-start. We also noted in our review some designs of non-randomized studies that may be interesting for studying rare diseases such as the use of a historical control group or the challenge-dechallenge-rechallenge design [115].

This latter result is consistent with previously published results: in a systematic review of sirolimus treatment for LMs, among 20 studies, only one was an RCT, versus 19 retrospective case series or case reports [116]. Another review evaluating the efficacy of sirolimus for treating vascular abnormalities identified mostly single case reports (47 studies) and case series (22 studies) and very few prospective observational studies (n = 2) or RCTs (n = 2) [9]. Vascular anomalies are heterogeneous, which challenges the randomization (Is stratification desirable? possible?), outcome selection (Is there a relevant common outcome?) and results interpretation [9, 73, 117].

Thus, using a classical two parallel-group RCT presents a conundrum, and therefore, alternative designs involving intra-patient comparison are of high interest for clinical trials of rare pathologies because they can avoid some of the difficulties mentioned below [118,119,120,121,122]. The rarity of these pathologies is also a difficulty [116, 123]. Thus, we observed a lower number of included patients as compared with the planned sample size, a result already acknowledged for rare diseases [124, 125]. The most glaring example was one study that compared vincristine and sirolimus for treating high-risk VTs [63]. The authors had planned to enroll 50 patients, but the study had to stop because owing to the rarity of the pathology, only 4 patients had been recruited. Our observations seem consistent with the narrative review of Neto et al. which aimed to summarize Cochrane systematic review evidence on treatments for congenital vascular anomalies and hemangiomas. They had several difficulties including the limited number of existing systematic reviews, limited number of participants in the studies of these pathologies and heterogeneity of the participants [126]. To carry out therapeutic clinical trials on rare superficial VAs, the results of our review suggest to select a design that can limit the required sample size as much as possible. To have as low a required sample size as possible, investigators can promote intra-patient comparisons, thus increasing power. This option also allows for better accuracy because of the absence of inter-patient variability [127, 128]. In our review, we found three such designs: the cross-over design [129,130,131], the randomized placebo-phase design [132, 133] and the within-person design [134]. More so, the main advantage of this last design is that it eliminates confounding factors between the arms of the trial because the treatments to be compared are given at the same time and not successively as in the cross-over design or the randomized placebo-phase design. It is particularly suitable for dermatology (e.g., VAs) practice. However, the resulting problem is that these designs are more sensitive to dropouts and missing data because each participant is their own control; therefore, a dropout potentially affects both groups.

Another difficulty is using a placebo. First, a sham intervention for surgery or interventional radiology is difficult [135], in particular for the ethical aspect because this fictitious intervention can cause excessive risk for participants. In general, invasive procedures such as surgeries should normally be tested against standard medical treatment or no treatment [136]. Then, it is important to limit the time spent on placebo, Limiting the time spent on placebo facilitates recruitment. For VAs, given that there are standard treatments, patients and physicians would be reluctant to have a real placebo-only control group. Therefore, there are designs that can limit this period under placebo, such as the randomized placebo-phase design. Nevertheless, to ensure the validity of the results, an effective placebo-phase duration must be established: short enough to avoid changes over time and long enough for valid measurements [15]. However, there is the risk of participants dropping out if the placebo-controlled phase is too long. Second, for rare diseases with a potentially shortened lifespan, parents may be reluctant to have their children receive a placebo [81, 137]. To overcome this, investigators can maximize on-treatment participants by giving each patient the experimental treatment at some point (this is the case in the cross-over design) or by ensuring that all patients end the study being exposed to the experimental treatment, which offers the opportunity to pursue this treatment (if possible) outside of the study context (this is the case in the delayed-start design [138] and the randomized placebo-phase design). Either way, maximizing on-treatment participants facilitates recruitment and increase acceptance and accrual. These types of design can be of interest if conducting a two-parallel group RCT would be difficult or unacceptable, for instance when assessing an experimental treatment for an incurable or fatal disease or when the disease affects the more fragile pediatric population (which is the case with VAs).

For investigators, using an observational run-in period allows for collecting useful clinical data, especially in the case of rare diseases such as VAs, screening out ineligible or non-compliant participants, and establishing baseline observations. However, it also has disadvantages, such as affecting the external validity of the study by excluding patients and affecting the internal validity by exaggerating the intention-to-treat effect (e.g., if a run-in period is used to exclude participants who do not tolerate the treatment, then potentially there will be only a large number of good responders in the following phase, which can give a too optimistic view of the treatment effect) [139].

It is also possible to integrate an “internal” pilot study (pilot phase) into a clinical trial that allows for integrating the pilot participants into the definitive study and therefore not “exhausting” the stock of patients eligible for the study. This is interesting in the case of rare diseases such as VAs. These pilot phases built into the trial do not require additional time or funds [15].

Our review highlighted the use of a historical control. Despite certain advantages such as avoiding the problem of recruiting patients, especially for studying rare diseases such as VAs, this design has several disadvantages: confounding effects [140] (baseline patient characteristic differences between treatment arms can prevent highlighting the treatment effect), selection bias [141, 142] (whereby patients receiving the experimental treatment are selected from a pool of “good responders”, thus resulting in an overestimation of the treatment effect), performance, and detection bias [143, 144] (because there is no blinding). To overcome these, it would be interesting to use advanced methods to manage confusion in observational studies, such as stratifying patients on the estimated propensity scores during the analysis of observational data [142]. The Bayesian statistical approach could also be of interest, combined with any design, although we could not find it in our review. This approach uses data from previous studies to form a prior probability distribution for treatment effect, combined with current trial data for a posterior distribution, from which conclusions can be drawn [13].

This systematic review has several limitations. First, it certainly does not present all the studies published on the subject given the use of several filters: we included only studies published in the English language and since the year 2000. In addition, most authors did not justify their choice of methodological scheme in their publication, which limited our collection of information on this subject. In addition, registry ongoing studies contain more missing data than do published articles. We can also consider limitations in the method we chose to follow by not looking at other methodological issues such as sequence generation and allocation concealment, which are issues of importance in randomized trials but probably of lower importance than blinding [145, 146].

Conclusions

Comparative studies are mandatory for assessing treatments or interventions, but RCTs are rare in these diseases. Classical two parallel-group designs are of limited use in rare pediatric diseases, notably with large between-patient variability. New designs, more adapted to this specific medical context, are emerging and can overcome the limitations of testing treatments in parallel groups. Their use is necessary for conducting trials with a high level of evidence.