FormalPara Key Points

Pulsed dye laser is recommended for treatment-naive capillary malformations of the head and neck region, but demonstrates greater hyperpigmentation rates compared with other therapies.

The effectiveness of surgery and medical camouflage for capillary malformations remains unclear.

Because of substantial heterogeneity in outcomes, we advocate for the development of uniform outcomes and validated outcome measurement instruments in prospective studies evaluating treatments for capillary malformations.

1 Introduction

Capillary malformations (CMs, a.k.a port wine stains) are characterized by hyperdilated capillaries and post-capillary venules of the dermis [1]. These congenital lesions occur in 0.04–2.1% of newborns and have been associated with somatic mosaic and germline mutations in the GNAQ, GNA11, PIK3CA, RASA1, and EPHB4 genes [2,3,4,5,6,7]. They manifest as flat pink macules to more hypertrophic red-to-purple lesions [8].

In addition to bleeding due to the development of vascular blebs, their disfiguring appearance may also greatly affect the patients’ quality of life, particularly because CMs often occur in the head and neck [9,10,11,12,13,14,15]. An important reason for treating CMs is therefore to inhibit progression leading to hypertrophy and lessen functional complications.

Yet, despite technological evolution, the effectiveness of modern laser and light therapies, including the pulsed dye laser (PDL) as the current treatment of choice, has not improved over recent decades [16]. As CM lesions may recur post-therapy, patients are left with a desire for more effective treatment options. Other recognized therapies include cosmetic camouflage, surgical excision, and medical tattooing [17,18,19]. However, the latter treatment options are not routinely proposed to patients in the initial treatment decision-making process.

To adequately participate in this decision-making process, patients with CMs (and/or their parents) need to be well informed about the expected outcomes of all possible treatment options. Essential information seems to be lacking as, to date, there has not been a systematic comparison of the effectiveness of abovementioned therapies. More importantly, there is currently no international consensus on treatment guidelines for CMs.

The aim of this systematic review is to systemically compare the evidence of available therapeutic strategies for treatment-naive (untreated) CMs of the head and neck region to strengthen the treatment decision-making process.

2 Materials and Methods

This systematic review was conducted according to the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement and the Meta-analyses of Observational Studies in Epidemiology (MOOSE) guidelines [20, 21]. The research protocol was registered in the PROSPERO database (CRD42020199445).

2.1 Literature Search

PubMed (MEDLINE), EMBASE, and Cochrane Central Register of Controlled Trials (CENTRAL) databases were searched for published studies reporting on therapeutic effectiveness in patients with CMs from inception to 16 December, 2020. The search strategy was built with the support of a medical librarian and included the following MeSH (Medical Subject Heading) terms: ‘Port-Wine Stain,’ ‘Lasers,’ ‘Laser therapy,’ ‘Surgical procedures, operative,’ ‘Photochemotherapy,’ and ‘Tattooing’. Relevant keywords and synonyms describing the condition and different types of therapies, including camouflage, were augmented (see Table 1 of the Electronic Supplementary Material [ESM] for the search strategy). Reference lists of all included articles were screened for additional relevant articles. We did not contact authors for additional data.

2.2 Study Selection

The retrieved articles were entered and deduplicated in EndNote X9. Study selection was performed independently by two researchers (GC and GL). A third reviewer (CH) was consulted to resolve any disagreement.

2.2.1 Inclusion Criteria

Original studies with all study designs reporting on therapeutic effectiveness of laser or light therapy, photodynamic therapy (PDT), camouflage, surgery, or medical tattooing of previously untreated CMs in the head or neck region were included. We included therapeutic studies published since the year 2000 only for laser or light therapy and PDT because around that time alternative light sources became available with the introduction of the neodymium-doped yttrium aluminum garnet/potassium titanyl phosphate (Nd:YAG/KTP) laser and intense pulsed light (IPL) [16]. Studies with no documentation of previous therapies were included. Furthermore, no publication language restriction was applied. Publications other than English were translated with Google Translate.

2.2.2 Exclusion Criteria

Case reports with fewer than five patients were excluded from this review to strengthen the generalizability of results. Studies were excluded if the article did not report on our pre-specified outcome measures or if combinations of therapies were applied. All inclusion and exclusion criteria are shown in Table 1.

Table 1 Study selection criteria

2.3 Pre-Specified Outcome Measures

The following outcome measures of interest were predefined: effectiveness as the primary outcome, described as any quantitative improvement of the CM, e.g., Physician Global Assessment or Patient Global Assessment, clearance or improvement reported as percentage ranges, change in L*a*b* color coordinate system values [22], or changes in the erythema index. Pre-specified secondary outcomes included patient satisfaction and complications related to the procedure: hyperpigmentation, hypopigmentation, scarring, and pain. Other reported complications are also described, but not considered for meta-analyses.

2.4 Data Extraction

The two reviewers (GC and GL) independently extracted the following study characteristics and patient data from each included study, using a predefined digital data extraction form: authors, publication year, study design, number of patients, mean age, Fitzpatrick skin type, characteristics of the CM (color, size, hypertrophy), pretreatment, type of treatment and corresponding characteristics, number of treatment sessions, treatment interval, data on predefined outcome measures as previously described, and follow-up duration.

2.5 Risk of Bias Assessment

Two reviewers independently assessed the risk of bias (GC and GL). For non-randomized studies, the Dictionary of the Effective Public Health Practice Project tool was used [23]. Consistent with the Dictionary Component Rating Scale, the tool domains were rated as ‘strong,’ ‘moderate,’ and ‘weak’. Randomized clinical trials were evaluated by means of the critical appraisal checklist issued by the Dutch Cochrane Collaboration [24]. The Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach was used to rate the quality of the evidence for the pre-specified outcomes that could be pooled [25].

2.6 Data Analysis

Extracted data were presented whenever possible as means with standard deviations, medians with interquartile ranges, or percentages with 95% confidence intervals. Meta-analyses of proportions or means using a random-effects model were performed if studies showed similar patient and treatment characteristics. If statistical heterogeneity was high (I2 > 75%) or fewer than five studies were available for a specific outcome, no meta-analyses would be conducted, but study results would be explored and summarized as forest plots without totals.

To enable meta-analyses, the different outcome measures used for improvement of the CM or investigator/physician/patient global assessment scores were converted into dichotomous scales: ‘poor/moderate/good’ ≤ 75% improvement and ‘excellent’ = ≥ 75% improvement. This could only be done if studies described their results as quartiles of percentage lightening (i.e., 0–25%, 25–50%, 50–75%, and 75–100%).

Studies that used other percentage ranges that could be converted to the aforementioned categorical scales were also included in the meta-analyses. Differences are presented as risk differences with their 95% confidence intervals. Studies that quantitatively reported qualitative effects were converted into the ‘poor/moderate/good’ ≤ 75% improvement group when the following terms were used: poor, fair, moderate, partial, slight improvement, a bit better, quite a bit better, satisfactory, good, and a lot better. Likewise, results with terms like very good, very satisfying, excellent, or clear were categorized in the ‘excellent’ (≥ 75%) group.

All other pre-specified outcomes were dichotomized as ‘yes’ or ‘no’. Data from studies that could not be pooled were displayed separately and explored in a descriptive manner. Meta-analyses were performed using Rstudio Version 1.3.1056 (Rstudio PBC, Boston, MA, USA).

3 Results

3.1 Included Studies

The search generated a total of 4200 hits. After eliminating duplicates, 2894 studies were screened based on titles and abstracts. Next, 318 articles were screened in full text and 51 of these studies met the inclusion criteria [26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76]. Details about study selection are described in Fig. 1 of the ESM.

Fig. 1
figure 1

af Effectiveness as the proportion of patients who reached ≥75% clearance. a Pooled estimated effectiveness for the pulsed dye laser (PDL) of studies with three to eight treatment sessions, Fitzpatrick skin types I–IV, and age >1 year. bf Unpooled (range of) effectiveness of neodymium-doped yttrium aluminum garnet (Nd:YAG) laser, intense pulsed light (IPL), photodynamic therapy, other laser and light modalities, and medical tattooing. τ2 dispersion, CI confidence interval, I2 heterogeneity, p probability value, 1532 nanometer, 21064 nanometer

3.2 Study Characteristics

Of the 51 included studies, 50 studies adequately described the number of included patients with CMs of the head and neck region and comprised a total of 3068 patients. From one study, only the number of treated CM lesions of the head and neck region could be derived [27]. Out of the 51 studies, 48 were prospective or retrospective cohort studies and three were randomized clinical trials. The study characteristics per treatment modality are shown in Table 2 of the ESM. The number of participants per study ranged from 5 to 306 patients, with mean ages from 3 months to 39.5 years. Most patients had Fitzpatrick skin types I–IV and CM lesion colors varied from pale pink to dark red and purple. The following treatments were encountered: PDL, Nd:YAG laser, KTP, KTP+; CO2 laser; 577-nm (yellow) laser; dual sequential wavelength laser, IPL, PDT, and medical tattooing. Photodynamic therapy was studied alone or compared to other laser modalities. No studies on surgery or cosmetic camouflage were included, as they did not meet the eligibility criteria (i.e., largely pretreated patients, effectiveness not quantitatively described, mix of vascular malformations) or lacked adequate outcome reporting. Treatment characteristics varied greatly between and within studies. For example, for PDT, two different types of photosensitizers were used: hematoporphyrin monomethyl ether or PsD-007. Furthermore, the total number of treatments (range 1–15), treatment interval (a few weeks to several months), and follow-up (range 4 weeks–5 years) varied substantially.

3.3 Risk of Bias Assessment

The overall methodological quality of the included non-randomized studies (48 out of 51 studies) was poor, mainly owing to the study design, no correction for confounding variables (e.g., age, sex, skin type, CM characteristics), and use of non-validated and non-reliable outcome measurement instruments for CMs. The three randomized clinical trials contained blinding of outcome assessments, but displayed potential risks of selection bias and two randomized clinical trials lacked information on blinding of patients and physicians. The risk of bias assessment of the included studies is summarized in Fig. 2 of the ESM.

3.4 Treatment Outcomes

Table 2 of the ESM shows the outcome measures the studies intended to report. The results of the predefined outcomes of the individual studies are summarized per treatment modality in Figs. 1, 2 and 3 and in Table 3 of the ESM. Only a limited number of studies reported on our predefined outcomes and because of the large heterogeneity in outcomes, pooling of most outcomes was impossible.

3.4.1 PDL

Most studies addressed PDL, with 1697 patients involved in 28 studies, and 674 CM lesions in one study. The pooled estimate of the proportion of patients reaching a ≥75% clearance was 43% (95% confidence interval 24–64; I2 = 55%) after three to eight treatment sessions (Fig. 1a). Overall, more ‘excellent’ clearance scores (≥ 75% clearance) were achieved in infants, with a range of 63–90% in three studies (89 patients in total) after a mean of three to nine treatment sessions [31, 34, 39]. A significant correlation was found between objective and subjective treatment outcomes in the study by Fallahi et al., in which the mean Investigator Global Assessment score was 64.33 ± 23.67% and the mean Patient Global Assessment score was 65 ± 20.44% (correlation coefficient: 0.85, p < 0.001) after four treatment sessions [38]. Treatment outcomes of the remaining studies examining PDL were reported as mean clearance or color improvement scores, L*a*b* color coordinate system values, or Visual Analogue Scale (VAS) scores. Hyperpigmentation was the most frequent adverse event, with incidences between 0 and 41% (Fig. 2a). One study, involving ten patients with Fitzpatrick skin type V, reported a 20% incidence of scarring after PDL therapy with a follow-up duration of 49 months (Fig. 3a) [63]. In general, patients were satisfied with treatment outcomes [43, 55].

Fig. 2
figure 2

ae Predefined adverse events of hyperpigmentation and hypopigmentation per treatment modality. a Pulsed dye laser (PDL); b neodymium-doped yttrium aluminum garnet (Nd:YAG laser); c intense pulsed light (IPL); d photodynamic therapy (PDT); and e other laser and light modalities. No adverse events were reported for medical tattooing. CI confidence interval, CO2 carbon dioxide, DSWL dual sequential wave length, HMME hematoporphyrin monomethyl ether, KTP potassium titanyl phosphate, PsD-007 second-generation photosensitizer, 1532 nanometer, 21064 nanometer

Fig. 3
figure 3

ae Predefined adverse events of scarring and pain per treatment modality. a Pulsed dye laser (PDL); b neodymium-doped yttrium aluminum garnet (Nd:YAG) laser; c intense pulsed light (IPL); d photodynamic therapy (PDT); and e other laser and light modalities. No adverse events were reported for medical tattooing. CI confidence interval, CO2 carbon dioxide, DSWL dual sequential wave length, HMME hematoporphyrin monomethyl ether, KTP potassium titanyl phosphate, PsD-007 second-generation photosensitizer, 1532 nanometer, 21064 nanometer

3.4.2 Nd:YAG Laser

Eight studies evaluated Nd:YAG laser treatment in 304 patients. With a 532-nm Nd:YAG laser, 9–75% of the patients reach a ≥75% improvement after 3–12 treatment sessions. The 1064-nm Nd:YAG laser was less effective with 9–30% of the patients reaching a ≥75% improvement (Fig. 1b) and patients were less satisfied compared with the 532-nm laser (satisfaction scores 7.6 [standard deviation = 2.3] vs 3.3 [standard deviation = 0.8], p < 0.001, respectively) [29]. Overall, low adverse event rates were reported (Figs. 2b and 3b). One study reported a hyperpigmentation rate of 65%, but this was only transient [37]. Scarring occurred in 0–9% of the patients.

3.4.3 IPL

The IPL laser was examined in only three studies, comprising 71 patients. Between 14% and 30% of adult patients reached ‘excellent’ clearance after a mean of five treatment sessions (Fig. 1c). Adverse events were reported in the study by Özdemir et al., who observed permanent hypopigmentation (n = 1) and hypertrophic scarring (n = 2) in adult patients with a follow-up between 6 and 13 months (Figs. 2c and 3c) [54]. Wang et al. reported pain in 100% of the patients [66] (Fig. 3c).

3.4.4 PDT

Seven studies, comprising 865 patients, examined PDT. In two prospective cohort studies on PDT with a light-emitting diode source and hematoporphyrin monomethyl ether, between 6 and 31% of the patients achieved almost complete clearance (> 90%) after one to two treatment sessions [51, 68]. Overall, between 6 and 88% of the patients reached a ≥ 75% improvement after one to four treatment sessions with different photosensitizers and light sources (Fig. 1d). Adverse events were infrequently reported. Yet, pain occurred in 100% of the patients in two studies, but pain intensity was not reported (Fig. 3d) [51, 52]. Furthermore, scarring was present in 3–4% of the patients in two studies [72, 75].

3.4.5 Other Laser and Light Modalities

Five studies (138 patients) were included using other laser and light modalities: the KTP+/KTP laser, the CO2 laser, a 577-nm (pro-yellow) laser, and a dual sequential wavelength laser. The CO2 laser was applied for hypertrophic CMs with nodules. These lasers revealed inferior results to the PDL, as only 10–27% of the patients achieved ‘excellent’ (≥ 75%) clearance scores (Fig. 1e) [28, 35, 53]. Hyperpigmentation and scarring were most common after the KTP+ and KTP laser (100% vs 20%, respectively) (Figs. 2e and 3e).

3.4.6 Medical Tattooing

Medical tattooing was investigated in only three studies, including 65 patients. Between three and ten treatment sessions were needed and follow-up duration was up to 3.1 years. According to a patient/parent assessment panel in the study by Grabb et al., a ≥ 75% improvement was attained in two out of 19 patients (11%) (Fig. 1f) [40]. Van der Velden et al. reported that all patients achieved good results without loss of pigment color with a mean follow-up of 19.8 months [65]. None of the individual studies reported on our predefined secondary outcomes.

3.5 Overall Quality of the Evidence (GRADE)

The overall quality of the evidence (GRADE) could only be determined for the PDL regarding effectiveness and was rated as very low (Table 4 of the ESM). This rating was mainly due to a high risk of bias in the outcome assessment and publication bias, use of non-validated outcome measurement instruments, relatively small sample sizes (imprecision), and the heterogeneity of treatment outcomes (inconsistency).

4 Discussion

This systematic review describes 51 studies, in which 3068 patients with treatment-naive CMs were treated with either laser or light therapy, PDT, or medical tattooing. The quantitative improvement of CM lesions and reported adverse events varied widely between studies and treatment modalities. Pulsed dye laser was most frequently used, but less than half of the patients receiving PDL showed a ≥ 75% improvement. Other therapies revealed to be less effective. Pulsed dye laser studies reported higher hyperpigmentation rates in the majority of the studies as compared with other treatment modalities. Overall, hypopigmentation and scarring were uncommon, but results are based on only a few studies. If reported, pain occurred frequently after PDT and IPL in both children and adults, but pain intensity was not adequately described. Patient satisfaction was rarely assessed.

With this systematic review, we aimed to provide a comprehensive overview of treatment effectiveness and the safety of frequently used therapies for untreated CMs of the head and neck region. This evidence is essential to support patients with CMs in the treatment decision-making process. However, as no studies on cosmetic camouflage and surgery could be included in this review, there is still insufficient evidence regarding all available therapies to adequately support patients with CMs in the treatment decision-making process. Particularly for cosmetic camouflage, it seems striking that little evidence on its effectiveness exists, as it is still commonly used by patients with CMs. Currently, surgical therapy is predominantly used for hypertrophic lesions, in which laser treatment has a limited beneficial effect [77, 78]. Novel, mostly experimental therapies do exist, such as laser therapy with a topical adjunct or site-specific pharmaco-laser therapy [79, 80]. These, however, constitute combinations of therapies, which were excluded from this review as their separate effects cannot be ascertained.

Previously published (systematic) reviews on CMs mainly compared different laser or light modalities and assessed therapeutic effectiveness [16, 81]. Van Raath et al. recently concluded that laser treatment outcomes have not improved over the last decades and that 30.5% of previously untreated patients achieve 75–100% clearance [16]. Our findings in treatment-naive patients with CMs demonstrate better outcomes for the PDL. This is conceivably owing to the fact that only studies with similar patient and treatment characteristics were included in our pooled analysis, resulting in more case-matched outcomes.

The diverging treatment characteristics between and within studies may account for the observed differences in treatment outcomes. Because of this heterogeneity in treatment characteristics, we were not able to associate treatment outcomes with the specific treatment settings (pulse duration, energy fluence, spot size, number of passes, treatment sessions and intervals). Furthermore, the differing treatment outcomes may also depend on patient characteristics (age, skin type) and CM lesion characteristics. There is some evidence that infants under the age of 1 year achieve better clearance scores, as more patients reach a ≥75% improvement [31, 34, 39]. Yet, contradictory results have been published [82,83,84]. Furthermore, Wimmershoff et al. demonstrated that only 2% of the patients of various ages reach a >75% improvement after PDL treatment, which supposedly could be because of more difficult-to-treat darker skin types or fewer treatment sessions [69, 85]. The authors, however, did not report on this aspect. Notably, the results of two observational studies indicate that dark skin types may cause a higher risk of hyperpigmentation and scarring after laser therapy [37, 63]. This may be explained by the fact that dark-skinned patients are more prone to develop epidermal damage. As with higher pigmentation levels, the laser light is predominantly absorbed by melanin before reaching hemoglobin, which may subsequently lead to inflammatory effects and post-inflammatory dyspigmentation [86,87,88]. The Pakistani study, however, did not sufficiently document skin type [37]. Furthermore, not all studies reported if epidermal cooling was applied, which could explain some of the reported hyperpigmentation cases among studies [35, 37, 44, 63, 72]. Because of limited reporting of CM lesion characteristics, drawing conclusions about which treatment modality is best for hypertrophic CMs in particular is not safe.

In this systematic review and meta-analyses, the methodological quality of the included studies in general was low. This was mainly because confounding factors were rarely corrected for and there was a lack of validated outcome measurement instruments for CMs. More importantly, heterogeneity was high in terms of patient, CM lesion, and treatment characteristics among and within studies. Hence, only a few low-quality studies could be pooled. Both this heterogeneity and the resulting small subgroup sizes obstructed adequate subgroup analyses. Nevertheless, this review presents the best evidence currently available. As both observational and experimental studies were included, we have been able to collect evidence on therapeutic efficacy and safety.

Second, the authors dichotomized categorized treatment outcomes whenever possible. In many studies, however, different outcome measures were used. This prohibited pooling of the results and made a proper comparison between treatment modalities impossible. In addition, some qualitative effects were converted into quantitative outcomes to support the comparison of study results. Conversion of these outcomes is inevitably arbitrary and interpretation should be done with caution.

One of the strengths of this review is that we provide a comprehensive overview of the available treatment options for CMs, while previous reviews mostly focused on PDL only. Additionally, our study discloses the deficiencies in existing knowledge regarding CM therapeutic efficacy studies. As outcome reporting in clinical research on CMs is anything but uniform, the evidence of clinical studies on CM treatment is hard to compare. Moreover, we still know little about which outcomes are considered most relevant by patients with CMs. As a result, the impact of CM treatments cannot be assessed sufficiently and optimal strategies for managing CMs care are still to be desired, while urgently needed for this patient group [89].

It is of great importance that uniform and relevant characteristics regarding patients, CM lesions, and treatments are provided in studies, and results should be described per patient when characteristics vary widely. We are therefore currently preparing an e-Delphi study as part of the international COSCAM (Core Outcome Set for CApillary Malformations) project, to reach a consensus on which outcome domains should be measured and reported in clinical studies regarding CMs. This project, registered at the Core Outcome Measures in Effectiveness Trials (COMET) website (registration number 1599), is guided by the Cochrane Skin-Core Outcome Set Initiative (CS-COUSIN) that supports the standardization of outcome measurement in dermatological clinical trials [90].

5 Conclusions

Based on the currently available evidence and in the context of lacking guidelines, PDL therapy is recommended for treatment-naive CMs of the head and neck region, but at the cost of greater hyperpigmentation rates compared with other therapies. Our results are, however, based on low-quality evidence. Larger high-quality comparative studies using reliable validated methods and uniform outcome measures are therefore warranted. Based on this systematic review, clinicians and patients should be aware of the limited evidence when deciding on the available treatment options for CMs.