Introduction

White spot lesions (WSLs), also known as early caries lesions (ECLs), are the earliest evidence of enamel demineralization and remineralization therapy is a trend in treatment [1,2,3]. WSLs are typically in the international caries detection and assessment system (ICDAS) II 1–2 range [1, 2, 4]. Under physiological conditions, there is a balance between demineralization and remineralization at the enamel surface as a result of altered pH levels [5]. If this balance is disturbed, early caries lesions will appear [3]. It should be mentioned that orthodontic treatments with fixed multibracket appliances hinder the maintenance of oral hygiene, leading to the accumulation of plaque and the progression of dental caries [6,7,8]. WSLs occur precisely in this way. Moreover, WSLs are assumed to correlate with bracket debonding time, raising concerns about orthodontic WSLs. Orthodontic WSLs are considered active until the time of bracket debonding [6, 9, 10]. The management of caries is undergoing a paradigm shift towards the minimally invasive approach, which emphasizes the prevention, reduction, and reversal of caries in incipient lesions [1, 11, 12]. These early lesions are considered amenable to the intervention to achieve a state of remineralization or arrest of caries. If the process of demineralization is not halted, the intact enamel surface will eventually collapse and cavitate [1, 13,14,15].

Fluoride-based strategies are the gold standard for preventing and managing WSLs [2, 16, 17]. Fluoride can interact with saliva at the surface and subsurface of the enamel. And then, it can combine with phosphate and calcium ions to form large new crystals containing more fluoride (Fluor-hydroxyapatite), thus improving remineralization [18]. However, current fluoride therapies have been reported to be flawed, especially caries already manifested as white spots [6, 12, 19, 20]. The casein phosphopeptides (CPP) contain multiple phosphoryl sequences that can stabilize calcium phosphate in nano complexes in solutions like amorphous calcium phosphate (ACP). Through their multiple phosphoryl sequences, the CPP binds to ACP in a metastable solution to prevent the dissolution of the calcium and phosphate ions. The casein phosphor peptides- amorphous calcium phosphate (CPP-ACP) also serves as a reservoir for bioavailable calcium and phosphate, thereby promoting remineralization [18, 21]. But compared to fluoride, the mentioned properties of CPP-ACP do not perform well in the treatment results [22,23,24]. The clinically significant benefit of tricalcium phosphate product over fluoride cannot be performed [6, 25, 26]. The self-assembling peptide P11-4(SAP P11-4) provides a novel opportunity for the remineralization therapy of WSLs through the mechanism of biomimetic mineralization [6, 27,28,29]. The current findings suggest that P11-4 has superior performance in the treatment of WSLs compared to the gold standard fluoride [12, 15, 30, 31]. Resin infiltration (RI) has also emerged as an effective method to treat WSLs by minimally invasive means [32, 33].

There have been many clinical studies exploring the differences between the methods of treating WSLs, but there isn't a broadly accepted conclusion [32, 34,35,36,37]. It is unrealistic to conduct a comparative study of all treatment modalities for WSLs at one time. Traditional meta-analyses have also been performed to compare the differences between two or several treatments [38,39,40,41]. In contrast to traditional meta-analyses, network meta-analyses (NMA) allow for the inclusion of evidence from direct and indirect comparisons across different intervention research networks to create multiple hierarchies of intervention effects, even where two interventions comparisons are lacking [42,43,44]. A comparison of the many treatment options and standard procedures for WSLs is necessary [45]. To date, however, no comparison of WSLs’ therapies has been performed using a network meta-analysis with relatively sufficient evidence. Therefore, this study aimed to perform a systematic review and network meta-analysis to compare the aforementioned therapies for contributing to the establishment of clinical treatment guidelines for WSLs [41].

Methods and analysis

Registration

The systematic review and network meta-analysis are reported following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [46]. The study protocol was registered (registration number: CRD42022343703) with the International Prospective Register of Systematic Reviews (PROSPERO).

Search strategy

Two researchers (Xie and Yu) independently searched for meta-analysis articles published in the following databases: Web of Science, EMBASE, PubMed, and Cochrane Central Register of Controlled Trials. They used medical topic headings (MeSH) and free-text terms. The search time frame was from January 2007 to June 2022. The search strategies are based on the PICOS principle, which can be found in Supplementary Table.

Selection of researches and eligibility criteria

The two reviewers (Xie and Yu, blinded to each other) independently completed the screening of the studies using a specifically designed data extraction form. The disagreement will be solved by Li using an inner decision system. Trials were considered eligible according to inclusion and exclusion criteria. The following are detailed criteria in Table 1.

Table 1 Inclusion and exclusion criteria

In the research, it was necessary to exclude diseases with similar treatment modalities to WSLs, such as deep caries, root caries, and fluorosis [47,48,49,50,51]. Systemic and structural barriers also limit dental health for individuals with special healthcare needs (SHCN) [52]. For intervention, in contrast to resin infiltration, conventional composite resin filling is contrary to the current treatment philosophy of managing WSLs [53, 54]. We also had to confront several studies that explored drug concentrations, frequency of use, and use of novel forms of treatment [55,56,57]. We had difficulty performing a network meta-analysis of these unique forms of intervention. For this research, we tend to analyze measures that have specific values. Visual indicators such as visual analog scale (VAS) may introduce a potential bias, which also questions the accuracy of optical indicators [50, 58]. Conventional fluoride varnish has to be applied repeatedly ranging from once every 2 weeks to four topical applications a year to maintain its effectiveness [31, 59]. It is necessary to set a follow-up time ADDIN EN.CITE. Non-RCT designed and plagiarized articles are not eligible for review.

Data extraction

The following data will be extracted by two blinded reviewers using EXCEL software, Author and journal; Publication year; Study design; Participants and groups; Baseline characters; Intervention; Comparison; Outcome; Results, and Follow-up period. The data will be extracted from the full text or if missing data is present, the author will be contacted via email. The disagreement will be solved by Li using an inner decision system.

Risk of bias in individual studies

For clinical research, the ROB 2.0 tool from Cochrane will be used for the quality assessment [60]. The risk of bias will be assessed based on the following five parts: randomization process, deviations from intended interventions, missing outcome data, measurement, and selection of the reported results. The overall risk of bias was expressed as 'low risk of bias' if all domains were categorized as low risk, 'some concerns' if a certain concern was raised in at least one area but was not classified as high risk in any other area, or ‘high risk of bias’ if at least one domain has been classified as high risk, or if it has multiple domains with certain concerns [60]. The methodological quality assessment tool for included in vitro study was from previous systematic reviews of in vitro studies [61, 62]. The risk of bias in each article was evaluated according to the description of the following parameters: specimen randomization; single-operator protocol implementation; blinding of the testing machine operator; the presence of a control group; standardization of the sample preparation; outcome mode evaluation; use of all materials according to the manufacturer’s instructions; description of the sample size calculation. If the reviewers stated the parameter, the study received a “YES” for that specific parameter. In the case of missing data, the parameter received a “NO.” The risk of bias was classified regarding the sum of “YES” answers received: 1 to 3 indicated a high bias, 4 to 6 medium, and 7 to 8 indicated a low risk of bias. All quality assessment processes are carried out by two blinded researchers (Xie and Yu), with Li responsible for resolving disputes arising from this process.

Data analysis

We performed a network meta-analysis to analyze direct and indirect comparisons of the six different therapies and the control treatment using a multivariable meta-analysis model with the STATA 15.1 statistical software (Stata Corp. College Station, Texas, USA).

The outcome of interest is the variation (from baseline to endpoint) in the absolute value of the lesion metric, such as QLF (quantitative photo-induced fluorescence), LF (DIAGNOdent measurement pen), or lesion area, which is typically measured by image analysis. Where studies did not provide a standard deviation (SD) of the change in outcomes, these values were estimated using a correlation coefficient (r) of 0.5 and the following equation:

$$SD_{change}=\surd [SD^{2}_{baseline}+SD^{2}_{final}-(2r\times SD_{baseline}\times SD_{final})]$$

According to the Cochrane Handbook guideline [63]. Since these changes were continuous outcomes by various measurements, the effect sizes were calculated as SMDs and 95% confidence intervals (CIs). The difference between the drugs was considered significant when the 95%CI for SMD did not include 0 (equivalent to P < 0.05). We conducted an inconsistency analysis to explore differences between the direct and various indirect effect estimates for the same comparison [42, 64]. Inconsistency between direct and indirect comparisons may indicate transitivity that is not immediately obvious [42, 65]. The side-split test was used to analyze the local inconsistency. After that, a consistency model was used for network meta-analysis. To rank the effects of the treatment regimens, we used surface probabilities under cumulative ranking (SUCRA) [66]. A SUCRA of x% indicates that the intervention achieves x% of the effectiveness of the imaginary intervention; thus, larger SUCRAs indicate more preferable interventions [42]. The forest plot was based on the consistency model. Additionally, publication bias was assessed using a comparison-adjusted funnel plot.

Results

Search results

Our search strategy identified 3032 studies from four primary databases. Furthermore, we identified ten additional studies after reviewing the reference lists of all eligible articles and recent systematic reviews. Following the removal of 660 duplicate records removed, 2382 records were evaluated. When 2033 non-RCT records were removed, 349 articles were included in the final eligible assessment. Subsequently, 42 studies fulfilled the requirements of the systematic review [1, 2, 6, 12, 15, 22, 26, 30,31,32, 34, 36, 37, 67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95]. Eleven studies were excluded from the NMA because the format of the results or the quantity of interventions is not appropriate for use in NMA [22, 26, 34, 67, 71,72,73, 76, 81, 83, 86]. Among the 31 studies included in NMA, the following treatment conditions were evaluated: CPP-ACP [10]; CPP-ACFP (6 studies); Control (22 studies); FV (21 studies); P11-4(7 studies); P11-4 + FV (5 studies); RI (5 studies). As 2 studies provided 2 additional outcomes, 33 results from 31 studies were included in the meta-analysis.

The flow chart of the literature retrieval process is shown in Fig. 1.

Fig. 1
figure 1

Flowchart diagram of randomized controlled trials of WSL/ECL’s therapies

Characteristics of the included studies

Data extraction results were displayed in the following Table 2. All articles were RCT-designed research. Articles reporting sex ratios were relatively balanced. For most articles, the participants were in the range of children and adolescents. 29/36 in vivo articles were focused on permanent teeth, while 7/36 in vivo articles focused on primary teeth. Six studies did not provide a detailed message regarding the age of the participants because these were in vitro-engineered lesions. Five articles focused on the different kinds of toothpaste used in daily life, which made them lacking the appropriate interventions and absent in NMA [2, 71, 73, 76, 81]. 2 articles provide additional records for NMA [12, 89]. In 42 studies, 35.7% orthodontics WSLs, 4.8% non-orthodontics WSLs, 16.7% WSLs without special introductions, 7.1% ACLs,19.0% ECLs, 2.4% molar incisor hypo-mineralization (MIH) [75] and 14.3% artificial lesions compose all lesions. But a single ACL study with a description of ICDAS = 2 resulted in its inclusion in NMA [68]. Due to the uncertainty of using fluoride toothpaste in oral education (Some articles have explicit descriptions, others do not), we did not consider the efficacy of fluoride toothpaste in this study. Two studies were on occlusal surface lesions [15, 30], where the outcomes were generally consistent with those of smooth surface lesions. One study studied both occlusal surface and smooth surface [95] with a mild difference observed between them, while the rest of the studies were on smooth surface lesions or smooth surface lesions associated with orthodontic brackets.

Table 2 Characteristics of systematic reviews’ study

Results of ROB assessment

Thirty-six clinical articles were evaluated by ROB 2.0 for the risk of bias. Figure 2 provides details of ROB evaluation in each included clinical study. Overall, 12 articles were judged to be of low ROB, 22 of moderate ROB, and the remaining two were assessed as high ROB. The majority of studies receive a "yellow" rating because there was no information for randomized queue concealment. The other part is that there is no guarantee of the blinded method of the assessor in evaluating the results and whether the procedures were by a pre-specified analysis plan. There are also risks associated with the absence of a specific description of the bias of the outcomes. One of the two high-risk studies was due to the high-risk assessment obtained during the concealment of randomized cohorts, and the other was due to failure to guarantee the impact of loss to follow-up. Table 3 showed the ROB result of the six in vitro studies. Most of the manuscripts involved were counted with a medium or low risk of bias. The sources of risk are from the sample size calculation, single operator, and operator blinded parameters.

Fig. 2
figure 2

Risk of bias for clinical studies. According to the ROB 2 tool, the risk offset evaluation was carried out from five aspects. Green means low risk, yellow means some concern, and red means high risk. In addition, the overall evaluation results and a bar chart are also shown in the graph

Table 3 Risk of bias for in vitro study

Network meta-analysis

This network meta-analysis included a total of 1906 people with 33 outcomes. Figure 3 showed the network map.

Fig. 3
figure 3

Network Map

Network plot

The network of direct treatment comparisons for the changes in absolute values of the outcomes of the WSLs is illustrated in Fig. 3. The sizes of the node reflect the number of matching trials. As shown in the network plot, the ‘FV’ (22 outcomes) and ‘Control’ groups (22 outcomes) were included in the largest number of treatment comparisons, followed by the ‘CPP-ACP’ (10 outcomes) and ‘P11-4’ (7 outcomes), while the ‘CPP-ACFP’ (6 outcomes), ‘RI’ (5 outcomes) and ‘P11 + FV’ (5 outcomes) groups were less. There were 15 direct comparisons. The lines link direct comparisons, and the thickness of the lines represents the number of trials that compare the two therapies. There were 15 pair-to-pair direct comparison groups. The most frequent intercomparison in the included literature was “FV group VS Control group” (12 direct comparisons), followed by the “CPP-ACP group VS Control group” (8 direct comparisons), “CPP-ACFP group VS Control group” (5 direct comparisons), “FV VS P11-4” (5 direct comparisons) and “FV VS P11-4 + FV” (5 direct comparisons). The other specific quantities are also represented in Table 4.

Table 4 the results of the side split test

Consistency and inconsistency analysis

We performed an inconsistency analysis to identify potential inconsistencies between direct and indirect comparisons. The results indicated that there were no significant differences between the direct comparison and the indirect comparison (χ2 = 9.05, P = 0.9388). We also performed the local inconsistency test; the results of the side split test in Table 4 showed that there was no significant difference between the indirect comparison and direct comparison in 15 groups(P > 0.05). Six comparisons lack the results of direct comparisons but only indirect comparisons, which can be also shown in Fig. 3.

Forest plot with the result of NMA

Figure 4a shows the NMA forest plot from the consistency model. We used SMD as the effect size. As shown in Fig. 4a, there was a statistically significant difference between 4 groups (P11-4, P11-4 + FV, RI, CPP-ACFP) and the ‘Control’ group (with 95% CI of SMD < 0). Compared to the ‘FV’ and ‘CPP-ACP’ groups, the ‘P11-4 + FV” and ‘RI” groups showed a significant difference (with 95% CI of SMD < 0). No significant differences were found for other comparisons. Visual displays of point estimates and confidence intervals of relative effects of interventions against a common comparator were shown in Fig. 4b [96]. There were no statistically significant differences in direct and indirect comparisons between these interventions and the control group according to inconsistency analysis.

Fig. 4
figure 4

a Forest Plot, b Forest plots for comparison with the control group

SUCRA ranking

Figure 5 showed the SUCRA of seven therapies. The hierarchy of WSLs' treatments and the SUCRA values are shown in Table 5. The higher the SUCRA value, the higher the ranking. The values of SUCRA used in our study indicated the following hierarchy among the seven treatments: 50.5, 24, 3.3, 31.9, 61.9, 89.7, and 88.7% for the CPP-ACFP, CPP-ACP, Control, FV, P11-4, P11-4 + FV, RI treatments. Figure 6 shows the changes in the absolute value of the outcome identified in association with the seven therapies.

Fig. 5
figure 5

Surface probabilities under cumulative ranking (SUCRA) Values of Seven Therapies. The horizontal axis is the rank sequence from 1 to 7. The vertical axis is cumulative probabilities. Intervention is ranked based on SUCRA. The larger the surface area under the curve and the faster the curve rises, the greater the possibility of being the most efficacious treatment. The specific calculated SUCRA values are shown in Table 4. Con: Control group; FV: Fluoride varnish; P11-4: self-assembling peptide 11–4

Table 5 Hierarchy of seven therapies by SUCRA
Fig. 6
figure 6

Outcome absolute value changes identified in association with The 7 Therapies

Publication bias

The funnel plot fitted to the comparison was symmetrical around the zero line, indicating that there was no evidence of publication bias. The publication bias plot is shown in Fig. 7.

Fig. 7
figure 7

publication bias plot

Discussion

We sought to compare the common therapy effects of white spot lesions and searched as much literature as possible for this network meta-analysis. Several valuable findings from this network analysis may inform standardized treatment procedures for the treatment of WSLs. Firstly, the clinical efficacy of conventional fluoride based as well as CPP-ACP-based remineralization strategies is not statistically significant. Secondly, resin infiltration and P11-4-based treatment strategies ranked high. Finally, we have observed that the combination of drugs improves the effectiveness of remineralization therapy in WSLs. In particular, the combination of the self-assembled peptide P11-4 and the fluoride varnish showed the most excellent efficacy.

Based on the SUCRA probabilities, we created an effect size hierarchy for therapeutic effects. The ‘P11-4 + FV’ and ‘resin infiltration’ interventions had more effective outcomes than the other interventions, followed by ‘P11-4’, ‘CPP-ACFP’, ‘FV’, ‘CPP-ACP’, and ‘Control’ interventions. This result suggests that fluorinated varnishes are not clinically effective compared to the control group [ES: -0.25 95%CI (: -0.51,0.02)], even though fluoride strategies are currently the gold standard for managing WSLs [2, 16, 17]. There have been reports of deficiencies in current fluoride therapies, primarily ineffective in caries that have already manifested as white spots [6, 12, 19, 20]. It has already been supposed that the effects restricted to the enamel surface layer led to the shortcomings of fluoride-based strategies [6, 97].

The NMA on the efficacy of CPP-ACP is also under the current clinical status [35, 98,99,100], with no significant differences either compared to FV [ES: 0.07 95%CI (: -0.29,0.44)] or to the control group [ES: -0.18 95%CI (: -49,0.14)]. CPP-ACP allows for the remineralization of deep lesions [101, 102]. The similarity of CPP-ACP to the fluoride strategy suggests that there are other potential reasons for the remineralization effect. Besides, the study found that SAP P11-4, which can form scaffolds on the enamel surface [6, 27,28,29, 103], exhibited superior remineralization properties than the control group [ES: -0.56 95%CI (: -0.96, -0.15)]. The effectiveness of P11-4 in randomized studies, conventional Meta-analysis, and the NWA suggest to us that it is more relevant to establish micro scaffolds suitable for remineralization than to provide the required ions for remineralization [12, 15, 30, 31, 69, 103,104,105].

We need to be more cautious about the effects of resin infiltration therapy, even though it ranks very highly in this analysis [ES: -0.94 95%CI (: -1.46, -0.43) compared to the control group]. Unlike remineralization therapy, resin infiltration, as a minimally invasive etch-adhesive system, can penetrate deep into caries and significantly improve the aesthetic effect of the surface of caries [50, 106, 107]. This means that resin infiltration therapy did not cause regeneration of the enamel, although the effectiveness of resin infiltration has been favored by many clinical studies and meta-analyses [37, 40, 50, 108]. Visual indicators such as visual analog scale (VAS) may introduce a potential bias, which also questions the accuracy of optical indicators [50, 58]. Again, this is the reason we did not include these outcomes in the current study. From the results of this research, CPP-ACFP tended over CPP-ACP, and P11-4 + FV combinations also tended over P11-4 alone. Combination therapy appears to be more appropriate for the treatment of WSLs. The combined application of P11-4 and fluoride varnish holds the highest ranking [ES: -0.96 95%CI (: -1.44, -0.48) compared to the control group], probably due to the formation of precursor scaffolds while providing the ion pool required for remineralization. In summary, the precursor scaffolds and remineralization ion pools together facilitate the management and treatment of WSLs.

We would like to stress here the importance of this study and some methodological necessities. Firstly, there is still no network meta-analysis of WSLs, and in particular, there is a lack of a comprehensive evaluation system for multiple remineralization therapies and resin infiltration therapies. Secondly, there is an urgent need for standardization of current clinical strategies regarding WSLs. Our study will provide an important reference for this. In addition, to match the standardization in the definition of WSLs, we chose 2007 as the starting year for the search. The ICDAS II standards were theoretically discussed in 2005 by the ICDAS work-shop [4, 109, 110]. It’s necessary promoting the changes in caries-related clinical decision-making strategies [111]. It often takes time. It was at the 54th ORCA Congress in 2007 that the ICDAS II criteria became a keyword in the diagnostic section compared to the ICDAS criteria in the 53rd ORCA Congress [112, 113]. Finally, the use of SUCRA alone for comparison of treatment outcomes in NMA is not adequate. Therefore, we used an inconsistency test (Table 4), SUCRA statistic (Table 5 and Fig. 5), and visual displays of point estimates and confidence intervals of relative effects of interventions against a common comparator (Fig. 4b) in this NMA to aid in interpretation [96].

We have equally carefully considered the limitations of this study. Most notably, there remains a paucity of trials in this space that can inform direct comparisons, in particular, the top-ranked interventions. The vast majority of direct comparison studies are relative to FV or control groups. Besides, we did not discuss potential influencing factors for WSL, such as gender, age, follow-up time, outcome measuring tool, etc. This is because the data indicating these contents are difficult to unify. Finally, we also recognize the potential bias that comes from setting language limits. However, there was no regional selection bias in this study. We also compared other systematic reviews that were not included in other languages to identify possible bias [114, 115].

Overall, this systematic review and network meta-analysis points to the clinical advantages of resin infiltration and SAP P11-4 (in combination with fluorinated varnish or as a single agent). This study clarifies the hierarchy of multiple therapies for WSLs and informs clinical strategies for WSLs. We plan to attempt analyses of confounding factors in the future to provide more reference value for the standardization of WSLs treatment.

Conclusions

Our study compared and evaluated the effects of the treatment for WSLs. Both resin infiltration and SAP P11-4 have a positive therapeutic effect on WSLs. The clinical efficacy of both CPP-ACP-based and fluoride-based drugs is not significant. The combination of SAP P11-4 and fluoride varnish is a better strategy for treating WSLs.