Introduction

White spot lesions (WSLs) are consequences of subsurface enamel demineralization caused by acid producing caries-associated bacteria in the plaque. WSLs manifest as chalky white opacity of enamel and is an undesirable common complication of orthodontic treatment [1]. Several studies have reported a substantial increase in the prevalence of WSLs in orthodontic patients ranging from 2 to 97% [2]. Although, it is believed that these lesions may reduce or even disappear after appliance debonding due to the remineralizing potential of saliva, [3] some of the lesions may still persist much longer [4]. The significant increase in prevalence of these lesions during fixed appliance treatment is attributed to the increase in plaque retentive areas that hinder the routine oral hygiene measures, further increasing the plaque load around the brackets.

Apart from plaque accumulation, fixed orthodontic appliance induces alterations in oral microbiota; it has been reported that there are increased levels of Streptococcus mutans and Lactobacillus species in the oral cavity detected after bonding orthodontic attachments [5, 6]. Furthermore, analysis by checkerboard DNA-DNA hybridization technique has shown multi-colonization of several bacterial species including cariogenic microorganism on metallic brackets soon after bonding [7]. In addition, a recent study based on RT PCR quantification of salivary levels of caries-associated bacteria in patients with fixed orthodontic appliance revealed increased levels compared to non-orthodontic patients [8]. The increase in plaque coupled with elevation in caries-associated bacterial counts in biofilm and saliva [5] eventually reduces the pH resulting in enamel demineralization.

Recently, there has been an increase in aesthetic demands among patients seeking orthodontic treatment [9]. Clear aligners (CA) are transparent removable thermoplastic trays that is believed to be safe, aesthetic, removable and comfortable orthodontic appliance. They enable patients to carry out routine oral hygiene procedures and thereby reducing the negative effects of orthodontic appliance on periodontal health [10]. However, a 2.85% overall incidence of new WSLs has been reported with the use of CA and 28% of the patients were affected by at least one new WSL considering all the assessed teeth [11]. In addition, surface area of the WSLs has been found to be large but with less mineral loss during CA treatment compared to fixed appliance treatment [12]. This can be attributed to the fact that patients are advised to wear aligners approximately 22 h a day for optimal results which interrupts the self-cleansing activities of orofacial soft tissues allowing further accumulation of plaque under the aligner [13] and hampers the cleansing, buffering and remineralizing properties of saliva. Another study reported increase in caries- causing microbes namely, Streptococcus and Lactobacillus, within 24 h of CA wear [14].

Although, many studies on the periodontal health status, incidence of WSLs and salivary caries-associated bacterial levels in patients undergoing treatment with CA and fixed orthodontic appliances have been done, there are still some controversies existing [15]. Recent findings by Shokeen et al. [16] reported that CA treatment has less negative impact on clinical oral health outcomes than fixed orthodontic appliance. However, Chhibber et al. [17] and Pango et al. [18] reported no significant difference in the oral hygiene levels between CA and conventional brackets during long term orthodontic treatment. A study by Mummolo et al. [19] reported abundance in Streptococcus mutans during fixed appliance treatment compared to CA treatment whereas another study based on 16 S rRNA gene found no significant variations in the relative abundance of Streptococcus between the aligner and fixed appliance treatment [20].

To our best knowledge, there are no systematic reviews that have compared conventional fixed (CF) orthodontic appliance solely with CA focussing on plaque accumulation and salivary caries-associated bacteria (SCB) collectively, which have a direct influence on development and severity of WSLs. Hence, this review was conducted with the objective of systematically synthesizing all the available evidence regarding the following research question: Is there a difference in plaque accumulation measured by plaque index (PI), SCB, incidence and severity of WSLs (outcomes) in orthodontic patients (population) undergoing CA (intervention) and CF orthodontic appliance (control) treatment?

Materials and methods

Protocol and registration

This review was conducted in accordance with Cochrane Handbook for Systematic Reviews of Interventions and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement [21]. The protocol of the systematic review was registered in OPEN SCIENCE FRAMEWORK registries (DOI: https://doi.org/10.17605/osf.io/kcpvb).

Eligibility criteria

The eligibility criteria applied for the systematic review are presented in Table 1.

Table 1 The eligibility criteria applied for the systematic review and Meta- analysis

Information sources, search strategy and study selection

A 3-month comprehensive electronic database search was conducted from January 2022 up to May 2023. Literature (1990- May 2023) from relevant databases: PubMed, Scopus and Embase, and Google Scholar were included in the review. Search of the grey literature was also performed on Opengrey and Proquest Dissertation Abstracts and Thesis database. An additional search was performed in ClinicalTrials.gov (www.clinicaltrials.gov). A combination of index terms (Medical Subject Headings (MeSH) for PubMed and other relative terms pertaining to the databases) and keywords were used to perform search. The detailed search strings are presented in the Table 2. Hand searching was performed from the reference lists of included articles and published systematic reviews.

Table 2 The search strategy used in the review using MeSH terms (PubMed), key words and terms related to other database

All the identified records were imported into reference management software (desktop version of EndNote®, version X9; Clarivate Analytics). After removal of duplicates, two reviewers (S.R and E.A) independently screened the articles based on title and abstracts using Rayyan Systematic Review Screening Software (https://www.rayyan.ai). The full texts of potentially eligible studies and those with insufficient information in the abstract were retrieved and read in full for the final selection. Articles that did not meet any one or more of the inclusion criteria were excluded. Any disagreements in screening and including potentially relevant articles were resolved by a third reviewer (S.A).

Data extraction

Data extraction was performed by two reviewers (S.R and E.A) independently using a standardized data extraction form that comprised the following items:

  • Study information: Author, year of publication, study design, study setting and sample size and funding.

  • Population: Age, gender.

  • Intervention and control: Type of appliance.

  • Outcomes: Outcome measures (pertaining to the review), method of obtaining the outcome measures, follow-up periods and results . For few studies [16, 22] Web plot digitizer (https://automeris.io/WebPlotDigitizer/website) was used to extract the data from the graphs and plots. Some of the authors were contacted through email for obtaining clarifications, missing and additional data.

Risk of bias (ROB) assessment in individual studies

A revised version of Cochrane risk-of-bias tool ROB 2 [23] and ROBINS-I [24] was used to assess the risk of bias in Randomised Control Trials (RCTs) and non-randomized studies of intervention (NRSIs), respectively. The risk of bias assessment tool for RCTs is based on the following 5 domains to evaluate the risk of bias as a result of: randomization process, deviation from intended intervention, missing outcome, inappropriate measurement of outcome and selective reporting of results. The overall judgment can be of ‘low/ high risk of bias’ or can express ‘some concerns. The risk of bias assessment tool for NRSIs is based on the following 7 domains to evaluate the risk of bias due to confounding, selection of participants, classification of interventions, deviation from intended intervention, missing outcome, inappropriate measurement of outcome and selection of reported results. The overall judgment can be of ‘low/ moderate/ serious/ critical risk of bias or ‘no information’. All the extracted data were cross-verified by 2 reviewers (S.R and E.A) any discrepancies were resolved by the third reviewer (N.S).

Summary measures and synthesis of results

For the effect size calculation, the mean and standard deviation (SD) were extracted from all the included studies. In the absence of means and SDs, these were derived from the reported medians, inter-quartile ranges, or confidence intervals (C.I.s). As the eligible studies assessed the PI at varying and multiple time-points, the data for the maximum time-point from each of the included studies was considered for meta-analysis. The influence of the time-point on the effect size was considered as continuous moderator (assessment duration in months) in the meta-regression analysis. Sub-group analysis was conducted for categorical moderator based on the study-design (RCT and NRSI). In addition, three separate meta-analysis were conducted based on the PI follow-up duration (at 3, 6 and 12 months) to assess effect size at these time-points.

The standardized mean differences (SMD) and 95% C.I. for PI was used in the summary measures. Random effects meta-analysis with restricted maximum likelihood method was conducted. Statistical heterogeneity was first examined through visual inspection of the C.I. for the treatment effects on forest plots. A chi-square (p-value below the level of 10%) was considered as indicative of significant heterogeneity) and I2 tests (value greater than 50% was considered as substantial heterogeneity) were applied to assess the heterogeneity. Predictive intervals (95% P.I.) were calculated to incorporate existing heterogeneity and to provide a range of possible effects in future studies. All the analyses were performed using STATA 17 software (StataCorp, College Station, TX).

Risk of bias across the studies

Contour- enhanced funnel plot was decided to be generated to assess publication bias if at least 10 studies were to be included in the meta-analysis.

Quality of evidence

The certainty of evidence using the GRADE approach was used to rate the quality of evidence of estimates (high, moderate, low, and very low) derived from the MA using GRADEpro GDT software (https://www.gradepro.org). The GRADE summary of findings was categorized based on the study design. Accordingly, two tables were created for the plaque accumulation (RCT and NRSI). Two reviewers (S.R and N.S) independently assessed the confidence in effect estimates for outcomes synthetized quantitatively using the following categories: risk of bias, inconsistency, indirectness, imprecision, and publication bias.

Results

Study selection and characteristics

The results of search and study selection are shown in Fig. 1. The total number of reports identified was 1862 (1858 from the databases and registry and 4 reports from grey literature). After removal of 435 duplicates, 1427 reports were included in the title and abstract screening. Of these, 1403 from the databases and 2 from grey literature were excluded and only 22 reports (20 from the databases and 2 grey literature) were entitled for full text screening. Due to non-availability of 2 full text reports from databases, 20 reports (18 from the databases and 2 grey literature) were screened. The full texts of potentially eligible articles were assessed by the two reviewers (S.R and E.A), of which 6 (5 from the databases and 1 grey literature) were excluded for various reasons. The list of excluded studies along with reason of exclusion is summarized in online resource 1. Finally, 14 articles [12, 16,17,18,19, 22, 25,26,27,28,29,30,31,32] were included for qualitative analysis out of which 8 articles were suitable for meta-analysis.

Fig. 1
figure 1

The PRISMA 2020 Flow Diagram of article retrieval

The characteristics of included studies are depicted in Tables 3, 4 and 5. This systematic review included 5 RCTs [12, 17, 26, 27, 31] and 9 NRSI [16, 18, 19, 22, 25, 28,29,30, 32]. Four of the included studies [12, 29,30,31] investigated the incidence of WSLs, eight studies [16,17,18, 22, 26,27,28, 32] investigated plaque accumulation as one of the primary outcomes and two studies [19, 25] investigated SCB as primary outcome and plaque accumulation as additional outcome.

Table 3 Characteristics of included studies reporting incidence and severity of WSLs comparing CAs and CF orthodontic appliances
Table 4 Characteristics of included studies reporting Plaque accumulation using Plaque Index (PI) comparing CAs and CF orthodontic appliances
Table 5 Characteristics of included studies reporting Salivary Cariogenic Bacteria (SCB) comparing CAs and CF orthodontic appliances (Included for quantitative synthesis)

Out of 4 studies that assessed the WSLs, two were RCTs [12, 31] and two were NRSIs [29, 30]; one study followed a retrospective design [29] and the other a prospective design [30]. The method used to assess the incidence were different across the study. Two studies [29, 31] used digital photographs, 1 study used Quantitative Light Fluorescence [12] and 1 study assessed WSLs by visual examination [30]. Among these, only 2 studies [12, 31] assessed the severity of WSLs in terms of surface area and 1 study [12] in terms of the depth of lesions.

Out of the 10 studies that evaluated plaque accumulation (as primary or additional outcome) using PI, three were RCTs [17, 26, 27] and the others [16, 18, 19, 22, 25, 28, 32] were observational studies. The indices used to measure the outcome and the time points of plaque quantification were varying across the studies. Two studies [18, 26] used PI of Silness and Loe, five studies [17, 22, 27, 28] used Modified plaque index of Loe, one study [16] used Turesky Modified Quigley Hein Plaque Index and two studies [19, 25] did not mention the index used. The two observational studies [19, 25] that investigated salivary caries-associated bacteria had assessed the number of subjects (percentage) with Streptococcus mutans and Lactobacilli greater than 105 CFU/ml as outcome measure. All the outcomes were measured at different time points ranging from 1 to 18 months.

Risk of bias within studies

All the included studies had limitations in methodology that contributed to bias. The overall risk of bias of all included RCTs [12, 17, 26, 27, 31] and three NRSIs [22, 29, 32] were graded as high risk whereas the risk of bias of the other five NRSIs [16, 18, 19, 25, 30] were graded as moderate. Risk of bias of one of the studies [28] could not be judged as there was no information on the missing data.

-RCTs.

The risk of bias assessment and the overall judgement is shown in Fig. 2. All included RCTs [12, 17, 26, 27, 31] suffered high risk of bias in measurement of the outcome mainly due to lack of blinding of assessors and only one study [27] was additionally graded high risk of bias arising from the randomization process. Four studies [12, 26, 27, 31] showed some concerns in the bias due to deviations from intended intervention and all studies in the bias in selection of the reported result. All studies were at low risk of bias due to missing outcome data.

Fig. 2
figure 2

Risk of Bias summary outlining judgement of ROB items of Randomized controlled trials using – ROB2

-NRSI

The risk of bias assessment and the overall judgement is shown in Fig. 3. Three included NRSI [22, 29, 32] were graded as high risk of bias and other five studies [16, 18, 19, 25, 30] were graded as moderate risk due to confounding factors. All studies suffered moderate risk of bias in domains 6–7. On the other hand, all studies were at low risk of bias in domains 2–5.

Fig. 3
figure 3

Risk of Bias summary outlining judgement of ROB items of Non randomised studies of Interventions using ROBINS- I

Results of individual studies and meta-analysis

  1. a.

    Plaque accumulation.

Among the 10 included studies [16,17,18,19, 22, 25,26,27,28, 32] less plaque accumulation and better oral hygiene maintenance was reported by eight of them [16, 19, 22, 25,26,27,28, 32] and only two [17, 18] found no difference between the two groups. Only eight studies [16,17,18,19, 22, 26, 27, 32] were included in the meta-analysis out of which 3 were RCTs and 5 were NRSIs. The observed SMD ranged from − 3.92 to -0.12, all the estimates being negative (100%) favouring the CA. The estimated average SMD based on the random-effects model was − 1.58 (95% CI: -2.57 to -0.58) and it differed significantly (z = -3.11, p = 0.002) favouring lesser plaque accumulation in the CA as depicted in the forest plot (Fig. 4).

Fig. 4
figure 4

Meta- analysis (Random effects model): Forest plot comparing PI in patients with CA (Treatment) to those with CF orthodontic appliances (Control) (N-No. of Samples, SD: Standard deviation, CI: Confidence interval)

According to the Q-test, the true outcomes appear to be heterogeneous (Q (7) = 102.76, p < 0.0001, tau² = 1.91, I² = 93.85%). Subgroup analysis (Fig. 5), based on the type of study design revealed the same trend favouring CA and the heterogeneity was found to be high in both the subgroups (RCT − 1.79, 95% CI: [-3.629, 0.043] and I² = 93.6%, NRSI − 1.457 95% C.I.: -2.766, -0.147] and I² = 94.91%). Meta regression analysis using duration as continuous moderator reported I2 residual statistic as 94.57%, which still suggests high heterogeneity (online resource 3). In addition, duration did not influence the effect size (z=-0.09, p = 0.93). A 95% P.I. (online resource 2) was found to be -5.181 to 2.025 which indicates that the possibilities of the estimate to be positive in the future studies, though the average outcome is estimated to be negative.

Fig. 5
figure 5

Subgroup analysis of studies reporting plaque accumulation based on study design CA (Treatment); CF orthodontic appliances (Control) (N-No. of Samples, SD: Standard deviation, CI: Confidence interval) (RCT: Randomized Controlled Trial, NRSI –Non Randomised Studies of Interventions)

Three separate forest plots showing the pooled effect size with 95%C.I. for the time-points 3, 6 and 12 months were presented in Fig. 6a-c. The number of studies with 3 months and 6 months was five, and with 12 months follow up was three.

Fig. 6
figure 6

a-c: Separate Meta- analysis (Random effects model): Forest plot comparing PI in patients with CA (Treatment) to those with CF orthodontic appliances (Control) at 3 months, 6 months and 12 months (N-No. of Samples, SD: Standard deviation, CI: Confidence interval)

At all the three time points, the effect size was favouring CA and it is statistically significant (P < 0.05).

  1. b.

    Salivary caries-associated bacteria.

Both the included studies reported higher concentration of caries-associated bacteria in CF orthodontic appliances as compared to CA [19, 25]. Meta-analysis of the SCB outcome was not carried out due to lack of sufficient number of studies.

  1. c.

    White spot lesions.

Only four studies [12, 29,30,31] that assessed the incidence of WSLs were available. Less risk of developing WSLs in CA than CF orthodontic appliances was reported by three studies [12, 29, 30] whereas one study reported no difference in the incidence and severity of WSLs between CA and CF orthodontic appliances [31]. As different methodologies were adopted in each one of the studies, meta-analysis for incidence and severity of WSLs was not possible.

Risk of bias across studies

Due to sparse datasets included in the synthesis that assessed PI, funnel plots were not generated to assess publication bias.

Quality of evidence

The GRADE summary of findings (strength of evidence for interventions) substantiated the evidence for less plaque accumulation in the clear aligner patients (Table 6).

Table 6 Certainty of evidence - Based on the Grading of Recommendations Assessment, Development and Evaluation Approach – (GradePro GDT)

In context with the evidence from I2 statistics for heterogeneity and ROB 2/ROBINS-I tool for risk of bias, downgrading for inconsistency and Risk of Bias domains for plaque accumulation was implemented. On the other hand, the evidence rating was upgraded for strong association as the quantitative pooling of the results showed a large effect. The results revealed that the quality of evidence for the plaque accumulation was graded as “moderate” for both RCT’s and NRSIs.

Discussion

Adult orthodontic patients tend to prefer CA over CF appliances as it satisfies their aesthetic demands and is proven to have a positive impact on the QoL [9]. Generally, quality of life (QoL) has been reported to reduce during orthodontic treatment and the type of orthodontic appliance is said to influence the patients functionally and psychologically [33]. However, it has been shown that CAs cause less physical and psychological disabilities compared to fixed appliances [33]. Based on existing literature, it is believed that an increase in the quantum of plaque, caries-associated bacteria in saliva, reduction in the salivary pH and resultant enamel demineralization are unwanted sequel of orthodontic treatment jeopardising the aesthetics offered by orthodontic treatment. There are individual studies comparing the plaque accumulation, SCB levels, and the incidence and severity of WSLs in patients undergoing treatment with CA to that of CF orthodontic appliances [12, 16, 17, 19, 29]. One review was identified that compared clear aligners with fixed orthodontic appliance in terms of the 3 variables (WSLs, PI and SCB) [34]. The aforementioned study lacked robust eligibility criteria; intervention group included any orthodontic treatment with aligners and comparator group included fixed orthodontic treatment, other aligner treatment or removable appliances. The comparator group in the review not only included conventional fixed appliance but also Self- ligating and lingual appliances, which, among themselves, exhibit difference in quantum of plaque accumulation and SCB due to bracket design and placement. The above mentioned differences among the types of fixed appliance reflect on the incidence of WSLs which would also vary. Hence, this review was conducted to synthesize explicit evidence of any possible link between the incidence and severity of WSLs, plaque accumulation and SCB in an attempt to distinguish these parameters between CA and CF.

Studies that compared the CA with CF orthodontic appliances were only included. Other types of fixed orthodontic appliance such as self-ligating or lingual appliances were excluded due to the controversies in the literature related to the influence of bracket type on WSLs, the quantity of plaque accumulation and cariogenic microbial colonization [35,36,37,38,39,40,41,42]. PI and SCB levels (Streptococcus mutans, in particular) were considered appropriate outcome variables as they are proven to be the best predictors of WSLs [43].

The qualitative assessment of the included studies indicated comparatively lower incidence and severity of WSLs and SCB with CA as opposed to CF orthodontic appliances which could be attributed to fewer plaque retention sites and ease of oral hygiene maintenance with CAs. In addition, plaque accumulation as assessed with PI was less in CA than CF orthodontic appliances. These findings are consistent with those of previous reviews [15, 34, 44]. However, the aforementioned reviews included self-ligating and lingual appliances in addition to CF orthodontic appliances, and their inclusion was regarded as a reason for heterogeneity. Also, no attempts were made to explore heterogeneity in one of the systematic review while many potential articles missed their way into the meta- analysis (only 4 studies were included) even though the search was run until May 2021 [34].

Due to the large variability in the methods of WSLs evaluation and recording criteria (tooth-based incidence [12, 30] and patient-based incidence [29, 31]) adopted in the included studies and scarcity of studies that assessed SCB, meta-analysis for both of the mentioned outcomes was not feasible. However, quantitative analysis was conducted including studies that assessed plaque accumulation. Only eight of ten studies with PI as outcome measure was considered for the analysis. Due to lack of data, one of the study was not included into the analysis [28] and out of the two studies [19, 25] that assessed for PI as secondary outcomes, only one with higher sample size [19] was included for quantitative analysis due to the possibility of study population being mutually inclusive as both the studies were conducted in similar setting and published in the same year by the same authors. Plaque accumulation in the included studies was assessed using different plaque indices and different teeth type and number. Therefore, it was considered prudent to adopt SMD as summary measure for PI instead of mean difference used in the previous review [44]. The findings of this review indicated that CA was associated with less plaque accumulation, less salivary caries-associated bacteria and reduced incidence and severity of white spot lesions than CF orthodontic appliances.

The primary meta-analysis was conducted by including all the possible studies irrespective of variation in the duration of PI assessment and the influence of duration was considered as continuous covariate in meta regression model, which allowed us to pool maximum studies thereby increasing the power of meta-analysis. It is important to emphasize that only one dataset (dataset for maximum time-point) was taken from each of the included study to avoid the influence of dependency of data, which was not considered in the previous review [44]. Our meta-regression analysis revealed no relationship between plaque accumulation and follow-up duration. In addition, separate meta-analysis based on 3 time points (3,6 and 12 months) did not change the direction of the results. Furthermore, subgroup analysis based on study design did not identify the source of heterogeneity between RCT and NRSI. The 95% prediction interval (ranged from − 3.89 to 1.05) revealed that although the average value was more towards the direction favouring CA, there is a possibility of absence of effect or that the true effect may be in the opposite direction.

It is crucial to emphasize that this review was majorly based on studies with high risk of bias assessment. Lack of blinding of assessors (not possible due to the nature of interventions), failure to incorporate a random element in generating the allocation sequence, selective reporting, lack of consideration of confounding factors (age, gender, type of malocclusion, oral hygiene status) or failing to adjust for the confounding factors were among the reasons for this assessment. As all the included studies were of high risk of bias, stratified analysis based on ROB was not possible. Therefore, all available data were included in the meta-analyses as suggested by the Cochrane Handbook [45].

Assessment of publication bias was not feasible due to scarcity of studies that assessed plaque accumulation. Only one study [31] out of 4 reports was included from the grey literature. It is important to note that although this number is limited it adds to the strength of this review. It indicates that the possibility of missing any relevant studies is a minimum and the number of papers published in a non-indexed journals is limited.

Limitations

The study level limitations include: high risk of bias among the included studies, and clinical and/or methodological heterogeneity across the studies that affected the certainty and generalizability of evidence. Furthermore, inability to access all eligible studies (non- English and non-availability of full texts) and scarcity of primary studies that investigated the WSLs and SCB could be regarded as limitation at review level.

Conclusion

  • Based on moderate quality evidence, CA is associated with less plaque accumulation than CF orthodontic appliances. In addition, salivary caries-associated bacteria were found to be less with CA which may be related to the reduced incidence and severity of WSLs in CA as opposed to CF orthodontic appliances.

  • Future considerations should be aimed at conducting a high-quality RCT to detect the direct association of WSLs, plaque accumulation and SCB following standardized protocols in terms of study design (randomization), selection of subjects, method of evaluation of WSLs and SCB (quantitative, proper time-points of follow up measurements). Also, to ensure generalizability of the results a multi-centre study will be preferable. Additionally, RCTs employing pre-post design (before commencement and immediately after completion of orthodontic treatment) would minimize the risk of bias due to lack of blinding.