Background

From the initial spread of clear aligners in the orthodontic field, their application was intended for mild crowding. With the development of the material and the computer design of the tooth movement, the indication of clear aligners has been broadened to moderate or severe malocclusion, including extraction or non-extraction cases [1].

However, in the extraction cases approached with clear aligners, molar anchorage control and central incisor retraction were not achieved as predicted, getting body movements with this technique is still difficult, although the use of elastic chains and other auxiliaries could improve them [2, 3].

Clear aligners have improved the patients’ esthetics, comfort, and hygiene. On the other hand, they have some limits in controlling some tooth movements [4].

Since these new devices were introduced as a therapeutical option for orthodontic treatment, orthodontists have been investigating the effectiveness of the treatment and the accuracy of tooth movements, comparing the predicted with the achieved result through digital models. Aligners have the advantage of perspective three-dimensional digital planning and the disadvantage of the limited predictability of some dental movements [3,4,5,6,7,8,9].

For treatments with aligners to be effective, there must be a correspondence between the planned and actual results. All the studies on this topic agree that in most cases, there are some discrepancies between the digital setup and the clinical result [5, 9].

According to Buschang, virtual models do not accurately reflect the patients’ final occlusion immediately at the end of active treatment: they overestimated alignment, rotations with occlusal contacts, and occlusal relations [10].

As reported in a recent Systematic Review [11], in almost all cases, additional refinements are needed to achieve the prescribed setup results. In other words, the number of patients requiring an unplanned correction or even resorting to fixed orthodontics to get the results indicated in the virtual model is closer to 70–80% [5].

The crowding resolution with clear aligner is possible by proclining the tooth, expanding the arch, performing IPR, or, in the most serious cases, extracting, thus using the same approaches used for the conventional orthodontic treatment [10, 11], however, it is less predictable with aligners to obtain pure translation movements in extraction cases [2].

In 2012, Krieger et al. [12] assessed arch length, intercuspid distance, overbite, overjet and midlines by comparing initial and final casts, which were measured with a clinical caliper. They provided a general conclusion that clear aligner treatment effectively resolved anterior crowding by incisor proclination, but overbite correction was difficult to achieve [13].

However, the expansion efficiency decreased from the canine to the first molar [14,15,16], while the amount of enamel removed in vivo did not correspond with the amount of IPR planned [17, 18].

Knowing the accuracy of the software in predicting tooth movements could help clinicians to overcorrect if necessary or staging the movement in smaller increments, thereby reducing refinements, mid-course corrections, and treatment time. So orthodontists must be fully aware of the features and weaknesses of this device in order to choose the correct indications and obtain good treatment results [19,20,21].

The current study aims to assess the predictability of crowding resolution and the efficacy of the different strategies to gain space during clear aligner treatment and their correlation to provide a suitable protocol to achieve predictable results.

Materials and methods

This retrospective observational study was approved by the Institutional Review Board of the University of Campania Luigi Vanvitelli. (No. 308 20/512019).

Subject recruitment

The sample size was estimated based on preliminary data [17]. A minimum sample of 39 subjects was needed to achieve 80% power, with an alpha of 5% to detect a 0.5 mm difference (SD 0.5 mm).

The sample was obtained from 10 orthodontists selected using the Doctor Locator (DL) with the method and inclusion criteria described in the previous paper [17]:

  1. 1.

    at least 8 years of experience in CAT;

  2. 2.

    execution of manual IPR;

  3. 3.

    at least 100 patients were treated with CA last year.

Ten Italian ZIP codes have been randomly drawn and entered into the DL application. The first ten doctors that agreed to participate were included in the provider's list in that area.

For each doctor, the last four consecutive patients treated with clear aligner were selected according to the following inclusion criteria:

  1. 1.

    adult patients with full permanent dentition;

  2. 2.

    non-extraction orthodontic treatment in both arches;

  3. 3.

    use of composite attachments;

  4. 4.

    manual IPR digitally planned (between 0.1 mm and 0.5 mm per tooth);

  5. 5.

    virtual digital planning with Clincheck (Invisalign; Align Technology, Santa Clara, Calif);

  6. 6.

    digital models at the beginning of treatment and after the first aligner sequence;

  7. 7.

    good compliance with aligner;

  8. 8.

    treatment started in 2014 or later (after the introduction of the SmartTrack material).

Exclusion criteria were as follows:

  1. 1.

    interruption of aligner sequence;

  2. 2.

    poor compliance;

  3. 3.

    multiple and/or advanced caries;

  4. 4.

    supernumerary teeth;

  5. 5.

    cleft lip and/or palate.

The orthodontist placed the attachments with no restrictions, according to his preferences. Aligners were changed on average every 10 days.

For each patient, we evaluate the predictability of the three different strategies to gain space:

  1. 1.

    transversal arch expansion;

  2. 2.

    arch length;

  3. 3.

    interproximal enamel reduction.

3D casts and measurement protocol

Records were collected from the Invisalign Doctor Web Site. Digital models were exported as STL (Standard Triangulation Language) files at the beginning (T0), at the end of the virtual digital planning (vT1), and at the end of the corresponding aligner sequence (T1) and uploaded into OrthoAnalyzer Program (version 1.7.1.4; 3Shape, Copenhagen, Denmark).

An operator (AF) defines for each tooth, from the first molar to molar, all landmarks illustrated in Table 1 to be able to carry out all the measurements (Fig. 1).

Table 1 Landmarks
Fig. 1
figure 1

Measurements: upper and lower arch widths, arch length, and interproximal enamel reduction. A–C Intercuspid width: linear distance between inner lingual points; interpremolar width: linear distance between inner lingual points; intermolar width: linear distance between inner lingual points; D arch length: the perpendicular to a line connecting the contact point passing through the buccal point; E Interproximal enamel reduction: the difference between mesiodistal diameter at T0 and T1

The arch measurements were performed, similarly to those proposed by Raucci [22], to evaluate the changes between the three-time steps (T0; vT1; T1) (Fig. 1).

Then for each arch to evaluate the real IPR, the mesiodistal teeth dimensions were measured from second-to-second premolar before and after IPR (Fig. 1E). The full-arch amount of IPR performed was obtained through the difference between the length of mesiodistal tooth diameters before and after treatment [17].

After it was compared to the planned IPR, shown in the virtual digital planning.

Dental crowding was quantified by Little’s Irregularity Index [23].

The following formula was used to quantify the predictability of each measurement:

predictability: [real post-treatment (T1)—initial pre-treatment (T0)]/[ideal post-treatment (vT1)—initial pre-treatment (T0)].

Thus, an index of the predictability of each strategy was obtained: the closer the value to 1, the more precise strategy was performed by the aligner (100% of the prescription).

To evaluate the correlation between each space gain strategy and crowding resolution, the treatment outcomes were assessed through the measurement changes between the start (T1) and end of treatment (T0) (Delta: Δ).

Statistical analysis

The statistical analysis was performed with Jamovi 2.3.2 statistical software (The jamovi project, Sidney, Australia). Descriptive statistics were calculated for each measurement. The data distribution was normal, so a paired t test to determine if there was a statistically significant change between vT1 and T1 (p < 0.05) was used.

To assess reliability, 1 month after the initial measurements, the parameters were remeasured by the same examiner. Intraclass correlation coefficients showed moderate interobserver reliability, with Cronbach's alpha of 0.723 for linear measurements.

In the presence of normally distributed data, a paired t test was used to compare the T1-T0 changes and the vT1-T0 changes. The level of significance was set at 5%.

A Pearson's product-moment correlation was run to assess the correlation between the difference of Irregularity Index (before and after treatment and the outcomes of strategies to gain space.

Results

A total of 40 adult Caucasian subjects (18 men, 22 women, of mean age 34 ± 7) for a complete of 80 dental arches were achieved.

All the patients completed treatment in keeping with the research protocol. Patient compliance forms were collected at the tip of treatment; all patients reported wearing their aligners for 21 to 23 h per day. For all measurements, statistically significant differences were found at the various stages as reported in Table 2.

Table 2 Comparison between T0, T1v, and T1 in upper arch

Preliminary analyses showed the link to be linear with both variables normally distributed, as assessed by Shapiro–Wilk's test (p > 0.05),

The predictability of crowding resolution measured with the decrease in the Irregularity Index was very high, starting from 87% (p < 0.04) within the upper arch and 81% (p < 0.03) within the lower one.

Conversely, changes in arch diameters were less reliable varying between 59 and 83% within the upper arch and 49–67% within the lower one (Tables 2, 3).

In the upper arch, the foremost accurate prediction was for the first molar with the predictability of 83% (p < 0.00), while the smallest amount was for the canine of 59% (p < 0.03) (Table 2).

Moreover, within the lower arch, the foremost accurate prediction was for the second premolar with the predictability of 68% (p < 0.09), while the smallest amount was for the canine at 49% (p < 0.03) (Table 3).

Table 3 Comparison between T0, T1v, and T1 in lower arch

Variations in sagittal incisor position were predictable, with a worth of 68% within the upper arch (p < 0.04), and 69% (p < 0.02) within the lower one (Table 4).

Table 4 Summary of the variables' predictabilities

Lastly, IPR was the smallest amount accurate procedure, wavering from 49% within the upper arch and 42% within the lower arch.

In the lower arch, a Pearson's product-moment correlation was run to assess the relationship between the different strategies to gain space and Irregularity Index. There was a statistically significant, high inverse correlation between arch width (Δ 3-3) and Δ Irregularity Index, (r = −0.41; p < 0.05) (Fig. 2). Moreover, there was a statistically significant, moderate inverse correlation between arch width (Δ 4-4) and Δ Irregularity Index, (r = −0.35; p < 0.05) (Fig. 3).

Fig. 2
figure 2

Lower arch: linear correlation between Δ irregularity index and Δ 3-3

Fig. 3
figure 3

Lower arch: linear correlation between Δ irregularity index and Δ 4-4

There was no statistically significant correlation between the other strategies and the Irregularity Index (Table 5).

Table 5 Correlation between space gain strategy and crowding in upper arch

Moreover, in the upper arch, there was a statistically significant, high inverse correlation between arch width (Δ 3-3) and Δ Irregularity Index, (r = −0.45; p < 0.05) (Fig. 4).

Fig. 4
figure 4

Upper arch: linear correlation between Δ irregularity index and Δ 4-4

There was no statistically significant correlation between the other strategies and the Irregularity Index (Table 6).

Table 6 Correlation between space gain strategy and crowding in the upper arch

Discussion

The aim of this investigation was to assess the predictability of the crowding resolution, the efficacy of the different strategies to gain space, and their correlation to provide a suitable protocol to achieve predictable results.

The sample of this study presented an Irregularity Index mean of 7.26 mm in the upper arch and 8.13 mm in the lower one. These data indicated that the patients recruited had severe crowding in both arches, suggesting orthodontists' confidence in treating this issue with clear aligners.

At the end of digital planning (vT1), a residual crowding of 1.83 mm in the upper arch and 1.40 in the lower arch was still present. Therefore, it could be deduced that planning errors are often present and could affect the final outcomes.

At the end of the first set of aligners (T1), the crowding was statistically significantly reduced, but an Irregularity Index of 2.81 mm in the upper arch and 2.66 mm in the lower arch was still found. These data show a high value of predictability (87% in the upper arch and 81% in the lower one), confirming that the clear aligner treatment is efficacy to reduce crowding [24]. Unfortunately, single planning, even if very effective, is not enough to solve the crowding entirely.

Several studies compare, in clear aligner patients, the treatment outcomes that needed refinement, with their digital planning to evaluate the effectiveness of tooth movement. The results of these researches were similar to ours [12, 15, 23].

Therefore, additional aligners should always be considered to solve the crowding still present.

The predictability of crowding resolution with clear aligners is a multifactorial issue. Some are related to aligners, such as the protocol used, the features and the thickness of the aligner material, the planning software, and the different strategies to gain space planned.

Others depend on the operator, such as the experience of the clinician in case selection [19], the accuracy plan of the strategies to gain space, and the accuracy of IPR procedures [17].

As emerged in this paper, the single strategies to gain space reached different predictability, although they together, concur to solve crowding. Our results showed large variability of these data.

For example, transverse changes in the upper arch ranged between 59 and 83% depending on the form of the diameters considered.

This indicates that the aligners do not allow the planned expansion entirely as already reported in the literature [20, 21]. The least accurate transversal diameter change, as also reported in previous studies [15, 16, 20, 25], was for canine and the first premolar with the predictability of 59% and 60%, while the most accurate was for the first molar with the predictability of 83%. This could be related to the different extensions of the lingual surface and anatomy of each tooth related to the retention and fit of the clear aligner. Indeed, the slender and conical canine shape could not help to control the planned expansion movement.

In the mandibular arch, the predictability is even lesser, with the lowest value always found for the canine diameter of 49%. Mandibular canines have the longest root, a shape of crown with few undercuts, and a small lingual surface due to their usual more vestibular position; this reduces the retention of the aligner and the ability to push the tooth buccally.

The data regarding the sagittal incisor position (arch length) indicate that, on average, at the end of virtual digital planning (vT1), control of the arch length is planned [12, 26].

Instead, at T1, the slightly lingual movement of the incisors, through coronal tipping, was predictable with a value of about 70% both in the upper and lower arch.

Also, Kravitz reported [27], that the aligners are more accurate in retracting the incisor than in expanding them labially.

Obviously, arch length decrease or remaining stable is associated with the posterior expansion of the dental arch and it might be due to the maxillary arch form [28].

These results highlight that with the clear aligners, it is possible to plan crowding resolution without buccal tipping, thus avoiding side effects of the lower incisors proclination, which are often present with fixed appliances.

The clinicians should take in mind these outcomes to plan crowding resolution with less canine expansion and incisor proclination but primarily with IPR.

Regarding IPR, after aligner sequences, in the upper arch, less than half, 0.55 mm, was performed, while in the lower arch the result was better, 0.82 mm. These data attest to the accuracy of IPR at 49% in the upper arch and 42% in the lower one, values that confirm the results of the previous study [17].

These data indicate that in these patients, less than half of the planned IPR at the start of treatment with software was performed.

The predictability of IPR is a multifactorial issue. The amount of enamel removed depends, in fact, on several factors: some related to characteristics of the tooth such as the hardness of the enamel, the anatomy of the crown, or the position of the tooth itself. Others related, instead, to the operator, such as his experience, the technique used for the IPR, and the pressure exerted during the procedure [17].

Therefore, IPR is the least accurate strategy to gain space, and it requires greater accuracy. This could improve the predictability of crowding resolution, reaching the optimal values planned.

Moreover, there was not a high correlation between the irregularity index and the strategies to gain space; however, all the strategies concur to create useful space to solve the crowding.

In both arches, a high inverse statistically significant correlation was found between intercuspid diameters (Δ 3-3) and Δ Irregularity Index. This result indicates that crowding decreases as 3-3 diameter increases.

Incisor proclination (Δ arch length) was not correlated to Δ Irregularity Index, obviously because in digital planning there was no proclination to gain space.

Moreover, the correlation between Δ IPR and Δ Irregularity Index was positive, although not statistically significant. Indeed, with the reduction in enamel increasing, we registered a resolution of crowding improvement, although IPR has rather low efficacy rates. The result of this research showed that digital planning was not a predictor of final tooth position. Therefore, the virtual tooth position may not be the achieved final tooth position. So, it is important to know the limits of digital planning to overcorrect tooth movements when it is necessary to decrease the need for refinement [12]. Knowing the strengths and the weaknesses of clear aligners will help the clinician in selecting the best orthodontic appliance to treat a specific malocclusion.

Therefore, orthodontists play an important role in modifying the virtual digital plan with their clinical experience, programming the movements at the proper steps, and adding the features to improve the predictability of the tooth movement with the aligners [29].

To prevent the risk of selection bias, due to the retrospective nature of the study, patients were consecutively selected for each orthodontist. This study included only adult patients to avoid bias due to normal transverse growth of the jaws and because they currently represent most of the patients who request orthodontic treatment with clear aligners. Moreover, these patients generally show better compliance compared to adolescents, thus reducing a possible source of bias [9, 17].

However, the retrospective studies do not allow for estimating precisely patient cooperation. Thus, to verify the collaboration of the patients, their charts were reviewed at the end of the treatment to check if the aligners had been changed at regular intervals, but this may have been influenced by patient statements.

Future studies should add other measurements such as lateral cephalometric or volumetric 3-dimensional cone-beam to assess predictability of the different strategies to gain space. Such studies will allow for the evaluation of posterior tooth movement and address questions regarding root movement with clear aligners. Obviously, the possibility of developing prospective randomized clinical trials would allow emerging greater certainty regarding the predictability of treatment with clear aligners.

Conclusions

This study showed that the predictability of crowding resolution with clear aligners is 87% in the upper arch and 81% in the lower one.

Transversal arch expansion's predictability achieves a value between 59 and 83% in the upper arch and 49–67% in the lower one, decreasing from molars to canines in both arches.

Sagittal arch length is less predictable, and it presents the same value in both arches.

IPR is the least accurate strategy to gain space.

A correlation between the three gain space strategies and crowding was found only with the increase in arch transversal diameters. Specifically, the changes in 3-3 diameters were correlated with crowding resolution both in the upper and lower arch, whereas 4-4 only in the upper one.