Background

Changes in tooth color may be caused by several factors, for instance, by extrinsic (external) and intrinsic (internal) discolorations, or by aging [1]. Further causes of color changes are dental treatments, including bleaching or restorative therapy [2]. In addition, tooth color can be changed by the acid-etching process used for bonding orthodontic brackets [3]. Formation of white spots and irreversible penetration of resin tags that remain in the enamel as the two main causes have been reported [4,5,6,7]. Therefore, multibracket treatment (MBA) may be associated with enamel discoloration due to changes in the enamel by tooth cleaning, enamel conditioning procedures (etching), and the debonding and subsequent polishing processes [8, 9].

Association between tooth color changes due to bonding and debonding procedure and multibracket treatment (MBA) is discussed controversial. Some studies [4, 10, 11] have shown that enamel color variables were significantly affected by bonding and debonding procedures, other investigations [3, 12,13,14] did not find clinically important influence of this procedures on the enamel discolorations.

The purpose of this in vivo study was to investigate how tooth color is affected by multibracket appliance (MBA) treatment, especially whether: (1) the change in tooth color during MBA treatment is clinically important; (2) the color change differs by bracket (body) and non-bracket (gingival) tooth segments; and (3) the change is substantially the same for the conventionally used 2D system and the scientifically favorable 3D system.

Methods

Subjects and clinical examination procedure

All subjects expecting MBA treatment were regular patients of the orthodontic department and participated on a voluntary basis. All measurements were performed during regular visits. All procedures performed in this study were in accordance with the ethical standards of the institutional research committee Ärztekammer Mecklenburg-Vorpommern (Reg. Nr.III UV 15/08). Informed consent was obtained from the patients and parents before start of the study. Initially, 26 patients were included. The inclusion criteria were good oral hygiene, non-carious and restoration-free permanent teeth, and no white spots. The multibracket appliances had been present in situ for 2.0 (SD ± 0.3) years (individual study period of each patient). The entire period of study data collection lasted from 2005 to 2009. Time points of measurements were start of MBA treatment (baseline - T0), end of MBA treatment (2 years SD ± 0.3 - T1), and 3-month after end of MBA treatment (T2) (Fig. 1). The complete clinical procedure was performed by an experienced orthodontist under standardized conditions (color neutral such as same room and light conditions, patient was covered by a drape, tooth surfaces were always saliva-wet) according to the standardized bonding protocol of the orthodontic department. Enamel was etched with 35% orthoposphoric acid (Scotchbond, 3 M Unitek) for 10 s, rinsed with air-water spray for 20 s and dried for 10 s. Transbond XT™ Ligth Cure Primer (3 M, Unitek) was used in conjunction with Transbond XT™ Ligth Cure Adhesive (3 M Unitek) according to the manufacturer’s instructions for bonding Mini-Mono – .022 Roth Technique Stainless Steel Brackets (Forestadent, Germany). After that the bracket was pressed firmly on the enamel surface and the excess adhesive resin was removed with a probe. Light curing was performed with LED source Starlight Pro (Mectron, Germany) for 10 s. For study purposes, the protocol was slightly modified by the additional advice “avoiding etching of the gingival segment”. Each tooth was categorized into the gingival (S1), the body (S2), and the incisal (S3) segment (Fig. 2). For standardization of the measurements, we used the facial axis point (FA point) for placing the bracket determined with a Dental Bracket Placement Gauge accordingly the MBT™-technique for the middle segment S2 and for gingival segment S1 we placed the tip of the measuring probe perpendicularly 1 mm above of the middle point of the gingival line of the corresponding tooth (Fig. 3). The probe was moved slightly around the defined measurement points measuring automatically four times giving an overall value of these measurements at the end. The incisal segment S3 was not included into analysis because of its transparency. All measurements were performed by a calibrated examiner from a pilot study [15].

Fig. 1
figure 1

Consort Flow Diagram

Fig. 2
figure 2

Measuring report by Shade Inspector™

Fig. 3
figure 3

Shade Inspector™ - Measurement of the gingival segment (S1)

During the entire study period we lost 11 patients. Drop out reasons were lack of oral hygiene with breakup of fixed orthodontic treatment, move, repeated schedule failure and withdrawal of informed consent.

Electronic color measurement

Tooth color was measured electronically with the spectrophotometer Shade Inspector™ (Schuetz Dental, Rosbach, Germany- presently not available). The tooth color measuring device operates independently of light on the principle of spectral photometry. For color determination, the color data of the test specimen are compared with manufacturer-furnished color rings. The tested spectrophotometer is calibrated with a factory-provided selection of industrially fabricated color reference scale VITAPAN Classical® and VITA 3D-Master® by the company (Schuetz Dental, Rosbach, Germany). In the present study, the color references VITAPAN Classical® and VITA SYSTEM 3D-Master® were selected from the device software. The VITAPAN® Classical Color System has a two-dimensional structure that enables the description of hue (category A to D) and lightness including chroma (group 1 to 4) [16, 17]. It serves as standard shade guide for visual color assessment in dental praxis. The VITA 3D-Master® Color System has a three-dimensional structure that enables the separate description of lightness (1 to 5 and 0 for bleaching), chroma (1 to 3, including half points), and hue (M, L, R) [18]. It was developed to obtain a method for systematic and ordered color determination and a better hit rate. The examiners were provided with device operating instructions to ensure observance of the manufacturer’s specifications and calibrated in a pilot study [15]. Within a 1 mm measurement range diameter, the probe measures 26 standard colors and three bleaching colors from the VITA 3D–Master® color ring as well as 16 standard colors and 48 intermediate colors (calculated) from the VITA Classical® color ring. The measuring probe was protected by a detachable hygiene cap. During the measurements the probe was placed vertically to the tooth surface (Fig. 3).

Statistical methods

As the 3D-system (VITA 3D-Master) is “a more ordered shade guide” than the 2D-system (VITAPAN® Classical) [16], we considered the 3D-system as the primary outcome [19, 20].

Besides lightness and chroma, we analyzed color distributions in terms of L* (CIE lightness) and C*ab (CIE chroma) after having assigned VITA 3D-Master® shades to values given in Table 1 in Ahn et al. [21] via data analysis syntax. Additionally to L*, values for a* and b* were calculated from values of C*ab and h degrees as given in Ahn et al. and were then used to calculate ΔE (defined [22][as square root of [(ΔL*)2 + (Δa*)2 + (Δb*)2]). For example, the change from 1 M2 to 2 L2.5 was calculated in two steps. First, a* and b* values were calculated (a*1M2 = 8.7*cos(89.4*2*π/360) = 0.09; b*1M2 = 8.7*sin(89.4*2*π/360)); then the square root of [(65.0–61.3)2 + (0.09–0.82)2 + (8.70–13.5)2] = 6.1 was calculated, which can also be found in Table III in Ahn et al. [21]. Because ΔE is restricted to non-negative values, we computed the distance of each shade to 0 M1 additionally, denoted by d(0 M1). A positive change in d(0 M1) indicates a darker or stronger color; a negative change indicates a lighter or purer color.

Table 1 Description of color distributions on tooth level (n = 120)

In the 2D-system, the shade group B is ordered by C*ab (CIE chroma), but not by L* (CIE lightness); for the latter B2 > B1 > B4 > B3) [16]. Therefore, we analyzed color distributions only in terms of L*, C*ab, and ΔE after having assigned VITAPAN® Classical shades to values given in the D65 columns of Table I in Park et al. [16] as described for the 3D system. Because the second shade designation numbers of the 2D-system were assessed on a five-point scale at quarter points, extrapolation to five and interpolation to quarter points were applied.

In the 2D-system, the shade group B is ordered by C*ab (CIE chroma), but not by L* (CIE lightness); for the latter B2 > B1 > B4 > B3) [16]. Therefore, we analyzed color distributions only in terms of L*, C*ab, and ΔE after having assigned VITAPAN® Classical shades to values given in the D65 columns of Table I in Park et al. [16] as described for the 3D system. Because the second shade designation numbers of the 2D-system were assessed on a five-point scale at quarter points, extrapolation to five and interpolation to quarter points were applied.

As the American Statistical Association [23] recommends to avoid over-reliance on p-values, we estimated and interpreted confidence intervals [24]. Treatment effects were corrected for tooth level and subject level by using multilevel modeling [25], and adjusted for tooth type and quadrant. The group difference in change from baseline was calculated in order to estimate treatment effects. Originally, a difference in shade ≥3.7 CIELAB units had been prespecified as clinically meaningful both for changes within groups and treatment effects [16] which was revised to ≥2.7 [26]. The treatment group difference in change (change in S1 versus change in S2) was estimated by linear multilevel models with Kenward-Roger correction for small samples [27] via the procedure “mixed” by Stata software, release 14.2 (Stata Corporation, College Station, TX, USA); changes within groups were computed afterwards using the command “margin”. The relative treatment effect of the difference in change was estimated by ordinal logistic multilevel models via Stata’s procedure “meologit”. Odds ratios in the ordinal logistic regression can be interpreted as those in the binary logistic regression whatever the cutoff point of the ordinal outcome is [28]. Box plots and descriptive statistics, including quantiles and Gini’s mean difference (Gmd) as a robust measure of dispersion [28], were generated using R, release 3.3.3 (R Core Team (2017). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. https://www.r-project.org), especially the “ggplot2” package [29].

Results

Subjects, teeth, and observations

The initial study sample consisted of 26 consecutive patients. Eleven patients were excluded from the study for different reasons, including lack of oral hygiene, decalcification, or relocation. The multibracket appliances had been present in situ for 2.0 years (SD ± 0.3). At the end of MBA treatment, data for tooth color of 120 teeth of the upper jaw (#14 to #24) of 12 female and 3 males were available, resulting in a total of 720 observations for each color system (120 teeth, 2 tooth segments, 3 time points). All patients were Caucasian, aged 11 to 18 years.

Measurements results

2D-system

At baseline, 13 different shades were measured (Fig. 4a). Five shades with a frequency greater than 30 occurred: B2, B2.25, B2.5, B2.75 and B3 (Fig. 4a). Coordinates (CIE L*, a*, b*) of quarter points for the second shade designation number were interpolated to (61.0, 59.7, 58.4, 57.1, and 55.8) for L* of B2, B2.25, B2.5, B2.75 and B3, respectively, and to (9.8, 11.1, 12.4, 13.6, and 14.9) for C*ab of B2, B2.25, B2.5, B2.75 and B3, respectively (Fig. 5). Note that B2.25, B2.5, and B2.75 lie in a space not well covered by the 3D-system (Fig. 5). Gingival segments were darker (L*) and stronger (C*ab) than body segments (P = 0.004 and P = 0.031, respectively; Table 1).

Fig. 4
figure 4

a, b Frequencies of 2D and 3D shades in gingival and body segments of 120 teeth at baseline

Fig. 5
figure 5

Scatterplot of CIE L* and C*ab values for 2D shades (blue) and 3D shades (orange)

Changes within segments S1 and S2 from baseline to 3 months after MBA treatment (T0 – T2) were at worst 1.97 ≈ 2.0 units (ΔE for gingival segment; Table 2), which is less than the threshold of 2.7 units for a clinical meaningful difference (Fig. 6a). Moreover, confidence intervals for the treatment effects in terms of the difference in change indicated no clinically important differences between body and gingival segments (Table 2).

Table 2 Treatment effects in terms of the difference in change using linear multilevel models to account for 15 subjects and 120 teeth, and relative treatment effects of the change in terms of the odds ratio of the body segment referred to the gingival segment using ordinal multilevel models
Fig. 6
figure 6

a, b Box plots showing the distribution of ΔE for the 2D-system (a; left) and the 3D-system (b; right) on tooth level. Orange circle: mean; bold line: median; box: interquartile range (between 25 and 75%); whiskers: range between 12.5 and 87.5%; grey dots figure the 120 observations; red line: clinically important difference at 3.7 units or 2.7 units

3D-system

At baseline, 13 different shades were measured (Fig. 4b). Four shades with a frequency greater than 30 occurred: 1 M2, 2 L2.5, 2 M3, and 3R2.5 (Fig. 4b). Note that shades 2 L2.5, 2 M3, and 3R2.5 limit a space that is not well covered by the 3D system (Fig. 5; 3R2.5 is nearest neighbor of 3 L2.5). Chroma of gingival segments was stronger than that of body segments (P = 0.018; Table 1); differences in lightness were uncertain (P = 0.17; Table 1).

Changes within segments S1 and S2 from baseline to 3 months after MBA treatment (T0 – T2) were at worst 2.28 ≈ 2.3 units (ΔE for body segment; Table 2), which is less than the threshold of 2.7 units for a clinical meaningful difference. Figs. 6b and 7 illustrate that ΔE is prone to information bias (measurement error). The value of ΔE = 9.9 for T0 – T1 and T1 – T2 at the gingival segment as shown in Fig. 6b resulted from a change from 1 M2 to 3 L2.5 and back to 1 M2 for T0, T1, and T2, respectively. This change is more appropriately described in terms of d(0 M1): Values of 9.2, 19.0, and 9.2 for T0, T1, and T2, respectively, correspond to a change in d(0 M1) of 9.8, and − 9.8 for T0 – T1 and T1 – T2, respectively, because d(0 M1) allows negative values to describe purer or lighter changes. Moreover, confidence intervals for the treatment effects in terms of the difference in change indicated no clinically important differences between body and gingival segments (Table 2).

Fig. 7
figure 7

Box plots showing the distribution of the change in distance from 0 M1 for the 3D system on tooth level. Orange circle: mean; bold line: median; box: interquartile range (between 25 and 75 - 50% of the values); whiskers: range between the 12.5 and 87.5% (75% of the values); grey dots figure the 120 observations; change > 0 indicates darker or stronger colors; change <0 indicates lighter or purer colors; red line: clinically important difference at 3.7 units or 2.7 units

Discussion

During MBA treatment, color changes in bracket (body) and non-bracket (gingival) tooth segments were not clinically relevant. Moreover, body and gingival tooth segments differed in change in tooth color only slightly and possibly by zero. The extent of change in color depended on color metrics (2D, 3D); nevertheless, our findings using different color metrics were sufficiently robust insofar as color change during MBA treatment was not clinically relevant, even if using small thresholds down to 2.3 units for a clinically relevant difference (ΔE).

Methods of the study

In this study, we preferred electronical measurements instead of visual measurements for several reasons. First, it was assumed that problems due to the regression to the mean [30] which is “one of the most important of all phenomena regarding data and estimation” [31] could not have been substantially reduced by repeated visual measurements; the judger will be biased after the first measurement. Second, we aimed to use measurements of two systems (2D and 3D) for which judgers would have introduced bias regarding the second measurement. Third, four measurements as used internally by the electronic device to compute the overall value increased the reliability according to the Spearman-Brown formula. Fourth, by using quarter points, electronic 2D measurements could have been more accurate than visual 2D measurements. Finally, it could be expected that our adolescent patient group (11–18 years) was homogeneous concerning tooth colors, especially in terms of B color shades of the 2D system. Therefore, it could be assumed that a systematic measurement error will be substantially the same in this highly homogeneous group – an assumption which would not be justified in a sample with a wide age range (and more frequent color shades different from B of the 2D system). This is a crucial point because in presence of a constant systematic measurement error the validity of the measurement of change will not be threatened. In short, we looked for a trade-off between reliability and validity issues, including regression to the mean.

Nevertheless, there are some limitations concerning the electronical measurement methods, including light condition, calibration of the measurement device, reproducibility of the measurements, and visual threshold discussed in the literature [32]. The spectrophotometer Shade Inspector™ was used in our study, because of its good results regarding reproducibility of lightness and chroma found in pilot studies [15, 33]. Other studies, investigating dental color measuring devices did show reliable results as well [34,35,36,37,38].

The Shade Inspector™ is calibrated with a factory-provided selection of industrially fabricated color reference scale (VITAPAN® Classical and VITA 3D-Master®). These color scales originating of different batches were read in and the measurements averaged. Therefore, variations in measurements due to the calibration process are conceivably [39]. The study of Kohlmeyer and Scheller evaluating VITAPAN® Classical color scale samples, revealed that the individual color scale samples failed to invariably correspond to the respective primary color [40]. In addition, unequivocal findings were reported on color consistency alongst shade guides from the same manufacturer [41, 42]. One in vitro study found that repeatability and accuracy of a dental color measuring instrument (ShadeScan) was influenced by shade guide systems used for testing [43]. In our study, the complete clinical procedure was performed by an experienced orthodontist under standardized conditions (color neutral such as same room, same dental unit and same light conditions by dental unit lamp, patient was covered by a drape, tooth surfaces were always saliva-wet). The electronical measurements were performed by a calibrated examiner [15] in a pilot study. The tooth color measuring device itself operates independently of light on the principle of spectral photometry. However, in a study, evaluating the effect of different illuminants (natural daylight, dental unit lamp, and daylight lamp), the matching repeatability of 2 intraoral spectrophotometers was not completely satisfactory for clinical practice [44]. Therefore, our measurements were taken under standardized conditions as described before. Thus, we do not assume relevant effects by the surrounding light conditions.

Our study has methodological strengths. Notably, two measurements (2D, 3D) at each time point were used, thereby reducing problems due to regression to the mean, which is here the tendency of tooth segment’s colors at the extremes to have less extreme values on subsequent measurements [30]. To reduce the influence of extreme values at the first measurement, it is common to discard the first of three blood pressure measurements of the same examination [45] or to measure the periodontium by the Florida probe thrice given disagreement in first two measurements. Importantly for interpreting of the analysis of change as done herein, the second measurement was performed by the 3D-system, which was considered as the primary outcome. Moreover, we used mixed models as a shrinkage approach and “a way of discounting observed variation that accounts for regression to the mean” [31]. Second, the 2D-system measured at quarter points for the second shade designation number. As the 3D-system did not cover the space of the most frequent 2D shades, the 2D-system added essential information, although limited by the regression to the mean. Third, tooth type as a potentially substantial confounder can only be considered in multilevel analysis. Further, it is not possible to address confounding due to tooth type by the study design. Thus, tooth type cannot be subject of randomizing in a MB study; analysis restricted to the subject level can be misleading. Fourth, we presented not only the original codes of the 2D- and 3D-system but also the transformed values based on the CIE system. As B2 > B1 > B4 > B3 on the L* scale [16]. the shade designation numbers of the original 2D codes cannot be well interpreted. Finally, we used not only ΔE to estimate the treatment effect but also the measure d(0 M1) to allow for purer or lighter changes. In terms of L*, ΔE does not differentiate a lighter change from a darker change given the same ΔE; in terms of C*ab, ΔE does not differentiate a purer change from a stronger change. The 3D shade 0 M1 as the new origin of the coordinate system enables us to differentiate lighter/purer changes from darker/stronger changes. 0 M1 as the new origin of the 3D-system is justified for its lightest lightness and its purest chroma, including the purest red (a*) and the purest yellow (b*). For the 2D- system, no shade has these properties [16].

Unfortunately, there was no sample size calculation for this study. However, we accounted for subject and tooth level to increase statistical power. Moreover, other studies included similar numbers of participants [10, 46]. Besides this limitation, it was not sensitive to adjust for baseline values [47,48,49], because segments could not be randomized to treatment groups. Therefore, we compared the difference in change from baseline between segments [28, 50, 51].

Discussion of results

Confidence intervals for the treatment effect for both color systems indicated no clinically important differences between body and gingival segments. Further, changes from baseline to 3 months after MBA treatment (T0 – T2) were at worst 2.3 units for 3D- system and 2.0 units for 2D-system, respectively, which are less than the threshold of 2.7 units for a clinical meaningful difference.

Previous studies [4, 14] have shown that the enamel color variables are affected by orthodontic bonding and debonding procedures due to tooth cleaning [52], enamel conditioning procedures (etching) [53], and enamel scratches [54]. Other effects, such as staining of enamel and resin material used for the bonding brackets, may also induce color change of teeth during orthodontic treatment. These color change may be the result of demineralization [55], or direct food dye [12, 56]. The staining of the resin material is associated with the color instability of the polymer [57].

Several experimental studies [3, 4, 12,13,14, 58] investigated the impact of the bonding process on tooth color. Three studies [3, 12, 14] investigating color change after bonding of extracted teeth have not found any indication of a significant influence of the bonding process on tooth color. In another experimental study [13] assessing color changes in bracket areas, significant differences in ∆E were found. Despite the significance of the results, the authors did not consider the color changes visually perceivable for the majority of examiners. Eliades et al. [4] reached similar conclusions when examining the influence of different bonding materials. Furthermore, enamel color alterations might also derive from the irreversible penetration of resin into the enamel surface [4]. Moderate evidence exits that shorter resin tags penetration produces less change in enamel color following clean-up procedure and polishing [58]. Self-etching primers produce less resin penetration and these systems produce less iatrogenic color change in enamel following orthodontic treatment [58]. In our study 35%-phosphoric acid was used.

The results of a prospective clinical trial conducted by Karamouzos et al. [10] showed significant changes of tooth color (2.1 to 3.6 ∆E units) after orthodontic treatment. The value for the parameter lightness (L*) decreased, whereas the values for the parameters a* (value for green-red) and b* (value for blue-yellow) increased. These changes indicated a decrease in tooth lightness as well as a change in hue, which may be perceptible if a threshold of 1.2 is assumed [26]. In our study, however, we did not find ∆E values greater than 2.7 units, which are considered clinically relevant [26]. Nevertheless, our results are in accordance to a recently published review by Chen that there is no strong evidence that orthodontic treatment with fixed appliances alters the original color of enamel [8].

Conclusion

Within the limitation of this study the MBA treatment can be seen as a safe method with respect to tooth color.