CAT-DB in Eating Disorders

Dysfunctional cognitions and schemas have a significant impact on human affect, behavior, as well as perception and interpretation of the environment (Beck et al., 2017). Furthermore, cognitive factors are crucial for the etiology and maintenance of eating disorders (Hatoum et al., 2022; Stice et al., 2017; Treasure et al., 2020), as dysfunctional beliefs and distorted body perception represent vital characteristics and major symptoms of these disorders (Keizer et al., 2016; Marco et al., 2013; Pike et al., 2010). Dysfunctional core beliefs are part of a person's schemas (Beck, 1976; Young, 1994) which keizerinfluence the individual's cognitive content and the way they process information consistent with existing schemas (Wenzel, 2012). In contrast to dysfunctional core beliefs, automatic thoughts are more specific and triggered by various cues in certain situations (Jones et al., 2007). Automatic thoughts, in turn, trigger specific behaviors in eating disorders (Zarychta et al., 2014). Primarily, restrained eating behaviors, vomiting, other compensatory behaviors, and extreme weight control behaviors occur due to these specific dysfunctional cognitions (Fairburn et al., 2003). For example, intensification of food-related restrictions is often assumed to result from the occurrence of automatic thoughts (e.g., "I'm fat."). Thoughts like these can result from the overvaluation of food, figure, and weight (e.g., "If I am skinny, everyone else will like me.") (Cooper et al., 1998). In many cases, this leads to the development of maladaptive, dysfunctional attitudes such as "I have to be perfect in every way." (Fairburn et al., 2003) or self-deprecating, body-related thoughts such as "I am fat and plump" (Trunk et al., 2019). Topics of "figure" and "body weight" have a strong influence on the self-esteem of those affected by eating disorders (Lampard et al., 2013; Smink et al., 2018). Patients with eating disorders often exhibit black-and-white thinking related to themselves and the importance of their body and weight (Garner et al., 1982).

Within the treatment of eating disorders, cognitive-behavioral therapy (CBT) has proven to be particularly effective (Atwood & Friedman, 2019). CBT focuses on changing maladaptive, negative patterns of thinking and perception, as well as strengthening functional cognitive pattern (Fenn & Byrne, 2013). The affected person should successfully identify dysfunctional beliefs and replace them with a more rational mindset (Corstorphine, 2006), thus reducing the impact of weight and outward appearance on self-esteem. Deacon et al. (2011) achieved significant improvements in body image satisfaction and perceived accuracy of negative body-related thoughts by using cognitive restructuring techniques. Legenbauer et al. (2011) reported a reduction of restrictive eating and binge-eating behaviors and of dysfunctional body- and self-esteem-related thoughts as a result of therapeutic reduction of diet-related thoughts.

Leff et al. (2014) developed an innovative approach integrating cognitive elements in an avatar-based exposure therapy for patients with auditory hallucinations, making hallucinations tangible for patients. Patients were able to engage in dialogue with the avatar representing the hallucinations and learned to contradict the personal insults (Craig et al., 2018). Kocur et al. (2021) adapted this approach, utilizing the avatar as a representation of dysfunctional beliefs, and validating their computer-assisted avatar-based treatment for dysfunctional beliefs (CAT-DB) in a sample of depressive inpatients. Patients were confronted with their individual dysfunctional beliefs by a virtual avatar and were asked to contradict them with congruent functional alternative beliefs. Compared to the treatment as usual (TAU) group, the experimental group experienced significantly greater reductions in their belief scores of the dysfunctional cognitions and their symptom severity after the intervention, with the effects remaining stable at a two-week follow-up.

Since eating disorders show high comorbidity with depression (Milos et al., 2003; O'Brien & Vincent, 2003; Schulze & Kölch, 2020) and dysfunctional beliefs are one of the main characteristics of both disorders (Beck & Dozois, 2011; Legenbauer et al., 2007), an adaptation of the CAT-DB for eating disorders seems appropriate. Furthermore, an "eating disorder voice" as a representation of the disorder is frequently reported in patients with eating disorders (Aya et al., 2019; Pugh et al., 2018). In accordance with the rationale of avatar therapy by Leff et al. (2014), CAT-DB could help to decrease the impact of "inner voices" in eating disorders by providing the opportunity to visualize and interact with them. To our knowledge, no studies have evaluated the usefulness of such a therapy tool for patients with eating disorders so far. More specifically, there is a lack of studies examining the simultaneous confrontation with and processing of dysfunctional eating disorder-specific or body-related cognitions through a virtual avatar. In this study, we investigated whether cognitive restructuring techniques through a virtual avatar lead to a change in participants' automatic dysfunctional, body-related thoughts and eating disorder-specific symptomatology. In contrast to Kocur et al. (2021), we aimed at modifying the automatic, disorder-specific cognitions, rather than the fundamental core beliefs, as automatic thoughts directly interact with the disorder-specific behavior and can be addressed more easily (Jacobson et al., 1996). Based on Kocur et al. (2021), the following hypotheses and associated subtests were formulated:

  1. 1.

    CAT-DB leads to a significantly greater reduction in eating disorder-specific symptom burden and expression of related psychopathological variables, measured with the respective Eating Disorder Inventory 2 (EDI-2; Paul & Thiel, 2005) subscales, compared to the control group.

  2. 2.

    CAT-DB leads to significantly greater improvements in the conviction and evaluation regarding individual and global body-related cognitions, assessed via conviction ratings and the Questionnaire for the Assessment of Dysfunctional Cognitions in eating disorders (FEDK, Legenbauer et al., 2007) subscale "Body and Self-esteem," compared to the control group.

Materials and Methods

Participants

Considering that treatment effects are smaller in subclinical samples than in clinical samples (Cuijpers et al., 2014), the sample size was calculated based on an effect size of Cohen’s d = 0.2. With a target power of 0.95, N = 66 participants were required. Potential participants were screened before the start of the study to determine the extent of deviating eating behavior and associated variables. Subjects scoring below an a priori defined cut-off value (< 0.33) of the eating disorder scale of the ICD-10-Symptom-Rating (ISR; Tritt et al., 2008) were excluded. Out of the 80 participants who were assessed for eligibility, 19 declined to participate, and 13 were excluded due to a pretreatment ISR score below 0.33. This resulted in a final sample of 48 participants who were randomly assigned to the experimental conditions (see Fig. 2). However, 5 participants from the control group did not complete the follow-up; to include their data in our results, an intention-to-treat (ITT) analysis was performed. Accordingly, the present study relies on data from 48 individuals (37 women, 11 men: age = 31.46, SDage = 12.2, Mdnage = 25, age range: 21–64; see Table 1). The 48 participants enrolled in the present study had a mean score of 0.92 (SD = 0.53) on the eating disorder scales of the ISR at pretreatment, after being screened for subclinical eating disorder symptoms, signifying a light symptom severity on average.

Table 1 Sociodemographic characteristics of the sample

The CAT-DB and control group did not differ significantly with respect to age, gender or the scores obtained in any of the pretreatment measurements.

Study Design and Outcomes

The randomized controlled trial (RCT) was designed as a multifactorial 2 × 3 experimental design with a two-level between-subjects factor (group: CAT-DB and control group) and a three-level within-subjects factor (measurement time: pre, post, follow-up). Primary outcome variables represent the scores of the respective subscales of the questionnaires (i) EDI-2 and (ii) FEDK. Secondary outcome variables include the conviction rating of the self-formulated dysfunctional and alternative, functional cognitions, as well as the difficulty in contradicting the avatar.

Procedure

Data collection took place between February 11th, 2021, and April 2nd, 2021. Potential participants were recruited via social media platforms, university forums, and e-mail distribution lists. Participants did not receive any monetary compensation (i.e., money or other rewards) for their participation. The study was approved by the local Ethics Committee of the Department of Psychology at the PFH – Private University of Applied Sciences (Ethics application number: 251981/3).

The study consisted of a screening, a pre-measurement, three intervention sessions, a post-measurement, and a two-week follow-up. After receiving all necessary instructions, participants were able to successfully complete the intervention sessions autonomously from home. The sessions were scheduled within four weeks. In the screening, participants completed a demographic questionnaire and the German version of the ICD-10 symptom rating (Tritt et al., 2008). After evaluating the exclusion criteria, suitable subjects were then randomly assigned to either the experimental or the control group. To randomize the subjects into one of the two groups, they were numbered, and the numbers were randomly assigned to a group using an open-source online tool. Participants were then invited to the pre-measurement, in which they completed the respective subscales of the EDI-2 and FEDK before they were redirected to the virtual environment, starting with the first avatar intervention session. The intervention session included a standardized psychoeducational text explaining the concept of dysfunctional cognitions as well as alternative, functional ones. Subsequently, the subjects formulated three individual automatic thoughts related to body or food and rated the conviction of each cognition on a scale from 0 (not convinced at all) to 100 (extremely convinced). To help identify individual automatic thoughts and create functional alternatives for each of the dysfunctional cognitions, participants were provided with instructions and examples in a help section. As this study was conducted online, participants did not receive any external assistance from the investigator. A test run ensured that participants were comfortable with the procedure before beginning the first avatar intervention session (which lasted approximately 15 min; see Fig. 1). During this test run, participants practiced the cognitive restructuring process using standardized sample statements provided. In the experimental group, the avatar confronted participants with their previously formulated individual automatic thoughts (e.g., "You're more attractive when your stomach is flat.") using a synthesized voice. Participants then had to contradict the avatar using a contradiction composed of a negation (e.g., "No, that's not true.") and the alternative, functional statement (e.g., "I am beautiful and satisfied as I am.").

Fig. 1
figure 1

Interface and experimental setup. Note. Top panel: Virtual avatar at the beginning of the intervention. Pressing the "Play" button starts the confrontation. Middle panel: Initial situation after the verbal confrontation of the virtual avatar. Clicking a checkbox and a button before the start of the next confrontation was used to ensure participants verbally contradicted the avatar. (Translation of the contradiction: „No, that's not true. I don't have to be perfect to be liked by others “). Bottom panel: Experimental setup. The participant sits in front of a computer screen showing the CAT-DB with the virtual avatar

In the control condition, the avatar also confronted participants, but instead of self-related statements with high emotional valence, general false statements were used (e.g., "Paris is the capital of Germany “). Participants also had to contradict the avatar (e.g., “No, that's not true. Berlin is the capital of Germany.”).

The subjects participated in a total of three intervention sessions, each consisting of six blocks of cognitive restructuring. Every block included three confrontations (each statement once) by the avatar. After each session, all the participants again rated the accuracy of their dysfunctional as well as their alternative, functional cognitions on a scale from 0 to 100. Furthermore, participants were asked to rate how difficult it was to contradict the virtual avatar (rating each statement on a scale from 0 to 100) to gain information on the aversiveness of the avatar and to also exploratively test for possible habituation effects in regards to the contradiction, expecting the treatment group to show a habituation but no habituation to occur within the control group due to the avatar not representing the individual automatic thoughts but general false statements without emotional valence. All participants, regardless of their group affiliation, rated their dysfunctional cognitions and functional alternatives after each intervention session.

The post-measurement included the EDI-2 and FEDK. In the follow-up measurement after 14 days, the participants were asked to complete the EDI-2 and FEDK again (see Fig. 2 for the procedure of this study).

Fig. 2
figure 2

Flowchart of the procedure. Note. Eating-Disorder-Inventory subscales (EDI-2): [DT] = drive for thinness, [B] = bulimia, [BD] = body dissatisfaction, [P] = perfectionism, [IA] = interoceptive awareness

Material

The ICD-10 symptom rating (Tritt et al., 2008) evaluates the state and the severity of mental disorders by participants’ self-assessment on a 5-point Likert scale ranging from 0 (“does not apply”) to 4 (“extremely applies”). The ISR was used to screen for deviant eating behavior (ES: Eating Disorder Scale, three items, e.g. " I think a lot about food and constantly worry about gaining weight."). The ISR consists of six subscales screening for a variety of disorder symptomatology. Since the present study is focusing on a burdened sample with eating disorder symptomatology, for this screening only the subscale measuring eating disorder symptomatology was utilized. According to Fischer et al. (2010), the ISR has a good internal consistency of Cronbach's α = 0.92 for the ISR total score and good internal consistency between Cronbach's α = 0.78 to α = 0.86 for the overall syndrome scales, supporting the validity of the ISR.

The Eating Disorder Inventory 2 (EDI-2; Paul & Thiel, 2005) is a self-report assessment tool for pathological eating behavior as well as for the multidimensional assessment of other relevant psychological variables in eating disorders and associated personality traits on a 6-point Likert scale (1 = “never” to 6 = “always”). The EDI-2 contains eleven subscales (“Drive for Thinness”, “Bulimia”, “Body Dissatisfaction”, “Ineffectiveness”, “Perfectionism”, “Interpersonal Distrust”, “Interoceptive Awareness”, “Maturity Fears”, “Asceticism”, “Impulse Regulation”, “Social Insecurity”) and represents a valid test instrument for assessing the degree of and the change in symptom severity (Kappel et al., 2012). Internal consistency of the scale scores ranged from Cronbach's α = 0.73 to α = 0.93 in a group of female patients with anorexia and bulimia nervosa, and test–retest reliability of all scales ranged from rtt = 0.81 to rtt = 0.89 at a time interval of 7 days (Paul & Thiel, 2005). For this present study, only five out of the eleven subscales of the EDI-2 were used: “drive for thinness”, “bulimia”, “body dissatisfaction”, “perfectionism” and “interoceptive awareness”. It is possible to interpret the subscales on their own since they mirror unique characteristics, even if a higher total score reflects a higher degree of psychopathology (Paul & Thiel, 2005).

The German version of the Questionnaire for the Assessment of Dysfunctional Cognitions in eating disorders (FEDK, Legenbauer et al., 2007) was used to assess eating disorder specific cognitions on a 4-point Likert scale (1 = “not at all” to 4 = “always”). Out of the three subscales of the FEDK (i. e. “Restrictions and diet rules”; “Eating and loss of control”; “Body and Self-esteem”), only the subscale “Body and Self-Esteem" (11 items; e.g.: "If I lose weight, others will be more interested in me “) was utilized, as these items represent self-referential statements with high emotional valence. The subscale "Body and Self-Esteem" captures the extent of maladaptive schemas during the last month. Items reflect the negative body and self-image of eating disordered patients in the context of their dysfunctional cognitions. The focus lies on the patient's own physical appearance and the desire of attractiveness as well as social recognition by striving for thinness. The FEDK scales achieve good internal consistencies of Cronbach's α = 0.85 and α = 0.92 in a clinical sample with an eating disorder. In the present study, the FEDK allows the assessment of eating disorder specific cognitions in addition to the individual dysfunctional automatic thoughts that are already surveyed within the intervention on a single item base, thus methodologically improving the design of Kocur et al. (2021). The questionnaire is also used to monitor potential time effects in the control group.

The female avatar was based on the Genesis 8 model from Daz3D, with the avatar's lip movements being simulated using the SALSA LipSynch v2 Suite. The actual intervention was created with the Unity3d engine.

Statistical Analysis

Data analyses were performed using IBM SPSS Statistics (Version 22; IBM Corp, 2013). The statistical analysis included a 2 × 3 repeated-measures ANOVA with the between-subjects factor group (CAT-DB, control group) and the within-subjects factor session (pre-treatment, post-treatment, follow-up).

If required, the degrees of freedom of the F-statistics were adjusted by multiplying them with the Greenhouse–Geisser estimate to correct for deviations from spherical data. Depending on the variance homogeneity, each ANOVA with a significant session effect or interaction effect factor was followed by the same set of post-hoc t-tests or Welch tests. First, for both groups, it was separately tested whether the means of an outcome were different at pretreatment from posttreatment and at pretreatment from follow-up. These tests helped to discriminate the contribution of each group to potential session effects. Second, the differences of an outcome from pretreatment to posttreatment and from pretreatment to follow-up were calculated. Subsequent t-tests were used to compare whether the means of these differences significantly differed between the two groups. These t-tests were used to analyze possible Group × Session interaction effects and further explore whether both groups evolve differently over time.

Scores outside the interval defined by subtracting and adding three times the interquartile range to the first and third quartiles were considered outliers and removed from analyses. Furthermore, an intention-to-treat (ITT) analysis was performed as five participants from the control group did not complete follow-up. To address the missing data, the means of the valid surrounding values were used as a method of simple imputation.

Results

Eating Disorder Inventory (EDI-2)

Descriptive data analysis revealed no extreme scores for any of the participants on the EDI-2 scales. Figure 3 illustrates a decrease in the mean scores of the EDI-2 between pretreatment and posttreatment and from posttreatment to the follow-up in the experimental group.

Fig. 3
figure 3

EDI-2 scores for the CAT-DB and control group. Note. Mean scores of the Eating-Disorder-Inventory (EDI-2) for the CAT-DB group (n = 24) and the control group (n = 24) at pretreatment, posttreatment and at follow-up. Error bars show standard errors. Repeated Measures ANOVA for the Eating-Disorder-Inventory (EDI-2), F(2, 92) = 8.85, p < 0.001, η2 = 0.16. Note that for the present study only five out of the original eleven subscales were utilized: “drive for thinness”, “bulimia”, “body dissatisfaction”, “perfectionism” and “interoceptive awareness”. The trend thus reflects the means of these subscales

The 2 × 3 repeated-measures ANOVAs for each subscale revealed a statistically significant Group x Session interaction for the subscales “drive for thinness” (F(1.63, 75.03) = 8.40, p = 0.001, η2 = 0.15), “bulimia” (F(2, 92) = 4.62, p = 0.012, η2 = 0.09), “body dissatisfaction” (F(2, 92) = 9.15, p < 0.001, η2 = 0.17), and “interoceptive awareness” (F(2, 92) = 5.55, p = 0.005, η2 = 0.11). There was no significant interaction for the subscale “perfectionism” (F(2, 92) = 1.35, p = 0.263, η2 = 0.03). The Group x Session interaction for the sum score that was calculated by adding all subscales showed statistical significance as well (F(2, 92) = 8.85, p < 0.001, η2 = 0.16). There was no significant main effect for Group or Session for any of the subscales. For the ANOVA statistics of the subscales see Appendix A1. A detailed view of the means and standard deviations of the EDI-2 subscales is provided in Table 2.

Table 2 Means and standard deviations for EDI-2

Paired post-hoc t-tests for the control group revealed that the mean scores of the EDI-2 subscales at pretreatment did not significantly differ from the mean scores of the EDI-2 subscales at posttreatment. The mean scores of the EDI-2 subscale “body dissatisfaction “ significantly differed between pretreatment and follow-up in the control group (t(23) = -2.361, p = 0.027, d = -0.48). For the CAT-DB group, the mean EDI-2 score for the subscale “interoceptive awareness” at pretreatment was significantly higher than the mean score at posttreatment (t(23) = 2.38, p = 0.026, d = 0.49). The paired post-hoc t-tests for the other subscales and the sum score revealed no significant differences between pretreatment and posttreatment. However, the mean EDI-2 score of the subscales “drive for thinness” (t(23) = 2.956, p = 0.007, d = 0.60), “body dissatisfaction “ (t(23) = 3.494, p = 0.002, d = 0.71), “interoceptive awareness” (t(23) = 4.602, p < 0.001, d = 0.94), and the sum score (t(23) = 3.655, p = 0.001, d = 0.75) in the CAT-DB group were significantly higher at pretreatment than at follow-up. The post-hoc t-tests for each subscale are provided in Appendix A2.

Post-hoc t-tests revealed no significant mean differences between pretreatment and posttreatment for the groups. Between pretreatment and follow-up, the mean differences significantly differed between the control group and the CAT-DB group for the subscales “drive for thinness” (t(46) = -3.21, p = 0.002, d = -0.93), “bulimia” (t(46) = -2.41, p = 0.020, d = -0.70), “body dissatisfaction” (t(46) = -4.15, p < 0.001, d = -1.20), “interoceptive awareness” (t(34.35) = -2.99, p = 0.005, d = -0.86), and the sum score (t(46) = -3.76, p < 0.001, d = -1.09). A detailed view of the results is provided in Appendix A3.

Questionnaire for the Assessment of Dysfunctional Cognitions in eating disorders (FEDK)

Descriptive analysis revealed no extreme scores for any of the participants on the FEDK subscale “Body and Self-esteem". Figure 4 illustrates a decrease in the mean score of the FEDK subscale between pretreatment and posttreatment and follow-up in the CAT-DB group.

Fig. 4
figure 4

FEDK “Body and Self-esteem” score for the CAT-DB and control group. Note. Mean scores of the the Questionnaire for the Assessment of Dysfunctional Cognitions in eating disorders (FEDK) subscale “Body and Self-esteem” for the CAT-DB group (n = 24) and the control group (n = 24) at pretreatment, posttreatment and at follow-up. Error bars show standard errors. Repeated Measures ANOVA for the FEDK, F(2, 92) = 6.21, p = 0.002, η2 = 0.12. Note that for the present study only one out of three original subscales was used: “Body and Self-Esteem"

A 2 × 3 repeated-measures ANOVA revealed a significant Group x Session interaction effect, F(2, 92) = 6.21, p = 0.002, η2 = 0.12. There was no significant main effect for the factors Group, F(1, 46) = 0.41, p = 0.526, η2 = 0.01 and Session, F(2, 92) = 0.11, p = 0.898, η2 < 0.01. A detailed view of the means and standard deviations for the FEDK subscale “body and self-esteem” and the following conviction and difficulty ratings is provided in Table 3.

Table 3 Means and standard deviations for the FEDK, conviction ratings for dysfunctional and functional cognitions and difficulty rating

Paired post-hoc t-tests for the control group revealed that the mean score of the FEDK at pretreatment was not significantly different from posttreatment to follow-up. For the CAT-DB group, the mean FEDK scores between pretreatment and posttreatment showed no significant differences. However, the paired post-hoc t-test for the FEDK revealed that the mean score at pretreatment was significantly higher than the mean score at follow-up (t(23) = 2.583, p = 0.017, d = 0.53).

Post-hoc t-tests revealed no significant mean differences between pretreatment and posttreatment for the groups. Between pretreatment and follow-up, the mean differences significantly differed between the control group and the CAT-DB group, t(46) = -3.00, p = 0.004, d = -0.87).

Conviction Ratings for the Dysfunctional Beliefs

Descriptive analysis revealed that no participant had extreme belief scores regarding their dysfunctional cognitions. Figure 5 illustrates a decrease in the mean score of the conviction ratings for the dysfunctional cognitions of the CAT-DB group from pretreatment to posttreatment and from posttreatment to follow-up.

Fig. 5
figure 5

Mean scores of the conviction ratings for the dysfunctional cognitions of the CAT-DB and control group. Note. Mean scores of the conviction ratings for the dysfunctional cognitions of the CAT-DB group (n = 24) and the control group (n = 24) at pretreatment, posttreatment and at follow-up. Error bars show standard errors. Repeated Measures ANOVA, F(2, 92) = 0.69, p = 0.505, η2 = 0.01

A 2 × 3 repeated-measures ANOVA revealed no statistically significant Group x Session interaction, F(2, 92) = 0.69, p = 0.505, η2 = 0.01. There was no significant main effect for the factors Group, F(1, 46) = 2.20, p = 0.144, η2 = 0.05 and Session, F(2, 92) = 1.67, p = 0.194, η2 = 0.04.

Conviction Ratings for the Functional Cognitions

Descriptive analysis revealed that no participant had extreme belief scores regarding their alternative functional cognitions. Figure 6 illustrates an increase in the mean score of the conviction ratings for the alternative thoughts across all measurements within the CAT-DB group.

Fig. 6
figure 6

Mean scores of the conviction ratings for the functional cognitions of the CAT-DB and control group. Note. Mean scores of the conviction ratings for the functional cognitions of the CAT-DB group (n = 24) and the control group (n = 24) at pretreatment, posttreatment and at follow-up. Error bars show standard errors. Repeated Measures ANOVA, F(2, 92) = 1.79, p = 0.173, η2 = 0.04

A 2 × 3 repeated-measures ANOVA revealed no statistically significant Group x Session interaction, F(2, 92) = 1.79, p = 0.173, η2 = 0.04. There was no significant main effect for the factors Group, F(1, 46) = 3.76, p = 0.058, η2 = 0.08 and Session, F(2, 92) = 0.83, p = 0.440, η2 = 0.02.

Difficulty in contradicting the avatar

Descriptive analysis revealed that no participant had extreme scores regarding their difficulty ratings. As shown in Fig. 7, there was a decrease in the perceived difficulty of contradicting the avatar in the CAT-DB group, especially between pretreatment and posttreatment.

Fig. 7
figure 7

Mean scores for the difficulty ratings of the CAT-DB and control group. Note. Mean scores of the difficulty ratings for the CAT-DB group (n = 24) and the control group (n = 24) at pretreatment, posttreatment and at follow-up. Error bars show standard errors. Repeated Measures ANOVA, F(1.73, 79.47) = 0.09, p = 0.888, η2 < 0.01

A 2 × 3 repeated-measures ANOVA revealed no statistically significant Group x Session interaction, F(1.73, 79.47) = 0.09, p = 0.888, η2 < 0.01. There was no significant main effect for the factor Session, F(1.73, 79.47) = 0.32, p = 0.696, η2 = 0.01. There was a significant main effect for the factor Group, F(1, 46) = 13.93, p < 0.001, η2 = 0.23. A Bonferroni-corrected Tukey post-hoc test showed a significant difference (p < 0.001) in the difficulty rating between the control group and the CAT-DB group (17.92, 95% CI[8.26, 27.59]) indicating that the control group had less difficulties in contradicting the avatar in comparison to the CAT-DB group.

Discussion

Despite the limitations to consider, our findings are consistent with previous research that has suggested the potential effectiveness of virtual reality remote psychotherapy and computer-assisted cognitive-behavioral therapy in treating various mental disorders, such as addiction (Carroll et al., 2008), anxiety (Cooney et al., 2018), depression (Kocur et al., 2021), and eating disorders (Matsangidou et al., 2020). However, more research is needed to determine the extent to which our results apply to patients with comorbid conditions.

The primary objective of our study was to assess the effectiveness of computer-assisted avatar therapy in reducing dysfunctional body-related cognitions in individuals with subclinical eating disorder symptoms. Our goal was to decrease the severity of eating disorder-specific symptoms and promote functional thoughts as an alternative. In line with our expectations, participants in the experimental group exhibited a significant decrease in eating disorder symptom severity and related psychopathological variables compared to those in the control group. Post-hoc t-tests revealed that the mean differences between pretreatment and posttreatment for the EDI-2 subscale scores did not significantly differ between the groups and the sessions; however, the mean difference between pre- and follow-up treatment significantly differed between the groups for the subscales “drive for thinness”, “bulimia”, “body dissatisfaction”, “interoceptive awareness”, and the sum score, indicating a delayed treatment effect. According to Cohen (1988), these results are supported by medium to large effect sizes (Cohen’s d = -0.7 and d = -1.20). Additionally, the CAT-B group showed greater improvements in body-related cognitions compared to the control group. When comparing pretreatment and follow-up, a significant difference was observed in the FEDK subscale “Body and Self-esteem”. The effect size for this result can be classified as large (Cohen’s d = -0.87). Apart from that, no significant interaction effect was found for the conviction ratings of dysfunctional as well as for alternative thoughts. However, a significant main effect of group emerged. Additionally, within the CAT-DB group, the conviction regarding the dysfunctional cognitions decreased, while the conviction regarding the functional cognitions increased, albeit non-significantly; these tendencies did not show in the control group. These results could be interpreted as a first indication of the efficacy of the intervention. Regarding the difficulty to contradict the virtual avatar, a significant group effect was found, with the experimental group having greater difficulties in contradicting the avatar, underlining the impact of the context of the avatars' statements (automatic thoughts vs. random facts).

It was expected that significant improvements in the individual cognitions would lead to improvement in disorder-specific symptomatology. However, since no significant interaction for the individual cognitions was found, it is uncertain which factors are responsible for the observed changes. A possible working mechanism explaining the symptom reduction could lay in a potential increase in subjects’ self-efficacy as a result of contradicting the avatar. A study by Keshen et al. (2017) found early changes in self-efficacy during an eating disorder treatment program to predict symptom severity at the end of the program. Despite the insufficient power of the current study, a trend could be observed in favor of a change in the individual cognitions that is consistent with the hypothesized mechanism. Therefore, in further studies with a larger sample size, the usage of a mediation model would be recommended to clarify which mechanisms have caused the observed improvements in symptoms.

Comparing the findings with Kocur et al. (2021), the use of CAT-DB seems to be effective regarding symptom reduction, as a significant Group x Session interaction effect for our sample occurred with similar effect sizes in comparison to the depressive inpatient sample. However, in contrast to Kocur et al. (2021), significant group differences only emerged with a time delay at the 14-days follow-up. Furthermore, in contrast to Kocur et al. (2021), no interaction effect for conviction ratings of dysfunctional and alternative cognitions could be observed. This might be due to Kocur et al. (2021) directly targeting core beliefs rather than disorder specific automatic thoughts as in the present study. Dysfunctional core beliefs are deeply anchored in the cognitive structure of a person, as they are acquired and strengthened over the years (Wenzel, 2012). As underlying core beliefs trigger situation-specific automatic thoughts, it might be possible that by only working on these automatic thoughts, the underlying self-schemas are not changed and are therefore easily triggered in other contexts and situations, with no generalization of alternative beliefs possible. Regarding the sustainability of treatment effects, it might be necessary to work on dysfunctional core beliefs (Jones et al., 2007) that target transdiagnostic factors such as self-esteem, rather than to focus on the situation and disorder specific automatic thoughts. Due to the similarity of dysfunctional core beliefs across different disorder conditions, CAT-DB might even be useful to target patients with high comorbidity, as high comorbidity is quite common in eating disorders (Keski-Rahkonen & Mustelin, 2016).

In addition to the treatment of core beliefs in Kocur et al.'s (2021) study and the level of automatic thoughts addressed in the present study, future research could also consider a perspective on other cognitive levels. For instance, further studies could focus on the distinction between irrational cognitive processes and rational thoughts (Tecuta et al., 2021).

Limitations

It is essential to emphasize the potentially limited reliability of the collected data in the current pilot study. In three cases, there were technical issues during a session, leading to a repetition of these sessions. Since this only affects three subjects, it might not have a big impact on the overall results but certainly limits reliability.

Furthermore, due to time constraints only particular subscales of the EDI-2 and FEDK were used in this study, so a generalization of results to other subscales is not possible, and validity is limited.

It is important to note that our study sample consisted mostly of females (77.1%). However, this gender imbalance is representative as the prevalence of eating disorders is three times as high in females compared to males in the general population (see Qian et al., 2022). Furthermore, due to the nature of this pragmatic pilot study, broad eligibility criteria for study participation were applied instead of testing highly selected patient populations. The main aim of this study was to show first empirical evidence to the theoretically expected effect of our avatar-based intervention for individuals with eating disorder symptomatology. The possibility that participants received medication or were currently under psychotherapeutic treatment which could have had a confounding effect cannot be neglected. Besides, even though participants were screened for eating disorder symptomatology to attain a burdened sample, our sample was not significantly affected by eating disorder symptoms, compared to the EDI-2 reference sample (Paul & Thiel, 2005). This might account for potential ceiling effects.

Moreover, a possible explanation for the hypothesis-compliant but non-significant results could be an insufficient sample size as needed sample size was N = 66, with the actual sample size at N = 48 due to drop-out (n = 19). However, according to a post-hoc power analysis, the sample is sufficient to reveal statistically significant differences with medium to large effects (Cohen’s d = 0.48 and d = -1.20) between conditions at α = 0.05 with a power of 0.99. This is in line with findings from previous work investigating the efficacy of mental health interventions using virtual characters to treat anxiety (Provoost et al., 2017), mood disorders (Ma et al., 2019), eating disorders (Matsangidou et al., 2020), and phobias (Cieślik et al., 2020) based on cognitive behavioral therapy.

The missing treatment effect of the dysfunctional cognitions might also be due to these cognitions having to be formulated anew in each CAT-DB session. An analysis regarding the formulated automatic thoughts revealed that the content differed between sessions in some participants, thus some individuals were confronted with more than three different statements over time. The effectiveness of the CAT-DB might benefit from formulating only three dysfunctional and matching functional cognitions for the most relevant contents. A standardization of three thoughts in total could lead to an earlier onset of effect, with the specific content of each confrontation remaining the same across all sessions, possibly positively affecting the treatment process as particular cognitions are repeatedly targeted and trained. This approach would also improve the validity and comparability of the cognition ratings. Measurements would reflect actual treatment effects on individual maladaptive cognitions, rather than ratings of individual cognitions, which may differ depending on the session.

Furthermore, it must be mentioned that there was a time overlap across our measuring points due to the effort to use a valid, appropriate measurement tool for assessing eating disorder specific cognitions. The subscale “Body and Self-Esteem” of the questionnaire FEDK asks about maladaptive schemas over a period of one month. In the present pilot study, however, only a period of three weeks was considered. Therefore, in a further study it should be ensured that a questionnaire is used that matches our measurement points.

Another limitation addresses the design of the control group. In the classic process of cognitive restructuring, the recognition of cognitive errors and the development of alternative explanations are central steps (Clark, 2013). As both groups in the current pilot study had to formulate dysfunctional and functional statements in the beginning, this could already have had a therapeutic effect. Several studies have demonstrated an enhanced effect of cognitive-behavioral therapies with the addition of virtual reality (VR) in participants with eating disorders (Ferrer-García et al., 2017; Gutiérrez-Maldonado et al., 2016; Marco et al., 2013). The emotional engagement and thus the effectiveness of the CAT-DB is significantly influenced by the immersion and the individual sense of presence of the participants in the virtual reality (Diemer et al., 2015). Low or decreasing perceived presence over time can be interpreted as confounding factors and as possible reasons for non-significant effects. Also, whether a high sense of presence is related to higher treatment success might be of interest. Further adaptations of the virtual avatar may increase the avatar realism and emotional involvement of the participants. For example, by programming additional interactive elements, facial expressions, gestures, and general movement, the immersion could be enhanced.

Besides, the avatar was only presented with a female appearance. The effectiveness of the treatment may be positively influenced by customizing the appearance of the virtual avatar (Marcoux et al., 2021). The individual customization of the avatar would be helpful to increase personal and emotional engagement (Birk et al., 2016; Turkay & Kinzer, 2015), improving therapy adherence (Birk & Mandryk, 2019) and therefore possibly counteracting the high drop-out (n = 24) observed in this study. The participants could additionally be given the opportunity to represent their dysfunctional cognitions as real objects or abstract shapes (e.g., a dark rain cloud) to more adequately correspond to their dysfunctional cognitions than a virtual person. Craig et al. (2018) and Percie du Sert et al. (2018) also indicated the potential benefits that could be achieved from such further developments of avatar-based therapy methods.

Lastly, the aim of the present study was to target automatic thoughts that are situation specific and therefore easier accessible to participants than underlying core beliefs. Nevertheless, the indication and formulation of those individual automatic thoughts might have been difficult for some participants depending on their state of mood (see Kërqeli et al., 2013; Miranda et al., 1990; Roberts & Kassel, 1996). Even though standardized psychoeducational information was provided at the beginning of every session including six examples of maladaptive cognitions characteristic for eating disorders, people learn more successfully through visual delivery of content (Berney and Betrancourt, 2016). To make the cognitive constructs more tangible for participants, the use of various explanatory videos could be helpful in future studies. Furthermore, the intervention process could be improved by presenting cognitions using visuals, e.g., photographs, drawings, or classical imagery techniques (DeCoster & Dickerson, 2013; Hales et al., 2015). Also, in addition to providing examples of specific dysfunctional cognitions, it might be helpful to identify specific situations in which such cognitions arise to increase accessibility. Additionally, participants could be instructed to keep a diary for a few days to record dysfunctional beliefs and cognitions (see Utley & Garza, 2011).

Implications

Taken together, the presented results reveal the potential that could be expected by exploiting the benefits of CAT-DB. Although web-based interventions can be effective in promoting health and health-related behaviors (Taylor et al., 2021) in eating disorders, lack of adherence is a common problem that needs to be addressed. Intervention characteristics and persuasive technology elements can be used to explain a substantial amount of the variance in adherence (Kelders et al., 2012). As a solution approach, developing a mobile, offline-based application for the smartphone could be the aim of future research. A mobile application could provide tailored guidance and therefore help to increase adherence, which can be low in internet-based interventions (Fuhr et al., 2018; Musiat et al., 2022). Transferring the CAT-DB to an app-based format would increase accessibility, make processing more user-friendly, and eliminate the aforementioned technical problems as especially tailored apps have proven to be effective for the treatment of eating disorder symptoms (Tregarthen et al., 2019). Mobile applications and smartphone apps have already been successfully used to reduce automatic thoughts in depression (Hur et al., 2018) and as a transdiagnostic cognitive-behavioral intervention for eating disorder psychopathology (Linardon et al., 2020). Avatar-based virtual reality tools seem to be especially appealing to young patients (Falconer et al., 2019), which might be of great use for eating disorders as these often occur during adolescence (Sergentanis et al., 2020).

By implementing an app-based version of CAT-DB, it could also be used in the context of therapeutic homework, therefore bridging the gap between therapy sessions by allowing the patients to make progress even without the active guidance from their therapist, strengthening the treatment effect (Karyotaki et al., 2018). Mobile apps are also already successfully used as relapse prevention in eating disorders (Devakumar et al., 2021), as relapse is a common issue in the treatment of these disorders (Berends et al., 2018). Moreover, self-guided intervention tools have the potential to buffer the existing supply gap concerning psychotherapy vacancies, at least in the short term (Machado & Rodrigues, 2019).

Conclusion

The present pilot study suggests that the CAT-DB intervention may be a valuable addition to the already available therapeutic tools for modifying dysfunctional body-related cognitions and decreasing eating disorder symptom burden. However, further research is required to clarify the underlying mechanisms that link the CAT-DB intervention to dysfunctional thoughts and symptom severity. Nonetheless, these data can support the further study of treating cognitive variables in eating disorders. Based on this, future studies with larger samples and in different clinical settings (e.g., inpatient, outpatient, treatment concomitant, relapse prophylactic, homework) could provide important evidence to further explore and confirm the efficacy and external validity of the CAT-DB, as an additional tool in CBT.