Introduction

The tendency to misinterpret ambiguous situations, termed interpretation bias, is proposed to maintain a range of emotional disorders (e.g., Mathews and MacLeod 2005). Consistently, interpretation biases are considered to be transdiagnostic phenomena (Craske et al. 2009; Harvey et al. 2004; Mansell et al. 2008), emerging from aberrant information processing (see Gladwin and Figner 2014; Ouimet et al. 2009; Strack and Deutsch 2004) that is guided by underlying dysfunctional beliefs and current concerns. Such current concerns are linked to pursuing a goal (e.g., achieving attractiveness) and prompt the preferential processing of goal-relevant stimuli (e.g., others' reactions to one's appearance; Klinger and Cox 2011). Over time, current concerns may thereby shape associative memory networks through learning processes (Anderson 1983; Collins and Loftus 1975; Foa et al. 2006) and produce disorder-specific cognitive profiles—including interpretation biases—that elicit emotional states (e.g., Beck and Clark 1988; Beck and Haigh 2014).

To date, these disorder-specific cognitive profiles have mainly been investigated by contrasting self-reported cognitive content. A classic meta-analysis in this field yielded mixed results, demonstrating that anxiogenic and depressogenic cognitions were related to symptoms of both disorders, with only the latter showing a specific association with depression (Beck and Perkins 2001). However, more recent research supported the assumption of cognitive profiles, indicating that some emotional disorders are linked to distinct underlying core beliefs (Cooper et al. 2006; Dozois et al. 2008; Schulz et al. 2008).

Considering interpretation biases, prior studies have mainly focused on their assessment in single disorders, e.g., depression (e.g., Hindash and Amir 2012; Wisco and Nolen-Hoeksema 2010), social anxiety disorder (SAD; e.g., BeVonard and Amir 2009; Hirsch and Clark 2004; Huppert et al. 2003; Voncken et al. 2003), generalized anxiety disorder (GAD; e.g., Hazlett-Stevens and Borkovec 2004), and eating disorders (e.g., Rosser et al. 2010). Conversely, few studies have systematically compared interpretation bias patterns (e.g., Buhlmann et al. 2002; McManus et al. 2000; Voncken et al. 2007). Exemplarily, Buhlmann et al. (2002) showed that individuals with body dysmorphic disorder (BDD) favored more threatening and less non-threatenineg interpretations for appearance-related (e.g., “While talking to some colleagues, you notice that some people take special notice of you.”) and social scenarios (e.g., “You are having a conversation with some friends. You say something and the conversation stops.”) than individuals with obsessive–compulsive disorder and mentally healthy controls. However, both clinical groups exhibited a negative interpretation bias for generally ambiguous scenarios (e.g., “A letter marked “URGENT” arrives.”), potentially reflecting an overarching vulnerability factor for psychopathology (Buhlmann et al. 2002). Similarly, Voncken et al. (2007) demonstrated that individuals with SAD (vs. individuals with depression and mentally healthy controls) displayed a distinct negative interpretation bias for social scenarios. Further, individuals with depression, compared to non-clinical controls, exhibited a more global negative bias encompassing various situation categories.

In sum, the studies cited above indicate that maladaptive, concern-specific interpretation biases characterize a range of emotional disorders. Nevertheless, prior research has mainly relied on restrictive self-report formats, such as forced-choice, which allow unlimited time for evaluation and may be prone to confounds (e.g., response selection bias and demand effects; Mathews and MacLeod 2005). Further, such measures preclude a clear differentiation between tendencies to endorse negative and reject positive interpretations (i.e., pronounced negative bias and a lack of positive bias). However, these tendencies have been shown to represent two distinct, equally pertinent factors in bias phenomenology, being tied to different behavioral implications that warrant further investigation (Huppert et al. 2003; Steinman et al. 2020). Specifically, while negative interpretation bias has been linked to behavioral avoidance, a lack of positive bias has been suggested to dampen positive affect during behavioral approach (Amir et al. 2012; Kuckertz and Amir 2017).

Additionally, most cognitive theories posit that interpretation bias is characterized by a more implicit, reaction time-based (RT) component (Hirsch and Clark 2004), requiring assessment via opaque, time-limited task designs (e.g., Schoth and Liossi 2017). Within such implicit designs, RT is conceived to index the speed by which situational interpretations are accessible and activated within semantic memory (see de Houwer et al. 2009). Hence, when resolving ambiguity, RT theoretically quantifies the associative strength between a situation and its valent interpretation. However, it has not been studied yet whether observed RT differences indeed emerge from relatively faster information uptake, or confounds, such as low response thresholds or general response slowing (Voss et al. 2015). Overall, examining the aforementioned bias indices and dissecting their underlying cognitive components within associative memory organization appears critical to further characterize interpretation biases phenomenologically.

The Word Sentence Association Paradigm (WSAP; Beard and Amir 2009) was designed to yield different explicit and implicit interpretation bias indices. In this task, participants are asked to judge as fast as possible if an ambiguous sentence and a positive or negative interpretation are related. Importantly, both decision rates (i.e., the explicit component) and congruent RT (i.e., the more implicit component) may be recorded. Using the WSAP, Beard and Amir (2009) demonstrated that participants high (vs. low) in social anxiety exhibited a pronounced negative and a lack of positive interpretation bias. Further, individuals high (vs. low) in social anxiety also showed this bias implicitly, as they were slower in endorsing positive and rejecting negative, and faster in endorsing negative and rejecting positive interpretations. Since this study, interpretation biases have been investigated using the WSAP in various emotional disorders (see Gonsalves et al. 2019, for a review). However, to our knowledge, the WSAP has never been used to compare interpretation biases across disorders and current concerns. Such comparative assessments are paramount to determining common and disorder-specific bias features that may be addressed within interventions, such as Cognitive Bias Modification for Interpretation (CBM-I; see Cristea et al. 2015; Hallion and Ruscio 2011; Jones and Sharpe 2017; Menne-Lothmann et al. 2014, for meta-analyses).

The present study investigated interpretation biases for different current concerns across three disorders: BDD, SAD, and GAD. The key characteristics of these disorders involve preoccupation about subjectively perceived bodily flaws in BDD, fear and avoidance of social situations in SAD, as well as anxiety and worry about various domains in GAD (American Psychiatric Association 2013). BDD, SAD and GAD can be represented on a continuum of phenomenological proximity. In this respect, BDD and SAD can be considered phenomenologically proximal. Both disorders are characterized by social anxiety and avoidance, similar onset and trajectories, and high mutual comorbidity (Fang and Hofmann 2010; Pinto and Phillips 2005). BDD and SAD might further relate on a cognitive level as they potentially share maladaptive social and appearance-related interpretation patterns, given their overlap in anxiety and appearance-related concerns during social situations (Fang and Hofmann 2010). Indeed, cognitive-behavioral models of BDD and SAD propose that interpretation biases maintain symptoms (e.g., Hofmann 2007; Wilhelm et al. 2013). Relatedly, SAD-specific cognitive-behavioral interventions have been found to improve BDD symptoms (Fang et al. 2013), suggesting common underlying factors (e.g., fear of negative evaluation or self-focused attention; Fang and Hofmann 2010). Thus, investigating interpretation bias profiles in BDD and SAD would further elucidate their role in symptom maintenance, which remains unstudied at present.

GAD can be viewed as phenomenologically distal to BDD and SAD. Despite some commonalities, such as trait anxiety, excessive worry, avoidance, and safety behaviors to reduce anxiety (American Psychiatric Association 2013; Craske et al. 2009; Turk et al. 2005), GAD appears distinct on a cognitive level as it is associated with various current concerns (i.e., different worry domains, primarily concerning a potentially aimless future, relationships, work incompetence, and physical threat; Dugas et al. 1998). The differences between GAD, BDD and SAD are further reflected in their low mutual comorbidity rates and diverging age of onset (e.g., Gunstad and Phillips 2003). Moreover, cognitive-behavioral models of GAD identify intolerance of ambiguity as a catalyst for habitual negative interpretations and worry (see Hirsch et al. 2016, for an overview). However, it remains unclear how individuals with GAD respond to concerns present in other disorders.

Addressing these questions, we explored interpretation bias patterns in individuals fulfilling self-report DSM-5 criteria for BDD, SAD, GAD, and non-clinical controls (NC). Using an adapted version of the WSAP (Hindash and Amir 2012), we assessed decision rates and RT for positive and negative interpretations in three categories central to these disorders: appearance-related, social, and generally threatening situations. We used a multilevel approach (MLM) and Wiener diffusion models (Ratcliff and McKoon 2008; Voss et al. 2013) to determine cognitive parameters within RT. In reference to classic cognitive models, we explored the contributive value of decisional processes in implicit interpretation bias, which may be interpreted as indices of associative memory underlying interpretations (McKoon and Ratcliff 2012; White et al. 2010).

Consistent with prior evidence, we hypothesized that (1) clinical groups would overall exhibit maladaptive interpretation biases across situation categories, endorsing more negative and fewer positive interpretations than NC (transdiagnostic hypothesis). Within clinical groups, we predicted that (2) individuals with BDD would exhibit an appearance-related interpretation bias (vs. SAD and GAD) and a social interpretation bias (vs. GAD). Further, we expected that (3) SAD (vs. GAD) would show an appearance-related and social interpretation bias, and last, (4) individuals with GAD would show a general interpretation bias compared to NC only (current concern hypotheses). We further assumed that these disorder-specific patterns would be reflected in concurrent RT, i.e., faster endorsement and slower rejection of negative interpretations, and the reverse pattern for positive interpretations, as compared to NC.

Methods

Participants

Participants for clinical groups were recruited via online advertisements in disorder-specific Internet fora and social networks. Non-psychological fora and social networks were used for the recruitment of controls. Participation was not reimbursed. However, participants could enter a lottery comprising six Amazon vouchers worth 150€ in total.

Inclusion criteria were: (1) aged between 18 and 65 years, (2) no previous or current self-reported diagnosis of psychotic disorder, bipolar disorder or substance abuse/dependency, (3) no acute suicidality [as indicated by Patient Health Questionnaire, Depression Module (PHQ-9; Gräfe et al. 2004)], item 9 ≤ 3), and (4) fluent in German. Clinical groups had to fulfill inclusion criteria and the self-report DSM-5 criteria of BDD, SAD, or GAD (excluding mutual comorbidity). NC had to meet inclusion criteria with scores below the clinical cut-offs of 14 on the Body Dysmorphic Symptoms Inventory ('Fragebogen körperdysmorpher Symptome,' FKS; Buhlmann et al. 2009), 5.7 on the Generalized Anxiety Disorder Questionnaire-IV (GAD-Q-IV; Newman et al. 2002), 19 on the Social Phobia Inventory (SPIN; Sosic et al. 2008), 11 on the PHQ-9 (Gräfe et al. 2004), while not fulfilling clinical diagnoses of BDD, SAD or GAD.Footnote 1

One thousand one hundred sixty-two individuals accessed the study’s landing page, and 600 participants (51.64%) started the experiment. Some data sets had to be discarded due to: missing data related to participant dropout or technical difficulties (N = 161), non-fulfillment of inclusion criterion 1 (N = 2), and inclusion criterion 3 (N = 43). We also excluded comorbid and subclinical cases as per the aforementioned criteria (N = 275). The final sample (N = 119) comprised the following groups: BDD (N = 29), SAD (N = 36), GAD (N = 22), and NC (N = 32).

Measures and Materials

Unless otherwise specified, higher scores on the following scales indicate higher symptom severity.

Diagnostic Criteria (DSM-5)

We rephrased the DSM-5 criteria of BDD, SAD and GAD (American Psychiatric Association 2013) into questions, using a dichotomous response format (“Yes”/“No”). Some criteria were broken down into two questions to improve comprehensibility. Accordingly, we administered six questions for BDD, ten questions for SAD, and eight questions for GAD. Internal consistencies for these self-report items were acceptable to high (BDD: KR-20 = 0.76; SAD: KR-20 = 0.96; GAD: KR-20 = 0.94).

Body Dysmorphic Symptoms Inventory (Fragebogen körperdysmorpher Symptome; FKS)

To measure BDD symptom severity, we administered the FKS (Buhlmann et al. 2009). Sum scores on this scale range between 0 and 64. Internal consistency in this sample was high (Cronbach’s α = 0.89).

Social Phobia Inventory (SPIN)

We used the SPIN (Connor et al. 2000; Sosic et al. 2008) to assess SAD symptom severity. This scale yields sum scores between 0 and 68. Internal consistency in the present sample was high (α = 0.90).

Generalized Anxiety Disorder Questionnaire-IV (GAD-Q-IV)

We administered the GAD-Q-IV (Newman et al. 2002) to assess GAD symptom severity. Sum scores on this scale range between 0 and 12. Internal consistencies for the dichotomous (KR-20 = 0.78) and interval items (α = 0.86) were high in this sample.

Patient Health Questionnaire, Depression Module (PHQ-9)

To assess depression and suicidality, we administered the PHQ-9 (Gräfe et al. 2004; Kroenke and Spitzer 2002). Sum scores of this scale range between 0 and 27. Internal consistency was high in the present sample (α = 0.92).

Spielberger Trait Anxiety Inventory (STAI-T, Form Y)

We assessed trait anxiety as a possible covariate of interpretation bias using the trait version of the STAI-T, Form Y (Spielberger et al. 1983). Sum scores on this inventory range between 20 and 80. Internal consistency in the present sample was high (α = 0.92).

State Scales

To rule out unintended affect changes induced by the SWAP, which could influence subsequent symptom measures, we assessed state distress, self-esteem, and body dissatisfaction before and after the SWAP.

Affect Scales

We assessed state anxiety, shame, sadness, disgust, and frustration each on single-item, ten cm-visual analog scales ranging from 0 (“not at all”) to 100 (“extremely”). Affect scales were administered before and after the SWAP. Single-item scores were summed per time point (see Dietel et al. 2018).

State Body Dissatisfaction and Self-Esteem (BISS and RSES-S)

We administered the 6-item Body Image State Scale (BISS; Cash et al. 2002) and 4-item Rosenberg Self-Esteem Scale, State Version (RSES-S; e.g., Nezlek and Kuppens 2008). Higher scores on the RSES-S reflect higher state self-esteem. Internal consistencies were high in the present sample at pre-assessment (α = 0.83 for the affect scales, α = 0.85 for the BISS, and α = 0.90 for the RSES-S).

Sentence Word Association Paradigm (SWAP)

To assess interpretation bias, we used a version of the WSAP (Hindash and Amir 2012) presenting the ambiguous sentence before the valent word (named Sentence Word Association Paradigm, SWAP; see Dietel et al. 2018).

The situation set consisted of 240 sentence-word-combinations extracted from a pre-validated stimulus pool (see Appendix for examples). This stimulus pool was compiled from pre-existing stimulus sets of interpretation bias assessment studies in BDD (Buhlmann et al. 2002), SAD (Beard et al. 2011; Beard and Amir 2008) and GAD (Ogniewicz et al. 2014). Stimulus sets were kindly provided by the authors and translated. Further stimuli were generated based on expert consensus. Stimulus pools for appearance-related, social and general situations were then pre-validated in separate online surveys.Footnote 2 Negative and positive interpretations in the present situation sets did not differ in word length for all situation categories (all ps > 0.17).

Participants initially saw ten practice trials, followed by 3 (appearance-related, social, general threat) × 80 sentence-word combinations. Each sentence was presented once. 50% of all trials contained a positive word. Participants thus received 240 trials in two blocks of 120 trials, separated by a 45 s-pause with an on-screen countdown to enhance concentration. Participants could skip this pause.

Trials began with a central black fixation cross displayed against a white background for 500 ms. An ambiguous sentence then appeared in the center of the screen for 3500 ms. It was subsequently replaced by a positive or negative word. Participants were requested to indicate as fast as possible whether the sentence and word were related (i.e., pressing “L” for “Yes”; “S” for “No”). The next trial was initiated after each decision. Decisions and RT were recorded. Internal consistencies per situation category were acceptable to excellent for decision rates (α = 0.83–0.92) and RT (α = 0.70–0.95).

Procedure

The experiment was written in Inquisit Web Version 4 (Millisecond Software 2015) and accessible via desktop browsers. Upon informed consent, participants received screening questions for inclusion criteria and the PHQ-9. In case of a study exclusion, participants were directed to a webpage providing contact information of the principal investigator and additional mental health services. Upon meeting the inclusion criteria, participants completed the state scales (affect scales, BISS, RSES-S), SWAP, and state scales once again. Participants then received the STAI-T, FKS, DSM-5 BDD criteria, SPIN, DSM-5 SAD criteria, GAD-Q-IV, and DSM-5 GAD criteria. Upon completion, participants were redirected to a separate webpage to enter their e-mail address for the lottery. The experiment lasted 36.27 min on average.

Design and Statistical Analyses

Data analyses were conducted using (1) SPSS Statistics Version 24 and (2) R (R Core Team 2018) with the RStudio interface (RStudio Team 2018). SPSS was used to initially investigate between-group demographic and psychometric differences via analyses of variance (ANOVA), Welch's t-tests and χ2 tests, as well as frequentist analyses. Non-parametric alternatives were employed when basic assumptions of ANOVA were violated. R was used for Bayesian MLM-based analyses, treating individual responses as nested within both participants and items assuming a maximal multilevel structure (see OSF for the exact specification of the multilevel structure).

Bayesian Multilevel Models (MLM) and Wiener Diffusion Models

We employed MLM as they offer greater flexibility and test power when treating interdependent, incomplete data sets that do not meet sphericity assumptions (e.g., Hoffman and Rovine 2007; Quené and Van den Bergh 2004). Inter alia, this approach is advantageous regarding the previously discussed problem of interdependency in SWAP-based RT data (see Dietel et al. 2018; Möbius et al. 2015).

Within this framework, we assumed the binary responses to be Bernoulli distributed (i.e., applying multilevel logistic regression) and the corresponding response times to be distributed according to an exponentially modified Gaussian distribution, which is a common choice for modeling RT in cognitive tasks (e.g., Balota and Yap 2011; Heathcote et al. 1991). We applied several tidyverse packages (Wickham 2017) for data preparation and plotting as well as the brms package (Bürkner 2017, 2018), which is based on the probabilistic programming language Stan (Carpenter et al. 2017). Overall, we employed a joint MLM of both binary decisions and RT, including a supplementary analysis via Wiener diffusion modeling to disentangle cognitive components underlying group-wise RT differences (e.g., Link and Heath 1975; Ratcliff 1978; Ratcliff and McKoon 2008; Wagenmakers 2009).

Wiener diffusion models allow for an in-depth analysis of information contained within RT distribution, i.e., information processing components (Wagenmakers 2009). These models conceptualize stimulus processing as a noisy accumulation of evidence over time, where one of the responses is given when the evidence reaches a set boundary (see Fig. 1). This accumulation may be represented within a bivariate probability density function, incorporating both RT and response options (i.e., decision rates) comprehensively (Vandekerckhove et al. 2011). The probability density function ultimately yields four parameters that may be mapped onto cognitive processes: drift rate (i.e., speed of information uptake), boundary separation (i.e., amount of information considered before a decision), starting point or initial bias (i.e., a priori biases in decision thresholds), and non-decision time (i.e., encoding and execution processes unrelated to decisions; Voss et al. 2013). Importantly, diffusion models assume parameters to vary from trial to trial, allowing for trial-based analyses and thereby avoiding information loss. They have been previously applied in the mechanistic analysis of binary choice tasks, e.g., the Implicit Association Test (Klauer et al. 2007; van Ravenzwaaij et al. 2011), and other cognitive bias assessments, e.g., the dot-probe paradigm (Price et al. 2019).

Fig. 1
figure 1

Depiction of the Wiener diffusion model and its parameters. Figure adapted from van Ravenzwaaij et al. (2012). Optimal decision making in neural inhibition models. Psychological Review, 119, 201–215. Copyright 2012 by American Psychological Association. Adapted with permission

In this analysis, we predicted all four parameters of the Wiener diffusion model using a maximal multilevel structure. For drift rate, non-decision time, and boundary separation, the intercept, as well as the effects of item type and valence, were modeled as varying across participants. The intercept was also set as varying across items. Since the initial bias is, by definition, not affected by item properties, we only assumed a varying intercept across participants. All varying effects were allowed to correlate across participants and items. We included RT between 100 and 5000 ms (Luce 1986; Whelan 2008).

Frequentist Analyses

To improve consistency with prior studies, we performed mixed repeated-measures analysis of variance (ANOVAs) on endorsement rates, RT, and state scales (Dietel et al. 2018; Möbius et al. 2015), using two-tailed testing at an alpha-level of 0.05. For the affect induction analysis, log-transformed state measures were submitted to a 3 (group: BDD, SAD, GAD, NC) way ANCOVA on post-assessment state measures, setting pre-assessment measures as a covariate (Dietel et al. 2018). Percentual endorsement rates were submitted to a 4 (group: BDD, SAD, GAD, NC) × 3 (category: appearance-related, social, generally ambiguous) × 2 (valence: positive, negative) mixed ANOVA. We explored critical interactions using mixed ANOVAs and multiple comparison posthoc tests. Given sample size imbalance and heterogeneous variance, we report Games-Howell posthoc test outcomes (see Jaccard et al. 1984). For RT analysis, we eliminated the first and last percentile of observed RT, i.e., below 468.61 ms and above 5108.56 ms (Dietel et al. 2018; Möbius et al. 2015). Medians were subjected to a 4 (group: BDD, SAD, GAD, NC) × 2 (valence: positive, negative) × 2 (decision: endorse, reject) mixed repeated measures ANOVA per situation category. We followed up significant interactions using Bonferroni-corrected t-tests.Footnote 3

A Priori Power Analysis

The a priori power analysis using G*Power (Faul et al. 2007) was based on a WSAP-based interpretation bias assessment study with a similar setup, comparing mentally healthy controls and individuals with clinical SAD (Amir et al. 2012). This study reported large between-group effect sizes for negative (Cohen’s d = 1.23) and positive interpretations (d = 1.19). Assuming this large between-group (i.e., mentally healthy vs. clinical individuals) effect size for positive and negative interpretations (analysis parameters: 1 − β = 0.95, α = 0.05) yielded a sample size of n = 20 per group (hence: N = 80 in total) to observe these effects. Given the lack of WSAP-based studies comparing interpretation patterns between clinical groups, effect sizes for these comparisons could not be estimated a priori.

Results

Graphical and numerical summaries (including 95% credibility intervals of the effects) illustrating estimates for all MLM-based analyses can be found in the OSF repository.

Baseline Measures

Groups did not differ in gender, age, and years of education (all ps > 0.08). Further, there was a significant difference in psychotherapy status, χ2(3) = 11.36, p < 0.001, V = 0.31. On a numerical level, clinical groups received psychotherapy more frequently than non-clinical controls, with a significant difference in psychotherapy status between GAD and NC. As expected, clinical groups additionally differed from non-clinical controls regarding scores on the PHQ-9, STAI-T, SPIN, FKS, GAD-Q-IV, affect scales, BISS, and RSES-S (see Tables 1, 2).

Table 1 Demographics and questionnaire measures for all groups
Table 2 Mean state ratings (SD) for all assessment points

Unintended Affect Induction

We conducted multilevel linear regression models on the mean BISS, sum distress, and sum RSES state scales to investigate unintended SWAP-based affect induction. Overall, there were no substantial pre-post differences across all groups and state scales, indicating no unintended change in state distress, body dissatisfaction, or self-esteem (see Table 2).Footnote 4

Decision Rates

Appearance-related Situations

As shown in Fig. 2, for positive appearance-related interpretations, the BDD and SAD group showed substantially lower endorsement rates (vs. GAD and NC). Conversely, for negative appearance-related interpretations, the BDD group exhibited substantially higher endorsement rates than all other groups. Similarly, the SAD and GAD group (vs. NC) demonstrated substantially higher endorsement rates for this category. All other differences remained non-substantial within a 95% credibility interval.

Fig. 2
figure 2

Mean endorsement rates (%) per situation category. Error bars represent 95% credibility intervals of the mean. BDD body dysmorphic disorder, SAD social anxiety disorder, GAD generalized anxiety disorder, NC non-clinical controls

Social Situations

For positive social interpretations, endorsement rates were substantially lower in the BDD and SAD group, compared to GAD and NC. For negative social interpretations, both the BDD and SAD group showed substantially higher endorsement rates than all other groups. Further, the GAD group endorsed more negative social interpretations than the NC group. All other differences were non-substantial within a 95% credibility interval.

General Situations

For positive general situations, endorsement rates were substantially lower in the BDD and SAD as compared to the NC group. For negative general interpretations, the BDD, SAD, and GAD group exhibited substantially higher endorsement rates than NC. All other differences remained non-substantial within a 95% credibility interval.Footnote 5

Reaction Times: Multilevel Analysis

Appearance-related Situations

As evident in Fig. 3, the BDD and SAD group endorsed positive appearance-related interpretations substantially slower than the GAD group. Conversely, the GAD and the BDD group showed substantially faster rejection of positive appearance-related interpretations than the NC. For negative appearance-related interpretations, the BDD, SAD and GAD group demonstrated faster endorsement (vs. NC), while the BDD and SAD group exhibited slower rejection RT (vs. GAD). All other differences remained non-substantial within a 95% credibility interval.

Fig. 3
figure 3

Mean reaction times (in ms) per situation category and decision. Error bars represent 95% credibility intervals of the mean. BDD body dysmorphic disorder, SAD social anxiety disorder, GAD generalized anxiety disorder, NC non-clinical controls

Social Situations

For positive social interpretations, the SAD group showed substantially slower endorsements than the GAD group. Upon rejecting negative social interpretations, the BDD and SAD group were substantially slower than the GAD group. All other differences remained non-substantial within a 95% credibility interval.

General Situations

The GAD group rejected positive general interpretations substantially faster than the NC group. Conversely, the GAD and the BDD group endorsed negative general interpretations substantially faster than the NC group. All other differences remained non-substantial within a 95% credibility interval.Footnote 6

Reaction Times: Wiener Diffusion Models

Figure 4 displays mean SWAP effects on drift rate, boundary separation, initial bias, and non-decision time, including 95% credibility intervals of the effects.

Fig. 4
figure 4

SWAP-effects across participants on drift rate, initial bias, non-decision time and boundary separation. Error bars represent 95% credibility intervals of the mean. BDD body dysmorphic disorder, SAD social anxiety disorder, GAD  generalized anxiety disorder, NC non-clinical controls

Drift Rates

As shown in Fig. 4 (upper left panel), there were non-zero SWAP effects for drift rates across all categories and groups. More positive values indicated relatively faster endorsement, and more negative values reflected faster rejection of the valent word. For negative appearance-related interpretations, drift rate values were overall negative (i.e., reflecting faster rejection), with the BDD group showing less negative values than all other groups. Further, the SAD and GAD group exhibited less negative values than the NC group in this situation category. For positive appearance-related interpretations, drift rates were overall positive (i.e., reflecting faster endorsement), with the BDD and SAD group showing less positive values than the other groups.

Similarly, for negative social interpretations, drift rates for the BDD and SAD group were less negative than GAD and NC drift rates. Again, for positive social interpretations, the BDD and SAD group exhibited less positive values than GAD and NC. The latter pattern was identically evident for positive general interpretations. All other between-group and between-category differences were non-substantial.

Non-decision Time, Boundary Separation and Initial Bias

Figure 4 shows minimal SWAP effects across categories and groups, with no substantial between-group or between-category differences for all other Wiener diffusion model indices (i.e., non-decision time, boundary separation, and initial bias). Individuals thus did not differ across groups and situation categories with regard to non-decisional components, such as response caution or general response slowing (Voss et al. 2013).

Discussion

This study investigated the phenomenology of interpretation bias across different clinical disorders (i.e., BDD, SAD, GAD, vs. NC) and current concerns. Using the SWAP paradigm, we examined explicit and more implicit, RT-based bias components for positive and negative interpretations. Further, we proposed a multilevel, diffusion model-based approach in analyzing SWAP-based RT bias indices to examine the relative contribution of underlying cognitive processes.

Explicit interpretation bias (i.e., based on decision rates) was present in BDD, SAD, and GAD, as consistent with the transdiagnostic hypothesis. However, bias patterns were shaped by content-specific differences between these groups, in line with the current concern hypothesis. As expected, both individuals with BDD and SAD, compared to GAD and NC, showed diminished positive appearance-related interpretation bias. However, BDD participants displayed a pronounced negative appearance-related interpretation bias, relative to all other groups. Similarly, both the BDD and SAD groups, vs. GAD and NC, exhibited reduced positive and enhanced negative social interpretation bias patterns. Only the BDD group endorsed fewer positive general interpretations than the GAD and NC group, while all clinical groups demonstrated a more pronounced negative general interpretation bias than NC. These results are largely in line with the findings of Buhlmann et al. (2002), showing that BDD is associated with reduced positive and enhanced negative interpretation bias for appearance-related, social, and general situations. Importantly, the similarities of interpretation profiles further support the postulate that BDD and SAD are cognitively proximal (Fang and Hofmann 2010; Fang and Wilhelm 2015). Future research might aim to identify contributive factors driving these commonalities, such as fear of negative evaluation (Fang and Hofmann 2010). Examining dimensional relationships between these factors, bias patterns, behavioral implications (e.g., approach and avoidance), and symptom severity represent critical next steps to empirically explore the etiological role of interpretation biases within emotional disorders.

Relatedly, all clinical groups endorsed more negative general interpretations, suggesting a common vulnerability factor reflected in this pattern (Buhlmann et al. 2002). Nonetheless, it should be noted that general situations encompass relatively heterogeneous current concerns (e.g., romantic relationships, finances, health concerns), which might, in sum, be of broader relevance across mental disorders as compared to more circumscribed concerns. In this respect, future studies might disentangle response patterns to specific current concerns contained in general situation sets.

In line with the current concern hypothesis, our results indicate a pronounced negative but no lack of positive interpretation bias for general situations in GAD compared to NC. One WSAP-based study had previously demonstrated this pattern in an unselected sample, showing that explicit threat bias, and not reduced positive bias, was predictive of GAD symptoms (Ogniewicz et al. 2014). The present study extends these findings by demonstrating the identical relationship in a sample meeting self-report DSM-5 GAD criteria.

Concerning the implicit, RT-related bias component, MLM analyses revealed a relatively inhomogeneous outcome. For appearance-related and social situations, the BDD and SAD (vs. GAD) group exhibited differences in RT, with slower endorsement of positive and slower rejection of negative interpretations being the most consistent findings. RT differences for these groups (vs. GAD and NC) were non-substantial for general situations. Moreover, findings for the GAD group reflected a differentially faster endorsement of negative and faster rejection of positive general interpretations compared to NC. We were further able to identify differential RT to both positive and negative general interpretations in GAD, while there was only a negative explicit interpretation bias. Overall, patterns of decision rates and RT partially diverged, which, within methodological limitations, might be due to several reasons, such as quick interpretation changes or conflicting evaluation of individuated behaviors (see Rydell et al. 2008). In sum, these discrepancies highlight the benefit of RT-based indices as a supplement in understanding the architecture of interpretation bias. Nevertheless, more implicit RT results only partially supported the current concern hypothesis, as clinical groups showed alterations that were specific for some, but not all current concerns. Hence, further research is needed to reinvestigate the relationship of implicit and explicit interpretation bias, i.e., within clinically diagnosed, larger samples, and lab-based settings.

The examination of cognitive components via Wiener diffusion models revealed that between-group and between-content RT differences originated mainly from alterations in drift rate; that is, faster information uptake. As all other parameters did not vary substantially across groups and concerns, RT differences cannot be attributed to non-decisional factors, e.g., differential response caution (i.e., boundary separation) or response bias (i.e., initial bias). Drift rate relates to recognition and classification speed of stimuli, and thus, inter alia, associative strength in memory (McKoon and Ratcliff 2012; White et al. 2010). Hence, the current findings illustrate that faster reaction to an interpretation is a product of closely associated sentence-word-combinations being processed more easily than other distal combinations. In sum, as content-specific drift rate differences and endorsement rates were meaningfully associated with clinical group status, these results suggest that associative memory networks might indeed be organized as per disorder-relevant current concerns. Future studies should validate this novel application and interpretation of Wiener diffusion model indices for this task within larger replication studies.

This study has some limitations. Aiming to reach a heterogeneous range of participants, it was entirely web-based, thus conducted in a less controlled environment, with clinical diagnoses based on self-report DSM-5 criteria. Previous research indicates no validity differences between in-lab and web-based settings for most experimental tasks (e.g., Hilbig 2016; Ramsey et al. 2016; Semmelmann and Weigelt 2017), which is in line with the good psychometric properties found for all measures used in this study. Nonetheless, regarding self-report diagnoses, it remains unclear how well participants were able to accurately respond to diagnostic criteria, which might have affected validity. Overall, replication studies using clinician-administered interviews appear warranted.

Relatedly, the present study design did not allow for an assessment of comorbidities, which are prevalent in the disorders tested (Kessler et al. 2005). Depression, in particular, has been shown to be associated with a medium-sized effect on interpretation bias patterns (Everaert et al. 2014). Hence, future research should investigate the influence of comorbidities determined through standardized clinical interviews. In this respect, diagnostic procedures might especially aid in differentiating GAD from depression. The considerable overlap between these disorders (e.g., concerning repetitive negative thinking) has been empirically discussed (e.g., Mennin et al. 2008), highlighting problems in effectively distinguishing them through self-report. Notably, depression scores for the GAD group in this study were significantly higher than for all other clinical groups and might have influenced interpretation bias patterns.

Further, it is noteworthy that SWAP interpretation options might be somewhat heterogeneous, potentially affecting the association between interpretation bias and symptoms. For instance, while some interpretations reflect attributions (e.g., “funny” vs. “stupid”), others more indirectly refer to behavioral responses (e.g., “help” vs. “avoid”; see Supplementary Table A1 in the Appendix). This heterogeneity is an inherent characteristic of most interpretation bias stimulus sets and potentially yields a more comprehensive picture of bias features. Nonetheless, future studies might focus on specific aspects of interpretation bias (e.g., attribution) to test their association with symptom severity. Last, although our study was powered to detect large between-group differences in mentally healthy vs. clinical populations, statistical power might not have been sufficient to observe small to medium-sized effects (e.g., between clinical groups). The investigation of these effect sizes prospectively requires replication in a larger sample.

In sum, this study is the first to provide a comparative account of interpretation bias across BDD, SAD, and GAD, showing its transdiagnostic presence and specific modulation by current concerns, most coherently for explicit bias patterns. Our findings also demonstrate a shared propensity of BDD and SAD to misinterpret ambiguous appearance-related and social situations. Further, GAD appears to be characterized by both an explicit and implicit negative, as well as a lack of positive general interpretation bias. Additionally, mechanistic insights from RT-based results indicate a differential structural organization of associative memory within different disorders and current concerns, which is consistent with pertinent models of information processing (e.g., Beck and Haigh 2014). They further underscore the importance of identifying and targeting disorder-relevant interpretation bias profiles in cognitive-behavioral therapy, for example, via functional analyses, idiosyncratically tailored cognitive interventions, and CBM-I. In this respect, clinicians should attend to potential cognitive overlap, e.g., for appearance-related and social scenarios in BDD and SAD. Prospective CBM-I programs should equally incorporate such overlap and flexibly address different current concerns within training rationales to enhance intervention efficacy.