Introduction

Cognitive behavioral therapy (CBT) has proven effective for many common mental disorders (Carpenter et al., 2018; Hofmann et al., 2012), but although many patients are helped by CBT, a considerable proportion does not respond sufficiently, and relapse is common (Ginsburg et al., 2014; e.g., Loerinc et al., 2015). This is a considerable challenge that needs to be tackled. Incorporating recent findings and approaches from neuroscience represents one promising route forward; see also psychological treatments: A call for a mental health science (Holmes et al., 2014). The 9th World Congress of Cognitive and Behavioral Therapies (WCBCT) held in Berlin, Germany, in 2019 offered several symposia with topics on the crossroad between neuroscience and psychological therapies (e.g., Craske, 2019; Lueken, 2019; Månsson, 2019). Here, we will present a selection of findings from three areas where neuroscience can offer novel perspectives to better understand (a) how CBT works on a biological level (i.e., characterizing CBT-induced mechanisms of change), (b) how we can enrich CBT with neuroscience-informed techniques (i.e., augmentation of CBT), and (c) why some patients may respond better to CBT than others (i.e., identifying moderators and prognostic markers of CBT outcome). See also Fig. 1.

Fig. 1
figure 1

Illustration of the current article’s theme, describing how cognitive behavioral therapy (CBT) can be enriched by neuroscience, including assessments of behavior, brain, cells and genes. Neuroscience can offer novel perspectives to better understand (a) how CBT works on a biological level (i.e., characterizing CBT-induced mechanisms of change), (b) how we can enrich CBT with neuroscience-informed techniques (i.e., augmentation of CBT), and (c) why some patients may respond better to CBT than others (i.e., identifying prognostic markers of CBT outcome). The figure was created in Keynote v. 10 (Apple Inc., CA, USA)

Characterizing CBT-Induced Mechanisms of Change

Two commonly used tools to image the living brain are magnetic resonance imaging (MRI) and positron emission tomography (PET). These techniques can be used to map brain structure and function at the macroscopic level, including the wiring and cross-talk between brain regions, and thus to investigate neural correlates of CBT. Pioneering work in this field was performed by Baxter and colleagues (Baxter Jr et al., 1992), who in the early 1990s used PET to show that responders to CBT for obsessive compulsive disorder, similar to responders to SSRI, reduced their resting metabolic rate of glucose in the striatum. This was followed by Furmark et al. who showed that CBT and the SSRI citalopram produced common reductions in amygdala activation during anxiogenic public speaking in patients with social anxiety disorder (Furmark et al., 2002). Work in depression soon followed, with Goldapple and co-workers finding changes in prefrontal cortex resting glucose metabolism in CBT-treated patients (Goldapple et al., 2004). Following the advent of functional MRI (fMRI) and increased accessibility to neuroimaging facilities, CBT treatment studies have surged. An initial study using fMRI during viewing of films of spiders was conducted on spider phobia by Parquette at al. (Paquette et al., 2003), reporting CBT-induced dampening of both prefrontal and limbic activity. These early studies highlight that CBT response maps to neural changes and that these changes may be disorder-specific or perhaps symptom-specific. Linking behavior and CBT outcome to regional brain activity is interesting because it conveys information about the biological representation, and if we better understand where, what, and when things change, this could enrich CBT by providing therapeutic targets. For instance, activity in subcortical limbic structures such as the amygdala can be interpreted as related to threat detection or fear, whereas the prefrontal cortex might be related to regulatory functions such as higher-order cognitive control. However, these links are not trivial, as brain-behavior relationships probably are best described as many-to-many, meaning that there is not one single brain region responsible for a specific behavior or function and brain regions are involved in many different behaviors or functions. Moreover, the meaning of increased activity is not straightforward as the same increase in activity may signal either improved performance or compensatory mechanisms. In summary, the vast majority of studies detecting treatment-related changes in the brain have been conducted on anxiety and depressive disorders. They broadly support the notion of a dual-process model of psychotherapy in anxiety disorders with abnormally increased limbic activation being decreased, while prefrontal activity is increased following treatment. Partly overlapping findings are reported for depression, albeit with a stronger focus on prefrontal activation following treatment (Lueken & Hahn, 2016; Marwood et al., 2018). The results are in accordance with the notion that emotion regulation capacities are enhanced by CBT with resulting increased top-down prefrontal control of structures conveying emotional and autonomic arousal (Marwood et al., 2018). On the other hand, prefrontal hyperactivity has been reported as a feature of anxiety symptomatology, perhaps indicating excessive recruitment of maladaptive emotion regulation strategies (Reinecke et al., 2015). Key mechanisms on how we learn and inhibit pathological forms of fears come from fear conditioning and extinction. As such, Study Box 1 will refer to this basic learning mechanism and exemplify how neuroimaging can help us to better understand what happens during exposure-based CBT.

Study Box 1 How understanding fear conditioning and extinction can help us improve exposure techniques

In addition to macro-level MRI and PET assessments of the brain, neuroscience also provides tools to investigate therapy-induced effects on cells, proteins, and enzymes, e.g., inflammatory cytokines, markers of oxidative stress (Chen et al., 2011), and telomere biology (Månsson et al., 2019). These studies broadly support the notion that CBT not only can ameliorate psychiatric symptoms but also reduce the expression of inflammatory cytokines and improve cellular protection. Although there are relatively few CBT studies investigating effects at the cellular level, it is interesting to note there is a tentative overlap between effects of psychopharmacological interventions and CBT also on these measures (Lindqvist et al., 2015; Verhoeven et al., 2014). However, current study designs preclude separating primary effects of treatment from secondary effects of symptom improvement. To mitigate this limitation, we propose innovative sampling protocols such as multiple sampling over the course of treatment as outlined below under Challenges for the Future. Similar and dissimilar effects of SSRIs and CBT are particularly interesting considering the novel developments on enhancing the effects of CBT with pharmacological agents. Next, we will review some of the developments along these lines.

Neuroscience-Based Augmentation of CBT

Elucidating the neural effects and mediators of CBT may also open up novel neuroscience-based treatment options. As mentioned in Study Box 1, extinction does not erase the amygdala-based fear memory but rather induces an inhibitory safety memory conferred via the ventromedial prefrontal cortex (vmPFC) that exerts top-down control of amygdala activity and hence fear memory expression (Dunsmoor, Niv, Daw, & Phelps, 2015). Thus, fear expression is determined by the interaction between the fear memory and the safety memory. Notably, the safety memory is labile, which is reflected by the return of the elicited fear response after change of context, passage of time, or stress provocation, as can be seen in the relapse of patients treated with CBT (Vervliet et al., 2013). Recently, targeted manipulation of the underlying memory processes, i.e., directly weakening the fear memory or strengthening the safety memory, has been proposed and tested. For example, after recall of a memory, the memory is in a destabilized state before being reconsolidated, a process that takes a couple of hours. This means that memories are not encoded once and for all and after that are fixed, but rather they are open for change during the reconsolidation window of roughly a couple of hours after activation. This was eloquently discovered in animals (Nader et al., 2000) and replicated in humans using both experimentally induced fear memories (Ågren et al., 2012) and long-term fear of spiders (Björkstrand et al., 2016). Interestingly, fear memory activation and subsequent extinction within, but not outside, the reconsolidation window attenuated fear responses by directly targeting the amygdala fear memory trace (Ågren et al., 2012). These findings open a window of opportunity for enhancing CBT, e.g., by destabilizing fear memories and then time the exposure to within the reconsolidation window. Reconsolidation of fear memories may also be targeted using various pharmacological compounds. Interesting contributions to this end have been achieved using the beta-blocker propranolol to block the reconsolidation of spider fear memories in phobic patients (Soeter & Kindt, 2015). Although interfering with reconsolidation shows some promise in attenuating excessive fear and anxiety, small sample sizes and a recent non-replication of initial findings (Chalkia et al., 2020) call for pre-registered trials with larger samples to delineate the effects.

Combining CBT with pharmacological cognitive enhancers can be used to boost inhibitory learning and the safety memory. Among the cognitive enhancers, D-cycloserine (DCS) is one of the most studied in mental illness (e.g., Andersson et al., 2015). DCS is an antibiotic that also acts as a partial agonist on the N-methyl-D-aspartate (NMDA) receptor, a receptor involved in memory formation. Work on DCS and the strengthening of extinction memory are a successful result of translational research, where findings in animals have been translated to human experimental studies employing Pavlovian fear conditioning, and then to clinical studies of anxiety and related disorders. As with many pharmacological agents, timing of the plasma concentration to the learning/memory formation is crucial. The administration of DCS about 1–2 h prior to Pavlovian fear extinction has been shown to enhance extinction learning and attenuate both neural and subjective fear responses (Ebrahimi et al., 2020). In anxiety and related disorders, DCS has similarly been shown to facilitate exposure learning (Mataix-Cols et al., 2017), both when administered prior to or after the exposure session (Smits et al., 2020). One of the potential drawbacks of DCS and other cognitive enhancers is that they enhance learning regardless of the success of the exposure session. To safeguard against unwanted strengthening of fear memory, it has been suggested to administer DCS only after exposure sessions with adequate fear reductions. It should be mentioned that DCS is but one example and there is a large range of potential pharmacological agents that could be used to potentiate inhibitory learning during CBT (see e.g., Singewald et al., 2015 for a review). See also Study Box 2.

Study Box 2 Cognitive enhancers: a paradigm shift in the combined psycho-pharmacological treatment of mental disorders?

Pharmacological interventions often affect broad brain regions as well as non-brain targets, which may result in unwanted side effects. Using more targeted modulation of neural activity with non-invasive brain stimulation such as transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS) can overcome these limitations (Burger, 2019). These techniques can modulate neural activity and be used to target brain regions involved in emotion regulation and extinction learning, such as the prefrontal cortex. Indeed, there are initial reports of non-invasive brain stimulation augmenting exposure therapy for PTSD (Isserles et al., 2013) and spider phobia (Herrmann et al., 2017). However, a common limitation of current non-invasive brain stimulation techniques is that they only reach superficial sites such as the cortex, whereas many vital brain functions are located subcortically, e.g., limbic structures like the amygdala and hippocampus, and the medial prefrontal cortex. To overcome this limitation, indirect targeting of brain regions functionally connected to the cortical site can be applied. As an example, in a recent study, Raij et al. (2018) used functional connectivity fMRI to identify cortical regions connected to the vmPFC. The authors then targeted the vmPFC using TMS of the functionally connected cortical region to show that this augmented extinction of conditioned stimuli.

Recent work has used real-time fMRI to deliver feedback on current brain activity to patients. In this work, patients can use the neurofeedback to adapt their emotion regulation techniques to match a predefined target, e.g., either enhancing activity in frontal regulatory regions or attenuating amygdala activity. Although routine clinical application is still not available, promising results have been reported in several diagnostic groups, including spider phobia (Zilverstand et al., 2015) and depression (Young et al., 2018). Even single neurofeedback sessions may have lasting effects on emotion regulation strategies used in everyday life, as shown by MacDuffie et al. (2018).

Tools and approaches from neuroscience are well suited to potentiate CBT, and indeed, many promising avenues are being pursued that may see clinical applications, as outlined in this section. While improving CBT remains important, we must also consider the possibility that CBT will not fit everyone. Thus, it is important to address the question of for whom will the treatment work and ideally to know this in advance. In other words, we need to develop objective methods to guide the selection of the treatment that best fits a particular individual, i.e., precision psychiatry and psychotherapy. Hence, we now turn to the recent advances in machine learning and statistics and how they can inform clinical decision-making in the areas of diagnosis, prognosis, and treatment response for the individual patient (Gabrieli et al., 2015).

Identifying Prognostic Markers of CBT Outcome

Behavioral research on treatment outcome prediction has existed for many years, and we believe it will become even more important in the future. Behavioral data, compared with any other current biological assessment, is superior from a feasibility perspective, e.g., the ease and low cost of collecting behavioral data. However, reports on behavioral treatment outcome predictors are contradictory, and the reported effects are usually small. Forsell et al. (2020) reported on 4310 patients treated for depression, panic disorder, or social anxiety disorder with the Internet-delivered CBT and found that patients not responding to CBT can be predicted with about 70% accuracy. However, this accuracy rate was only achievable half way through the treatment (week 6 of 12), and pre-treatment subjective symptom severity ratings did not significantly predict post-treatment outcomes (Forsell et al., 2020). In a similar way, Hilbert et al. (2020) predicted CBT outcome in a naturalistic sample from a university-based outpatient center with 2147 patients who received face-to-face CBT. Predictors based on routinely available sociodemographic and clinical data did not exceed 59% prediction accuracy. Thus, there is room for improvement and a need to develop accurate pre-treatment predictors of outcome. In this respect, neuroscience may contribute with data at a level more appropriate for predicting treatment outcome. To date, a variety of methods have been used with the aim to predict treatment outcomes, e.g., therapygenetics and genome-wide association studies (Andersson et al., 2018; Rayner et al., 2019 for a meta-analysis). As non-invasive brain imaging with structural and functional MRI has become increasingly accessible, the number of studies employing such data to predict treatment outcome is steadily growing.

Brain network configurations in emotion-regulating circuits which are recruited by CBT techniques may act as predictors of treatment response. The vmPFC and anterior cingulate cortex (ACC) are densely connected to the amygdala and have been associated with different forms of emotion regulation, among them fear inhibition. Translational research (Milad & Quirk, 2002) shows that behavioral exposure as a key CBT technique commonly employed for the treatment of anxiety disorders and beyond is very likely to recruit this brain circuit (see Study Box 1). In accordance, an emerging body of research shows that CBT response seems to be associated with pre-treatment ACC functionality in disorders along the internalizing spectrum such as panic disorder (Hahn et al., 2015; Lueken et al., 2013), social anxiety disorder (Månsson et al., 2015), post-traumatic stress disorder (Szeszko & Yehuda, 2019), and unipolar depressive disorders (Siegle et al., 2006, 2012). Moreover, in some studies, neuroimaging data has produced more accurate predictions of treatment outcome than demographic and clinical data (Frick et al., 2020; Månsson et al., 2015). See also reviews on treatment outcome prediction in anxiety disorders (Lueken et al., 2016; Shin et al., 2013). Using such biomarkers would open for proper identification of patients not likely to respond to CBT and thus should be offered another treatment or behavioral or neuroscience-informed augmentation strategies such as neurofeedback training in addition to standard CBT.

Precision psychotherapy will however only become clinically relevant if we can predict the treatment outcome at the level of a single patient. Most predictions to date have been achieved by using univariate, group-based methods that neither mirror the multivariate nature of predictors (e.g., inter-dependency of variables) nor yield predictions suitable for individual patients. In contrast, novel methods embedding multivariate data within a machine learning framework allow modeling the complex, interdependent structure of these data and to train an algorithm to detect a pattern of predictive value (see also Study Box 3). Further, unbiased estimates that generalize to future individuals are needed. Single patient prediction studies typically use a cross-validation framework in which a subsample of the study participants is used to train a model (training set), which is then applied to a subset of subjects who were held out (test set). However, it should be acknowledged that the majority of neuroprediction studies has used less than 50 patients and most often employed biased cross-validation schemes (i.e., leave-one-out cross-validation), and robust prognostic markers are still largely lacking (Poldrack et al., 2019; Varoquaux, 2017).

Study Box 3 What is machine learning and how may it support clinicians in the future?

Another drawback of most studies to date is that they have used only one treatment, precluding assessments of specificity of findings; i.e., it is unknown if responders would respond to any treatment or specifically to the tested treatment. Establishing predictors of treatment outcome remains important, but clinicians are mainly interested in knowing what treatment will be best suited for the individual patient they have in front of them. To answer this question, a possible next step in neuroprediction is modality prediction, e.g., is this patient more likely to respond to CBT than SSRI or vice versa? In a recent study (Frick et al., 2018), we showed that patients with social anxiety who had high pre-treatment activity in the dorsal ACC were more likely to respond to combined SSRI + CBT than CBT monotherapy, whereas patients with low pre-treatment activity in this brain region were more likely to respond to monotherapy. If these findings are replicated, they may represent an important clinical application of fMRI to inform treatment selection.

Challenges for the Future

While mechanistic studies using neuroimaging techniques usually focus on a high degree of internal validity (thus, enhancing the causal strength to infer a given mechanism), the clinical application of neuroscience-based findings crucially calls for a stronger focus on external validity, specifically at the individual patient level. Using biomarkers in clinical routine care requires high sensitivity and specificity; that is, we need to estimate the degree to which these findings (usually inferred from highly controlled clinical samples with strict inclusion and exclusion criteria) can be generalized to patients with comorbid disorders, varying sociodemographic factors, and diversity in general. From a clinical perspective, what is important is that the performance of the biomarker generalizes to the next patient entering your clinic. Currently, tests of the robustness of findings, including replications and validations in new samples, are rather the exception than the rule. This will require novel study designs, including more heterogeneous samples from multiple sites. Also related to the robustness of findings is the current discussion within the neuroimaging community of low reliability of many fMRI paradigms (Elliott et al., 2020) as well as the multitude of analysis pipelines employed that sometimes lead to contradicting results (Botvinik-Nezer et al., 2020). It is important to be aware of the shortcomings of the field but at the same time be aware that the field is being transformed toward greater emphasis on reproducibility, including code and data sharing, preregistration of studies and analyses, and more robust findings. These measures will improve both mechanistic and treatment-prediction studies of CBT.

Moreover, we believe that innovative sampling strategies will further enhance our understanding of neural factors involved in CBT. For example, measures of symptoms are often collected at numerous timepoints during treatment, but neuroimaging is almost exclusively only to pre- and post-treatment. Adding additional scans during the course of treatment would allow for a more mechanistic understanding of the treatment process and how brain networks change over the course of treatment. This could resolve current debates regarding the order of changes in cognitions, behavior, the brain, and symptoms. There is a trend in the fMRI field of complementing large sample sizes where individuals are scanned once with smaller samples with multiple and/or longer scans per individual. This would fit well with the single subject design (e.g., A-B-A) employed in CBT studies, where assessments are performed, e.g., weekly and interventions delivered at random timepoints to tie the change in symptoms to the intervention.

Furthermore, neuroscience research so far has predominantly focused on individual factors (e.g., genetics) and how these may alter brain systems. Based on a bio-psycho-social understanding of mental disorders and their treatment, the social (e.g., environmental) component has most often been neglected. Neuroimaging can enhance our understanding not only on bio-psycho but also bio-social interactions and hence unravel the impact of social/environmental risk and resilience factors on our brains. For example, environmental factors such as living in an urban area appear to act on amygdalar reactivity: Current city living was associated with increased amygdala activity, whereas urban upbringing affected the anterior cingulate cortex, a key region for regulation of amygdala activity, negative affect and stress (Lederbogen et al., 2011). Given the limitations of this cross-sectional approach, these findings are in line with urban living being a risk factor for common mental disorders such as mood, anxiety, and substance use disorders. Neuroscience may help to better understand potential mechanisms conferring this association.

Summary

In summary, using tools and approaches from neuroscience offers plenty of opportunities to identify mechanisms of therapeutic change induced by CBT, to augment CBT and achieve better treatment outcomes, and to guide clinical decision-making. Here, we have highlighted a few of these opportunities and outlined both challenges and promises within the field. We briefly reviewed the early days of neuroimaging studies on CBT, and moved forward to fear conditioning and extinction as an exemplary mechanism to study pathological forms of fear as well as their amelioration via exposure therapy, and we highlighted that prior knowledge focus on habituation as a mechanism of exposure therapy has to be revised according to recent evidence from neuroscience. We mentioned a few intriguing observations on similar effects by CBT and pharmacological interventions and moved forward to research on cognitive enhancers, brain stimulation, and neurofeedback to boost the effect of CBT. We have also acknowledged that it is possible that there will never be one single treatment that is superior to all other treatments, and thus, we should focus on what treatment works best for whom. Along these lines, we introduced the field of neuroprediction, with studies using novel techniques from machine learning to achieve personalized treatment selection.

We believe there will be a surge of studies using neuroscience approaches to better understand CBT, and that the near future will see clinical implementations that will benefit our patients.