A total of 333 studies were extracted from the online databases and other sources (e.g., bibliographies of all included studies). After the removal of duplicates and ineligible records, 276 records were screened. Based on the evaluation of the titles and abstracts, 237 articles were excluded because they were not relevant, with common reasons for exclusion (e.g., no breast cancer survivors, no RCT study design, no CBT intervention, no data on FCR, and abstract only). Then, 25 full-text articles were assessed for secondary screening, of which eight studies were excluded for the reasons described in Fig. 1. After a review and discussion among authors, 17 articles were finally selected for the systematic review (see Fig. 1 for PRISMA flow diagram of the literature review).
General characteristics of the interventions
A summary of the general characteristics of the study participants, including the sample size, age, sex, ethnicity, cancer type, cancer stage, and time since diagnosis, is listed in Table 1. Of the included studies, most studies were conducted in the U.S. (n = 8), the Netherlands (n = 2), or Germany (n = 2), and other countries represented were Spain, Belgium, Canada, Australia, and Japan. The total sample size across the studies was 2288 participants, with the sample sizes of each study ranging from 24 to 322. The participants in the sample were predominately non-Hispanic White females, with an average age of 53.1 years. Most studies focused on only breast cancer survivors (n = 12), while some studies included mixed cancer populations (n = 5). Most of the studies recruited cancer survivors diagnosed at stages 0 to 3 or stages 1 to 3, but two studies included people with stage 4 cancer. The means of time since diagnosis for the experimental and control groups were 4.48 and 4.49 years, respectively, which indicated that the study participants had completed their treatment rather recently.
Content and methodological strategies of the interventions
First, the included interventions utilized a wide range of CBT techniques with some variations, such as mindfulness awareness practices (MAPS) , acceptance and commitment therapy (ACT) [26, 38], cognitively based compassion training (CBCT) [39, 40], mindfulness-based stress reduction (MBSR) [27, 41], mindfulness-based cognitive therapy (MBCT) , attention and interpretation modification , cognitive-existential psychotherapy , blended CBT , and CBT-based online self-help training .
More than half of the studies used group-based interventions, while six studies adopted an individual format. One of the group intervention studies targeted breast or gynecological cancer survivors and their partners . Most interventions were delivered face-to-face, and one study combined in-person delivery with an online method . A couple of studies used either telephone communication [47, 48] or online communication . The frequency and duration of the interventions varied greatly, from a single 20- to 45-min session  to 15 weekly 2-h sessions . The most common intervention duration was six or 8 weeks (n = 9). The CBT interventionists included heath care professionals such as psychotherapists, psychologists, or nurses. The threat of selection bias was low because all the studies utilized a pretest-posttest control group design with random assignment. Most studies included an active comparison group and/or a control group with usual care during the intervention period. In particular, Herschbach et al.  compared a CBT group with a comparison group (supportive-experiential group) and a control group, and Butow et al.  compared a CBT group with a comparison group that received a relaxation training program. Johns et al.  included both a comparison group (e.g., survivorship education) and a control group and compared these groups with a CBT group. Six studies [27, 37, 39, 41, 42, 44] compared a CBT group with a waitlisted control group. Last, there were limitations regarding a lack of external validity (e.g., small and highly homogeneous samples, the same settings and regions in the U.S. or European countries) in most studies. However, Germino et al.’s  study targeted both non-Hispanic Whites and African Americans, and Butow et al.  recruited samples from 17 oncology centers in Australia.
Five different instruments were used to measure FCR in the selected studies: the Cancer Worry Scale (CWS) , the Concerns about Recurrence Scale (CARS) , the Fear of Cancer Recurrence Inventory (FCRI) , the short form of the Fear of Progression Questionnaire (FoP-Q-SF) , and the Quality of Life in Adult Cancer Survivors (QLACS)-FCR subscale . Among these instruments, the FCRI (42 items) and the CARS (30 items) were the most frequently used in the included studies. Eight studies [24, 26, 38,39,40, 44, 45, 49] used either the FCRI total scale or some of the seven FCRI subscales (e.g., triggers [8 items], severity [9 items], psychological distress [4 items], coping strategies [9 items], functioning impairments [6 items], insight [3 items], and reassurance [3 items]). Six studies [27, 41,42,43, 47, 48] used the 30-item CARS, which is composed of two measures: overall fear and a range of problems. Van de Wal et al.  used both the 8-item CWS and FCRI, and other studies measured FCR by using either the 12-item FoP-Q-SF [23, 46] or the 4-item QLACS-FCR subscale . Table 3 provides a summary of instruments used to assess FCR.
The FCR-related outcomes of all studies are listed in Table 4. FCR was included as an outcome variable in most studies except Lengacher et al.’s study , which treated FCR as a mediator. Three studies [24, 41, 49] assessed FCR only at baseline or pretest (T0) and posttest (T1), and other studies assessed FCR scores at baseline or pretest, posttest, and one or two follow-up periods. Most studies showed significant reductions in the FCR scores of the intervention groups at the postintervention and follow-up time points, but some results were not statistically significant. In particular, three studies [45, 47, 48] reported no significant between-group differences in FCR across time. The common aspects of these studies were a telephone or online format and brief sessions of less than 1 h. Seven studies [23, 26, 27, 41, 42, 44, 46] reported both significant main effects in the intervention groups and significant group-by-time interaction effects on all FCR scores over time. In general, these interventions were in four to eight 60- to 120-min, face-to-face group sessions over at least 4 weeks and the interventionists were trained health care professionals such as psychotherapists and psychiatrists. In addition, seven studies using either the FCRI or the CARS subscales [24, 38,39,40, 43, 47, 49] showed partially significant improvements in a couple of subscales among the intervention groups. Interestingly, two studies [23, 38] included three arm RCTs to compare CBT with a comparison group (e.g., supportive-experiential group therapy, survivor education) and a control group and they found that supportive-experiential group therapy was comparable to the CBT intervention while survivor education demonstrated minimal changes in reducing FCR over time.
Table 5 and Additional file 1 summarize the results of the study quality assessment. The quality of reporting of RCTs of the included studies was assessed using the CONSORT 2010 checklist guidelines. Four items of the checklist were excluded from the analyses because they were not applicable. Not all of the studies complied with the CONSORT 2010 checklist. The average percentage of articles that reported each applicable item on the checklist was 77.3.
Among the six sections of the checklist, the included RCT studies had the highest average reporting percentage for the items related to the introduction (100%), followed by those related to the title and abstract (91.2%), discussion (70.6%), the results (69.9%), other information (66.7%) and methods (65.6%). Ten of the 33 applicable items on the checklist were reported by 100% of the included studies, while ten items were reported by less than 60% of the RCT studies. Specifically, only a few studies reported the items on the important changes to methods after trial commencement with reasons (5.9%), the explanation of any interim analyses and stopping guidelines (5.9%), or generalizability (11.8%). None of the studies reported important harms or unintended effects (0%).
In general, the studies of face-to-face CBTs with an intervention duration of at least 1 month were more likely to provide detailed descriptions according to the checklist and to comply with the CONSORT 2010 guidelines. In contrast, the studies of telephone-based CBTs with a single session or a few sessions did not provide sufficient information regarding the CONSORT 2010 items, and they especially underreported issues such as trial design, the allocation concealment mechanism, blinding, recruitment, registration, and protocol.