Background

Health behaviour change interventions are typically complex and often consist of multiple, interacting, components [1]. This complexity is magnified by the fact that these interventions are often context-dependent, delivered across multiple settings, by multidisciplinary healthcare professionals, to a range of intervention recipients [24]. As a result, ensuring consistency in the implementation of behaviour change interventions is challenging [5]. Despite this, less attention is given to the implementation of behaviour change interventions than to the design and outcome evaluation of such interventions [68].

Intervention fidelity is defined as the ‘ongoing assessment, monitoring, and enhancement of the reliability and internal validity of an intervention or treatment’ [9, 10]. Monitoring intervention fidelity is integral to accurately interpreting intervention outcomes, increasing scientific confidence and furthering understanding of the relationships between intervention components, processes and outcomes [610]. For example, if an intervention is found to be ineffective, this may be attributable to inadequate or inconsistent fidelity of delivery by the intervention deliverer, rather than the intervention components or design [10]. This can result in the discard of potentially effective interventions, when in fact inadequate implementation may be responsible (described by some as a ‘Type III error’) [11]. Moreover, assessing fidelity can support the wider implementation of interventions in clinical practice by identifying aspects of intervention delivery that require improvement, and intervention deliverer training needs that may form the basis of quality improvement efforts [3]. The importance of assessing intervention fidelity has been emphasised in the recently developed UK Medical Research Council Guidance for conducting process evaluations of complex interventions [12].

Several conceptual models of fidelity have been proposed, and there is no consensus on how best to divide the study of implementation into key components [13]. Proposed models differ in the number and nature of components argued to represent fidelity. In an attempt to synthesise and unify existing conceptual models of fidelity, a Treatment Fidelity Workgroup part of the National Institute of Health (NIH) Behaviour Change Consortium (BCC) has proposed a comprehensive framework that proposes five components of intervention fidelity: design, training, delivery, receipt and enactment [9] (see Bellg et al. (2004) [9] and Borrelli et al. (2005) [10] for full definitions of these components). This framework has guided a considerable amount of health research since then [1417].

The current review examines the methods used to address receipt in health interventions. Patients are now more commonly regarded as active participants in healthcare than as passive recipients [18], particularly with the advent of self-management support in chronic conditions [19]. This active role requires that they engage fully with, understand, and acquire intervention-related skills, so they may subsequently apply them to their day-to-day life (i.e. enactment). As such, receipt is the first recipient-related condition that needs to be fulfilled for outcomes of an intervention to be influenced as intended, and enactment is dependent on this condition being fulfilled.

According to the original BCC framework papers [9, 10, 20], a study that addresses receipt includes one or more strategies to enhance and/or assess participants’ understanding of the intervention and/or the performance of intervention-related skills. The 2011 update [20] added considerations of multicultural factors in the development and delivery of the intervention as a strategy to enhance receipt. Receipt is also defined as the accuracy of participants’ understanding in Lichstein et al.’s (1994) [21] framework, and as ‘ the extent to which participants actively engage with, interact with, are receptive to, and/or use materials or recommended resources’ in frameworks by Linnan and Steckler’s (2002) [22] and by Saunders et al. (2005) [23]. In addition, Saunders et al. (2005) [23] suggest receipt may also refer to participants’ satisfaction with the intervention and the interactions involved. The role of receipt or dose received in these other fidelity, process evaluation, or implementation frameworks, further supports its importance in health research.

Despite this recognised importance of receipt however, systematic reviews to date indicate this concept has received little research attention. Borrelli et al. [10] first examined the extent to which the BCC recommendations to address receipt were followed in health behaviour change research published between 1990–2000. Assessments of participants’ understanding and of performance of skill were found in 40% and 50% of papers, respectively. Strategies to enhance these were found in 52% and 53% of papers, respectively. In subsequent reviews [1417] the proportion of papers addressing receipt varied between 0% and 79% (see Table 1). In general strategies to enhance receipt have more often been included in studies than assessments of receipt (see Table 1).

Table 1 Proportion (%) of papers from past systematic reviews addressing receipt as defined in the BCC framework

There are limitations to the reviews described above. First, they examined fidelity in relation to specific clinical contexts. Currently there is therefore a need to examine the extent to which receipt has been addressed in the wider health intervention research, a little more than a decade after the publication of the original BCC fidelity framework in 2004 [9]. A second limitation, which also applies to Borelli et al.’s review [10], is that limited attention is given to describing the methods used to address receipt. Comparability and coherence in the methods used across studies is advantageous however, particularly for the effective interpretation and use of systematic reviews in decision-making [13]. Providing a synthesis of fidelity methods used so far would be valuable in guiding future work.

This systematic review was designed to address these limitations. It aimed to describe 1) the frequency with which receipt, as defined in the BCC framework, has been addressed in health intervention studies reporting on fidelity and published since 2004, and 2) the methods used to address receipt. Since receipt is a component in other fidelity frameworks than the BCC, and because it can be reported on in papers without reference to a specific framework, the second aim of this review was broader in scope and examined methods used to address receipt irrespective of whether or which guiding framework was used.

Methods

Search strategies

Two electronic searches were used to address the aims of this review. First, to determine the frequency with which receipt, as defined in the BCC framework, has been addressed in health intervention studies since 2004, a forward citation search was conducted using the two seminal BCC framework papers [9, 10]. It was applied to Web of Science and Google Scholar and covered the 2004–2014 period. Results of the second search described below were not used to address this aim, as the focus in search terms on receipt would have introduced bias towards papers reporting on this fidelity component.

Second, to identify methods used to assess receipt in the wider literature (i.e. without focus on the framework(s) used), results from the forward citation search described above were combined with those of a second search performed in five electronic databases (CINAHL, Embase, PsycINFO, Medline, and Allied and Complementary Medicine) using four groups of terms. These comprised synonyms of: i) fidelity, ii) intervention, iii) receipt, and iv) health (Table 2 for a complete list of search terms). Within each group of synonyms, terms were combined using the OR function, and each group of synonyms was combined using the AND function. Terms for receipt and health were used as search terms in all fields (e.g. title, abstract, main body of article), whereas terms for fidelity and intervention were restricted to those contained in titles and abstracts, so as to increase the specificity of the search and identify studies whose main focus was to report on intervention fidelity.

Table 2 Search terms

Paper selection

Papers published in English since 2004, and reporting data on receipt of a health intervention were included in this review. A full list of inclusion and exclusion criteria, applicable to results from both searches conducted, is presented in Table 3. These were applied first at the title level, and abstract, and then at the full-text level. They were piloted by the research team on 80 papers and Cohen’s Kappa [24] was k = 0.82. They were refined as appropriate and verified on a further 40 papers. Discrepancies in screening outcomes were discussed until agreement was reached.

Table 3 Inclusion and exclusion criteria

Data extraction

A standardised data extraction form was developed and used to extract data in relation to: i) Study aims, ii) Study design, iii) Recipients/participants, iv) Intervention description, v) Information on receipt (guiding fidelity framework, assessment methods, enhancement strategies, etc.), and vii) Data collection details (e.g. timing of measurement (s), sample involved, reliability/validity, etc.). Data were extracted by one researcher and subsequently verified by a second researcher. A third reviewer was involved in instances where there were disagreements, and these were resolved through discussion.

Analysis and synthesis

All reviewed papers were examined to investigate how receipt was addressed. This investigation first focused on whether receipt as defined in the BCC framework had been addressed (assessments or strategies to enhance participants’ understanding and performance of skill, and consideration of multicultural factors) and then on any other method reported to assess receipt.

A narrative synthesis of the studies reviewed was performed. The proportion of papers citing the BCC framework and addressing receipt as defined in this framework is first presented, then the frequency at which different methods were used to address receipt in the wider literature is provided.

Results

A PRISMA flow diagram is presented in Fig. 1. Of the 629 papers identified in the forward citation search, 555 were screened following duplicate removal. Thirty-three of these were found to fit the eligibility criteria for this review and were used to address the first aim of this review.

Fig. 1
figure 1

PRISMA Diagram. *168 papers reporting data on any type of fidelity from the forward citation search (left hand side flow) can be calculated by the sum of 52+83+33. Search strategies were conducted consecutively; duplicates removed from the electronic database search results therefore included papers that had already been identified in the forward citation search

Of the 2345 papers identified in the electronic database search, 2282 were screened following duplicate removal. Twenty-two of these papers were selected for inclusion in the review. Combined with the forward citation search results, this resulted in a total of 55 papers being used to address the second aim of this review.

A summary of basic study characteristics (study designs, intervention deliverers and recipients, level and mode of delivery) is presented in Table 4 (detailed information on study characteristics available in Additional file 1).

Table 4 Summary of characteristics of included studies (n = 55)

The fidelity research reported was embedded in RCT or cluster RCT designs in most cases (28 studies, 50.9%) but pilot/feasibility designs were also common (15 studies, 27.3%). All interventions included multiple components. The most common components were education or information provision in 19 studies (34.5%) [2543], and behavioural skills rehearsal or acquisition in 8 studies (14.5%) [25, 26, 30, 3840, 44, 45]. The largest group of intervention recipients (17 studies, 30.9%) was people with health conditions including adults, women and children [33, 34, 43, 44, 4658]. It was unclear who intervention deliverers were in 12 studies (21.8%) [26, 39, 46, 50, 51, 55, 5964], but in studies where this information was identifiable, deliverers were most frequently nurses (10 studies, 18.2%) [33, 3537, 40, 47, 52, 6567]. With regards to level and mode of delivery, interventions were most frequently delivered at the individual (25 studies, 45.5%) [2729, 33, 34, 40, 41, 45, 46, 48, 5052, 54, 56, 60, 63, 65, 66, 6873] and group level (19 studies, 35.1%) [26, 31, 32, 38, 39, 42, 43, 49, 53, 55, 58, 61, 62, 64, 67, 7477]. Face to face was the most common (28 studies, 50.9%) mode of delivery [27, 29, 31, 32, 3538, 4145, 49, 50, 56, 58, 6062, 6668, 7478].

Papers citing the BCC framework and addressing fidelity of receipt as per BCC definition

Of the 629 forward citation search results, 168 papers reported on fidelity of a health intervention (see notes under Fig. 1 to locate these in the PRISMA diagram), 33 (19.6%) of which addressed receipt (studies 1–33 in Table 5). Although all 33 papers cited the BCC framework, 5 (15.2%) papers were not worded in a way to suggest that this framework had informed the fidelity or process evaluation reported [28, 39, 66, 67, 77].

Table 5 Methods of assessment and enhancement of fidelity of receipt

Twenty-five (75.8%) of these 33 studies addressed receipt in one or more ways consistent with the definitions proposed in the BCC framework. An assessment of participants’ understanding was included in 20 (60.6%) studies [25, 29, 31, 3337, 39, 45, 47, 48, 50, 57, 61, 65, 67, 73, 75, 78] and an assessment of participants’ performance of intervention-related skills in 14 (42.4%) studies [3336, 45, 47, 48, 51, 54, 56, 57, 65, 75, 78]. With regards to strategies to enhance receipt, 4 (12.1%) studies reported using a strategy to enhance participants’ understanding [41, 48, 56, 57], 7 (21.1%) to enhance performance of intervention-related skills [39, 41, 44, 47, 48, 56, 57]. Four (12.1%) studies reported having considered multicultural factors in the design or delivery of the intervention [25, 29, 31, 64].

Methods used to assess receipt

To address the second aim of this review, eligible studies identified through both electronic searches (55 studies) were examined. Information on the methods used to assess receipt in these studies is displayed in Table 5 (further details can be found in Additional file 2).

Frameworks used

As a consequence of the focus of the forward citation search on the BCC framework, this was the framework used in the majority (28 studies, 50.9%) of studies to inform planning and/or evaluation (i.e. none of the studies included from the electronic database search reported using the BCC framework). Other frameworks that informed the studies reviewed included the process evaluation framework by Linnan and Steckler (2002) [22] in 11 (20.0%) [27, 46, 52, 53, 55, 60, 66, 68, 69, 71, 74], Lichstein et al.’s Treatment Implementation Model (TIM) [21] in 4 (7.3%) studies [28, 39, 40, 67], Saunders et al.’s framework [23] in 5 (9.1%) studies [26, 30, 46, 49, 59], the Reach, Efficacy, Adoption, Implementation, and Maintenance (RE-AIM) framework [79] in 2 (3.6%) studies [46, 70], Dane & Schneider’s framework [80] in 2 (3.6%) studies [38, 76], Dusenbury et al.’s framework [81, 82] in 2 (3.6%) studies [38, 62], Baranowski et al.’s framework [83] in 1 (1.8%) study [52]. A brief definition of how receipt is defined in these frameworks is available in notes below the Table in Additional file 2. More than one of the above frameworks informed the study in 2 (3.6%) of the 55 reviewed studies [46, 52], with a maximum of 3 frameworks being used, none of them being the BCC framework. In 4 studies (7.3%), there was no suggestion that a framework had been considered [32, 72, 77, 84].

Operationalisations of receipt

Given the focus of the forward citation search on the BCC framework, the two most common ways of assessing receipt in the 55 studies reviewed were measurements of understanding, included in 26 (47.3%) studies [25, 2931, 3337, 39, 40, 45, 4750, 57, 6062, 65, 67, 70, 73, 75, 78], and of performance of skills, included in 16 (29.1%) studies [3336, 45, 47, 48, 51, 54, 56, 57, 65, 70, 71, 75, 78].

Receipt was also operationalised in relation to intervention content (e.g. intervention components received or completed, problems areas discussed, advice given) in 9 (16.4%) studies [28, 32, 44, 60, 61, 6770], satisfaction in 8 (14.5%) studies [27, 41, 49, 52, 55, 59, 65, 66], engagement (level of participation, involvement, enjoyment, or communication) in 8 (14.5%) studies [30, 39, 52, 55, 57, 66, 73, 76], attendance in 8 (14.5%) studies [31, 43, 56, 58, 64, 73, 74, 76], acceptability in 6 (10.9%) studies [26, 42, 48, 49, 63, 75], use of materials (e.g. website use, homework completed) in 4 (7.3%) studies [28, 46, 47, 51], behavioural change and/or maintenance in 4 (7.2%) studies [25, 54, 67, 71], receptivity or responsiveness in 3 (5.5%) studies [38, 62, 77], receipt of intervention materials in 3 (5.5%) studies [39, 59, 84], intention to implement learnings from the intervention in 2 studies [52, 60], telephone contacts during intervention delivery in 2 (3.6%) studies [48, 64], reaction to intervention or feedback on program in 2 (3.6%) studies [32, 39], self-efficacy or confidence in 2 (3.6%) studies [30, 61], exposure (e.g. awareness of intervention) in 2 (3.6%) studies [59, 71], and use of skills learnt in 2 (3.6%) studies [45, 74]. Operationalisations of receipt that were only used in 1 study (1.8%) were attitude in relation to intervention topic [61], perceived effects of exposure [36], treatment received with respect [70], feasibility [26], adherence to commitments made [52], adequacy of communication methods used [40], and availability of hardware to use intervention materials [48].

Studies using the same framework operationalised receipt in many ways, some of which were not consistent with the conceptualisation of receipt proposed in respective frameworks. One example is the 12 studies using the Linnan and Steckler framework [22] in which dose received is defined as ‘the extent to which participants actively engage with, interact with, are receptive to, and/or use materials or recommended resources’. These studies included measures of engagement, present in 4 studies [52, 53, 55, 66] and measures relating to exposure to or use of intervention materials in 3 studies [46, 71, 74], behaviour change following the intervention in 1 study [71], intention to implement intervention in 2 studies [52, 60]. Other measures were used that were less consistent with the frameworks’ definition of receipt. These included measures of satisfaction in 4 studies [27, 52, 55, 66], intervention content in 3 studies [60, 68, 69], attendance in 1 study [74], and adherence to commitments made in 1 study [52].

A second example is the 4 studies using Lichstein et al’s [21] framework in which receipt is defined as the accuracy of participants’ understanding of receipt. These studies included measures of receipt that related to intervention content (problems areas discussed [28], accuracy of recall of intervention content [67]), contacts [28], participants’ receipt of intervention materials [39] or level of participation [39], feedback on the intervention [39], and adequacy of communication methods used [40]. The same applies for studies using other frameworks (see frameworks and measures used in Additional file 2).

Assessments of receipt

Five (9.1%) studies included only an objective assessment of receipt [43, 44, 46, 58, 76], whilst 7 (12.7%) combined this with a subjective assessment [31, 38, 48, 51, 56, 64, 73]. The majority of studies (43 studies, 78.2%) included only a subjective assessment of receipt (i.e. collected on intervention deliverers or recipients) [2530, 3237, 3942, 45, 47, 49, 50, 5255, 57, 5963, 6572, 74, 75, 77, 78, 84].

Objective assessments

In the 12 (21.8%) studies that included an objective assessment of receipt [31, 34, 38, 43, 44, 46, 48, 51, 58, 64, 73, 76], this was measured using the number of participants reached during the intervention and the number of participants requiring to borrow hardware to use intervention materials in 1 study [48], website monitoring of module or chapter completion in 2 studies [46, 51], website logins in 1 study [46], records from intervention sessions in 1 study [44], or attendance logs in 8 studies [31, 34, 38, 43, 58, 64, 73, 76].

Subjective assessments

In total 50 (90.0%) of the 55 studies included a subjective assessment, 21 (42.0%) of which used qualitative methods [25, 28, 32, 33, 36, 39, 40, 42, 45, 47, 50, 5254, 57, 63, 66, 67, 69, 73, 75] and 38 (76.0%) of which used quantitative methods [26, 27, 2932, 34, 35, 3742, 45, 48, 49, 51, 52, 5557, 5962, 6466, 68, 7072, 74, 75, 77, 78, 84].

Fourteen (28.0%) of the 50 studies included a subjective assessment collected on the intervention deliverer [26, 28, 30, 33, 34, 53, 56, 57, 62, 64, 69, 73, 78, 84], 25 (50.0%) studies on the intervention recipient [27, 29, 31, 32, 3538, 4042, 45, 47, 51, 54, 5961, 63, 65, 7072, 74, 77], and 11 (22.0%) studies on both of these [25, 39, 4850, 52, 55, 6668, 75].

Assessments collected on intervention deliverers

Twenty-five (45.5%) of the 55 studies that included a measurement of receipt collected this data on the intervention deliverer. Although these were collected on intervention deliverers, they were generally about intervention participants’. An equal number of these assessments involved the collection of qualitative (14 studies, 25.5%) and quantitative data (14 studies, 25.5%). Qualitative data collected in 14 (25.5%) studies consisted of individual interviews, focus groups or reports in 4 studies [50, 52, 67, 69], field notes and comments in 3 studies [39, 53, 66], audio or videotapes of intervention sessions in 3 studies [66, 73, 75], participant observations in 2 studies [33, 48], documentation in participants’ care plan in 1 study [25], records of contacts kept during the intervention in 1 study [28], and active questioning to participants in 1 study [57]. Quantitative data was collected via self-report through questionnaires, surveys or checklists in 8 studies [26, 30, 49, 52, 55, 62, 68, 84], checklists or ratings completed during or following participant observations in 5 studies [34, 53, 56, 57, 78], number and length of phone contacts with participants in 1 study [64].

Assessments collected on intervention recipients

In total there were 36 (65.5%) studies that included a measure of receipt taken on intervention participants’. Thirteen (23.6%) studies included an assessment of receipt that was performed using qualitative methods. These included interviews in 4 studies [40, 50, 63, 67], focus groups in 3 studies [32, 36, 75], reports in 2 studies [25, 67], audio recordings in 2 studies [45, 54], verbal confirmation of participants’ understanding in 1 study [25], confirmation of receipt of information on intervention requirements in 1 study [39], data on meeting discussions in 1 study [42], and daily journals in 1 study [45], and review of participants’ skills and understanding through demonstrations and practice in 1 study [47]. Quantitative data was collected in just over the majority (29 studies, 52.7%) of studies via questionnaire/surveys [27, 29, 31, 32, 35, 3742, 45, 48, 49, 51, 52, 55, 5961, 65, 66, 68, 7072, 74, 75, 77].

Validity and reliability of subjective assessments

In only 13 (26.0%) of the 50 studies that included a subjective assessment, there was some consideration made towards the reliability or validity of the methods used to assess receipt [26, 29, 37, 42, 45, 48, 53, 54, 61, 63, 65, 69, 75].

These considerations were reported in relation to quantitative methods (surveys, questionnaires, or checklists) in 10 (26.3%) of the 38 studies making use of these [26, 29, 37, 42, 45, 48, 53, 61, 65, 75]. These considerations included reporting or providing justification for the lack of reporting of Cronbach alpha [45, 48, 53, 65], information on psychometric properties [29, 37, 75], reporting on construct/content validity [42, 61] or on blinding [26].

These considerations were reported in relation to qualitative methods in 4 (19.0%) of the 21 studies using these [45, 54, 63, 69]. Data was coded by more than one person [54, 63], the coder was blinded to group allocation [45], or the scoring attributed to each participant based on the qualitative data collected was calculated independently by 2 researchers and the kappa coefficient for their agreement reported [69].

Sample selection for receipt assessment

The majority of the 55 studies reviewed (38 studies, 69.1%) [2530, 33, 35, 36, 3847, 49, 51, 52, 5562, 64, 67, 68, 72, 74, 7678] collected receipt data on all (100%) intervention deliverers’ or intervention participants. There were 4 (7.3%) studies in which the proportion of the sample on which the data was collected varied by assessment measure, one of them being less than 100% [48, 50, 73, 75]. For the 15 (27.3%) studies in which receipt was assessed on less than 100% of the sample, the selection of the subsample assessed was related to missing data or participant withdrawal in 4 studies [63, 65, 66, 70], invitations issued (no further details provided) [50], purposive sampling [54], random selection [56, 73], convenience sampling [53], specific eligibility criteria defined to select the cluster to assess [32], a representative sampling method [69], one in every 5 participants being assessed [71], only one of the intervention groups being assessed [48], or a subset of people randomly selected from one of the clusters assessed [84]. In one study this information was unclear [75].

Timing of receipt assessments

In 23 (41.8%) of the 55 studies reviewed, the assessment (s) of receipt were conducted during the intervention period (e.g. during/after each intervention session) [25, 27, 28, 30, 33, 34, 43, 44, 46, 47, 50, 5459, 62, 64, 68, 73, 76, 78]. A slightly lower number of studies (15 studies, 27.3%) included an assessment of receipt that was performed following the intervention [26, 29, 32, 36, 38, 40, 41, 60, 63, 6972, 74, 77]. Others (14 studies, 25.5%) included assessments of receipt taken at different time points: 4 (7.3%) studies included pre and post assessments [31, 35, 37, 61], one of which combined this with an assessment during the intervention too [31]. Nine (16.4%) studies included assessments taken both during and after the intervention [39, 42, 45, 48, 49, 52, 66, 67, 75]. Another, less frequent combination, consisted in assessments taken before as well as during the intervention, and this was found to happen in 1 study [51]. In 2 (3.6%) studies the timing of the receipt assessments was unclear [65, 84].

Assessments of receipt such as those based on attendance logs, documentation in care plans, field notes, comments, meeting data, recordings, daily journals, observations, records of contacts, demonstrations of skills or completion of practice logs, logins/website monitoring, were generally collected during the intervention period.

Assessments of receipt collected after the intervention were generally those that required participants’ exposure to the intervention, for example measures of satisfaction, acceptability, feasibility, recall of intervention content, feedback forms, use or receptivity to intervention materials/skills, interviews/focus groups on intervention content/experiences using intervention. Assessments based on pre and post intervention measurements were used to examine effects of the intervention on variables such as knowledge or self-efficacy.

Discussion

The first aim of this review was to identify the frequency with which receipt, as defined in the BCC framework, is addressed in health intervention research. Only 19.6% of the studies identified from the forward citation search to report on fidelity were found to address receipt, compared with 33% in a recent review on clinical supervision [85]. Amongst the studies identified, 60.6% assessed receipt in relation to understanding (compared to 0–69% in other reviews [10, 1417]) and 42.4% in relation to performance of skill (39–65% in other reviews [10, 1417]). Strategies to enhance understanding were present in only 12.1% (0–79% in other reviews [10, 1417]) and performance of skill in 21.1% of studies (50–69% in other reviews [10, 1417]). These results suggest that there has been little improvement over time with regards to the frequency with which receipt is addressed in health intervention research and that there is a need to continue to advocate for better quality evaluations that focus and report on this fidelity component. These results were further supported in our examination of the wider literature (i.e. not only BCC-related studies), in which understanding was found to be assessed in 47.3% of the 55 studies reviewed and performance of skill in 29.1%. As was suggested by Prowse and colleagues [86], integrating fidelity components to the list of recommended information to report on in reporting guidelines may help increase the proportions of studies addressing and reporting on receipt. Some reporting guidelines have encouraged reporting on fidelity of receipt (e.g. Template for Intervention Description and Replication checklist [87]) but others have not. The Consolidated Standards of Reporting Trials (CONSORT) checklist for RCTs [88] for example emphasises the importance of external validity with regards to generalisability, but the importance of reporting on fidelity is not included. Similarly, a CONSORT extension for non-pharmacological trials [89] does underline importance of reporting on implementation details, but the emphasis is on intervention delivery and not on fidelity of receipt. Consistency across reporting guidelines would help to ensure receipt is addressed and reported more consistently.

The proportions listed above taken from our findings are considerably lower than proportions found in other reviews (see Table 1) that examine receipt using the BCC framework as a guide, particularly with regards to strategies to enhance receipt. Possible explanations for this may be related to differences in the methods used to conduct these systematic reviews. Previous reviews have excluded papers based on study designs. Preyde et al. [17] for example focused only on RCTs and quasi-experimental designs, whilst Garbacz et al [14] required the presence of a comparison or control group. Similarly, McArthur et al [16] included only RCTs and control groups. In contrast, our review was inclusive of all study designs and a considerable proportion was for example, pilot or feasibility studies (27.3%). In a further 5 papers (9.1%) the study design was unclear. Higher quality studies, and those aiming to test hypotheses, may be more likely to monitor and report on fidelity components. Maynard and colleagues [90] for example found that RCTs were 3 times more likely to measure fidelity than studies with a design of lower quality. In this review, studies were not excluded on the basis of study design. We believe that addressing fidelity components is important in study designs like pilot or feasibility studies, and the proportion of these designs included in our review tends to indicate this belief is not uncommon. These trials play a fundamental role in determining the methods and procedures used to assess and implement an approach that will subsequently be used in a larger study and they can help refine an intervention and its implementation to increase its probability of success when evaluated in a larger RCT [91].

Another explanation for some of the differences found between this and other reviews lies in the method used to assess the presence or absence of assessments or strategies to enhance receipt. In other reviews [10, 1517], fidelity components were judged to be ‘present’, ‘absent (but should be present)’, or ‘not applicable’ (the particular fidelity strategy was not applicable to the paper in question). In this review, the denominator used to calculate proportions was the total number of studies, not only those studies where receipt was deemed to be applicable. It is therefore a conservative estimate of receipt. Similar to Garbacz et al.[14], our review did not account for studies where receipt was not deemed applicable. Performance of a skill, for example, may not have been relevant in all the studies we reviewed. An intervention aiming to provide information on health benefits only (e.g. Kilanowski et al.[31] in this review) is one example of this. As most interventions reviewed involved multiple components and targeted behaviour change, it is unlikely this difference in methods significantly affected our findings. In line with this, future work may benefit from developing guidance for researchers on the types of methods to address fidelity components and that is specific to different intervention types, populations, or evaluation methodologies. Some researchers have begun this process by working towards the identification of features that are unique to the fidelity of technology-based interventions [92].

An important challenge in the field of fidelity is the varying nature of interventions, and the tailoring of the design of an intervention fidelity plan that is therefore required [90]. This is compounded by the other challenge that is the lack of reliable methods available to measure intervention fidelity [93]. The second aim of this review was to describe the methods used to address receipt. Our main findings are that receipt has been operationalised in a variety of ways across studies, and that operationalisations are not always consistent with the framework reported to be guiding the evaluation. Such inconsistencies in the operationalisation of receipt make it difficult to synthesise evidence of receipt and to build a science of fidelity. Clearer reporting of methods to address receipt is also required and may help improve consistency in this field. In this review a third reviewer was involved in data extraction for 18 (32.3%) papers to help reach agreement on the methods used to assess receipt. One common problem was the lack of clear differentiation between fidelity components or other constructs measured and reported on. Ensuring constructs are clearly labelled and differentiated from others is recommended for future work. A recent meta-evaluation of fidelity work in psychosocial intervention research supports our reviews’ findings as it found that there was strong variation in whether authors defined fidelity, that the use of different fidelity frameworks and terminology tended to generate confusion and make comparisons difficult, and that the operationalisation of receipt varied greatly [94]. The BCC framework was an attempt to build consistency in the science of fidelity, but ten years later this attempt does not appear to have been entirely successful. As was underlined by Prowse and colleagues [94] there is a need for standardisation in the field of fidelity, but this must not increase complexity.

A subjective assessment of receipt was included in 90.0% of the studies reviewed, and these were carried out using quantitative (76.0%) and/or qualitative methods (42.0%). Quantitative and qualitative methods have been recognised to provide valuable process evaluation data [13], therefore the combination found in this review is not surprising. One important finding from our review however was that only 26.0% of studies using subjective assessments of receipt reported on the reliability and validity of the measurement tools or qualitative methodology used. More specifically, 26.3% of studies using quantitative methods and 19.0% of those using qualitative methods were found to provide such information. This has been found to be the case in a previous review on fidelity in which none of the studies addressing fidelity were found to have reported on reliability [90]. The lack of information on these issues limits the utility and value of the measures used and their potential to inform evidence-based practice and policy.

Strengths and limitations of the review

A strength of this review lies in the search strategies used. A forward citation search strategy on the two seminal papers presenting the BCC framework was performed to determine the frequency with which healthcare intervention studies citing this framework assessed receipt. This has been shown to be an effective search strategy to identify literature pertaining to a specific framework or model [95]. Its use in this review was therefore well-suited to the exhaustive identification of relevant papers. Citation searching has been shown to help locate relevant work that traditional database searching sometimes fails to identify [96, 97] but is not commonly used in reviews. The second strategy combined the results from the forward citation search and a database search to examine methods used to assess receipt in healthcare interventions. One other strength of this review is the range of health interventions it covered. Previous reviews on fidelity have focused on specific fields of intervention research and populations (e.g. second-hand smoking [15], mental health [16], and psychosocial oncology [17]. Although Borrelli and colleagues [10] examined a broad range of interventions, their review was published over 10 years ago. To the best of our knowledge, the current review is the first to focus specifically on fidelity of receipt. It was therefore considered more appropriate to broaden the intervention focus as much possible, to reach an overall understanding of the current state of this field of research. Finally, our focus on methods to address receipt has not been investigated before. Earlier reviews [98, 99] have reported on methods to assess fidelity but these were focused on delivery.

This review is not without limitations. First, the first research question focused on the BCC framework. Other fidelity frameworks have been used and the study of their applications may have yielded findings that could have added to our understanding of receipt in interventional research. Despite this we contend that the BCC framework was chosen for its comprehensiveness, as it was developed to unify previously proposed frameworks of fidelity, and to enable comparison with previous reviews that have examined fidelity using this framework. Furthermore, our second research question was broad in scope, and examined the use of several other frameworks. This was to account for the emerging science of fidelity assessment [100], and the likely variability in fidelity conceptualisations and practices.

Second, this review included only published work. The reporting of complex health interventions is often incomplete [101, 102], and the lack of reporting in published manuscripts of fidelity assessments does not necessarily imply their omission from evaluation designs. Consulting the grey literature may have identified a higher frequency with which fidelity of receipt was assessed. Finally, our examination of how receipt was addressed in the literature was applied to the intervention group and not to control groups [20]. We agree that it is important for fidelity to be assessed in control groups, however we did not feel it was within the scope of this review to examine this.

Furthermore, it should also be noted that fidelity of interventions is part of a broader process in which context is an important consideration, in terms of how it affects the implementation of the intervention (e.g. adaptations and alterations to the intervention) and the mechanisms of impact (e.g. participants’ responses to and interactions with the intervention) [13]. For example, in interventions to increase vaccination uptake, both media scares (context) and individual differences in cognitive and emotional antecedents (individual beliefs and fears) to vaccine uptake may be important considerations. If such interventions are not successful in improving participants’ understanding of vaccination, or skills in cognitive reframing regarding vaccination in the context of collective fear, then it is unlikely that vaccination would be enacted and fear would remain. Yet participants with improved understanding and skills in challenging unhelpful beliefs would be more likely to vaccinate. Therefore, for optimal receipt of an intervention, tailoring an intervention to the individual and their social and cultural context will plausibly relate to better receipt of the intervention, which will result in turn improved outcomes. Future studies should examine the extent to which intervention receipt is the mediating mechanism between tailored interventions and enactment, and how these factors impact on outcomes.

Conclusion

Addressing intervention fidelity is a fundamental part of conducting valid evaluations in health intervention research, and receipt is one of the fidelity components to address. This systematic review examined the extent to which, and the methods used to address receipt in health intervention research in the last ten years. The results indicate a need for receipt to be more frequently integrated to research agendas. The review also identified some issues and concerns relating to the ways in which receipt has been addressed to date, with operationalisations of receipt lacking in consistency. We recommend that information on reliability and validity of the receipt measures be reported in future fidelity research.

Box 1: Lessons learnt and recommendations from this review

Lessons learnt

• Fidelity of receipt (as defined in the BCC framework, i.e. assessments of participants’ understanding and performance of skill and strategies to enhance these) remains poorly assessed in health intervention research

• Reporting of strategies to enhance receipt, i.e. participants’ understanding and performance of skill, remains particularly low.

• Other frameworks than the BCC have been used to guide fidelity/process evaluation work, but operationalisations of receipt do not always match the definitions of receipt provided in these frameworks

• The reporting of methods used to assess receipt requires improvement. Reporting was unclear in a number of papers, requiring readers to read manuscripts attentively several times to identify how receipt was operationalised and providing no information on the validity/reliability of the methods used

• Quantitative and qualitative methods, or a combination of both, have been used to address fidelity of receipt in health intervention research.

Recommendations for future work

• In the early stages of study design, consider how to address fidelity of receipt both in relation to assessments and strategies to enhance

• Select one or more fidelity frameworks to guide fidelity work (or use an overarching model) and ensure the methods used to assess receipt are consistent with the definitions of receipt in the chosen framework (s) (provide definitions of receipt)

• Clearly differentiate between fidelity components and other constructs when writing papers (e.g. receipt and enactment are different constructs, therefore methods used to assess them need to be described separately, as well as results).

• Address and report on the reliability and validity of the methods used to assess receipt