FormalPara Key Points

Compliance with injury prevention interventions can significantly affect study outcomes.

There is considerable heterogeneity in the way that sports injury prevention studies have measured, defined and reported compliance. More uniformity is needed in future studies to better progress sports injury prevention.

1 Introduction

It is widely recognised that participation in regular sports and physical activity has the potential to improve health [1]. However, involvement in such activities also entails a risk of sustaining an injury. Serious sport injuries that take a considerable time to heal can force those involved not only to withdraw from the activity, but also to seek medical care and invest in medication and assisting materials—such as tape, braces and crutches. They can even prevent someone from continuing work or study activities. As a result, injuries lead not only to an individual burden, but also to substantial societal direct and indirect cost [2].

Numerous studies have been performed to evaluate the efficacy of interventions to prevent sport injuries or to reduce the risk of recurrent injury [3]. Although a variety of efficacious preventive interventions have been proposed, implementation of these interventions faces the challenge of persuading participants to follow instructions as prescribed. Establishing the effectiveness of any injury prevention intervention requires knowledge about what percentage of the targeted population exactly complied with the prescribed protocol. Especially in an intention-to-treat (ITT) approach, insights into the compliance to the intervention provides valuable and, arguably, necessary information to judge the efficacy of an intervention [4].

When one incorrectly assumes that the entire study population has complied with the intervention protocol, the preventive effect of any intervention can be either over- or underestimated. Unfortunately, many different definitions of compliance have been reported in the sports medicine literature [3]. Both the constructs of compliance and adherence have been used interchangeably to describe the complete and correct following of a prescribed intervention. Nonetheless, the two terms are not synonymous. Compliance refers to participant obedience in a study where a clinician/researcher prescribes the intervention, with little to no right of consultation on behalf of the participant. It can thus be defined as “the athletes’ correct following of the prescribed intervention” [4]. Adherence implicates a more collaborative environment in which a clinician/researcher and a study participant cooperate to develop an intervention that fits with the participants’ opportunities and restraints [5, 6]. Research, ideally performed in a more or less controlled setting, therefore implicitly focuses on compliance, rather than on adherence.

In addition to using correct definitions, the operationalisation of compliance requires attention. A comprehensive assessment of study results will only be possible if there is thorough insight into the way compliance has been defined, measured and adjusted for. If there is no, or incomplete, information available on the extent to which participants have complied with the intervention, it will remain unclear as to whether the intervention has been truly efficacious or not. Therefore, it is important that researchers, who aim to present studies of high quality with a low risk of bias, acknowledge the importance of compliance, and measure and report on compliance and its effects on study outcomes.

A number of study reporting guidelines, such as the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) statement and the CONSORT (CONsolidated Standards Of Reporting Trials) statement, recognise the importance of compliance and include specific items on the topic in their guidelines [79]. The STROBE statement addresses cohort, case-control and cross-sectional studies; the CONSORT statement specifically addresses the quality of reports of randomised controlled trials (RCTs).

Until 2010, the CONSORT statement advocated the use of ITT analysis for RCTs. ITT analysis does not include the measurement of compliance but assumes full adherence to the prescribed intervention [4]. However, as mentioned in the CONSORT statement, strict ITT analysis is often hard to achieve for two main reasons: missing outcomes for some participants and non-adherence to the protocol. Therefore, since 2010, the CONSORT statement has replaced the mention of ITT by the requirement of “more information on retaining participants in their original assigned groups” [7]. As an alternative to an ITT analysis, it has been suggested that per-protocol-analysis (PPA)—sometimes referred to as ‘modified ITT’—can be used [4]. In this approach, the analysis is performed only on those participants who have fully complied with the programme. A PPA can provide a measurement of efficacy in that it gives the result of a prescribed programme that is implemented exactly as the researcher originally developed it. It is currently unclear to what extent RCTs on sport injury prevention have included the guidelines provided by the CONSORT statement and to what extent compliance measures have been addressed. This systematic review therefore aims to assess the extent to which sport injury prevention RCTs have defined, measured and adjusted their results for compliance with the trialled intervention(s).

2 Methods

2.1 Research Questions

This review answers the following questions to provide a detailed analysis on how compliance has been reported in sport injury prevention studies:

  1. 1.

    How and how often was compliance defined?

  2. 2.

    When defined, how was compliance measured?

  3. 3.

    When defined and measured, how was the outcome adjusted for compliance in the analysis?

2.2 Electronic Searches

Seven electronic databases were systematically searched for peer-reviewed publications on sport injury prevention interventions: PubMed (to October 2014), MEDLINE (1966 to October 2014), SPORTDiscus (1949 to October 2014), the Cochrane Central Register of Controlled Trials (to October 2014), CINAHL (Cumulative Index to Nursing and Allied Health Literature; 1982 to October 2014), PEDro (The Physiotherapy Evidence Database; to October 2014) and Web of Science (to October 2014). A standardised search strategy, based on a word string, including relevant sports injury terms and study designs, was used. The following keywords, and various combinations of those words, were used in the search: sport injury/ies, athletic injury/ies, prevention, preventive, preventi*, randomiz/s/ed, randomiz/s/ed controlled trial. Reference lists and related citations of included studies and relevant systematic reviews were also hand-searched for applicable publications.

2.2.1 Inclusion Criteria

Only RCTs, quasi-RCTs and cluster-RCTs were considered eligible for inclusion. The reason for including only (cluster- and/or quasi-)RCTs is that these studies maximise internal validity, which can be seen as a prerequisite for external validity. Trials were included that involved physically active individuals of either sex and of all ages. To be selected, studies had to examine the effects of an intervention aimed at the prevention of sport- or physical activity-related injuries. The primary outcome of the studies had to be a measure of sports- or physical activity-related injury (i.e. injury rate, time to first injury or the number of injured individuals). Only English-language publications were considered.

2.2.2 Exclusion Criteria

Studies that did not assess prevention of sports injury, that were not an RCT, quasi-RCT or cluster-RCT, or did not involve a physically active population were excluded from this review.

2.3 Definitions

Compliance in this review was defined as “the athletes’ correct following of a prescribed intervention” [4]. It is acknowledged that a number of terms have been used in the scientific literature, referring to comparable constructs. As such, for the purpose of this current review, we considered all text referrals to participants’ following of an intervention as compliance. Other examples of phrases equivalent to compliance commonly used in publications are ‘use’, ‘cooperation’ and ‘adoption’ [4]. In this review, all studies included were scrutinised thoroughly to identify the specific form/phrase used by the authors. This ensured that all accounts of compliance were included.

2.4 Methodological Quality

Potentially eligible studies were initially screened by title and abstract by the primary author. When eligibility was unclear, full-text articles were retrieved. In order to assess the methodological quality and risk of bias, all included studies were assessed based on ten out of 12 criteria as recommended by Furlan et al. [10]. These were the method of randomisation, concealed allocation, blinding of participants, blinding of care providers, blinding of outcome assessors, dropout rate, analysis according to allocated group, baseline similarity of the groups, compliance and timing of outcome assessment. This was done to assess if there were differences in the risk of bias between studies that did and did not report compliance. It is possible that studies that did not report compliance also failed to report other important methodological and design properties. Two criteria were omitted from Furlan et al. [10]—the reporting without selective outcome and avoidance of co-interventions—as these criteria were not considered to be distinctive for risk of bias between the included studies.

Each criterion was scored as ‘yes’, ‘unclear’ or ‘no’. Furlan et al. [10] defined studies with more than 6 points (yes = 1 point) as having “low risk of bias”. As two criteria were omitted, the original scoring was adjusted. Hence, more than 5 points was considered as the cut-off for “low risk of bias”.

To familiarize the authors with the risk of bias assessment, three reviewers (MvR, IV and EAV) scored ten studies that were randomly selected from all studies. Examining the disagreement in the assessment of these ten studies allowed the reviewers to identify possible incongruities in scoring. Thereafter, the total number of studies (n = 110) was randomly divided in two equal-sized sets (n = 55) and two reviewers (MvR and IV) both independently assessed risk of bias for one set. For the coding reliability assessment, from each of the sets, 19 studies were randomly selected. Both reviewers scored these 38 studies. It was agreed that if the agreement (kappa) score for these 38 studies was >0.9, agreement was acceptable and there was no need for the reviewers to score all studies separately. Of the 380 items that were scored twice, there was agreement on 370 items. This resulted in an agreement (kappa) score of 0.95. Based on this high level of agreement, it was thus decided that the remainder of the manuscripts did not needed to be assessed by both reviewers.

2.5 Data Extraction

One reviewer (MvR) scrutinised the included studies for all terms referring to compliance. Thereafter, for the studies that mentioned compliance, details about the definitions, the methods of compliance measurements and the corresponding outcomes were extracted. Finally, all studies were examined for adjustment of the main outcome in their analyses by compliance rates.

3 Results

3.1 Search Results

The search strategy initially yielded 1902 studies, of which a total of 289 full-text articles were retained after initial screening for eligibility. A total of 180 studies were then excluded (Fig. 1), resulting in 109 studies being included in this review. The primary reasons for exclusion were that studies did not involve an RCT or did not use injury as an outcome measure. For five studies, full-text articles could not be retrieved [1115]. Electronic Supplementary Material Appendix S1 provides an overview of the studies included in the final review. Figure 2 describes the included studies in terms of their mentioning of, measurement of and/or adjustment for compliance.

Fig. 1
figure 1

Literature search flow chart. RCT randomised controlled trial

Fig. 2
figure 2

Annual trends in compliance reporting. Note A study can be categorised into more than one of the four categories shown

3.2 Risk of Bias Scores

The 109 included studies scored an average of 4.1 ± 1.8 yes ratings (out of 10), 2.8 ± 1.3 no ratings and 3.3 ± 1.8 unknown on the risk of bias assessment instrument. It can thus be concluded that, in general, the included studies demonstrated a fairly high ‘risk of bias’. The 21 studies that explicitly adjusted for compliance rates in their study outcomes—and hence had provided the most details on compliance in their report—scored an average of 4.7 ± 1.6 on the risk of bias assessment, compared with average scores of 3.9 ± 1.8 for the 88 studies that did not account for compliance. This suggests that the studies that accounted for compliance had a slightly higher methodological quality than those studies without such adjustment. Electronic Supplementary Material Appendix S1, Sect. 1 provides an overview of the risk of bias score of each of the included studies.

3.3 Compliance

3.3.1 Terms Used for Compliance

Of all studies, 78 (71.6 %) mentioned compliance or a related term. Most common was the use of the term ‘compliance’ (n = 57; 52.3 %). Other terms used were ‘use’ (n = 8), ‘adherence’ (n = 6), ‘attendance’ (n = 2), ‘cooperation’ (n = 1) and ‘participation’ (n = 1). Some studies used multiple terms by switching between ‘compliance’ and ‘adherence’ (n = 2), ‘compliance’ and ‘exposure’ (n = 1) or ‘compliance’ and ‘internal dropout’ (n = 1). Electronic Supplementary Material Appendix S1, Sect. 2 provides an overview of the terms used in the included studies.

3.3.2 Measurements of Compliance

The majority of the 78 studies that mentioned compliance (75; 68.8 % of all studies included) provided details on how they measured compliance. Compliance rates were recorded using diverse methods. Studies that concerned supervised exercises derived compliance rates from a written or online report by a supervisor, e.g. a trainer, coach or designated team member (n = 15) [1630]. Home-based or individual exercises studies made use of a written or online self-report (n = 12) [3142]. In studies relating to the use of protective equipment (orthoses, wrist protectors, etc.) or supplements, this use was recorded by either the participant (n = 4) [4346] or a supervisor (n = 5) [4751]. In 15 studies [47, 5265] the wearing/usage of protective equipment was only checked visually. In three studies [52, 54, 62], a lack of compliance with wearing/usage of material resulted in prohibition to participate; these studies therefore suggested 100 % compliance for people who remained in the study. For example, the participants who were designated to wear a helmet during football were visually checked before they entered the field; non-compliance with wearing the helmet resulted in the prohibition to play [52].

In 24 studies, researchers verified the reported compliance rates by multiple methods. These included combining self-report with the report of a supervisor [6670], combining a report of a supervisor with random visits [5, 7178], combining a report of a supervisor with phone calls and visits [7981], combining self-report with random visits [82], combining a report of a supervisor with phone calls and emails [83] or combining self-report with phone calls [71].

Thirty-one studies included in this review were conducted in a military setting. Although it might be expected that a military setting would make it easier to report on compliance—with many supervised activities and a highly compliant environment—these studies did not provide more details on compliance than other studies. Slightly less than half of the military studies (n = 14) provided details on compliance measurements. In eight of these 14 studies it was reported that the participants were visually checked or supervised while carrying out the intervention. Two of those eight studies provided no further details on compliance rates [53, 54], two studies excluded participants from the analysis when they did not comply [55, 61] and the other four studies reported compliance rates of between 57 and 100 % [47, 56, 57, 60]. Electronic Supplementary Material Appendix S1, Sect. 3 provides an overview of ways in which studies have reported compliance rates.

3.3.3 Compliance Data and Adjustments for Compliance Rates

Of the 75 studies that provided information on compliance measurement, only 56 studies (51.4 % of all included studies) provided compliance data. These data were presented in heterogeneous ways. Nine studies [5, 16, 67, 71, 74, 79, 81, 84, 85] created subclasses of participants in which high, intermediate and low rates of compliance were defined. However, the (arbitrary) cut-off percentage that was considered for high versus low compliance varied considerably between studies.

For example, in a cluster-RCT on the FIFA 11+ injury prevention programme, low, middle and high compliance were defined respectively as performing <24.7, 24.8–48.1 or >48.2 % of all exercises [84]. This resulted in the categorization of 18 % of teams in the low compliance category, 41 % of teams within the moderate compliance category and 41 % of teams in the high category. In another neuromuscular training intervention cluster-RCT, high compliance was defined as carrying out three (of three) sessions in a first intensive intervention period, two sessions in the second intervention period and one session in the third/maintenance period [16]. In this study, 36 % of the teams were considered as highly compliant, 43 % as irregularly compliant and 21 % as having interrupted compliance.

Other studies choose to report compliance for each player [5, 73, 75, 78, 79, 81, 84], for the team as a whole [1720, 72, 74, 75, 78, 79, 81] or a seasonal compliance rate [20, 79]. In addition, some studies combined compliance rates of the intervention and the control group, which were presented as one overall compliance rate [21, 22, 57, 66, 70, 82, 86]. Electronic Supplementary Material Appendix S1, Sect. 4 provides an overview of the studies that reported compliance data.

In addition to providing compliance rates, a mere 21 studies [5, 16, 17, 20, 23, 31, 32, 43, 58, 67, 71, 74, 76, 77, 79, 8387] (19.3 % of all included studies) analysed the effect of different compliance rates on study outcomes. As the studies used heterogeneous methods to report these analyses, it is impossible to provide a pooled effect of compliance rates. Therefore, Table 1 presents the details of the effect of measured compliance rates on their study outcome in these 21 studies.

Table 1 Studies that analyse the effect of compliance rates on study outcome

4 Discussion

4.1 A Lack of a Uniform Definition of Compliance

In the studies presented in this review, various methods were employed to define, measure and analyse the effect of compliance. The most important finding is that, although the majority of studies mention the concept of compliance, there is a large degree of heterogeneity in the manner in which studies deal with this concept. Some studies merely mention compliance in either the introduction or discussion without providing further details on compliance assessment and compliance data. As can be seen from Fig. 2, there are more studies that provide compliance data than there are studies that give an explicit definition of compliance or one of the related constructs. In other words, whilst many report compliance, a majority do not define this term or explicitly state how they operationalised it.

The majority of the studies report minimal details on (1) the definition of compliance; (2) how compliance was measured; (3) the frequency by which compliance was measured (every day, week, month); and (4) how compliance affected study outcomes.

From 1970 onwards there was a clear increase in the number of sport injury prevention RCT studies. However, in the last few years (2011–2014) this trend has not continued and the number of injury prevention RCTs has actually decreased. It is likely that after numerous efficacy studies, research now focuses on implementation of prevention measures in non-RCT studies. As these non-RCT studies are not the topic of this review, they will not appear in Fig. 2.

4.2 The Importance of Compliance Reporting

In order to evaluate study outcomes in the context in which they are examined, it is essential that studies report the percentage of participants who have actually complied with the prescribed intervention. Compliance to an intervention significantly influences the outcomes of intervention studies, which is clearly illustrated by a number of studies included in this review [5, 23, 32, 71, 74]. For example, in the study by Steffen et al. [5] that assessed compliance rates to a neuromuscular injury prevention programme, high, intermediate and low compliance groups were defined. The authors’ PPA found that only the high compliant group benefited significantly from the intervention.

In the study by Emery et al. [71] evaluating home-based balance training, participants who had conducted more than 18 sessions (of the recommended 42 sessions) in 6 weeks had achieved a significant improvement in static balance skills. Participants with lower compliance rates did not improve their static balance skills. Gabbe et al. [23] evaluated eccentric hamstring exercises in amateur football players, of whom only 4 % of those who were compliant with the intervention sustained an injury. Players who were not compliant to the intervention showed no reduced injury risk when compared to the control group. Hagglund et al. [74] reported similar outcomes, showing that a significant reduction in injury rates was found only in teams with the highest compliance to a neuromuscular training programme. Finally, the study of Hupperets et al. [32], in which only 23 % of participants were fully compliant, suggested that higher compliance would have resulted in fewer injuries. In a secondary analysis in a subsequent paper, it was indeed shown that the small group of participants with high compliance was responsible for the positive effect of the exercise programme on recurrent injury risk [4].

Information on the rate of compliance and its effect on study outcomes can be shaped into a clear message for the target groups involved; they should be informed about the number of training sessions they should at least participate in to reduce their risk of sustaining an injury. Providing information on compliance rates and the effect of those different rates on study outcomes might increase the practical usability of study results for the target group.

4.3 Acknowledgment of the CONSORT Statement

The CONSORT statement argues that, in order to evaluate both efficacy (with the assumption of full compliance and no recognition of implementation barriers) and effectiveness (the real-life adoption of an intervention), researchers should analyse study results using ITT, PPA and a graded compliance measure [7]. The latter refers to the extent to which participants have complied with the programme and what effect this has had on the outcome.

In addition to the diversity by which compliance is defined, measured and adjusted for in the analysis, the studies included in this review show a large degree of heterogeneity in the use of ITT, PPA or graded compliance.

Thirty-seven studies have used one or more of the recommended analyses. Twenty-eight studies [16, 17, 27, 29, 32, 34, 3742, 44, 50, 52, 71, 72, 7582, 84, 88, 89] used ITT analysis, one used PPA [19] and eight studies [23, 31, 43, 47, 58, 86, 90, 91] used both analyses (see Electronic Supplementary Material Appendix S1). It is clear that, although the CONSORT statement clearly acknowledges the importance of compliance and, hence, provides a step forward in improving the quality of intervention studies, there is still a lack of uniformity. What is needed is a uniform way in which compliance is dealt with.

4.4 Further Research

Further research needs to confirm which measures provide the most valid and reliable assessment of compliance. Although various methods have been used to measure compliance (e.g. the use of written, vocal or online self-reports, supervision and/or unscheduled visits), each method has its own limitations. Participants can incorrectly recall their activities or provide socially desirable reports on self-reported measures of compliance. In addition, a uniform definition of compliance and a categorisation of compliance rates might increase the possibility of comparing the effectiveness of different injury prevention programmes. The main weakness of the current study is that it only included RCTs. It would be of interest to conduct a similar review that includes both RCTs and less-controlled studies to identify adherence to sport injury intervention studies in which the setting is less controlled.

5 Conclusion

Injury prevention studies vary significantly in the way they define, measure and adjust for compliance. While the majority of these studies mention the concept of compliance, only one-fifth of the studies gave a more detailed account of how compliance rates influence their study results. The studies that did account for compliance demonstrate that the level of compliance can have a significant effect on study outcomes. Valid and reliable tools to measure and report compliance need to be developed, matched to a uniform definition of compliance. Although current guidelines for reporting of studies have increased awareness of the need for compliance measurements, the way these measurements are executed and reported still deals with a large degree of heterogeneity.