Key Points

  • The systematic collection of a range of physical activity outcomes is required in both clinical and research settings to effectively monitor and support post-arthroscopy recovery, building a more comprehensive activity profile of patients that moves beyond athletic classification.

  • Physical activity outcomes are important but diverse and poorly captured in the current literature. The appropriateness of the patient-reported outcomes most commonly employed to measure physical activity is questionable and the range limited.

  • The majority of patients feel better in relation to their ability to undertake physically active tasks including sports, but fail to progress to ‘feeling good’ or a patient-acceptable symptom state.

Background

Hip arthroscopy is an increasingly common surgical intervention for young and middle-aged adults with hip-related pain or dysfunction [1,2,3,4]. Indications for hip arthroscopy most frequently include persistent pain and altered bony morphology associated with femoroacetabular impingement syndrome (FAIS) in addition to labral tears, chondral defects and ligamentum teres injuries [5, 6]. Young and middle-aged adults undergoing hip arthroscopy have high expectations for returning to physical activity to support their social and cultural roles [7]. Despite this expectation, physical activity-related outcomes are only reported in approximately a quarter of studies investigating surgical intervention for FAIS [8], returning to sport or play being the predominant outcome assessed. A high level of return to sport/ return to play following hip arthroscopy (88–91%) has been reported in a number of systematic reviews [9,10,11,12,13,14,15,16] ; however, recent study findings suggest the need for a more expansive analysis, beyond these simplified nominal criteria, to assess the wider impact of hip arthroscopy on physical activity. When adding the further consideration of level to sports status, Ishøi et al. [17] identified a relatively low return to pre-injury sport at pre-injury level of 57%, and Thorborg et al. [18] identified that at 1 year post-arthroscopy, only 25% of patients that met physical activity reference scores commensurate with those expected in a healthy population.

Dichotomous return-to-sport or return-to-play outcomes only provide a narrow perspective of physical activity which comprises multiple constructs such as the type, quantity, intensity and quality of activity, as well as physical activity-related impairments such as pain or discomfort. As these multiple dimensions imply, capturing comprehensive physical activity data is challenging and unlikely to be attained using a single measure [19]. One potential method of capturing data is through the use of patient-reported outcome measures (PROMs). Recommended PROMs with adequate clinometric properties for patients following hip arthroscopy include the Copenhagen Hip and Groin Outcome Score (HAGOS), International Hip Outcome Tool (iHOT-33) and Hip Outcome Score (HOS) [20,21,22]. While subscales of these PROMs primarily provide information on the degree of difficulty that patients experience with sport-related activities, other PROMs such as the Hip Sport Activity Scale (HSAS) provide information on the level of activity undertaken [23]. In addition to questionnaires, with advancing technology, potential exists to gather objective information relating to physical activity. Duration and intensity of physical activity may be captured through the use of motion sensors, accelerometry and mobile phone applications. Although an overview from ClinicalTrials.gov [24] lists over 1500 trials using accelerometry as an outcome measure, only 118 of these are related to musculoskeletal problems and less than 5 are related to the hip. The extent to which these newer technologies are being used and reported in relation to the outcomes following hip arthroscopic surgery has yet to be described.

To gain a comprehensive understanding of the impact of hip arthroscopy on the physical activity of patients, it is necessary to consider a range of outcomes and include both competitive and non-competitive (recreational) physical activity. Within the context of this review, physical activity is deemed to be an activity exceeding that which is required for normal activities of daily living, interpreting sport in a wider community context [25]. While arthroscopic interventions continue to evolve and increase in popularity [2, 4, 26], our current understanding of post-arthroscopy outcomes, in terms of physical activity, remains limited.

Review Aim:

The primary aim of this systematic review is to examine quantitative primary research, reporting level IV evidence or above, to assess the impact of hip arthroscopy, undertaken for hip-related pain and dysfunction, on the physical activity of young and middle-aged adults. This will be assessed via the study outcomes presented. In addition, an overview of the outcomes used will be described.

Methods

Protocol and Registration

The protocol for this review was registered with the International Prospective Register of Systematic Reviews (PROSPERO, registration no. CRD42017080527). Amendments were made to the original protocol to (i) clarify exclusion criteria and (ii) modify outcomes in light of literature published during completion of the current review.

Eligibility Criteria for Inclusion in the Review

Pre-specified inclusion and exclusion criteria are identified in Table 1.

Table 1 Inclusion and exclusion criteria

Literature Search Strategy and Study Selection

A comprehensive search strategy was developed for the following databases: Scopus, MEDLINE, CINAHL, PubMed, AUSPORT, SPORTDiscus, PEDro and PsycINFO. The search was restricted to articles from January 1st 1990, due to the limited literature on hip arthroscopic surgery prior to this date, through to January 16th 2018. The search was updated through to December 5th 2019.

The search was conducted independently by two reviewers (DMJ, JJH), with the strategy adapted as appropriate for the requirements of each database. An example of the full search strategy is given in Additional file 1. Citation tracking of key articles was undertaken using Web of Science and Google Scholar. A manual check of reference lists of key articles was also undertaken. References were imported into Endnote X6 (Thomson Reuters, Carlsbad, California, USA) and duplicates removed. Title, abstract and full text screen were undertaken by two teams of independent reviewers (DMJ, JJH, BFM). Any disagreements were resolved by a fourth independent reviewer (JLK).

Study appraisal

All included papers were assessed using an adaptation of the assessment form for observational studies created by Siegfried et al. [27], utilising further examples from Ganderton et al. [28, 29]. Copies of the appraisal form are given in Additional file 2. The tool considers biases relevant to observational studies in general and those specific to the research question. To address the research-specific biases, four authors (DMJ, JLK, KMC, JJH) compiled a list of potential confounding factors such as age, sex and the degree of degenerative change in the hip joint. As the majority of studies were non-randomised controlled trials, this approach was undertaken to align with good practice guidelines outlined by the non-randomised studies methods group of the Cochrane Collaboration [30]. This tool was used to assess methodological quality of all included studies by two teams of reviewers (DMJ, KD, MO, BM). Disagreements were resolved through discussion and, where necessary, consensus agreed with an independent arbitrator (JLK). Agreement between raters was determined using percentage-observed agreement and Cohen’s Kappa (κ). Itemisation and display of each aspect was presented in its raw form for each study. An assessment of level of evidence was made against the Oxford Centre for Evidence-Based Medicine criteria [31]

Data extraction, synthesis and analyses

Data for each included study were extracted independently by two teams of reviewers (DJ, KD, MO, BM) using a standardised form adapted from the Cochrane Effective Practice and Organisation of Care (EPOC) criteria [32]. Inconsistencies were resolved by consensus discussion with arbitration from a third reviewer (JLK) if needed. Study authors were approached by email with requests for further data if required.

Data regarding study design, participant demographics (age, sex, physical activity attributes), outcome measures, duration of follow-up, arthroscopic findings and intervention were extracted and collated. The primary indication for surgery was noted (if specified). Where sufficient data were available, sports activities were categorised using previously established criteria in which activities are grouped based on the mechanical load placed on the hip joint (Table 2) [33, 34].

Table 2 Categories of sports activities, based on hip joint load

To accommodate heterogeneity in the reporting of duration of follow-up, data collection points were collated under the following time frames: ≤ 6 months, 7–12 months, 13–18 months, 19–24 months, ≥ 25 months. Improvements in activity-specific subscales are known to be limited beyond 2 years post-arthroscopy [11, 35].

Reported outcomes were assessed to identify the direction and consistency of effect, and where appropriate data were available, standard paired differences (SPD) were calculated to present a magnitude of effect between time points. This was determined by the within-group difference between time points, divided by the pre-score standard deviation (SD). Where standard errors (SE) were reported, SD was calculated (SD = SE*√number of participants). The magnitude of SPDs was interpreted as large effect (≥ 0.8), moderate effect (0.5–0.79) and weak effect (0.2–0.49) [36]. The 95% confidence intervals for SPDs were calculated. Where appropriate summary scores were available for whole cohorts in studies with more than one arm, these data were used in preference to group data. Where data were insufficient for SPDs to be calculated, relevant study conclusions were reported where available.

To provide a visual representation of HOS-SS outcome scores, all data points from study groups were plotted against the minimal clinically important difference (MCID) and patient-acceptable symptom state (PASS) for this subscale (a change of 6 points and score of 75 points, respectively [21, 37, 38]). These scores were interpreted as ‘feeling better’ (MCID) and ‘feeling good’ (PASS) [39].

Pooling of data was undertaken where outcomes were statistically and clinically homogeneous. Any studies with potential replication of participants were excluded from this analysis. Where no responses were offered from authors to enable discrete cohorts to be identified, the study encompassing the widest time frame with the greatest number of participants was chosen from studies generated within the same research setting, utilising the same outcome measures and database. Where more than one outcome was reported in a study, the most frequently occurring outcome score across all studies was chosen to be reported in pooled data. Studies reporting number of participants or number of hips were included in the pooled data. Where reporting was unclear, a conservative approach was taken with calculations being made in relation to the lowest number of potential participants. Pooled data were examined using forest plots (Review Manager (RevMan) [Computer program]. Version 5.3. Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration, 2014). Duration of follow-up categories were further merged to provide pooled data for the following time frames: 6 to 12 months, 13 to 24 months and ≥ 25 months. Studies were only reported once in each time frame.

Results

Search Strategy

The number of records considered at each stage of the review and the reason for exclusions are shown in Fig. 1. In total, 120 studies were included in the review. A list of excluded studies is provided in Additional file 3.

Fig. 1
figure 1

PRISMA flow chart

Study Characteristics

The included studies [6, 17, 18, 35, 37, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154] comprised two randomised controlled trials (RCTs), 24 prospective studies and 94 retrospective studies, of which 41 were single-arm case series (Additional file 4: Characteristics and outcomes of included studies). Author requests were made in relation to 51 (43%) studies to attain unreported data and query potential replication of participant data between studies. Additional information was supplied for five studies [18, 89, 99, 100, 112].

One hundred and twelve (93%) studies were conducted on a single site and/or involved the patients of one surgeon (Table 3). One hundred studies (83%) were from North America, 12 from Europe (10%) and 3 from Australia (2.5%). Three studies were from Korea, 1 from China and 1 from Israel.

Table 3 Summary of study quality assessment

A mix of reporting approaches was used, the majority of studies providing data based on participants (20,154 participants), the remainder recording 1,446 hips/procedures. We were unable to exclude the possibility of participants appearing in more than one study due to the high number of studies retrospectively reviewing databases. The number of participants in studies ranged from 11 to 1835. The mean (± SD) age of participants was 34 ± 7 years with 58% of the data pertaining to women. Seventy-two percent of studies specified FAI/FAIS as the primary inclusion pathology.

One study [154] reported objective measures of physical activity utilising accelerometry. The majority (n = 99, 83%) presented the Hip Outcome Score-sport-specific subscale (HOS-SS, Fig. 2). The ‘Function in Sport and Recreation subscale’, subscale of the Hip disability and Osteoarthritis Outcome score (HOOS-SS) and the two relevant subscales (‘Physical Function in Sport and Recreation’, ‘Participation in Physical Activities') of the Copenhagen Hip and Groin Outcome Scores (HAGOS-SR; HAGOS-PA) were presented in 8 (7%) and 8 (7%) of studies, respectively. An overview of PROMs is included in Additional file 5. The ‘Sports and Recreational Activities’ subscale of the International Hip Outcome Tool (iHOT-33 SR), Tegner Activity Scale (Tegner) and Hip Sports Activity Scale (HSAS) were reported in 2 (2%) of studies, while the UCLA Activity Score and Functional Activity Score (FAA) were each reported in a single study (Additional file 4). Outcome scores for studies with multiple time points of data collection can be found in Additional file 6. All but two studies reported pre- and post-arthroscopy results. Kemp et al. [89] provided an assessment of two post-arthroscopy time points; Tijssen et al. [124] reviewed changes from pre-injury to post-arthroscopy.

Fig. 2
figure 2

Hip Outcome Score-Sport Scale (HOS-SS) outcome scores for study groups at all time points. Points above the MCID (minimal clinically important difference) line represent a sufficient change in HOS-SS score pre- to post-arthroscopy to identify ‘feeling better’. Points to the right of the PASS (patient acceptable symptom state) represent a sufficiently high HOS-SS score at follow-up to identify ‘feeling better’

Thirty four (28%) of the reviewed studies included some assessment of physical activity attributes of the cohort such as type of activity (e.g. ‘recreational’, ‘professional’; work activity or Tegner Activity Scale) with a similar proportion providing sufficient data to enable categorisation of activity type (as identified in Table 2; n = 30, 25%). A summary of inclusion/exclusion criteria for each study, arthroscopic intervention and findings are given in Additional file 7.

Quality assessment scores

Observed agreement between quality assessors was 99.6% (1554 out of 1560 items), where κ = 0.53, representing moderate inter-rater agreement [155].

All studies employed PROMs; however, the reporting of validity and reliability of these outcomes was deemed adequate in only 26 (22%) of the studies. Complete quality assessment scores are provided in Additional file 8 and a summary is provided in Table 3. Blinding of those assessing data was poorly addressed in all but six studies (5%) and only six studies (5%) provided clearly identifiable time points in which all follow-up outcomes related to analogous time frames. Although the mean age of participants in all studies met the current inclusion criteria, 108 studies (90%) included some participants outside this age range or failed to report sufficient information.

Main Findings

Large effect sizes for patient-reported physical activity (where able to be calculated) were seen in all studies at latest follow-up for the HOS, HOOS, HAGOS and iHOT33 subscales, with the exception of ten study groups for the HOS [44, 80, 85, 97, 98, 138, 142, 144, 146, 147]; and one for HAGOS [17] in which effect sizes were moderate pre- to post-arthroscopy. In assessing progress between two post-arthroscopy time points, Kemp et al [89] determined a small effect size for the HOOS-SR. The direction of change was consistently toward improvement across studies. Table 4 shows the summary of the range (minimum SPD and maximum SPD) of effect sizes for each score across all studies for individual outcomes. The full set of results of SPDs are contained in Additional file 4.

Table 4 Range of effect sizes for each instrument across all studies (pre- to post-arthroscopy)

Pre- to post-arthroscopy change in the HSAS was assessed in four studies [6, 99, 118, 131]. No effect and small effect were evident at 6 months post-arthroscopy in the RCT conducted by Bennell et al. [131] compared to a moderate effect size at 6 months post-arthroscopy reported by Sansone et al. [118] (SPD [95% CI]; 0 [−0.79 to 0.79]; 0.12 [−0.89 to 0.65]; −0.63 [−0.94 to 0.33] respectively). Two studies [6, 99] showed small effect sizes at approximately 2 years (SPD [95% CI]; −0.33 [−0.49 to 0.16]; −0.41 [−0.48 to 0.34]). Bennell et al. [131] was the only study to assess pre- and post-arthroscopy Tegner scores, finding large-to-moderate effect sizes at 6 months post-arthroscopy (SPD [95% CI]; −0.90 [−1.74 to 0.07]; −0.64 [−1.43 to 0.15]).

A visual representation of all HOS-SS outcome scores is presented in Fig. 2. Two studies [49, 100] had outcome scores sitting below the MCID and PASS scores (3% of all included data points). Sixty percent of outcome data points failed to reach the magnitude required to reach the PASS score. For data points relating to a follow-up duration of ≥ 25 months, 64% failed to reach the PASS score.

Data were pooled for HOS-SS, HOOS-SR, HAGOS SR and iHOT-33 SR and grouped according to time frame (Fig. 3). A large effect was evident for SPDs at each time frame (SPD [95% CI]; −1.22 [-1.41 to −1.03]; −1.06 [−1.24 to −0.88] and −1.35 [−1.61 to −1.09] at 6–12 months, 13–24 months and ≥ 25 months, respectively). Considerable heterogeneity was evident between studies in all time frames (I2 79% to 92%).

Fig. 3
figure 3

Pooled effect sizes of pre- to post-arthroscopy including Hip Outcome Score-Sport Scale (HOS-SS), Hip disability and Osteoarthritis Outcome Score-Function in Sport and Recreation (HOOS-SR), The Copenhagen Hip and Groin Outcome Score-Physical Function in Sport and Recreation (HAGOS-SR) and International Hip Outcome Tool-Sports and Recreational activities (iHOT-33 SR) at 6−12 months (a); 13−24 months post-arthroscopy (b) and ≥ 25 months (c), showing standard paired difference (SPD) and 95% confidence intervals (CI). Weightings relate to study size. Randomised controlled trials are indicated with *

Eight studies [73,74,75, 95, 116, 124, 126, 154] reported quantified changes in physical activity. Methods used in these studies were largely sport-specific, e.g. change in swimming distances pre- to post-arthroscopy [74] or number of holes of golf played per week [126]. Decreases were evident in all measures, although this change was not significantly different in five of the studies [73,74,75, 116, 126]. Significant decreases were reported in running mileage [95] (P < 0.001) and sport frequency [124] pre-injury to post-arthroscopy. Kierkegaard et al [154] identify a self-reported four-fold increase in hours of physical activity per week but no significant differences were reported for accelerometry-derived activity data such as the percentage of time spent in undertaking moderate or high physical activity, step count or percentage of time running between pre-arthroscopy and 1-year post-arthroscopy (Additional file 4).

Discussion

This systematic review evaluated the impact of hip arthroscopy, undertaken for hip-related pain and dysfunction, on the physical activity of young and middle-aged adults. A limited range of relevant outcomes were reported, with PROMs, specifically the HOS-SS predominating, and one study using objective measures to monitor physical activity. Consistency was seen across PROMs for improvements post-arthroscopy; however, the majority of HOS-SS scores did not reflect a patient-acceptable symptom state. In interpreting the evidence, it should be noted that considerable heterogeneity was evident between study designs and eligibility criteria. The majority of studies (78%) were retrospective, the preponderance of level 4 evidence, thus having the potential to inflate positive outcomes and effect sizes.

Pooled data showed large effect sizes for the PROM subscales included in the analysis (HOS-SS, HAGOS-SR, iHOT-33 SR), depicting improvements in patients’ perceived difficulties with sport-related activities. This was consistent within each time frame for data covering 6 to ≥ 25 months post-arthroscopy. Across all pooled data, four studies demonstrated extreme positive effects. Three of these studies [54, 101, 120] involved participants undertaking high-level physical activity with elevated post-arthroscopy scores. Conversely, Michal et al. [102] reported very low pre-arthroscopy scores in a cohort who underwent surgery for subspinal decompression. Excluding these studies from the analysis did not impact on the large pooled effect sizes. While the pooled data reflect a positive trend of patient-reported improvements in relation to physical activity impairments, isolated analysis of the HOS-SS raised questions about whether the magnitude of improvement was sufficient to be perceived by patients as satisfactory recovery of physical activity. The failure of 64% of reported HOS-SS scores to meet the PASS level for this scale beyond 2 years post-arthroscopy, echoes previously identified deficits in the HAGOS-SR and HAGOS-PA scores for patients at 1 year post-arthroscopy compared to their healthy peers [18]. These findings should encourage clinicians to monitor and support patients’ return to physical activity for extended time spans following hip arthroscopy. The heterogeneity of the study cohorts, in relation to number of participants, age range, diagnosis, surgical procedures, physical activity background and time point at which data were gathered, potentially underlies the spread of outcomes depicted in Fig. 2, although this speculation also requires further investigation into the suitability of the outcome measure for the population.

Our findings indicate the need for more in-depth analysis of the impact of surgery on sport and activity involvement at an individual level. The limited range of outcomes utilised within studies was insufficient to answer questions about how much activity patients are undertaking and at what level of involvement. Despite the rising interest in and accessibility of wearable technology in health and fitness [156], and the increasing use of activity monitors within health research [24, 157], we found only one study utilising objective monitoring of physical activity for hip arthroscopy patients. Without the collection of more robust data to identify the type and quantity of activity undertaken, we are unable to determine if patients are participating in sufficient physical activity to meet guidelines of minimal activity requirements for health.

The limited range of frequently used PROMs identified in the current review reflects the findings of Reiman et al. [8] and Renouf et al. [158]. Both these reviews identified that PROMs with appropriate clinimetric evidence to support their use in the population of young to middle-aged adults with hip-related pain and dysfunction, such as the iHOT-33 and the HAGOS, were utilised in less than 5% of studies assessing outcomes following hip arthroscopy and surgery for FAIS. The utility of the HOS-SS in this population has yet to be clearly established. In a recent review of PROMs for hip-related pain [20], the HOS was not recommended as it lacked content validity, an issue that likely also applies to the individual subscales. As Kemp et al. [21] also observed ceiling effects for the HAGOS-PA subscale, limiting its ability to identify improvements over time in hip-arthroscopy patients, further research is needed to identify which PROMs are best suited to capture physical activity gains in this cohort. PROMs that provide information on levels of activity, such as the HSAS and the Tegner were also infrequently utilised. The HSAS was assessed in four studies [6, 99, 118, 131], identifying no to moderate effect at 6 months [118, 131] and small effect sizes at approximately 2 years post-arthroscopy [6, 99]. Although the number of studies is limited, the smaller effect sizes may be indicative of less profound changes in relation to improvements in activity levels following surgery. Similarly, although only seven of the included studies sought to quantify the amount of activity undertaken in specific sports, the negative trends depicted indicate the importance of tracking more than one domain of physical activity. This is reiterated in the findings of Kierkegaard et al. [154], with the lack of agreement between objective and subjective reports of activity change. Only a quarter of the studies reported on the activity profile of participants, although information about the type of activity undertaken would be of value in identifying potential barriers and facilitators to physical activity participation post-arthroscopy.

This study offers insights into the effect of hip arthroscopy on physical activity, based on a comprehensive search strategy across eight databases utilising a rigorous screening and review process; however, there are a number of limitations that should be acknowledged. The methodological quality of the included studies was variable, many being retrospective studies with low participant numbers. This may increase potential for bias and magnification of positive effects [159]. Additionally, a number of studies were based on reviews of archived databases. The reliability of evidence emanating from these sources depends upon the quality of the database. National registries such as those developed in Sweden and Denmark, for which criteria, planning, monitoring and ongoing quality assurance are transparent [3, 160], provide data with high external validity. While single site/ single-surgeon registries offer a convenient tool for internal audit, the external validity and applicability of these data in the wider field are limited. When pooling study data in this review, a conservative approach was taken to data that were potentially derived from same database. While this reduced the number of studies contributing to the pooled data, it minimised the potential for data from the same participant to be duplicated in the analysis. It should be noted that in the visual representation of all HOS-SS outcomes, all studies were included. The high incidence of the HOS-SS may be an artefact of the number of studies emanating from North America and the dominance of a limited number of surgical centres, exacerbated by the omission of non-English language studies in this review. The predominance of North American studies also limits the cultural perspective of the data, with potential biases arising from influences on the manner in which participants complete patient-reported outcomes.

Conclusion

The current level of information regarding physical activity for post-arthroscopy patients is limited in scope. Within the framework of patients’ perceived difficulties with sport-related activities, there is a consistent trend of post-arthroscopy improvement. However, the limited percentage of study participants achieving a score commensurate with ‘feeling good’, rather than ’feeling better’, indicates a need for more in-depth analysis to identify potential barriers and facilitators, both physical and psychological, to achieving a more satisfactory return to physical activity.

Although the HOS-SS was the most frequently utilised PROM in this review, questions remain regarding its utility for this cohort. A greater range of outcome measures is needed to identify changes in other domains of physical activity. The use of objective measures, such as step count data, is currently a resource that is rarely utilised in studies, despite its use in contemporary practice, and warrants further investigation.

This review generates a compelling case for higher quality, sufficiently powered observational studies and RCTs. While RCTs remain the gold standard, purposefully designed, quality controlled, multicentre or population-level databases offer the opportunity for large-scale, comprehensive data collection. However, a more expansive view of physical activity profiles needs to be established with the routine collection of data about type and volume of physical activity undertaken beyond the traditional focus on ‘sport’-related physical activity.