Introduction

Evaluation of patient-reported outcome (PRO) in early osteoarthritis (OA) of the knee is difficult [1,2,3,4], with a large range of severity measurements and a variety of symptomatic criteria that patients present with [5, 6]. However, objective measurement of quality of life in mild or moderate OA is of growing interest and can play an important role in the development of joint preservation therapy [5, 7, 8]. Conventional scoring systems are based on objective parameters like the range of motion or radiographic factors. However, this reflects the surgeon’s point of view.

The “Forgotten Joint Score” (FJS) was originally developed as a measurement tool in patients after arthroplasty of the hip or knee joint [9]. Joint awareness in everyday life is a crucial criterion in the activity of daily living FJS [10]. Considering the patient’s evaluation of the loss of awareness of the knee joint is a paradigm shift in PRO measurement [9, 11,12,13,14] relative to more traditional measurements of pain or activity level. Conventional scoring instruments frequently show ceiling effects leading to limited content validity [14]. For evaluation of further therapeutic improvements, it will be necessary to discern between good and excellent results. Reflecting the patient’s joint awareness, the FJS has shown high discriminative power in patients after arthroplasty of the hip and knee [15]. Therefore, the interpretation of the patient’s joint awareness measured by the FJS is seen as a new dimension in PRO-measurement.

Established measurement tools focus on one of the two major patient groups in knee surgery: First, young and physically active patients sustaining sports injuries without any signs of OA [16,17,18,19], and second, older patients with advanced OA of the knee designated for knee arthroplasty [20,21,22]. Recently, a study by Behrend et al. [23] demonstrated that the FJS is a viable instrument for PRO measurement in patients after anterior cruciate ligament (ACL) reconstruction. The FJS could serve as an ideal PRO measurement for other sports-related knee injuries resulting in increased risk of developing OA. Accordingly, the FJS could become an invaluable measurement tool in evaluating long-term outcomes in patients sustaining tibial plateau fractures, who are predisposed to posttraumatic OA of the knee joint. In this study, we intended to investigate the relationship between the FJS and mild to moderate posttraumatic OA at long-term follow-up. For this reason, we chose to validate the score in a specific patient population. A group of patients after knee joint fracture with long-term follow-up seemed to be feasible in this context.

The purpose of this study was to validate a German version of the “Forgotten Joint Score” (FJS) according to the COSMIN (COnsensus based Standards for the selection of health status Measurement INstruments) checklist. For determination of construct validity, we investigated the correlation between the FJS and long-term radiographic development of OA as measured by the Kellgren-Lawrence score (KLS) in patients after surgical treatment of tibial plateau fractures following a skiing accident.

Materials and methods

The COSMIN checklist (COnsensus based Standards for the selection of health status Measurement INstruments) is a consensus-based checklist to evaluate the methodological quality of studies on measurement properties of health status measurement instruments based on an international Delphi study in 2010 [24]. The COSMIN checklist was utilized in this study to ensure high methodological quality [25]. This study was carried out in accordance with the Declaration of Helsinki and approved by the ethics committee at the University of Regensburg in December 2015 (Institutional Review Board Number 15–101-0241). We obtained written informed consent from all study participants.

Study design

We identified 108 consecutive German-speaking patients who sustained an intraarticular tibial plateau fracture in a skier’s accident between 03/2000 and 12/2006 (T0).

Inclusion criteria were:

  1. 1)

    Patients with history of undergoing open reduction and internal fixation (ORIF) of an intra-articular tibial plateau fracture

  2. 2)

    No relevant concomitant injuries,

  3. 3)

    No preexisting mental disorder,

  4. 4)

    Minimum follow up was 8 years past trauma,

  5. 5)

    Age between 18 and 70 years,

  6. 6)

    Minimum light sports activity level (Tegner Activity Scale ≥3) at time of injury,

  7. 7)

    Sufficient German reading and comprehension capacity, and

  8. 8)

    Consent to participate in this study.

77 patients met the inclusion criteria. For characterization of the patient population, we recorded relevant clinical data and reviewed pre- and initial postoperative x-rays (T0). For the validation study (T1 and T2), the patients were asked to answer the following questionnaires according to their current status and return the forms by mail. We reminded all patients who did not answer within two weeks by telephone. For evaluation of test–retest reliability, the patients completed a second questionnaire after a minimum of two weeks (T2).

Materials

Forgotten joint score knee (FJS)

The FJS is a self-administrated questionnaire comprising of 12 items concerning the patient’s lack of awareness of the knee joint in everyday life [9]. The loss of awareness of a joint is widely regarded as the ultimate goal in achieving maximum patient satisfaction [9]. Developed in 2012, the FJS has shown a high internal consistency, construct validity and responsiveness in long term PRO [9, 11, 12, 15, 23, 26,27,28,29,30]. The FJS has been validated in patients after arthroplasty of the knee or hip, and after ACL reconstruction [23]. The total score ranges from 0 (low degree of forgetting) to 100 (high degree of forgetting).

Lsyholm knee scoring scale LH [3]

The LH is a well-established 8-item PRO tool to evaluate the functional status of the knee in physically active patients [19]. The score values of each question are summed up to representing the total score ranging from 0 points (representing extreme limitations and worst outcome) to 100 points (representing full function and best outcome). The score has been previously validated in German [31].

Tegner activity scale (TAS)

The TAS is a 10 level activity scale reflecting the patient’s currently highest level of sports activity or other routine activities [18]. It was designed to complement other functional scores for the knee joint, and is the most commonly used activity-scoring tool for patients with knee disorders. A German version is available [32].

EuroQol-5D 3 L (EQ 5-D)

The EQ-5D is a global quality of life questionnaire consisting of a 5-item assessment of the health status regarding mobility, self-care, usual activities, pain/discomfort, and anxiety/depression combined to an EQ Index ranging from −.21 (low quality of life) to 1.00 (high quality of life) [33]. The second part of the EQ-5D consists of a visual analogue scale (EQ VAS) concerning the patient’s assessment of the current global health status ranging from 0 (worst health status) to 100 (best health status).

Subjective assessment

The patient was asked to evaluate at T2 whether the condition of his artificial knee joint was ‘better’, ‘somewhat better’, ‘unchanged’, ‘somewhat worse’ or ‘worse’ compared to T1. This item was used as the anchor variable for test-retest reliability of FJS.

Radiologic assessment

Radiologic assessments were based on plain radiographs of the knee in two planes. We evaluated preoperative x-rays, postoperative control x-rays (T0) and radiographs at the time of follow-up (T2). A single experienced independent observer evaluated the degree of degeneration according to the clinical relevant classification of KLS: 1) no OA (KLS = 0), 2) mild OA (KLS = 1 or 2), 3) severe OA (KLS = 3 or 4) [34]. These parameters were rated at three time points and separately for all joint compartments (medial, lateral, and patellofemoral).

Statistical analysis

Statistical analysis was performed using the software package SPSS (Version 24, SPSS Inc., Chicago, Illinois). The level of significance was defined at p < .05 for all tests. Descriptive data are given as frequencies (n) and percentage (%) for categorical variables, means (m) and standard deviations (±) for continuous and normal distributed variables, and median (med) and quartiles (Q1/Q3) for continuous and not normally distributed variables. Normal distribution was assessed by Shapiro-Wilk-Test.

Methodological testing according to the COSMIN checklist

Studies evaluating measurement properties have to meet a high methodological quality [25]. The COSMIN checklist (COnsensus based Standards for the selection of health status Measurement INstruments) is an international consensus-based checklist to evaluate the methodological quality of health status measurement instruments [24]. Based on the COSMIN checklist, we evaluated the reliability (internal consistency, test-retest reliability) and validity (construct validity, clinical validity, content validity) of the FJS.

Internal consistency is described as the degree of interrelatedness among items [35]. Sufficient internal consistency was assumed for a Cronbach’s α > .70 [25]. Test–retest reliability is the extent to which results of the same patient in the same health condition remain unchanged over time [35]. According to the recommendation of the COSMIN guidelines, the retest was performed after a minimum of two weeks after primary consultation to avoid recollection of the answers and relevant changes in health condition. Intraclass correlation coefficient (ICC) was calculated for all patients indicating an unchanged condition of their knee joint since the primary evaluation. For an ICC > .70 sufficient test-retest reliability was assumed [25].

Since there is no gold standard in the measurement of PRO, validity was rated as construct, clinical and content validity. Construct validity is the degree to which the score of the FJS is consistent with the scores of questionnaires (LH, TAS, EQ Index, EQ VAS) indicating to measure the same construct (congruent validity) [35]. Construct validity was measured by Spearmen’s rank correlation. Correlation coefficients ≥ .40 indicates congruent validity. Clinical validity of FJS was measured by known-groups comparisons: Kruskal-Wallis test was used for differences in OA degrees and U-test was used for differences between symptomatic and asymptomatic patients. Content validity is met by the absence of floor and ceiling effects. If more than 15% of patients score highest (100) (ceiling effect) or lowest (0) value (floor effect) in the FJS, extreme outcome values might not be represented adequately and the questionnaire might not be able to reflect changes [25].

Results

Demographic data and generalizability

Demographic and clinical data

77 patients (51% women) after surgical treatment of tibial plateau fractures following a skiing accident were included in the study. All patients were treated operatively at Spital Davos (CH) with open reduction and internal fixation (ORIF) 1.4 days ±1.2 (range 0–6) after the accident. For stabilization, 40% received only compression screw fixation, and 60% received an angular stable plate osteosynthesis with or without additional compression screw fixation. Operative management was carried out according to the AO-principles. The postoperative regimen was equal for all patients with partial weight-bearing for 6 weeks. The median time span between accident (T0) and first FJS assessment (T1) was 13 years (Q1/Q3 = 12/15, range = 9–13). The mean age at T1 was 63.2 ± 12.2 years (range 36–87). Figure 1 shows two example patients 9 and 12 years after a tibial head fracture.

Fig. 1
figure 1

Radiographs showing preoperative, postoperative, and long-term condition after tibial head fracture. Patient 1 nine years after bicondylar tibial head dislocation fracture. Patient 2 sustained a lateral depression type tibial head fracture 12 years ago

Reliability

Cronbach’s alpha of .96 showed high internal consistency for the FJS. The item total correlation ranged between .95 and .96. The ICC(67) was .91 (95%-CI = .85, .95) for all patients indicating an unchanged condition of their knee joint since their primary evaluation (T1). The median time span between first (T1) and second (T2) FJS assessment was 26 days (Q1/Q3 = 24/32, range = 2–113).

Validity

There was no floor effect (no Patient had a minimum score of 0) and no relevant ceiling effect (10% (n = 8) patient had a maximum score of 100) for the FJS (T1).

Construct validity (T2) could be confirmed between FJS and LH (rs = .71, p < .001) as well as between FJS and EQ VAS (rs = .51, p < .001) indicating that these questionnaires /scales measure the same construct. The coefficient of the correlation between FJS and EQ Index (rs = .35, p = .002) fell short of reaching the cut-off of ≥40 indicating that the scales are not conceptually related. TAS correlated low, but significant with FJS (rs = .28, p = .013). The higher the activity, the higher the forgetting of the joint.

The Kruskal-Wallis test demonstrated significant differences between groups of patients with different degrees of OA in FJS values at T2 (H(2, 75) = 6.370, p = .041). Figure 2 shows the relation between KLS and FJS. At T2, asymptomatic patients had significantly higher FJS values (med = 81.3, Q1/Q3 = 62.0/91.7) than symptomatic patients (med = 54.2, Q1/Q3 = 41.7/75.0, p < .001).

Fig. 2
figure 2

Relation between Kellgren-Lawrence Score (KLS) and Forgotten Joint Score (FJS) at T2

Discussion

In this prospective study, we demonstrate that the FJS is a valid and reliable PROM-tool in patients after surgical treatment of tibial plateau fractures following a skiing accident. The FJS correlated with the radiologic degree of joint degeneration at long-term follow-up (Kellgren-Lawrence Score) and was able to distinguish between clinically symptomatic and asymptomatic patients. This is the first study following the complete COSMIN checklist validating FJS in long-term results after joint fracture.

Study design and patient population

Early OA of the knee joint is defined as knee pain with radiographic changes or arthroscopically visualized cartilage damage [36]. Early OA is a disabling condition with morphologic degenerative changes, however with a certain capacity for compensation/regeneration [5, 8, 37]. Patients with mild to moderate OA can present with a variety of signs and symptoms. Moreover, dynamics of joint degeneration kinetics vary greatly, which makes it difficult to characterize this patient population, and to compile comparable study populations [5, 8, 36]. If OA is the consequence of an acute event, like in posttraumatic OA, a pro-inflammatory response is triggered initially in addition to the osteochondral injury. After the remodeling of damaged cartilage areas, there can be a long period of asymptomatic steady-state in post-traumatic joint disease before further progression of degenerative disease. Only 12% of patients with OA of the knee have a relevant knee injury in their medical history [38]. However, this patient population represents an ideal opportunity to study early indicators of the progression of OA. Hence, we chose to validate the FJS as a long-term PRO after tibial plateau fracture. To minimize confounding, we set our exclusion criteria to ensure a homogenous patient population with a similar level of activity. Our demographic data is comparable to other studies on sports-related tibial plateau fracture with an average age around 50 years [39]. Originally, the FJS was designed for patients after arthroplasty of the hip and knee joint [9]. Thienpont [26] validated the score for patients with advanced OA designated for arthroplasty. Unfortunately, they do not provide data on the radiologic degree of OA preoperatively. Thienpont et al. [26] recorded a mean FJS of 24 points preoperatively indicating significant joint awareness in advanced OA of the knee. Although originally utilized for older patients, the FJS has been shown to be equally reliable in younger patients in recent studies [23, 26]. Behrend et al. [23] recently published validation data of the FJS on mid- and long-term results after ACL-reconstruction in 115 patients, demonstrating an increased joint awareness of 20 points after ACL-reconstruction compared to matched healthy control subjects. Patients after ACL-reconstruction had a mean FJS-value of 71.6 (mid-term) and 70.1 (long-term). These results are comparable to the findings in the present study on long-term outcome after tibial plateau fracture, with a mean FJS of 70 points.

Reliability

The FJS has been validated in English and has been adapted in French, Dutch, Danish, Japanese, and German language [9, 26,27,28,29,30]. All publications confirmed internal consistency with a Cronbach’s alpha of 0.95–0.97. In the present study, we recorded a Cronbach’s alpha of 0.96. According to Terwee et al. [25], a positive rating for internal consistency can be given if Cronbach’s alpha is between 0.7 and 0.95. Greater values reflect higher correlations among the items and might be an indication for a redundancy of two or more items [25]. Cronbach’s alpha is dependent on the number of items, leading to higher values for scores with a higher number of items. However, the FJS consists of only 12 items. Hence, the concept of the FJS with the inception of awareness for every question might be somewhat prone for a high correlation among the items. Test-retest reliability for the FJS has been documented to be between 0.80 and 0.94 [27, 28, 30]. We could confirm excellent test-retest reliability with an ICC(68) of 0.91. We investigated a long-term result with a minimum follow-up of eight years after injury. Therefore, a stable medical condition can be expected to make the ICC relatively robust.

Validity

The LH and the TAS seemed most appropriate for evaluation of the construct validity on a functional basis, because they are widely used and validated in German language for sports-related injuries and arthroplasty patients [31,32,33].

A major issue in outcome measurement of the knee joint is the correlation between clinical and radiologic results. Especially in mild to moderate OA, conventional outcome measurement tools often fail to reflect the radiological status of OA [1, 40]. However, large cohort studies like the ROAD study [41] have shown that there is an impairment of disease-specific and generic health-related quality of life (HRQoL) scales [7, 41]. Considering this, a PRO-measurement tool for early posttraumatic OA should reflect the disease-specific impairment of HRQoL.

The FJS showed good correlation to the Kellgren-Lawrence score in our patient population and was able to distinguish between symptomatic and asymptomatic patients. In addition, we saw significant differences in FJS values between groups of patients with no OA (KLS = 0), mild OA (KLS = 1 or 2), and severe OA (KLS = 3 or 4) indicating good construct validity.

Limitations

The results of this study should be interpreted in light of some limitations. First, the study design specifically investigates mild to moderate posttraumatic OA making our results less generalizable to primary OA. In addition, we could not control the factor degeneration due to natural aging of the joint or overuse with this study design. Another limitation is that no conservatively treated patients were included as all patients included in the study were managed operatively. However, the majority of intraarticular tibial plateau fractures are treated operatively.

Conclusion

This is the first study on validation of the FJS as a long-term indicator of progression to mild or moderate of post-traumatic OA after intra-articular joint fracture. We demonstrate good psychometric properties in our patient population and confirm a correlation between the radiologic degree of OA and the disease-specific PRO-score result of the FJS. The FJS was able to distinguish between symptomatic and asymptomatic patients, as well as between mild and severe forms of radiographically diagnosed OA.