Introduction

Osteoarthritis (OA) is associated with joint pain and functional limitation and is a leading cause of disability among older people. OA is considered the most common form of arthritis from which 15–18% of the population suffers [1]. Approximately 22% of the general population suffers from knee pain, and knee and hip pain are even more common in older people [2, 3]. This generally leads to consultation with a physician: e.g., 33% of the population with knee pain in the UK consults a general practitioner (GP) in primary care [4]. One reason for consultation is that patients with knee pain are looking for a definite diagnosis [5]. However, no clear clinical diagnostic primary care tools are available. Diagnosis of OA is often based on radiological evidence and/or on recommendations formulated by OA experts active in secondary care [6].

The diagnosis of OA in patients suffering from knee or hip pain in primary care would become easier if well-defined criteria were used. The American College of Rheumatology (ACR) has developed different criteria for the classification of OA of the knee and hip in order to promote uniformity in reporting OA in epidemiological and intervention studies. These criteria were developed using combinations of clinical, clinical/laboratory, and clinical, laboratory and radiographic criteria [7, 8]. Although these criteria were developed primarily for epidemiological purposes rather than for clinical use, the ACR criteria are commonly used as a diagnostic tool in secondary care. Because the criteria were developed in secondary care with patients with (mostly) rheumatoid arthritis (RA) in the control group, these criteria might primarily distinguish patients with OA from patients with RA. Furthermore, it has been suggested that the criteria are probably mainly diagnostic for late-stage OA [9]. More uniform and early diagnosis of OA would provide a better window of opportunity for interventions and a clear diagnosis could also help to motivate patients for often-difficult lifestyle changes involved with such a diagnosis.

The present study aimed to evaluate the prevalence of ACR criteria in subjects with knee and hip complaints and whether they will develop evident OA according to the ACR criteria for hip and knee OA. This study also aimed to determine predictive factors for the development of knee/hip OA according to the ACR criteria, during 5-year follow up. These predictive factors may help to diagnose OA at an earlier stage in primary care and thereby promote earlier treatment according to established guidelines.

Patients and methods

Study design

The CHECK (cohort hip and cohort knee) study is a prospective cohort study of 1002 individuals who first presented with knee and/or hip pain. Details of the protocol are published elsewhere and a summary is presented below [10]. No ethical approval is required for a prognostic cohort without interventions in the Netherlands.

Study population

Patients that potentially fulfilled the inclusion criteria were invited to join the study when they visited their GP. In addition, participants were recruited through advertisements, articles in local newspapers, and via the website of the Dutch Arthritis Association. Individuals were eligible to participate if they had pain and/or stiffness of the knee and/or hip, were aged between 45 and 65 years, and had not yet consulted their physician for these symptoms, or the first consultation was within the preceding 6 months.

First presenters with pathological, previously diagnosed, conditions that obviously explained the existing symptoms (e.g. other rheumatic disease, isolated tendinitis/bursitis, previous hip or knee joint replacement, congenital dysplasia, osteochondritis dissecans, intra-articular fractures, septic arthritis, Perthes’ disease, ligament or meniscus damage, plica syndrome or Bakers’ cysts (sign of more advanced OA)) were excluded. Other exclusion criteria were co-morbidity that precluded physical evaluation and/or follow up for at least 10 years, malignancy in the last 5 years, and inability to understand the Dutch language [10].

Physicians at the participating centers checked whether referred patients and patients from their outpatient clinic, fulfilled the inclusion criteria. All patients underwent radiographic assessment and a physical examination, and filled out an extensive questionnaire at baseline, and at 2-year and 5-year follow up.

Outcome measures

OA of the hip/knee was determined using the ACR criteria for hip and knee OA [7, 8]. We determined the clinical classification criteria, and the combined clinical and radiographic classification criteria. The clinical classification of OA of the hip was determined using hip flexion measured during physical examination instead of the erythrocyte sedimentation rate (ESR), as ESR was only available at baseline. Therefore, we followed the alternative proposed by the ACR when the ESR is not available; these alternative criteria were reported to be equally sensitive and specific [8].

For the classification of knee OA, the clinical criteria and the combined clinical and radiographic criteria were determined. The criteria were first defined per joint (i.e. left and right hip/knee) separately. A participant was classified as having hip or knee OA when at least one of the two joints fulfilled one of the ACR criteria at 2-year and/or at 5-year follow up for the hip and the knee separately, e.g. patients fulfilling ACR criteria at 2 years and not at 5 years of follow up would be classified as patients with OA.

Predictors

The predictors assessed were factors available at consultation with the GP, and consisted of demographic factors (age, gender and body mass index (BMI)), anamnestic factors (site of pain, pain score in the last week and morning stiffness), co-morbidity (lower back pain, previous surgery in the knee or hip, use of analgesics, unilateral or bilateral hip or knee pain), factors from physical examination (pain ont hip/knee flexion, pain and reduced range of motion (ROM) on internal rotation (ROM < 15 vs. > = 15°) and hip flexion (ROM > 115 vs. < =115°), presence of Heberden’s nodules, palpable warmth of the knee, patellofemoral grinding, joint line tenderness, bony enlargement of the knee) and simple diagnostic tests such as Kellgren and Lawrence grade (K&L 0 vs. ≥ 1) on conventional radiography and ESR (< 20 mm/h). An overview of all tested variables (19 in the hip cohort and 21 in the knee cohort) is presented in Additional file 1: Table S1.

Data analysis

To reduce bias and improve efficiency, we performed multiple imputation of missing values at baseline. We generated 10 imputed datasets using chained equations implemented in the R routine MICE. All analyses were done separately on the 10 imputation sets. A weighted mean outcome (as proposed by Rubin) was calculated [11]. Separate logistic regression models were constructed for participants with hip or knee complaints at baseline, but who were not classified at baseline as having OA according to the ACR classification criteria for hip and knee OA. Predictors used are described in Additional file 1: Table S1. Because of the large number of measured predictors a data reduction method was used. Predictors related to the outcome (p < 0.2) were divided into five categories (i.e. demographics, complaints and symptoms, co-morbidities, physical examination, and diagnostic interventions). Per category of participants with knee or hip complaints a multiple logistic or linear regression (enter method) analysis was performed with predictors that were univariately associated with the outcome (p < 0.2). All predictors selected in the different categories were again entered into the final logistic or linear regression analysis to build the final model (p < 0.05). The results are presented as odds ratios (OR) with 95% confidence intervals (CI). Predictive values and likelihood ratios were calculated [12]. All analyses were performed with the SPSS software package (version 22.0.0.0).

Results

The baseline characteristics of the study population are presented in Table 1. Of the 1002 participants in the CHECK cohort, 79.0% was female and mean age was 55.9 years. Of the total study population, 58.7% (n = 588) had hip complaints, either stiffness or pain, at baseline. Of these, 27.6% (n = 162) were classified as having hip OA at baseline according to the ACR clinical criteria, 50.0% (n = 295) according to the combined clinical/radiographic criteria for hip OA, and 62.9% (n = 370) met either one or the other of these criteria. Knee complaints were reported at baseline by 82.7% of patients (n = 829), of whom, 81.3% (n = 674) were classified as having knee OA at baseline according to the ACR clinical criteria,73.1% (n = 606) according to the combined clinical/radiographic criteria for knee OA, and 91.7% (n = 760) met either one or the other of these criteria.

Table 1 Baseline characteristics of the CHECK study population at baseline

Predictive factors in participants with hip complaints and development of OA according to the ACR criteria

Of the 198 participants with hip complaints that were not classified as having hip OA at baseline according to the ACR clinical and/or combined criteria and were not lost to follow up, 80 fulfilled the ACR criteria at 2-year and/or 5-year follow up. Based on the 19 potential predictive factors measured at baseline, 8 univariately significant factors were included in the final multivariate logistic regression model. This model identified the following baseline factors: morning stiffness (OR 2.39; 95% CI 1.14–4.98, positive likelihood ratio (LR+) 1.56), painful internal rotation (OR 2.53; 95% CI 1.23–5.19, LR+ 1.71), hip flexion < 115° (OR 2.33 95% CI 1.17–4.64, LR+ 1.47) and ESR < 20 mm/h (OR 2.94; 95% CI 1.13–7.61, LR+ 0.77) (Table 2).

Table 2 Multivariate regression analysis for hip OA at 2-years and/or 5-year follow up according to the ACR classification criteria (n = 198, 80 cases, a priori risk = 0.40)

Combinations of these factors provided even higher likelihood ratios. Individuals with both morning stiffness and painful internal rotation had a LR+ of 4.03 (positive predictive value (PPV) 0.73, negative predictive value (NPV) 0.64, negative likelihood ratio (LR-) 0.83). When individuals presented with morning stiffness, painful internal rotation and hip flexion < 115°, the LR+ was 15 (PPV 0.91, NPV 0.63, LR- 0.88). Addition of ESR < 20 mm/h as a predictor did not enhance the predictive value (LR+ 12,66, LR- 0.89, PPV 0.9, NPV 0.61).

Predictive factors in participants with knee complaints

A total of 64 participants with knee pain were not classified as having knee OA at baseline according to the ACR clinical and/or combined criteria and were not lost to follow up. Of these, 35 fulfilled the ACR criteria at 2-year and/or 5-year follow up. In this group, 21 potential predictive factors were measured at baseline (Additional file 1: Table S1). Age, morning stiffness, joint line tenderness and ESR < 20 mm/h were included in the final multivariate logistic regression model. In this small sample no variable was statistically significant. Morning stiffness in the knee lasting < 30 min had a positive likelihood ratio (LR+) of 4.97 (PPV 0.86, NPV 0.49, LR- 0.08) (Table 3).

Table 3 Multivariate regression analysis for knee OA at 2-year and/or 5-year follow up according to the ACR classification criteria (n = 64, 35 cases, a priori risk = 0.55)

Discussion

This study demonstrates that the majority of patients presenting for the first time with hip pain fulfill the combined ACR hip OA criteria, both clinical or combined ACR criteria, and that 40% of patients not fulfilling those ACR criteria will develop evident OA according to the clinical or combined ACR criteria for the hip after 5 years. For this last subgroup we identified the following predictive factors: morning stiffness, painful internal rotation, hip flexion < 115° and an ESR < 20 mm/h. Combinations of these signs and symptoms have an even higher predictive value. In first presenters with knee pain, up to 92% do fulfill the clinical or combined ACR criteria, at baseline. For this reason, the number of participants with knee symptoms not fulfilling the ACR criteria was, in fact, too small to assess predictors of OA development. This study is unique in having such a large group of first presenters. We would like to argue that the CHECK cohort represents people in (Dutch) primary care presenting for the first time with hip and knee complaints and suspected of having early OA.

We were surprised by the large percentage of participants fulfilling ACR criteria at baseline in participants with hip complaints, and that this was even more pronounced in participants with knee complaints. In a previous open population-based knee pain cohort that included persons with chronic knee pain, 47% were not diagnosed with OA at baseline [13]. This proportion is larger than our proportion of participants without OA at baseline. This difference could be due to the younger age (mean age 45 years) and lower BMI in that cohort. In that same study, the majority (86%) of persons developed OA during the 12-year follow up [13]. In our study, a smaller proportion of participants with pain in the hip (40%) and knee (55%) were diagnosed with either hip or knee OA according to the ACR criteria during follow up. However, this result could be related to the shorter follow-up period in our study.

The predictive factors we identified to be associated with the development of hip OA are consistent with the previous literature. Morning stiffness and limited internal rotation are known predictors for total hip replacement in primary care [14, 15]. Age and pain levels, however, were not statistically significant in the final model in the current study whereas other studies found these to be predictive [14, 15]. This could be explained by our relatively young cohort with generally quite low pain levels (Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) pain score 27.2 on a scale of 0–100, numeric rating scale (NRS) 3.7, Table 1) such as can be expected in a cohort with early OA. Limited hip flexion and ESR < 20 mm/h were not identified previously as risk factors for the development of HOA but were statistically significant in our final model. A possible explanation could be that higher ESR was related to inflammatory diseases at baseline that were not evident at the time of inclusion.

In contrast to previous studies we were unable to identify predictors of the knee pain that develops into knee OA [16, 17], even when we performed a separate analysis for the clinical and the combined ACR criteria. Also, in these subgroup analyses, no variables were significantly associated with development of knee OA, except for borderline significant results for morning stiffness. The large percentage of patients with knee pain who fulfill the ACR criteria at baseline is probably the main reason for not finding significant predictive factors based on OR. However the predictive values show that morning stiffness would probably have had good prognostic value if we had had greater statistical power.

As expected, the criteria associated with fulfilling the ACR criteria at follow up, in either the combined or separate analysis for the clinical and the combined ACR criteria, were all sub-items of the ACR criteria. This indicates that pain in combination with one or more of these sub-items of the ACR criteria might be indicative of future OA.

There are currently no clear diagnostic criteria for OA in primary care, e.g. the ACR criteria are widely used in epidemiologic research but not validated in primary care. Most discussions focus on the use of radiographic outcomes [18]. For example, Kellgren and Lawrence (K&L) grade ≥ 2 is accepted as a cutoff for OA in epidemiological studies and (possibly) in secondary care. The cutoff of K&L grade ≥ 1 is useful in epidemiologic studies to predict progression, but its use is not advised in primary care because knee radiography has no additional value in the assessment of individual patients with knee pain [19,20,21,22]. However, in the present study we chose to examine not only clinical features but also radiographic features, because of the availability and still frequent use of radiography in primary care. Our study clearly showed that radiographic features do not predict fulfillment of ACR criteria, nor when assessed in subgroups of clinical or combined ACR criteria (data not shown.)

The prevalence, incidence, and predictors of the incidence of OA are clinically important findings, because they implicate that most persons aged 45–65 years of age presenting to a GP with no other hip or knee disease could be diagnosed with clinical OA at that time or are prone to developing clinical OA within the following years. This could help to provide a clear diagnosis, which contributes to early treatment according to the guidelines that are available for both hip and knee, whereas undiagnosed knee and/or hip pain is usually treated according to the best insight of the individual physician [23,24,25]. For patients diagnosed with OA, first-step treatments (e.g. education, lifestyle advice, and acetaminophen) should be started, due to their beneficial effects in the early stage of the disease process [26].

Our study offers a unique population to study hip and knee pain in first presenters, because the patients included are comparable with patients who would present to a primary care physician and therefore this study helps in addressing the diagnostic challenge of hip and knee pain in primary care. A limitation of our study is that a substantial number of variables were tested in the analysis. Due to the limited number of OA cases identified, we could justify testing only 2–5 variables per analysis per category when building the explorative models. However, clinically relevant variables were used (defined prior to our analyses) that were previously applied in epidemiological/clinical research and no new predictors were introduced. Further, data reduction methods were used by means of restrictions based on p values by pre-analyzing the predictors in their categories. Other predictors of OA could remain unexposed due to this lack of power.

Conclusion

The majority of people presenting with hip pain for the first time fulfill the clinical or combined ACR criteria and 40% of the patients not fulfilling those ACR criteria will develop OA according to the clinical or combined ACR criteria for the hip after 5 years. Predictive factors for the development of HOA are morning stiffness, painful internal rotation, hip flexion < 115° and an ESR < 20 mm/h. In first presenters with knee pain, up to 92% already fulfill the clinical or combined ACR criteria. No predictive characteristics were identified for the development of knee OA in those not fulfilling ACR criteria.

Recommendations

We would suggest that future studies validate whether patients with hip complaints aged > 45 years with the characteristics of morning stiffness, painful internal rotation, hip flexion < 115° and an ESR < 20 mm/h indeed have early OA. It also needs to be validated whether first presenters with knee complaints aged > 45 years indeed have early KOA.