Hospital variation in revision rates after primary knee arthroplasty was not explained by patient selection: baseline data from 1452 patients in the Danish prospective multicenter cohort study, SPARK

Purpose Revision rates following primary knee arthroplasty vary by country, region and hospital. The SPARK study was initiated to compare primary surgery across three Danish regions with consistently different revision rates. The present study investigated whether the variations were associated with differences in the primary patient selection. Methods A prospective observational cohort study included patients scheduled Sep 2016 Dec 2017 for primary knee arthroplasty (total, medial/lateral unicompartmental or patellofemoral) at three high-volume hospitals, representing regions with 2-year cumulative revision rates of 1, 2 and 5%, respectively. Hospitals were compared with respects to patient demographics, preoperative patient-reported outcome measures, motivations for surgery, implant selection, radiological osteoarthritis and the regional incidence of primary surgery. Statistical tests (parametric and non-parametric) comprised all three hospitals. Results Baseline data was provided by 1452 patients (89% of included patients, 56% of available patients). Patients in Copenhagen (Herlev-Gentofte Hospital, high-revision) were older (68.6 ± 9 years) than those in low-revision hospitals (Aarhus 66.6 ± 10 y. and Aalborg (Farsø) 67.3 ± 9 y., p = 0.002). In Aalborg, patients who had higher Body Mass Index (mean 30.2 kg/m2 versus 28.2 (Aarhus) and 28.7 kg/m2 (Copenhagen), p < 0.001), were more likely to be male (56% versus 45 and 43%, respectively, p = 0.002), and exhibited fewer anxiety and depression symptoms (EQ-5D-5L) (24% versus 34 and 38%, p = 0.01). The preoperative Oxford Knee Score (23.3 ± 7), UCLA Activity Scale (4.7 ± 2), range of motion (Copenhagen Knee ROM Scale) and patient motivations were comparable across hospitals but varied with implant type. Radiological classification ≥ 2 was observed in 94% (Kellgren-Lawrence) and 67% (Ahlbäck) and was more frequent in Aarhus (low-revision) (p ≤ 0.02), where unicompartmental implants were utilized most (49% versus 14 (Aalborg) and 23% (Copenhagen), p < 0.001). In the Capital Region (Copenhagen), the incidence of surgery was 15–28% higher (p < 0.001). Conclusion Patient-reported outcome measures prior to primary knee arthroplasty were comparable across hospitals with differing revision rates. While radiographic classifications and surgical incidence indicated higher thresholds for primary surgery in one low-revision hospital, most variations in patient and implant selection were contrary to well-known revision risk factors, suggesting that patient selection differences alone were unlikely to be responsible for the observed variation in revision rates across Danish hospitals. Level of evidence II, Prospective cohort study.


Introduction
Assessment of the quality of knee arthroplasty (KA) surgery is traditionally based on cumulative revision rates (CRR) [29]. According to data from national arthroplasty registries, there are significant CRR differences between countries and large and statistically significant differences within countries and between hospitals [7,33]. These observations are rarely discussed and attempts to explain the variation often focus on implant selection. Data from the Danish Knee Arthroplasty Register show a statistically significant variation across the five administrative regions in Denmark for 1-, 2-, 5-and 10-year cumulative revision rates (CRRs) [42]. The CRR of the Capital Region has persistently been the largest and lower rates have been seen with increasing distance from the capital, Copenhagen (Fig. 1). For instance, in 2015 when this study was initiated, the 2-year CRR was 5.0% in the Capital Region, 2.2% in Central Denmark Region and 1.0% in North Denmark Region [42]. Variations among regions or hospitals can occur by chance, but consistent differences in CRRs could indicate systematic differences in the indications for the primary procedure, patient demographics, the quality of surgery including implant selection, or indications for revisions -or combinations of these. Demographics, preoperative knee symptoms and the severity of radiographic knee osteoarthritis (OA) are all factors that are associated with the degree of postoperative patient satisfaction and the risk of revision [6,11,14,15,29,35]. These variables, however, have not specifically been compared across hospitals with varying revision rates following KA.
For the three Danish regions in issue, register data provided no explanation for the CRR differences, and apart from undocumented assertions of cultural differences between regions, there were no hypotheses regarding the factors that might be responsible. This motivated the initiation of the prospective observational cohort study, SPARK ("Variation in patient Satisfaction, Patient-reported outcome measures, radiographic signs of Arthritis, and Revision rates in Knee arthroplasty patients in three Danish regions"). The present part of the SPARK study aims to compare patient characteristics, knee radiographs, implant selection and patient-reported outcome measures (PROMs) obtained before primary KA in a large hospital of each region, and to investigate whether hospital variations in patient selection were associated with the CRR differences. Postoperative outcomes will be reported in a separate publication.

Materials and methods
The National Committee of Health Research Ethics provided ethical approval (Protocol no. 16038343, 2 September 2016) and all patients gave their written consent to participate. Reporting adheres to the STROBE guidelines for observational cohort studies.

Patient inclusion
This prospective observational cohort study invited the largest knee arthroplasty university hospital in each of the three Danish regions that differed most in revision rates after KA surgery: Aarhus University Hospital in the Central Denmark Region, Aalborg University Hospital Farsø in the North Denmark Region and Copenhagen University Hospital Herlev-Gentofte in the Capital Region. Revision rates for each of the three hospitals were comparable to those for the region as a whole (Table 1) [42]. All hospitals were public  (94% of primary KA's were performed in public hospitals in 2017) [43]. From 1 September 2016 to 31 December 2017, patients who were scheduled for primary KA, i.e., total (TKA), medial/lateral unicompartmental (MUKA/LUKA) or patellofemoral arthroplasty (PFA) were eligible for inclusion. Participation did not interfere with implant selection or surgical routines. Exclusion criteria were knee tumors, hemophilia, severe developmental lower limb deformities, dementia or language barriers that could not be overcome by help from relatives. Patients unable to answer questionnaires online were excluded, with the exception of the final 6 months of the inclusion period (July 2017-Dec 2017) during which participation via paper questionnaires was permitted.
Patients were recruited for the study by the surgeon (Aarhus and Aalborg) or by an employed medical student (Copenhagen). Two days later, patients received an email with a unique link to the preoperative PROM set or a letter with the same content. Up to two email reminders were sent, three days apart, if necessary. To avoid confusion among patients with bilateral knee trouble, the email specified that the knee scheduled for surgery was the object of the study. Patients planned for surgery on both knees could participate twice if the operations were conducted on separate occasions, while patients with simultaneous bilateral surgery were asked to choose which knee to participate with in advance [24]. Since PROMs were the cornerstone of this study, patients who failed to complete the questionnaire prior to surgery were excluded.
Post-hoc quantification of inclusion rates and demographic comparisons between participants and non-participants were conducted. As the time from inclusion to surgery varied, these analyses were based upon registered surgical activity during a certain time period (1 Jan to 31 Dec 2017) [36].
Patients reported their height and weight as well as additional health and lifestyle information, including their degree of urbanization ("city/suburb", "small town/village" or "countryside"), daily smoking ("yes"/"no"), and alcohol consumption (more or less than two standard drinks (12 g alcohol) per day). Patients were asked whether the knee was their main physical disability, and "How often do you take painkillers due of your knee?" with five answer options ranging from "more than once per day" to "rarely or never" (full wording in Table 3).

Radiographic classification of knee osteoarthritis
The severity of tibiofemoral OA was assessed in blinded preoperative weight-bearing postero-anterior knee radiographs with the knee flexed 15-30° [2]. Patients listed for PFA or LUKA and those with predominantly lateral OA on radiographs were excluded from this analysis because the radiographic basis for surgery could not be fairly assessed without tangential (Skyline, Merchant) or flexed (Rosenberg) views, respectively.
Two radiologists with expertise in musculoskeletal radiology viewed the radiographs in a random sequence. First, the Ahlbäck classification (0-5, 5 severe), and secondly, in a new round of random order, the Kellgren-Lawrence classification (K-L, 0-4, 4 severe) was recorded for each patient [1,13,17]. In case of disagreement, both radiologists reevaluated each radiograph together and reached a consensus. Using a novel heuristic-based method, radiographs were evaluated free of classifications by 13 experienced knee arthroplasty surgeons from all five Danish regions. Each surgeon was presented with the knee radiographs in pairs and was asked to choose the radiograph that they expected would cause the most severe knee symptoms, not considering any formal grading system but instead using their personal experience and heuristics, i.e., "rule-of-thumb". These thousands of comparisons resulted in a complete ranking of all radiographs [28].

Incidence of surgery and implant selection
The incidence of primary KA on a regional level was retrieved from the National Patient Register by NOMESCO procedure code KNGB (age > 40 years and subgroup 60-79 years). The CRRs for the hospitals (Table 1) were retrieved from the Danish Knee Arthroplasty Register (97% completeness). On an individual level, the medical record was consulted in case of a mismatch in laterality or implant type from inclusion to postoperative registration.

Statistics
Sample size and inclusion period were determined by clinical relevance and feasibility. Throughout the study period, around 1800 operations were anticipated and with a 75% inclusion rate and 80% response rate, 1080 responses would be ready for analysis. Any regional variations that were not detectable in a sample of this size were considered clinically irrelevant to the overall study question.
All analyses were based on the null hypothesis that patient selection was identical across the three hospitals. Due to the explorative nature of the study, additional data-driven analyses were allowed [30]. All tests were unpaired as though each knee belonged to a unique participant [32]. OKS and EQ-5D data were treated as numeric variables [24], as were knee flexion and extension [21], while Ahlbäck, K-L, surgeons' ranking, and UCLA ratings were ordinal. A separate article describes the statistical details of heuristics-based assessment of radiographs using the Bradley-Terry model for paired comparisons [28].
Unless otherwise specified, statistical tests compared all three centres, not one against the mean. The significance of difference tests depended on the type and structure of data: Chi-square test for dichotomized variables, unpaired t-test or one-way analysis of variance (ANOVA) for parametric variables and Mann-Whitney U or Kruskal-Wallis test (> 2 groups) for nonparametric (ordinal) data. General linear regression models were used to estimate the effects of independent numerical variables on dependent variables, and when adjustment for confounders was relevant, multiple linear regression analyses were conducted (noted in text). Aarhus was selected as the reference hospital as it was situated between the two other hospitals in terms of geography, urbanization and CRR, i.e., the disparities known prior to inclusion. The level of significance was set to 0.05 (twosided) and 95% confidence intervals (CI) were supplied when relevant. Data collection and Case Report Forms etc. were handled by Procordo Software Aps, Copenhagen. In Mar 2019, analyses were conducted using R (RStudio) [31].

Patient inclusion
Questionnaires were sent to 1704 patients (Fig. 2), 52 of those through letter. In 32 cases, the email address or laterality was wrong, or a technical error occurred, and 48 patients had their procedure cancelled or postponed beyond the research period. Consequently, 1624 patients received a questionnaire, and 1452 patients (89%) completed the PROM set at a mean of 29 days before surgery, spending an average of 12:30 min each patient. The 53 patients who participated with separate knees accounted for 7.3% of responses.
In the SPARK cohort, males and younger patients more often agreed to participate in the SPARK study than females and older patients ( Table 2). Further analyses (not shown) found that the distribution of implant types within each hospital did not differ between participants and non-participants (p ≥ 0.2, Chi-square). SPARK participants from Copenhagen had a mean age of 68.6 years, 1.4-2.0 years older than those in Aalborg (67.3 y.) and Aarhus (66.6 y.), respectively (p = 0.002, ANOVA) ( Table 3). Male sex was more prevalent in Aalborg (56%) than in Copenhagen (43%) and Aarhus (45%) (p = 0.002, Chi-square). In Aalborg, males (68.8 y.) were 3.5 years older than females (65.3 y.) (CI 1-6, t test), whereas in the other hospitals, there was no significant difference. BMI (mean 29.5 ± 5 kg/m 2 ) was lower in the elderly (− 0.13 kg/m 2 /year, CI − 0.16-(− 0.11), linear regression) and higher in females (+ 0.69 kg/m 2 , CI 0.2-1.2) as well as in Aalborg patients (+ 1.5-1.7 kg/ m 2 , p < 0.001), even after adjusting for age and sex (+ 1.4-1.9 kg/m 2 , adjusted). There were no differences between hospitals for smoking, alcohol consumption, physical activity level (UCLA) or self-reported general health (EQ Index and VAS). Except for smoking, males reported significantly higher levels of these parameters (Table 3) as compared to females (significant on hospital level for ULCA and alcohol consumption, only). In sub-analyses of EQ-5D-5L items, 76% of Aalborg patients were "neither anxious or depressed", compared to 66% in Aarhus and 62% in Copenhagen (p = 0.01, Kruskal-Wallis). This hospital difference was only significant among females (p females = 0.03, males = 0.3). The 41 patients who responded by letter (75.9 y) were 8.1 years older than those who responded via email (67.8 y) (CI 6-10, t-test) and 29 (71%) were female (54% in the email group, p = 0.05, Chi-square).

Patient-reported outcome measures (PROMs) at baseline
OKS at baseline did not differ among patients in the three hospitals (23.3 ± 7, p = 0.9, ANOVA) ( Table 3, Fig. 3), even after adjusting for age, sex and BMI (multiple linear regression). The same was true for use of analgesics, knee flexion and the global knee anchor (Table 3). Extension deficits were more prevalent in Aalborg (62 vs. 45-46%, p = 0.007, Chi-square). Males scored 2.8 OKS points higher than females in all hospitals (CI 2-4, t test) and reported less frequent use of analgesics, while the sex difference in the overall perception of the knee condition (global knee anchor) was not significant (p = 0.1, Mann-Whitney U). OKS was significantly lower (− 2.6 points, CI − 3−(− 2), t-test) in obese patients (BMI > 30) and in smokers (− 1.5 points, CI − 3−(− 0.4), t test). There were no hospital differences in patients' motivations for surgery (p ≥ 0.1, Chi-square), but stratification by implant and sex revealed significant variation (Table 4).

Radiographic classification of knee osteoarthritis
Exclusions were made for 50 PFA, 7 LUKA patients and 167 patients with predominantly lateral OA. 177 radiographs were unavailable due to logistical matters unrelated to the patient, leaving 1051 radiographs (86% of those possible) ready for analysis. The two radiologists reached a moderate interobserver agreement of 0.59 (weighted Kappa) for both K-L and Ahlbäck [17]. Prior to consensus, they disagreed in 29% (K-L) and 41% (Ahlbäck) of cases, respectively. The surgeons' heuristics-based evaluations (17,767 comparisons) ranked all radiographs from number 1 (most severe) to number 1051 [28].

Incidence of surgery and implant selection
In Capital Region, the incidence of primary KA surgery in patients aged 60-79 years in 2017 was 28% higher than in Central Denmark Region and 15% higher than in North Denmark Region (Table 5). 22 surgeons treated the SPARK patients: 4 in Aarhus, 6 in Aalborg and 12 in Copenhagen. All surgeons were exclusively occupied with joint replacement surgery, except for five surgeons in training programs, who were responsible for fewer than six operations each and were evenly distributed among hospitals. With the exception of one surgeon in each hospital, the staffs had remained stable over the preceding years. Implant selection varied widely across hospitals (Table 3). Overall, MUKA patients (67.0 ± 9 y) were 1.7 years younger (CI 0.6-3, t test) than TKA patients (68.8 ± 9 y), more likely to be male (52 vs. 44%, p = 0.01, Chi-Square), had a lower BMI (28.1 vs. 29.2 kg/m 2 , i.e., − 1.1 kg/m 2 CI − 1.7−(− 0.5), t-test) and reported 1.4 points higher OKS (24.3 vs. 22.9, CI diff. 0.6-2, t test) and 3.9 (CI 1-6, t test) points better general health (EQ-VAS 64.5 vs. 60.6). In Aarhus, which had the highest frequency of MUKA use (40% MUKA, 51% TKA), there was no difference in age, sex or BMI between two patient groups ( Table 6). In contrast, group differences were more pronounced in self-reported health (EQ-VAS), global knee anchor and patient-reported knee range of motion, e.g., preoperative flexion was 0.5 points better in MUKA patients (equivalent to approximately 5-10 degrees) [21].

Discussion
All hospitals had comparable preoperative PROM scores, indicating comparable symptom states prior to primary knee arthroplasty. Particularly four findings were unexpected in relation to commonly accepted revision risk factors: A very high percentage of patients from a low-revision hospital (Aarhus) were treated with unicompartmental implants, patients in both low-revision hospitals were younger than those in the high-revision hospital (Copenhagen), and the mean BMI and percentage of male patients was greater in one low-revision hospital (Aalborg) than elsewhere. Based on the literature, a higher risk of revision was expected in these four situations [9,11,29,39]. In contrast, the more (< 18.5 kg/m 2 ) comprised only 2 patients, who were thus included in the "normal" group). OKS Oxford Knee Score (0-48 version, 48 best). UCLA Activity Scale range 1-10 (10 most active).
The summarized findings show that the historical differences in revision rates among the three centres studied cannot easily be explained by variations in preoperative patient characteristics (Table 7).

Strengths and limitations
Due to the observational nature of the study, causal conclusions cannot be drawn. Also, when a number of parameters are investigated, some significant differences will be discovered that are not necessarily reproducible or clinically important, as may be the case for e.g. the small difference in knee extension [21]. Similarly, the magnitude and clinical relevance of hospital differences in age or BMI may be debatable.
It is an important strength that the results were based on patients treated in routine clinical settings. Surgeons were not aware of any changes to patient selection practices during (or leading up to) the study period, so it was assumed that the study reflected standard hospital practice. However, the    [20,25]. Response rates were relatively high. Numerous PROMs from nine out of ten participants in conjunction with radiographic OA classifications should provide a valuable reference set for future comparisons [42]. However, not all potential candidates were included, inevitably resulting in bias. To make the inclusion process feasible, no information was collected regarding patients who were not invited or declined participation and the reasons why. The surgeons and medical students in charge of patient recruitment reported that inclusion was occasionally overlooked or not prioritized, but patients were eager to participate. One could argue that the electronical collection of PROMs posed a threat to patient representation. However, Danish citizens are among the most IT-literate in Europe (2 out of 3 Danish citizens > 65 years used the internet daily in 2017) [45] and knee OA patients have previously preferred electronic questionnaires over paper ones [10]. Though the demography of the SPARK cohort largely resembled the surgical population of 2017 and the underlying hospital differences in demography were reflected in the SPARK cohort, males and young patients were overrepresented in the study. Participants without email address were 8 years older than others and were only allowed participation in the 6 of the 16 inclusion months. Therefore, it must be assumed that some of the oldest and possibly least resourceful patients were excluded, resulting in additional inclusion bias. Objective information regarding comorbidity or socioeconomic factors could have revealed important Test results refer to comparisons within the hospital group. When nothing else is stated, means [and medians] ± SD are reported. For some nonparametric variables, means and ± SD have been reported to aid comparison TKA Total Knee Arthroplasty. MUKA Medial Unicompartmental Knee Arthroplasty. BMI Body Mass Index. UCLA Activity Scale range 1-10 (10 most active). OKS Oxford Knee Score (0-48 version, 48 best). Global knee anchor Patients' overall knee assessment (VAS 0-100, 100 best). OA osteoarthritis. K-L Kellgren-Lawrence. CKRS Copenhagen Knee ROM Scale Flexion 0-6 (6 is max), Extension 0-5 (5 is max), see text for details. Surgeons' ranking radiographic knee OA severity, total range 1-1051 (1 is most severe) hospital differences in baseline health [40]. As a proxy of socioeconomic factors, 10% of men and 8% of women in age group 65-74 years reported daily smoking; this proportion was lower than the 17% and 14% reported in the National Health Profile 2018 [12]; however, smoking is associated to lower risk of OA (Relative Risk 0.80) [16]. In Aalborg, the low inclusion rate threatened the generalizability of results. A low level of self-reported anxiety and depression (especially among females) here may be a reflection of daily practice or may result from inclusion bias. The high proportion of males among patients undergoing KA surgery was a general tendency in Aalborg.
In this study, urban-rural variations in radiographic classifications were minimal. This may be due to the relatively small geographical distances in Denmark: almost all citizens live within a 1.5 h drive of a KA centre [27,34]. In Aarhus, which is located in a region with a KA incidence 18-22% lower than the Capital Region, fewer patients with mild degrees of radiological OA underwent surgery. This would suggest that not all patients in Capital Region would have been offered (or accepted) primary KA surgery if they had lived in the Central (or North) Denmark Region. Utilization of primary KA is known to vary across economies and countries, for example by a factor of ten between countries in the Organization for Economic Co-operation and Development (OECD) alone [27,29]. In welfare countries, the utilization of KA varies by a factor of two [26], and there are large regional variations within countries (Finland 1.6, Germany 1.8 and Spain factor of 27) [8,19,34]. In this light, the Danish variation in KA incidence by a factor of 1.3 is negligible. Regional variations in the threshold for primary KA surgery are not necessarily explained by the actions of knee surgeons alone [38]; expectations for surgery and risk aversion among patients, physicians and other caregivers (e.g. physiotherapists) the number of patients admitted for orthopaedic evaluation [18]. Therefore, the optimal comparison of patient selection should also include knee OA patients treated outside of hospitals and with non-surgical methods.

Conclusions
The observed hospital variations in patient selection prior to primary knee arthroplasty were not associated with wellknown revision risk factors to an extent that could reasonably explain the persistent differences in revision rates among three Danish high-volume hospitals. These baseline data provide the basis for comparing postoperative outcomes within the same cohort.
Funding Open access funding provided by Royal Danish Library.

Conflict of Interest Authors have reported no conflicts of interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.