Introduction

Trochlear dysplasia refers to an abnormal trochlear morphology characterized by a shallow groove and is known to be a major risk factor for patellofemoral instability. The exact type of trochlear dysplasia is important for surgical treatment planning, e.g., whether a medial patellofemoral ligament reconstruction is performed alone or in combination with a trochleoplasty [1].

Trochlear dysplasia was first described on strictly lateral conventional radiographs by Maldague and Malghem 1985 [2]. In 1987, Henri Dejour defined a classification of trochlear dysplasia based on the level of the crossing sign on lateral knee radiographs [3,4,5]. In 1998 his son David Dejour expanded this classification to also include the supratrochlear spur and double-contour sign by cross-referencing lateral radiographs with axial slices on computed tomography (CT) imaging [6, 7]. Nowadays, the Dejour classification is most commonly used to categorize and describe trochlear dysplasia: it refers to 4 different types (A, B, C, and D) of trochlear dysplasia [7]. The Dejour classification from 1998 has been criticized for having a poor intra- and inter-observer agreement [8,9,10]. Moreover, the description of the criteria for analyzing the axial image differ from the original publication and have changed over the years. Currently, there is no clear standard for the application of Dejour classification based on magnetic resonance imaging (MRI), and different authors have used different methods, resulting in poor agreement between conventional radiography and MRI and low reproducibility [11,12,13,14,15]. The adapted Table 1 probably represents the consensus for the Dejour classification on radiographs and axial images today. Experienced musculoskeletal radiologists in our department also reported uncertainties and poor reproducibility when using the Dejour classification for MRI in clinical routine. Therefore, the aim of our study was to simplify and standardize the 1998 Dejour classification for MRI purposes by using axial and sagittal MRI planes to achieve better intra- and inter-rater reliability.

Table 1 Trochlear dysplasia, Dejour grading, findings on lateral radiographs and axial images

Materials and methods

This retrospective study was approved by the local ethics committee (BASEC No. 2022-01534). Our institutional PACS system (Carestream Vue PACS, Version 12.2.2.1025, Carestream Health, Inc, Rochester, NY, USA) was searched between December 2020 and August 2022 for patients with a knee MRI examination and the RIS system (MEDAVIS RIS, Version 5.3.7.2, Karlsruhe, Germany) for diagnosis of trochlear dysplasia (by a radiologist not involved in the following evaluations: ABR) in the corresponding report. All initial MRI reports were done by board-certified radiologists with more than 8 years of experience in MSK radiology. The following inclusion criteria were applied: age between 6 and 99 years, knee MRI examination between December 2020 and August 2022, and diagnosis of “trochlear dysplasia” in the conclusion of the MRI report. Exclusion criteria were as follows: incomplete examination, qualitatively non-diagnostic examination, post trochlear surgery, and missing informed consent for research purposes.

MRI examination

All patients were scanned in 1.5 or 3 Tesla scanners (Sola/Avanto/Aera or Vida/Skyra; Siemens Healthcare, Erlangen Germany) with the institutional routine knee MRI protocol, an example protocol for the Sola scanner can be found in Table 2. The parameters for the protocols of the other scanners were slightly adapted according to scanner characteristics.

Table 2 Routine protocol for native 3 T-MRI of the knee in our institution

Image evaluation

Two radiologists (NJL (Reader 2) with 5 years of experience in MSK radiology and FMB (Reader 1) with 18 years of experience in MSK radiology) performed three evaluations of the cohort (4 months between each evaluation to exclude memorization of images). The readers were blinded to the original MRI reports. The study flowchart is shown in Fig. 1.

Fig. 1
figure 1

Study flowchart

First evaluation (current description of Dejour classification)

In the first evaluation, the two readers assessed the trochlear dysplasia according to the Dejour classification types A–D using the current established criteria and illustrations in the paper by Kazley and Banerjee (see Table 1 in this study and Fig. 4 in Reference [9] – weblink in References):

  • Type A: fairly shallow trochlea

  • Type B: flat or convex trochlea

  • Type C: asymmetry of trochlear facets with a hypoplastic medial condyle

  • Type D: asymmetry of trochlear facets plus vertical joint/cliff pattern

The entire distal femur was analyzed by the two readers (not only a specific slice).

Second evaluation (adapted scheme)

In the second evaluation (4 months after the first evaluation), the two readers classified the knees again analyzing the entire distal femur on axial and sagittal MRI slices using the following criteria.

Novel adapted scheme according to the presence of a spur or cliff (see Fig. 2)

Fig. 2
figure 2

Novel adapted classification scheme. Adapted scheme showing each classification category (A, B, C, or D) with illustrations on the left side and corresponding MRI examples (sag proton-density (pd) weighted and axial proton-density-with-fat-saturation (pdfs) weighted) on the right side. Checkmark means the finding is present. Cross sign (X) means the finding is absent

  • Type A: shallow trochlea, spur: no, cliff: no

  • Type B: shallow trochlea, spur: yes, cliff: no

  • Type C: shallow trochlea, spur: no, cliff: yes

  • Type D: shallow trochlea, spur: yes, cliff: yes

Definition of a spur presence

  • Visible spur on mid-sagittal (= within the second and third quarter of the trochlea) at level of the epiphysis: yes/no

  • Clear spur (> 2.5 mm)

  • Spur size measured in mm: maximal distance from tangent at anterior trochlear cortex (see Fig. 3)

Fig. 3
figure 3

Presence of a spur (arrow) on mid-sagittal proton-density-weighted image in a patient with trochlear dysplasia. Same image (B) shows measurement of spur height by drawing a tangent at the anterior cortical bone of the trochlea and a parallel line at the spur tip (yellow dotted lines). The distance between the lines measures the height of the spur (green line). Note the level of the epiphysis (dotted white line) adjacent to the spur

Definition of a cliff presence

  • Visible cliff (> 3.0 mm) on transverse plane: yes/no

  • Step-like transition from lateral facet to anterior cortical bone of the femur

  • Cliff height in mm (maximum on transverse images (see Fig. 4))

Fig. 4
figure 4

Axial pdfs (A) showing small cliff (arrow) in a patient with trochlear dysplasia. Same image (B) showing the measurement of cliff by drawing a horizontal line adjusted to the bony contour of the medial trochlea and parallel line to the cortical bone of the central trochlea (yellow dotted lines). The distance between the lines measures the height (in mm) of the cliff (green line)

Third evaluation (adapted scheme)

Four months after the second evaluation, the two readers assessed the included knees again (using only the adapted classification) in order to calculate intrareader reliability for the adapted classification.

Statistics

Study data were collected and managed using REDCap® electronic data capture tools (Version 13.5.4; Vanderbilt University, USA) hosted at the Schulthess Clinic, Zurich, Switzerland [18]. Descriptive statistics for basic demographic data and summary of evaluations per reader were created using the “Stats & Charts”-Tool in RedCap®: age (years: mean and range), gender (%), side (left in %), dysplasia type (A, B, C, or D in %) and spur and cliff sizes (mean ± SD). The inferential analysis was conducted with SPSS (Version 28.0. Armonk, NY: IBM Corp) and MATLAB (The MathWorks Inc. (2022b), Natick, Massachusetts). P-values below 0.05 were considered statistically significant.

Cohen’s kappa values for inter-reader and intra-reader reliability were interpreted as follows: 0.01–0.20 indicating slight agreement beyond chance, 0.21–0.40 indicating fair agreement beyond chance, 0.41–0.60 indicating moderate agreement beyond chance, 0.61–0.80 indicating substantial agreement beyond chance, 0.81–1.00 indicating almost perfect agreement beyond chance [19, 20]. Inter-reader reliability of continuous measurements (i.e., spur and cliff size) was quantified in terms of the intraclass correlation coefficient (ICC) [21] based on a two-way random effects model assessing the absolute agreement of a single-measure approach and the related standard error of measurement (SEM). ICC values were interpreted as follows: Values less than 0.5 indicate poor reliability, values between 0.5 and 0.75 indicate moderate reliability, values between 0.75 and 0.90 indicate good reliability, and values greater than 0.9 indicate excellent reliability. In the two categories for cases with a positive finding of a bone spur, we assessed potential differences in spur size based on the coincident presence or absence of a bony cliff (i.e., category B vs. D). The analogous procedure was applied to assess for a difference in cliff size—provided it is present—as a function of the presence or absence of a cliff (i.e., category A vs. C). These comparisons were conducted using nonparametric tests (Mann Whitney U-tests; two-sided).

Results

One hundred seventy-one knees of 161 patients (65.5% female) were included as patient cohort with a median age of 34.3 years (range 11.3–79.2 years). In 47.4% of knees the right side and in 52.6% the left side was examined (see Table 3).

Table 3 Basic demographics and summary of evaluations

This patient cohort was used for all evaluation rounds (1–3). The readout results are presented using the 4-type trochlear dysplasia classification (types A–D), the 2-type trochlear dysplasia classification of low (type A) and high grade (types B, C, and D), and whether a supratrochlear spur was present or not (types A and C versus types B and D).

High-grade dysplasia was more often rated using the established description (reader 1: 64.9%; reader 2: 60.2%) than with the adapted scheme (both readers: 51.5%). Using the adapted scheme, type A was most often rated both by reader 1 (48.5%) and 2 (48.5%). All other detailed results for Evaluations 1 and 2 can be found in Table 3.

Table 4 shows that classification differences between the readers were much higher in Evaluation 1 (established description) compared to Evaluation 2 (adapted scheme). Therefore, inter-reader reliability was much higher for the adapted scheme (Table 5).

Table 4 Differences in classification between Evaluation 1 (established description) and 2 (adapted scheme)
Table 5 Inter-reader reliability for established description and the adapted scheme

Intra-reader reliability comparing the differences between Evaluation 2 and Evaluation 3 was almost perfect for both readers with kappa values between 0.88 and 0.95 (see Table 6).

Table 6 Intra-reader reliability for the adapted scheme

Inter-reader reliability analysis of spur and cliff size measurements yielded very good reliability (see Table 7). Both spur size and cliff size differed significantly (p < 0.001) between types B vs. D and C vs. D, respectively (Table 8). Both spurs and cliffs were significantly larger in Dejour type D.

Table 7 Inter-reader reliability for size of spur and cliff
Table 8 Comparing inter-type spur sizes and cliff sizes based on measurements of reader 1: median (inter-quartile range) spur size between types

Discussion

Radiologists’ inter-reader reliability for the Dejour classification was significantly higher in this study when using the adapted scheme. Intra-reader agreement was excellent for the adapted approach. This study is the first to use specified sagittal MR images for grading and also to introduce cut-off values for the size of a spur or cliff on MR images.

The original title of the paper describing the Dejour classification from 1998 was “Douleurs et instabilité rotulienne. Essai de classification” [6]. The article was published in the journal “la revue Médecine et Hygiène,” which was founded in Geneva, Switzerland, in 1943. Today, this journal is known as the “Revue Médicale Suisse.” The original description included conventional and cross-sectional criteria. The translation of the original description from French is as follows:

  • Simple dysplasia: crossover sign but normal morphology of trochlear facets on CT

  • Simple dysplasia with a supratrochlear spur: increased radiological prominence and an overall convex trochlea on CT

  • Complex dysplasia: double contour ending below the crossing sign on plain radiography and hypotrophy of the medial side and convexity of the lateral side on axial CT scan

  • Complex dysplasia with supratrochlear spur: increased radiological prominence and disappearance of the medial side and hypertrophy of the lateral side

However, the description of the criteria for analyzing the axial cross-sectional images and the graphical schemes have been modified repeatedly and therefore markedly changed over the years (current established description: see Table 1 [9]). In the novel adapted scheme, we have integrated criteria for the conventional X-ray images and simplified the criteria for sectional imaging including not only axial but also sagittal MR images.

The low inter-reader agreement with k-values between 0.11 and 0.43 in our first evaluation using the current established description [9, 22] is as expected from other recent studies [8, 10, 23], for instant the paper by Sharma et al. with k-values between 0.24 and 0.46 based on cross-sectional images [14]. The absolute number of inter-reader disagreement in our study was very high using the established description (in 47.4% of cases, so almost as reliable as tossing a coin) and significantly lower when using the adapted scheme (in 12.9% of cases). Both, for the established description and for the adapted scheme, most differences between the readers were seen between types A and B (58.0 versus 72.7%). Differences between types B and C were a major problem for the established description (23.5%, n = 19), but not for the adapted scheme (9.1%, n = 2). Trochlear dysplasia type A (= low grade) was the most common type when using the adapted scheme in our study cohort. This seems plausible since we searched the data base for all patients with a stated “trochlear dysplasia” in the knee MRI report and not for a subgroup of patients with known trochlear instability or incidence of patella dislocation.

In 2000, Pfirrmann et al. [24] found that the presence of trochlear dysplasia can be reliably diagnosed with MRI, but they did not further classify the severity of the dysplasia. Tscholl et al. [11] only found a fair agreement between trochlear dysplasia measured on radiographs and different levels on axial MR images, especially when the supratrochlear region of the distal femur is not analyzed on axial MRI. They concluded that an MRI evaluation limited to the cartilaginous trochlea alone tends to underestimate the severity of dysplasia. Therefore, we also included the distal femur in the analysis for signs of trochlear dysplasia and included not only axial, but also specified mid-sagittal MRI slices in our study.

Various other publications criticized that the reliability and reproducibility of Dejour’s classification is too low for its routine use in research and clinical practice [9, 22, 25]. Zimmerer et al. [26] underlined the need for a reliable and valid classification to grade trochlear dysplasia and recommended that the Dejour classification may be only grouped into low grade (type A) and high grade (types B to D). Another study by Fucentese et al. [1] modified Dejour’s classification combining patients with presence of a spur (types B and D) as high-grade and patients without spur presence (types A and C) as low-grade trochlear dysplasia. The authors stated that patients with a spur had better clinical outcomes after trochleoplasty regarding subjective knee pain and pain during sporting activities. Therefore, we tested in this study also the reliability of our adapted scheme regarding low-grade versus high-grade dysplasia and found excellent intra- and inter-observer reliabilities for both variants.

To date, there are no clear thresholds for the presence of a spur or cliff on MR images. A supratrochlear spur is defined as prominence of the trochlea above the anterior femoral cortex [9]. A clear spur indicates an elevation of the trochlear floor with loss of relative lateral trochlear height and increased susceptibility to patellar instability [9]. For the evaluation of spurs, we adapted the published measurement methods for sagittal radiographs to sagittal MR images [9, 27, 28]. To increase the specificity, we introduced a cutoff value of > 2.5 mm for spur since tiny bony irregularities (< 2 mm) are frequently found at the distal femur. Formation of osteoarthritis-related osteophytes at the distal femur is a minor differential diagnosis for spur presence since a spur is visible in the center of the trochlea, while the osteophytes usually occur at the trochlear edges. In orthopedic literature, Dejour types B and D with a spur larger than 5 mm are considered (together with other factors such as symptomatic patellar instability) as indications for proximal trochlear resection therapy [29]. A cliff is present with a missing or hypoplastic medial facet with a vertical transition from the lateral facet in the axial plane [9]. For the evaluation of a cliff, only a step-like transition from the lateral facet to anterior cortical bone of the femur > 3 mm was rated as a clear cliff in order to avoid overdiagnosis. We found a significant larger cliff and spur sizes for Dejour type D compared to types C and B. For the adapted scheme, the smaller spur size in types B compared to D might serve as an explanation, that the readers had some difficulties in differentiation between spur and no spur presence regarding types A and B, but no difficulties between all types and type D.

Recent studies tried to correlate and expand the Dejour classification with quantitative measurements (e.g., trochlear depth or lateral trochlear inclination [30, 31]) or 3D printing [32]. Some even introduced other classifications for the grading of trochlear dysplasia [14, 15, 33, 34]. But up to now no other classification system for trochlear dysplasia could be established in the orthopedic community, and referring physicians continue to ask for the Dejour classification in MRI reports. The proposed adapted scheme in this study might be used with both MR and CT images. However, since our data only included MR images, a further evaluation using CT images is recommended in the future.

In summary, the adapted scheme shows an almost perfect intra-reader reliability and a significantly higher inter-reader reliability than the established description and can therefore be a reliable tool for the daily diagnostic work of radiologists.