The reliability of posterior malleolar ankle fracture assessment: a unique perspective

Aim This study aims to elucidate the pathology of PMFs in the South African population, establish correlations between fracture patterns and international classification guidelines and evaluate the interobserver reliability of current classifications. Methods A retrospective review was conducted in a multicentre analysis over a one-year period from January 2019 to December 2019 at our institution. Computer tomography scans for foot and ankle injuries were reviewed, and posterior malleolus fractures were included. Pathoanatomical data was collected and analysed according to known classification systems and subsequent treatment modalities evaluated. A panel of observers individually reviewed radiographic data to determine interobserver reliability. Results A total of 71 patients were included with a mean age of 41 ± 13.4 years (range 18–78) and a female predominant population (69%). A greater proportion of injuries were high energy (23.9%), with significant fragment comminution (53.5%), and half (52.1%) of all injuries were subluxated/dislocated at presentation. A total of 93% of injuries were managed operatively, despite theatre access limitations resulting in significant delays to fixation (19.1 days). Despite good pathoanatomical agreement with most international classifications, interobserver reliability was poor (Krippendorff α-coefficient < 0.667). Inconsistent treatment patterns in operative and non-operative strategies are reported. Conclusion A unique patient population of younger, female individuals incurred posterior malleolar fractures due to higher energy mechanisms of injury. Whilst injury patterns were mostly comparable, significant interobserver variability was noted. Resource limitations, diagnostic challenges, poorly defined and inconsistent treatment strategies, inevitably impact outcomes within the South African population. Level of evidence Level III.


Introduction
Posterior malleolus ankle fractures (PMF) pose ongoing controversies in orthopaedic literature and clinical practice [1,2].Despite varying incidences ranging from 7 to 46%, the specific characteristics of these injuries in the African continent remain poorly quantified [1,[3][4][5][6].Biomechanically, the posterior malleolus plays a crucial role in maintaining syndesmotic stability and ensuring uniform tibiotalar contact stressors [7,8].Injury to the posterior malleolus typically occurs through low-velocity, twisting-type mechanisms, as indicated by the Lauge-Hansen classification, which signifies the systematic and progressive disruption of key ankle restraints [3,9,10].Consequently, joint congruency and stability are compromised within the tibiotalar joint and syndesmosis [11].
Previous studies have reported improved clinical outcomes with surgical repair of the posterior malleolus when more than 25% of the articular surface was involved, establishing a controversial size-based threshold for surgical intervention [12].However, recent literature highlights the significance of joint congruity and syndesmotic stability as critical factors in treatment planning [2,13].Computed tomography (CT) scanning has emerged as the gold standard for pre-operative evaluation of PMFs, owing to the unreliable and inaccurate interpretation of plain film radiographs [14][15][16].CT scans guide the preferred surgical approach and enhance direct fragment fixation method [17].Various systemic classification systems by Haraguchi et al., Bartonicek et al., Mason et al. and Lu et al., based on 2D axial CT cuts have been proposed, but their interpretation lacks consensus regarding optimal management and reliability [9,11,[18][19][20][21][22].
This study aims to elucidate the pathology of PMFs in the South African population, establish correlations between fracture patterns and international classification guidelines, and evaluate the interobserver reliability of current classifications.By addressing these objectives, the study seeks to enhance our understanding of PMFs, provide valuable insights into their management and guide treatment decisions in the South African context.

Methods
This retrospective study reviewed the records of all patients with ankle fractures involving the posterior malleolus between January 2019 and December 2019 at three public centres, including a tertiary, district and regional hospitals.Institutional ethics committee and hospital board approval were obtained prior to data collection.
All adult patients who underwent foot and ankle computed tomography scans during the study period were considered for inclusion.Patients with posterior malleolar or trimalleolar ankle fractures were included (Fig. 1).Data were collected regarding patient demographics, injury characteristics, radiological evaluation and treatment.
The above flow diagram represents the inclusion and exclusion process of 217 CT scans performed in a one-year period for variable foot and ankle pathology.After exclusion, 71 patients remained eligible for further examination.Four distinct processes were followed, namely, i) establishing the patient and injury descriptive data; ii) mapping the fracture patterns and measuring fracture variables to establish data points that were used to compare with known classification data points; iii) fractures were classified on CT imaging and interobserver reliability testing performed, in addition to comparing clinical observations to measured data points (measured control classification) and iv) surgical fixation methods reviewed on fluoroscopic imaging, then compared to individual classification types and to international surgical fixation rates and methods.
Standard imaging protocols with one millimetre (fine slice) CT images were evaluated in the coronal, sagittal and axial planes (standard 3 mm above the joint line).Sagittal films were examined in the slice that revealed the greatest cross-sectional area of the posterior malleolus fragment, and coronal images sectioned at the transmalleolar line.No 3D reconstructions were examined.Individual fracture morphology was mapped by importing CT images into SketchAndCalc online software and analysed according to size, shape, fracture pattern, angle of primary and secondary fracture lines, overall plafond area-sagittal and axial, fracture fragment area-sagittal and axial, incisural extension, any noted comminution, displacement of the fragment and widening of the talofibular clear space (Fig. 2).The exact parameters required to fulfil the criteria of each classification system were measured to establish an objective data set (Control Classification), used as a control in further comparisons.
Figure 2A Axial Computer Tomography (CT) scan images taken 3 mm above the joint line.Examples in this image show (purple) the area measurement of the plafond and the fragment; (blue) the axial length measurement of the fragment, perpendicular to the fracture line; (yellow) the axial length measurement of the fragment, perpendicular to the trans-malleolar axis; (orange) the displacement of the fragment, in relation to the tibiofibular clear space; (green) the proportion of incisural joint surface attached to the fracture fragment; and (red) the primary fracture angle, as measured from the trans-syndesmotic plane.
Figure 2B Mid-sagittal CT scan images depicting the fracture fragment.Examples in this image show (red) the height of the fracture fragment, at the fracture line; (yellow) the amount of anteroposterior (AP) displacement; (green) the sagittal length measurement of the fragment, in the direct AP plane; (blue) the sagittal length measurement of the fracture fragment, along the tibiotalar joint surface; and (purple) the joint space was measured at multiple intervals, inconsistencies with varying lengths, denoting sagittal tibiotalar joint subluxation.
Figure 2C Mid-coronal CT scan images depicting medial malleolar fractures and coronal tibiotalar incongruity.Examples in this image show (purple) the medial tibiotalar clear space; and (green) the incongruity of the tibial and talar joint surfaces, denoting coronal tibiotalar subluxation.
Images are for descriptive purposes only.Actual measurements were measured using computer software (Sketchand-Calc), calibrated to reference parameters utilizing a digital grid overlay and standardized across the series.
The primary researcher and four orthopaedic trauma specialists with a minimum of six years' experience analysed and classified all the images independently.Injuries were classified according to the Dennis-Weber [27], Lauge-Hansen [10,28] and Association of Osteosynthesis (AO) [29] classifications for plain film radiographs and the Haraguchi [18], Mason [19] and Bartonicek [9] Fig. 1 Detailed methodology process flow of all participants reviewed in the posterior malleolus fracture study over a one-year period classifications for CT images.Plain film radiographs were utilized as preliminary imaging, aiding in the assessment of CT images and classifications upon which further analysis and comparison were conducted.Finally, intraoperative fluoroscopic images were reviewed, and the fixation methods were recorded for the medial malleolus, posterior malleolus, syndesmosis and fibula.
Data were analysed using Stata v15.1 software and are described as means ± standard deviations or medians (interquartile ranges, IQR), depending on distribution, or as frequencies (counts).Normality was assessed using the Shapiro-Wilk test, and the Spearman's Rank correlation was utilized to investigate correlations between continuous data.The Spearman's Rank coefficient (Rho or ρ) was interpreted as < 0.10 being a negligible correlation, 0.10-0.3being a weak correlation, 0.40-0.69being a moderate correlation, 0.70-0.89being a strong correlation and 0.90-1.00being a very strong correlation [30].Differences between classifications in the present study and the original work of Bartonicek, Haraguchi and Mason were investigated with a Chi-square test.
Agreement between observers against the control measurement is reported as percentage agreement, whilst the level of agreement on the classification of the fracture patterns between the five raters was measured using Krippendorff's alpha statistic, using the methods described by Klein [31,32].An alpha level of > 0.8 was classified as a "high" level of agreement, whilst a level < 0.667 was regarded as "no agreement".Levels between 0.667 and 0.800 were considered 'tentative agreement'.These conservative cutoff values were based on the recommendations of Krippendorf for the interpretation of values that influence important decision-making processes, such as medical interventions [33].

Results
Over the one-year study period, 217 patients underwent CT scans for injuries to the foot and ankle.Of these, 114 patients were excluded, with injuries to other structures within the foot and ankle.Additionally, 32 patients were excluded as the injuries were better described as "pilon" or "plafond" fractures.(Table 1) The final cohort comprised 71 patients, with a female predominance (n = 49, 69%) and a mean age of 40.9 ± 13.4 years (range 18-78).Most patients were unemployed (n = 46, 64.8%).A limb  The predominant mechanism of injury was low-velocity falls (n = 47, 66.2%), followed by pedestrian-vehicle accidents (n = 17, 23.9%) and assault (n = 2, 2.8%) (Table 2).Four patients (5.6%) sustained open injuries.Posterior subluxation was seen in 31 patients (43.7%), with mediolateral subluxation noted in 37 (52.1%).Four patients (5.6%) required temporary external fixation to maintain reduction of the ankle joint.The mean period from injury to definitive fixation was 19.1 days (IQR, 1-45).Contributory factors to this delay included limited theatre availability, poor soft tissue condition, inter-facility transfers and pre-operative medical optimization.Two patients (2.8%) sustained isolated posterior malleolar injuries.Fibula fractures occurred in 67 patients (94.4%), and the medial malleolus required fixation in 59 cases (83.1%).
Four methods were used to measure the posterior malleolus fragment's AP length, and median values were used to compare central tendencies.Sagittal fragment measurements included the most commonly measured direct AP length of 12.5% (IQR, 0-54.5) and joint surface length of 18.9% (IQR, 0-57.5) of the plafond.On axial images, the AP length was 21.62% (IQR, 4.37-50.0),and the perpendicular length to the fracture line was 20% (IQR, 4.2-48.8) of the plafond.The fragment size as a percentage of the axial area was 9.6% (IQR, 0.47-34.9).A strong correlation in the sagittal plane between the AP length and surface length was observed (ρ [rho] = 0.855, p < 0.001), whilst moderate correlations were observed between the sagittal AP length and both the axial plane AP length (ρ [rho] = 0.642, p < 0.001) and perpendicular length (ρ [rho] = 0.595, p < 0.001), respectively.Moderate correlation was observed in the fragment size when comparing the sagittal AP length and the percentage area of the fragment in the axial plane (ρ [rho] = 0.595, p < 0.001).The primary fracture fragment angle from the trans-malleolar line was 21.83 ± 11.36 degrees (IQR, 0.1-52.3),with secondary fracture angles in multi-fragmented fractures of −15.69 ± 13.80 degrees (IQR, −48.2-4.9).
Sixty-six patients (93%) received operative fixation, and five (7%) were managed non-operatively (Table 4).All fracture types had varied methods of fixation used, with only Bartonicek type 4 fractures always being treated using a posterior tibial plate.No discernible pattern of fixation was identified across the classification types.
Xu et al. in a high-volume study, observed that PMF's are typically low-velocity injuries, resulting from a fall or stumble, and rarely due to high mechanisms of injury such as falls from height or road traffic accidents, contrary to what was found in this study, with nearly a quarter of all PMFs were due to high-velocity pedestrian-vehicle accidents [12,34].Considering the prevalence of these higher energy injuries, it is unsurprising that joint subluxation was noted in almost half (43.7% sagittal subluxation; 52.1% coronal subluxation) with a subsequent increased rate of post-traumatic arthritis expected [35,36].Thus, our higher energy, subluxated injuries pose a significant risk to poor outcomes and residual functional limitations in our population as an independent pathomechanical variable.Additionally, when comparing fracture patterns and fragment numbers to morphological studies, with no differences noted [18].However, 53.5% of the injuries in the current series showed some degree of comminution.This is likely due to pathomechanical variables, and whilst no comparisons are available in the literature, this is thought to be high.
The current series measured the size of the posterior malleolus by four means: direct AP (sagittal), along the joint surface (sagittal), perpendicular to the trans-malleolar line (axial), and perpendicular to the fracture line (axial).Variations in fragment size were noted, however all studies showed variations in mean fragment size (range 16 -25%), the current series appears consistent with prior studies.
The Bartonicek, Haraguchi and Mason classifications are the most well-known CT-based classifications, and typically used in clinical practice [9,18,19].The authors' individually described parameters were used to establish an objective control classification type for each fracture within each classification.It is not known if this method has been used to obtain an objective classification type, but we believe this form of measured assessment to be more accurate than observation alone.When the control classifications of the current cohort were compared to cited  [9,19].Hence, the agreement of our population with that of Bartonicek's is surprising, with a greater proportion of type four (axial compression) injuries anticipated.Whilst the exact position of the foot in relation to the tibia, the amount of rotatory energy transferred across the tibiotalar joint and the direction of force being unknown in pedestrian-vehicle accidents, a rotatory talar moment must still predominate, leading to similar fracture morphologies.
When comparing the observations of the five clinicians, unanimous agreement was seen in less than half of the cases (range 42-52%), with interrater agreement rated as "poor" for all the classification systems.Subsequent comparison to the control classification revealed that when observers were unanimously agreeable, there was a high accuracy and likelihood that they were correct in their assessment (range 84-100%).Furthermore, whilst still poor, the Haraguchi classification had the highest agreeability rates, likely due to it being the simplest classification, with distinct, clinically discernible variables and no size or percentage-based classification type variations.It is thus reasonable to state that whilst a correct diagnosis is possible, at acceptable rates, this is only possible to achieve once a unanimous diagnosis is made.In contrast to recent literature which notes substantial reliability and agreement to all three of the commonly utilized classification systems, our study observed "poor" interobserver reliability [37,38].The general poor agreement may be due to the complexity of the classifications themselves, clinical difficulties in recognizing these variables, or both.The need for clinical observations to be reliable requires no justification, as misdiagnosis leads to inconsistent treatment, as seen in this study, and ultimately poorer anticipated outcomes.
In recent literature, the concept of tibiotalar congruity, incisural reduction, syndesmotic stability and restoration of the posterior buttress is generally accepted as treatment goals [2,12].However, recent literature, including systematic reviews, found no consensus regarding the fragment size as an isolated threshold for surgical fixation with ankle joint stability, irrespective of fragment size gaining favour and well supported in literature [1,11,20,21,39,40].Whilst Gardner et al. found this to be the most significant surgical indicator, this opinion was only elicited in 56% of surgeons [13].A standardized surgical indication or morphological threshold is not available at present.Subsequently, there are significant inconsistencies in treatment strategies.
In the current series, 93% of all PMFs underwent surgical fixation.Our disproportionately high rate of surgical intervention, with mean smaller fracture fragment sizes, indicates that we are potentially overtreating our patients.Some literature support fixation for most PMFs; however, the lack of consensus is evident in the high fixation rates noted in the current series.The fixation strategies observed in our study varied, with no identifiable predilection of one fixation method with type 4 Bartonicek fractures the only fracture pattern exclusively managed with posterior buttress plating.The lack of clear surgical indications and established treatment algorithms has resulted in the management of these injuries being highly inconsistent and an area of concern.
Investigating the short-and long-term outcomes of the patients included in this study was beyond the scope of the investigation, however, whilst not measured-they are thought to be sub-optimal-significantly affecting a unique demographic, which is a crucial area requiring further study in future.

Conclusion
This study identified a unique, heterogeneous, at-risk, vulnerable population group in a resource-limited health care system.These patients are incurring higher mechanism injuries, high rates of subluxation at presentation and greater fragment comminution with representative fracture patterns to international literature.Despite this, they are poorly and unreliably classified.The lack of well-established treatment standardization, algorithms or consensus has resulted in sub-optimal management strategies and likely sub-optimal outcomes, which require further study.
Funding Open access funding provided by Stellenbosch University.

Declarations
Conflict of interest The authors declare that they have no conflict of interest.Human and animal rights statement This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent
No informed consent for the collection of retrospective data was required.A waiver of consent for this data was granted by the Stellenbosch University Health Research Ethics Committee (C22/08/023).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http:// creat iveco mmons.org/ licen ses/ by/4.0/.

Fig. 2
Fig. 2 Examples of the fracture mapping done on all posterior malleolus fractures in a multicentre study over a one-year period (color figure online) The authors declare that this submission is in accordance with the principles laid down by the Responsible Research Publication Position Statements as developed at the 2nd World Conference on Research Integrity in Singapore, 2010.Prior to the commencement of the study ethical approval was obtained from the following ethical review board: Stellenbosch University Health Research Ethics Committee (C22/08/023)-(HREC Reference #: N18/09/105).

Table 2
Injury description of all participants with posterior malleolus fractures in a multicentre study over a one-year study period a Spearman Rank correlation ρ = 0.855, p < 0.001 b Spearman Rank correlation ρ = 0.625, p < 0.001 c Spearman Rank correlation ρ = 0.642, p < 0.001 d Spearman Rank correlation ρ = 0.595, p < 0.001

Table 3
Classification accuracy between five observers across three different CT classification systems for posterior malleolus fractures Data are described as percentage of unanimous observer agreement with the total number of agreed cases compared in parentheses.The Krippendorff alpha coefficient includes multiple agreement variables and represents an overall agreement rate.It is interpreted as: < 0.667 indicating no agreement; 0.667-0.8indicatingtentative agreement; > 0.8 indicating high level of agreement as per the definitions ofKrippendorff et al.

Table 4
Methods of fixation of all patients that underwent operative management for posterior malleolus fractures over a one-year period , there was no significant difference between the current series and those of Bartonicek et al. or Haraguchi et al.The current series showed similar numbers and fracture patterns to studies conducted in predominantly European and Asian populations.The Mason classification, however, was dissimilar due to very few fractures in the present study having a pure coronal fracture line. classifications