Classifications of posterior malleolar fractures: a systematic literature review

Introduction Complex ankle fractures frequently involve the posterior malleolus. Many classifications describing posterior malleolar fractures (PMF) exist. The aim of this study was to provide a systematic literature review to outline existing PMF classifications and estimate their accuracy. Methods The databases PubMed and Scopus were searched without time limits. Only specific PMF classifications were included; general ankle and/or pilon fracture classifications were excluded. Selection and data extraction was performed by three independent observers. The systematic literature search was performed according to the current criteria of Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA). The methodological quality of the included studies was quantified using the modified Coleman score. Results A total of 110 studies with a total of 12.614 patients were included. Four main classifications were identified: Those describing the size of the posterior malleolar fracture (n = 66), Haraguchi (n = 44), Bartoníček/Rammelt (n = 21) and Mason (n = 12). The quality of the studies was moderate to good with a median Coleman-score of 43.5 (14–79) and a weighted median Coleman-score of 42.5 points. All classifications achieved a substantial to perfect score regarding the inter- and intraobserver reliability, with Mason scoring the lowest in comparison. Conclusions None of the reviewed PMF classifications has been able to establish itself decisively in the literature. Most of the classifications are insufficient in terms of a derivable treatment algorithm or a prognosis with regard to outcome. However, as the Bartoníček/Rammelt classification has the greatest potential due to its treatment algorithm, its reliability in combination with consistent predictive values, its usage in clinical practice and research appears advisable. Supplementary Information The online version contains supplementary material available at 10.1007/s00402-022-04643-7.

Decision-making to fixate PMF is still highly debatable and traditionally often based on fracture size measurement on radiographs, with lack of accuracy and poor reliability [11][12][13][14][15][16][17][18]. Nowadays, it is generally believed that the morphology of the fragment is more closely related to the fracture pattern and is, therefore, more important in classifying the fracture [14,[19][20][21]. Consequently, with regard to the Julia Terstegen and Hanneke Weel have contributed equally. In memoriam of Alexej Barg. proportion of the affected joint surface and recommendation for surgical fixation of PMF, there is a shift away from the 1/3 dogma [7,17,[22][23][24][25][26][27][28]. With increasing understanding of fracture morphology and the routine use of computed tomography (CT), efforts have been made in recent years to establish new classification systems based on CT imaging [14,29,30]. Until now, there is no international consensus regarding classification and treatment of PMF [24,31,32]. A good classification system helps the orthopedic surgeon to identify and characterize a problem, suggest a potential prognosis, and offer guidance in determining the appropriate treatment method for a particular condition. To achieve optimal therapeutic results, a complete understanding of the morphology is indispensable.
Therefore, the aims of this systematic review were first, to determine how many studies use a classification of the PMF; second, to identify and to describe which classifications of PMF exist; third to examine which classification system does have the most reliable (inter-and intra-observer) scores; and fourth, to evaluate the predictive value of the classifications in terms of postoperative outcomes.

Search strategy
The study protocol was registered in the PROSPERO database (CRD42021264268). The review was performed and reported according to the PRISMA 2020 checklist [33].
The electronic databases of the Cochrane Central Register of Controlled Trials, MEDLINE via PubMed and Scopus were searched systematically. The search was performed on the 20th of March 2021. The following search algorithm was used: (posterior AND ankle AND fracture) OR (posterior AND (malleolus OR malleolar) AND fracture) OR (ankle AND volkmann) OR (trimalleolar AND fracture) OR (posterior AND pilon AND fracture). A final update of the search was conducted 12th of May 2022 using the same search string. Furthermore, reference lists of relevant reviews and included articles were screened for additional articles. Bidirectional citation search was used including backward and forward citation search methods [34]. There were no limitations on journal or publication date of the article.

Study selection
Studies reporting data on classification systems of trimalleolar ankle fractures were screened for using a PMF classification. Inclusion and exclusion criteria were cross checked by three reviewers (HW, JT, EM), first by screening the title and abstract, second by reading the full text. Clinical studies were included for data extraction. Cadaveric studies, review articles, case reports with fewer than 10 cases, studies that did not include a posterior malleolus specific classification, and studies not written in English, were excluded.

Study quality assessment
The methodological quality of the included studies was quantified using a modified Coleman score [35]. The modified Coleman score was applied by two independent reviewers (HW, JT) (Online Resource 1). The score is composed of two parts. Part A assesses study size, average follow-up time, percentage of patients with follow-up, number of interventions, study type, diagnostic certainty, description of surgical method, and postoperative rehabilitation. Part B is comprised of outcome criteria, procedure for assessing outcomes, and description of the subject selection process. The maximum score to be achieved is 100 points.

Statistical analysis
The data were processed descriptively, therefore, no metaanalysis was performed.

Included studies
Evaluation of the databases revealed 3.377 studies potentially relevant for inclusion. After excluding duplicates, title and abstract of the remaining studies were assessed. 380 studies were eligible for full-text analysis, after applying the exclusion criteria (no clinical study, case reports < 10 patients, no classification/no PMF-specific classification), 110 remaining relevant studies were included in this review. The selection process was performed according to "Preferred Reporting Items for Systematic Review and Meta-Analyses" (PRISMA) and is shown in Fig. 1 [33].

PMF classification according to fracture size
Sixty-six studies that used the size of the PMF in relation to the joint surface as a classification could be included. Of these, 35 studies used radiographs and 30 studies used CT to estimate size, one study did not provide a clear statement in this regard. Only one studied inter-and intraobserver reliability, measuring a substantial Kappa of 0.64 and 0.63 respectively [13]. The majority of these studies used either a cut-off value of 25% for fixation of the PMF (26 studies) or fixed the posterior malleolus regardless of size (28 studies). The remaining studies used either 20% (4 studies), 30% (2 studies), or > 1/3 of the joint area (5 studies) as the cut-off value, 1 study fixed the PMF in young patients or in the presence of subluxation from 10%, and 3 studies did not provide any information (Table 3). Nine studies reported a better outcome with reduction of smaller posterior malleolus fragments [4,6,7,[46][47][48][49][50][51], whereas seven studies reported no difference between fixation and no fixation of smaller posterior malleolus fragments [52][53][54][55][56][57][58].

Haraguchi classification
The first CT-based classification found, was developed 2006 by Haraguchi et al. which classified PMF into 3 distinct types [14]. Type I is described as a posterolateraloblique wedge-shaped fragment involving the posterolateral corner of the tibial plafond, type II as a transverse medial-extension fracture line extending from the fibular notch to the medial malleolus, and type III is characterized as a small-shell type fragment at the posterior lip of the tibial plafond (Fig. 2). So far, Haraguchi's classification has been mentioned in 101 studies and was applied in 44 of them, which were, therefore, included and can be seen in Table 4. Three studies reported on the reliability of the classification, all showing substantial interobserver reliability (Fleiss kappa 0.70/Cohen's kappa 0.799/Cohen's kappa 0.797) and substantial to almost perfect intraobserver reliability (Fleiss kappa 0.77/Cohen's kappa 0.985) [24,32,59]. Modifications of the Haraguchi classification were found three times. Kumar       malleolar fracture was then categorized based on the fragment's location into "postero-lateral", "postero-medial" and "postero-central" [62]. In terms of predictive values, type II fractures were regarded to show worse outcome [19,59,63], have higher presence of osteoarthritis [59], and are more likely to require placement of 2 syndesmotic screws [41].

Bartoníček/Rammelt classification
Another CT-based classification was presented by Bartoníček/Rammelt in 2015 [29]. Five different fracture types were defined: type 1 as an extraincisural fragment with intact fibula notch, type 2 as a posterolateral fragment including the fibula notch, a posteromedial two-part fragment extending to the medial malleolus as type 3 fracture, a posterolateral fragment larger than one-third of the notch as type 4 fracture, and finally irregular osteoporotic fragments as type 5 fracture (Fig. 3). It also includes a treatment algorithm. The Bartoníček/Rammelt classification has been found 46 times in the literature, of these, 21 studies have used it as a classification system, which were included in this study and are shown in Table 5. There is one modification made by Tucek et al., who divided Bartoníček type 4 into three subtypes: subtype 1 as a fracture line that passes laterally past the malleolar groove, subtype 2 as a fracture line that involves the malleolar groove, and subtype 3 as an intercollicular fracture line or a line involving the posterior colliculus [66]. Two studies reported reliability of the classification, both showing substantial interobserver reliability (Fleiss kappa 0.78/ Cohen's kappa 0.744) and almost perfect intraobserver reliability (Fleiss kappa 0.81/Cohen's kappa 0.936) [24,32]. Regarding the predictive outcome value, type 1 fractures showed to have better outcome than type 2 fractures [65], and a significantly improved clinical outcome was achieved in type 4 fractures when they were surgically fixed [54]. With increasing fracture type, clinical outcome became worse [1,21,63].

Mason classification
In 2017, Mason et al. developed a CT-based classification of PMF ascending in severity of injury [30]. Therefore, Mason described type 1 as an extra-articular avulsion fracture following a rotational force applied to the foot when the ankle is in plantarflexion and the talus unloaded. Rotational forces applied to a loaded foot result in a type 2A fracture in form of a primary triangular posterolateral fragment. A type 2B fracture with a secondary posteromedial fragment, usually angled at 45° to the primary fragment, occurs when the talus continues to rotate in the mortise. A type 3 fracture is characterized by a coronal fracture line that involves the entire posterior plafond due to an axial loading of a plantarflexed talus (Fig. 4). Until now, Mason's classification has been mentioned 22 times in literature, and used for classification in 12 studies, which were included and can be found in Table 6. One modification of Mason type 2B fracture was found. Vosoughi et al. divided it into a large intra-articular pilon fragment and a small extra-articular fragment [67]. Interobserver reliability ranged from substantial to almost perfect values (Cohen's kappa 0.919/Fleiss kappa 0.61/Cohen's kappa 0.717) as did intraobserver reliability (Fleiss kappa 0.65/ Cohen's kappa 0.957) [24,30,32]. As for the predictive outcome value, type 3 fractures tend to show worse postoperative outcome [68].

Quality assessment of included studies
The Coleman score achieved a total median value of 43.5 points (14-79), composed of Part A with a median of 26 points, and Part B with 18 points. Based on the number of patients included, the weighted median total Coleman score was 42.5. Coleman score points are shown in Table 1.

Discussion
By reviewing the literature, 4 classifications were found describing PMF: a classification based on the fragment proportion in relation to the distal tibial joint surface [45] and the three CT-based classifications according to Haraguchi, Bartoníček/Rammelt, and Mason [14,29,30]. The earliest and most commonly used classification was the PMF Classification according to fracture size as first specified by Nelson and Jensen, who postulate a recommendation for treatment of PMF with a fragment size exceeding more than 1/3 of the articular surface on lateral radiographs based on a study sample consisting of 8 patients [45]. With 66 included studies, this classification accounts for the largest proportion of classifications used by surgeons in clinical practice. In the included studies the most used cut-off value was 25%, but also values of 20%, 30% or 1/3 of the articular surface were used. There are still controversial opinions for osteosynthetic treatment of PMF [69]: McDaniel and Wilson demonstrated, that if a PMF of less than 25% of the tibial joint area was not reduced, it did not significantly affect the overall outcome [58]. De Vries et al. and Xu et al. found no evidence for fixing PMF smaller than 25%, as outcome scoring systems showed no significant better outcome [52,53], as well as Guo et al. for PMF in tibial spiral fractures [54]. Comparing the outcome of treating PMF less than 25% with that of not fixing it no significant difference in the AOFAS Score was found [55][56][57]. On the other hand, a trend toward better clinical and radiological outcome in patients in whom PMF was fixed was observed and, therefore, authors recommend PMF fixation of even smaller fragments that cannot be satisfactorily reduced by ligamentotaxis [6,46,47,49,50]. Baumbach et al. and Tosun et al. postulated even that in PMF of all sizes, syndesmotic stability is significantly more likely to be restored if treated by open reduction internal fixation [48,51]. In relation to the total number of studies using this classification, the number of studies in terms of predictive outcome values is rather limited. In the matter of inter-and intraobserver reliability, the available evidence is also meager, Büchler et al. were the only ones to study this, providing good results with an inter-and intraobserver reliability of kappa of 0.64 and 0.63, respectively [13]. Of all studies that asses the PMF classification according to fracture size, all but two [6,49] are of retrospective design. Especially in the earlier studies, the evaluation of the fracture was not optimal, since this was done mainly on the basis of lateral radiographs.
The use of radiographs was found to be limited for the accurate size estimation of PMF [12,14,18,70], therefore, it recently came to the increasing use of computed tomography (CT) in the diagnosis of trimalleolar ankle fractures [18,47,54]. Subsequently, the conviction increases that not the size, but the fracture morphology is crucial for the improvement of outcome [19]. Factors such as syndesmotic stability, joint congruity, postoperative step-off, reconstruction of the incisura, intercalary fragments and talar subluxation are thought to be of prognostic importance to consider when treating PMF [7,23,48,50,51,53,58,63,[71][72][73][74][75][76]. Hence, a paradigm shift has occurred [21,24,31,77], as also the systematic review by Odak et al. has previously shown [22]. This is where the three CT-based classifications come to the fore. The classification used in the majority of studies is the one proposed by Haraguchi [14]. Most probably due to being the first CT-based classification and due to the simple and clear structure dividing the fracture in three types.  Since 2015, however, a preference for the Bartoníček/Rammelt classification has emerged, with the main strengths of this classification being the ascending severity of the classification and the derived therapy recommendations [29]. After noting that the Haraguchi classification did not map the mechanism of injury, Mason developed the most recent classification, also considering the injury mechanism [30]. Some objections against Haraguchi's classification have arisen with the time. First, the classification is not based on severity and thus does not relate to functional outcome [78]. Second, that the classification was based only on axial sectional images and, therefore, fractures were only assessed in one plane, vertical size expansion not being estimated [31], that medial injuries were not evaluated, which may lead to misjudgments [17,32], and that the extent of involvement of the tibial incisura was not specified, wherefore type I fractures include a wide range of both small and large posterolateral fragments [59]. Most multi-fragmentary fractures cannot be defined using this classification [79]. Also, the three modifications found [61,62,80] may suggest that Haraguchi's classification is not as advanced to represent all fracture types. Regarding the predictive value of the classifications in terms of postoperative outcomes, some authors have shown that type II fractures have worse clinical outcomes [19,59,63], whereas Mertens et al. observed an improvement in the AOFAS score from type I to type III [65].
The Bartoníček/Rammelt classification was developed on the basis of a larger patient population. It ascends in severity and contains a therapy recommendation [29,81]. Zhang et al. were able to show that the potency of the Bartoníček/Rammelt classification also applies to distal tibial spiral fractures with associated PMF [82]. One objection is the imprecise definition of type 5 fractures, which includes all fractures that cannot be classified as type 1-4. We were not able to find an image of such a type 5 fracture: neither in the original article nor in our own fracturedatabase. Another objection is the difficulty of estimating 1/3 of the tibial incision to distinguish between a type 2 and type 4 fracture [32]. There is a consistent opinion on worse outcome with increasing fracture type [1,63,65]. Only Neumann et al. saw an increase in the AOFAS score and no difference in the Olerud and Molander ankle score (OMAS) [21].
The authors of the Mason classification see the advantage in the ascending degree of severity of the classification considering the accident mechanism. They have also introduced treatment recommendations based on their classification. Gandham et al. even made a recommendation on the appropriate operative approaches [30,68,83]. However, they described the classification using schematic drawings and also do not define the tibial incisura [32]. In  [24,32]. However, none of the classifications can adequately describe the complexity of posterior malleolus fracture, as factors such as extent of articular surface impaction, degree of dislocation or intercalary fragments among others are not taken into account [32,79].
Several important classifications were excluded because they are not PMF-specific. This includes the AO classification originally published in 1987 by Müller/AO, being a universal classification depicting all skeletal injuries. It is a valuable, international classification, which has its justification, and which has been used for years [84,85]. With the routine use of CT imaging to reliably diagnose and classify trimalleolar fractures [9], authors have shown that all fractures involve the articular surface of the distal tibia [14,29,81]. This in contrast to the specification of the AO's classification through Heim, dividing posterior malleolar fractures into extra-and intraarticular fractures [86]. The AO classification, based on standard plain radiographs, is therefore not suitable for considering the significance and morphology of PMF, nor is it applicable in addressing specific questions regarding PMFs [24,48,87].
Classification systems of posterior pilon fractures were also considered to be non-PMF-specific. Hence, the differentiation of pilon fractures from trimalleolar ankle fractures still often causes difficulties in clinical practice [75,88,89]. This has led to the emergence of a subset of PMF, also known as the "posterior pilon" variant, which has recently gained popularity [61,87,[90][91][92][93]. However, there is still no clear definition and the understanding of it varies [75,81,94,95]. In addition, there are studies showing that posterior pilon fractures are a separate entity due to morphological differences [61,94].
Other excluded classifications were sub-entities of PMF fractures. For example, a classification of PMF in tibial shaft fractures (TSF) [96,97], and one also involving talar subluxation [98].
A few more limitations are worth noting, with majorly the limited quality of the included studies. Limitations affecting the Coleman score include the predominantly retrospective nature of the included studies and small patient cohorts. Therefore, the results of this study could only be presented in a descriptive manner. Only studies written in English were considered, excluding further useful contributions written in other languages.
In conclusion, this review demonstrates that there has been a shift from usage of the PMF classification by fracture size to the newer CT-based classifications, however, none have been able to establish itself in the literature so far. Summarizing all of the previously described points, we believe that, to date, no classification is able to adequately describe the complexity of the PMF. Also, the classifications are weak in terms of a derivable treatment algorithm or prognosis of outcome. According to this review, the Bartoníček/ Rammelt classification has the most potential to prevail in  the literature and in clinical practice due to its treatment algorithm, its reliability in combination with consistent predictive outcome values.
Funding Open Access funding enabled and organized by Projekt DEAL. No funding was received for conducting this study.

Declarations
Conflict of interest All authors declare that they have no conflict of interest.
Ethical approval The study was approved by the local ethics committee of the medical board in Hamburg, Germany (WF-093/21).
Informed consent This study did not involve human participants.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.