Introduction

Midfacial trauma is a frequent cause for presentation at the emergency department [1,2,3]. The epidemiology of midfacial fractures varies depending on the population studied and may be the result of cultural, social, and environmental differences [4,5,6]. Leading causes include activities of daily living, sports, assault, and traffic-related accidents [4, 6]. Knowledge of these epidemiological properties may help the emergency physician to deliver more accurate care to the patients [5]. The assessment of midfacial trauma can be particularly challenging in a coexisting multi-trauma setting [5, 7,8,9]. Moreover, midfacial fractures present themselves with varying degrees of severity ranging from non-dislocated common nasal fractures to gross communition in Le Fort type fractures in which patients require immediate airway control due to midface instability and oropharyngeal obstruction [10,11,12]. Upon entering the emergency department, each trauma patient is assessed by the principles of Advanced Trauma Life Support (ATLS) to resuscitate and identify all the potential injuries, including fractures in the midfacial region [11,12,13].

The anatomy of the midface is known for its complexity [14]. The midfacial skeleton is often conceptualized as a framework of buttresses that are responsible for the width and height of the facial profile and establishes functional support for the dental arch and globe [14,15,16]. As a consequence, the midface is particularly known for its specific physical examination findings. Zygomaticomaxillary complex fractures, for example, are associated with sensory disturbances due to compression of the infra-orbital nerve [17,18,19]. Also, orbital floor fractures are known to cause entrapment of the inferior rectus muscle leading to upward gaze limitations and diplopia [20]. In addition, the broad range of potential fracture patterns, including frontal sinus, maxillary sinus, nasal bone, nasoorbitoethmoid complex, Le Fort I, II, III type and maxillary dentoalveolar complex fractures can complicate the physical examination [6, 21]. Understanding these fracture patterns is necessary as they are related to particular physical examination findings which are used to guide the need for radiological imaging.

Computed tomography (CT) and cone beam computed tomography (CBCT) are considered the gold-standard imaging modalities for the diagnosis of midfacial fractures [2, 5, 22,23,24,25,26,27]. The scanners produce volume datasets with submillimetre-sized voxels in all dimensions [22, 28]. The image data can be used for orthogonal plane reconstruction and three-dimensional volume rendering [29,30,31,32]. Both scanning systems area associated with risks related to exposure to ionizing radiation [25, 29, 33,34,35,36,37], which is of concern because of the exponential increase in the use of these systems over the last few decades. The estimated effective radiation dose of scan protocols for midface trauma is considered to be 0.9 to 3.6 mSv [25, 36, 38]. The effective dose of a CBCT is known to be lower, ranging from 0.08 to 0.21 mSv on average, depending on the field of view that is used [34]. However, the effective dose of both a CT and CBCT can vary significantly based on a multitude of factors such as the system type, scan range, size of the patient and scan protocol parameters [25, 34, 36, 39]. Hence, the interest in investigating whether physical examinations can be used to diagnose a fracture so as to reduce unnecessary imaging, health care costs and exposure to ionizing radiation [40, 41].

Although oral and maxillofacial surgeons are specifically trained to assess maxillofacial trauma patients, the initial diagnostic management is mostly performed by emergency physicians and specialized trauma surgeons [1, 5]. An awareness of how physical examination findings can predict midfacial fractures would enable adequate stratification of patients requiring radiological imaging. To date, no systematic review has been published on this topic. The aim of this systematic review and meta-analysis, thus, was to assess the diagnostic accuracy of physical examination findings and related clinical decision aids, in comparison to CT and CBCT, for the diagnosis of midfacial fractures.

Material and methods

Protocol

This systematic review was conducted following the recommendations of the Cochrane Handbook for Systematic Reviews of Interventions and reported according to the Preferred Reporting Items for a Systematic Review and Meta-Analysis of Diagnostic Test Accuracy Studies (PRISMA-DTA) [42, 43]. The study protocol was registered in the international prospective register of systematic reviews (PROSPERO, registration number CRD210040).

Search strategy

An initial literature search was conducted on March 11, 2020, and updated on March 23, 2021, using the electronic databases of MEDLINE, EMBASE, CINAHL, and Cochrane Controlled Trial Register. Relevant search terms regarding midfacial fractures, physical examination findings, and their diagnostic accuracy were used and matched to relevant MeSH (MEDLINE, Cochrane) and EMTREE (EMBASE) terms, and to free text words according to the syntax rules of each database (Supplementary material S1). The search strategy was conducted in collaboration with a medical information specialist. In addition, the references of the included studies were screened.

Study eligibility

The results of the literature search were imported into an EndNote X9.2 software environment (Clarivate Analytics, Philadelphia, Pennsylvania, USA) and duplicates were removed. The research question was defined using the PICOS format and, subsequently, the inclusion and exclusion criteria were determined (Table 1). The publications were assessed for eligibility in two rounds. In the first round, two reviewers (RR and MD) independently assessed the titles and abstracts according to the inclusion and exclusion criteria. The publications were allocated as “included” or “excluded” and in case of an indecisive verdict, publications were included for full text assessment. Publications selected for full text selection were independently assessed by the same two reviewers for final inclusion using the same selection criteria. After each selection round, discrepancies between the two reviewers were resolved in a consensus meeting. A third reviewer (BvM) was consulted to give a final judgement on any persisting disagreement. The interobserver agreement was calculated as the percentage of agreement, Cohen’s κ coefficient and Gwet’s AC1 statistic [44,45,46].

Table 1 Inclusion and exclusion criteria

Risk of bias assessment

The risk of bias of all the included studies was independently assessed by the same two reviewers using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool [47]. This tool consists of four key domains covering patient selection, index test, reference standard, and flow and timing each including signaling questions focusing on the judgment of bias and concerns regarding applicability. A version applicable to this review is provided in Supplementary material S2. Disagreements were resolved through discussion.

Data collection

Data were extracted using a pre-defined standardized form including the year of publication, study design, study set-up, single-center, or multi-center study design, trauma center level according to the American College of Surgeons classification [48], the studies patient population, patient demographics, level of consciousness according the Glasgow Coma Scale (GCS), the reference test used, fracture prevalence, the type of fracture outcome, reported physical examination findings (i.e., any finding related to the visual appearance of the patient, outcomes of the nasal and ocular assessment, intra-oral examination, sensory disturbances, and to palpation of the midface) and any proposed clinical decision aids developed from a combination of the reported physical examination findings. Only those physical examination findings that were specifically related to the midfacial region were collected. Two by two tables were constructed. If insufficient data were reported to produce two-by-two tables, backward calculations were performed using the provided sensitivity, specificity, pre-test probability, positive predictive value, negative predictive value, positive likelihood ratio, and negative likelihood ratio with the corresponding 95% confidence intervals [49]. The authors of the included studies were contacted in case of missing data or inconsistencies in the calculations by means of a minimum of two email attempts.

Statistical analysis

Interobserver agreement was calculated using the Statistical Package for the Social Sciences version 23 (SPSS, IBM Corp., Armonk, New York, USA). A meta-analysis was performed to calculate the pooled sensitivity, specificity and diagnostics odds ratio using R statistics package for Meta-Analysis of Diagnostic Accuracy, for all the physical examination findings that were reported more than once for the same fracture outcome (MADA version 0.5.10, R Foundation for Statistical Computing, Vienna, Austria) [50]. Physical examination findings were only combined if the reported phraseology was plausibly about the same finding (e.g., infra-orbital nerve hypoesthesia and reduced sensation in the maxillary division of the trigeminal nerve). Regarding the diagnostic odds ratio calculations, 0.5 was added to all the cells of the contingency table in case of a zero cell count [51]. Testing for publication bias was performed using Deek’s funnel plots asymmetry test by a regression of the diagnostic log odds ratio against the inverse of the square root of the effective sample size [52, 53]. The statistical significance of the slope coefficient was defined as a p-value < 0.05. A meta-regression analysis was undertaken if more than ten studies reported physical examination findings with the same outcome.

Results

Study identification and selection

The initial and updated literature search identified a total of 3171 publications (Fig. 1). After removing the duplicates, 2367 publications were screened by title and abstract. The percentage of agreement, kappa, and Gwet’s AC1 statistic were 98%, 0.55, and 0.98, respectively. A remaining total of 32 publications was eligible for full text screening. Twenty articles were excluded because they did not fulfil the inclusion or exclusion criteria (Supplementary material S3). The percentage of agreement, kappa, and Gwet’s AC1 statistic of the full text selections were 97%, 0.93, and 0.94, respectively. After the second round, a total of 12 publications were finally included for both qualitative and quantitative syntheses. It was not necessary to consult the third reviewer for a consensus.

Fig. 1
figure 1

Flowchart of the study identification and selection process

Methodological quality

Figure 2 presents the quality assessment of the included studies according to the QUADAS-2 tool. High risk of bias in patient selection was detected in three studies (25%). Unclear risk of bias was found for the “index test” (75%), “references test” (50%), and “flow and timing” (75%) domains of the majority of the studies. Additionally, high concerns regarding applicability were found for “patient selection” in five studies (41.7%) and “reference standard” in eleven studies (91.7%), whereas the “index test” was unclear for most of the studies (75%).

Fig. 2
figure 2

Risk of bias assessment

Study characteristics

The included publications consisted of eight retrospective studies, three prospective studies, and one case control study (Table 2). All 12 studies included emergency department patients; eleven studies investigated patients from a single center and one study had patients from two centers. Among the single-center studies, eight studies included patients from level I trauma centers, two studies included patients from level II trauma centers and one study included patients from a level III center. The two-center study included patients from both a level I and II trauma center.

Table 2 Study characteristics

Patient characteristics

The number of patients in the studies ranged from 47 to 2262, resulting in a total of 9017 patients of whom 6007 were male and 3010 female. The reported mean age was 37.1 years, and the reported median age ranged from 28 to 50. The study population included midfacial trauma patients (n = 1) [58], maxillofacial trauma patients (n = 4) [56, 60, 61, 64], orbital trauma patients (n = 2) [57, 62], head and orbital trauma patients (n = 3) [54, 55, 65], minor head injury patients with a black eye (n = 1) [59], and traumatic brain injury patients with facial trauma (n = 1) [63]. All the studies had used CT as a reference test and thus no studies were included where CBCT was used as a reference test. Any midfacial fracture was used as an outcome by one study [58], whereas any midfacial or mandibular fracture was used as an outcome by seven studies [54, 56, 59,60,61, 63, 64], and orbital fracture was used as an outcome by four studies [55, 57, 62, 65]. In one study, midfacial and mandibular fracture outcomes were stratified as frontal sinus, zygoma, orbital floor, naso-ethmoidal, nasal, maxilla, and mandibular fractures [61]. The fracture prevalence ranged from 13.8 to 91.2%, resulting in an average of 41.2%.

Physical examination findings

A total of 42 distinct physical examination findings were identified and categorized into 5 distinct groups: visual appearance, nasal assessment, ocular assessment, intra-oral assessment, and findings related to functional and palpation assessment. The diagnostic accuracy of each individual physical examination finding is presented in Table 3. For 30 findings, the diagnostic accuracy was reported in more than one study. Meta-analysis was feasible for a total of 31 physical examination findings (Fig. 3).

Table 3 Diagnostic accuracy of individual physical examination findings
Fig. 3
figure 3figure 3figure 3

a Forest plots showing study-specific and pooled specificity, sensitivity, and diagnostic odds ratio of the physical examination findings related to visual appearance for (a) swelling, (b) peri-orbital swelling or hematoma, (c) hematoma, (d) forehead hematoma, (e) peri-orbital hematoma, (f) nasal hematoma, (g) malar hematoma, (h) laceration, (i) forehead laceration, (j) peri-orbital laceration, (k) nasal laceration, (l) malar laceration, (m) peri-oral laceration, (n) asymmetry in diagnosing midfacial fractures. b Forest plots showing study-specific and pooled specificity, sensitivity, and diagnostic odds ratio of the physical examination findings related to nasal assessment for (a) epistaxis in diagnosing midfacial fractures. c Forest plots showing study-specific and pooled specificity, sensitivity and diagnostic odds ratio of the physical examination findings related to ocular assessment for (a) subconjuctival hemorrhage, (b) diplopia, (c) extra-ocular movement limitation, and (d) visual acuity change in diagnosing midfacial fractures. d Forest plots showing study-specific and pooled specificity, sensitivity and diagnostic odds ratio of the physical examination findings related to intra-oral assessment for (a) intra-oral laceration, (b) tooth avulsion and (c) malocclusion in diagnosing midfacial fractures. e Forest plots showing study-specific and pooled specificity, sensitivity and diagnostic odds ratio of the physical examination findings related to functional and palpation assessment for (a) facial pain, (b) infra-orbital nerve paresthesia, (c) palpable step-off, and (d) open fracture in diagnosing midfacial fractures

Findings related to visual appearance

A total of 24 distinct physical examination findings were identified as being related to the visual appearance of the patient and reported 52 times in the included studies [54,55,56,57,58, 60,61,62,63,64,65]. The outcomes of the findings were any midfacial or mandibular fracture (n = 40), any midfacial fracture (n = 2), any orbital fracture (n = 7), orbital floor fracture (n = 1), and zygoma fracture (n = 2). The identified findings included swelling, hematoma, laceration, asymmetry, globe position change, and malar eminence flattening. Regarding swelling, hematoma, and laceration, the diagnostic accuracy was also reported for specific regions of the midfacial skin. For swelling, this included that diagnostic accuracy was also reported for specifically the periorbital region [56, 57, 60, 64]. The region specific findings for hematoma included the forehead [56, 60], peri-orbital region [54, 56,57,58, 60, 62, 65], eyelid [54, 55], nasal region [56, 60], malar region [56, 60], and the facial or scalp region [54]. For laceration, region specific findings included the forehead [56, 58, 60, 63], peri-orbital region [56, 57, 60], eyebrow [54], eyelid [54], conjunctiva [54], nasal region [54, 56, 60], malar region [56, 60], peri-oral region [56, 60], and the lip [54]. Among the physical examination findings related to swelling, hematoma and laceration, high pooled specificity was found for eyelid hematoma, eyebrow laceration, conjunctival laceration, nasal laceration, and malar laceration ranging from 0.19 to 0.98 (Table 3 & Fig. 3a). The diagnostic odds ratio for these physical examination findings ranged from 1.10 to 3.48. Regarding asymmetry, globe position change, and malar eminence flattening, the specificity, PPV, and LR + were found to be high.

Findings related to nasal assessment

Epistaxis was the only reported physical examination finding related to the nasal assessment and was reported in 6 studies [56,57,58,59,60, 63]. The outcomes included any midfacial or mandibular fracture (n = 4), any midfacial fracture (n = 1), and any orbital fracture (n = 1). The pooled specificity was found to be high (0.94) and the pooled sensitivity remained low (0.25). The diagnostic odds ratio was 5.43 (Table 3 & Fig. 3b).

Findings related to ocular assessment

A total of 6 distinct physical examination findings were identified in relation to the ocular assessment and reported 23 times in the included studies [54, 56, 57, 59,60,61,62, 65]. The outcomes were any midfacial or mandibular fracture (n = 11), any orbital fracture (n = 10), orbital floor fracture (n = 1), and zygoma fracture (n = 1). The identified findings included subconjunctival hemorrhage [54, 56, 57, 59,60,61, 65], hyphema [57], diplopia [56, 57, 59, 60, 62, 65], extra-ocular movement limitation [56, 57, 60, 65], extra-ocular movement pain [57], and visual acuity change [56, 60, 65]. The pooled specificity of all the physical examination findings was high, ranging from 0.89 to 0.94, and the pooled sensitivity was low, ranging from 0.09 to 0.36 (Table 3 & Fig. 3c). The diagnostic odds ratio ranged from 1.79 to 3.27. Although the outcomes varied, most of the studies reported a high PPV and LR + for the findings related to the ocular assessment, with two individual studies reporting a PPV of 100 and infinite LR + for diplopia and visual acuity change [60, 65].

Findings related to the intra-oral assessment

A total of 3 distinct physical examination findings were identified to be related to the intra-oral assessment and reported in 10 times of the included studies [54, 56, 60, 63, 64]. All of these reported physical examination findings were studied using any midfacial or mandibular fracture as outcome (n = 10). Identified findings included malocclusion [56, 60, 64], intra-oral laceration [54, 56, 60], and tooth avulsion [56, 60, 63, 64]. The pooled specificity was high, ranging from 0.92 to 0.98, and the sensitivity was low for all findings, ranging from 0.10 to 0.21 (Table 3 & Fig. 3d). The diagnostic odds ratio ranged from 3.41 to 6.64. The PPV found higher than 80.0 in almost all of the studies, with one study reporting a PPV of 100 and an infinite LR + for malocclusion and tooth avulsion [64]. The NPV was low in all studies.

Findings related to functional assessment and palpation of the midface

Regarding findings related to the functional assessment and palpation of the midface, a total of 8 distinct physical examination were identified that were reported 24 times in the included studies [56, 57, 59,60,61,62, 64, 65]. The outcomes were any midfacial or mandibular fracture (n = 12), any orbital fracture (n = 8), orbital floor fracture (n = 1), nasal bone fracture (n = 1), and zygoma fracture (n = 2). The identified findings included facial pain [56, 60], infra-orbital nerve paresthesia [56, 57, 59,60,61,62, 65], subcutaneous emphysema [59, 62], tenderness on palpation [57, 61], palpable step-off [56, 57, 59, 60, 64], trismus [57, 61], mandible locked open [57], and open fracture [56, 60]. The pooled specificity was high for infra-orbital nerve paresthesia, subcutaneous emphysema, palpable step-off, trismus, mandible locked open, and open fracture, ranging from 0.69 to 0.99. The pooled sensitivity remained low for the findings, ranging from 0.04 to 0.39 (Table 3 & Fig. 3e). The diagnostic odds ratio ranged from 1.39 to 11.38. A high PPV and LR + was found for infra-orbital nerve paresthesia, subcutaneous emphysema, palpable step-off and open fracture. Individual studies reported a PPV of 100 and a corresponding infinite LR + for infra-orbital nerve paraesthesia, palpable step-off and open fracture [56, 60, 64]. A high NPV was found for tenderness on palpation. The NPV of the other physical examination findings was low.

Publication bias

The Deek’s funnel plot tests showed that publication bias was significant for subconjunctival hemorrhage with midfacial and mandibular fractures (Supplementary material S4). The statistical significance of the publication bias could not be assessed for 15 physical examination findings because only two studies provided data.

Clinical decision aids

Clinical decision aids were reported in 8 studies (Table 4). Four studies assessed the Wisconsin criteria [56, 60, 61, 64]. The criteria were defined as any presence of a bony step-off or instability, malocclusion, tooth absence, peri-orbital swelling or contusion, and a Glasgow coma score of less than 14, using any midfacial or mandibular fracture as an outcome [56]. The sensitivity of these criteria ranged from 80.2 to 98.2%, and the specificity ranged from 22.3 to 41.2%. Clinical decision aids specifically for orbital fractures were presented in 2 studies [55, 65]. One study focused on the need for a facial CT for head injury patients [55], and constructed a clinical decision aid that produced a sensitivity of 55.1% and a specificity of 100.0% in the presence of either blepharohematoma in one or two orbits, palpable fracture line, infra-orbital nerve hypesthesia, ocular motility disturbance, skin emphysema, enophthalmos or exophthalmos, impaired pupil reaction, and decrease in vision. Another study focused on the identification of head injury patients who had benefitted from including the orbits in the head CT [65]. Another clinical decision aid was constructed based on unbounded subconjunctival hemorrhage, reduced sensation in the distribution of the infra-orbital nerve, change in the position of the globe, reduced visual acuity or any two of the following, peri-orbital bruising, diplopia, and limited eye movement. The presence of any of these findings produced a sensitivity of 80.0% and specificity of 75.0%. Two studies produced a clinical decision aid for orbital fractures using a risk score [57, 62]. In one study, the risk score consisted of assigning a point for orbital rim tenderness, peri-orbital emphysema, subconjunctival hemorrhage, impaired extra-ocular movement, painful extra-ocular movement and epistaxis [57]. The other study assigned one point for male sex, etiology other than assault, peri-orbital ecchymosis, peri-orbital emphysema, infra-orbital nerve hypoesthesia and diplopia. One study introduced clinical decision aids, which were referred to as the Stony Brook University Hospital (SBUH) criteria, for orbital floor fractures, zygoma fractures and nasal fractures [61]. The respective sensitivities and specificities were 92.0% and 75.0% for orbital floor fractures, 88.9% and 51.3% for zygoma fractures, and 87.5% and 87.8% for nasal fractures. Contingency tables of the physical examination findings and clinical decision aids are presented in Supplementary Material S5.

Table 4 Reported clinical decision aids

Discussion

The assessment of midfacial and mandibular injury is characterized by particular physical examination findings. Understanding the predictive value of each finding may help emergency physicians to deliver a more optimal diagnostic management. In this systematic review and meta-analysis, we synthesized the best available evidence regarding the diagnostic accuracy of the physical examination findings and the accompanying clinical decision aids. The meta-analysis provided evidence of high specificity and low sensitivity for most of the individual physical examination findings related to the visual appearance of the patient; nasal, ocular, and intra-oral assessments; and findings related to the functional assessment and palpation of the midface. This indicates that the absence of any physical examination findings can be used to successfully identify patients who do not have a midfacial fracture, whereas the presence of individual findings does not necessarily mean that patients have a midfacial fracture. Among these physical examination findings, we observed a high diagnostic odds ratio for epistaxis, tooth avulsion, malocclusion, infra-orbital nerve paraesthesia and palpable step-off, indicating that the likelihood of diagnosing a midfacial fracture is high when these findings are present during the physical examination. Also, particular findings had a high PPV and corresponding LR + . From a clinical perspective, emergency department physicians are blinded for the potential presence of a fracture during the physical examination and so these individual findings are especially useful for identifying patients at risk of the presence of a midfacial fracture and radiological imaging should be strongly considered for these patients. The NPV and LR- remained low for almost all the physical examination findings. Hence, the individual findings were unable to identify patients with a low risk of midfacial fractures and who did not require radiological imaging. However, this should be interpreted with caution due the low number of included studies and the high degree of risk of bias and concerns regarding the applicability of most of the studies.

Clinical decision aids

It is of particular interest how a combination of physical examination findings performs as a clinical decision aid. Accordingly, the studies included in this systematic review proposed a variety of clinical decision aids using any midfacial or mandibular fracture, orbital fracture, orbital floor fracture, nasal fracture, and zygoma fractures as an outcome. The University of Wisconsin produced a clinical decision aid with sufficient diagnostic accuracy for patients suspected of midfacial or mandibular fractures [56]. However, validation of these criteria was unsuccessful in three other studies due to lower diagnostic accuracy outcomes [60, 61, 64]. The other studies focused on clinical decision aids for the identification of specific midfacial fractures, five of which were for orbital fractures [55, 57, 61, 62, 65]. The relevance of specifically studying the latter is emphasized for two reasons. First, orbital fractures are commonly found in patients presenting with a head injury and, therefore, it is often discussed whether the orbits should be included when performing a head CT [7, 55, 63]. Second, orbital fractures are associated with complications, such as entrapment of the extraocular muscles or retrobulbar hemorrhage, that require immediate surgical intervention and should therefore not be missed [15, 66,67,68,69]. Three of the five studies successfully produced a clinical decision aid with this focus, whereas the two other produced a score to stratify patients into risk categories for the presence of orbital fractures [57, 62]. One study based the risk score on physical examination findings only [57] whereas the other study also included sex and the mechanism of injury [62]. Although these scores identified the high risk fracture patients, the authors emphasized that further research is needed to determine a weighted cut-off. Nevertheless, patients with a high score were strongly suspected of having orbital fractures. None of these clinical decision aids were validated.

Most importantly, this systematic review did not identify a clinical decision aid that used any midfacial anatomy as an outcome. Yet, both the midface and mandible are known for their characteristic and complex anatomy, consequently each producing region-specific physical examination findings. Hence, we believe that both the midfacial and mandibular region should have a dedicated clinical decision aid, and we suspect that false positive findings might be more likely in studies where any midfacial or mandibular fracture is used as an outcome. For instance, the Wisconsin criteria score was positive for patients suffering peri-orbital hematoma while being diagnosed with a mandibular fracture. Conversely, malocclusion is considered to be a more common finding in mandibular trauma patients due to changes to the temporomandibular joints and the more prominent position of the alveolar process. Dedicating a clinical decision aid to midfacial fractures would allow it to be focused on physical examination findings related to the midfacial region, making it more easily reproducible. This is especially appreciated because a majority of midfacial trauma patients are initially assessed by emergency physicians and trauma surgeons who are not specifically trained to assess these patients.

Radiological imaging

Our systematic review did not find any studies that used CBCT as a reference test. CBCT scanners are dedicated to the oral and maxillofacial region and datasets are acquired while the system rotates around the patient [22, 33, 70]. A probable explanation is that the system can only be used on patients with isolated midfacial trauma, or patients for whom the initial management did not provide evidence of additional injuries [71]. For that reason, the availability of CBCT scanners in the emergency department is usually limited, and the systems are mostly used in outpatient clinics. A CT, on the other hand, is able to scan multiple body parts resulting in single data acquisition by transporting the patient through the gantry in synchrony with continuous data acquisition [72]. This is especially appreciated for midfacial trauma patients with concomitant cervical spine and head injuries which force the patients into a supine position [7, 73,74,75,76]. Nevertheless, both CT and CBCT have the major advantage that they overcome superimposition of structures that inevitably occurs with conventional radiography [22, 30, 32].

Quality of evidence and bias

In most of the included studies, there was an unclear risk of bias for the domains of the index test, reference standard, and flow and timing. Information regarding either the blinded interpretation of physical examination findings, or the blinded interpretation of CT data, was not reported in these studies. Not blinding the interpretation introduces important biases such as, for example, recording physical examination findings as present more likely if the emergency department workers are aware a priori of fractures being diagnosed on a CT. This type of bias cannot be controlled and therefore was judged as unclear in the studies. High unclear risk of bias was found for the flow and timing domain because no information was provided regarding the interval between the assessment of the physical examination findings and the CT. The accuracy of the interpretation decreases as the interval increases and should therefore be as short as possible. However, it is likely that in an emergency department setting the majority of patients are assessed within hours after the trauma, and a CT is conducted within the same time frame. High applicability concerns were found for the patient selection and reference standard domains. Regarding the selection of patients, a variety of studies focused on head injury patients only who, one would expect, were injured more severely, therefore introducing selection bias and affecting the interpretation of the physical examination findings. Concerns regarding the applicability of the reference standard were due to the use of an outcome other than ‘any midfacial fracture’. Concerns regarding the applicability of the index test were unclear in many studies (i.e., the standardization, handling or interpretation of the physical examination findings). It was especially unclear how the scoring of the chart review was handled by the retrospective studies, and if the data were reported systematically. Not reporting data as an absent physical examination finding could result in bias due to false negative outcomes. Also, the included studies did not report how “not assessable”| physical examination findings were handled, for instance the inability to score ocular related findings in patients with severe peri-orbital swelling.

Strengths and limitations

The strength of this review is the detailed literature search, eligibility assessment of studies by two independent reviewers, good inter-observer agreement, structured risk of bias assessment using the QUADAS-2 tool, and conducting and reporting analyses according to the Cochrane handbook and PRISMA statement. A major limitation is the interpretation of the pooled outcomes due to the low or unclear quality of the studies, as well as the high concerns regarding applicability. The likely source of this bias was due to the patient selections and the fracture outcomes. Also, most of the studies were single-center trials thereby potentially introducing geographic and demographic biases. Another limitation is that we were unable to perform a meta-regression analysis of the midfacial fracture subgroups due the limited number of studies and data.

Implications and future research

Future research should focus on the diagnostic accuracy of the physical examination findings using ‘any midfacial fractures’ as an outcome. Particular interest should be paid to the QUADAS-2 domains where high and unclear risk of bias was observed. Studies should include a consecutive population of midfacial trauma patients and inappropriate exclusion, such as multi-trauma patients, should be avoided. A standardized set of physical examination findings should be reproducible and should be assessed before knowing the CT outcome. The interpretation of the CT datasets should be interpreted by either a board certified radiologist or oral and maxillofacial surgeon. Ideally, the study should be conducted as a prospective multi-center trial to avoid geographical bias. Data from a large population of midfacial fracture patients should allow for a regression analysis to study how physical examination findings can predict fracture subtypes, such as orbital or zygomaticomaxillary complex fractures. Above all, the aim of identifying relevant individual findings would be to produce a clinical decision aid to reduce exposure of patients to unnecessary radiological imaging.

Conclusions

Based on all the currently available evidence, the present systematic review and meta-analysis identified the diagnostic accuracy of individual physical examination findings related to visual appearance, nasal and ocular assessment, intra-oral assessment and functional and palpation assessment of midfacial fractures compared to CT. The high specificity reveals that the absence of physical examination findings can aid in identifying patients who do not have a midfacial fracture, whereas the low sensitivity is evidence that the presence of individual findings cannot be used to accurately identify patients with midfacial fractures. Although, various clinical decision aids and risk scores were presented in the reviewed studies, none focused on the identification of any midfacial fracture. The results herein should be interpreted with caution due the limited number of studies as well as the high risk of bias and concerns regarding the applicability.