Introduction

Distal forearm fractures in the pediatric population represent a significant share of musculoskeletal injuries, comprising around 30% of all fractures in children [1, 2]. The conventional diagnostic strategy predominantly involves radiography. However, this method presents notable limitations, including the potential for ionizing radiation exposure, the need for patient mobility during imaging, and challenges in identifying subtle fractures within the pediatric skeletal structure [3,4,5].

In recent years, ultrasound has emerged as a promising diagnostic tool for pediatric distal forearm fractures. Differing from conventional radiography, ultrasound offers a safer option for the pediatric population by eliminating ionizing radiation, which poses potential adverse effects [3, 6]. Its bedside applicability not only eliminates the need for patient transportation but also enables real-time imaging, potentially expediting the diagnostic process [7, 8]. Additionally, ultrasound presents advantages such as lower cost and higher availability [4, 9].

Recent studies have investigated the application of ultrasound in diagnosing pediatric forearm fractures, highlighting its capacity for dynamic, multi-planar visualization of both radial and ulnar cortices [10,11,12,13]. This dynamic capability effectively addresses challenges related to overlapping bone structures and provides advantages in identifying cortical disruptions and subperiosteal hematomas [7]. Moreover, the mobility of ultrasound facilitates bedside examinations, proving valuable in non-hospital settings and situations involving multiple-trauma patients [7, 8]. However, despite promising findings, the incorporation of diagnostic ultrasound into the clinical management of pediatric distal forearm fractures remains limited [4].

This study aims to conduct a meta-analysis on the diagnostic effectiveness of ultrasound in pediatric distal forearm fractures. Through a systematic review of the current literature, our objective is to offer insights into the viability of ultrasound as a diagnostic tool in the pediatric emergency setting.

Methods

This systematic review follows the guidelines outlined in the preferred reporting items for systematic reviews and meta-analyses (PRISMA) statement [14]. On October 1, 2023, a comprehensive literature search was conducted across four major databases: PubMed, Scopus, Web of Science, and Embase. Specific search terms were devised for each database, incorporating (“radius” OR “radial” OR “ulna” OR “ulnar” OR “forearm”) AND (“fracture*” OR “displace*”) AND (“sonograph*” OR “ultrasonograph*” OR “ultrasound” OR “POCUS”) AND (“pediatric*” OR “paediatric*” OR “child*” OR “neonat*” OR “infant*” OR “toddler*” OR “preschool” OR “pre-school” OR “juvenile” OR “young adult*”). Additionally, a thorough manual examination of references within the included studies ensured no relevant papers were inadvertently overlooked. The assessment process involved a detailed review of each article’s title, abstract, and/or full text. Two co-authors independently conducted this review, resolving uncertainties or ambiguities through consultation with a senior co-author. The AutoLit platform, developed by Nested Knowledge in St. Paul, Minnesota, USA, facilitated deduplication, screening, and data extraction.

All studies pertinent to the topic of interest, presenting at least one of the following diagnostic accuracy measures in pediatric patients (under 21 years old), were considered for inclusion: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), likelihood ratio (LR), diagnostic odds ratio (DOR), and area under the receiver operating characteristic curve (AUC). No constraints were imposed on publication date, country of origin, patient characteristics, reference standard type, or study design. Non-English literature, case reports, case series with fewer than five eligible patients, conference abstracts, editorial comments, and review articles were excluded from the study.

The quality assessment of diagnostic accuracy studies-2 (QUADAS-2) tool was applied to evaluate the quality of included studies [15]. The four primary domains of the QUADAS-2 tool, including patient selection, index test, reference standard, and flow and timing, underwent independent assessment for potential bias and concerns regarding applicability. Evaluations for each domain were based on specific criteria outlined in the tool, such as the representativeness of the study population, blinding of test results, and completeness of outcome data. Ratings of “low,” “high,” or “unclear” were assigned to each domain to determine the overall rigor and reliability of the evidence synthesis.

Statistical analysis

The primary analytical approach utilized a random effects diagnostic test accuracy (DTA) model, specifically the bivariate model developed by Reitsma et al. [16]. Summary receiver operating characteristic (SROC) curves were generated using this bivariate meta-analysis data. For visualization purposes, study-specific estimates were relatively weighted in SROC plots based on the weights within the random effects univariate DOR model. AUC and its confidence interval (CI) for each subgroup were calculated using a 2000 sample bootstrapping technique based on the bivariate model [17].

To assess heterogeneity, the I2 metric was employed following the approach by Holling et al. [18]. A significant level for heterogeneity was considered for I2 confidence intervals above 25%, leading to further sensitivity analyses through the DOR univariate meta-analysis to identify and re-analyze potential outliers.

The study also investigated the influence of various covariates on the reported rates using subgroup meta-analysis and meta-regression techniques. Covariates included training status and roles of the individuals involved, image acquisition methods (either four or six views), and the age of patients. Considering the variability in reporting metrics, with some studies reporting by patient number and others by the number of bones scanned, these aspects were treated as covariates for further subgroup analyses. Subgroup comparisons were made between studies reporting on a patient-wise basis vs. those reporting on a bone-wise basis. Separate subgroup analyses were also conducted for studies reporting diagnostic accuracy metrics for fractures in the radius and ulna.

Fagan plots and LR scattergrams were utilized to assess the clinical applicability of findings. Positive LRs above ten signified suitability for confirmation, whereas negative LRs below 0.1 indicated exclusion suitability. Fagan nomograms were constructed for pre-test prevalences of 25%, 50%, and 75%, based on the bivariate Reitsma model, as detailed by Zwinderman et al. [19].

Publication bias was scrutinized using an adaptation of Egger’s regression test for DTA meta-analysis, involving the analysis of funnel plot asymmetry with 2000 sample bootstrapping, as recommended by Noma et al. [20].

All statistical procedures were conducted using R (version 4.2.1, R Foundation for Statistical Computing, Vienna, Austria), utilizing packages such as “Mada,” “MVPBT” [21], “dmetatools” [17], “Metafor” [22], and “meta” [23].

Results

Article screening and selection process

A systematic literature search employing a predetermined strategy identified 1570 articles. Upon removing duplicates, 746 papers underwent screening based on title and abstract. This screening process resulted in the exclusion of 718 articles, comprising 38 review articles, 19 editorials, 15 conference papers, 49 non-English articles, and 597 articles deemed irrelevant to the topic of interest. The full text of the remaining 28 papers was meticulously reviewed. After a thorough examination, 14 articles were excluded because they failed to report at least one diagnostic accuracy measure for ultrasound in the evaluation of pediatric distal forearm fractures. Ultimately, 14 articles meeting the inclusion criteria were identified and incorporated. The screening process and eligibility criteria adhered to PRISMA guidelines, with a flow diagram presented in Fig. 1.

Fig. 1
figure 1

PRISMA flow diagram showing the review process. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses

Study and patient characteristics

The analysis encompassed 14 studies involving a total of 1377 patients. The predominant methodology across these studies was prospective and single center. Distal forearm fractures were diagnosed using ultrasound, with radiography as the reference standard. Diagnostic criteria for fractures on ultrasound included the identification of cortical disruptions, protrusions, deviations, or hematomas. The studies utilized a diverse range of ultrasound equipment, with a linear transducer being the most commonly employed. Additionally, a 6-view imaging protocol was predominantly implemented. The ultrasound examinations were conducted by various professionals, such as traumatology residents, pediatric emergency physicians, nurses, and radiologists, each with varying levels of training in musculoskeletal ultrasound examination. Detailed sensitivity and specificity percentages, along with additional characteristics of the included studies, can be found in Table 1.

Table 1 Characteristics of the included studies

Quality assessment

The methodological quality of the incorporated studies is detailed in Table 2 and Supplementary Fig. 1. A notable risk of bias was observed, mainly related to the diversity in ultrasound performers’ training levels. Additionally, five studies lacked explicit clarification regarding the ultrasound criteria for diagnosing forearm fractures.

Table 2 Methodological quality assessment of the included studies

Meta-analysis

In the evaluation of pediatric distal forearm fractures across 14 studies, the pooled sensitivity and specificity were found to be 94.5 (95% CI 92.7–95.9) and 93.5 (95% CI 89.6–96.0), respectively (Fig. 2). The SROC curve demonstrated an AUC of 0.94 (95% CI 0.92–0.97) (Supplementary Fig. 2). Supplementary Fig. 3, depicting a scattergram of positive and negative likelihood ratios, suggests a high-performance level, ideal for both exclusion and confirmation purposes. According to the Fagan plot study, considering pre-test probabilities of 25%, 50%, and 75% for distal forearm fractures in children, the positive post-test probabilities were 83%, 44%, and 98%, while the negative post-test probabilities were 2%, 6%, and 15%, respectively (Supplementary Fig. 4).

Fig. 2
figure 2

Forest plot and summary statistics of the diagnostic test accuracy (DTA) meta-analysis encompassing all included studies. CI, confidence interval; EP, emergency practitioner; N/S, not specified; PW, patient-wise; BW, bone-wise; R, radius; U, ulna

Meta-regression and subgroup analyses

Additional examination involved meta-regression and subgroup assessments, focusing on various covariates.

  • Ultrasound performers’ training status

    In studies where individuals with ultrasound training conducted the examinations, the sensitivity and specificity were 94.4% (95% CI 91.2–96.5) and 94.9% (95% CI 91.9–96.9), respectively. In contrast, studies with untrained performers showed sensitivity and specificity of 95.1% (95% CI 92.3–96.9) and 86.3% (95% CI 80.5–90.6), respectively (Fig. 3). The AUC of the SROC was 0.95 (95% CI 0.93–0.98) and 0.94 (95% CI 0.86–0.97) for studies with trained and untrained ultrasound performers, respectively. Analyzing with a bivariate model indicated significantly higher diagnostic accuracy in the subgroup with trained performers (p = 0.03), as depicted in Supplementary Fig. 5. Furthermore, post hoc analysis revealed notably higher specificity in this subgroup (p < 0.01).

    Supplementary Fig. 6 illustrates a scattergram of positive and negative LRs for each subgroup.

    Moderate heterogeneity was noted in studies with untrained ultrasound performers. A sensitivity analysis identified the study by Herren et al. [13] as a significant outlier. After excluding this study, the overall diagnostic accuracy and specificity remained significantly higher in the subgroup with trained ultrasound performers (Supplementary Figs. 7 and 8).

  • Ultrasound interpreter

    In studies where radiologists interpreted the ultrasound, the sensitivity and specificity were 94.5% (95% CI 92.7–95.9) and 93.5% (95% CI 89.6–96.0), respectively. In studies with ultrasound interpreters other than radiologists, the sensitivity and specificity were 94.2% (95% CI 92.1–95.8) and 92.6% (95% CI 88.0–95.6), respectively (Fig. 4). The AUC of the SROC was 0.99 (95% CI 0.93–0.99) for studies with radiologist interpreter and 0.95 (95% CI 0.92–0.97) for studies with interpreters other than radiologists. The diagnostic accuracy did not differ significantly between the two subgroups (Supplementary Fig. 9).

  • Bone under examination (radius vs. ulna)

    The sensitivity and specificity of ultrasound examination for detecting radius fractures were 97.1% (95% CI 94.2–98.6) and 94.1% (95% CI 85.8–97.7), respectively. Additionally, the sensitivity and specificity of ultrasound examination for detecting ulnar fractures were 78.9% (95% CI 61.6–89.7) and 97.7% (95% CI 94.6–99.0), respectively (Fig. 5). The AUC of the SROC was 0.98 (95% CI 0.94–0.98) for studies investigating radius fractures and 0.97 (95% CI 0.79–0.99) for studies investigating ulnar fractures. Utilizing a bivariate model, it was observed that the diagnostic accuracy was significantly higher in the subgroup examining radius fractures (p < 0.001), as illustrated in Supplementary Fig. 10. Additionally, a post hoc analysis revealed a notably higher sensitivity in this subgroup (p < 0.001).

    Supplementary Fig. 11 displays a scattergram of each subgroup’s positive and negative LRs.

    Considerable heterogeneity was observed in both subgroups. A sensitivity analysis was conducted to identify potential outliers and investigate the source of this heterogeneity. This analysis identified the study by Herren et al. [13] as a significant outlier. After excluding this study, the overall diagnostic accuracy and sensitivity remained significantly higher in the subgroup of studies investigating radius fractures. However, post hoc analysis revealed that the specificity of ultrasound for the ulna subgroup was significantly higher than the other subgroup (p < 0.01) (Supplementary Figs. 12 and 13).

  • Ultrasound views (4-view vs. 6-view)

    The sensitivity and specificity of the 4-view ultrasound examination for detecting distal forearm fractures were 95.3% (95% CI 90.5–97.7) and 94.4% (95% CI 88.1–97.5), respectively. Additionally, the sensitivity and specificity of the 6-view ultrasound examination were 94.4% (95% CI 92.2–95.9) and 92.9% (95% CI 87.9–96.0), respectively (Supplementary Fig. 14). The AUC of the SROC was 0.96 (95% CI 0.92–0.98) for the 4-view subgroup and 0.94 (95% CI 0.92–0.97) for the 6-view subgroup. The diagnostic accuracy did not differ significantly between the two subgroups (Supplementary Fig. 15).

  • Reporting method (bone-wise vs. patient-wise)

    The pooled sensitivity and specificity were examined in two subgroups. In studies reporting results on a bone-wise basis, the sensitivity was 93.8% (95% CI 90.04–96.1), and the specificity was 97.0% (95% CI 93.2–98.7). In contrast, in studies reporting results on a patient-wise basis, the sensitivity was 94.7% (95% CI 92.2–96.4), and the specificity was 90.6% (95% CI 86.3–93.6) (Supplementary Fig. 16). The AUC of the SROC was 0.97 (95% CI 0.92–0.98) for the bone-wise subgroup and 0.96 (95% CI 0.91–0.97) for the patient-wise subgroup. Utilizing a bivariate model, the diagnostic accuracy was observed to be slightly higher in the bone-wise subgroup (p = 0.07), as demonstrated in Supplementary Fig. 17. Furthermore, a post hoc analysis indicated notably higher specificity in the bone-wise subgroup (p = 0.04).

  • Age

    In the meta-regression analysis, using a bivariate model to assess the impact of age, no significant influence of the mean age of the study samples was observed as a factor explaining heterogeneity (p = 0.53).

Fig. 3
figure 3

Forest plot and summary statistics of diagnostic test accuracy (DTA) subgroup meta-analysis comparing studies with trained/untrained ultrasound performers. CI, confidence interval; EP, emergency physician; US, ultrasound

Fig. 4
figure 4

Forest plot and summary statistics of diagnostic test accuracy (DTA) subgroup meta-analysis comparing studies based on the professional background of ultrasound interpreters. CI, confidence interval; EP, emergency physician; US, ultrasound

Fig. 5
figure 5

Forest plot and summary statistics of diagnostic test accuracy (DTA) subgroup meta-analysis comparing the performance of ultrasound examination to detect fractures of radius and ulna. CI, confidence interval

Funnel plot analysis

In Supplementary Fig. 18, paired funnel plots illustrating publication bias and the small study effect are presented. Importantly, the application of Egger’s regression analysis with 2000 bootstrapping revealed significant asymmetry (p = 0.23), suggesting evidence of potential publication bias or a small study effect.

Discussion

This systematic review and meta-analysis extend prior investigations to assess the diagnostic effectiveness of ultrasound in detecting distal forearm fractures in pediatric cases. This study reveals that ultrasonography exhibits a notable sensitivity and a high specificity in the detection of distal forearm fractures in pediatric patients. It proves to be highly accurate in positively identifying cases with fractures and effectively excluding those without. The pooled positive and negative LRs reported further underscore its reliability for both detection and exclusion purposes. These findings position ultrasonography as a valuable tool in clinically assessing distal forearm fractures in pediatrics, aiding healthcare practitioners in making well-informed decisions about patient care.

As an effective diagnostic tool for distal forearm fractures, ultrasonography leverages factors such as good image quality resulting from thin soft tissue and the short distance between the transducer and bone [6]. In contrast to radiography, ultrasonography enables the assessment of soft tissues, including muscle edema, tendons, and joint functions, providing exceptional spatial resolution and multiplanar imaging capabilities. Furthermore, ultrasonography proficiently localizes soft tissue interpositions between fracture fragments [7, 13]. Crucially, it adeptly addresses time constraints in emergency scenarios by facilitating rapid examinations. Ultrasonography eliminates the need for transporting patients to radiology suites and bypasses the subsequent wait for radiologist readings. This expedited process makes a substantial contribution to reducing the overall patient length of stay, effectively mitigating concerns associated with prolonged emergency department wait times [6]. The absence of ionizing radiation, coupled with the opportunity for families to view images during the initial physician visit, underscores the efficiency and patient-centric benefits of ultrasonography in diagnosing distal forearm fractures [7].

Research consistently indicates that ultrasound for detecting pediatric distal forearm fractures is less painful than radiographic imaging. In the study by Chaar-Alvarez et al., the pain score during ultrasound was significantly lower than during radiographic imaging [7]. This aligns with broader literature where patients consistently reported similar or lower pain scores for point-of-care ultrasound (POCUS) compared to X-ray [5]. Furthermore, in the study by Xo et al., the results suggest that pain with ultrasonography is unlikely to limit its use in this diagnostic context [9]. The minimal discomfort associated with POCUS can be attributed to its non-invasive nature, as all images can be obtained without requiring the child to move the affected extremity [5, 9]. This indicates that clinicians can reassure caregivers and children that ultrasonography is generally not a painful procedure.

Our findings indicate that the diagnostic accuracy of ultrasound did not significantly differ between the subgroups of radiologist interpreters and non-radiologist interpreters. However, a significantly higher diagnostic accuracy was observed in the subgroup with trained performers, particularly showcasing notably higher specificity. No difference was noted between trained and untrained individuals in terms of sensitivity. This suggests that ultrasound can effectively rule out distal forearm fractures in children regardless of the operator’s training level. Nevertheless, a crucial distinction emerges when confirming fractures, emphasizing the necessity of specific training. This underscores the importance of skill and expertise in accurately confirming fractures through ultrasound examination. Studies have reported that a standardized training duration of approximately 1 h is sufficient [25,26,27], indicating that the learning curve for bone ultrasonography is not excessively complex, at least for pediatric distal forearm fractures. The study by Galletebeitia Laka et al. demonstrates that a pediatric resident, even without prior POCUS experience, achieved diagnostic accuracy comparable to an experienced emergency department pediatrician after receiving basic training [3]. Similarly, Epema et al. demonstrated that inexperienced physicians can master POCUS of the forearm after a short training period [8]. Moreover, Chaar-Alvarez et al.’s findings confirm the accuracy of ultrasound as a diagnostic tool for nonangulated pediatric forearm fractures, even when reviewed by an experienced, blinded professional relying solely on ultrasound images for diagnosis. In contrast, unblinded bedside diagnoses by less experienced physicians were accurate but to a lesser extent. This indicates that having more ultrasound experience was more crucial and potentially less prone to misleading results than relying solely on additional bedside clinical information for making accurate diagnoses [7].

We noted a significant increase in diagnostic accuracy within the subgroup focused on examining radius fractures, with the analysis highlighting notably higher sensitivity in this category. This trend persisted even after the exclusion of an outlier study. However, upon removing this study, the specificity of ultrasound for the ulna subgroup was found to be significantly higher. Consequently, ultrasound demonstrates greater reliability in excluding radius fractures and proves more effective in confirming ulna fractures. This variability may be attributed to anatomical differences, operator experience and training, variability in patient positioning, characteristics of the study population, and differences in equipment/technology. Further investigation into these factors is necessary to gain a deeper understanding of the nuanced reasons behind the observed differences in diagnostic accuracy for fractures in distinct forearm bones in pediatric cases.

Although the current study’s findings highlighted the reliability of ultrasonography in detecting pediatric distal forearm fractures, it is crucial to acknowledge that this imaging modality may not be universally suitable, especially in cases involving patients with obesity or other factors affecting image quality. Consequently, clinical judgment should guide the choice of the most appropriate imaging modality for each individual patient. Additionally, it is worth noting that ultrasound operators might have been influenced by visible deformities or physical signs during image interpretation, potentially leading to an overestimation of ultrasound accuracy.

While this systematic review centered on the diagnostic utility of ultrasound in pediatric distal forearm fractures, questions about its role post-diagnosis—particularly its potential to inform surgical decisions and support follow-up assessments—remain unexplored. Understanding these aspects is crucial for advancing clinical practice and encourages further investigation into the broader utility of ultrasound in managing pediatric distal forearm fractures.

Conclusions

This study demonstrated the reliability of ultrasonography as an imaging modality for detecting distal forearm fractures in pediatric patients, exhibiting high sensitivity and specificity. Notably, trained performers displayed significantly higher diagnostic accuracy, particularly in terms of enhanced specificity, underscoring the importance of expertise in accurately confirming fractures through ultrasound examination. Further research should specifically address the observed differences in diagnostic accuracy of ultrasound between fractures in the radius and ulna.