Background

Technological advancements have led to increasing availability of high quality, low-profile ultrasound devices at reduced costs [1]. One area that has seen tremendous growth is that of focused cardiac ultrasound (FoCUS), which describes point-of-care ultrasound that is intended to provide a rapid qualitative assessment of cardiac function. The use of FoCUS has expanded to a variety of practice settings, including emergency medicine, critical care, anesthesia, internal medicine, and primary care, owing largely to its relative ease of use [2]. Prior studies suggest that trainees and non-cardiologist physicians with limited prior ultrasonographic experience can gain proficiency in FoCUS with brief training, such as a 1-day workshop and 20–50 practice scans [3, 4]. FoCUS has proven useful for the assessment of ventricular function, valvular abnormalities, volume status, as well as for the detection of cardiac tamponade, aortic dissection or aneurysm, and pulmonary embolism [5]. The use of FoCUS has been shown to alter management in perioperative [6, 7], critical care [8, 9], and emergency [10, 11] settings and has been shown to improve outcomes in select patients [12].

While FoCUS can be beneficial for patient care and more effective allocation of healthcare resources, there is potential for harm with inappropriate use [13]. The implications of relying on a false negative exam could include delayed or missed diagnoses. Similarly, false positive findings or misinterpretations could lead to unwarranted testing or procedures and increased healthcare spending. Despite the potential for such consequences, formal training programs have not been widely embraced, and quality control metrics are often lacking [14, 15]. Surveys have revealed the fear of missed diagnoses and the lack of training or certification as important barriers to the adoption of FoCUS [16]. The adoption of robust parameters for assessing competency in image acquisition, analysis, and interpretation among physicians is needed to effectively train learners and ensure appropriate use [17].

Leaders in ultrasonography have recognized the need for training standards and have supported the development of structured certification programs for FoCUS as well as quantitative transthoracic echocardiography (TTE), shown in Table 1 [24, 31]. Current certifications in TTE require between 75 and 250 scans and passing one or more standardized examinations, while certification in FoCUS typically requires between 20 and 50 supervised scans. However, many of these recommendations are based on guidance developed for the use of FoCUS and/or TTE in emergency and critical care settings, and their applicability outside of these settings has not been well-demonstrated. There is also no consensus on the optimal method of training in FoCUS or the appropriate metrics for determining skill development. Many small-scale studies have documented and compared strategies for FoCUS education and evaluation among various sub-populations and clinical environments [32]. Among these are studies on trainees and licensed physicians working in intensive care units, medical wards, emergency departments, and perioperative areas for which very different scanning protocols are employed. The heterogeneity of studies has made it difficult to draw conclusions, and thus, the type and duration of training to allow most learners to achieve competency in FoCUS remains undetermined. We conducted a systematic review and meta-analysis to examine existing strategies for FoCUS training and to gain insight on the optimal amount and type of training that will allow for attainment of basic competency in adult FoCUS.

Table 1 Published accreditations in focused cardiac ultrasound and transthoracic echocardiography

Methods

This systematic review conformed to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [33].

Data search

Our search strategy utilized PubMed, Embase, and Cochrane Library databases from inception until June 2020. The following search terms were used: “echocardiography” or “transthoracic echocardiography” or “TTE” or “bedside ultrasound” or “cardiac ultrasound”, and “doctors” or “physicians” or “residents” or “fellows” or “medical students” or “attending” or “intensivist” or “internist” or “hospitalist”, and “competence” or “competency” or “certification” or “accreditation” or “evaluation” or “assessment” or “curriculum”. These terms were identified in the title or abstract (PubMed and Embase) or in the title, abstract, or keywords (Cochrane). We also examined the lists of references from relevant studies and review articles for any additional articles that might have been missed in our initial search.

Inclusion and exclusion criteria

Studies were included only if standardized training on FoCUS was provided followed by a formal assessment of competence, such as by expert review or comparison to an expert-performed echocardiogram. An expert was designated as a physician or sonographer with extensive training and/or certification in adult echocardiography. Included studies were required to have at least 5 learners who were medical students, trainees, or attending physicians without expertise or formal certification in transthoracic or transesophageal echocardiography. For inclusion, each study was required to outline the type and duration of training, describe which parameters were assessed, and identify a comparator for assessment of competency. Studies within pediatric populations and on non-physician learners were excluded.

Study selection and data extraction

Titles and abstracts were assessed independently by two reviewers (LEG and PJL) and were included in the full text review if selected by either. The same two reviewers performed full text review, with discrepancies resolved by a third reviewer (MGC). Two authors (LEG and GAW) independently extracted the following data using a standardized form: number and training level of learners, ratio of learners to instructors during training, type and duration of training, total study duration, views and pathology taught, ultrasound device used, clinical setting, selection of subjects for assessment, parameters assessed, measurement of competency, and outcomes.

Risk of bias assessment

The ROBINS-I tool [34] was used to assess risk of bias in our cohort of non-randomized studies of interventions. Risk of bias in seven pre-specified categories was independently assessed by two reviewers (LEG and GAW), with disputes resolved through joint discussion with a third reviewer (MGC).

Study outcome

The primary outcome was the performance of medical students or physicians in acquiring and/or interpreting cardiac and hemodynamic parameters using FoCUS relative to that of expert echocardiographers.

Data analysis

Summary tables are provided for included studies, accompanied by a qualitative discussion and evaluation of risk of bias. The relationship between three training parameters (didactic hours, hands-on practice hours, and scans performed) and reported level of agreement (kappa coefficient, κ) between learners and expert echocardiographers on identifying cardiac pathology was assessed by linear regression (SPSS version 24, IBM Corp.). Analysis was performed for parameters in which assessments were relatively uniform across studies, and for data sets containing ≥ 8 studies in order to minimize the likelihood of sampling error. The Pearson correlation coefficient (r) and p value for the linear fit were reported. The kappa coefficient (κ) was interpreted as [35]: perfect agreement (κ = 1), near perfect agreement (κ = 0.81 to 1), substantial agreement (κ = 0.61 to 0.8), moderate agreement (κ = 0.41 to 0.6), fair agreement (κ = 0.21 to 0.4), and slight to no agreement (κ = 0 to 0.2).

Results

Search results and study selection

Our search yielded 1479 unique studies to be screened, of which 1301 were excluded, leaving 178 full-text studies to be screened. Of these, 23 met inclusion criteria and were included in this systematic review (Fig. 1). Many studies met multiple criteria for exclusion.

Fig. 1
figure 1

Flow diagram showing the selection of studies for inclusion

Quality of included studies

All studies included in the analysis were non-randomized, observational studies of an intervention and thus were expected to have a substantial and unavoidable bias due to confounding. We identified several consistent sources of bias in the selection of participants, amount of training received by learners within each training program, number of scans performed, pre-existing knowledge of the clinical status of patient subjects, and interobserver variability. Bias was assessed for each study using the ROBINS-I tool (Table 2) [34].

Table 2 Risk of bias for each included study assessed using the ROBINS-I tool [26]

Study participants

Data were collected on a total of 292 learners across 23 studies (see Tables 3 and 4). Participants ranged from medical students to subspecialty physicians with up to 29 years of attending-level experience [43]. The most represented group was internal medicine residents (n = 174, 59.6%), followed by critical care fellows (n = 32, 11.0%), hospitalists (n = 27, 9.25%), emergency medicine residents (n = 23, 7.88%), emergency medicine attendings (n = 15, 5.14%), medical students (n = 10, 3.42%), intensivists (n = 6, 2.05%), trauma surgeons (n = 6, 2.05%), and anesthesia residents (n = 5, 1.71%). For the majority of learners, participation was on a voluntary basis. At least 9 learners (3.08%) across all included studies had some prior training in echocardiography, but none had expertise or formal certification.

Table 3 Characteristics of the learner population, training program, device used, and study duration for 23 included studies
Table 4 Characteristics of ultrasound skill assessment and overall findings for 23 included studies

Training format and duration

All studies had a standardized training program that included some combination of didactic and practical hands-on learning. Where reported, the didactic component ranged from 45 min to 18 h, and from 7 to 80% of the dedicated training time. Didactics included a component of in-person lectures, review of pre-recorded cases, and/or bedside demonstration in 21 of 23 studies (91%) and consisted of remote learning only with handouts or online modules in 2 of 23 studies (8.7%). Practical learning was reported either as a duration of time spent in small groups or 1-on-1 performing supervised echocardiograms, or as the number of supervised exams or exams performed independently with feedback. Where reported, the time spent on practical training ranged from 30 min to 20 h or from between 1 and 50 exams, with the exception of one study in which learners were encouraged to perform 100 independent exams as part of their training [51].

Subjects for assessment

Learners performed FoCUS on a total of 3794 subjects, which included 3785 patients, 4 healthy volunteers, and 5 simulated patient cases. Patients were examined in a variety of clinical settings, including the intensive care unit (n = 1077, 28.5%), inpatient medicine floor (n = 1002, 26.5%), intermediate care unit (n = 408, 10.8%), emergency department (n = 385, 10.2%), outpatient clinic (n = 257, 6.79%), and short-stay unit (n = 175, 4.62%). A total of 524 patients (13.8%) were on mechanical ventilation at the time of the exam. Clinical setting was not specified for 481 patients (12.7%). Most patients were selected for study inclusion based on having a clinical indication for FoCUS, and many were excluded due to the presence of injuries requiring immediate intervention, inability to tolerate repositioning, the sonographers’ inability to obtain adequate windows, or a prolonged duration (typically > 48 h) between learner and expert examinations.

Parameters for assessing competency

Learners were assessed on their skills in both acquiring and/or interpreting images. Parameters of acquisition ability included whether or not learners were able to obtain adequate images to make a diagnosis, the time required to obtain images, a subjective assessment of image quality, or an efficiency score (quality/time). One study also reported self-perceived workload for performing FoCUS [44]. Parameters of interpretation ability included accuracy in quantitative measurements (chamber or vessel sizes, ejection fraction, E/A ratio) and diagnostic accuracy (normal or abnormal function, presence or absence of pathology). Competency in these areas was assessed by comparison against the performance of an expert echocardiographer. This was typically a board certified cardiologist or a physician who had completed level 2 or 3 certification by the American Society of Echocardiography, although in two studies this was a cardiology fellow [50] or intensivist [55] with formal training and experience in echocardiography but without certification. Ideally, exams performed by learners were compared to a similar exam performed by an expert, with both exams performed using either a portable or traditional ultrasound. However, only in 8 of the 23 studies [40, 41, 45, 47, 49, 50, 56, 57] were the learner’s exam compared to another focused exam performed on the same or very similar type of device. Most studies included comparison of a learner-performed FoCUS exam with a standard TTE, and often with the learner performing the exam with a portable device with limited functionality and poorer image resolution than a traditional ultrasound machine. One study [37] compared learner and expert performance on an ultrasound simulator, while two others examined healthy volunteers [40, 56].

Quantitative assessment of training parameters

Of the 23 studies included in this review, 11 calculated a kappa coefficient (κ) for inter-rater reliability between learner and expert interpretation of at least one cardiac ultrasound finding and could be included for quantitative analysis. The most frequently assessed pathologies were left ventricular (LV) systolic dysfunction and pericardial effusion, followed by regional wall motion abnormalities, valvular abnormalities, and hypovolemia. LV systolic function and the presence of pericardial effusion were assessed in at least 8 studies, providing the largest sample sizes for meta-analysis. The other parameters had limited sample sizes with measures that were relatively less uniform across studies. The level of agreement with experts on learner assessment of LV systolic function (Fig. 2, left panel) and pericardial effusion (Fig. 2, right panel) is shown based on the number of didactic hours (Fig. 2a), number of hands-on practice hours (Fig. 2b), and total number of exams performed (Fig. 2c). Learners achieved near perfect agreement (κ > 0.8) with expert echocardiographers on the assessment of LV systolic function after 6 didactic hours and 6 h of hands-on training, and substantial agreement (κ > 0.6) after 2 h of didactics and 2 h of hands-on training. There was no correlation between number of scans performed and agreement with experts on the identification of LV systolic dysfunction. Learners achieved substantial agreement (κ > 0.6) with experts on the identification of pericardial effusion after 3 h of didactics, 3 h of hands-on training, and at least 25 scans. For the assessment of LV systolic function, agreement between learners and experts correlated with the amount of time (1 to 6 h) spent on didactics (r = 0.79, p < 0.05) and performing hands-on practice (r = 0.82, p < 0.05). For the identification of pericardial effusion, agreement between learners and experts correlated with the amount of time (1 to 6 h) spent on didactics (r = 0.82, p < 0.005) and the number of scans performed in each study (r = 0.51, p < 0.05).

Fig. 2
figure 2

Relationship between a number of didactic hours, b number of hands-on practice hours, and c number of scans performed during a standardized training phase on learner agreement with expert echocardiographers for the detection of left ventricular systolic dysfunction (left panel, navy) and pericardial effusions (right panel, light blue). The Pearson correlation coefficient (r) and p value for the linear fit are reported for each data set, and regression lines are shown with 95% confidence intervals (dashed lines). Agreement is expressed by the kappa coefficient, κ

Discussion

FoCUS is intended to provide qualitative or semi-quantitative assessment of major cardiac abnormalities, such as identifying LV systolic dysfunction, pericardial effusion, or valvular abnormalities [59]. As a goal-directed tool, data obtained needs to be reliable as it is used to guide immediate clinical management. Thus, the development of a FoCUS training platform that ensures competency is necessary for safe and meaningful use. Our systematic review has shown that existing training programs vary substantially in their duration of training (45 min to over 20 h), type of training provided, skills taught, and clinical setting in which FoCUS skills were assessed. Our analysis also showed that a short duration of training, i.e., 2–3 h didactics and 2–3 h of hands-on training, may be sufficient for most learners to achieve substantial agreement with experts in identifying two major cardiac abnormalities: LV systolic dysfunction and pericardial effusion. Meanwhile, near perfect agreement (κ > 0.8) for detecting these abnormalities could be achieved after 6 h of didactics and 6 h of hands-on training. Identification of other pathologies, particularly wall motion abnormalities, valvular lesions, and IVC enlargement, was often more difficult, and most learners were only able to achieve fair to moderate agreement with experts after brief training.

Many studies included in our review involved comparison of data obtained through FoCUS exams performed using a small portable or handheld device to data obtained from a TTE performed using ultrasound machines with high resolution and advanced features. FoCUS is not performed for the same diagnostic purpose, nor should it be expected to match the precision of a comprehensive TTE. Yet we felt that comparison to a well-established standard was likely to be the most reliable metric to assess learner competency and that results yielded from this higher benchmark should be interpreted within a margin of non-inferiority. FoCUS training should also include education on the intended use and inherent limitations of FoCUS versus TTE.

Our review examined the effect of three training parameters on learner performance. We showed that substantial agreement (κ > 0.6) between learners and experts on the assessment of LV systolic function could be achieved with only 2 h each of didactic and hands-on practice and a minimum (4–10) number of scans. Similarly, substantial agreement with experts on the identification of pericardial effusion could be achieved with only 3 h each of didactic and hands-on practice. The greater amount of time required for identifying pericardial effusions may be due to misidentification of pericardial fat as an effusion, or to the fact that small effusions can be missed in some views. Regardless, these findings are impressive, given that only moderate (κ > 0.4) to substantial (κ > 0.6) agreement exists between trained experts for assessments of LV function by FoCUS [60]. We also show that learner performance for identifying LV systolic dysfunction improves with time spent on didactics and time spent performing hands-on practice, at least for up to 6 h each, whereas the total number of scans performed did not correlate with improvement in identifying LV dysfunction. This may be due to the fact that there was already substantial agreement (κ > 0.6) between learners and experts after very few (4–10) scans. Also, identifying LV dysfunction by FoCUS is a skill that may be best taught through a combination of didactics and supervised practice, while the actual number of exams performed may be less important. In contrast, identification of pericardial effusion improved with time spent on didactics as well as with the number of scans performed, and substantial agreement with experts could be achieved after 25 scans. This suggests that the detection of pericardial effusion is a skill that is gained through additional experience rather than supervised practice and supports the completion of between 20 and 30 focused exams for achieving competency in FoCUS as recommended by existing governing bodies (Table 1). Overall, our quantitative findings confirm that learners may be able to achieve reasonable competency using ultrasound to assess LV function and identify pericardial effusion after a very short (4–6 h) duration of training that includes equal portions (2–3 h each) of didactic and hands-on learning. Our findings also suggest that a small number of scans (20–30) may be sufficient for learners to gain basic competency in FoCUS.

To our knowledge, ours is the first systematic review and meta-analysis to be published on training in FoCUS. A prior systematic review by Rajamani et al. [61] examined 42 studies with an aim of evaluating the quality of point-of-care ultrasound training programs and their ability to determine competence. Roughly half of all studies did not include a comparator group against which to assess learner competency. Another prior systematic review by Kanji et al. [32] examined 15 studies in the critical care setting, most of which assessed learning based on pre- and post-training test scores and also did not include assessment of competency against an accepted standard as was required in our review. In addition to requiring a comparator for assessing competency, we also took a broader approach in examining the training of a diverse group of learners. As FoCUS adoption continues to expand, we wanted to report findings that might guide appropriate guidelines for the education of providers from different backgrounds and skill levels.

Our review is the first to provide quantitative evaluation of the impact of various training parameters on learner performance. While established curricula exist for FoCUS training in critical care and in emergency medicine, such standards do not currently exist for other specialties. By including a heterogenous population of learners in our review, we hope that the findings may be generalizable to learners in other specialties such as internal medicine, anesthesiology, and general surgery who may be examining patients in settings ranging from outpatient clinic, in the operating room, or post-operatively in the hospital wards. Studies also ranged in their scope of training and parameters assessed, emphasizing that the determination of competency in performing and interpreting FoCUS is a challenging distinction that depends heavily on the clinical context. Because the goal of FoCUS will vary based on the clinical context to which they are applied, the specific metrics for competency will also vary [17]. For example, sensitivity for the detection of a reduction in left ventricular ejection fraction needs to be high in outpatient settings, such as in the study by Croft et al. [41], when determining the need for specialty referral and tailored management of chronic conditions. Meanwhile, a lower sensitivity is likely acceptable in the emergency department, such as in the studies by Farsi et al. [42] and Carrie et al. [39], when determining the presence of a cardiogenic cause for hemodynamic instability.

When considering the wide range of potential clinical applications for FoCUS, it is important to recognize that training clinicians with different skill levels for the use of FoCUS in a variety of settings is unlikely to be successful with a single standardized curriculum. Rather than content-based training that uses completion of a set of material as an endpoint, a competency-based program recognizes that learners will progress at different speeds and that some will require additional material to reach the same level of competency. Competency-based programs enable learners to move through topics at their own pace, progressing when they are comfortable with a new skill and deemed competent by their supervisor(s). This form of training has been successful for teaching other clinical skills such as central line placement and orotracheal intubation, in which clinical competency is not strictly linked to a number of lines placed or intubations performed and no formal accreditation is needed. The future practice of FoCUS may benefit from a convergence on competency-based training that is tailored to a particular application and/or specialty, rather than from pursuit of formal accreditation across specialties.

When considering the most effective ways to train physicians on the use of FoCUS, it is also important to recognize that the co-existent clinical demands on physician-learners can impede skill acquisition. Some of the strategies to support learners that were adopted by the studies in this review include offering one-on-one or small group sessions for additional supervised practice, providing supervision during clinical application, and establishing processes that give learners access to ongoing feedback from experts. Flexibility in training availability and integration of FoCUS practice with existing clinical workflows were two recurring strategies that seemed to cater to the needs of physician-learners.

The need to train new generations of physicians in adult FoCUS presents the opportunity for future study in this field. An important consideration when designing a training program is the prevention of skill decay, which has been noted to occur rapidly (within 1–3 months) after the completion of a brief training program [62]. One study [56] found that learners retained their imaging skills at 6 months post-training, but there was no data on skill retention beyond 6 months in any included studies. The duration of the training phase may be inversely related to the rate of decay, suggesting that longitudinal support through deliberate practice and mentored review may help learners to retain their skills [56]. By making ultrasound devices readily available and easily accessible within clinical environments, physicians can develop ways to incorporate FoCUS into their daily practice. Training programs must find ways to support learners beyond the initial training period in a manner that is structured yet flexible.

Limitations

It is important for the reader to recognize that all of the studies identified were non-randomized, observational studies with critical levels of bias. First, selection bias was often evident in both the selection of participants, many of whom were volunteers, and the selection of patient subjects for exams. For example, patients requiring urgent evaluation and treatment are those who are also most likely to benefit from rapid, point-of-care ultrasound, and yet many of these patients were excluded from learner examinations. Three studies reduced subject selection bias by using standardized patients or an ultrasound simulator [37, 40, 56], but at the expense of external validity. Second, few studies [39, 44, 45, 50, 55] acknowledged exams performed by each learner as dependent data points, and even fewer accounted for this through the use of linear modeling [45, 50]. Third, most studies were conducted in actual clinical settings, where time constraints, patient factors, and learner motivation are expected to introduce bias into the results. And lastly, while we report the minimum hours required for learners to detect LV systolic function and identify the presence of pericardial effusion, we were unable to determine the minimum training period required to achieve competency in other aspects of cardiac assessment due to insufficient data.

Conclusion

FoCUS is an important diagnostic tool and will likely soon be considered a standard skillset for any practicing physician. A formal training program that includes 2–3 h of didactic learning, 2–3 h of hands-on training, and requiring 20–30 scans is likely to be adequate for most learners to achieve competency in the detection of gross LV systolic dysfunction and pericardial effusion. Additional training is necessary for skill retention, efficiency in image acquisition, and the detection of more subtle abnormalities. The finding that reasonable proficiency can be obtained after only brief formal training should encourage physicians at any career level to pursue training in FoCUS.