Background

Evaluating the potential benefits of emergency medical services (EMS) missions is crucial to allocate EMS resources purposefully and to focus the dispatching of advanced-level prehospital units to missions where patients are likely to benefit from their advanced skills. Due to the multifaceted nature of prehospital missions, the benefits of prehospital care are difficult to evaluate [1, 2], and the benefits of advanced prehospital care are continuously subject to debate [3]. Existing scoring systems estimate the severity of injuries or illnesses for patients, such as the National Advisory Committee for Aeronautics (NACA) severity score [4], which focuses on the severity of an incident and patient characteristics and does not consider the impact of prehospital care, limiting its use in benchmarking and benefit assessments.

The helicopter emergency medical services (HEMS) Benefit Score (HBS) is a nine-level scoring system developed to evaluate the benefits of HEMS missions in the 1990s in Finland [5]. Each category is defined by a written description along with exemplar interventions which can be used to guide the scorer's choice of category. The highest HBS score is reserved for the most advanced prehospital interventions, but the idea is to evaluate the benefit produced by the whole EMS system, not only HEMS units. The scoring system has been used in the Finnish HEMS units since 1997, originally to follow the benefit of the HEMS launched at that time, but nowadays also to compare individual national HEMS units and to collect data for administration purposes.

Despite the everyday use of the HBS in Finnish HEMS for over two decades, its validity has not been studied at all, and reliability has been studied only recently [5, 6]. According to study results, the HBS’s inter-rater reliability was noticed to vary from poor to substantial or almost perfect, and mean difference between raters and reference values were substantial [5, 6]. As the scoring is guided by exemplar interventions, it can be argued, that the reliability could be improved by more detailed and comprehensive examples. Additionally, it has been suggested, that the exemplar interventions should be updated to meet the current treatment guidelines [5].

Methods

Aim

The aim of this study is to develop a score to measure the benefits of prehospital interventions to a single patient. This score development is based on the HBS, but the old exemplar interventions are replaced by more relevant examples. The meaning of these updated instructions is to cover the most common prehospital mission types and make evaluating the effectiveness of prehospital treatments easier and more accurate. Because this evaluation tool is appropriate for the whole EMS system, the score is renamed the EMS Benefit Score (EBS).

Design and setting

This is a four-round, web-based, international Delphi study using expert panel consensus. The technique involves a panel of experts who are asked to complete a series of questionnaires focusing on their opinions, predictions and judgements on a topic of interest. The Delphi technique is widely used in health research to obtain consensus in serial surveys, which are referred to as “rounds”. Key elements of the technique are (1) expert participants, (2) anonymity and individuality, and (3) a summary of results of the former round at the start of each round [7, 8]. The data collection, Delphi rounds and data analysis of the current study were performed from 3.12.2018 to 19.11.2020. A pilot study was performed prior to the actual study to evaluate the study setting. The pilot study participants consisted of Finnish and Danish prehospital physicians who did not participate in the planning of the study or in the actual study.

The work of the expert panel and the commentary board were executed in four Delphi rounds as follows:

  1. 1.

    Each expert panellist was asked to list both common and rare examples of prehospital treatments and interventions and to locate them based on their current knowledge and personal experience into HBS categories 3–8 as comprehensively as possible in subsections based on ten complaint-based diagnoses: “acute neurology excluding stroke”, “breathing difficulties”, “cardiac arrest”, “chest pain”, “infection”, “obstetrics including child birth”, “other”, “psychiatry including intoxication”, “stroke” and “trauma”. These diagnosis groups are recommended in prehospital reporting [9]. The answers were collected anonymously into an electronic data sheet by a data-collection officer who did not participate in the example selection but gathered suggestions in a common table. HBS categories 0–2 were excluded from the study because they are used for scoring when a prehospital intervention is deemed unnecessary or the patient was not met. A commentary board commented on the data gathered from the first Delphi round on the diagnosis groups related to their individual specialties. These comments were shown to the expert panel in the second Delphi round to help them rate the examples on a 5-point Likert scale. Identical suggestions from the first round were combined and overlapping examples removed for the second Delphi round.

  2. 2.

    The examples from the first Delphi round with the commentary board’s opinions were set in a table and sent back to the panellists, who were asked to rate each item on a 5-point Likert scale from 1 (strongly disagree) to 5 (strongly agree). A content validity index (CVI) was calculated for each example, and at least 70% of the experts were required to assign a suggested example a high-agreement score (4 or 5) for it to be included in the third Delphi round. Overlapping examples were then removed.

  3. 3.

    In the third Delphi round, the remaining examples were listed in their suggested HBS categories. The expert panellists were asked to assign each of these remaining examples one of the following labels: “Accept”, “Delete” or “Relocate to EBS category number __”. An acceptance rate of 70% or more was required to assign an example to a category. The examples with acceptance rates below 70% were deleted or relocated to category with the most “Relocate” suggestions—whichever had the higher percentage.

  4. 4.

    In the final Delphi round, the EBS was revealed to the prehospital expert panellists, who were offered an opportunity to comment on it or accept it in that form.

In addition to these Delphi rounds, each phase included an opportunity for free comments on the exemplar interventions and category descriptions.

Participants

Two expert groups were formed for the study: a prehospital expert panel and a separate commentary board. Experts were recruited with open letters: the prehospital expert panel via the European Prehospital Research Alliance (EUPHOREA) and the commentary board via National Finnish specialty societies. The participants were selected based on individual clinical and scientific experiences. The prehospital expert panel ultimately included 18 prehospital physicians from Scandinavia and Northern Europe and the commentary board 11 Finnish in-hospital physicians from seven specialties. The total number of study experts was 29. Table 1 presents characteristics of the 18 prehospital expert panellists. Physicians from intensive care, traumatology, cardiology, neurology, neurosurgery, paediatrics and obstetrics were recruited for the commentary board. Members of the commentary board were recruited to give an in-hospital viewpoint, and therefore they did not have prior or current prehospital experience.

Table 1 Characteristics of the 18 prehospital expert panellists

Statistical methods

This study used the Delphi method and expert consensus. Data handling and collection were performed using Webropol 3.0 by the Webropol Group. A 5-point Likert scale was used on the second Delphi round, and a CVI was calculated for the collected data by Webropol 3.0. Agreement was defined as 70% of the experts rating a suggested example with a high-agreement score (4 or 5) [10].

Ethics

By Finnish law, no ethical approval was needed for this study because no patients or personal data were involved. The study permission was requested and granted by Turku University Hospital (decision number TP2/010/18). The study subjects participated voluntarily. The Standards for Reporting Qualitative Research (SRQR) guidelines by the EQUATOR network were followed in reporting the study.

Patient and public involvement

No patients were involved.

Results

The first Delphi round resulted in 1284 examples from 18 expert panellists divided into HBS categories 3–8 in ten complaint-based subsections. Seven of the responders gave free comments (each Delphi round included sections for free written comments). Figure 1 describes the course of the Delphi rounds, and Additional files 1 and 2 present the materials of the second and third Delphi rounds (Additional files 1 and 2).

Fig. 1
figure 1

The course of the Delphi rounds in the study

Table 2 presents the final form of the scoring system, and additional materials present the expert panellists’ free comments. The definitions of the score categories were kept in their original forms, and no free comment was related to the content of these written definitions. In the fourth Delphi round, one participant suggested moving “Administration of tranexamic acid” from EBS 4 to EBS 6 based on current scientific evidence, and this manoeuvre was performed.

Table 2 The EBS

Discussion

In this study, we updated the HEMS Benefit Score by using the Delphi method to meet the current needs of prehospital emergency care. The structure of nine-level numerical scoring categories, inherited from the original HBS, remained intact, but the exemplar interventions in each category were totally renovated. With this renewal, the scoring system was expanded from HEMS usage to cover all prehospital emergency care, including non-HEMS units, and to better face present-day needs. The renamed score, EBS, better represents the fundamental features of this scoring system and encourages non-HEMS units to utilise it in their practice.

The EBS focuses on interventions that are performed prehospitally and considers the impact of these manoeuvres for treated patients. By this, the EBS aims to evaluate the true benefit of EMS for single patients. In contrast, other scores and classifications used in prehospital settings, such as the American Society of Anesthesiologists Physical Status Classification System (ASA-PS) or NACA [5, 6, 9], describe patient background characteristics and acute clinical status. However, these scores do not evaluate the influence of prehospital care and were not originally built or implemented for prehospital use, so their reliability in prehospital settings is questionable [6].

The revised scoring examples are expected to improve correct benefit category selection. After each EMS mission, EMS personnel responsible for mission documenting, choose a suitable benefit category depending on the individual mission circumstances. Even though the revised examples introduce the consensus opinion of the experts and give guidelines to the benefit category selection, the scoring is ultimately based on the subjective judgement of the person doing documentation. This is because the revised examples are obviously not comprehensive, even if they are versatile. Additionally, it is justifiable to deviate from the score suggested by the exemplar interventions, if the patient has, for example, benefited from several interventions or fast air transport or, on the other hand, the interventions performed have been unnecessary or ineffectual. Despite the subjective nature of the EBS, it can serve as a valuable tool for gathering information from one aspect of prehospital missions, as the effectiveness of prehospital emergency care is a highly complex ensemble and a totally inclusive scoring system for this purpose does not exist.

During the Delphi process, the benefit category examples were renovated, but the numerical scoring categories remained intact, as it was judged unreasonable to evaluate the number of the categories during the same process. These numerical categories were originally developed based on practical experience, so there is no science behind them, and they or the number of them might be inappropriate. This issue must be taken into account in the future studies, and one must estimate the need of possible revision of the categories.

To evaluate the effectiveness of prehospital care, various quality indicators and measurement protocols have been launched [1, 11,12,13], but few studies have focused on their implementation or outcomes. A single scoring system does not solve the absence of process control in EMS systems, but combined with other manoeuvres, the EBS can support intrinsic quality improvement. For example, data on EMS unit-dispatch codes and criteria can be compared on EBSs and the benefit produced by EMS to prehospitally treated patients, based on interpretation of a treating clinician. Beyond accurately dispatching the proper level and number of EMS units, however, EMS system coverage and the geographic locating of units remain challenges [14, 15]. The type and number of missions historically presented in the areas under observation are important aspects in locating EMS units and bases. With the EBS, additional information on regional missions can be gathered. However, far-reaching conclusions based on the EBS are not justified until its reliability and validity have been studied in various settings.

Strengths and limitations

The international expert panel improved the EBS’s generalisability. Despite variations in EMS systems between countries, the EBS evaluates the potential advantages for prehospital patients regardless of the level of the treating EMS unit, the only exception being the highest EBS category, which is reserved for treatments usually offered by only advanced-level units.

The Delphi technique in this study enabled a panel of 18 experienced panellists to express their opinions freely and impersonally guided by the opinions of 11 in-hospital experts from seven specialties. This method limits dominance by eminent, eloquent or highly opinionated individuals in their respective fields of expertise [7, 8], and the panel moderator is less likely to bias the work of the panel. The Delphi method gives panellists substantial time to express their ideas, reflect on their answers and make changes, P and it avoids geographical constraints. On the other hand, the Delphi method itself is vulnerable to a loose definition of an expert, and biases might influence participant selection. The method is also dependent on questionnaire design [7, 8].

A major limitation of this study is, that there is limited data on the impact of several prehospital interventions such as prehospital airway management [16, 17]. An intervention may or may not be life-saving, depending on context. However, in the absence of a thorough research-based data on the impact of different interventions, a consensus opinion of experts is meaningful. In addition, currently no evidence exists of paramedics` ability to predict mortality.

Implications

The EBS is based on the subjective opinion of an attending prehospital clinician. To make the scoring system less dependent on individual variation, the renewed exemplar interventions in each EBS category support the selection of the appropriate category. The revised EBS can be used to benchmark different types of units, enabling quality control, which also allows the development of EMS efficiency. The given EBS scores can be compared to in-hospital interventions and patient outcome, to evaluate the adequacy of prehospital care. For example, a person unconscious due to alleged alcohol intoxication has been given EBS 2 on paramedic evaluation but needs rapid sequence intubation upon arrival in the emergency department. In this case EBS could be used to detect and study why this has happened, and this way for system quality control. Moreover, if the patients with low EBS receive intensive care or emergency procedures in hospital, this should raise the question of the quality of prehospital evaluation of the patients’ condition. Finally, this scoring system can be used to categorize prehospital interventions in clinical studies on EMS performance and to get more data where and in which type of missions, the patients are likely to benefit most. In the future EBS could optimally be linked to the care patient receive in hospital and their later level of performance. However, further reliability and validity studies are needed, before a wide-scale implementation.

Conclusion

Using the Delphi method, the new scoring system, the EBS, was formed by a panel of experienced experts from across Northern Europe. We recommend implementing the EBS to every EMS systems as a part of a routine reporting.