Introduction

Despite improvements in vehicle technology, emergency services, and trauma management, traffic accidents remain one of the leading causes of death, with a shifting mortality trend towards the prehospital phase [1]. According to European law, since 2018, every new passenger car model must be equipped with automatic emergency call systems (eCall) [2]. Currently, eCall transmits neither medical nor occupant-specific information to emergency call centers.

Our project team is developing an automated injury prediction device for car accidents and therefore an injury prediction tool for traffic accidents in another project. After entering clinical, quickly assessable accident parameters such as seating position, impact against the passenger cell, seatbelt usage, impact direction, EES (energy equivalent speed), vehicle class, and airbag deployment, an expected injury pattern of the vehicle occupants is generated based on an injury risk function derived from logistic regression (unpublished results from our research group). Although the data required for injury prediction are technically measurable, they are currently not comprehensively collected or stored by vehicles or rescue teams. For these reasons, it is necessary for the input data to be initially captured at the accident scene by the attending personnel. This is currently being done in a prospective, multicenter offline testing of the injury prediction tool.

The aim of this subproject is to investigate the ability to collect traffic accident–specific data at the accident scene and reveal potential interobserver variability in data collection quality among different professional groups using real accident scenarios. Furthermore, the time required to conduct data collection will be assessed. Based on these findings, user recommendations for the offline tool will be formulated.

Insights from this study can help evaluate the ability of different professional groups to identify potential accident-independent parameters at accident scenes. This could include, among other things, relaying information to dispatchers at rescue control centers, also within the context of telemedicine approaches. Additionally, the time required and the learning effects from training sessions could provide guidance for possible user training in such scenarios.

Materials and methods

This study is a prospective descriptive cohort study. It has received a favorable vote from the ethics committee (BO-EK-207042021).

Participants

A total of 50 participants were surveyed, divided into five groups with ten individuals each. The positive control group consisted of employees of the Traffic Accident Research Institute at the Technical University of Dresden (hereinafter referred to as TAR, Traffic Accident Researcher). These individuals have professional experience in technical accident reconstruction, are familiar with the variables to be collected, and regularly review images of real accident scenarios. The negative control group comprised ten laypersons with no experience in emergency medicine or the technical assessment of accident scenarios (LAY). Other professional groups included emergency physicians (EP), hospital physicians from a university-level trauma center (HP), and emergency services (EC; each with n = 10). The survey was conducted pseudonymously, and data on the participants’ age, gender, professional group, and years of professional experience were collected.

Survey and training

During the participant survey, group sessions of approximately ten individuals per survey session were conducted. Real accident scenario images were presented both before and after a training presentation using PowerPoint as exemplified in (Fig. 1) (Microsoft Corporation, Washington, USA). Participants were asked to visually assess the required input parameters for injury prediction (similar to a real accident scenario) and were given a time limit of 120 s per case. The following accident parameters had to be evaluated, as they are currently necessary for injury prediction: vehicle class (compact, mid-size, luxury), whether the collision involved a rigid obstacle (e.g., tree, wall), rollover, main deformation area of the vehicle to determine the impact direction (front, right, left, rear), impact on the passenger compartment (impact between A and B pillars), energy equivalent speed (EES), seating position of the affected person (driver or passenger), seatbelt usage, age (categorized as 0–17 years, 18–64 years, > 65 years), gender, and airbag deployment (differentiating between curtain, frontal, knee, and side airbags). Additionally, two complex accident scenarios (rollover, multiple collisions) were demonstrated before and after the training.

Fig. 1
figure 1

Exemplary images from the participant survey. Depicted is a crash vehicle, showing the damage pattern, seating position, and airbag status

During the training session, which took place after the first ten cases, the following parameters were defined and explained using real accident images in a presentation: vehicle class, rigid obstacle, rollover, impact on the passenger compartment, impact point (front, right, left, rear), seatbelt status, four airbags (curtain airbag, frontal airbag, knee airbag, side airbag), and energy equivalent speed (EES). Participants had the opportunity to ask questions during the training.

The parameters (driver/passenger), age, gender, and seatbelt status were provided to participants both before and after the training, as they could not be determined from the accident images. Estimated values were recorded on questionnaires. The processing time for all 24 cases was recorded by 23 randomly selected participants (2 laypersons, 3 traffic accident research staff, 5 emergency medical service personnel, 7 hospital doctors, 6 emergency physicians).

Participants were informed about the purpose and procedure of the survey and provided written consent.

Scenarios

The scenarios used are part of the GIDAS database (German In-Depth Accident Study) collected by the Traffic Accident Research Institute at TU Dresden, and actual values of the variables to be collected are available to the study team as a reference. Fully reconstructed scenarios were utilized (n = 10 per professional group).

Out of the ten cases to be assessed, six had an EES of 0–30 km/h, three had an EES of 30–50 km/h, and one case had an EES of > 50 km/h. The selection of cases was done randomly and based on the relative distribution of EES in the GIDAS dataset (extraction 2010–2021). In this dataset, for maximum AIS 3 or more severe injuries (AIS3 +) following a traffic accident, the EES distributions were as follows: EES 0–30 (n = 1.309, 57%), EES 30–50 km/h (n = 609, 27%), EES > 50 km/h (n = 369, 16%). The relative distribution of impact direction from GIDAS (52% frontal, 27% side impact, ratio approximately 2:1) was also considered. Seven cases with frontal impact and three cases with side impact were selected for each group. Additionally, two complex cases (multiple collisions, rollover) were demonstrated before and after the training.

Statistics

In addition to the descriptive analysis of data regarding the demographics of the respondents, prediction accuracy, and time required, the quality of individual variable assessments was compared between professional groups. Furthermore, a comparison was made between the quality of assessments before and after the training. Except for the EES value, the response options for the other 13 variables were considered binary (true vs. false response). For the EES estimation, the deviation from the actual reconstructed numerical value was considered, and the correctness of the EES estimation was assessed with a tolerance of 10 km/h. If the response fell within this range, it was considered correct. The Wilcoxon test was used for non-normally distributed, paired samples to calculate significances in differences between professional groups before and after the training. The Kruskal–Wallis test was applied to calculate significances in differences between the assessed parameters before and after the training. If significant differences were observed, professional group-specific differences were calculated using the Mann–Whitney U test. Differences were recognized as statistically significant at a significance level of p < 0.05. Additionally, a comparison was made between front and side impact. No imputation methods were applied for missing data. Excel for Mac (v16.46, Microsoft Corporation, Washington, USA), GraphPad Prism version 9.0.1 for Mac (GraphPad Software Inc., San Diego, CA, USA), and SPSS (release 23 for Windows, IBM, Armonk, NY, USA) were used to calculate and visualize the results.

Results

A total of 50 participants were surveyed. Demographic data were available for 47 participants (94%). Complete data were available for the LAY, TAR, and EC groups (Table 1). In the HP group, data were missing for two participants and for one participant in the EP group. Of all study participants, 81% (n = 38) were male, and 19% (n = 9) were female. The mean age of all participants was 34 years (SD 9.08). The lowest mean age was 25 years in the LAY group, and the highest mean age was 43 years in the EP group. The mean professional experience was 8 years, with employees of TAR (12 years), EC (11 years), and EP (14 years) having a significantly longer average professional experience compared to HP (3 years).

Table 1 Demographic parameters of the surveyed participants (gender, age, and professional experience)

Processing time per case

The average processing time per case was 68 s, with case 1 taking 96 s (ranging from 57 to 140 s) and the last processed case, case 22, taking 60 s (ranging from 37 to 98 s). The median processing time for the 10 non-complex cases before training (cases 1–10) was 73 s, and after training (cases 13–22), it was 59 s (p = 0.003, two-way ANOVA).

Figure 2 shows the decreasing trend in processing time with the number of cases performed. The complex cases required longer processing times. Cases 11 and 12 (before training) were completed on average after 81 s (ranging from 49 to 119 s). Cases 23 and 24 (after training) took an average of 72 s (ranging from 38 to 118 s).

Fig. 2
figure 2

Processing time per case (mean and standard deviation). A trend of reduced processing time with case progression is evident, with the complex cases 11, 12, and 23, 24 (boxes) standing out as outliers. Cases before training (1–10) were significantly processed more slowly than after training (13–22, *p = 0.003, two-way ANOVA)

Impact of user training

Before training, of all participants, regardless of professional group and impact direction, 629 of the binary answers (10.7%, median 37 per parameter) were answered incorrectly. After training, this reduced to 340 answers (5.5%, median 11 per parameter). The sum of the differences between estimated and actual EES before training was 4250 km/h and decreased by 28.8% to 1223 km/h after training.

Regarding the quality of the input parameter assessments before vs. after training, regardless of impact direction, the following relative distributions were observed: vehicle class, 74% correct answers before and 74% after training; rigid obstacle collision, 90% before and 99% after training; rollover, 98% before and 98% after training; impact side, 91% before and 93% after training; damage to the passenger compartment, 71% before and 95% after training; EES within ± 10 km/h, 52% before and 74% after training; seatbelt usage, 97% before and 98% after training; curtain airbag, 88% before and 93% after training; front airbag, 96% before and 99% after training; knee airbag, 93% before and 94% after training; and side airbag, 80% before and 91% after training. Seat position, age group, seatbelt usage, and gender were predetermined for the participants (Table 2).

Table 2 Correct assessments by professional group, input parameters before and after user training, and the difference, with a breakdown by impact direction (relative distribution in %)

All impact directions

Regardless of the impact direction, after training, LAY (− 6%), TAR, and hospital doctors (− 2% each) showed slightly poorer assessments of the vehicle class (not significant, p > 0.315). Significant improvements were observed in the assessment of a rigid obstacle collision by LAY (+ 6%, p = 0.034) and hospital doctors (+ 22%, p = 0.01), with a significant overall improvement among all professional groups (+ 8.4%, p < 0.01). After training, all professional groups exhibited significantly better assessments of passenger compartment damage (+ 20%, p < 0.007). The estimation of EES was also significantly better for EC (+ 30%, p = 0.009), hospital doctors (+ 35%, p = 0.019), EP (+ 35%, p = 0.011), and in the overall assessment (+ 22%, p < 0.001). In terms of assessing airbag deployments, significantly better results were obtained for the assessment of front airbag deployment by LAY (+ 8%, p = 0.02), as well as the assessment of side airbag deployment by LAY (+ 17%, p = 0.048), hospital doctors (+ 11%, p = 0.09), and EP (+ 10%, p = 0.04).

Frontal impact

Looking at the cases after a frontal collision, in addition to the overall assessment mentioned earlier, significant differences were observed in the assessment of impact direction by LAY (p = 0.023), EC (p = 0.008), and hospital doctors (p = 0.02). Laypersons also showed a significantly better estimation of the EES value (p = 0.024). Compared to the overall assessment (Table 2), differences in front and side airbag assessments by LAY were not significant.

Side impact

For side impact, after training, EC showed a significantly better assessment of rigid obstacle collision (p = 0.046), impact direction (p = 0.02), and side airbag (p = 0.03). Hospital doctors (p = 0.046) and EP (p = 0.025) also had better scores for impact direction. Compared to the overall assessment (Table 2), LAY did not show significant differences in the assessment of rigid obstacle collision, front and side airbags. For TAR, knee airbag, and EP, side airbag assessments were not significant.

Complex cases

Compared to the overall assessment (Table 2), LAY showed significantly better results after training for impact direction (p = 0.023), EES (p = 0.025), seatbelt usage (p = 0.008), and curtain airbag (p = 0.034). The significance for front and side airbags was not observed. TAR did not show significant differences, but EC showed significant differences in impact direction (p = 0.014) and curtain, front, and knee airbags (p = 0.046), with EES not being significant. Hospital doctors showed significant differences in rollover (p = 0.038) and impact direction (p = 0.005), while passenger compartment, EES, and side airbag assessments were no longer significant. Emergency physicians performed significantly better in assessing impact direction (p = 0.024) and front airbag (p = 0.025), with differences in EES and side airbag no longer being significant.

Interdisciplinary comparison before and after user training

The following analysis pertains to the 20 non-complex cases. When considering differences in the assessed input parameters before training, significant differences between professional groups are observed in the assessment of vehicle class (p = 0.047), rigid obstacle collision (p = 0.036), impact side (p = 0.032), EES (p = 0.006), seatbelt usage (p = 0.024), front airbag (p = 0.01), and side airbag (p = 0.017). Of these, only the assessment of front airbag remains as the sole significant difference after training (p = 0.014; Table 3).

Table 3 Comparison between input parameters before and after user training

Interdisciplinary differences before training

In the further differentiation of the significantly differing input parameters, the interdisciplinary differences before training are presented in Table 4. Regarding the assessment of the vehicle class, TAR significantly outperformed LAY (p = 0.023) and EC. For rigid obstacle collision, hospital doctors performed significantly worse than LAY (p = 0.023), TAR (p = 0.007), and EP (p = 0.043). Impact side was rated significantly better by TAR compared to hospital doctors (p = 0.011). The EES was also significantly better estimated by TAR than by LAY (p = 0.009), EC (p = 0.005), hospital doctors (p = 0.001), and EP (p = 0.011). Seatbelt usage was predetermined; however, EP outperformed EC (p = 0.023). Concerning the front airbag, LAY were inferior to TAR (p = 0.023) and EP (p = 0.043). Side airbag was significantly better assessed by TAR compared to LAY (p = 0.019), EC (p = 0.003), and hospital doctors (p = 0.019).

Table 4 Interdisciplinary differences before user training, subdivided by input parameters (which showed significances in the Kruskal–Wallis test, Mann–Whitney U test)

Interdisciplinary differences after training

For the assessment of the front airbag (Table 3), no significant interdisciplinary difference dependent on training was observed in the Mann–Whitney U test (p ≥ 0.28).

Discussion

In this study, a possible interobserver variability in the assessment quality of technical accident parameters was examined using real accident scenarios from GIDAS database collected by the Traffic Accident Research Institute at TU Dresden. The surveyed participants had varying degrees of experience in technical aspects and emergency medicine. Positive controls for technical understanding were employees of the TAR, while EP served as positive controls for preclinical emergency medicine knowledge. Laypersons with no prior experience in both fields served as the negative control group.

As the number of assessed cases increased, participants tended to require less time for parameter assessment, implicating a trainings effect. After user training, the processing time was significantly reduced (see Fig. 1). This suggests both a training effect and a positive impact of the training on processing time. After training, the median processing time for non-complex cases was 59 s.

A significantly positive training effect on the assessment quality was observed both in the overall analysis of all answers, regardless of parameters and professional group, and for the parameters rigid impact opponent, affected passenger compartment, EES, and front and side airbags. This indicates the necessity of training, especially for these parameters. These parameters were analyzed in a parallel project having significant impact on the prediction value of injury severity. Notably, both laypersons and medical professionals (doctors and emergency personnel) benefited from the training, indicating that prior medical knowledge played a relatively minor role. In contrast, TAR, who possessed technical knowledge, demonstrated superiority in the assessment, as expected. This was supported by their significant advantage in interdisciplinary comparison before training. For various parameters, such as passenger compartment damage, the superiority of TAR remained even after training, except for seatbelt usage (a preset value).

EC benefited significantly in the assessment of EES and damage to passenger compartment, possibly due to their frequent experience with accident scenarios in preclinical settings. This could suggest that EES and damage to the passenger compartment are of lesser concern and may have been easily defined through training. Vaca et al. postulate comparable results. The prediction of injury profiles by paramedics was sensitive, whereas the assessment of vehicle-specific crash variables was less accurate [3]. Doctors working in preclinical emergency medicine performed significantly better in three parameters, while clinic physicians and LAY performed better in four of the parameters assessed. This might be representative of their experience with traffic accident scenarios.

The training did not result in a significant improvement in the assessment quality for vehicle class, rollover, impact side, curtain airbag, or the predefined parameters of seat position, seatbelt usage, age, and gender. For vehicle class, it is plausible that distinguishing between compact, midsize, and luxury vehicles may not be straightforward, leading to minimal or even slightly worse assessments despite training. The other parameters mentioned already had high accuracy rates before training, so while training tended to improve them, it did not lead to significant improvements, especially for predefined parameters.

When examining complex scenarios, the significant advantages in assessing EES for EC, HP, and EP were no longer apparent. This confirms the expectation that in such scenarios, determining the main impact and its EES may be more challenging. Assessing the side airbag in complex collisions also appeared to be problematic, as it was no longer significantly better after training.

After training, significant interdisciplinary differences were only observed in the assessment of the front airbag, which were not confirmed in the Mann–Whitney U test. Accordingly, the training, which defined the parameters using image examples (not used in subsequent case scenarios following the training), led to a partial and significant improvement in the quality of assessment. On the other hand, the interdisciplinary differences before training were neutralized.

The results of our study reveal the practicability of preclinical assessment of injury severity predictive accident-related parameters. Following appropriate training, any person, regardless of medical or technical knowledge, can assess the accident parameters considered in this study, necessary for innovate injury severity prediction at the accident site. Without training, the assessment should primarily be conducted by individuals with technical expertise and emergency physicians. They performed best without training in assessing the rigid impact opponent, seatbelt usage, and airbag deployment. The EES should be trained for all professional groups if no prior knowledge is available.

Contrary to the seemingly problematic prehospital assessment of whether a severe injury is present [4,5,6], the technical parameters after a traffic accident appear to be well assessable through appropriate training.

In the absence of comparable studies, it is not possible to contextualize this work. However, it is known that training leads to improved performance of the participants [7, 8].

These facts offer the opportunity in advance the automatic eCall to a valid injury severity prediction tool allowing injury adapted alert of emergency medicine staff (HEMS, emergency physician, paramedics) and may help to decide which trauma center level is needed for the adequate treatment for the casualties.

With proper guidance from rescue center dispatchers or driver training, it would be conceivable to query these parameters in the context of an emergency call or eCall to make early inferences about potential accident outcomes and initiate appropriate primary alerting of rescue resources. A currently ongoing prospective multicenter study is investigating the feasibility and predictive accuracy of the aforementioned tablet-based injury prediction tool for prehospital optimization of care for trauma patients following traffic accidents. In this study, the mentioned parameters are being collected by various professional groups within the emergency medical services at both ground and air-based locations across multiple federal states to test the predictive validity. Results are pending. Regardless, this study provides robust results that after appropriate training, valid assessments of accident parameters can be made at the accident scene, regardless of professional background. For example, within a pre-project phase preceding the automated prediction tool (qualified eCall), dispatchers could query the necessary input parameters from the emergency medical service personnel present at the accident scene. This could be done using specific questionnaires to assess injury patterns and make appropriate decisions regarding the required rescue resources or disposition.

Limitations

It is important to note that, in real accident scenarios, the views of the vehicles necessary for assessment may not be readily available, as they were during training. Therefore, a longer time commitment can be expected under real conditions, particularly in potentially stressful and chaotic accident scenes.

Furthermore, the statistical comparisons of individual collision directions are limited in their utility due to the low number of data points (two complex cases, three side impacts before and after training for each group). The parameters of seatbelt status, age, gender, and seating position were predefined in the survey, as they could not be determined from some accident images. Therefore, an analysis for these parameters is not possible.

The lower average professional experience of clinic physicians could potentially influence the comparability to the other professional groups, as they have less experience.

Conclusions

In this study, we proofed the practicability of preclinical injury prediction based on relevant technical accident parameters. Significant differences in the quality of technical accident parameter assessment were observed depending on participants’ technical and medical backgrounds. Employees of the TAR were the most accurate in assessing parameters, while LAY, clinic physicians, EP, and emergency medical services (EC) personnel benefited from training in evaluating the specified parameters. Following the user training, the initial interdisciplinary differences were equalized, and all professional groups provided comparable assessment results without significant differences. It can be assumed that, after training or definition of these accident parameters using appropriate visual materials, any participant in the chain of care can make these assessments.

The time required for the complete assessment of the 14 accident parameters considered here decreased as the number of cases processed increased. Additionally, after training, the average time was significantly shorter than before, indicating a substantial learning effect through application and training. Thus, the collection of such parameters is trainable and can be done within a reasonable timeframe.

For future telemedical, preclinical queries of accident parameters, such as in the development of eCall or the establishment of injury prediction models, user training should be recommended to ensure an adequate assessment by users. After such training, the profession does not appear to have an impact on the quality of accident parameter assessment.