Introduction

Simulation-based training for post-graduate clinical staff is a highly effective training modality [1,2,3]. It usually follows the scenario-debrief training approach [4], with a debriefing style designed to scaffold the exploration of trainees’ internal beliefs and assumptions. This method is particularly effective in the learning of human factors skills, which are as essential for healthcare professionals as technical task-based skills [5,6,7].

Human factors skills are well defined and include situational awareness, communication, teamwork, leadership, decision making and care and compassion [8, 9]. Evaluation of the extent to which these skills are developed during simulation training is needed to ensure the effectiveness of training programmes. The Human Factors Skills for Healthcare Instrument (HuFSHI) is a validated and reliable instrument for evaluating changes in clinical learners’ confidence in their human factors skills pre- and post-training [6]. It is valid and reliable for physical and mental healthcare settings and clinically trained learners.

The delivery of patient care is a team activity. Within a hospital setting, a patient’s care journey relies on the behaviour of both clinical and non-clinical professionals including hospital porters, domestic staff and administrative and managerial staff working together to achieve common goals. Simulation training has recently begun to consider the importance of including non-clinical professionals who must interact with patients and with healthcare professionals [10, 11].

In mental healthcare in particular, non-clinical professional groups such as hospital and primary care administrators, social workers, probation officers, police officers and ambulance and hospital security staff are often the first contact for patients, particularly those experiencing deterioration in their mental health [12]. These non-clinical professionals are also involved at different stages of the processes of assessment, diagnosis and treatment and are important supports for eventual discharge into the community [13]. They are often regarded as auxiliary members of the healthcare team [14] and are frequently present at the very start of the patient’s contact with the healthcare system, a particularly critical phase as evidence suggests that negative experiences at the first contact with mental health services are associated with delays in help-seeking and resistance to treatment [15]. Simulation training is increasingly being used to educate this population about mental healthcare and effective team working in this environment [10, 15,16,17].

The inclusion of non-clinical professionals in healthcare simulation training courses is an important development that acknowledges the importance of good teamwork at all levels of the patient journey. Such multidisciplinary training is relevant across a range of healthcare settings where patients with physical or mental health problems present, or where clinical and non-clinical staff work together.

Evaluating the learning of human factors skills for non-clinical trainees remains challenging with no validated methods available [18, 19]. The Human Factors Skills for Healthcare Instrument (HuFSHI) provides a reliable and valid method of assessing clinical trainees’ human factors skills self-efficacy across acute and mental health settings, which is sensitive to change following training [6]. HuFSHI has been validated for use with clinical professionals, uses healthcare language and refers to clinical settings and tasks, which may be a barrier to its use with non-healthcare professionals. A further potential problem is that it was developed for degree-qualified health professionals, and so, it employs language suited to this audience and potentially difficult for others to understand. Furthermore, informal feedback from using HuFSHI with non-clinical learners indicated that the content and language were not easily understood. Therefore, the aims of this study were to:

  1. 1.

    Develop a new version of the Human Factors Skills for Healthcare Instrument for auxiliary healthcare (non-clinical) trainees by adapting the language used to describe human factors skills for non-clinical team members

  2. 2.

    Test the validity, reliability and sensitivity of the HuFSHI Auxiliary version (HuFSHI-A)

  3. 3.

    Identify the factor structure of the new instrument

Methods

Setting

The study took place in a large mental health simulation centre in South London: Maudsley Simulation at the South London and Maudsley NHS Foundation Trust. The centre provides simulation training for people working with mental health populations in community and acute settings. Ethical approval was provided by King’s College London ethics committee (RESCMR-15/16-1561).

Participants

Participants were trainees (n = 188) attending simulation training at Maudsley Simulation during an 11-month period (June 2017–April 2018). They were non-clinical professionals whose job role involved contact with clinical populations and healthcare teams. They were hospital/primary care administrators (n = 53, 28%), police officers (n = 112, 59%), probation officers (n = 13, 7%) and social workers (n = 10, 5%). Most participants were female (n = 110, 59%) and from White ethnic backgrounds (n = 144, 77%).

Item generation

The initial pool of items was drawn from the development of the HuFSHI [6], which was developed using iterative cycles of psychometric testing to choose the best items. Similarly, we anticipated that psychometric testing would identify the most effective items for this new instrument and reduce the pool of items. Therefore, to allow for this process, the initial item pool was the 18 core human factors skills items that were used to generate the HuFSHI [6]. The wording of the eighteen items was revised by psychologists [GR and ML] to be more relevant and understandable for a non-clinical audience, ensuring they reflected the same core human factor skills. During this process, the items were reviewed by non-healthcare professionals for face and content validity, readability and relevance. The stem question remained the same as in the original HuFSHI: ‘Please rate how confident you are that you can manage the following effectively’. Participants were asked to respond on a scale from 1 to 10. The 18 items from HuFSHI together with the non-clinical versions are shown in Table 1. Items in italics indicate those that were finally included in each instrument.

Table 1 The 18 items that were piloted in development of the original Human Factors Skills for Healthcare Instrument (HuFSHI) are displayed alongside the comparable items piloted for inclusion in the HuFSHI Auxiliary version. Items in italics are those included in each of the final 12-item instruments

Procedure

Participants completed the 18-item instrument pre- and post-attending simulation training. Participants were trainees attending 11 different one-day simulation training courses, which are described in more detail in Table 2. All courses employed the scenario-debrief approach, included the important roles of non-clinical staff and contained learning objectives relating to human factors skills. All training was delivered by experienced trainers and clinical educators at the training centre.

Table 2 Details of the simulation training courses participants attended

Participants were informed about the study both verbally and through a participant information sheet. The instrument was completed by consenting participants at the start of the training day (pre-training) and at the end of the training day (post-training).

Statistical analysis

Item selection

Initial analyses on the first pool of 18 items were conducted using IBM SPSS (V.24) [20]. Evaluation of the items was achieved in four steps:

  1. 1.

    Participant responses to each item were examined descriptively to identify ceiling and floor effects.

  2. 2.

    As sensitivity to change pre- and post-training was a critical feature of the instrument, paired samples t tests assessed the change in item scores pre- to post-training, Cohen’s d effect sizes were calculated for each item and items with a small effect size (d < .3) were eliminated from the instrument. This is standard practice in item selection and instrument development: multiple items representing the same construct are proposed and tested and the items that do show sensitivity to change would be retained while those that are not are eliminated [21].

  3. 3.

    Inter-item correlations were examined to assess for redundancy between items, while balancing with theoretical justifications for item selection.

  4. 4.

    An exploratory factor analysis (EFA) using a maximum likelihood factor extraction method was conducted. Only factors with Eigenvalues over 1 were extracted.

Instrument sensitivity to change

The sensitivity of the final instrument was explored in a further two steps:

  1. 5.

    Paired samples t tests compared pre- and post-training scores for the final instrument for the whole sample and for the two most common professional groups (administrators and police).

  2. 6.

    The HuFSHI Clinical Version compared change pre- and post-training for experienced and novice healthcare professionals based on their years of clinical experience. However, due to the diversity in professional groups in our sample, experience was not deemed to be an appropriate index. Instead, Hierarchical Cluster Analysis explored characteristics of groups of participants based on their pre- and post-training scores. A Ward’s cluster method was employed with a squared Euclidean distance method measuring distance between cases. This method combines individual cases into clusters. The relative distance between clusters informed the number of distinct clusters that were present. The difference between clusters in terms of participant characteristics and instrument scores was compared using appropriate inferential statistics.

Results

Item selection

Step 1

No ceiling or floor effects were observed for the 18 items with all displaying normal distributions and skewness within normal levels (range − 1.7 to − 0.59).

Step 2

Paired samples t test assessed the change in item scores pre- to post-course, and Cohen’s d effect sizes were calculated for each item. Nine participants were excluded from this analysis as they did not complete the post-course questionnaires. Five items with small effect sizes (d < .3) were identified and removed: item 2—I can ask colleagues from other professions for help if I need it; item 3When I disagree with a colleague, I can still work well with them; item 4When many things are happening at once, I can work out what needs doing first; item 6When making decisions, I can ask my colleagues for help and advice; item 8I can ask colleagues for things I need, even if they are busy. The number of items after this step was 13.

The items eliminated in this step may have shown small effect sizes due to a number of potential factors including item phrasing, wording, or lack of conceptual clarity to participants. However, the human factors skills that these items referred to were also represented by other items that were retained in the item pool. Thus, although these specific items were not sensitive to change, it does not mean that the skills they represent did not improve, but merely that these items were not the most effective items for detecting these changes [21].

Step 3

Inter-item correlations for the remaining 13 items were conducted on the pre-course questionnaires and are displayed in Table 3. Reliability analysis of these 13 items revealed a Cronbach’s alpha of .93. All items were significantly positively correlated with item-total correlations ranging from r = .61 to r = .80. Two items showed high item-total correlations of r = .80 (item 14—I can choose the key facts I need to tell a colleague during a good handover and item 15When I am under pressure, I can still make important decisions). Despite the high item-total correlations, these items were retained to ensure that the six human factors skills (situation awareness, decision making, communication, teamwork, leadership and care) were adequately represented in the final tool.

Table 3 Pearson’s correlations between items alongside for each item: the item total correlation, Cronbach’s alpha if deleted and the exploratory factor analysis factor loadings

Item 18 (I can show others that I care, even when I am under stress) was one of the three items representing the human factors skills of care. This item had the lowest item-total correlation (r = .61), and deleting it had no impact on the Cronbach’s alpha; therefore, it was removed from the final instrument to reduce redundancy. Twelve items were retained after this step.

Step 4

Reliability analysis on the remaining 12 items revealed a Cronbach’s alpha of .934. An exploratory factor analysis of pre-training scores was conducted using a maximum likelihood method extracting factors with Eigenvalues greater than 1. The Kaiser-Meyer-Olkin measure of sampling adequacy was .94, and the Bartlett’s test of sphericity was highly significant (chi-squared = 1399.29, df = 44, p < .0001).

The Scree plot produced in the exploratory factor analysis revealed a one-factor solution that explained 58.3% of the variance. The factor loadings of each item are displayed in Table 3 (range .63 to .84).

Sensitivity to change

Step 5

Comparing mean scores for the final 12-item instrument pre- and post-training revealed that participants’ scores significantly improved post-training (p < .0001) overall and at the professional group level (p < .001) with large effect sizes (d > .7) (see Table 4).

Table 4 Paired samples t test comparisons of mean 12-item instrument scores by professional group

Step 6

Participants’ pre- and post-scores of the final 12-item instrument were entered into the cluster analysis. Four participants were further excluded as they were identified as outliers in the cluster analysis process; these cases are described in more detail below (see the “Outliers” section). After exclusion of these cases, the final dendrogram revealed two clear clusters, cluster 1 contained 104 cases, while cluster 2 contained 70 cases. The chi-squared analysis revealed that clusters did not differ significantly in terms of participant characteristics (Table 5). Independent samples t test revealed that participants in cluster 1 had significantly higher pre- and post-training scores, compared to participants in cluster 2. The proportional improvement score was calculated for each participant as the difference between pre- and post-training scores, divided by the pre-training score and multiplied by 100 (i.e. [((post-score-pre-score)/pre-score) × 100]). The proportional improvement scores were significantly greater for participants in cluster 2 (M = 11.23, SD = 12.28; range − 18 to 43) compared to cluster 1 (M = 5.11, SD = 6.54; range − 10 to 27) (z = 3.23, p = .001) (see Fig. 1). Following psychometric testing, the final instrument contained 12 items.

Table 5 Comparisons of instrument scores and participant characteristics by identified clusters
Fig. 1
figure 1

Mean proportional improvement of participants scores post-training by identified clusters alongside participants identified as outliers

Outliers

The four outliers identified comprised of two administrators and two members of the police. The pre- and post-training scores of these participants were particularly low (pre-training scores M = 5.15, SD = .97; range 3.90–6.30; post-training scores 7.52, SD = 1.35; range 5.90–9.00). However, the proportional improvement of these participants was greater than the upper limits of participants in either clusters 1 or 2 (M = 55.5, SD = 5.4; range 50–61) (see Fig. 1).

Discussion

The aim of this study was to develop an instrument to evaluate the learning of human factors skills in non-clinical staff working in healthcare settings, or with clinical populations. The new instrument, titled the Human Factors Skills for Healthcare Instrument-Auxiliary version (HuFSHI-A), is a 12-item instrument with a single factor structure. It is reliable, with face and content validity and sensitive to change post-training. This is the first instrument that has been specifically developed for use with non-clinical populations receiving simulation training in the context of healthcare. This will enable better design and evaluation of simulation training for non-clinical populations.

This instrument has been developed and validated in a healthcare educational setting in which the training focused on human factors skills in a mental healthcare context. Although the courses differed in content, a focus on improving participants’ human factors skills in scenarios relevant to their daily work was maintained throughout the development, delivery and review of all courses. Human factors skills were embedded in the course learning outcomes, the scenario design and the debrief approach, and training was delivered by experienced trainers. Therefore, it would be expected that participants’ human factors skills self-efficacy would improve following such training, providing a rationale for evaluating the instrument in this robust training environment. The diversity of course learning objectives (Table 2) across nine different training courses, alongside the diversity of the participants’ professional backgrounds (primary and secondary care administrators, police officers, probation officers and social workers), provides reassurance that the final instrument is widely applicable.

Data gathered during the instrument development phase showed that non-clinical trainees’ self-efficacy in their human factors skills increased significantly post-training and improved more for those whose pre-course scores were low. This pattern of low pre-training scores and higher improvement scores was particularly pronounced for four trainees in this cohort who were identified as outliers. Participants’ pre-training scores and improvement scores were not associated with their professional group, years qualified or any demographic characteristics. For trainees with lower pre-training scores, human factors skills may have been a concept they had not encountered prior to the training. This study provides evidence that exposure to and practice of these skills through simulation training leads to significant improvements especially when participants are less confident to begin with. If non-clinical populations clearly benefit from simulation training, there is a clear rationale for including them in training programmes.

HuFSHI-A has both face and content validity [22]. Due to a lack of available tests of this type, it was not possible to measure criterion validity at this stage [22]. A contemporary approach to validity is addressed through Kane’s framework [23], which examines an instrument’s validity in the context of its specific purpose. It is comprised of four steps: scoring, generalisation, extrapolation and implication. Kane’s framework focuses on the development of instruments to be used in assessment decisions. As such, it emphasises the implications of assessment tools, instruments and approaches focused on measuring individual attainment for the purposes of admission, progression or award decisions. The HuFSHI-A is not an assessment tool; its purpose is to help educators to determine the extent to which simulation training is effective at improving non-clinical learners’ self-efficacy around human factors-oriented skills. It is not intended for the purposes of individual assessment. Therefore, extrapolating from individual scores to real-world performance (step three in Kane’s framework) would not be an appropriate validation of this instrument (although previous studies have shown that similar self-efficacy measures do correlate with work-related performance [24]). Similarly, Kane’s final validation step ‘Implication’ evaluates the consequences or impact of the assessment on the learner, whereas HuFSHI-A has no implications for individual learners. Rather, the results would, we argue, help to inform decisions regarding training design and delivery.

Despite some lack of fit between Kane’s framework and the purpose of the current instrument, HuFHSI-A does meet Kane’s first two steps of validity (i.e. scoring and generalisation). Scoring validity is evidenced by rigorous item selection procedures [6] and use of item scoring which is consistent with theory and practice in self-efficacy instrument design [21]. Generalisation validity is evidenced by diversity in the pilot data, both in terms of training content and trainee professional groups. Thus, we claim that this instrument has a valid scoring procedure and is generalisable to the populations with which it was designed (i.e. non-clinical professionals working in healthcare settings, or with clinical populations). We do not claim that this instrument has validity beyond this context; however, we believe this context is sufficiently broad to be confident in its use in similar settings for the purposes of evaluating and informing simulation training.

The clinical training content in the current sample focused on mental health. However, the human factors-oriented learning objectives of the training courses, which were the focus of this evaluation instrument, were not specific to mental health but are general human factors skills (i.e. situational awareness, communication, teamwork, leadership, decision making and care and compassion), which traverse all aspects of working within clinical contexts, irrelevant of the nature of the clinical situation (e.g. mental or physical health). As such, the clinical topics of the training courses provide the context for communicating these topics, but the human factors skills themselves are not bound to any clinical situation: they are transferable across all aspects of team working in a healthcare context. For this reason, we anticipate that the HuFSHI-A would be applicable for evaluation of human factors skills learning following educational training in any context where clinical and non-healthcare professionals work together.

Joint training of clinical and non-clinical staff has received little attention but is increasingly necessary as non-healthcare professionals are recognised as some of the first-line contacts in patient care [12]. Mental healthcare simulation programmes are leading the development of such multi-disciplinary training programmes because this is where the need is greatest [10, 25]. However, human factors skills are the building blocks of effective communication and team working. Effective patient care in any healthcare context relies on effective teamwork, not just between healthcare professionals but also with non-clinical team members, e.g. administrators, receptionists and managers. Improving human factors skills in both clinical and non-clinical staff working in any healthcare context has the potential to improve the efficiency and effectiveness of team working, leading to improvements in the provision of healthcare more broadly. Joint training of clinical and non-clinical staff is therefore an important consideration across all healthcare sectors. The provision of tools such as HuFSHI and HuFSHI-A to evaluate learning following training is a step towards broadening this practice.

Conclusions

Simulation training develops human factors skills that are essential for clear communication and good teamwork. Although there are instruments available to evaluate learning following simulation, these are not tailored to non-healthcare professionals who are increasingly being included in training programmes. The strengths of this instrument are that it was empirically developed, has good validity and reliability and is brief and therefore feasible to incorporate into busy training programmes. Its use will facilitate the development of effective training human factors programmes and effective teamwork between clinical and non-clinical disciplines. Further research could examine its application alongside training in other healthcare settings.