Patient safety is an important component of healthcare quality. Several studies in various countries have shown that 2.9% to 16.6% of patients in acute care hospitals experience one or more adverse events [19]. Approximately 50% of the adverse events are judged to be preventable. It is believed that to improve quality and safety in healthcare, hospitals have to create a patient safety culture among their staff beside making structural interventions. The culture of an organisation consists of the shared norms, values, behaviour patterns, rituals and traditions of the employees of an organization [10]. Safety culture is an aspect of the organisational culture. A positive safety culture guides the many discretionary behaviours of healthcare professionals toward viewing patient safety as one of their highest priorities [11]. The Institute of Medicine states that if there is a safety culture where adverse events can be reported without people being blamed, they have the opportunity to learn from their mistakes and it is possible to make improvements in order to prevent future human and system errors, and thus promoting patient safety [12].

Therefore, if hospitals want to improve patient safety, it is important to know more about the culture regarding patient safety. Several instruments are available to make an assessment of the safety culture in hospitals [13, 14]. One of these instruments is the Hospital Survey on Patient Safety Culture (HSOPS) of the Agency for Healthcare Research and Quality (AHRQ) [15]. Previous research has shown that the psychometric properties of the HSOPS in the USA are good [1315]. We translated the questionnaire into Dutch for application in the Netherlands [16]. We used forward-backward translation: the questions were translated into Dutch by one translator and then translated back into English by an independent translator who was blinded to the original questionnaire [17].

The HSOPS is a commonly used instrument to measure multiple dimensions of patient safety culture. It is being used in the USA [18] and UK [19]. At the international Patient Safety Research Conference in Porto in September 2007, it appeared that not only the Netherlands, but other countries use a translation of the questionnaire as well, including Belgium, Denmark and Norway. After translating a questionnaire into another language and applying it in a different setting, it is important to check the validity and reliability of the questionnaire in this new situation. If the psychometric properties of the Dutch version of the HSOPS are comparable to the original questionnaire, cross country comparisons are possible and useful to get more insight into the elements of patient safety culture in specific countries.



The Dutch version of the HSOPS was distributed in eight hospitals in the Netherlands in June 2005. The hospitals differed by teaching status: four general hospitals, three teaching hospitals and one university hospital. The capacity of these hospitals varied from 530 to 1120 beds. The participating hospitals were located across the Netherlands. Within the eight hospitals, 23 units participated (two to five units per hospital): six internal medicine units, five intensive care units, three surgical units, three emergency departments, two pediatrics units, two neurology units and one psychiatry unit. Units and hospitals were not randomly selected; units that participated were about to introduce an incident reporting system at their unit and they wanted to assess their patient safety culture prior to the implementation of the new system. In each unit, a random sample of about 30 healthcare providers was drawn, depending on unit size. When the amount of staff in a unit was less than 30 people, all healthcare providers of the unit were asked to participate.

The questionnaire was disseminated on paper through the mail sorting boxes of all selected healthcare providers at the unit. A research coordinator in the hospital took care of the distribution. To allow for confidentiality, respondents could send the questionnaire directly to the researchers outside the hospital in a postage-paid return envelope. The management board and medical board of each participating hospital formally consented to participate in the study. Formal ethical approval was not needed for this study according to Dutch law.

A total of 583 respondents completed the questionnaire. Most respondents worked as registered nurses (59.8%). Other respondents worked as medical consultants (6.8%), resident physicians (6.0%), administrative staff (4.3%), nurses in training (2.6%) or in management (2.4%). These percentages give a reasonable reflection of the real distribution of disciplines at the units.


Background variables

Work-related information, e.g. the respondent's primary department in hospital, how long he/she has been working at this unit, how many hours a week and in which function.

Items on patient safety culture

Most items of patient safety culture can be answered using a five-point scale reflecting the agreement rate: from 'strongly disagree' (1) to 'strongly agree' (5), with a neutral category 'neither' (3). Other items can be answered using a five-point frequency scale from 'never' (1) to 'always' (5). In addition, there are two mono-item outcome variables: 1) Patient safety grade: measured with a five-point scale, from 'excellent' (1) to 'failing' (5), and 2) Number of events reported: how often the respondent has submitted an event report in the past 12 months (answer categories: 'none', '1–2 event reports', '3–5 event reports', '6–10 event reports' and '11–20 event reports').

The original items have been validated by the Agency for Healthcare Research and Quality (AHRQ) for the USA hospital setting [15]. Factor analysis resulted in 12 factors (dimensions). The codes in brackets after each dimension refer to the sections in the questionnaire and the numbers of the questions.

F1 Teamwork across hospital units (F2, F4, F6, F10)

F2 Teamwork within units (A1, A3, A4, A11)

F3 Hospital handoffs and transitions (F3, F5, F7, F11)

F4 Frequency of event reporting (D1, D2, D3)

F5 Nonpunitive response to error (A8, A12, A16)

F6 Communication openness (C2, C4, C6)

F7 Feedback and communication about error (C1, C3, C5)

F8 Organisational learning – continuous improvement (A6, A9, A13)

F9 Supervisor/manager expectations and actions promoting patient safety (B1, B2, B3, B4)

F10 Hospital management support for patient safety (F1, F8, F9)

F11 Staffing (A2, A5, A7, A14)

F12 Overall perceptions of safety (A10, A15, A17, A18)

Data screening and pre-analyses

Completeness of the data was checked. Five respondents were excluded from the analyses, because they had completed less than half of all items. When a respondent had chosen two or more options at one item, this item was marked as missing, which rarely occurred. Missing values have been replaced by the respondents' mean scores on the item. The highest numbers of missing values were found at part D (Frequency of event reporting): 3.8% to 4.5% of the responses to these items were missing. No items were excluded based on the percentage of missing values. The distribution of only one variable was skewed, i.e. Number of events reported. There were no variables with 80% or more answers in one category.

We checked whether the inter-item correlations were sufficient, by an examination of the correlation matrix. Questions belonging to the same underlying dimension will correlate as they measure the same aspect of patient safety culture. Items that do not correlate, or correlate with only a few other variables are not suited for factor analysis [20]. Bartlett's test demonstrated that the inter-item correlations were sufficient: χ2 = 6456.8; df = 861; p < 0.001.

We also checked whether the opposite occurred: too much correlation between the items. Ideally, every aspect of patient safety culture uniquely contributes towards the concept of patient safety culture. A high correlation between two items means that patient safety culture aspects overlap to a large extent. The overlap in the answer patterns is about 50% when a correlation is 0.7 [20]. No correlations exceeded this boundary score.

In addition, The Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO) was determined. This value can range from 0 to 1. A value near 1 indicates that there is hardly any spread in the correlation pattern, enabling reliable and distinctive dimensions by factor analysis [20]. The KMO-score was 0.9; far above Kaiser's criterion of 0.5. The pre-analyses demonstrate that the data can be used for factor analysis.

Statistical analyses

Factor analysis defines which items are closely linked and refer jointly to an underlying dimension (or factor). The items can thus be reduced to the smallest possible number of concepts that still explain the largest possible part of the variance [20]. A confirmative factor analysis was performed (principal component analysis with Varimax rotation) in order to investigate whether the factor structure of the American questionnaire can be used with Dutch data. The data were also studied with explorative factor analysis (principal component analysis with Varimax rotation), in order to check whether the items form different factors in the Dutch situation. When establishing the number of factors, the Eigen value (Eigen value > 1: Kaiser's criterion) was taken into account, beside the extent of explained variance, the shape of the scree plot and the possibility of interpreting the factors. Kaiser's criterion is reliable in a sample of more than 250 respondents and when the average communality equals or is larger than 0.6. The shape of the scree plot gives reliable information when the sample is larger than 200 respondents [20]. The data satisfy these conditions.

The internal consistency of the factors was calculated with Cronbach's alpha (α), a value between 0 and 1. If different items are supposed to measure the same concept, the internal consistency (reliability) should be greater than or equal to 0.6 [20]. Since the questionnaire contains positively as well as negatively worded items, the negatively formulated items were first recoded to make sure that a higher score always means a more positive response.

The construct validity was studied by calculating scale scores for every factor (after any necessary reverse coding) and subsequently calculating Pearson correlation coefficients between the scale scores. The construct validity of each factor is reflected in scale scores that are moderately related. High correlations (r > 0.7), however, would indicate that factors measure the same concept and these factors may be combined and/or some items could be removed. In addition, correlations of the scale scores were calculated with the outcome variable: Patient safety grade. No correlations were calculated with the other outcome variable, Number of events reported, because of the lack of variability and skewed nature of this item (40% of the respondents indicated not to have reported any events during the past 12 months and 41% had reported only one or two events). All statistical analyses were performed using SPSS 12.0.


Confirmative factor analysis

The 12 dimensions that resulted from the factor analysis of the AHRQ [15], have already been mentioned above. Items that formed one factor in the AHRQ study have been studied in 12 separate factor analyses, to see whether a group of items also loaded on one factor with the Dutch data. All analyses showed that the items within one factor indeed did not consist of more than one factor. At first sight, the Dutch items appear to form the same factors as the original questionnaire.

Additionally, the internal consistency has been calculated for every factor and has been compared with the internal consistency found in the American study (see Table 1, left side).

Table 1 Characteristics of the factors after confirmative and explorative factor analysis

For each factor, the internal consistency of the Dutch items was lower than that of the original items in the AHRQ study except for Communication Openness, which was the same. The internal consistency of three factors was poor or even unacceptable: Organisational learning – continuous improvement (α = 0.57), Staffing (α = 0.49) and Teamwork across hospital units (α = 0.59). This gave occasion to carry out an explorative factor analysis in order to investigate if there is a factor structure that better fits the Dutch data.

Explorative factor analysis

Eleven factors were drawn by explorative factor analysis. The items of Organisational learning – continuous improvement and Feedback and communication about error from the American study combined into one factor instead of two separate factors. To find out whether this factor would nevertheless split into two factors, a confirmative factor analysis was done with only these six items. Again, the items formed one factor. Another analysis was executed in which the number of factors was forced at 12, in accordance with the American solution (confirmative). Once more, the factor consisted of the same six items. Besides, items from other factors shifted in such a way that factors could no longer be interpreted properly. The version with 11 factors was the best solution. Table 2 gives the mean scores with standard deviations and factor loadings per item. One item did not have a sufficient factor loading on any of the factors (all loadings < 0.40), i.e.: 'Patient safety is never sacrificed to get more work done' (A15).

Table 2 Mean scores and factor loadings of the items regarding patient safety culture

The factors jointly explained 57.1% of the variance in the responses. The internal consistency was calculated for every factor. One item turned out to decrease the reliability of a factor, i.e.: 'It is often unpleasant to work with staff from other hospital units' (F6). After this item had been removed from factor 6, Communication Openness, the internal consistency increased from 0.65 to 0.72. The item has not been used in further analyses. The internal consistency of ten factors was acceptable (0.64 < α < 0.79), but factor 10, Adequate staffing was doubtful (α = 0.58). Table 1 (right side) gives the number of items and the internal consistency per factor after the explorative factor analysis.

Construct validity

For each of the 11 factors, scale scores were calculated by obtaining the mean of the item scores within one factor for every respondent. Next, correlations between the scale scores were calculated. Table 3 shows the mean factor scores with standard deviations, and the correlations between the factors.

Table 3 Mean factor scores, correlation with patient safety grade and intercorrelations of the 11 dimensions

The highest correlations were those between Feedback about and learning from error and Supervisor/manager expectations and actions (r = 0.47) and between Feedback about and learning from error and Hospital management support (r = 0.47), but no correlation was exceptionally high.

Additionally, correlations of the scales with the mono-item outcome variable Patient safety grade have been calculated. The factors were expected to correlate positively with this outcome measure. All correlations with Patient safety grade were significant. The highest correlation of this outcome measure was with Overall perceptions of safety (r = 0.56).


In the 11-factor model, the reliability (internal consistency) of the factors and construct validity are acceptable. Two items of the original questionnaire have been eliminated: 'Patient safety is never sacrificed to get more work done' (A15) and 'It is often unpleasant to work with staff from other hospital units' (F6). The internal consistency of one factor, i.e. Adequate staffing remained questionable (α = 0.58). This was no reason to remove the factor from the questionnaire, because the three items match well and they concern an important organisational characteristic that influences patient safety.

The construct validity was satisfactory for all factors; the moderate correlations of the factors show that there are no two factors measuring the same construct. As expected, all factors correlated with the outcome variable Patient safety grade. The high correlation of Patient safety grade with Overall perceptions of safety is an indication for the validity of the latter scale. The higher the overall safety perceptions are, the higher the rating for patient safety, and vice versa.

The factor structure with 11 factors slightly deviates from the structure with 12 factors proposed by the AHRQ. The main difference is that the factor Feedback about and learning from error consisted of two separate factors in the American study: Organisational learning – Continuous improvement and Feedback and communication about error. It is not surprising that both factors' items turned out to be linked. Error feedback and improvements induced by errors are closely related. Communication with the staff plays an important role in making improvements in patient safety. There is no obvious explanation for the fact that in the American study, nevertheless, two factors occurred. The English wording might have directed more towards the difference between talking about errors and taking action based on errors, i.e. the distinction between words and actions. The two factors might have merged in the American data too if a factor analysis was carried out which was forced at 11 factors. The remaining differences in the factor structure only concern item shifts between factors. Table 4 sums up the differences.

Table 4 Differences in the names and composition of the factors in the American and the Dutch factor structure

There was a difference within the factor Overall perceptions of safety. Instead of the item 'Patient safety is never sacrificed to get more work done'(A15), the item: 'We work in "crisis mode" trying to do too much, too quickly'(A14) became part of this factor. In the American study, item A14 belonged to Staffing. However, it fits very well with the other items that measure the overall patient safety perceptions. It is not clear why the factor loading of A15 was low in this factor, but the item might point to wittingly and actively unsafe behaviour, while the other items within the factor are more related to latent (system) problems. Because the item did not load sufficiently on any of the factors, it was removed from the questionnaire.

Furthermore, two items loaded on Teamwork across hospital units, while in the American study these were part of Hospital handoffs and transitions, i.e.: 'Things "fall between the cracks" when transferring patients from one unit to another'(F3) and 'Problems often occur in the exchange of information across hospital units'(F7). In the Dutch study, these items loaded just slightly more on Teamwork across hospital units than on Adequate shift changes. The distinction between the two factors might be more lucid in the Dutch situation. One factor concerns cooperation and handoffs between units while the other concerns changing shifts within units.

Finally, there was initially a difference within the factor Communication openness. The item 'It is often unpleasant to work with staff from other hospital units'(F6) belonged to this factor, while in the American study it belonged to Teamwork across hospital units. Calculation of the internal consistency of Communication openness showed that the item reduced the factor's reliability and, therefore, the item has been left out of further analyses and deleted from the questionnaire. As a result, the structure of Communication openness in the Dutch study still matches the structure of Communication Openness in the American study.


The factor structures of the Dutch and American Hospital Survey on Patient Safety Culture are almost identical. The main part of the factor structure is the same and nearly all items are kept. There are only small shifts of items across factors. The Dutch factors show, undeniably, a lower internal consistency than in the American study. The factors' internal consistency has, however, become more acceptable by removing weak elements and by shifting items.

This study demonstrates that the Hospital Survey on Patient Safety Culture is an appropriate instrument to assess patient safety culture in Dutch hospitals. Before survey results can be compared between different countries, it is important to have insight into the validity and reliability of the HSOPS in these countries. The psychometric properties of the Dutch translation of the HSOPS are promising for other countries who want to translate and use the questionnaire and for cross country comparisons of survey results in the future. Moreover, in another publication of the authors, multilevel analysis has shown that the questionnaire measures unit culture and not just individual attitudes [21].