Background

Evidence-based practice (EBP) is an effective strategy to make conscientious and judicious clinical decisions based on the best evidence available, along with the complex clinical circumstances, clinician expertise, and patient preferences [1]. The EBP approach has contributed to better clinical outcomes, reduced unnecessary healthcare costs, and improved patient satisfaction [2, 3]. Hence, the importance of EBP in the global healthcare system is well-established [4, 5]. The International Council of Nurses has identified EBP as the gold standard for high-quality healthcare [5,6,7]. As one of the largest groups of healthcare workers, there is an increasing demand for nurses, to incorporate the best evidence into clinical decision making. This will assist institutions to achieve valued and cost-efficient evidence-based health care [8]. However, studies indicate that both nursing students and practicing nurses are still not ready for and competent in EBP [9, 10]. This may be due to a lack of understanding of EBP and/or to not knowing how to apply the research evidence into practical patient care [4, 10].

Education and training in EBP are reported as basic and essential approaches to enhance EBP [9, 11, 12]. The international Sicily Statement had put forward an EBP curriculum framework to facilitate EBP education and training programs [13], and research training, EBP skills, and knowledge have been embedded in nursing education [12]. Melnyk et al. [8] have also published 13 EBP competencies for clinical nurses and a further 11 for advanced nurses. Currently, nursing faculties and administrators are faced with the important challenge of exploring new and efficient instructional and training modes to build, strengthen, and encourage an EBP culture in academic and clinical practice in the nursing community [10, 14].

Psychometrically robust assessment tools are required to ascertain the efficiency of various teaching and training methods, as well as to monitor the progress of the participants in EBP [15,16,17]. Shaneyfelt et al. [18] analyzed 104 instruments from numerous EBP education programs across the health professionals and reported that rigorously-developed tools were in the minority, with only 10% having established validity in ≥3 areas. Furthermore, there is a limited number of EBP self-report knowledge, attitudes, and behavior questionnaires available for use with nurses. Of those that exist, many exclude coverage of the behavior domain of EBP [19,20,21,22,23] or lack development rigor in terms of reliability and validity [24,25,26]. The insufficiency of psychometric properties in EBP questionnaires for nurses was also acknowledged by Leung et al. [27].

The Evidence-Based Practice Profile Questionnaire (EBP2Q) developed by McEvoy et al. in Australia [28], is one of the most comprehensive self-report questionnaires that assesses the EBP knowledge, skills, attitudes, and behaviors across all five steps of EBP (Ask, Acquire, Appraise, Apply, Assess). Validated on 631 participants including students, academics, and practitioners from different healthcare professions [28], the EBP2Q has shown acceptable validity and reliability properties.

Panczyk et al. [29] adapted and validated the EBP2Q in 1362 nurses, nursing students, and midwives in Poland. Similar to the English version of the EBP2Q, the Polish version has shown high internal consistencies (Cronbach’s α = 0.80–0.97) and theoretical and criterion validity were confirmed [29]. The EBP2Q was also translated into Norwegian by Titlestad et al. [30] and validated in 149 nursing students, social education students, and health and social workers. The Norwegian EBP2Q showed high internal consistencies (Cronbach’s α ≥ 0.90), except for Sympathy (Cronbach’s α = 0.66), and demonstrated criterion and responsive validity [30].

This study was designed to translate, adapt, and validate the EBP2Q for use with clinical nurses in China to evaluate the self-perception of nurses on their EBP competencies, to promote the application of EBP in clinical nursing practice, and to help managers to identify where the improvement of EBP is needed.

Methods

A validation and reliability study was undertaken at Nanfang Hospital, Southern Medical University (Guangzhou, China). This study was conducted in two stages, with stage 1 comprising the translation and adaptation of the EBP2Q into the Chinese version, and stage 2 evaluating the psychometric properties of the Chinese version of the EBP2Q. All study processes were conducted according to the Declaration of Helsinki and ethical approval for the study was obtained retrospectively on September 24th, 2019 from the Medical Ethics Committee of Nanfang Hospital of Southern Medical University (NFEC-2019-171).

Instruments

Nurse general information questionnaire

Demographic characteristics and EBP-related data were collected using a questionnaire designed by the research team. The characteristics included sex, age, years of working, present position, highest education, EBP training, level of English and whether they had conducted a research study.

Evidence-based practice questionnaire (EBPQ)

The 24-item EBPQ was initially designed to evaluate EBP uptake and implementation of the nurses [25]. It comprises the three domains Practice, Attitude, and Knowledge, which are scored on seven-point Likert scales (1–7), with higher scores indicating more favorable outcomes. The EBPQ, validated on 751 clinical nurses, has good internal consistencies for the overall questionnaire (Cronbach’s α = 0.87) and the domains of Practice, Attitudes, and Knowledge (Cronbach’s α = 0.85, 0.79, and 0.91) [25]. The Chinese version of the EBPQ, as part of a Master’s thesis, was validated in 1621 clinical nursing staff with a Cronbach’s α of 0.94 for the overall questionnaire [31]. The Chinese EBPQ was also used in a multiple center cross-sectional study of 648 Chinese registered nurses [32].

EBP2Q

The EBP2Q incorporates 58 items in five domains: Relevance (14 items), relating to an individual’s perspectives on the importance of EBP; Sympathy (seven items), relating to the perceived compatibility of EBP with the practicality of use in day-to-day work; Terminology (17 items), relating to the understanding of research terms; Practice (nine items), relating to the application of EBP in clinical circumstances; and Confidence (11 items), relating to the individual’s perception of their EBP skills [28]. There are also 16 additional non-domain items related to EBP, which have not been classified into any of the known domains and were not included in the following analysis. Each of the items is rated on a five-point Likert scale from 1 = “not at all true” to 5 = “very true” in Relevance, from 1 = “strongly disagree” to 5 = “strongly agree” in Sympathy, from 1 = “never heard the term” to 5 = “understand and could explain to others” in Terminology, from 1 = “never” to 5 = “daily” in Practice, from 1 = “not at all confident” to 5 = “very confident” in Confidence. The seven items in Sympathy are negatively worded (e.g. ‘EBP does not take into account the limitations of my day-to-day work) and need to be reverse scored. Higher scores indicate more competent respondents in that particular EBP domain [28].

The EBP2Q has demonstrated good psychometric properties, including an excellent overall Cronbach’s α (0.96), domain intraclass correlation coefficients (ICCs) ranging between 0.77 and 0.94, and confirmed convergent validity with the EBPQ regarding the three comparable domains (with the following Pearson’s correlations: Confidence, 0.66; Practice, 0.80; and Sympathy, 0.54). As indicators of criterion validity, the domains Relevance and Terminology (no training vs. ≤20 h of training. no training vs. > 20 h of training) and Confidence (no training vs. ≤20 h of training) significantly distinguished participants by exposure to EBP training [28].

Methods in stage 1: translation and adaptation of the EBP2Q

Procedure

Translation

Permission for translation and adaptation in Chinese was granted by the authors of the English EBP2Q. In accordance with established guidelines [33], the following steps were implemented. Firstly, two bilingual postgraduate nursing students independently forward-translated the EBP2Q from English to Chinese. Secondly, a synthesized version was completed after careful discussion with a third bilingual postgraduate nursing student. Thirdly, the synthesized version was independently back-translated by two other bilingual translators with no knowledge of the original EBP2Q. These translators were native Chinese speakers, each with a year study experience in Britain to achieve a Master’s degree in Nursing at the University of Salford, Manchester, United Kingdom. Fourthly, the two overseas translators who had studied in Britain and a bilingual nursing expert reviewed and compared the two back-translation versions and agreed on an integrated back-translation version. The research team (including professionals in the fields of nursing, statistics, and questionnaire development) subsequently verified and resolved discrepancies amongst the forward and backward translations, as well as implemented feedback from the original authors to produce a pre-final Chinese version. This version was considered the semantic and conceptual equivalent of the English EBP2Q.

Expert committee review and content validation

A panel of six experts was formed to determine the need for adaptations and item removal to ensure compatibility in Chinese populations. The experts were educated to at least Master’s degree level and included clinical nursing experts, EBP experts, and nursing research educators with senior academic titles. In addition to academic qualifications, the experts grew up, were educated, and have worked in hospitals in different provinces, potentially representative of the health, social, and cultural systems of Mainland China (see Additional file 1). The panel evaluated the relevance, clarity, and the cultural and semantic equivalence of each item on three separate four-point Likert scales (relevance: 1-not relevant, 4-highly relevant; clarity: 1-very unclear- needs total revision, 4-highly clear-does not need revision; equivalence: 1-completely different, 4-completely equivalent) and made modification suggestions [34].

Pilot testing

Thirty registered nurses, employed at Nanfang Hospital, Southern Medical University (Guangzhou, China), volunteered to participate in the pilot test of the translated 58-item Chinese version of the EBP2Q, which was conducted in line with the recommended sample size of 10–40 participants [35]. The questionnaire completion time was recorded and participants were interviewed on their understanding of each item, and asked for advice on the wording of the items and on further ways to promote the usability of the pilot instrument.

Data analysis

Descriptive statistics were used to calculate the mean time required by participants to complete the questionnaire. We used the content validity index (CVI) to evaluate the validity of the quantified content in the scale. A scale-level CVI (S-CVI) > 0.90, and an item-level CVI (I-CVI) > 0.83 were considered appropriate [34].

Methods in stage 2: testing of the psychometric properties of the EBP2Q

Procedure

Participants

A non-probabilistic sampling of convenience was used. Prior to recruitment, authorization was obtained from the head nurses in each of the clinical departments. Eligible participants were registered nurses employed at Nanfang Hospital, Southern Medical University. The nurses were informed verbally by the head nurses regarding the survey in advance during the shifts of nurses or at a departmental meeting. The names of the nurses who were willing to participate in the research were collected by the head nurses and forwarded to the research team. Using scripted instructions (including the unified greeting, self-introduction and survey procedure), the research team subsequently contacted the volunteers before or after the changes in shift in a department meeting room or lounge. The nurses were provided with an information sheet explaining the purpose of the survey and instructions on how to complete the questionnaire. The volunteer nurses were assured that the survey was anonymous and confidential, and that they could withdraw at any time without consequences. All nurses who agreed to participate signed an informed consent form. Subsequently, the research team distributed the questionnaires to the participating nurses in a department meeting room or lounge. The questionnaires took approximately 20 min to complete. Once completed, the questionnaire was handed to the investigators who were waiting outside. If the participating nurses could not complete the questionnaire immediately, they could also complete it at home, and return it to the head nurses within the following 3 days. The research team collected the remaining questionnaires 10 days later.

Construct validity-structural validity

A cross-sectional validation study using two independent convenience samples of subjects with a total of 580 clinical nurses was conducted in two periods (exploratory factor analysis [EFA] and confirmatory factor analysis [CFA]) between September 2018 and April 2019. The CFA was necessary due to the different factor structure from the original English EBP2Q, as identified in the EFA. The recommended sample size to perform the EFA is 5–10 fold larger than the number of items in the questionnaire [36]. Therefore, a sample of 290 participants was required for testing, with 330 participants recruited between September and November 2018. For the CFA, 250 participants (exceeding the minimum acceptable number of 200 in the CFA) [37], were recruited between February and April 2019.

Prior to the EFA, the same 330 clinical nurses were tested for item analysis to identify each of the items, and those failing the standards were meant to be removed.

Construct validity-convergent validity

A convenience sample of 122 participants also completed the Chinese EBPQ [31] at the second period (CFA) to assess the convergent validity of the revised Chinese EBP2Q.

Criterion validity

The Chinese EBP2Q was tested for the ability to separate groups by subscale scores based on the highest level of education, present position, EBP training, experience in conducting research, and level of English.

Test-retest reliability

Twenty-two nurses who completed the Chinese EBP2Q for the CFA voluntarily completed the questionnaire again 2 weeks later to evaluate the test-retest reliability.

Data analysis

Item analysis of the first and revised second versions of the Chinese EBP2Q

The item analysis of the Chinese EBP2Q for the EFA and the revised Chinese EBP2Q after the CFA was calculated to test the following three criteria: (1) item discriminability, the Critical ratio (CR) index was expected to reach a value of > 3.0 [38]; (2) item homogeneity, item-total correlation was expected to be > 0.3 [39]; (3) Cronbach’s α after deleting the item was expected to be smaller than the corresponding subscale [40].

Construct validity-structural validity

The EFA and CFA were applied to establish the structural validity of the Chinese version of the EBP2Q. As recommended by Hair et al. [41], principle component analysis with varimax rotation was employed in the EFA to determine the structure of the questionnaire. Factors extracted and items reserved complied with the following rules: (1) eigenvalue > 1; (2) factor loading > 0.50; (3) no cross-loading items with factor loading ≥0.40 [41]; (4) ≥3 items belonging to each factor [42]. The CFA with maximum likelihood estimation was performed following the EFA. Several model fit indices were checked: (1) the result of the Chi-squared test divided by the degrees of freedom 2/df) was expected to be smaller than 3.0; (2) the standardized root mean square residual (SRMR) was expected to be smaller than 0.08; (3) the comparative fit index (CFI) was expected to be larger than 0.90 [43]; and (4) the root mean squared error of approximation (RMSEA) values < 0.01, < 0.05 and < 0.08 indicate excellent, good, and mediocre fit, respectively [44].

Construct validity-convergent validity

Sixteen comparable items of the Chinese version of the EBPQ [31], which involved three domains (i.e., Sympathy, Confidence, and Practice) were matched to the corresponding items of the revised Chinese EBP2Q after the CFA to evaluate convergent validity. Considering the non-normality of the data, the Spearman’s correlation coefficient was calculated to compare the participants’ scores on the two questionnaires, on both domains and items.

Criterion validity

To assess criterion validity, t-tests respectively one-way analyses of variance regarding the five criteria described above were performed based on the mean scores of each subgroup on the domains. Post-hoc analyses were conducted using the Bonferroni-adjusted significance test controlling for Type І error to identify differences between sample means.

Internal consistency

Analysis of internal consistency was applied to the entire scale and the domains of the revised Chinese EBP2Q after the CFA. Cronbach’s α ≥0.70 was deemed acceptable [45]. The composite reliability estimated in the CFA was also expected to be ≥0.70 [46].

Test-retest reliability

For test-retest reliability, the ICCs of the items and the domains of the revised Chinese EBP2Q after the CFA were calculated within 2-week intervals. ICC values ≥0.75 and 0.40–0.74 denoted perfect and adequate reliability, respectively [47].

Descriptive statistics were calculated for participant characteristics, items and scales. The ceiling and floor effect of the revised Chinese EBP2Q after the CFA were tested in terms of the lowest and highest item means and standard deviation (SD).

All statistical analyses were performed using IBM SPSS version 20.0 and IBM AMOS version 22.0. P-values < 0.05 were considered statistical significance.

Results

Results of stage 1: translation and adaptation of the EBP2Q

Translation

In the forward-backward translation process from the original EBP2Q, most of the changes were related to terms and phrases that were meaningful in the context of Chinese culture. For example, in item 6, the term ‘develop’ was replaced by ‘learn’; the terms ‘accessing’, ‘acquiring’, and ‘appraising’ were replaced by ‘searching’, ‘retrieving’ and ‘evaluating’, respectively; and the term ‘area’ was replaced by ‘field’. In item 18, the phrase ‘In making decisions about my professional work … ’ was replaced by ‘When making professional decisions … ’. Other changes from the original to the Chinese version were ‘Workplace experience … ’ to ‘Experience from work practices or colleagues … ’(original item 19), ‘real world’ to ‘actual work’ (original item 21), ‘in your workplace’ to ‘at work’ (original item 46), ‘research reports’ to ‘research literature’ (item 45), and where used throughout the questionnaire, ‘client’ was changed to ‘patient’. There were also changes to some labels on the Likert scale: ‘true’ became ‘correct’; ‘neutral’ became ‘not so sure’; and ‘one month’, ‘once a fortnight’ and ‘once a week’ became ‘seldom’, ‘sometimes’, and ‘frequently’, respectively.

Content validation by the expert committee

The expert scores for content validity on the S-CVI for relevance, clarity, and equivalence were 0.99, 0.98, and 0.98, respectively. The I-CVI of all the items for relevance, clarity, and equivalence ranged between 0.83 and 1.00 (see Additional file 2).

Pilot testing

All participants completed the pre-test instrument in a mean time of 12.27 min (SD: 2.46 min; range: 8–19 min). Based on feedback, the acronym “EBP” was replaced with the full term “Evidence-based practice” to ensure the easy understanding and appropriateness of the questionnaire in practice.

Results of stage 2: testing of the psychometric properties of the EBP2Q

Participants

In the first period of the EFA, 303 of the 330 questionnaires distributed were included in the data analysis. Of the 27 excluded questionnaires, nine were missing and 18 were incomplete (> 25% of items not completed). This represented a response rate of 91.8%. For the CFA, 240 of the 250 questionnaires distributed were included (three questionnaires were missing and seven were incomplete), representing a response rate of 96.0%. “Hot Deck” imputation was used for missing data in questionnaires with ≥75% of items completed [48].

Table 1 presents the sociodemographic and EBP-related characteristics of the total sample of 543 nurses. The mean age of the nurses was 30.8 years (range: 22–60 years), and 91.0% were females. More than two-thirds of the nurses (87.5%) held a Bachelor’s degree as their highest education qualification. Half of the nurses (52.1%) practiced in the surgery department. The vast majority of the nurses (71.5%) had junior professional titles. Approximately one-third (35.7%) had been working for > 5–10 years. The majority were staff nurses (91.2%), while the remaining (8.8%) were nurse administrators. Although nearly half of the nurses did not have EBP training experience (44.8%), 10.1% of nurses had undertaken > 20 h of EBP training. Also, 69.8% of the nurses had some basic knowledge of English. The majority had never conducted a research study (91.0%).

Table 1 Demographic Characteristics of nursing staff (n = 543)

Item analysis of the first version of the Chinese EBP2Q for the EFA

The CR values of the items were all significant (p < 0.01). Only item 15 (EBP does not take into account the limitations of my daily work) showed a CR value of < 3.0. The item-total correlations were all > 0.30, with the exception of item 15 (EBP does not take into account the limitations of my daily work: r = 0.17) and item 19 (Experience from work practices or colleagues is the most reliable and effective way to solve problems: r = 0.26). Except for items 19 (Experience from work practices or colleagues is the most reliable and effective way to solve problems), 49 (Computer skills), and 50 (Ability to identify your knowledge gaps), when there was separate deletion of all other items, the α did not increase to be larger than the α of the corresponding subscale. Based on this analysis, items 15, 19, 49, and 50 were removed, leaving 54 items for evaluation in the EFA (see Additional file 3).

Structural validity-EFA

The remaining 54 items were analyzed using the EFA. Prior to the EFA, two factor analyses criteria were assessed: the Kaiser–Meyer–Olkin measure was 0.932 and the Barlett’s test of sphericity was significant (p < 0.001), which indicated fitness for the EFA [49]. The first EFA attempt extracted 10 factors, explaining 72.70% of the total variance. However, based on the item retention rule that items should not be cross-loaded on two different factors with loading ≥0.4, the following eight items were removed: 24 (systematic review), 47 (formally share and discuss literature/research findings with others in your department/practice, e.g. journal clubs, achievement report speech/experience sharing meeting), 25 (odds ratio), 36 (dichotomous outcomes), 34 (clinical importance), 35 (randomized controlled trial), 32 (statistical significance), and 30 (forest plot) were identified to be removed. Following this, a re-run of the model showed that two items (22, relative risk and 23, absolute risk) formed naturally into an individual factor, violating the rule of three items per factor. Subsequently, item 23 (absolute risk), with a relatively high factor loading, was removed first. After re-analysis, an eight-factor solution was identified, with an explained variance of 71.0% from a total of 45 items (see Additional file 4). Moreover, the EFA of all the 45 items reached factor loadings > 0.50, ranging between 0.55 and 0.83. The domains were renamed (maintaining initial item numbers) according to their common characteristics to summarize the concepts of items in each of the eight domains in the revised structure, e.g. Basic Understanding describes an individual’s fundamental conception of EBP, Intention refers to the individual’s determination to strengthen EBP competencies, Attitude means an individual’s emphasis or values about EBP in practical work ......(See Table 2).

Table 2 Structure of the 45-item Chinese Evidence-Based Practice Profile Questionnairea

Structural validity-CFA

Considering the five-factor structure in the original EBP2Q, we tried a three-order factor model test in the CFA to verify the prior EFA solution. The results revealed acceptable goodness-of-fit indices: χ2/df = 2.001; RMSEA = 0.065; SRMR = 0.077; and CFI = 0.884. With one exception, the estimated parameters of all the items and factors were statistically significant (p < 0.001), with all standardized factor loadings > 0.50 (ranging between 0.54 and 0.95): Basic Understanding, 0.73–0.91; Intention, 0.89–0.95; Attitude, 0.63–0.83; Sympathy, 0.57–0.75; EBP-related Terminology, 0.68–0.85; Clinical-Related Terminology, 0.54–0.79; Practice, 0.67–0.86; Confidence, 0.73–0.87; Relevance, 0.65–0.91; Terminology, 0.74–0.95; General EBP Competency, 0.42–0.91. The one exception was the loading of the first-order factor Sympathy to the third-order factor General EBP Competency (0.42, p < 0.001) (Fig. 1).

Fig. 1
figure 1

Confirmatory factor analysis of the 45-item Chinese Evidence-Based Practice Profile questionnaire (n = 240)

Convergent validity

For convergent validity, the correlation coefficients between the comparable individual items in the two questionnaires ranged between 0.19 and 0.52 for Practice, 0.27 and 0.27 for Sympathy, and 0.42 and 0.65 for Confidence. While statistically significant for all, the correlation of each of the three comparable summed factor scores was moderate for Practice (r = 0.58, p < 0.001) and Confidence (r = 0.68, p < 0.001) and low for Sympathy (r = 0.32, p = 0.001).

Criterion validity

All mean factor scores of the nurses on the eight domains were significantly different in the subgroups regarding the educational background, EBP training, research study experience, and level of English (p < 0.05). In terms of the nurses’ present position, the mean factor scores of the nurse administrators were significantly different from those of the staff nurses in the Basic understanding, Intention, Attitude, EBP-related terms, and Practice domains (p < 0.05). In the post-hoc analysis, nurses who had undertaken > 20 h EBP training and had a Master’s degree and College English Test-Band 6/International English Language Test System/Test of English as A Foreign Language (CET-6/IELTS/TOEFL) qualifications scored significantly higher than all the other nurses in each of the eight domains (p < 0.05) (Table 3).

Table 3 Differences in the Revised Chinese Evidence-Based Practice Profile Questionnaire by nurses characteristics (n = 543)

Internal consistency and test-retest reliability

The overall Cronbach’s α for the scale was 0.96. As listed in Table 4, the α for the internal consistency of the eight domains ranged between 0.85 and 0.95, with the composite reliability from the CFA ranging between 0.82 and 0.95, indicating good internal consistency. The ICCs for the items ranged between 0.50 and 0.91 and for the domain between 0.75 and 0.96, revealing sufficient time stability.

Table 4 Internal consistency and Test-retest reliability of the Revised Chinese Evidence-Based Practice Profile Questionnaire

Item analysis and the ceiling-floor effect of the revised 45-item Chinese EBP2Q

Regarding the item analysis of the revised 45 items, with the exception of items 26 (meta-analysis) and 44 (consider patients’ preferences when making clinical/professional decisions), where α if the items were deleted was not lower than the α for the corresponding domains, the rest all met the requirement. All results from the item-total correlation analysis were significant (p < 0.01), and the coefficients ranged between 0.32 and 0.74. Descriptive statistic results showed that the lowest and highest item means were 1.86 (SD: 0.91) and 4.13 (SD: 0.72), respectively, indicating the absence of ceiling-floor effect (see Additional file 5).

Discussion

The EBP2Q was initially designed to evaluate EBP profile across a range of professions and different levels of experience in Australian populations, and has demonstrated strong psychometric properties [28]. Guided by the established guidelines [33], the EBP2Q was translated and cross-culturally adapted into Chinese and validated using a sample of 543 clinical nurses. Our psychometric tests highlighted the capability of the EBP2Q-C to assess the EBP knowledge, attitude, skills, and behavior in domestic nursing practice, providing evidence of valid measurement properties of the instrument.

Concerning content validity, the S-CVI and I-CVI for each of the individual items reached acceptable values, with all S-CVI values > 0.90 for relevance, clarity, and equivalence, and all I-CVI values ≥0.83. These findings suggest that the items conform well to the conceptual framework.

Based on the first item analysis, four items were deleted because they did not fulfil the threshold standards, leaving 54 items subsequently analyzed in the EFA. The EFA resulted in the removal of nine items based on a priori item retention rules. This led to a change in the questionnaire from a five- to an eight-factor structure. The differences in the two structures may have resulted from the late introduction of an EBP culture in nursing in China and the less well-developed understanding of EBP by clinical nurses [5]. The three-order factor model in the CFA conducted to verify the reformed structure of the revised 45 items demonstrated a comparably good model of fit. An exception was the CFI (0.884), which was slightly lower than the recommended criterion of 0.90, but still approached the value for an acceptable fit. As for the lower factor loading of “Sympathy” to “General EBP Competency” (0.42), the possible reason may be that under the complex situation in China, the daily workload of clinical nurses is so heavy that they don’t have enough time to carry out the EBP projects, so that the “Sympathy” can not effectively reflect their “General EBP Competency”. In general, based on the EFA and CFA, items were distributed accordingly to, and positively correlated with, the corresponding domains.

The 58 domain items from the original EBP2Q were reduced by 13 items to 45 items for the EBP2Q-C. In the item analysis stage, four items were removed based on the analysis of the tested sample showing that items may be measuring a different concept or could be misleading. A further nine items were removed in the EFA stage because they did not adequately correlate with the eight subscales identified. Nevertheless, the reduction of the items enables a more economic measurement. Eight of the 17 items in the domain of Terminology in the original EBP2Q were removed. It may be that these eight terminology items are not meaningful to the Chinese population. Nine of the original 11 items in the EBP2Q Confidence domain were included in the EBP2Q-C with the exclusion of ‘computer skills’ and ‘ability to identify gaps in knowledge’ (item 50). It may be that item 50 is not considered separate from item 51 where the identified gap or information need is transferred into a clearly answerable question. Importantly, the formulation of a clearly answerable question, which is an important concept in EBP, remains in the new version. Only one item (formal sharing of research findings) from nine of the Practice domain in the original EBP2Q was not included in the same domain of the Chinese version. The included items appear to adequately cover the application of EBP in clinical circumstances. The seven items for the domain of Sympathy in the original EBP2Q were reduced to five items in the EBP2Q-C. The two removed original items were item 15 (regarding whether EBP takes into account the limitations of daily work), and item 19 (rating the role of experience from work practice and colleagues in decision-making). It is possible that in the Chinese context, the subtle differences between items 19 and 18 (which rate the value of clinical experience in professional decisions compared with research experience) are not recognized. Overall, it appears that the domains in the new questionnaire adequately cover those under investigation. However, caution should be exercised in comparing study results where the two different questionnaires (Chinese and original) are used, due to changes based on differences in Chinese culture, workplace relationships, and linguistic nuances.

The findings for the convergent validity of the EBP2Q-C with the Chinese EBPQ as criterion (Practice: 0.58; Sympathy: 0.32; Confidence: 0.68) suggested adequate convergent validity except for Sympathy in contrast to the convergent validity of the original EBP2Q with the EBPQ as criterion (Practice: 0.66; Sympathy: 0.54; Confidence: 0.80) [28]. The questionnaire revisions for the EBP2Q-C in terms of language translation, cultural interpretation, and item reduction, as discussed above for all domains and specifically in each of the three comparative domains with the EBPQ, may have contributed to these weaker correlations.

Regarding the criterion validity, all the comparisons between five key characteristics showed statistically significant differences. Nurses with higher education, more extensive EBP training, experience in conducting research study, and better level of English scored significantly higher on each of the eight individual domains. These findings were consistent with the verified associations of these sociodemographic variables with EBP domains reported in previous studies [9, 50, 51]. However, compared with the statistical significance of EBP training on all the eight domains in the EBP2Q-C, the results in the Polish [29] and Norwegian [30] version only demonstrated significance in the Relevance, Terminology, and Confidence or Sympathy domains. This may be due to differences in the various EBP training times, with options limited to yes/no only in the Norwegian version, and none and 12 h in the Polish version. In the current study, there were three levels with greater differences in exposure (none, ≤20 h, and > 20 h). This may indicate that the duration of exposure to EBP training is of great importance to the effectiveness of the training program at a self-reporting level. As presented in Table 3, nurses in the role of administrators had significantly higher mean values in five of the domains, with significance not reached for the domains of Sympathy, Clinical-related terms, and Confidence. Previous research has also demonstrated that nurses who hold higher-level positions reported better values in the EBP domains [50]. Results from the post-hoc analysis also confirmed the significant influence of the Master’s degree, EBP training for > 20 h, and CET-6/IELTS/TOEFL qualifications on the EBP competencies of nurses. The results of the present study showed significant differences of these key characteristics in different EBP domains. Hence, these data demonstrated the validity of the EBP2Q-C to assess the self-reported EBP in Chinese clinical nurses with different training.

For internal consistency, the composite reliability from the CFA was employed as an additional supplementary analysis. It has been previously reported that the Cronbach’s α coefficient can underestimate or overestimate reliability [52]. For the newly proposed domains in the EFA, both the observed estimators exceeded the recommended standard, similar to that noted in the English EBP2Q [28], supporting the internal consistency. The temporal stability of the EBP2Q-C was confirmed by the test-retest reliability with a 2-week interval separating the completion of the two questionnaires. The ICC for the items exceeded 0.40, and reached satisfactory values (≥0.75) for all domains, with similar values to those obtained from the English EBP2Q [28].

In the item analysis of the 45-item EBP2Q-C (see Additional file 5), all item-to-total correlations were statistically significant and indicated a stronger association with the total scale. These findings demonstrate the homogeneity of the revised 45-item Chinese questionnaire. Although the results of changes in α when items are sequentially excluded from the analysis recommended the removal of items 26 (meta-analysis) and 44 (consider patients’ preferences when making clinical/professional decisions), these two items were retained to ensure the comprehensiveness of the instrument. As stated in the definition of EBP [14], health care professionals should instinctively take the patients’ values, expectations, and preferences into consideration to ensure evidence-based health care. Therefore, reserving item 44 (consider patients’ preferences when making clinical/professional decisions) may assist in assessing the EBP behavior of participants.

Descriptive statistics of the EBP2Q-C showed that nurses scored highest in the Attitude domain and lowest in the Confidence domain. The findings suggested that, while nurses held a positive attitude towards the EBP, they still lacked the necessary EBP competencies and confidence to incorporate research evidence into professional practice. This was similar to findings reported in recent studies involving nurses in Turkey and the USA [11, 53]. These findings suggest that there is ‘a long way to go’ for domestic EBP educators and training mentors in tailoring efficient instructional modes for clinical nurses. The floor and ceiling effects of each item in the EBP2Q-C were explored using the lowest and highest item mean (1.86 [SD: 0.91] and 4.13 [SD: 0.72], respectively), demonstrating the absence of the ceiling or floor effect (see Additional file 5). McEvoy et al. reported similar results for the EBP2Q: 1.71 (SD: 1.0) and 4.09 (SD: 0.9), respectively [28].

Limitations and considerations for further research

Despite the satisfactory findings, the current study was characterized by a number of limitations. Firstly, while the sample size was sufficient for the measurement of the tested properties, all nurses were recruited from a single tertiary hospital through convenience sampling: this may restrict the broader application of the study findings. Further research should involve hospitals representing different levels using the stratified random sample method, to expand the generalizability of the results. Secondly, all data were self-reported, which may result in the overestimation or underestimation of actual competence of the respondents, thus leading to reporting bias. Thirdly, the responsiveness of the EBP2Q-C in EBP educational or training programs is unknown. This may be valuable to consider in future validation research.

Finally, the EBP2Q-C was validated only with nurses in contrast to the original EBP2Q where a range of professions were included in the development and validity-testing. Further research may start with other professionals such as clinical doctors.

Conclusion

This study provides preliminary evidence for the EBP2Q-C as a psychometrically robust tool for the evaluation of EBP in nurses in China. Although consistent in terms of conceptualization, the factor structure of the EBP2Q-C differed from that of the English version, which necessitated further validation of the instrument. The final revised rigorously developed 45-item EBP2Q-C has an eight-factor structure and demonstrated acceptable structural, convergent and criterion validity, test-retest reliability, and internal consistency. The EBP2Q-C may be used in EBP education or training programs to improve the skills of participants, either as self-assessment or an outcome measurement of learning. It may also be used in the design of EBP courses by clinical EBP educators, to develop efficient evaluations of education or training programs.