Background

As health issues have become increasingly diverse and complex, there has been a corresponding increase in administration systems and government measures. The same applies to the activities of public health nurses (PHNs) who are responsible for providing health services to the public. It is therefore critical to implement evidence-based healthcare (EBHC) for using limited financial and human resources effectively and efficiently to resolve health issues [1]. However, actual program implementation is marred by an evidence–practice gap [2]. One way to address this challenge is dissemination and implementation (D&I) science [3], which has developed since the 1990s mainly in the USA. In Japan, insights gained from D&I could be used to promote evidence-based program implementation (EBPI).

PHNs are the primary drivers of health programs in Japan. In order to maintain and promote the health of individuals and the whole community and to prevent diseases, public health nurses develop health activities based on community diagnosis and according to community characteristics [4]. Under the Japanese PHN qualification system, which was launched in 1941, 73.9% of PHNs (39,117 or 30.9 PHNs per 100,000 persons) are affiliated with government agencies [5]. The direction of their activities is mainly defined as the “implementation of a PDCA cycle” under the National Guidance on PHN Activities [4], thus requiring EBPI. This direction is described as “capacity of program and policy development required to find solutions to local health issues” under the National Guidelines on PHN Training [6], whereas the career ladder pertaining to expertise after graduation includes “program and policy development” capacity to solve local health issues, as one of the six areas of activities to be implemented by PHNs [7], which means that EBPI is an essential activity and capacity to be acquired by PHNs.

However, a PHN survey (n=604) [8] that measured research utilization competency (RUC) with a two-factor, ten-item scale containing five evidence-based steps: (1) Ask, (2) Acquire, (3) Appraise, (4) Apply, and (5) Assess [9], showed low scores overall and for two skill subsets in particular—“To examine the search results to solve on-site problems (that contain the contents of step 1-2)” and “To appraise, apply, and evaluate the search results (that contain the contents of step 3-5)”—namely 58.7, 63.7, and 53.7 out of 100 points, respectively. These results indicate that, despite the widespread recognition of the importance of evidence among PHNs, there are many obstacles to actually using it.

The whole process of EBPI incorporates many organizational factors in addition to the five processes mentioned above. In order to promote EBPI by PHNs in future, it is necessary to grasp the situation not only of RUC but also of the evidence–practice gap in the whole EBPI process and consider the appropriate direction of problem-solving based on the results. However, the situation of EBPI in Japan is still not known, not only for public health nurses, as no comprehensive assessment sheet has been developed on this subject.

Accordingly, our research team considered the possibility of using an existing framework for D&I research as an assessment sheet. After analyzing 111 studies using various D&I conceptual frameworks, a review by Mazzucca et al. [10] reported that the Consolidated Framework for Implementation Research (CFIR) [11] and the RE-AIM Model (which has five factors: reach, efficacy, adoption, implementation, and maintenance) [12] were the most commonly used frameworks, each used in 16 of the 111 studies. After examining the most popular models, we found that CFIR and RE-AIM had an advantage in the comprehensiveness of “implementation” ranging from the adoption to introduction of evidence, and of the evaluation of interventions, respectively. Since the purpose of this study was to comprehensively understand the actual status of EBPI among public health nurses, CFIR, which emphasizes the process of adopting evidence-based interventions, was chosen as the most appropriate for this study. Since CFIR includes “1. Intervention Source, 2. Evidence Strength & Quality, 3. Relative Advantage, 4. Adaptability”, we believe that CFIR will contribute to strengthening the research utilization competency of PHNs. CFIR is a framework developed by integrating about 20 theories and models from psychology, sociology, organizational change theory, etc. The number of citations in PubMed exceeds 2600 as of July 2021, and it is being used in many countries.

Purpose of research

The purpose of this research was to develop an Implementation Degree Assessment Sheet for Health Program (IDAS) by customizing CFIR. The study design was the development and validation of the items and the scale for the implementation of health programs.

This is important in order to use the sheet to comprehensively understand the situation of EBPI among Japanese PHNs and thus clarify how to develop the capacity of public health nurses to promote EBPI going forward.

The term “program” was defined as “any program, activity, or other engagement with the targets of public health (individual, family, population, organization, local or other community), organized by you or members of your department.”

Methods (Table 1)

The development and validation of the IDAS followed the three phases of Boateng et al. [13].

Table 1 Procedure of IDAS development and validation

Phase 1: Item development (Table 1)

Step 1: Identification of domain and item

We used CFIR which had the five major domains and 38 constructs [14] as the framework for the IDAS, and the items were determined by discussing with the co-researchers of the lead university whether the contents could be used to assess the degree of health program implementation based on EBPI in Japan. CFIR was originally developed by integrating numerous existing models [11] and has been used across a wide range of studies for more than a decade [15]. Regarding the use of the CFIR, we confirmed that it was open and free to the developer, Damschroder.

Step 2: Content validity

Content validity was ensured through weekly consultations among the co-researchers over two months. Specifically, each of the four members read CFIR webpages and translated the CFIR constructs and guides into Japanese. We collated the translated materials and generated IDAS constructs by closely examining whether each construct is relevant to specific, hypothetical cases of health programs in Japan. We used taxonomy verbs for expression as they may be used for assessing behavior. These items were subsequently modified to fit the context of health programs in Japan by seven expert members with experience in health program implementation and qualitative research. For example, we adopted for number 11 non-competitive content in the program development of Japanese administrative PHNs because of the characteristic of emphasizing “Horizontal Development of Pioneering and Superior Practice” rather than “Peer Pressure.”

Consequently, all five domains were retained, while the constructs were ultimately reduced to 31 after consolidation and reorganization. As detailed in the footnotes to Table 3, the two CFIR items were reorganized to fit the Japanese context (IDAS; No. 7, 11), and the 11 CFIR items were integrated into the five IDAS items.

In order to make the IDAS available internationally in the future, the English version of the IDAS was prepared in accordance with the method of Wild et al. [16] by following the steps of (1) Forward Translation, (2) Back Translation, and (3) Backward Translation Review.

Phase 2: Scale development (Table 1)

Step 3: Pre-testing

In order to evaluate content validity, a pre-test of a target group was conducted after participants attended a training session on program implementation. The training session lasted 90 min and consisted of a PowerPoint lecture introducing the importance of EBPI and IDAS for public health nurses, and group work to discuss their own level of implementation. There were two questions in the pre-test, whether each item was easy to understand and whether the participants recognized its importance. We selected the constructs receiving at least 70% approval regarding the two questions.

Step 4: Survey administration and sample size

The survey covered PHNs affiliated to 334 government offices and health centers of all prefectures and the cities with health centers. This method was chosen because there are two types of health centers run by government agencies in Japan: prefectural and municipal. We selected health centers by stratified random sampling and requested six PHNs per center to participate. Assuming that 50% of the institutions and PHNs would participate, we expected data collection from some 500 respondents. Thus, the survey was designed to secure a sufficient sample size of 400, with a population of 40,000, confidence level of 95%, error margin of 5%, and response rate of 0.5.

The survey was conducted as an anonymous, self-administered questionnaire survey by mail to each head PHN. The head PHNs were asked to minimize bias in terms of job title, years of experience, and function in distributing the questionnaire form. We asked the participants to return the questionnaire form in a separate envelope provided by the researcher, which could be returned unsigned, to ensure that the responses were voluntary. The survey was conducted in January 2020.

The survey pertained to basic attributes and the draft IDAS (five domains and 31 constructs of CFIR customized to health programs in Japan).

The question asked: “Each of the following constructs is a statement concerning behavior in adopting/implementing a new program. Does the statement apply to your work behavior?” The participants were asked to assess their behavior on a scale of 0 to 5, ranging from “0: Does not apply at all” to “5: Always applies” (with the total score falling between 0 and 155; refer to Table 3 for the items).

Step 5: Item analysis

As item analysis, we conducted a ceiling/floor effect analysis [17] and an item-total correlation analysis. The adoption criteria are shown in Table 1.

Phase 3: Scale evaluation (Table 1)

Step 6: Tests of reliability

Regarding reliability, we checked internal consistency with Cronbach’s coefficient alpha, and stability with the split-half correlations (odd/even) method.

Step 7: Tests of Validity

Validity was verified in terms of construct validity, criterion-related validity, and known groups validity (four groups depending on the years of experience and whether the respondent was a supervising PHN or not). Since the IDAS comprises CFIR’s five domains, we verified the model fitness of construct validity with a confirmatory factor analysis.

The external criterion for criterion validity was the Research Utilization Competence Scale (RUC) score, which consists of 10 items (ranging from “Does not apply at all: 1” to “Most often applies: 6”; 10–60 points) and has been validated for reliability and validity [8].

As statistical software, we used IBM SPSS ver.27 (containing Advanced Statistics, Regression, and Amos) with a significance level of 5%.

Ethical considerations

We explained the ethical considerations in a document sent with the questionnaire form. The research program was implemented with the approval of the Ethical Review Board, Osaka University Hospital (Approval No. 19285 dated November 5, 2019). The document also explained the voluntary nature of participation in the survey, protection of personal information, data processing policy, and the publication of research findings, among others. The respondents gave consent to participation in the survey by ticking the appropriate box in the document and returning it.

Results

Phase 1: Item development

Step 1: Identification of domain and item

The draft IDAS, created following a series of consultations, contained all of the five domains of CFIR, as well as 31 constructs customized to heath programs in Japan (see constructs in Table 3). Although the original CFIR had 38 constructs [14], we customized them by narrowing and consolidating the meanings of those constructs which do not fit the scale of programs executed by PHNs or the structure of administrative organizations in Japan.

Step 2: Content validity

Thus, the definition of Construct I-G “Design Quality & Packaging” was narrowed down to mean “Material Quality Control.” Construct II-C “Peer Pressure,” which literally refers to pressure from companies operating in the same industry or competitors, was changed to “Horizontal Development (of Pioneering and Superior Practices)” to include positive impact. Construct III-D “Implementation Climate (1. Tension for Change, 2. Compatibility, 3. Relative Priority)” was modified to “Fostering Change Acceptance Climate” to put more weight on infrastructure development. As the scope of our survey is limited to program implementation by administrative organs, Construct A “Structural Characteristics” was narrowly defined in the spirit of Sub-constructs D5 and D4 as “17. Goals Setting and Accountability” and “18. Confirmation of Higher Goals/Incentives.” The five sub-constructs of Construct “V-B. Engaging” were modified into expressions and statements intelligible to Japanese PHNs, focusing on important factors of program implementation by administrative organs, namely internal human resource development, external partnerships, and program participants. Thus, Sub-constructs 1 “Opinion Leaders,” 2 “Formally Appointed Internal Implementation Leaders” and 3 “Champions” under Construct B “Engaging” were integrated into a construct called “Engaging: Internal Implementation Key Persons.” We also consolidated Sub-constructs B4 and B5 into “Engaging: External Change Agents/Key Stakeholders.”

Phase 2: Scale development

Step 3: Pre-testing

The number of participants in the training session was 7, and their attributes included an average of 7.7 years of experience as PHNs working for an administrative organ (range 2–15). The respondents had no difficulty understanding any of the statements. At least five PHNs (71.4%) recognized all but one construct as important, with the exception of “Trialability,” pointing to content validity. We revised the expression of “Trialability” in Japanese by adding the word “confirmation.”

Step 4: Survey administration and sample size (Table 2)

One hundred eighty-five of the 334 institutions (55.4%) agreed to participate in our survey. Of the 966 questionnaire forms distributed, 709 forms (73.4%) were collected, of which 702 responses (72.7%) were valid.

Table 2 Basic attributes of participants  N=702

The attributes of the participants (Table 2) indicated that women accounted for an overwhelming majority (95.7%), over half of whom (51.7%) had experience of 26 years or over. By affiliation, 24.9% were working for a prefectural government, and 75.1% at a health center. By job title, about 20% were supervising PHNs and 60% had a position equivalent to section head or higher.

Step 5: Item analysis (Table 3)

In our item analysis, any ceiling/floor effects were found, as there were no values with a mean +1SD less than 5 and a mean −1SD greater than 0. The item-total correlation analysis found no extremely low correlation coefficient of less than 0.5 between item and total scores, ranging between 0.557 and 0.717.

Table 3 Item analysis

Phase 3: Scale evaluation

Step 6: Tests of reliability

With regard to reliability, Cronbach’s coefficient alpha, a measure of internal consistency, stayed above 0.8 overall and for each of Domains I–V (0.95, 0.86, 0.83, 0.88, 0.89, and 0.86, respectively). Our verification of stability with the split-half (odd/even) method resulted in a Spearman–Brown reliability coefficient of 0.95.

Step 7: Tests of validity

We used confirmatory factor analysis to test whether the five domains of the IDAS fit the five-factor structure. The results initially showed inadequate fit indices of GFI: 0.78, CFI: 0.84, and RMSEA: 0.08. We added covariance to the 15 locations among the error variables where the value of the adjusted index was greater than 30. In doing so, we checked the content of the paired items and limited the covariance to those items for which a relevant context was identified. Then the values of fitness indexes showed improvement, although the GFI was slightly lower than the criterion of 0.9, the DFI was higher than 0.9 and the RMSEA was lower than 0.08, it reached acceptable levels of GFI: 0.87, CFI: 0.92, and RMSEA: 0.06.

The IDAS score and the research utilization competency score, used as an external criterion, showed a correlation coefficient of 0.51 (p<0.001). The result of the known-group method analysis (Table 4) showed significant differences in the IDAS score among the four groups of years of experience, as well as between the two job title groups (supervising PHN or not) (p<0.001). Multiple comparisons indicated significant score differences between Group 1 (1–5 years) on the one hand and Groups 3 (16–25 years) and 4 (≥26 years) on the other, as well as between Group 2 (6–15 years) and Group 4 (≥26 years) (p<0.01).

Table 4 Relationships between known-groups N=702

Discussion

The reliability and validity of the IDAS may be described in line with the three phases [13].

Phase 1: Item Development. We believe that the confirmability of “Identification of Domain and Item Generation” was ensured by using CFIR, a framework that has been used and developed over a decade in various countries. Content validity was also ensured by ongoing evaluation and modification by expert co-researchers and by modification based on the results of the pre-test.

Phase 2: Scale Development. Our survey obtained 702 valid responses. Along with the smallest attribution unit of 87 samples used for the analysis, we secured an adequate size of samples representing the population. No ceiling or floor effect was found, no item required consideration of deletion in the item-total correlation analysis, either.

Phase 3: Scale Evaluation. Internal consistency and stability were verified to warrant the reliability of the IDAS. The correlation between the IDAS and external criteria and the expected score differences among “known groups” confirmed the criterion validity (concurrent validity and known-group validity). Construct validity was also confirmed as our confirmatory factor analysis attested to the model fitness of the five IDAS domains.

From these results, we developed the IDAS, composed of five domains and 31 constructs with certain levels of verified reliability and validity.

Next, the following is a discussion of how the IDAS can be used to develop the capacity of public health nurses to promote EBPI. However, as a prerequisite, it should be noted that not all of the constructs need to be adhered to in actual work. The reason for this is that within each organization, there are different areas of responsibility depending on the division of work and position.

First, it may be used to assess individual behavior in program implementation by checking each of the constructs. For example: (1) the person responsible for implementing a health program may use the IDAS for regular self-assessment to identify (i) where the requirements have been met, (ii) where improvements have been observed, and (iii) where problems exist, thus facilitating his/her skill upgrading going forward; or (2) at the workplace level, individual PHNs may use the result of their self-assessment (i) to exchange information on their current status by construct and (ii) to share best and worst practices, thus facilitating individual skill upgrading and improvement. In the researches that used CFIR for program evaluation [18, 19], the status of each construct was compared for each facility to identify constructs that strongly differentiated whether the implementation was effective or not, so that improvements could be considered from the analysis. If we replace this method of utilization with the IDAS, we believe that the IDAS can be used as a tool to identify constructs that require capacity building at the individual level and apply it to continuous self-improvement.

Second, the total score and the scores for individual domains may be used for “competency assessment” in program implementation at the workplace. Concrete examples may include (1) regularly aggregating the self-assessment results of all members at the workplace level to identify any domain in which the workplace may be experiencing problems in competency, to serve as an input to competence development planning; or (2) identifying historical trends to inform quality improvement and future planning at the workplace level. Researches using CFIR have analyzed community-based programs for promoting and obstructing factors within the system or organization [20,21,22]. This suggests that if the IDAS can be used at the workplace level for a specific program, or at the team level, including related organizations, we believe that the IDAS can be used to identify strengths and challenges of the system in a community or organization on the basis of the aggregated scores for each domain, and to find guidelines for capacity building that should be undertaken by the entire team.

Finally, each domain or construct of the IDAS customized for health programs in Japan might serve as a framework for formative evaluation of program implementation in Japan, on the grounds that the original CFIR contains cumulative evidence [10, 15, 23] that can be used for formative evaluation of programs and activities. In doing so, interpreting the high and low scores requires an open discussion of whether or not it was a necessary construct for evaluating the degree of implementation of the program. We believe that continuing these researches, accumulating the results, and deepening the dialogue will contribute to the implementation science.

Limitations

A methodological limitation of this study is the issue of the representativeness of the population. Although a sufficient sample size was obtained, the percentage of the group with experience of 26 years or more was more than 20% higher than actual, and the percentage of the group with 6–15 years and 16–25 years was more than 10% lower. The need for further investigation and consideration of this effect is an issue for the future. Another limitation of this study is that the fitness index GFI did not meet the criterion, which we would like to discuss in future research. This research was intended to develop an assessment sheet to understand comprehensively the situation of EBPI among Japanese PHNs and was limited to the preparation of the IDAS, comprising five domains and 31 constructs. Future challenges include developing guidelines to promote EBPI and closing the evidence–practice gap by enhancing the elements of the IDAS through ongoing qualitative research that may serve as behavioral indicators. In addition, because of the limitations of IDAS, which is composed of items that fit the Japanese context, the transferability of IDAS for international use needs to be examined in the future.

Conclusions—development and validation of the IDAS

The IDAS (five domains, 31 constructs) was developed by customizing the five domains of the CFIR and the constructs included in it to assess the degree of implementation of health programs in Japan, and by going through three phases of a rigorously set procedure to ensure reliability and validity.

In the future, it is hoped that the IDAS will be used for developing the capacity of public health nurses and related professions, and for evaluating the implementation of programs. It is necessary to continue to accumulate research results to enhance the content of the IDAS as behavioral indicators, and to examine the possibility of its use in other countries.