A two-round Delphi study was conducted through the Internet by the use of Qualtrics software, version 45433  and within a 4-month time frame (July–October 2011). A flow diagram of the methods is shown in Figure 1.
The Delphi method is a systematic approach that can be used to derive consensus among experts on a topic where scientific knowledge is scarce . Its main characteristics, i.e., anonymity of experts, iteration, controlled feedback, and statistical group response, allow participants to give their opinion freely, change it after having received feedback, and assure that the opinion of every expert is equally represented in the results [25, 26].
Procedures and participants
The first round was conducted to facilitate consensus among experts on the importance of factors for the specific stages of the introduction process, i.e., adoption, implementation, and continuation, and on their changeability. Therefore, a variety of people with research and/or practice experience in the field of the introduction of PA interventions in PHC was recruited via research and practice networks (e.g., participants of the qualitative study, LinkedIn groups) and invited to participate by email and telephone. Participating experts were sent an email including the link to the first questionnaire. After two weeks, four weeks, and five weeks, non-respondents received a reminder. In total, 44 experts (response rate of 65%) completed the questionnaire. Completing the questionnaire indicated consent, so no separate consent from participants was obtained. All experts were Dutch and had experience with the introduction of PA interventions in PHC within the following functions: researcher (n = 12), policy maker (n = 7), intervention manager (n = 4), PHC advisor (n = 12), and PHC professional (n = 9).
The questionnaire consisted of two parts. Part one encompassed 267 structured questions (89 factors × 3 stages) on factors’ importance. Questions were based on the factors identified in the systematic literature review  and qualitative study  (see Additional file 1) and divided into six categories of factors that may influence the introduction process, i.e., innovation, socio-political context, organization, patient, adopting person, and innovation strategy [12, 17]. The experts were asked to rate on a 10-point Likert scale (1 = not at all important, 10 = essential) the importance of each factor for, respectively, the adoption, implementation, and continuation of PA interventions in PHC. For each category of factors an open-ended question was added on whether factors were missing in the list. Part two included 89 structured questions on factors’ changeability. The experts were asked to rate on a 10-point Likert scale (1 = no influence at all, 10 = a lot of influence) the amount of influence they had on each factor during their involvement in the introduction of PA interventions in PHC. Piloting of the questionnaire among health promotion researchers and employees of health promotion institutes indicated that the questionnaire was well received.
Median scores were calculated as indicators of factors’ importance for each stage of the introduction process. In concordance with van Stralen et al.  factors with a median score of 8 or higher were considered important. Based on median scores, many factors were found to be important. To avoid burdening experts with too many items to decide on their top-10s in the second-round questionnaire, mean scores were calculated to identify most important factors for each stage. Based on stages’ grand mean scores of important factors, most important factors were factors with a median score of 8 or higher and a mean score of 7.64 or higher for the adoption stage, a mean score of 7.70 or higher for the implementation stage, and mean score of 7.76 or higher for the continuation stage.
Median scores were also calculated for factors’ changeability. Factors were indicated as changeable if they scored a median of 6 or higher. This cut-off value was chosen to be able to include all factors that are considered to be at least somewhat changeable. The interquartile range (IQR) scores were calculated to assess the extent of agreement between the experts on the changeability of each factor . The IQR represents the distance between the 25th and 75th percentile values, with smaller values indicating higher degree of consensus. An IQR score of 1 means that 50% of all the scores given by experts fall within one point on the scale. According to Linstone and Turoff  an IQR of 2 or smaller can be considered as good consensus on a 10-point Likert scale.
Differences between expert groups (i.e., researchers, policy makers, intervention managers, PHC advisors, and PHC professionals) with regard to their ratings of factors’ importance and changeability were explored with one-way independent ANOVAs. IBM SPSS Statistics version 19.0  was used for the analyses. The qualitative data on potentially missing factors were scored as ‘new’ or ‘already in the list’.
Procedures and participants
All experts who completed the first-round questionnaire (N = 44) were sent an invitation by email to participate in the second round including the link to the second questionnaire. After one week and two weeks, non-respondents received a reminder. In total, 37 experts (response rate 84%) completed the questionnaire. Of them, 11 were researchers, six were policy makers, three were intervention managers, nine were PHC advisors, and eight were PHC professionals.
The second round was conducted to identify the top-10 most important factors for the specific stages (i.e., adoption, implementation, and continuation) of the introduction process, and their changeability. The questionnaire included the factors that were scored as most important by the experts in the first round (median ≥ 8 and mean ≥ 7.64 for the adoption stage; median ≥ 8 and mean ≥ 7.70 for the implementation stage; median ≥ 8 and mean ≥ 7.76 for the continuation stage). This resulted in a list of 18 factors for the adoption stage, 23 factors for the implementation stage, and 24 factors for the continuation stage; in total 37 different factors (see Table 1). For each stage, the experts were asked to indicate their top-10 of most important factors. Again, open-ended questions were added on whether any factors were missing. For the same set of factors, experts were asked to rate their changeability on a 10-point Likert scale (1 = not changeable at all, 10 = very changeable). In contrast to the first questionnaire, which concerned their own personal influence, experts were asked to rate factors’ changeability in general. This alteration was made because we felt that the group of experts was too heterogeneous for consensus to occur if their own personal influence was taken into account. Again, piloting indicated that the questionnaire was well received.
For changeability, again, the median scores and IQR scores were calculated. Importance was calculated based on the sum of points allocated to the factors based on the experts’ top-10 ranking. For each expert, factors ranked first in the top-10 were allocated ten points, factors ranked second were allocated nine points, and so on. When a factor was not assigned to a top-10, it did not get any points. Differences between expert groups (i.e., researchers, policy makers, intervention managers, PHC advisors, and PHC professionals) with regard to their top-10 rankings and ratings of factors’ changeability were explored with one-way independent ANOVAs. The qualitative data on potentially missing factors were scored as ‘a factor not in the list’ or ‘in depth information on top-10’.
The Medical Ethics Committee of the Leiden University Medical Centre had granted ethical approval of this study (reference number NV/CME 09/081).