Introduction

In Germany, a replication study of the US Community Youth Development Study (CYDS) was rolled out in 2021. This controlled study aims to evaluate the effectiveness of the Communities That Care (CTC) community-based prevention system in Germany. For this replication study, the assessment instruments of the CYDS were used. Most of these instruments had already been translated into German several years before the start of the replication study, adapted to the German context, pre-tested, and trialed (Groeger-Roth et al. 2015; Schwendemann et al. 2018; Jonkman 2015). However, some of these previously translated instruments had to be revised again within the replication study for various reasons (e.g., shortening, modifications, measurement errors, and biases). These revised versions were subjected to cognitive pretests and expert evaluations in an iterative process before being used in the replication study. Before the replication study started, we could not subject the revised instruments to another extensive validation study. These instruments include the Community Key Informant Interview (CKI). The CKI was developed by Arthur et al. (2001) and measures the short-term outcomes (system transformation outcomes) as postulated by the CTC theory of change (Fig. 1) (Arthur et al. 2001; Brown et al. 2008, 2014). In the CYDS, empirical validation of this model demonstrated that over 90% of the effect of CTC on risk and protective factors and adolescent problem behavior (positive youth development) was mediated by the construct adoption of science-based prevention (Brown et al. 2014). In a validation study of CKI, this construct was also shown to correlate with the constructs community collaboration for prevention and community support for prevention (Brown et al. 2008). The interpretation of this correlation was that the first construct depended on the other two constructs.

Fig. 1
figure 1

Theoretical model of system transformation promoted by Communities That Care adapted from Brown et al. (2011)

For Germany, we assume that the effect of CTC is probably also entirely mediated by the adoption of science-based prevention. In addition, we expect that the constructs community norms and social development strategy do not mediate the effect of CTC in Germany. Therefore, the German version of the instrument does not include these two constructs. Thus, the German version of the CKI includes five system transformation constructs. These are described more precisely in the following.

  1. (1)

    Stage of adoption of science-based prevention: This measure is based on the Prevention Science Framework for Community Prevention Planning and Roger's Stage Model for Diffusion of Social Innovations (Arthur et al. 2001). The former means that science-based community prevention (1) builds on an explicit model of scientifically proven risk and protective factors for health-related behaviors; (2) bases its intervention planning on epidemiological data on the local prevalence of these risk and protective factors and health-related behaviors; (3) selects, implements, and monitors proven effective interventions to address local prevention needs; (4) evaluates the interventions in a data-driven manner and adapts them if necessary. When considering science-based community prevention as a social innovation, the stage model of diffusion of social innovations (Rogers 1995) distinguishes the following five stages: (1) knowledge, (2) persuasion, (3) decision, (4) implementation, and (5) confirmation. On this basis, the construct distinguishes the following six stages (Table 1):

Table 1 Six stages of adoption of science-based prevention
  1. (2)

    Community collaboration for prevention improves the effectiveness and efficiency of community health promotion as it leads to broader political advocacy and shared responsibilities for health promotion activities, better resource pooling for health promotion, and a reduction in duplicative structures and because it counteracts jurisdictional wrangling and turf battles (Richard et al. 2002; Dearing et al. 1996; Brown et al. 2008). The construct is captured in the CKI with the following two measures:

    1. (a)

      Prevention collaboration: This measure captures the degree to which the local intersectoral health promotion coalition succeeds in networking (e.g., sharing information), coordinating (e.g., changing activities), and collaborating (e.g., sharing resources) (Brown et al. 2008). This conceptualization distinguishes the measure from related constructs, such as that of collaborative capacity (Goodman et al. 1998) and community readiness (Edwards et al. 2000), which primarily only conceptually capture the local context for intersectoral collaboration for health promotion (Brown et al. 2008).

    2. (b)

      Sectoral collaboration: This measure assesses the extent to which stakeholders (e.g., agencies, organizations, volunteers) in the local intersectoral coalition collaborate with stakeholders from other sectors for health promotion (Brown et al. 2008). Ten sectors are included (e.g., schools, churches and other religious institutions, law enforcement, youth services, and media).

  1. (3)

    Community support for prevention refers to local people and local decision-makers believing in the effectiveness of health promotion, knowing about local health promotion efforts, and wanting the community to invest in health promotion.

    1. (a)

      Community support: This measure captures the extent to which the community's population is convinced of the effectiveness of health promotion and is aware of local health promotion activities. In the original instrument, this is determined once for the community population and once for the interviewed key persons. The German version of the instrument determines this value only for the community's population.

    2. (b)

      Financial support: This measure captures the importance that key respondents attribute to primary prevention and health promotion when allocating a community budget.

The present study aims to evaluate the structural validity, concurrent construct validity, and reliability (internal consistency and inter-rater agreement) of these constructs using baseline data from the German replication study (Effectiveness of the Community Prevention System Communities That Care, CTC-EFF).

Methods

Study design

The following analyses are based on cross-sectional baseline data of the CTC-EFF study. The CTC-EFF is a quasi-experimental study to evaluate the effectiveness of the Communities That Care community-based prevention system in Germany. Small towns and rural communities, as well as city districts of large cities, are participating in the study. These are in Baden-Württemberg, Bavaria, Lower Saxony, and Rhineland-Palatinate, and are referred to below as communities. Twenty-two communities implementing CTC (IC) were recruited, along with 22 a priori-matched comparison communities (CC) that resemble them in demographics, economy, and crime. Details on the study design and recruitment of communities were published by Röding et al. (2021).

Sample and recruitment

Drawing on the Community Youth Development Study (Hawkins et al. 2008), a two-stage process was used to generate a sample of community key informants (e.g., mayors, school principals, police officers, youth-serving leaders, or health officials) in each of these communities. In stage 1, the research team used internet searches to identify community key informants for each community and invited them to participate in a survey study via email and telephone. At the end of the survey, in stage 2, respondents were asked to identify people they thought were most knowledgeable about prevention or health promotion in the community. These persons were also invited to participate in the survey. The goal was to recruit an average of 10 community key informants per community. For sample size calculation, see Röding et al. (2021).

For the baseline survey in 2021, 498 community key informants in 22 IC and 406 in 22 CC were identified and invited to participate in the survey. Of these, 111 persons from 22 IC and 89 persons from 21 CC participated. Reasons for non-participation and reasons for exclusion can be found in Fig. 2. The analysis includes 182 persons from 18 communities (101 persons from 20 IC and 81 persons from 18 CC).

Fig. 2
figure 2

Flow diagram

Data collection

Data were collected using an abridged version of the Community Key Informant Interview (CKI), translated into German and adapted to the German context. First, the 2009 version of the CKI provided by the Social Development Research Group (SDRG 2009) served as the starting point. Details of the original instrument can be found elsewhere (Brown et al. 2007; Rhew et al. 2013). Second, an existing abridged version of the CKI, translated into German and adapted to the German context, which was tested as part of a feasibility study for the implementation of CTC in Germany from 2013 to 2014, served as a reference point (Jonkman et al. 2015). However, this version was not adopted because (1) it was more abridged than would be appropriate for the CTC-EFF study, (2) in some places, it was adapted very specifically to the needs of the feasibility study, and (3) it showed some need for revision (e.g., too little variance, too high item non-response rates, too little internal consistency), as the data collected with this instrument showed.

There were three main reasons we could not use the original unabridged instrument from the United States. (1) The foremost of these was that the interview was estimated to have lasted between 90 and 120 minutes without the abridgments. We assumed that such an interview duration would have resulted in disproportionately low response rates and high item non-response rates in the context of the CTC-EFF. (2) The original instrument contains several items that were neither included in the publications of the CYDS we are replicating nor, in our view, essential for the CTC-EFF. (3) Of the constructs relevant to the replication of the CYDS that are assessed with the CKI (see the “system transformation constructs” in Fig. 1), it was shown that in the CYDS, the effects of CTC were mediated entirely by one construct, and the other constructs were not significant mediators (Brown et al. 2014). After consultation with the Crime Prevention Council of Lower Saxony and the German Prevention Congress, which are the providers of CTC in Germany, we concluded that the constructs social development strategy and community norms are also not relevant mediators in the German version of CTC and therefore do not need to be assessed. Regarding the construct community norms, there was also concern that the relevant questions might elicit negative attitudes toward the survey among respondents because the norms in Germany are different from those in the United States, where the construct was developed. This is particularly apparent regarding tolerance of the consumption of alcohol in small amounts by 16-year-olds in Germany.

Taking into account these necessary modifications, the German questionnaire was created in a team approach based on the TRAPD procedure used in the European Social Survey (ESS; Dorer 2021): Translation—in addition to the present German first translation, a second translation was carried out by the authors; Review—discrepancies were clarified in an interdisciplinary discussion with the first translator and subject matter experts; Adjunction—the questionnaire was finalized by the team; Pretest—the translated and abridged version was subjected to cognitive pretests (n = 5), and revised in several loops by the feedback; Documentation—the whole process is documented.

The final version is composed of seven thematic blocks: (1) characteristics of the community (e.g., social capital, openness to change), (2) prevention programs in the community (e.g., planning basis, evaluation, and monitoring), (3) collaboration for prevention (e.g., coordinating agencies, resource sharing), (4) local financial conditions for prevention (e.g., change in financial situation, assessment of the financial situation), (5) local events in the last 2 years that had an impact on local prevention activities (e.g., natural disaster, pandemic, suicides), (6) sociodemographic characteristics of respondents, (7) other community key informants in terms of snowball sampling. The first five blocks comprise 82 items or 14 constructs (e.g., multi-item scales, indices).

Data collection was primarily by computer-assisted telephone interviews (CATI). As an alternative to CATI, the key informants were provided with a digital version of the instrument in the form of a self-completion questionnaire. This could either be completed using a PDF viewer or printed out (paper-pencil) and sent by post to Hannover Medical School.

Constructs and variables

The constructs presented below are, at the individual level, a measure of how the community key informants, as experts in their community, assess their community. At the community level, they measure the (approximate) actual status of the respective community based on intersubjective assessments by experts. The respective measure for the community level is calculated from the community-specific mean value of the respective individual-level measures.

Construct 1: adoption of science-based prevention

In the CYDS, the indicator adoption of science-based prevention was measured by 20 closed questions and two open questions (Arthur et al. 2001). In the CTC-EFF, the two open-ended questions are not assessed because they refer to clearly titled risk and protective factor concepts that are reasonably well known in US prevention practice but not in German practice.

Based on the 20 closed questions (e.g., Were risk and protective factors prioritized in your community to be addressed with prevention measures? Did your community decide which programs to implement based on this prioritization? Have prevention programs changed as a result of the evaluation? Has your community made funding decisions based on program evaluations?), a total score between 0 and 5 was calculated for each respondentFootnote 1. This value indicates one of the following six levels of adoption of science-based prevention (Table 1). This score, calculated for each respondent, provides a measure of how the respondent ranks their community in terms of adopting a science-based prevention approach.

Construct 2: prevention collaboration

Prevention collaboration is measured using a multi-item scale consisting of nine items (e.g., There is a network of people involved in prevention and/or health promotion who are in contact with each other. Prevention providers rarely coordinate their activities with each other. Organizations and agencies share financial and human resources for prevention. In your community, each organization has a clear role in implementing the local prevention plan.). A four-point item response scale is used: agree, tend to agree, tend to disagree, disagree. For each respondent, a mean is calculated from their responses to these items. Beforehand, all items are coded so that higher values indicate better cooperation.

Construct 3: sectoral collaboration

Intersectoral collaboration for prevention is measured using a multi-item scale consisting of ten items. In the German version, 22 items were assessed. Respondents are asked: Please estimate to what extent your organization/institution has collaborated regarding prevention and health promotion with the following organizations in the last year. A four-point response scale is used: very much, a lot, a little, not at all. The mean value is calculated over all items belonging to the construct. The variables are coded so that higher values represent stronger intersectoral cooperation.

Construct 4: financial support for prevention

This indicator is based on the following question: You have a fixed budget for substance abuse reduction. What percentage of 100 would you allocate to the following approaches? (1) law enforcement and prosecution, (2) prevention, and (3) treatment (medical/psychotherapeutic). The indicator reflects the value of the budget share preferred for prevention and can take a value between 0 and 100.

Construct 5: support for prevention

The original six-item multi-item scale (Brown et al. 2014) was assessed with only four items in the CTC-EFF baseline survey. This is due to a misinterpretation of the item construct dictionary provided to us by the authors of the CYDS. The following four items were surveyed: (1) People in [community] believe that problem behavior among children and youth can be reduced through prevention. (2) People in [community] are aware of prevention activities locally. (3) Most people in [community] think that prevention programs for youth don't work. (4) Most people in [community] would be willing to pay more taxes to support youth prevention programs. The last item was modified for the German context. Unlike in the United States, communities in Germany have almost no ability to levy taxes. It is also not yet common in Germany for communities to introduce taxes or tax increases to finance prevention measures. Therefore, we have replaced this statement with the following: Most people in [community] believe that all people in the community are responsible for preventing drug abuse. A four-point item response scale is used: agree, tend to agree, tend to disagree, disagree. The variables were coded so that higher values represent better support. The measure is formed from the individual mean of the responses to the two items.

Confounders

Gender was assessed using the categories male, female, and non-binary. Current age was collected as well as migration background. Response categories were as follows: “Yes, I was not born in Germany”; “Yes, at least one parent is not from Germany”; “No”. The respondents' educational status was collected according to the “Demographische Standards” (Beckmann et al. 2016). To assess whether the respondents were experts on the respective study community, we asked: How many years have you lived in [community]? One answer option was “live elsewhere”. This asked whether the person lived in the study community and, if so, for how many years. Furthermore, we asked: How many years have you been working (including interruptions, if applicable) in [community]? One response option was “I do not work in the community”. This question was used to assess whether the person worked in the study community and, if so, for how many years they had worked there. To have an indicator of whether respondents were familiar with the main topic of the survey, we asked: Thinking about your previous professional and volunteer activities, have you worked in the field of prevention and health promotion among children and adolescents? Response options for this are “yes”, “no”, and “don't know”. Respondents who answered yes to this were considered to have experience in this area.

To assess the validity of the respondents' answers, the interviewers rated the respondents' willingness to cooperate, trustworthiness, and understanding of the questions at the end of the survey. The three items were each measured with a four-point Likert scale (e.g., item: “Please rate how cooperative the respondent was”; response format: 1 = “friendly and cooperative”, 2 = “generally cooperative”, 3 = “generally uncooperative”, 4 = “uncooperative and hostile”).

Missing data

For the five constructs, the item non-response rate was, on average, in the mid-single digits. Details are shown in the Results section. These missing values were replaced by the community-specific mean of the item. This imputed value was commercially rounded to a whole number for categorical data.

Statistical analyses

Analyses were conducted using SPSS 27.0 (IBM Corp., Armonk, NY, USA) and, in part, Microsoft Excel 2013 and Jamovi 2.3.18.

The structural validity of the five constructs was investigated through exploratory factor analyses (EFA) and confirmatory factor analyses (CFA). First, a series of EFA was conducted in an iterative process. In the first step of the analysis, all 27 items of the sectoral collaboration item battery, 11 items of the collaboration for prevention item battery, the four items on support for prevention, the indicator on financial support for prevention, and the index for the degree of adoption of science-based prevention (adoption score) were included in the analysis. All analyses were conducted as principal component analyses using the varimax method. Only factors with an eigenvalue of at least 1 were extracted. Factor loadings of at least 0.4 were assessed as substantial. Factor loadings below 0.3 were masked out. Initially, the items with the lowest commonality were sorted out step by step. In the subsequent course, in addition to this criterion, the criterion of improving the interpretability of the factors increasingly came into play. In addition, all factors or items that did not belong to our five constructs were eliminated. According to these criteria, items were eliminated until a solution with five factors was identified. Then the factor analysis was continued with the setting that five factors must be extracted. The Kaiser–Meyer–Olkin (KMO) criterion was used for the stepwise elimination or inclusion of items. A value of 0.6 or more is considered sufficient, a value of 0.7 or more satisfactory, a value of 0.8 or more good, and a value of 0.9 or more very good. Another criterion used was the increase in variance explained by the factors. The best-fitting five-factor model identified in this way formed the starting point for the CFA. The CFA was carried out using Jamovi. Evidence of construct validity would be realized if item-factor loadings for the nine indicator variables were large (i.e., > 0.50) and statistically significant (p < 0.05) and if the overall factor model fit the data well. The fit of the baseline CFA model was assessed using criteria suggested by Hu and Bentler (1998): comparative fit index (CFI; Bentler 1990) greater than 0.95, Tucker–Lewis fit index (TLI; Tucker and Lewis 1973) greater than 0.95, and root mean square error of approximation (RMSEA; Browne and Cudeck 1992; Steiger and Lind 1984) less than 0.06. Since the default model, according to which there is no covariance in the residuals of the items, had an insufficient model fit, covariance was accepted in steps, based on modification indices and theoretical considerations, until a sufficient model fit was achieved.

To assess the internal consistency of the two multi-item scales, Cronbach's alpha and McDonald's omega were calculated using a macro distributed by Hayes (Hayes and Coutts 2020). To assess the internal consistency of the five measures at the community level, O'Brien’s (1990) coefficient of internal consistency was calculated to account for the two-level structure of the data (O'Brien 1990). This reliability coefficient includes two variance components: the (estimated) “universal” variance and the observed variance. The first variance component includes the variance between aggregates. The observed variance refers to the dispersion of responses from individuals residing within each aggregate unit. The reliability coefficient is estimated by subtracting the root mean square associated with individual-level responses within the aggregates (i.e., the residual mean sum of squares) from the root mean square of the aggregates (i.e., the explained mean sum of squares). This number is then divided by the root mean square of the aggregates. The resulting coefficient is a ratio between the estimated variance of the aggregate score and the observed variance of the score. Reliability thus depends on the number of respondents in each community and the variance within and between communities. This ratio is bounded between 0 and 1. The closer this reliability coefficient is to 1, the higher the reliability of the indicator at the aggregate level.

In order to assess whether key informants' ratings of their community have substantial intersubjective explanatory value, a measure of inter-rater agreement was calculated using the method proposed by Garner and Raudenbush (1991) and Sampson et al. (1997). For this, hierarchical linear models (HLM) are first used to calculate which parts of the variance in the measurements (measured constructs) are due to differences between respondents (individual level) and differences between communities (community level), respectively. A significance test performed by SPSS for this purpose can also be used to determine whether differences between communities contribute statistically significantly to the variance.

In a second step, the intraclass correlation coefficient (ICC) is then calculated from the two variance components by dividing the part of the variance attributable to differences between the communities by the total variance (sum of the variance attributable to both differences between the communities and differences between the respondents). Thus, the ICC indicates what percentage of the total variance in the measurements is attributable to differences between communities. Since the individual samples within the communities in the baseline survey show differences in the distribution of sociodemographic characteristics of the respondents, it was examined to what extent the part of the variance attributable to differences between the communities could be explained by sociodemographic differences in the composition of the community-specific samples. Therefore, aggregated variables were calculated for the sociodemographic characteristics (e.g., community-specific proportion of women among respondents, community-specific average age of respondents) and included in the HLM as macro regressors (Baltes-Götz 2020), respectively community-specific characteristics.

To assess concurrent construct validity, correlations between the five constructs were examined at the individual and community level. For metric-scaled variables, Pearson's correlation coefficient r, and for ordinal-scaled variables, Spearman's correlation coefficient rho was calculated.

In addition, mean comparisons were used to examine whether respondents' ratings of their community differed systematically according to how they were rated by the interviewers in terms of their willingness to cooperate, trustworthiness, and how well they understood the questions. Since some groups here (e.g., respondents with low willingness to cooperate) had only small single-digit case numbers, only descriptive mean comparisons were made.

Results

Descriptive statistics

Table 2 describes the sample (n = 182) based on selected characteristics. Fifty-six percent of the respondents come from IC, 49% are female (no persons are non-binary), 91% have a university entrance qualification, 7% have a migration background, 54% live and 83% work in the community they were interviewed about, and 69% have acquired professional or voluntary experience in prevention or health promotion for children, adolescents, or families. The respondents are, on average, 49.6 years old and have lived in the community they were interviewed about for an average of 15.1 years.

Table 2 Sample description

Table 3 shows the means, standard deviations, and the number of missing values for the items of the five constructs. For 35 of these 42 items, the item non-response rate was in the single digits. For the construct adoption of science-based prevention, six of the 20 items had an item non-response rate between 11% and 26%. Imputation of missing values resulted in only minor changes (changes of no more than 0.01 units) in the mean and standard deviation of the items.

Table 3 Univariate statistics on the items of the five latent constructs

Table 4 shows mean values and standard deviations of the five constructs according to their measurement at the individual and community level. The standard deviations are larger at the individual level than at the community level. The values of constructs 1 and 5 have more variance than those of the other three.

Table 4 Means and standard deviations of the constructs by measurement level

Figure 3 displays the relative frequencies of the six stages of adoption of science-based prevention as reported by key informant respondents. A fixed algorithm was used to assign respondents to the stages based on information respondents provided about their community. One percent of respondents indicate that their community is in stage 0 (no prevention science awareness), 73% of respondents indicate that their communities are in stage 1 (prevention science awareness), 6% indicate that their community is in stage 2 (prevention science prevention planning), 3% of respondents indicate their community is in stage 3 (plus epidemiological analysis), 12% of respondents indicate their community is in stage 4 (plus data-driven allocation of prevention resources), and 5% of respondents indicate their community is in stage 5 (plus monitoring of interventions and replication of approach).

Fig. 3
figure 3

Relative frequencies of respondents (n = 182) by stage of their community's adoption of science-based prevention

Figure 4 shows the absolute frequencies of how the 38 communities were assigned to the six stages of adopting science-based prevention. The assignment of communities to stages is based on the commercially rounded community-specific mean scores of the key informants surveyed. Using this method, no community was assigned to stage 0 (no prevention science awareness), 18 were assigned to stage 1 (prevention science awareness), 14 were assigned to stage 2 (prevention science prevention planning), 4 were assigned to stage 3 (plus epidemiologic analyses), one was assigned to stage 4 (plus data-driven allocation of prevention resources), and one was assigned to stage 5 (plus monitoring of interventions and replication of approach).

Fig. 4
figure 4

Absolute frequencies of communities (n = 38) by stage of adoption of science-based prevention

Structural validity

Based on the theoretical model (Fig. 1), a series of EFA and then a CFA were conducted to test the structural validity. An EFA with 44 indicators identifies 13 factors with an eigenvalue greater than 1 (online supplement 1). Factors 1, 2, 9, 10, and 12 represent the five constructs with minor imprecisions. An EFA with 21 indicators (Table 5) identifies five factors with an eigenvalue greater than 1. The KMO criterion for this model is 0.854. The commonalities are above 0.5 for 17 of these 21 indicators and between 0.4 and 0.5 for the other four indicators. The five factors explain 60% of the variance in the indicators. The eight indicators of factor 1 represent the remaining items of the sectoral collaboration construct, the nine indicators of factor 2 represent the collaboration for prevention construct, the two indicators of factor 3 represent the support for prevention construct, and the two indicators for financial support and science-based prevention correspond to factors 4 and 5. Factor 2 corresponds to the multi-item scale used in the CYDS for the collaboration for prevention. Of factor 1, only four indicators correspond to the 10-item scale used in the CYDS for sectoral collaboration. In addition, factor 1 includes four indicators that do not appear in the CYDS 10-item scale.

Table 5 Rotated component matrix of an EFA

A CFA was carried out with the five factors of the final EFA model. All indicators load statistically highly significantly (p < 0.001) on their factor. Seventeen of the 19 indicators have a factor loading greater than 0.5 (Table 6). For the other two indicators, the factor loading is between 0.36 and 0.49. The global fit indices also support this model (CFI = 0.964; TLI = 0.956; RMSEA = 0.0388). The model assumes correlations between the factors (online supplement 2) and covariance of the residuals of certain indicators (online supplement 3).

Table 6 Factor loadings

Internal consistency and inter-rater agreement

Table 7 contains results of the reliability analyses for the five constructs. The first two columns show measures (Cronbach's alpha and McDonald's omega) for the internal consistency of the scales. These values range from 0.86 to 0.87, indicating good internal consistency for these scales. At this point, it should be recalled that the constructs do not represent the respondents' characteristics, but the communities for which they were interviewed as experts. However, the internal consistency depicted by the first two ratios does not take this two-level data structure into account. Therefore, an additional measure of the internal consistency of the measurements at the community level is plotted in the third column. Since this measure can be calculated for scales and indices and individual items, values are shown for all five constructs. In the last column, a measure of inter-rater agreement is plotted. The values are from unadjusted HLM. In addition, we ran adjusted HLM, with the result that the tested confounders (proportion of women, average age, proportion living in the community, and proportion working in the community) were not relevant to the models. Accordingly, differences in sample composition between the communities do not affect the reliability of the measurement.

Table 7 Results of the reliability analyses

Concurrent construct validity

Table 8 shows results of correlation analyses among the five constructs. Regarding the findings of Brown (2008), we examine whether the constructs (1) science-based prevention, (2) prevention collaboration, and (3) intersectoral collaboration are statistically significantly positively correlated with each other at both the community and individual levels. The results show highly significant correlations at both the individual and community levels. We recognize that theoretical justifications can also be made as to why the constructs (4) financial support and (5) community support could or should also be correlated with each other and the other three constructs. According to our data, however, constructs 4 and 5 are not correlated, showing correlations only with some of the other three constructs.

Table 8 Construct validity at the individual and community level

Table 9 shows mean values for the five constructs (individual level) differentiated according to how the interviewees rated the respondents' willingness to cooperate, trustworthiness, and understanding. Inferential statistical analysis for mean differences was impossible because the number of cases where respondents were rated as having a low willingness to cooperate, low trustworthiness, or low understanding was too small. Slight descriptive mean differences should be interpreted conservatively at this point. We looked for linear patterns and, if they were found, examined whether they consistently pointed in the same direction for all five constructs and for three extraneous assessment criteria. Considering the score for science-based prevention separately by respondents' willingness to cooperate, a linear pattern emerges: the lower the respondents' willingness to cooperate, the lower the mean score for science-based prevention. However, this pattern does not emerge for the other four constructs. And when science-based prevention is considered differentiated by respondents' understanding, this gradient turns in the other direction. The differences in the mean scores of the groups are mostly rather small and do not show a consistent pattern. It, therefore, stands to reason that the descriptive differences in mean values considered here are exclusively or at least mainly due to random fluctuations, which can be correspondingly large when the number of cases is small.

Table 9 Mean values of the constructs (individual level) according to interviewer ratings on respondents' willingness to cooperate, trustworthiness, and understanding

Discussion

This validation study is embedded in the CTC-EFF study evaluating the effectiveness of the Communities That Care community-based prevention system in Germany. The CTC-EFF is an extensive replication of the US Community Youth Development Study (CYDS), which has been ongoing as a cluster randomized controlled trial since 2003. In contrast to this design, the CTC-EFF is a non-randomized controlled trial (Röding et al. 2021). For the CTC-EFF, the assessment instruments from the CYDS were translated into German and adapted to the German context. In the current validation study, analyses were conducted on the structural validity, internal consistency, and concurrent construct validity of five community capacity measures collected as part of the baseline of the CTC-EFF using the German version of the Community Key Informant Interview (CKI).

Interpretation of the results

Only partially comparable figures have been published from CYDS and its Dutch replication for our univariate results. For the adoption of science-based prevention, we find very similar values in our data (M = 1.65; SD = 1.26) to those reported by Brown et al. (2014) for the CYDS for the control group in 2004 (M = 1.65; SD = 0.93). Moreover, according to our data, 16% of respondents are from a community that is at least in stage 4. Steketee et al. (2009) reported that 20% of the control communities of the CYDS in 2007 (4.5 years after the introduction of CTC) and 11% of the control communities of the Dutch replication study in 2011 (3 years after the introduction of CTC) were at least in stage 4. Brown et al. (2008) reported values of M = 1.97 (SD = 0.57) for the intersectoral collaboration construct for the year 2001. For Germany, we find values of M = 2.52 (SD = 0.62) for this construct. Brown et al. (2011) reported a value of about 41% for financial support for prevention in 2001. We find a value of 51%. Overall, we find only minor differences in the point estimates and standard deviations.

Regarding the structural validity of the four interval-scaled constructs, our results for the EFA show, on the one hand, that these are four distinct constructs. Second, the results show that the sectoral collaboration construct in our data consists of three subfactors. The results of the CFA on the two multi-item scales collaboration for prevention and intersectoral collaboration for prevention further confirm that these scales are two selective constructs. Additionally, the results show that a two-factor model fits our data best. However, it also reconfirms that the intersectoral collaboration factor consists of three subfactors. Our results are broadly consistent with the theoretical model of Brown et al. (2011). Yet, in our model, the sectors included in the sectoral cooperation construct differ from those included in the original model from the Community Youth Development Study. We believe that these differences reflect national differences in prevention practice. For example, businesses and religious organizations in Germany are hardly systematically involved in local prevention. On the other hand, in this field of prevention, social work services, school social workers, and drug counseling centers in Germany are regularly and systematically involved in prevention networks. It is also noteworthy that factor 3 includes only two of the four indicators adopted from the CYDS to measure support for prevention. At this point, it should be pointed out once again that due to a misunderstanding, this construct is only incompletely included in the German version of the CKI.

Our analyses of the internal consistency and reliability of the five constructs show that the two multi-item scales have good internal consistency, with Cronbach's alpha of 0.86 and 0.87 and McDonald's omega of 0.86 and 0.87 at the individual level. In the CYDS (Brown et al. 2011), Cronbach's alpha of 0.77 and 0.92 were reported. For science-based prevention, reliability coefficients were reported from the CYDS to assess reliability at the community level (O'Brien 1990) and ranged from 0.71 to 0.75 (Arthur et al. 2001). We find a value of 0.62 for our surveys. Since the value is highly dependent on sample size within communities, and our average cluster size of 4.8 was significantly smaller than those of Arthur et al. (2001) at 6.8 and 9.9, respectively, we attribute the differences in this coefficient to this. According to O'Brien (1990), a value between 0.6 and 0.7 corresponds to moderate reliability. We found reliability coefficients for the other four constructs ranging from 0 to 0.30 for the community level. No comparable values are available from the CYDS. Arthur et al. (2001) further report on science-based prevention having an inter-rater agreement of ICC = 0.23 and ICC = 0.27. For the CTC-EFF, we find an ICC of 0.32 in this regard. No corresponding measures are reported from the CYDS for the other four constructs. For the CTC-EFF, we find ICCs ranging from 0.02 to 0.21 for the other constructs. Some studies that surveyed key informants in communities found similar values ranging from 0 to 0.21 (Garner and Raudenbush 1991; Sampson et al. 1997; Shapiro et al. 2015). With respect to the partially low values we found for community-level reliability and inter-rater agreement, it should be noted that the effects of CTC on risk and protective factors and problem behaviors found in the CYDS are exclusively and completely mediated by the adoption of science-based prevention in the multivariate model (Brown et al. 2014). That is, the science-based prevention construct is likely to be most relevant to the CTC-EFF. For this construct, we found relatively good values in terms of reliability and inter-rater agreement.

Correlations between science-based prevention, prevention collaboration, and intersectoral collaboration were analyzed to test for concurrent construct validity. We find correlations between 0.24 and 0.27 at the individual level and between 0.30 and 0.42 at the community level. Brown et al. (2008) reported correlations between 0.26 and 0.37 at the individual level and between 0.64 and 0.75 at the community level. Although our data also show positive correlations between these three constructs, they are systematically weaker than in the CYDS. This could be partly because these constructs could be measured with slightly higher reliability in the CYDS. Second, contextual differences could also explain these differences in correlation strengths, which are currently unclear to us. Since theoretical justifications could be made that the other two constructs should also show a positive correlation within themselves and with the other three constructs, we also tested for this. However, the results are less consistent here, although correlations of up to 0.33 at the community level and 0.18 at the individual level are found. Most importantly, the negative correlation between community support and science-based prevention does not fit the pattern. This should be investigated in more detail in the further course of the CTC-EFF. We also examined whether our scores on the five constructs were statistically related to how cooperative, trustworthy, and understanding our interviewers rated the respondents. At this point, we found no evidence of such correlations.

Because of some differences between the USA and Germany, we had adapted some items to the German context. We found no evidence in our results that our measurements were affected by these adaptations. Studies (Repke and Dorer 2021) show that such adaptations are beneficial.

Overall, the results of this validation study indicate that the German version of the CKI is able to measure three important constructs of community capacity for prevention in a valid way. Thus, a new validated instrument for measuring community capacity is available for Germany. This is important as only one instrument for measuring community capacity (Nickel et al. 2013; Nickel et al. 2018) and one for community readiness (Gansefort et al. 2020) were previously available for Germany. However, measuring these constructs is an essential complement to corresponding qualitative assessment methods (Kehl et al. 2021; Wihofszky et al. 2021). Valid instruments for measuring community capacity are needed for effectiveness studies and testing causal models of approaches to community-based prevention and health promotion.

Limitations

Some limitations in this study are worth highlighting. First, as our study was cross-sectional, test-retest reliability and predictive validity of the CKI could not be ascertained. To investigate these properties, longitudinal research designs should be considered. Second, the target sample size of 10 could not be achieved in all communities. Although the sample size (n = 182) was considered sufficient for the analyses conducted, a larger sample size would have provided more reliable results. Because of the small sample size, it was also not possible to use multiple imputation procedures for missing data (Donner and Klar 2000; Campbell et al. 2007; Wears 2002). The imputation procedure that was utilized takes the cluster structure of the data into account but not the characteristics of the respondents. However, since the measured constructs are characteristics of communities and not individuals, only minor possible biases can be assumed. So far, the factor analyses also do not consider the data’s multilevel structure. Corresponding analyses have been planned for the future. However, based on the results by Brown et al. (2008), we assume that this will not lead to substantially different results from those presented in this article. Furthermore, communities were not randomly selected, which may limit the generalizability of the results. Thus, our results on point estimates may not be generalizable to communities in other contexts. However, we would like to point out that this was not the aim of this study. Moreover, we note that survey data from key informants are widely used in community research (Brown et al. 2014; Shinn 1990).

Conclusion

With the current German version of the CKI, three important community capacity constructs can be validly measured. A revision is being conducted for the other two constructs. The instrument is available to non-commercial prevention research free of charge. Thus, this study makes a significant contribution to German prevention research.