Equity-specific re-analysis strategy
The equity-specific re-analysis strategy comprises harmonizing the choice and definitions of outcomes (step 1), exposures (step 2), socio-demographic indicators (step 3), and statistical analysis strategies (step 4) across studies by defining common criteria; as well as synthesizing the results (step 5). The following sections provide detailed descriptions of the individual steps of the strategy and how to adopt them to existing study data. To do so, we present the criteria for harmonization and synthesizing results as defined for our convenience sample of PA intervention studies.
Step 1: harmonizing the choice and definition of outcome measures across studies
The first step includes choosing an outcome measure which adequately measures the objectives of the kind of intervention under study and which can be defined across studies as similar as possible. Health promoting behaviors such as PA need to be maintained for long-term health benefits [55, 56]. Moreover, it has been shown that inequalities may initially increase after implementation of new interventions before decreasing again as time passes . Therefore, in order to make conclusions about inequalities in long-term health benefits, where data permit, both short-term and long-term outcomes of the interventions should be considered.
For our sample of PA intervention studies, we identified weekly minutes of moderate-to-vigorous PA (MVPA) at the post-intervention follow-up time point closest to the intervention end point (T1) as primary outcome because it could be defined in a similar manner across the studies and the beneficial effects of MVPA on health are well documented . Considering the data of five studies, weekly minutes of MVPA at the next follow-up assessment (T2) was chosen as secondary outcome to investigate potential changes in equity-specific intervention effects over time. This was 8 months post-intervention for Active Plus I and Active Plus II, 9 months post-intervention for PACE-Lift and PACE-UP, and 6 months post-intervention for ProAct65+. Due to better precision and accuracy , we decided to prefer objective PA measures over subjective measures, when both were available in a study. In Active Plus I, Active Plus II, Every Step Counts!, GALM, and ProAct65+ that measured PA exclusively subjectively, physical activities of at least three metabolic equivalents (MET) were defined as MVPA, following recommendations by guidelines . In PACE-Lift, PACE-UP, and PROMOTE that measured PA objectively, the standard Freedson cut-point of 1952 counts per minute , equivalent to three METs, was used to define MVPA. In addition to the main outcome total weekly minutes of MVPA, sensitivity analyses were conducted for PACE-Lift and PACE-UP using weekly minutes of MVPA in bouts of at least 10 min.
Step 2: harmonizing the choice and definition of exposure measures across studies
Studies of interventions may differ with regard to the number of intervention and control groups. Step two includes choosing an exposure measure which can be defined across studies as similar as possible.
For our sample of PA intervention studies, any versus no intervention was defined as exposure. In Active Plus I, Active Plus II, PACE-UP, ProAct65+, and PROMOTE which included several intervention groups, intervention groups were combined to create a single pair-wise comparison in order to avoid double-counting. The Cochrane Handbook for Systematic Reviews of Interventions recommends this approach for including studies with several intervention groups in a meta-analysis .
Step 3: harmonizing the choice and definition of socio-demographic indicators across studies
Step three includes harmonizing the choice and definition of socio-demographic indicators which should be based on existing theories and evidence of equity-specific intervention effects. There are several different socio-demographic indicators that might be relevant to consider. The PROGRESS-Plus framework , proposed by the Campbell and Cochrane Equity Methods Group, may help researchers in identifying socio-demographic indicators relevant for their specific research question. SEP should be considered a multidimensional construct comprising diverse socio-economic indicators at the individual, household, or contextual level [64,65,66,67]. Because different indicators of SEP operate through different causal pathways and may have different relevance among individuals of varying age and gender [64,65,66,67], the choice of SEP indicator may affect findings about the presence and extent of equity-specific intervention effects. It is therefore important to consider, and clearly differentiate between, various relevant SEP indicators instead of focusing on one indicator only or using several SEP indicators interchangeably. Moreover, potential intersections between several socio-demographic indicators [68, 69], such as gender and SEP, should be considered. Putting such an intersectionality lens to the re-analysis of data of intervention studies, where sample size and diversity permit, could yield even more comprehensive insights on the impact of these interventions on health inequalities.
For our sample of PA intervention studies, education as a measure of SEP [64,65,66,67] and gender (only defined as female versus male) as a social construct [70, 71] were selected as main socio-demographic indicators because both characteristics have previously been shown to moderate the effects of PA interventions [26,27,28], information on both were available in all collaborating studies, and both can be operationalized in a similar manner across studies from different countries. Education was defined according to the International Standard Classification of Education (ISCED) 2011 . Based on the highest level of educational qualification or age at leaving full time education, individuals were grouped into the categories “Low” (at most lower secondary education (ISCED 0–2) or leaving full time education at ≤16 years), “Medium” (upper secondary and post-secondary non-tertiary education (ISCED 3–4) or leaving full time education at 17–18 years), or “High” (tertiary education (ISCED 5–8) or leaving full time education at ≥19 years).
In a secondary analysis, income and area deprivation as measures of SEP [64,65,66,67] were considered. Information on household income was available in two (ProAct65+, PROMOTE) and information on area deprivation (index of multiple deprivation [IMD] score ) was available in three studies (PACE-Lift, PACE-UP, ProAct65+). For both of these indicators, in each study, tertiles were defined in terms of the distribution in the study’s specific data set. This resulted in two variables with the categories “Low”, “Medium”, and “High” each for household income and area deprivation (see Additional file 2 for details). Additionally, marital status (defined as having versus not having a partner) was considered as a socio-demographic indicator because the presence or absence of a spouse has been shown to be associated with health inequalities and PA [10, 74].
Although the effects of PA interventions my also differ between individuals of different ethnic backgrounds, we did not consider ethnicity as a socio-demographic indicator due to differing ethnic compositions in the study populations and data availability. Potential intersections between several socio-demographic indicators were also not considered because of small sample size and insufficient diversity.
Step 4: harmonizing the choice and definition of statistical analysis strategies across studies
Step four comprises to specify the statistical methods and modeling strategies for the equity-specific effect analyses. Not only intervention effects, but also intervention reach, adherence, and dropout may also differ by socio-demographic characteristics and therefore should be considered for a comprehensive assessment of equity-specific intervention benefits [15, 75].
Equity-specific intervention reach
In our sample of PA intervention studies, the majority lacked information on socio-demographic indicators for non-participants. This precluded the calculation of socio-demographic group-specific response rates [76, 77], so it was not possible to investigate equity-specific intervention reach. We originally aimed to consult census data and to compare the study population with the targeted population of each study, considering the studies’ specific eligibility criteria. However, as no suitable census data could be identified, we decided to calculate an overall response percentage, defined as the number of persons who completed the baseline (T0) questionnaire and were assigned to the intervention conditions, divided by the number of persons invited to participate. For Every Step Counts! and PROMOTE, only estimations of response percentages could be made because the recruitment strategies comprised advertising. For each study, the distribution of gender, education, income, area deprivation, and marital status groups as well as the mean age in the intervention and control groups at T0 were calculated.
Equity-specific intervention adherence and dropout
We calculated percentages and means to describe adherence and dropout stratified by socio-demographic indicators. Information on intervention adherence was available in Active Plus II, GALM, PACE-UP, and PROMOTE, relating to the use of intervention materials and/or attendance at group meetings. We defined dropouts as individuals with valid information on MVPA at T0 but without valid information at T1. Additionally, we calculated mean values and corresponding standard deviations (SD) of weekly minutes of MVPA at T0 for each subgroup of interest, stratified by intervention and control group, as well as by completers and dropouts.
General and equity-specific intervention effects
The general intervention effect was defined as the difference between the intervention and control groups in minutes of MVPA per week at T1 (main analysis) or T2 (secondary analysis). For this purpose, post-intervention values of weekly minutes of MVPA were regressed on intervention versus control group and minutes of MVPA per week at T0 without (minimally adjusted model) and with adjustment for age in years, gender, and education (fully adjusted model). Due to the nature of the data, in four studies, the models were additionally (multilevel-)adjusted for practice (PACE-Lift, PACE-UP, ProAct65+); household (PACE-Lift, PACE-UP); or community, valid wear-time, and season (PROMOTE). All analyses were conducted by intention-to-treat, analyzing participants according to the group to which they were originally assigned, restricting the models to individuals with complete data on all variables included (i.e., complete case intention-to-treat analysis).
Equity-specific intervention effects were investigated by adding intervention*socio-demographic indicator interaction terms to the regression models. For analyzing equity-specific intervention effects by gender, for example, post-intervention values of weekly minutes of MVPA were regressed on intervention versus control group, MVPA per week at T0, age in years, gender, and the intervention*gender interaction without (minimally adjusted model) and with adjustment for education and the intervention*education interaction (fully adjusted model). Because age is associated with most of the socio-demographic indicators and with PA levels, we decided to include it as a covariate in all models. For each model, the p-values for the interaction terms and effect estimates with corresponding 95% confidence interval (CI) for each subgroup of interest were computed. Following Greenland et al. , precise p-values were reported.
Step 5: synthesizing the results
The last step includes synthesizing the results from the individual studies. Meta-analysis is the preferable method because it can increase the power for detecting equity-specific intervention effects which is often limited in post-hoc analysis [33, 34]. If the number of studies permit, meta-regression  should be used to investigate possible sources of heterogeneity (e.g. study quality, study design). If the sample of studies is highly heterogeneous and data can hardly be harmonized to enable meta-analysis, there are alternative approaches to synthesize and visualize the equity-specific results of individual studies, such as the harvest plot .
In our homogeneous sample of PA intervention studies, after data had been harmonized, the estimates for the regression coefficients of the intervention*socio-demographic indicator interactions from the individual studies were pooled using random-effects meta-analysis. To be able to assess the direction of these interaction effects, in particular for any disadvantage experienced by the most disadvantaged groups, regression models were slightly modified. Education, income, and area deprivation were considered as variables with two (low versus medium/high education and income, high versus medium/low deprivation) instead of three categories resulting in one regression coefficient for each intervention*socio-demographic indicator interaction. This means that for all studies, the socio-demographic indicators were comparable in measurement and levels.
Analyses were conducted in R using the metafor package . As effect size, we chose the point estimates of the intervention*socio-demographic indicator interactions in minutes. A random effects model was fitted using the DerSimonian and Laird method. The extent of heterogeneity was measured by the I2 index. Following Higgins et al. , I2-values of 25, 50, and 75% were considered low, moderate, and high heterogeneity, respectively. The intervention*socio-demographic indicator interaction effect estimates and their corresponding 95% CI were presented in forest plots. Since some studies used different numbers of predictors, a sensitivity analysis was conducted estimating partial correlation coefficients . Meta-regression was deemed inappropriate due to the low number of studies.
Risk of bias assessment
Whichever method to synthesize the results is chosen, a risk of bias assessment should be conducted. There is no specific tool for assessing the risk of bias in a result from equity-specific effect analysis. For our sample of studies, we therefore decided to assess the risk of bias regarding the general intervention effects, using the revised Cochrane risk-of-bias tool for randomized trials (RoB 2.0)  and the ROBINS-I risk-of-bias tool for non-randomized studies of interventions . The assessment of each study was performed by at least one researcher from the contributing study (FB, TH, SI, RM, SM, DP, MS, JV) and one researcher from the EQUAL project team (GC) independently. Journal article(s), the published re-analysis strategy , and internal knowledge about the study were used to help inform the assessment. Any discrepancies were resolved through discussion and, where necessary, consulting the last author (GB).
Application of the equity-specific re-analysis strategy
The following sections illustrate the application of the equity-specific re-analysis strategy. To do so, we present the results from applying the criteria for adapting the strategy set out above to our convenience sample of PA intervention studies.
Risk of bias within studies
Regarding the general intervention effects, the randomized studies PACE-Lift and PACE-UP were judged to be at low risk of bias, and Active Plus I, Active Plus II, GALM, ProAct65+, and PROMOTE at high risk (Table 1). The non-randomized study Every Step Counts! was judged to be at serious risk (Table 2). The high/serious risks resulted from non-concealed randomization sequences, differing proportions of missing outcome data in the intervention and control groups, and/or participant-reported outcome measures. Further details are available in Additional file 3.
Response percentages and baseline socio-demographic characteristics
Calculated response percentages ranged from 6% in ProAct65+, over 10% in PACE-UP, 12% in GALM, 16% in Active Plus II, 23% in Active Plus I, to 30% in PACE-Lift. Response percentages of PROMOTE and Every Step Counts! were estimated to be 7 and 80%, respectively. Some differences existed between the studies regarding the socio-demographic composition of their baseline samples (Table 3). Most studies had slightly higher percentages of females, ranging from 51% in Active Plus I to 68% in Every Step Counts! (mean = 58%). There was a great variation in the proportion of low-educated participants, ranging from 2% in PROMOTE to 56% in Every Step Counts! (mean = 38%). The percentages of participants without a partner ranged from 18% in Active Plus II to 42% in ProAct65+ (mean = 26%).
Equity-specific intervention adherence
Results of Active Plus II, GALM, PACE-UP, and PROMOTE with information on intervention adherence indicated no or only slight differences across gender and education subgroups, with no consistent pattern regarding the direction of differences (Table 4). For example, in GALM, slightly higher mean attendance rates of the 15 intervention sessions were observed among low educated participants. In PACE-UP, PA diary return and pedometer use were slightly higher among medium educated individuals. In PROMOTE, females attended the group meetings more often than males. We also found only marginal differences across income, area deprivation, and marital status subgroups. Further details are available in Additional file 4.
Equity-specific intervention dropout
Dropout rates from T0 to T1 varied considerably between the studies, ranging from 6% in PACE-Lift to 45% in Active Plus II. In half of the studies (Active Plus I, Active Plus II, GALM, PROMOTE), intervention group participants were more likely to drop out of the study (Table 5). This bias was mainly the same across gender and education subgroups. In the other half of the studies (Every Step Counts!, PACE-Lift, PACE-UP, ProAct65+), dropout rates were comparable between intervention and control groups, for the total sample, as well as for the gender and education subgroups. Moreover, dropout rates in the intervention and control groups were generally comparable or differed only slightly across gender and education subgroups. For example, in GALM and PROMOTE, dropout rates in the control group slightly differed by gender, with a higher dropout among males (GALM) and females (PROMOTE), respectively.
Patterns of dropout in intervention and control groups were also similar across income, area deprivation, and marital status subgroups. Only slight differences in dropout rates in the intervention and control groups were found across these subgroups (Additional file 5).
Information on equity-specific dropout at T2 and baseline MVPA levels can be found in Additional files 5 and 6.
General and equity-specific intervention effects
The general intervention effects as well as the gender- and education-specific intervention effects at T1 derived from the fully adjusted models are shown in Table 6. Results of the minimally adjusted models are available in Additional file 7. In Active Plus II, Every Step Counts!, PACE-Lift, PACE-UP, and PROMOTE, the intervention groups did more weekly minutes of MVPA at T1 than the control groups. In Active Plus I, GALM, and ProAct65+, no differences between the groups were found.
Overall, we found no consistent pattern of differential intervention effects across the studies. For Active Plus I, an intervention*gender interaction was found, suggesting that the intervention was more effective in increasing weekly minutes of MVPA in females than in males. For PACE-UP, an intervention*education interaction was found, suggesting that the intervention was more effective among medium than high or low educated individuals.
There was no evidence of differential intervention effects by household income, area deprivation, and marital status (Additional file 7). For Active Plus II, at 8 months post-intervention, as well as for PACE-Lift and PACE-UP, at 9 months post-intervention, the intervention groups continued to have higher MVPA levels compared to the control groups, although the differences between the groups were less pronounced when compared to the main analysis (Additional file 7). For Active Plus I, at 8 months post-intervention, and ProAct65+, at 6 months post-intervention, the intervention groups tended to engage in more MVPA than the control groups. There was no evidence of differential intervention effects by any of the socio-demographic indicators examined. For PACE-Lift and PACE-UP, sensitivity analyses of MVPA in bouts of at least 10 min had little impact on the effect estimates and did not change the interpretation (Additional file 7).
Figures 1 and 2 show the estimates for the moderated effects of the interventions through gender and education at T1 for each study (fully adjusted models). The detailed results of the meta-analyses can be found in Additional file 8. The pooled estimates indicated no differences in intervention effects either by gender (5.1 (95% CI: − 20.7 to 31.0), 5321 participants, 8 studies) or by education (− 1.5 (95% CI: − 28.9 to 25.9), 5321 participants, 8 studies). Between study heterogeneity was moderate to high (I2 = 64%) for the moderated intervention effects through gender and low to moderate (45%) for the moderated intervention effects through education.
The pooled estimates for the moderated intervention effects through income, area deprivation, and marital status at T1 indicated no differences in intervention effects by these indicators (income: 0.5 (95% CI: − 10.6 to 11.6), I2 = 0%, 933 participants, 2 studies); area deprivation: -27.9 (95% CI: − 58.5 to 2.7), I2 = 0%, 1802 participants, 3 studies); marital status: 6.9 (95% CI: − 3.3 to 17.1), I2 = 0%, 5341 participants, 8 studies).
At T2, the pooled estimates indicated no differences in intervention effects by gender (17.2 (95% CI: − 14.6 to 49.1); I2 = 18%; 4348 participants; 5 studies), education (− 13.4 (95% CI: − 54.3 to 27.5); I2 = 38%; 4348 participants; 5 studies), area deprivation (− 21.8 (95% CI: − 50.4 to 6.9); I2 = 0%; 1887 participants; 3 studies), and marital status (− 1.7 (95% CI: − 36.8 to 33.5), I2 = 15%; 4366 participants; 5 studies) (Additional file 8). The sensitivity analysis using partial correlation coefficients lead to comparable results (Additional file 9).