A Structural Equation Modeling Approach to Meta-analytic Mediation Analysis Using Individual Participant Data: Testing Protective Behavioral Strategies as a Mediator of Brief Motivational Intervention Effects on Alcohol-Related Problems

This paper introduces a meta-analytic mediation analysis approach for individual participant data (IPD) from multiple studies. Mediation analysis evaluates whether the effectiveness of an intervention on health outcomes occurs because of change in a key behavior targeted by the intervention. However, individual trials are often statistically underpowered to test mediation hypotheses. Existing approaches for evaluating mediation in the meta-analytic context are limited by their reliance on aggregate data; thus, findings may be confounded with study-level differences unrelated to the pathway of interest. To overcome the limitations of existing meta-analytic mediation approaches, we used a one-stage estimation approach using structural equation modeling (SEM) to combine IPD from multiple studies for mediation analysis. This approach (1) accounts for the clustering of participants within studies, (2) accommodates missing data via multiple imputation, and (3) allows valid inferences about the indirect (i.e., mediated) effects via bootstrapped confidence intervals. We used data (N = 3691 from 10 studies) from Project INTEGRATE (Mun et al. Psychology of Addictive Behaviors, 29, 34–48, 2015) to illustrate the SEM approach to meta-analytic mediation analysis by testing whether improvements in the use of protective behavioral strategies mediate the effectiveness of brief motivational interventions for alcohol-related problems among college students. To facilitate the application of the methodology, we provide annotated computer code in R and data for replication. At a substantive level, stand-alone personalized feedback interventions reduced alcohol-related problems via greater use of protective behavioral strategies; however, the net-mediated effect across strategies was small in size, on average. Supplementary Information The online version contains supplementary material available at 10.1007/s11121-021-01318-4.


Introduction
Mediation analysis is used to evaluate whether the effects of an intervention on health outcomes occur because of change in a key behavior targeted by the intervention. Most of the existing methodological research and applications of mediation analysis have focused on individual studies. However, beyond assessing the overall effectiveness of a treatment, single-study intervention trials are frequently underpowered to evaluate pathways of change (Fritz et al., 2015). A meta-analytic approach to mediation analysis that leverages data from multiple studies provides an opportunity to test pathways of change with greater statistical power. However, the literature showing how to conduct mediation analysis in a meta-analytic context has been limited to aggregate data (Cheung & Chan, 2005). This paper focuses on methods for conducting mediation analysis using individual participant data (IPD) from multiple studies.
The most widely used method for combining data from multiple studies is meta-analysis using study-level, aggregate data (e.g., means, SDs, correlations); however, standard meta-analysis methods either do not lend themselves to mediation testing or do not accommodate IPD from multiple studies. For example, meta-regression is used to examine moderators of intervention effects-study-level predictors that are associated with the size of the effect-not mediation. In contrast, newer approaches using aggregate data, such as meta-analytic structural equation modeling (MASEM), can provide a test of mediation (i.e., indirect effects) when pooling data from multiple studies (Cheung, 2014(Cheung, , 2015. Correlation-based MASEM is a prevailing approach to meta-analytic mediation analysis in which correlation or covariance matrices extracted from published reports or generated from the raw data (Cheung & Chan, 2005 are combined to create a pooled correlation or covariance matrix that is subsequently analyzed using structural equation modeling (SEM; e.g., Wilson et al., 2016). Effect sizes and standard errors may also be utilized to test mediated effects via marginal likelihood synthesis, sequential Bayesian methods, or parameter-based MASEM (see van Zundert & Miočević, 2020 for a comparison).
However, because prevailing approaches for metaanalytic mediation analysis typically rely on aggregate data extracted from published reports, the findings may be confounded with study-level differences that are unrelated to the mechanism of interest. For example, Riley et al. (2010) illustrated a meta-regression of ten clinical trials for hypertension where the estimated treatment effect was smaller in men compared to women, whereas a one-step IPD metaanalysis that examined participant-level information directly within studies did not support a clinically significant difference in treatment effect by sex. The apparent superiority of treatment with women was an artifact of studies with larger proportions of female participants tending towards larger effect sizes, though for reasons unrelated to sex. Specifically, when treatment effects by sex were evaluated within studies, the differences in treatment response were not clinically significant. Consequently, the study-level summaries that are commonly utilized can make these approaches more prone to ecological inference bias. An advantage of MASEM is that within-study variables (e.g., sex) can be included in the model, which can avoid introducing ecological biases when individual-level data are aggregated and analyzed as studylevel data (e.g., proportion of females in the study); however, this generally requires access to raw IPD.
A limitation of correlations as the input data for a mediation analysis is the loss of scale-level information since correlation coefficients are standardized within each study to have a mean of zero and a standard deviation of one. This allows the pooling and comparison of the correlations across studies but assumes that the bivariate correlations correspond with the same range of values on the variable scales across intervention groups and levels of the outcome and mediator variables within studies. In practice, it is difficult to know whether these assumptions are reasonable without verifying them with IPD. If these assumptions are not met, then the resulting inference could be biased.
Furthermore, MASEM and existing approaches utilizing aggregate data are generally limited by the information disclosed in intervention reports, which frequently do not include all outcomes that were assessed (see Mun et al., 2021), let alone correlations among key variables of interest. Thus, MASEM and other mediation modeling approaches that rely on aggregate data may not be possible in many cases without access to IPD or unreported aggregate data. Finally, with only aggregate data, it is impossible to check and verify whether the original data were appropriately analyzed and reported (e.g., the assumptions of multivariate normal distribution, data that is missing at random).
Meta-analysis using IPD provides an opportunity to more rigorously evaluate the pathways by which treatments improve health outcomes at the individual level. Furthermore, a mediation analysis with IPD permits a longitudinal analysis that controls for baseline levels of (a) the mediator, (b) the outcome, and (c) any relevant covariates.
The current paper proposes an SEM approach using IPD that (a) accounts for the clustering of participants within studies, (b) accommodates missing data via multiple imputation, and (c) allows valid inferences about the indirect effect (i.e., mediated effect) via bootstrapped confidence intervals in an integrative data analysis (IDA) that estimates the entire model in one step, after previously establishing commensurate measures (see Hussong et al., 2013 for typical considerations for IDA). In this article, we first introduce the motivating research question and example data. Second, we outline a meta-analytic mediation modeling approach that can accommodate the clustered data structure of participants nested within studies. Third, we discuss how to estimate confidence intervals for the indirect and total effects of intervention for the purpose of statistical inference. Finally, we illustrate the meta-analytic mediation analysis using data drawn from Project INTEGRATE  and discuss the implications of our method for both methodological and substantive research.
The motivating research question is whether improvements in protective behavioral strategies (PBS) mediate the effectiveness of brief motivational interventions for alcoholrelated problems among college students who drink. PBS are specific cognitive-behavioral strategies that can be used prior to or during alcohol consumption to reduce alcohol-related problems (Martens et al., 2013). In the past two decades, promoting the use of PBS has become a common component of interventions for reducing alcohol-related problems among college drinkers (Ray et al., 2014). However, there has been mixed evidence on the extent to which improvements in PBS can explain the effect of brief motivational interventions on reducing alcohol use and related problems, with most evidence coming from cross-sectional data (Reid & Carey, 2015). We detail a longitudinal mediation analysis approach to evaluate whether improvements in PBS following brief motivational intervention are associated with subsequent reductions in alcohol-related problems among college students who drink.

Motivating Data: The Project INTEGRATE Study
The motivating data are drawn from Project INTEGRATE, a large-scale IPD meta-analysis project evaluating brief motivational interventions for college drinking across 24 independent intervention studies . From the Project INTEGRATE data set, we selected ten studies that were randomized controlled trials assessing PBS and alcohol-related problems at baseline and at least one postbaseline assessment. Participants in the included studies were randomized to a control group or one of three brief motivational interventions: (1) individually delivered motivational interviewing with personalized feedback (MI + PF), (2) stand-alone personalized feedback (PF), or (3) groupbased motivational interviewing (GMI). Because PBS is not applicable for non-drinkers, we only included participants within each study who reported at least one drink in the past 1 or 3 months, depending on the study, at post-baseline assessment. Table 1 summarizes the intervention arms and corresponding sample sizes for the combined sample of drinkers from the ten studies that met the study inclusion criteria. Eight of the 10 studies were two-arm trials that evaluated a single brief motivational intervention, whereas studies 9 and 21 evaluated two or more intervention groups.
The mediator variable, PBS, was measured using five different scales across the original studies, which were subsequently harmonized and made commensurate by using a generalized partial credit model (Muraki, 1992), which is an extension of the hierarchical two-parameter logistic item response theory (2-PL IRT) model that we reported for alcohol-related problems (Huo et al., 2015). The measurement work to establish PBS trait scores can be found in Mun et al. (2015Mun et al. ( , 2016. With respect to the motivating data, studies 2, 8a, 8b, 8c, and 9 used the 10-item Protective Behavioral Strategies (PBS; American College Health Association, 2001) measure; studies 16, 18, and 21 used the 15-item Protective Behavioral Strategies Scale (PBSS; Martens et al., 2005); and studies 12 and 22 used the sevenitem Drinking Restraining Strategies (DRS; Wood et al., 2007) measure. Study 22 incorporated an additional nineitem measure asking about Drinking Strategies. These scales shared similarly worded items, from which five collapsed items across scales provided overlap across studies when estimating item parameters.
The outcome variable, alcohol-related problems, was assessed using six different scales across the original studies. We used latent trait scale scores estimated from hierarchical, 2-PL IRT models for multiple groups to establish commensurate alcohol-related problems trait scores for all participants across studies and time (Huo et al., 2015;Mun et al., 2015). Table 1 The combined sample by intervention group and study The follow-up (in months) is the first post-baseline assessment for which both mediation and outcome data were collected in the study.
MI + PF individually delivered motivational interviewing intervention with personalized feedback, PF stand-alone personalized feedback intervention, GMI group motivational interviewing intervention With respect to the motivating data, studies 2, 8a, 8b, 8c, 9, 16, and 21 used the Rutgers Alcohol Problem Index (RAPI; White & Labouvie, 1989); studies 8a, 8b, 8c, 9, 12, 16, and 22 used the Young Adult Alcohol Problems Screening Test (YAAPST; Hurlbut & Sher, 1992); study 12 also used the Alcohol Dependence Scale (Skinner & Allen, 1982;Skinner & Horn, 1984); study 18 used the Brief Young Adult Alcohol Consequences Questionnaire (BYAACQ; Kahler et al., 2005); and study 21 used the Alcohol Use Disorders Identification Test (AUDIT; Saunders et al., 1993). For readers interested in the technical details regarding how the measures of PBS and alcohol problems used in the motivating data were made commensurate, the harmonization work is discussed extensively in earlier reports (Huo et al., 2015;Mun et al., 2015Mun et al., , 2016Mun et al., , 2019. The sample for the present analysis included a total of 3691 students, with approximately two-thirds (63.8%) female. Most of the students identified as White (78.3%), and just over half of the participants (56.2%) were first-year or incoming college students. Table 2 provides a descriptive summary of all variables, including rates of missing data, by study and time point.

Meta-analytic Mediation Model for Pretest-Posttest Designs
Clinical trials commonly use pretest-posttest designs in which participants are assessed at baseline and one or more follow-ups. In the current motivating data, half of the studies included a single follow-up within 12 months post-intervention (see Table 1). To accommodate the broadest range of followup schedules, we focus on evaluating mediation using longitudinal data from two time points: (1) baseline and (2) the first post-baseline follow-up for which both mediation and outcome data were collected in each study. Figure 1 depicts a basic two-wave longitudinal mediation model (MacKinnon, 2008;Valente & MacKinnon, 2017) that controls for baseline levels of both the mediator and the study outcome. This is an extension of the classic crosssectional mediation model outlined by Baron and Kenny (1986) that evaluates if (a) the intervention (vs. control) is prospectively associated with post-baseline improvements in the mediator, (b) post-baseline improvement in the mediator is associated with post-baseline improvements in the study outcome, and (c) the intervention (vs. control) is associated with the study outcome after controlling for the mediator (i.e., the direct effect). This mediation model can be easily extended to include additional treatment contrasts and covariates as well as to accommodate clustered data across multiple studies, within an SEM framework. Next, we describe the application of the basic two-wave longitudinal mediation model outlined in Fig. 1 to the Project INTEGRATE data. The meta-analytic mediation model consists of (1) an "overall model" that combines IPD across all studies and (2) "study-specific sub-models" that characterize potential differences between individual studies and inform the interpretation of the overall meta-analytic results.

Overall Mediation Model
First, we detail the overall meta-analytic mediation model of the combined sample of all participants across all included studies. Let POST_PBS is be the post-baseline PBS score of participant i in study s. Equation (1) is the first equation in the mediation model, which models the average, prospective association between each intervention group (vs. control) and post-baseline levels of the mediator variable, controlling for baseline levels of the mediator variable, PBS, and the study outcome variable, alcohol-related problems: where (A) identifies regression coefficients from the first of the two mediation model equations and e is(A) is a participant-specific residual error term. TX_MIPF is , TX_PF is , and TX_GMI is are dummy-coded variables that indicate random allocation to MI + PF, PF, or GMI, respectively (each coded 1), compared to controls (all coded 0). The regression coefficients b 1(A) , b 2(A) , and b 3(A) quantify the covariate-adjusted average difference between participants who received (1) MI + PF, (2) stand-alone PF, or (3) GMI, respectively, compared to control participants. The covariate BL_PBS is adjusts for initial levels of the PBS mediator, and the covariate BL_ALCPROB is adjusts for initial levels of alcoholrelated problems.
Let POST_ALCPROB is be the post-baseline level of the study outcome variable, alcohol-related problems, of participant i in study s. Equation (2) is the second equation in the mediation model, which models the association between post-baseline levels of the mediator, PBS, and post-baseline levels of the study outcome, alcohol-related problems, adjusting for baseline levels of the mediator and study outcome variables: where (B) identifies regression coefficients associated with the second mediation model equation and e is(B) is a participantspecific residual error term. The regression coefficients b 1(B) , (1)

Study-Specific Mediation Sub-models
Next, we describe the study-specific mediation sub-models, which inform the interpretation of the overall mediation model by characterizing variation in the results across studies. The mediation analysis is repeated separately and sequentially for each study by using sub-models of Eqs. (1) and (2) to include the estimable terms (i.e., evaluated intervention groups and demographic covariates with variability). For example, coefficients b 7(A) and b 8(B) are not estimable and hence excluded in the study-specific sub-models for studies 9, 16, and 22 because they recruited only first-year students. As an illustration, Eqs. (3) and (4) are the study-specific sub-models for study 22, which evaluated MI + PF vs. control: where (A) and (B) identify regression coefficients from the reduced first and second mediation model equations, respectively, i identifies the participant, and e i(A) and e i(B) are participant-specific residual error terms. For consistency, the subscripts in Eqs. (3) and (4) correspond with the same variables as those shown in the overall model Eqs. (1) and (2). As seen in Table 1, intervention groups not evaluated in a study become study-level missing data in the context of IPD (3) meta-analysis. The parameters associated with missing treatment contrasts are excluded in the study-specific sub-model.
, and b 3(B) are excluded from Eqs. (3) and (4) since PF and GMI were not evaluated in study 22 by study design (i.e., TX_MIPF i = 0 and TX_GMI i = 0 for all participants i in study s), and b 7(A) , b 8(B) are excluded by study design since all participants in study 22 were firstyear students. It is important to note that the interpretation of each parameter estimate depends on the other parameters included in the model (see Jiao et al., 2020). However, if we assume that Eqs. (1) and (2) represent the true model for all studies, it is reasonable to assume the omitted coefficients in the sub-models are missing at random. In addition, since baseline PBS and alcohol-related problems are adjusted for in all sub-models, any interpretational bias associated with missing demographic covariates would be minimal.

Accounting for Clustered Design Using SEM for Complex Survey Data
A key data feature of IPD combined from multiple studies is the nesting of individual participants within studies, which must be considered for accurate statistical inference (see also Mun et al., 2015, p. 36-38). To account for the nested data structure of IPD from multiple studies in a one-stage integrative analysis, parameter estimates and corresponding standard errors can be adjusted for clustering by utilizing either (1) a model-based approach using multilevel modeling that incorporates cluster-specific parameters (e.g., Huh et al., 2015Huh et al., , 2019 or (2) a design-based approach in which clustering is accommodated via complex survey analysis with weights applied to participants in a single-level analysis (e.g., Clarke et al., 2013Clarke et al., , 2016Li et al., 2020;Ray et al., 2014). The advantage of design-based adjustment for clustering is that it can be implemented easily in an SEM framework and produces estimates that are comparable to multilevel modeling (Wu & Kwok, 2012), but with a lower

BL mediator
Post-BL mediator d

BL outcome
Post-BL outcome g

Intervention vs. Control
a b e f c computational burden. The computational efficiency of cluster-adjusted SEM makes it especially useful when combined with bootstrapping, the commonly accepted method for evaluating the statistical significance of the mediated (i.e., indirect) effect (see "Bootstrap Resampling with Multiple Imputation" later).
To evaluate the meta-analytic mediation model outlined in Eqs. (1)-(4), while accounting for the nested design of the data, we utilized SEM for complex survey data by first using the R package lavaan (Rosseel, 2012) to estimate an SEM that combines data across all studies in a singlelevel analysis followed by lavaan.survey (Oberski, 2014), which provides a design-based adjustment to account for clustering by study. SEM for complex survey data is analogous to the generalized estimating equation (Zeger et al., 1988) approach to analyzing multilevel data, which is also a design-based approach to accommodate clustered data. To account for widely varying sample sizes across studies, we weighted the data using the inverse of the square root of each study's sample size as explained in Mun et al. (2015) and used in research applications (Clarke et al., 2013(Clarke et al., , 2016Ray et al., 2014).
With respect to interpretation, the regression coefficients (i.e., fixed effects) produced by SEM for complex survey data are marginal estimates, which represent the average effects across all individuals. In contrast, regression coefficients estimated using a model-based approach are clusterspecific estimates that are conditional on specific values of the random effects (e.g., the deviation of a specific individual from the group average). When the outcome is modeled as normally distributed, regression coefficients produced by multilevel models (i.e., mixed-effects models) can be interpreted like marginal estimates, although this does not hold for extensions of multilevel modeling that use a non-identity link function, such as logistic or Poisson models (Atkins et al., 2013). Thus, the inference for a model that accounts for clustering using a design-based approach is functionally equivalent to multilevel modeling in the present application.

Calculating the Indirect and Total Effect of Intervention
To calculate the indirect effect of each intervention type on the post-baseline study outcome via changes in the mediator, we calculate the product of the regression coefficients corresponding to (1) the association between intervention type and post-baseline PBS (i.e., b 1(A) , b 2(A) , and b 3(A) ) and (2) the association between post-baseline PBS and changes in alcohol-related problems, b 6(B) . Equations (5)-(7) summarize the formulas used to calculate the indirect effects of MI + PF, stand-alone PF, and GMI vs. control, respectively, for the overall (Eqs. 1 and 2) and study-specific (Eqs. 3 and 4) models: To calculate the total effect of each intervention type on post-baseline alcohol-related problems, we sum (a) the direct effect of each intervention type on alcohol-related problems (i.e., b 1(B) , b 2(B) , and b 3(B) ) from Eq. (2) and (b) the corresponding indirect effect of each intervention type calculated in Eqs. (5), (6), or (7). Equations (8)-(10) summarize the formulas used to calculate the total effects of MI + PF, stand-alone PF, and GMI vs. control, respectively, for the overall (Eqs. 1 and 2) and study-specific (Eqs. 3 and 4) models:

Bootstrap Resampling with Multiple Imputation
To evaluate the magnitude and statistical significance of the estimates from the mediation model, including regression coefficients, indirect effects, total effects, and R 2 values, we used bootstrap resampling (Efron & Tibshirani, 1993) in which the mediation analyses are replicated across 5000 bootstrapped data sets to calculate the mean point estimate and 95% confidence interval for each parameter. Bootstrap estimation involves random sampling of observations with replacement from the original data set such that the sample is treated as if it were the population. The effect of sampling with replacement is that an observation may be represented more than once, whereas some observations may be left out in any given bootstrap sample. As a result, the bootstrap sample is equal in size to the original but is not identical.
Because missing data present in the original data set will also be reflected in the bootstrap data set, an additional consideration is needed to handle missing data when bootstrapping. In the context of an IPD meta-analysis, there can be two sources of missing data: (1) study-level missing data due to a variable not being assessed or without variation (see Jiao et al., 2020;Kim et al., 2014) and (2) participant-level missing data due to nonresponse. In the context of the Project INTEGRATE data, study-level missing data occurred because only one study evaluated all three intervention groups, the rest evaluated a subset of intervention groups (i.e., one or two), and also because some studies exclusively targeted first-year students or women. These are not missing variables within the original studies; however, in the context of meta-analysis, they are missing or inestimable covariates at the study level. As described previously, we excluded the corresponding treatment contrast or demographic covariate from the corresponding study-specific mediation sub-model. Therefore, study-level missing variables were not imputed.
As seen in Table 2, there were also participant-level missing variables. Thus, to minimize bias in the results of the mediation analysis due to missing mediator, outcome, and/or covariate data, bootstrapping was combined with multiple imputation. Multiple imputation is a widely used method for accommodating missing data. Furthermore, simulation research supports combining multiple imputation with bootstrapping (Little & Rubin, 2002;Schomaker & Heumann, 2018). There are several ways to combine multiple imputation and bootstrapping, each with pros and cons (Brand et al., 2019). In the present study, we chose to bootstrap first, followed by multiple imputation, which is more computationally intensive but produces confidence intervals that more accurately reflect uncertainty due to missing data (Bartlett & Hughes, 2020).
First, a stratified bootstrap was performed in which participants, including those with missing data, were randomly sampled with replacement separately by study and intervention group then combined into a single bootstrapped data set of equal size to the original data set. The stratification by study and intervention group accounted for the clustered design (i.e., participants nested within studies and groups) and maintained consistent sample sizes in subsequent analyses, within and across studies, as well as across all intervention groups. A total of 5000 bootstrap-resampled data sets were generated. Second, for each of the bootstrap-resampled data sets, a set of ten imputed data sets were generated via multivariate normal imputation with the R package Amelia (Honaker et al., 2011). According to simulation findings by Bartlett and Hughes (2020), ten imputations per bootstrap replicate provide approximately accurate confidence intervals when multiple imputation is nested within bootstrapping.
The mediation analysis was repeated for each multiply imputed data set, and the results were combined across ten imputed data sets. This yielded a set of 5000 estimates for each parameter in the mediation model, one for each bootstrap replicate. The collection of bootstrap estimates approximates the sampling distribution for each parameter and accommodates non-normally distributed estimates, such as the indirect and total effects. The point estimate for each parameter was calculated as the mean across the 5000 bootstrap replications. Bias-corrected and accelerated 95% confidence intervals were calculated to assess the indirect and total effects, as recommended by MacKinnon et al. (2004).

Analysis of the Motivating Data and the Summary of Findings
Annotated computer code in R for fitting the model, along with example data, can be accessed in the online repository (https:// doi. org/ 10. 17632/ t2yk5 kt3bw.1; Huh et al., 2021). Figure 2 is a path diagram that summarizes the estimated associations from the overall mediation model of the combined sample. The path coefficients are standardized with respect to the outcome, which can be interpreted as the effect that a unit difference in each predictor has on the corresponding outcome variable, holding all other covariates constant. For treatment contrasts and other indicator variables, the path coefficients correspond with the difference between groups (e.g., MI + PF vs. control) in SDs of the outcome. For continuous predictors (i.e., alcohol-related problems, PBS), the standardized coefficient can be interpreted as the change in SDs of the outcome for a unit difference in the predictor. The overall mediation model explained 43% of the variance in both post-baseline PBS and post-baseline alcohol-related problems.
The paths of interest are (1) the prospective association between each intervention and post-baseline levels of the mediator (PBS) and (2) the association of the mediator and the outcome at post-baseline. Of the three interventions, only stand-alone PF had a statistically significant association with the mediator, with a .07 SD increase (95% CI = [.01, .12]) in post-baseline PBS as compared to control. A one-SD increase in post-baseline PBS, in turn, was associated with a .22 SD reduction (95% CI = [−.26, −.17]) in post-baseline alcohol-related problems. Figure 3 is a forest plot that summarizes the key mediation-related results (i.e., indirect and total effects) from (a) the ten study-specific sub-models (top portion) and (b) the overall model (bottom portion, highlighted in gray) of the combined sample. A negative coefficient can be interpreted as a prospective improvement (i.e., reduction) in alcoholrelated problems at post-baseline. Stand-alone PF, compared with control, was associated with a statistically significant, albeit small, reduction in alcohol-related problems via increased use of PBS (β = −.01, 95% CI = [−.03, −.002]). Neither MI + PF nor GMI was associated with statistically significant reductions in alcohol-related problems, compared with control, through improvements in PBS.
An additional sensitivity analysis was conducted to evaluate the consistency of the findings when the mediation analysis was repeated by leaving out one study at a time, sequentially (see the Supplemental Material for a summary). The indirect and total effects of each intervention approach were consistent across the sensitivity models, suggesting that the results were robust and not driven by any single influential study. Fig. 2 Overall mediation model evaluating change in protective behavioral strategies as a pathway by which brief motivational intervention improves alcohol-related problems for college students who drink.

Discussion
The literature evaluating mechanisms of intervention effect has relied almost exclusively on single-study intervention trials, which are frequently underpowered to evaluate mediation hypotheses (Fritz et al., 2015). This methodological illustration details a meta-analytic mediation analysis approach that leverages IPD across multiple studies to evaluate mechanisms of change longitudinally. Specifically, the approach evaluates whether the prospective change in a mediator following intervention is accompanied by a change in the outcome. Moreover, the approach can accommodate missing data commonly encountered in clinical trial data, making it a practical option for meta-analytic mediation analysis. The illustrated SEM approach combines well-established quantitative methodologies, including SEM with designbased adjustment for clustering, bootstrap estimation of mediated effects, and multiple imputation, to test mediation with accuracy and precision. We describe how to calculate the magnitude of a mediated effect within and across studies and assess its statistical significance in a way that (a) accounts for the clustering of participants within the study, (b) uses all available data, and (c) produces point estimates and confidence intervals for the indirect and total effects of an intervention that account for the non-normal distribution that arises from a product of coefficients.
At a substantive level, it is of interest that greater use of PBS mediated the effect of stand-alone PF intervention on alcohol-related problems. Specifically, participants receiving stand-alone PF had greater improvement in PBS utilization compared with participants randomized to the control comparison. Greater PBS utilization, in turn, was associated with concurrent reductions in alcohol-related problems. Although statistically significant, it is important to note that the mediated effect of PF via a change in PBS was quite small, equivalent to a .01 SD difference in the reduction in alcohol-related problems. The small mediated effect may be because brief motivational interventions, including PF, do not increase the use of PBS substantially. However, the results from this study may suggest that stand-alone PF focusing on a few salient points, such as PBS, may be more likely to induce behavior change than formats that use multiple modalities (Ray et al., 2014).
Although the effect of brief motivational interventions on alcohol-related problems via a change in PBS appeared to be quite small in the present study, our findings are consistent with the evidence of some PBS-based interventions failing to improve outcomes (Martens et al., 2013). In addition, college students utilize PBS for different reasons, with some students engaging in PBS to get intoxicated faster while trying to prevent the most extreme harm. Therefore, the increased use of PBS can increase alcohol-related problems for some students unmotivated to change their drinking, while low-risk drinkers may use them to effectively limit harm from drinking (Li et al., 2020). The average effect that we focused on in the current study, although important, needs to be examined further for heterogeneous mediational paths, accounting for students' different motivations for drinking and PBS use.
It is important to note that most of the studies evaluated only one or two intervention groups and not all three interventions. The unbalanced nature of the intervention groups across studies is a typical challenge in a meta-analysis across heterogeneous studies, including IPD data syntheses (Brincks et al., 2018;Huh et al., 2019), and can complicate the interpretation of findings. However, the motivating data featured a large, pooled sample of college students from brief motivational intervention studies, which permitted more robust mediation estimates for all the intervention types (i.e., MI + PF, standalone PF, and GMI) than would be possible in individual trials. Furthermore, we previously developed commensurate measures across trials for key constructs and carefully controlled for baseline levels of both the mediator and outcome variables, which bolsters confidence in the findings.
An important advantage of meta-analytic mediation analysis using IPD compared to traditional meta-analysis is the ability to evaluate the prospective association between baseline participant characteristics and change in PBS, which yielded additional insights. As seen in Fig. 2, we found that men (vs. women), first-year students (vs. non-first-year students), White students (vs. non-White students), and those with more severe alcohol-related problems at baseline showed less improvement in PBS use at a follow-up. The ability to make inferences regarding participant-level change shows the benefit of this IPD-based approach for evaluating mechanisms of change in prevention research.

Limitations and Future Directions
It is important to consider the limitations of the present study. First, we could not evaluate if change in the mediator preceded change in the study outcome, which would require data from at least three time points. Second, the approach relies on assumptions about missing data that we believe to be reasonable, including that the absence of an intervention group in a study does not bias the overall findings. However, further investigation via simulation study may be needed to identify potential areas of improvement. Third, this methodological illustration focuses on evaluating a single mediator; however, the approach we detailed can be extended to models with multiple mediators. Fourth, a minor drawback to our approach is that combining multiple imputation with bootstrapping is computationally intensive; however, the estimation times (e.g., 10-20 min per model) encountered in the present study are feasible for applied research. Finally, our motivating example focused on a relatively normally distributed mediator variable and outcome of interest. Future research might examine extensions of this approach within a generalized SEM framework to binary, count, or other outcome distributions.

Conclusions
The SEM approach detailed in this methodological illustration is a flexible approach for conducting a mediation analysis that leverages the most granular information from multiple studies and overcomes key challenges that arise when combining clinical trial data. The annotated R code and data provide additional guidance for researchers who wish to apply the method in their own research, and we hope it will motivate further development in meta-analytic mediation methodology and its applications in prevention science.
Funding The project described was supported by the National Institute on Alcohol Abuse and Alcoholism (NIAAA) grants R01 AA019511 and K02 AA028630. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIAAA or the National Institutes of Health.

Declarations
Research Involving Human Participants This project was approved by the North Texas Regional Institutional Review Board (IRB). The original trials that contributed to Project INTEGRATE were IRB approved in each of the respective institutions. All ethical standards for conducting research with human participants were followed in the current project as well as in the implementation of the original trials, including the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.

Informed Consent
Informed consent was obtained from all participants included in the original studies contributing to this meta-analysis.

Conflict of Interest
The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.