Abstract
Science is an inherently cumulative process, and knowledge on a specific topic is organized through synthesis of findings from related studies. Meta-analysis has been the most common statistical method for synthesizing findings from multiple studies in prevention science and other fields. In recent years, Bayesian statistics have been put forth as another way to synthesize findings and have been praised for providing a natural framework for update existing knowledge with new data. This article presents a Bayesian method for cumulative science and describes a SAS macro %SBDS for synthesizing findings from multiple studies or multiple data sets from a single study using three different methods: meta-analysis using raw data, sequential Bayesian data synthesis, and a single-level analysis on pooled data. Sequential Bayesian data synthesis and Bayesian statistics in general are discussed in an accessible manner, and guidelines are provided on how researchers can use the accompanying SAS macro for synthesizing data from their own studies. Four alcohol use studies were used to demonstrate how to apply the three data synthesis methods using the SAS macro.
Similar content being viewed by others
Notes
We hope that by providing an annotated SAS macro, prevention science researchers and graduate students at all levels will feel comfortable performing SBDS with their own data. If an academic researcher does not have access to an institutional SAS license, SAS University edition can be downloaded for free from https://www.sas.com/en_us/software/university-edition.html.
Here we use the term significant to mean that the credibility intervals for a given effect do not contain zero. There is no concept of significance testing in the Bayesian framework, however, if 0 is not included in the credibility interval for an effect, we can conclude that the effect is different from zero. Throughout the paper we will use the term “significant” for both frequentist and Bayesian findings because of its familiar interpretation, but we caution the reader that this expression does not stem from the Bayesian theoretical framework.
PROC MCMC is able to accommodate missing data via the MISSING = option, although we did not add this explicitly to the macro, as a thorough discussion of the Bayesian treatment of missing data was beyond the scope of this article. Users who wish to utilize the PROC MCMC missing data utility are referred to https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=statug&docsetTarget=statug_mcmc_details61.htm&locale=en
A multilevel model with random slopes was also attempted following the SAS code from Bauer, Preacher, and Gil (2006); however, this model failed to converge. Researchers who have reason to believe that the a and/or b paths (slopes) may differ among their data sets are encouraged to follow the excellent SAS documentation accompanying Bauer et al. (2006) available at http://dbauer.web.unc.edu/publications/.
Multilevel meta-analysis is more complex when mediating variables are considered because the mediation model at its simplest contains two relations, X to M and M to Y, compared to typical multilevel meta-analysis which consists of one bivariate X to Y relation. Information for a mediational multilevel meta-analysis may consist of: (1) within-study (level 1) relations for the independent variable, mediator, and dependent variable which are all collected in the same study, and (2) between-study (level 2) information which combines information on different parts of mediational relations across studies (MacKinnon, 2008). Both within-study and between-study relations can be examined in a multilevel meta-analysis, and the difference between these effects, called a contextual effect, can be examined via a significance test (Wurpts, 2016).
Note that another way to specify the spread hyperparameter of a normal prior in SAS is as a variance, which one can compute by squaring the standard errors or posterior standard deviations of the corresponding regression coefficient or intercept. Specifying a standard deviation hyperparameter to be equal to the standard deviation of the coefficient and as the variance hyperparameter to be equal to the squared standard deviation of the coefficient are just two ways of specifying the same prior distribution in SAS PROC MCMC (Miočević & MacKinnon, 2014; SAS Institute Inc., 2013).
References
Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173
Bauer, D. J., Preacher, K. J., & Gil, K. M. (2006). Conceptualizing and testing random indirect effects and moderated mediation in multilevel models: New procedures and recommendations. Psychological Methods, 11, 142–163
Brockwell, S. E., & Gordon, I. R. (2001). A comparison of statistical methods for meta-analysis. Statistics in Medicine, 20, 825–840
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81
Cook, T. D., Cooper, H., Cordray, D. S., Hartmann, H., Hedges, L. V., & Light, R. J. (Eds.). (1992). Meta-analysis for explanation: A casebook. Russell Sage Foundation.
Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds.). (2009). The handbook of research synthesis and meta-analysis. Russell Sage Foundation.
Cuijpers, P. (2002). Effective ingredients of school-based drug prevention programs: A systematic review. Addictive Behaviors, 27, 1009–1023
Curran, P. J., & Hussong, A. M. (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14, 81–100
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis. CRC Press.
Hartung, J., Knapp, G., & Sinha, B. K. (2008). Bayesian meta‐analysis. Statistical Meta-Analysis with Applications, 155–170.
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Academic Press.
Hox, J. (2002). Multilevel analysis: Techniques and applications. Lawrence Earlbaum.
Hussong, A. M., Curran, P. J., & Bauer, D. J. (2013). Integrative data analysis in clinical psychology research. Annual review of clinical psychology, 9, 61–89
Ibrahim, J. G., & Chen, M. H. (2000). Power prior distributions for regression models. Statistical Science, 15, 46–60
Institute Inc, S. A. S. (2013). SAS/STAT® 13.1 User’s Guide. Cary, NC: SAS Institute Inc.
Jones, A. P., Riley, R. D., Williamson, P. R., & Whitehead, A. (2009). Meta-analysis of individual patient data versus aggregate data from longitudinal clinical trials. Clinical Trials, 6, 16–27
Krull, J. L., & MacKinnon, D. P. (1999). Multilevel mediation modeling in group-based intervention studies. Evaluation Review, 23, 418–444
Kruschke, J. K. (2011). Bayesian assessment of null values via parameter estimation and model comparison. Perspectives on Psychological Science, 6, 299–312
Kuiper, R. M., Buskens, V., Raub, W., & Hoijtink, H. (2013). Combining statistical evidence from several studies a method using Bayesian updating and an example from research on trust problems in social and economic exchange. Sociological Methods & Research, 42, 60–81
Lau, J., Antman, E. M., Jimenez-Silva, J., Kupelnick, B., Mosteller, F., & Chalmers, T. C. (1992). Cumulative meta-analysis of therapeutic trials for myocardial infarction. New England Journal of Medicine, 327, 248–254
de Leeuw, C., & Klugkist, I. (2012). Augmenting data with published results in Bayesian linear regression. Multivariate Behavioral Research, 47, 369–391
MacKinnon, D. P. (2008). Introduction to statistical mediation analysis. Lawrence Erlbaum Associates.
MacKinnon, D. P., Warsi, G., & Dwyer, J. H. (1995). A simulation study of mediated effect measures. Multivariate Behavioral Research, 30, 41–62
Maxwell, S. E., Lau, M. Y., & Howard, G. S. (2015). Is psychology suffering from a replication crisis? What does “failure to replicate” really mean? American Psychologist, 70, 487–498
McBride, N. (2003). A systematic review of school drug education. Health Education Research., 18, 729–742
Miočević, M., & MacKinnon, D. P. (2014). SAS® for Bayesian Mediation Analysis. In Proceedings of the 39th annual meeting of SAS Users Group International. Cary, NC: SAS Institute, Inc.
Muthén, B. O., & Satorra, A. (1995). Complex sample data in structural equation modeling. Sociological Methodology, 267–316.
Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7(6), 615–631.
O’Rourke, H. P., & Vazquez, E. (2019). Mediation analysis with zero-inflated substance use outcomes: Challenges and recommendations. Addictive Behaviors, 94, 16–25
Smith, T. C., Spiegelhalter, D. J., & Thomas, A. (1995). Bayesian approaches to random-effects meta-analysis: A comparative study. Statistics in Medicine, 14, 2685–2699
Thurstone, L. L. (1931). The reliability and validity of tests.
Tobler, N. (1997). Meta analysis of adolescent drug prevention programs: Results of the 1993 meta analysis. In W. Bukoski (Ed.), Meta-analysis of Drug Abuse Prevention Programs. (pp. 5–68). NIDA.
Wechsler, H. (1993). Harvard school of public health college alcohol study, 1993. ICPSR6577-v3. Ann Arbor, MI: Inter-University Consortium for Political and Social Research [distributor], 2020-01-30. https://doi.org/10.3886/ICPSR06577.v4
Wechsler, H. (1997). Harvard school of public health college alcohol study, 1997. ICPSR3163-v3. Ann Arbor, MI: Inter-University Consortium for Political and Social Research [distributor], 2020-01-30. https://doi.org/10.3886/ICPSR03163.v4
Wechsler, H. (1999). Harvard school of public health college alcohol study, 1999. ICPSR3818-v2. Ann Arbor, MI: Inter-University Consortium for Political and Social Research [distributor], 2020-01-30. https://doi.org/10.3886/ICPSR03818.v3
Wechsler, H. (2001). Harvard school of public health college alcohol study, 2001. ICPSR04291-v2. Ann Arbor, MI: Inter-University Consortium for Political and Social Research [distributor], 2008-02-05. https://doi.org/10.3886/ICPSR04291.v2
Wurpts, I. C. (2016). Performance of contextual multilevel models for comparing between-person and within-person effects (Doctoral dissertation). Retrieved from ASU Library Digital Repository.
Funding
Marie Skłodowska-Curie Individual Fellowship awarded to Milica Miočević (European Commission Horizon 2020 research and innovation program, grant number 792119). This research was supported in part by the National Institute on Drug Abuse (R37DA09757).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics Approval
Not applicable. This study performed secondary data analysis on publicly available de-identified data which does not constitute human subjects research.
Consent to Participate
Not applicable. This study performed secondary data analysis on publicly available de-identified data which does not constitute human subjects research.
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was supported in part by the National Institute on Drug Abuse (R37DA09757). At the time the majority of this research was completed, Dr. Wurpts was a graduate student at Arizona State University. She is now a Scientist at Presbyterian Healthcare Services.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Wurpts, I.C., Miočević, M. & MacKinnon, D.P. Sequential Bayesian Data Synthesis for Mediation and Regression Analysis. Prev Sci 23, 378–389 (2022). https://doi.org/10.1007/s11121-021-01256-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11121-021-01256-1