Abstract
There has been increasing interest in applying integrative data analysis (IDA) to analyze data across multiple studies to increase sample size and statistical power. Measures of a construct are frequently not consistent across studies. This article provides a tutorial on the complex decisions that occur when conducting harmonization of measures for an IDA, including item selection, response coding, and modeling decisions. We analyzed caregivers’ self-reported data from the ADHD Teen Integrative Data Analysis Longitudinal (ADHD TIDAL) dataset; data from 621 of 854 caregivers were available. We used moderated nonlinear factor analysis (MNLFA) to harmonize items reflecting depressive symptoms. Items were drawn from the Symptom Checklist 90-Revised, the Patient Health Questionnaire–9, and the World Health Organization Quality of Life questionnaire. Conducting IDA often requires more programming skills (e.g., Mplus), statistical knowledge (e.g., IRT framework), and complex decision-making processes than single-study analyses and meta-analyses. Through this paper, we described how we evaluated item characteristics, determined differences across studies, and created a single harmonized factor score that can be used to analyze data across all four studies. We also presented our questions, challenges, and decision-making processes; for example, we explained the thought process and course of actions when models did not converge. This tutorial provides a resource to support prevention scientists to generate harmonized variables accounting for sample and study differences.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11121-022-01381-5/MediaObjects/11121_2022_1381_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11121-022-01381-5/MediaObjects/11121_2022_1381_Fig2_HTML.png)
Similar content being viewed by others
References
Aigner, M., Förster-Streffleur, S., Prause, W., Freidl, M., Weiss, M., & Bach, M. (2006). What does the WHOQOL-Bref measure?. Social Psychiatry and Psychiatric Epidemiology, 41(1), 81–86. https://doi.org/10.1007/s00127-005-0997-8
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). https://doi.org/10.1176/appi.books.9780890425596
Asparouhov, T., & Muthén, B. (2016). IRT in Mplus. Version 2. Technical report.
Bauer, D. J. (2017). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22(3), 507–526. https://doi.org/gbww6z
Beck, A., Steer, R., Ball, R., & Ranieri, W. (1996). Comparison of Beck Depression Inventories-IA and-II in psychiatric outpatients. Journal of Personality Assessment, 67, 588–597. https://doi.org/10.1207/s15327752jpa6703_13
Bauer, D. J., & Hussong, A. M. (2009). Psychometric approaches for developing commensurate measures across independent studies: traditional and new models. Psychological Methods, 14(2), 101–125. https://doi.org/10.1037/a0015583
Bauer, D. J., Belzak, W. C., & Cole, V. T. (2020). Simplifying the assessment of measurement invariance over multiple background variables: using regularized moderated nonlinear factor analysis to detect differential item functioning. Structural Equation Modeling: A Multidisciplinary Journal, 27(1), 43–55. https://doi.org/10.1080/10705511.2019.1642754
Beard, C., Hsu, K. J., Rifkin, L. S., Busch, A. B., & Björgvinsson, T. (2016). Validation of the PHQ-9 in a psychiatric sample. Journal of Affective Disorders, 193, 267–273. https://doi.org/10.1016/j.jad.2015.12.075
Bird, H. R., Gould, M. S., & Staghezza, B. (1992). Aggregating data from multiple informants in child psychiatry epidemiological research. Journal of the American Academy of Child & Adolescent Psychiatry, 31(1), 78–85. https://doi.org/dr9jxf
Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T. H., Huber, J., Johannesson, M., & Wu, H. (2018). Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour, 2(9), 637–644. https://doi.org/gd3v2n
Coxe, S., & Sibley, M. H. (2021). Harmonizing DSM-IV and DSM-5 versions of ADHD “a criteria”: an item response theory analysis. Assessment. https://doi.org/hbr4
Curran, P. J., & Hussong, A. M. (2009). Integrative data analysis: the simultaneous analysis of multiple data sets. Psychological Methods, 14(2), 81–100. https://doi.org/bzrn7b
Curran, P. J., Hussong, A. M., Cai, L., Huang, W., Chassin, L., Sher, K. J., & Zucker, R. A. (2008). Pooling data from multiple longitudinal studies: the role of item response theory in integrative data analysis. Developmental Psychology, 44(2), 365. https://doi.org/drccpc
Curran, P. J., McGinley, J. S., Bauer, D. J., Hussong, A. M., Burns, A., Chassin, L., & Zucker, R. (2014). A moderated nonlinear factor model for the development of commensurate measures in integrative data analysis. Multivariate Behavioral Research, 49, 214–231. https://doi.org/10.1080/00273171.2014.889594
Derogatis, L. R. (1994). Symptom Checklist 90R (SCL-90R) Administration, Scoring, and Procedures Manual (3rd ed.). National Computer Systems Inc.
Derogatis, L. R., & Savitz, K. L. (2000). The SCL–90–R and Brief Symptom Inventory (BSI) in primary care. In M. E. Maruish (Ed.), Handbook of psychological assessment in primary care settings (pp. 297–334). Lawrence Erlbaum Associates Publishers.
DiStefano, C., Shi, D., & Morgan, G. (2021). Collapsing categories is often more advantageous than modeling sparse data: investigations in the CFA framework. Structural Equation Modeling: A Multidisciplinary Journal, 28(2), 237–249. https://doi.org/ghhvfn
Embretson, S. E, & Reise, S. (2013). Item response theory. Psychology Press. https://doi.org/vv6
Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29(4), 278–295. https://doi.org/bvmm5s
First, M. B. (2014). Structured clinical interview for the DSM (SCID). The Encyclopedia of Clinical Psychology. 1–6. https://doi.org/10.1002/9781118625392.wbecp351
Fried, E. I. (2017). The 52 symptoms of major depression: Lack of content overlap among seven common depression scales. Journal of Affective Disorders, 208, 191–197. http://dx.doi.org/10.1016/j.jad.2016.10.019
Fried, E. I., Epskamp, S., Nesse, R. M., Tuerlinckx, F., & Borsboom, D. (2016). What are ‘good’ depression symptoms? Comparing the centrality of DSM and non-DSM symptoms of depression in a network analysis. Journal of Affective Disorders, 189, 314–320. https://doi.org/10.1016/j.jad.2015.09.005
Fried, E. I., & Nesse, R. M. (2014). The impact of individual depressive symptoms on impairment of psychosocial functioning. PloS One, 9(2), e90311. https://doi.org/gfsz8g
Hays, R. D., Morales, L. S., & Reise, S. P. (2000). Item response theory and health outcomes measurement in the 21st century. Medical Care, 38(9 Suppl), II28. https://doi.org/dzgxfn
Hedges, L.V., & Olkin, I. (2014). Statistical methods for meta-analysis. Academic press.
Horwood, L. J., Fergusson, D. M., Coffey, C., Patton, G. C., Tait, R., Smart, D., & Hutchinson, D. M. (2012). Cannabis and depression: an integrative data analysis of four Australasian cohorts. Drug and Alcohol Dependence, 126(3), 369–378. https://doi.org/f4gqgv
Huo, Y., de la Torre, J., Mun, E. Y., Kim, S. Y., Ray, A. E., Jiao, Y., & White, H. R. (2015). A hierarchical multi-unidimensional IRT approach for analyzing sparse, multi-group data for integrative data analysis. Psychometrika, 80(3), 834–855. https://doi.org/f7p6ct
Hussong, A. M., Curran, P. J., & Bauer, D. (2013). Integrative data analysis in clinical psychology research. Annual Review of Clinical Psychology, 9, 61–89. https://doi.org/gmk9ff
Hussong, A. M., Cole, V. T., Curran, P. J., Bauer, D. J., & Gottfredson, N. C. (2020). Integrative data analysis and the study of global health. In Statistical Methods for Global Health and Epidemiology (pp. 121–158). Springer, Cham.
Johnston, K. M., Powell, L. C., Anderson, I. M., Szabo, S., & Cline, S. (2019). The burden of treatment-resistant depression: a systematic review of the economic and quality of life literature. Journal of Affective Disorders, 242, 195–210. https://doi.org/gdq7t3
Joiner, T. E., Jr., Walker, R. L., Pettit, J. W., Perez, M., & Cukrowicz, K. C. (2005). Evidence-based assessment of depression in adults. Psychological Assessment, 17, 267–277. https://doi.org/10.1037/1040-3590.17.3.267
Kenny, D. A., Kashy, D., & Bolger, N. (1998). Data analysis in social psychology. In Gilbert (Eds.) Handbook of social psychology (4th ed., pp.233–65). McGraw-Hill.
Keum, B. T., Miller, M. J., & Inkelas, K. K. (2018). Testing the factor structure and measurement invariance of the PHQ-9 across racially diverse US college students. Psychological Assessment, 30(8), 1096–1106. https://doi.org/gd2qkd
Lewinsohn, P. M., Petit, J. W., Joiner, T. E., & Seeley, J. R. (2003). The symptomatic expression of major depressive disorder in adolescents and young adults. Journal of Abnormal Psychology, 112(2), 244–252. https://doi.org/d6r8kv
Millsap, R. E. (2012). Statistical approaches to measurement invariance. Routledge.
Muthén, L. K., & Muthén, B. O. (1998). Mplus: the comprehensive modeling program for applied researchers: User's guide. CA: Muthén & Muthén.
Raykov, T., & Marcoulides, G. A. (2000). A method for comparing completely standardized solutions in multiple groups. Structural Equation Modeling, 7, 292–308. https://doi.org/10.1207/S15328007SEM0702_9
Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items?. Psychological Methods, 8(2), 164–184. https://doi.org/btc2xj
Riley, R. D., Lambert, P. C., & Abo-Zaid, G. (2010). Meta-analysis of individual participant data: Rationale, conduct, and reporting. BMJ, 340, c221. https://doi.org/10.1136/bmj.c221
Samejima, F. (1997). Graded response model. In Handbook of modern item response theory (pp. 85–100). Springer.
Sibley, M. H., & Coxe, S. J. (2020). The ADHD teen integrative data analysis longitudinal (TIDAL) dataset: Background, methodology, and aims. BMC Psychiatry, 20, 1–12.
Sibley, M. H., Coxe, S. J., Campez, M., Morley, C., Olson, S., Hidalgo-Gato, N., & Pelham, W. E. (2018). High versus low intensity summer treatment for ADHD delivered at secondary school transitions. Journal of Clinical Child & Adolescent Psychology, 47, 248–265. https://doi.org/10.1080/15374416.2018.1426005
Sibley, M. H., Graziano, P. A., Kuriyan, A. B., Coxe, S., Pelham, W. E., Rodriguez, L., & Ward, A. (2016). Parent–teen behavior therapy+ motivational interviewing for adolescents with ADHD. Journal of Consulting and Clinical Psychology, 84. https://doi.org/10.1037/ccp0000106
Sibley, M. H., Rodriguez, L., Coxe, S., Page, T., & Espinal, K. (2020). Parent–teen group versus dyadic treatment for adolescent ADHD: what works for whom?. Journal of Clinical Child & Adolescent Psychology, 49(4), 476–492. https://doi.org/gfzk5c
Sibley, M., Graziano, P., Bickman, L., Coxe, S., Martin, P., Rodriguez, L., & Ortiz, M. (2021). Implementing parent-teen motivational interviewing+ behavior therapy for ADHD in community mental health. Prevention Science, 22(6), 701–11. https://doi.org/hbr3
Skevington, S., Lotfy, M., & O'Connell, K. (2004). The World Health Organization's WHOQOL-BREF quality of life assessment: psychometric properties and results of the international field trial. A report from the WHOQOL group. Quality of life Research, 13(2), 299–310. https://doi.org/10.1023/B:QURE.0000018486.91360.00
Spitzer, R.L., Kroenke, K., Williams, J.B., Patient Health Questionnaire Primary Care Study Group, & Patient Health Questionnaire Primary Care Study Group. (1999). Validation and utility of a self-report version of PRIME-MD: The PHQ primary care study. JAMA, 282, 1737–1744. https://doi.org/10.1001/jama.282.18.1737
Swanson, J. M., Kraemer, H. C., Hinshaw, S. P., Arnold, L. E., Conners, C. K., Abikoff, H. B., & Wu, M. (2001). Clinical relevance of the primary findings of the MTA: Success rates based on severity of ADHD and ODD symptoms at the end of treatment. Journal of the American Academy of Child & Adolescent Psychiatry, 40, 168–179. https://doi.org/10.1097/00004583-200102000-00011
Funding
This project was funded by the National Institute of Mental Health (NIMH) Grant R03 MH116397. The original studies were funded by NIMH Grants R01 MH106587 and R34 MH092466, the Institute of Education Sciences Grant R324A120169, and the Klingenstein Third Generation Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics Approval
This study was approved by the Florida International University Institutional Review Board (Protocol # IRB-17–0409-CR02, valid through November 14, 2022). The study was performed in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.
Consent to Participate
All participants provided written consent.
Conflict of Interest
MHS receives book royalties from Guilford Press for a treatment manual used in the studies. Other authors report no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Zhao, X., Coxe, S., Sibley, M.H. et al. Harmonizing Depression Measures Across Studies: a Tutorial for Data Harmonization. Prev Sci 24, 1569–1580 (2023). https://doi.org/10.1007/s11121-022-01381-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11121-022-01381-5