Depression is a complex, chronic condition that is critical to public health and the well-being of many individuals, families, and their loved ones. Major depressive disorder affects 17 million adults in the United States (US) and more than 250 million people worldwide; less severe forms of depressed affect are even more common.1 Similarly, race, ethnicity, wealth, education, and other measures of social status and function are now recognized as major determinants of health outcomes.2

Bidirectional relationships have been reported between depression and many chronic illnesses3; however, most studies have focused on specific conditions, such as diabetes, stroke, or congestive heart failure, as opposed to a multidimensional deep phenotyping approach. Likewise, recent evidence suggests a bidirectional relationship between depression and poverty,4 adding to the robust literature on links between poverty and poor health. These findings highlight the need to better understand the possible moderating effects of social determinants and the degree to which social determinants and depression are intertwined. The most common screening tool for depression is the Patient Health Questionnaire-9 (PHQ-9), whose operating characteristics are well known5 and have been validated in a variety of contexts.6,7 Despite extensive research on the clinical and behavioral implications of PHQ-9,8,9 there is limited research on the relationship between the PHQ-9 score and social determinants of health.

The Baseline Health Study (BHS)10 is a prospective cohort study of an adult population selected to represent major demographic groups in the US. In BHS, deep phenotyping of numerous demographic, clinical, laboratory, functional, and imaging findings is coupled with ongoing longitudinal follow-up. The purpose of this study was to assess the relationship between PHQ-9 score and a broad array of measurements intended to assess social determinants of health.

METHODS

The Baseline Health Study

BHS methods have been previously described,10 including entry and exclusion criteria, the institutional review board and participant consent procedures, the data collection scheme, and key components of study procedures. BHS is enrolling a large number of participants, beginning with intensive measurement of the first 2502 people (the deeply phenotyped cohort) in whom a large volume of multimodal data are collected. Four clinical BHS sites in the US have begun enrollment.

BHS participants were enrolled through a virtual online registry; selection of participants for the deep phenotyping cohort included in this report was performed using an algorithm to produce a cohort representative of US adult age, race, and ethnicity. People in good health and with medical conditions were included and the sampling method was designed to over-represent people at risk of heart disease or cancer. The PHQ-9 in this report was collected at the initial study visit in person or online.

A pre-BHS pilot study, which tested clinical assessment workflows, was conducted in 200 healthy participants prior to initiation of the primary study. BHS is funded by Verily (San Francisco, CA) and is managed in collaboration with Stanford University (Stanford, CA), Duke University (Durham, NC), and the California Health and Longevity Institute (Westlake Village, CA) with enrolling sites in Durham, NC; Kannapolis, NC; Los Angeles, CA; and Palo Alto, CA. The extended studies have governance approaches specific to the needs of each study. Herein, we examine a cross-sectional analysis of the first BHS time point PHQ-9 scores.

Statistical Methods

Distributional measures, medians and 25th and 75th percentiles for continuous variables, and counts and percentages for categorical variables were computed and summarized across each of 5 PHQ-9 severity groups5 (0, 1–4, 5–9, 10–14, >14), divided by convention to be consistent with prior studies. The Cochran-Armitage trend test for binomial variables11,12 and the Spearman rank correlation test for continuous variables13 or categorical variables that are ordinal in nature (e.g., education and income) were used to test for linear trend across severity group. Multiple tests were not adjusted for, given the exploratory nature of this study. Subsequent studies with pre-planned hypotheses are needed to confirm results.

Leveraging the breadth of data collected in BHS, penalized regression using the least absolute shrinkage and selection operator (LASSO) was conducted to select a model of physical, mental, phenotypic, symptom, and sociodemographic factors that may be predictive of the PHQ-9 score (logarithm of PHQ-9 + 1). Prior to modeling, each multi-valued (i.e., beyond binary) categorical predictor (e.g., race and smoking status) was converted using one-hot encoding. Five key socioeconomic variables (household income, education, employment status, marital status, and health insurance) collected via the Life Circumstances and Habits survey were defined as follows: (1) highest education completed: high school or less or otherwise (coded as 1/0); (2) household income: <25K or otherwise (coded as 1/0); (3) marital status: unmarried or otherwise (coded as 1/0); (4) employment status: not working or otherwise (coded as 1/0); and (5) health insurance: no or otherwise (coded as 1/0). These five variables were summed to create a socioeconomic status (SES) score, which was entered into the LASSO with all other variables listed in eTables 1.1–1.7 and eTable 2, excluding patient-reported outcome (PRO) scales in eTable 1.6. PROs, self-reported medical conditions, and self-reported symptoms related to mental health or depression have been excluded from the model to minimize less interpretable or informative findings.

LASSO regression techniques require an input dataset with complete data. Rather than case-wise deletion, missing data was primarily addressed using iterative regression-based imputation, in which values of the missing fields are predicted using a regression model based on available data from complete cases. First, the 5 key socioeconomic variables described above were included in the models with an additional “missing” level to indicate missing observations. All other missing data fields were grouped by data type and then rank-ordered by most to least missing data. The rank of the whole group was based on the amount of missingness of the majority (≥50%) of the fields within that group. At the first imputation step, grouped fields with a small amount of missing data (i.e., <2%) were imputed with the fields that were never missing (i.e., demographic and baseline characteristics). The grouped field(s) with a larger number of missing data (i.e., between 2 and <5%) were imputed using the newly imputed data from the first step plus the fields without missing data. The remaining imputation steps continued in an iterative process until all missing data fields were imputed.

Data were randomly split into a training set (approximately 70% of the data), which was used to build the models, and an independent test set, which was used to evaluate model performance. A 10-fold cross-validation of LASSO was carried out using only the training data to produce an optimal tuning parameter (the minimum value of lambda), and the final linear model was then trained on the full training set, retaining all predictors with coefficients not equal to zero. Since inferential statistics for LASSO are subject to bias,14 a linear regression with the retained predictors from LASSO was conducted to estimate inferential statistics.

A partial correlation network was developed in order to depict the complex interrelationships among PHQ-9, SES, and other health measures. A correlation matrix containing all variables listed in eTables 1.1–1.7 and eTable 2 was generated, and only the top 10–15 variables from each predictor group (e.g., medical conditions, symptoms, labs) that were most correlated with PHQ-9 (based on the correlation coefficient) were included in the network to enhance legibility.

RESULTS

The relationship between the PHQ-9 score and key demographic characteristics is shown in Table 1. Female sex, younger participants, people of color, and those of Hispanic ethnicity had higher PHQ-9 scores. Table 2 displays socioeconomic status: less education, lower income, non-married, not currently working, and lack of insurance were all associated with higher PHQ-9 scores. Table 3 shows the relationship between PHQ-9 scores and other scales reflecting psychological and social distress. The inter-relationship of these different (but related) measures of distress is evident across the spectrum of measures. By multiple measures, lower SES was associated with higher PHQ-9 score.

Table 1 Demographics: PHQ-9 Score
Table 2 Socioeconomic Characteristics: PHQ-9
Table 3 Relationship Between PHQ-9 Scores and Other Scales Reflecting Psychological and Social Distress

In a simple unadjusted linear model, SES significantly predicted higher PHQ-9 score (p<0.001, R^2=0.08). After 10-fold cross-validation of LASSO was carried out, where model performance was similar between the training and test set, 52 variables were found to be predictive and explained 24% of the variance in PHQ-9 score. The linear model containing all retained variables from LASSO was significant (F=4.20, p<0.001), and explained 25% of the variance in PHQ-9 score. The results of the linear model are shown in Figure 1. SES was in the top six predictors of PHQ-9 score, significantly predicting higher PHQ-9 after adjusting for demographics, behavior, medical conditions, symptoms, and physical function (p<0.001). Although people of Black race had modestly higher PHQ-9 scores (Table 1), after adjusting for all other factors, Black race was associated with lower PHQ-9 scores (p=0.040).

Fig. 1
figure 1

Factors associated with PHQ-9 score in linear model. SES was in the top six predictors of PHQ-9 score, significantly predicting higher PHQ-9 after adjusting for demographics, behavior, medical conditions, symptoms, and physical function (p<0.001). LASSO, least absolute shrinkage and selection operator; PHQ-9, Patient Health Questionnaire-9; SES, socioeconomic status.

Figure 2 demonstrates the relationships among PHQ-9, SES, and other key measures in a partial correlation network. The length of the edges is inversely proportional to the magnitude of the correlation, and hence, highly related nodes appear closer together, with thicker edges indicating stronger correlations. The local clustering coefficient (a measure of a node’s connections with other nodes) of PHQ-9 is 0.52, and the eigenvector centrality (a measure of how connected a node is to other important nodes) of PHQ-9 is 0.22. The size of the nodes indicates variables with a greater number of connections with the other variables in the network. The largest connected subnetwork includes PHQ-9 and SES with laboratory variables on one side and disease symptoms and life satisfaction on the other. Race categories appear as unconnected nodes relatively distant from the center. There is also a locally connected node cluster of well-being variables (e.g., positive and negative affect schedule [PANAS], and subjective happiness) that is linked to both physical and mental health through the PHQ-9. Likewise, the locally connected subnetwork of pain and body symptoms is linked to PHQ-9 through female sex.

Fig. 2
figure 2

Network depicting relationship of PHQ-9 and SES with selected variables. This figure demonstrates the relationships among PHQ-9, SES, and other key measures in a partial correlation network. The length of the edges is inversely proportional to the magnitude of the correlation, and hence, highly related nodes appear closer together, with thicker edges indicating stronger correlations. PHQ-9, Patient Health Questionnaire-9; SES, socioeconomic status.

DISCUSSION

Our study confirms and emphasizes previous findings in the literature regarding the relationship between PHQ-9 and SES. While these relationships are not particularly surprising, they highlight how PHQ-9 is an entree into concerns about social determinants that demand more attention. The PHQ-9 is used in clinical practice15 as recommended by the US Preventive Services Task Force16 for screening within a health system or for public health assessment, yet when performing these assessments, contextual awareness is critically important.

As a cross-sectional study, this analysis cannot answer questions of cause and effect. The ongoing BHS longitudinal study will assess PHQ-9 and detailed serial measures of biological, clinical, behavioral, and social function. This measurement depth of demographic, clinical, biological, and behavioral issues offers an opportunity to better understand how different aspects of distress track together or differ over time.

Other studies have shown the relationship between PHQ-9 score and lower income, joblessness, less education, and lack of insurance, which raises the question of how to design interventions that address all of these concerns. McClintock et al. evaluated an intervention for hypertension that included medical and social determinants and found a greater improvement in PHQ-9.17 Huang et al. evaluated the construct of depression in the PHQ-9 for people of different racial and ethnic backgrounds.6 Income was a key correlate of PHQ-9 in Nigeria.18 A major review of depression from the UK focused on the relevance of social determinants.19 Interventions solely targeting the individual, but neglecting issues of socioeconomics and social context, will likely have limited impact on depression status.4,20,21,22,23,24 Furthermore, efforts to improve social determinants should consider specific treatment for depression, since depression may limit the ability of individuals to respond to opportunity.

People of Black race had slightly higher PHQ-9 scores, but lower PHQ-9 scores after adjustment for other factors; this finding is consistent with other studies suggesting that people of Black race may have greater resilience in the face of socioeconomic factors.25,26 This finding was unexpected, and although it was validated in our study, it should be considered preliminary and deserves follow-up. Across the spectrum of PHQ-9 scores, the higher the score, the lower the income, educational status, or level of social connection. Since our analysis is cross-sectional, we cannot discern whether these findings represent a different approach to revealing concerns or more significant distress.

This study has some limitations. First, our examination is cross-sectional in nature. The follow-up is now accruing in the study, and the trajectory of change related to the multidimensional associations will be informative. Second, BHS participants are volunteers from selected sites who express willingness to share data, and it is likely that people with significant depression are less likely to volunteer. Finally, while the population is generally representative of adult age, sex, race, and ethnicity, it is not a fully representative sample of the population; in particular, volunteers for digital technology studies are different from volunteers found in non-digital studies.27 We also lack detailed information on depression treatment, a potentially modifying factor.

In conclusion, PHQ-9 scores are related to multiple measures that indicate poor SES; therefore, focusing only on depression may have limited effectiveness. Similarly, the high PHQ-9 scores associated with social and economic disparities suggest that policy and economic strategies are needed to accompany individual efforts aimed toward improving depression status. When a high PHQ-9 score or other indicators of depression bring a patient to the attention of a clinician, contextual awareness is critically important to provide an effective clinical intervention. In particular, when an individual is evaluated for depression, behavioral and social factors should be included in their holistic evaluation. Similarly, efforts to improve social and economic status of individuals and populations should consider depression as a factor in reducing the ability of individuals to respond to interventions.