Identifying depression in the United States veterans using deep learning algorithms, NHANES 2005–2018

Qu, Zihan; Wang, Yashan; Guo, Dingjie; He, Guangliang; Sui, Chuanying; Duan, Yuqing; Zhang, Xin; Lan, Linwei; Meng, Hengyu; Wang, Yajing; Liu, Xin

doi:10.1186/s12888-023-05109-9

Identifying depression in the United States veterans using deep learning algorithms, NHANES 2005–2018

Research
Open access
Published: 23 August 2023

Volume 23, article number 620, (2023)
Cite this article

Download PDF

You have full access to this open access article

BMC Psychiatry Aims and scope Submit manuscript

Identifying depression in the United States veterans using deep learning algorithms, NHANES 2005–2018

Download PDF

Zihan Qu¹,
Yashan Wang¹,
Dingjie Guo¹,
Guangliang He¹,
Chuanying Sui¹,
Yuqing Duan¹,
Xin Zhang¹,
Linwei Lan¹,
Hengyu Meng¹,
Yajing Wang² &
…
Xin Liu¹

2573 Accesses
2 Citations
8 Altmetric
1 Mention
Explore all metrics

Abstract

Background

Depression is a common mental health problem among veterans, with high mortality. Despite the numerous conducted investigations, the prediction and identification of risk factors for depression are still severely limited. This study used a deep learning algorithm to identify depression in veterans and its factors associated with clinical manifestations.

Methods

Our data originated from the National Health and Nutrition Examination Survey (2005–2018). A dataset of 2,546 veterans was identified using deep learning and five traditional machine learning algorithms with 10-fold cross-validation. Model performance was assessed by examining the area under the subject operating characteristic curve (AUC), accuracy, recall, specificity, precision, and F1 score.

Results

Deep learning had the highest AUC (0.891, 95%CI 0.869–0.914) and specificity (0.906) in identifying depression in veterans. Further study on depression among veterans of different ages showed that the AUC values for deep learning were 0.929 (95%CI 0.904–0.955) in the middle-aged group and 0.924(95%CI 0.900-0.948) in the older age group. In addition to general health conditions, sleep difficulties, memory impairment, work incapacity, income, BMI, and chronic diseases, factors such as vitamins E and C, and palmitic acid were also identified as important influencing factors.

Conclusions

Compared with traditional machine learning methods, deep learning algorithms achieved optimal performance, making it conducive for identifying depression and its risk factors among veterans.

View this article's peer review reports

Deep learning in mental health outcome research: a scoping review

Article Open access 22 April 2020

Deep Feedforward Neural Networks for Prediction of Mental Health

Unravelling the complexities of depression with medical intelligence: exploring the interplay of genetics, hormones, and brain function

Article Open access 04 April 2024

Introduction

As a major human epidemic, depression ranks ninth in terms of total disability and death, following conditions such as heart disease, stroke, and AIDS [1]. It stands as one of the leading causes of disability worldwide, increases the overall global burden of disease [2]. Depressive episodes are characterized by progressive and sudden onset, with variable duration [3], frequency, and mode of occurrence. The risk of occurrence increases with each episode. Furthermore, age is an important influencing factor in depression [4, 5]. The onset and recurrence of depression tend to be detrimental to the prognosis as the age of onset increases [6]. Depression is often not widely diagnosed and treated due to stigma in filling out the depression scale, inadequate mental health resources, and the tendency to conceal depressive symptoms, making the disorder difficult to identify and predict.

Among veterans, the prevalence of major depressive symptoms was 31%, which is two to five times [7] higher than that of the general U.S. population. Military personnel who participated in deployment were twice as likely to develop depression as those who were not deployed (OR = 2.8) [8]. A cohort study suggested that veterans with depression had a higher risk of suicide [9]. In addition to suicide and injury-related causes of death [10], depression is associated with an increased risk of death from nearly all major medical causes. The cohort study of Quinn D Kellerman et al. [11] showed a higher risk of mortality in heart disease, diabetes, hypertension, and cerebrovascular disease among veterans with depression [12].

In the medical field, machine learning has been proven to be highly predictive [13]. Traditional machine learning methods have also been well applied in the field of depression recognition [14]. In recent years, with the continuous improvement of the algorithms, deep learning (a sub-domain of machine learning) has shown superior identified capabilities compared to other traditional machine learning models. A recent study using deep learning algorithms to identify the severity of hazardous drinkers and alcohol-related problems have confirmed the optimal outcome of deep learning algorithm [15]. To date, no study has used deep learning algorithms to identify depression in veterans.

Therefore, we mainly focused on the effectiveness of deep learning algorithms in identifying depression in veterans. By using 10-fold cross-validation, we compared the deep learning models (DL) and five traditional machine learning models: eXtreme Gradient Boosting (XGBoost), Decision Tree (DT), Support Vector Machines (SVM), K Nearest Neighbor (KNN), and Random Forest (RF), as well as the area under the subject operating characteristic curve (AUC), accuracy, recall, specificity, precision and F1 score to evaluate the identification effectiveness of the model. Considering the significant impact of age on depression, we further identify important variables for middle-aged and older veterans by this algorithm and ranked the contributions.

Methods

Dataset description

We obtained a total of 2,546 veterans as study subjects in the National Health and Nutrition Examination Survey (NHANES) database. The NHANES database is a long-standing and representative survey conducted by the National Center for Health Statistics (NCHS) [16]. A substantial amount of data, including personal health and nutrition information, biometric data, and laboratory test results, was collected by conducting face-to-face interviews, physical examinations, and laboratory tests. A multi-stage sampling method was used to obtain a representative sample of individuals of different age groups, races, genders, and socioeconomic backgrounds in the United States. In addition, a cross-sectional study design was used to obtain data from a representative sample of the population at a given point in time. These surveys were conducted in cycles, each lasting two years. Approval from the Institutional Review Board was not required due to the publicity of NHANES data [17].

We combined raw data from seven cycles of the NHANES database from 2005 to 2018, obtaining a total of 70,193 participants. To mitigate the effects of multicollinearity, the variables that remained consistent throughout the seven cycles were selected. Furthermore, variables indicating the same disease were merged. For instance, in the case of hypertension, the selection criteria included satisfying any one of the three items [18]: [BPQ20] Ever been told to have high blood pressure; [BPXSY] systolic blood pressure ≥ 140 mm Hg and/or [ BPXDI] diastolic blood pressure ≥ 90 mm Hg; [BPQ040a] Ever been told to take a prescription for hypertension. In the end, we got a total of 755 variables.

Remove missing values

The values “7”, “77”, “777” and “7777” indicated rejection, while “9”, “99”, “999” and “9999” indicated unknown status and were therefore considered as missing values. Since missing values will affect the predictive classification effect of machine learning [19], all variables with over 20% missing data were excluded, and the remaining variables were filled with missing values through plural interpolation.

Selecting the study population

Veterans were identified as participants who answered “yes” to the population question (2005–2006 DMQMILIT: Veteran/Military Status; 2007–2010 DMQMILIT: served in the U.S.; DMQMILIT: Served active duty in the U.S. Armed Forces; 2011–2018 DMQMILIZ: Served active duty in the U.S. Armed Forces). Participants who did not answer depression-related questions and those who were under the age of 20 years were excluded from the study. Eventually, a total of 2,546 individuals were included in the study to train the algorithms. (Fig. 1)

Definition of diseases

The Patient Health Questionnaire-9 (PHQ-9) is the most reliable and validated screening tool for depression in primary health care [20]. It comprises nine questions, with each item scored on a scale range of 0–3, resulting in a total score of 27. Participants with scores ≥ 10 on PHQ-9 were considered to have clinically significant depressive symptoms. Therefore, a threshold of 10 was selected for diagnosing depression [21]. To compare the difference of variables between the depressed group and the non-depressed group, categorical variables were tested by SPSS 24.0 using chi-square tests, and a two-sided P < 0.05 was considered statistically significant.

Model development and validation

The algorithms used in this study were implemented in R4.2.1. The variables were selected based on AIC values through backward stepwise regression in “MASS” package, Eventually, 48 variables were retained for analysis [22] (Supplementary Table 1). All data were divided into a training set and a test set at a ratio of 7:3. Furthermore, the “ROSE” package was used in this study to increase the number of minority category samples by random oversampling to balance the dataset [23]. Each algorithm automatically adjusts its hyperparameter values by utilizing a standardized grid of candidate models from the “cart” package. These hyperparameters were subsequently applied to the training data to optimize the model parameters. Deep learning was performed using the h2o.grid function of the H2O platform. Deep learning of the H2O was based on a multilayer feedforward artificial neural network, which was trained using backpropagation for stochastic gradient descent. The model training involved adjusting various parameters, including the activation function (activation="Tanh”, “TanhWithDropout”, “Rectifier”, “Rectifier with dropout”), the range of hidden layers (hidden = c (20, 20), (40, 40), (100, 100), (30, 30, 30)), input dropout ratio (input_dropout_ratio = c (0, 0.05)), and learning rate (rate = c (0.01, 0.25)). The number of epochs was set to 10 by default to filter the best-performing model.

The other five traditional machine learning algorithms, XGBoost, DT, SVM, KNN, and RF were compared with deep learning in the study. (1) XGBoost is a large-scale machine learning algorithm, first officially released in 2016, that was built iteratively to minimize function loss [24]. (2) DT represents a tree-like structure, where each node corresponds to an attribute, the branches represent decision rules, and the leaf nodes represent output classes [25]. (3) SVM uses a one-two hyperplane to split the data into four kernel functions: linear kernel, polynomial kernel, radial basis function, and sigmoid kernel [26]. (4) KNN algorithm is a simple non-parametric method that customizes the information of its neighboring points and classifies the output labels based on a similarity measure [27]. (5) RF is an integrated classification algorithm consisting of a large number of individual decision trees, which employs bootstrap aggregation and randomization of predictor variables to achieve a high degree of predictive accuracy [28].

To reduce the risk of overfitting and bias, we select the best model and hyperparameter combination by 10-fold cross-validation (Supplementary Table 2). The evaluation was performed based on six metrics: AUC, accuracy, recall, specificity, precision, and F1-score [29]. AUC serves as an evaluation metric that provides a comprehensive measure of model classification performance in both balanced and unbalanced datasets. It remains independent of data distribution, insensitive to classification thresholds, and combines two important metrics: the true positive rate and the false positive rate. Consequently, we utilized the magnitude of the AUC (0.8–0.9 is considered good and above 0.9 is considered excellent [17]) as the primary assessment metric for evaluating model performance. Finally, the importance scores of the variables were obtained, and the contribution ranking was analyzed [30].

Results

Classification model performance

Of the 2,546 veterans included in the study from 2005 to 2018, 185 (7.27%) individuals suffered from depression. The demographics and characteristics of the patients are summarized in Table 1. The input variables used to characterize the selected data included gender, age, race, education, marital status, family income to poverty ratio, and BMI (kg/m²). The differences in age, marital status, ratio of family income to poverty and BMI (kg/m²) were statistically significant (P < 0.05). Among all participants, 2,386 were males (93.7%), and 160 were females (6.3%). The number of young, middle-aged, and elderly individuals were 273(10.7%), 913(35.9%), and 1,360(53.4%), respectively.

Table 1 Baseline characteristics of depression in United States veterans

Full size table

DL and other traditional machine learning algorithms are used to train the data and select the optimal hyperparameters for a 10-fold cross-validated model evaluation, and the ROC curves are shown in Fig. 2. the six metrics of DL were AUC (0.891, 95%CI 0.869–0.914), accuracy (0.830), recall (0.754), specificity (0.906), precision (0.889), and F1-score (0.816). AUC was selected as the primary evaluation metric. The AUC value of the DL was the highest, while that of other traditional machine models was XGBoost (0.869, 95%CI 0.824–0.915), DT (0.818, 95%CI 0.787–0.848), SVM (0.805, 95%CI 0.748–0.863), KNN (0.724, 95%CI 0.653–0.794), and RF (0.737, 95%CI 0.669–0.804), respectively. In identifying the level of depression for the entire veteran population, DL emerged as the best performing algorithm, followed by XGBoost, while KNN exhibited the lowest performance. There was a significant difference (P < 0.05) between DL and other traditional machine learning models, namely XGBoost, DT, SVM, KNN, and RF. However, the classification performance of DL was not significantly better than XGBoost (P = 0.389).

In the middle-aged group, DL had the highest AUC (0.929, 95%CI 0.904–0.955), followed by XGBoost (0.879, 95%CI 0.823–0.935) In the elderly group, DL also had the highest AUC (0.924, 95%CI 0.900-0.948), followed by XGBoost (0.923, 95%CI 0.878–0.967). The difference between DL and DT, SVM, KNN, and RF is statistically significant (P < 0.05), but not significantly better than XGBoost (P = 0.108 for the middle-aged group, P = 0.967 for the older age group). The AUC value of DL was stable above 0.900 in different age groups and had the highest specificity and accuracy, which was the best model (Fig. 3; Table 2).

Table 2 Six models predict outcomes of depression in middle-aged and older veterans

Full size table

Feature importance

The deep learning model was used to calculate the importance scores of the total population of veterans, the middle-aged veterans, and the older veterans (Tables 3 and 4). According to the ranking, the top 20 variables were retained in the total population, and the top three variables were general health conditions (1.000), sleep difficulties (0.963), and memory confusion (0.948). The inability to work due to physical, mental, or emotional problems ranked fourth (0.834). Having an income below 130% of the federal poverty level (i.e., PIR < 1.3) ranked fifth (0.676). In addition to the requirement of special equipment for walking, the diet survey of Vitamin E, palmitic acid, and Vitamin C for the total number of families, BMI, and individuals with some chronic diseases were also important variables affecting the depression of veterans. The number of neutrophils in the biochemical index segment ranked seventh (0.703).

Table 3 Identifying the top 20 important variables for overall United States veteran depression through deep learning model

Full size table

Table 4 Top 15 important variables for middle-aged and older veterans

Full size table

The top 15 variables in the middle-aged and older age groups were retained according to the ranking. The top three variables in the middle-aged group were difficulty sleeping (1.000), memory confusion (0.831), and general health condition (0.777). In addition, the intake of docosahexaenoic acid (0.626) was also an important variable. Meanwhile, the top three variables in the older age group were general health conditions (1.000), the requirement of special equipment in walking (0.855), and memory confusion (0.719).

Discussion

In this study, the AUC of the deep learning model for the overall population and the test set was found to be greater than 0.85 after different age stratification. Deep learning has consistently shown higher performance in identifying depression in veterans compared to traditional machine learning methods.

Deep learning is mainly applied to identify and predict clinical diseases from imaging data. Both image and text-based data can achieve favorable prediction effects. Currently, deep learning algorithms based on textual data (HCET) obtain the best performance in modelling electronic health record data to predict depression compared to traditional machine learning [31]. Here are also studies that predict clinical and genetic biomarkers for antidepressant drugs in major depression by deep learning, among which the MFNN model with three hidden layers (AUC = 0.806) has the optimal prediction performance [32]. These results highlight the efficacy of deep learning in disease prediction, even in scenarios where imaging data is unavailable.

The same is true for our study. Deep learning had the highest AUC (0.891 95%CI 0.869–0.914), accuracy (0.830), recall (0.754), specificity (0.906), precision (0.889), and F1-score (0.816) in identifying the overall veterans. Followed by the XGBoost: AUC (0.869, 95%CI 0.824–0.915), accuracy (0.913), recall (0.963), specificity (0.427), precision (0.942), and F1-score (0.816). DT ranked third (AUC:0.818, 95%CI 0.787–0.848). DL achieved the highest AUC of 0.929 (95%CI 0.904–0.955) and 0.924 (95%CI 0.900-0.948) in the middle-aged and elderly groups, respectively, with the highest specificity (0.962), precision (0.953) in the middle-aged group, with the highest specificity (0.960), precision (0.950) in the older group.

We found that general health conditions, sleep difficulties, and memory confusion were the top three variables affecting depression among U.S. veterans, and the deep learning algorithm ranked them in terms of their contribution to crucial variables. This finding is similar to previous studies, in which Angela M Benavides et al. found that sleep difficulties in veterans were associated with self-reported depression [33]. It is reported that veterans have six syndromes, with syndrome 1 being “cognitive impairment” characterized by attention, memory, and reasoning problems, with symptoms in insomnia, depression, daytime sleepiness and headache [34]. In addition, job restrictions, the ratio of family income to poverty, the total number of families, the need for special equipment to walk, infections, BMI, and some chronic illnesses (asthma, liver conditions, hypertension, stroke, and stomach or intestinal illnesses) are all significant variables influencing the depression of veterans. Notably, we also found that the depression of veterans was associated with the intake of vitamin E and vitamin C, which may be due to the beneficial effects of vitamin E on the oxidation and inflammatory state of individuals, leading to diminished depressive symptoms [35]. Conversely, vitamin C deficiency is associated with adverse emotional and cognitive effects, which may trigger depression [36]. Urinary leakage, arthritis, soft fatty acid, and docosahexaenoic acid intake played a significant role in the middle-aged group. Meanwhile, chronic bronchitis, urinary leakage, HIV infection, and lauric acid intake figured prominently in the elderly group. Among these factors, urinary leakage is also an important factor influencing depression. Some studies have found that urinary leakage was related to certain monoamines, particularly serotonin [37, 38]. A study conducted by Kristen Sueoka et al. based on the Veterans Aging Cohort found that HIV-infected patients were more likely to experience depressive symptoms (OR = 1.38, 95%CI = 1.18, 1.62) [39]. These exemplified the rationale for using deep learning models to identify factors that influence depression in veterans.

The advantage of this study is its novelty as the first study to identify the depression of veterans through deep learning. Compared with other deep learning prediction models, dietary data, and biochemical indicators were incorporated to find as many important factors related to depression in veterans as possible. Some studies have shown that general practitioners can identify 40–50% of actual cases [4]. The discrepancy becomes more evident when considering different age groups, as only 47.3% of late-life depression and 39.7% of mid-life depression were correctly identified. Therefore, the clinical identification of depression in primary care is often suboptimal. Deep learning algorithms may be a supportive tool to identify depression in veterans due to the high morbidity [40, 41], identification difficulty, and increased risk of suicide and [42] death.

This study has several limitations. Firstly, the cross-sectional survey used in our study could only identify significant variables but was unable to verify causality. Secondly, the study was limited to depression among US veterans and the results were based on a balanced dataset. Further research is necessary to validate and extend our findings in a larger and more diverse dataset to better represent the true distribution of depression among veterans. Lastly, while our research findings may contribute to an overall understanding of depression risk among the veteran population, the diversity of individual experiences and length of service is crucial and should be duly considered in individual assessments and care.

Conclusion

In this study, the deep learning algorithm has good performance in identifying depression in veterans and is a very effective algorithm. Modeling the identification of veterans’ depression through deep learning algorithms can identify veterans’ depression and their risk factors early enough to provide timely intervention and support, optimize resource allocation and ultimately contributing to the improvement of veterans’ mental health.

Data availability

Data supporting the findings of this study can be found on the NHANES homepage. (www.cdc.gov/nchs/nhanes.htm)

References

Smith K. Mental health: a world of depression. Nature. 2014;515(7526):181.
Article PubMed Google Scholar
Murray CJ, Vos T, Lozano R, Naghavi M, Flaxman AD, Michaud C, Ezzati M, Shibuya K, Salomon JA, Abdalla S, et al. Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990–2010: a systematic analysis for the global burden of Disease Study 2010. Lancet (London England). 2012;380(9859):2197–223.
Article PubMed Google Scholar
Malhi GS, Mann JJ. Depression. Lancet (London England). 2018;392(10161):2299–312.
Article PubMed Google Scholar
Mitchell AJ, Rao S, Vaze A. Do primary care physicians have particular difficulty identifying late-life depression? A meta-analysis stratified by age. Psychother Psychosom. 2010;79(5):285–94.
Article PubMed Google Scholar
Alexopoulos GS. Depression in the elderly. Lancet (London England). 2005;365(9475):1961–70.
Article PubMed Google Scholar
Spijker J, de Graaf R, Bijl RV, Beekman AT, Ormel J, Nolen WA. Duration of major depressive episodes in the general population: results from the Netherlands Mental Health Survey and Incidence Study (NEMESIS). Br J psychiatry: J mental Sci. 2002;181:208–13.
Article Google Scholar
Hankin CS, Spiro A 3rd, Miller DR, Kazis L. Mental disorders and mental health treatment among U.S. Department of Veterans Affairs outpatients: the Veterans Health Study. Am J Psychiatry. 1999;156(12):1924–30.
Blore JD, Sim MR, Forbes AB, Creamer MC, Kelsall HL. Depression in Gulf War veterans: a systematic review and meta-analysis. Psychol Med. 2015;45(8):1565–80.
Article CAS PubMed Google Scholar
Zivin K, Kim HM, McCarthy JF, Austin KL, Hoggatt KJ, Walters H, Valenstein M. Suicide mortality among individuals receiving treatment for depression in the Veterans Affairs health system: associations with patient and treatment setting characteristics. Am J Public Health. 2007;97(12):2193–8.
Article PubMed PubMed Central Google Scholar
VanItallie TB. Traumatic brain injury (TBI) in collision sports: possible mechanisms of transformation into chronic traumatic encephalopathy (CTE). Metab Clin Exp. 2019;100s:153943.
Article PubMed Google Scholar
Zivin K, Yosef M, Miller EM, Valenstein M, Duffy S, Kales HC, Vijan S, Kim HM. Associations between depression and all-cause and cause-specific risk of death: a retrospective cohort study in the Veterans Health Administration. J Psychosom Res. 2015;78(4):324–31.
Article PubMed Google Scholar
Zivin K, Ilgen MA, Pfeiffer PN, Welsh DE, McCarthy J, Valenstein M, Miller EM, Islam K, Kales HC. Early mortality and years of potential life lost among Veterans Affairs patients with depression. Psychiatric Serv (Washington DC). 2012;63(8):823–6.
Article Google Scholar
Gao S, Calhoun VD, Sui J. Machine learning in major depression: from classification to treatment outcome prediction. CNS Neurosci Ther. 2018;24(11):1037–52.
Article PubMed PubMed Central Google Scholar
Zhang C, Chen X, Wang S, Hu J, Wang C, Liu X. Using CatBoost algorithm to identify middle-aged and elderly depression, national health and nutrition examination survey 2011–2018. Psychiatry Res. 2021;306:114261.
Article PubMed Google Scholar
Kim SY, Park T, Kim K, Oh J, Park Y, Kim DJ. A deep learning algorithm to Predict Hazardous Drinkers and the severity of Alcohol-Related problems using K-NHANES. Front Psychiatry. 2021;12:684406.
Article PubMed PubMed Central Google Scholar
Paulose-Ram R, Graber JE, Woodwell D, Ahluwalia N. The National Health and Nutrition Examination Survey (NHANES), 2021–2022: Adapting Data Collection in a COVID-19 environment. Am J Public Health. 2021;111(12):2149–56.
Article PubMed PubMed Central Google Scholar
Lee C, Kim H. Machine learning-based predictive modeling of depression in hypertensive populations. PLoS ONE. 2022;17(7):e0272330.
Article CAS PubMed PubMed Central Google Scholar
Li C, Shang S. Relationship between Sleep and Hypertension: Findings from the NHANES (2007–2014). International journal of environmental research and public health 2021, 18(15).
Pridham G, Rockwood K, Rutenberg A. Strategies for handling missing data that improve Frailty Index estimation and predictive power: lessons from the NHANES dataset. GeroScience. 2022;44(2):897–923.
Article PubMed PubMed Central Google Scholar
Levis B, Benedetti A, Thombs BD. Accuracy of Patient Health Questionnaire-9 (PHQ-9) for screening to detect major depression: individual participant data meta-analysis. BMJ (Clinical research ed). 2019;365:l1476.
PubMed Google Scholar
Iranpour S, Sabour S. Inverse association between caffeine intake and depressive symptoms in US adults: data from National Health and Nutrition Examination Survey (NHANES) 2005–2006. Psychiatry Res. 2019;271:732–9.
Article CAS PubMed Google Scholar
Shariff JA, Cheng B, Papapanou PN. Age-Specific Predictive Models of the Upper Quintile of Periodontal attachment loss. J Dent Res. 2020;99(1):44–50.
Article CAS PubMed Google Scholar
Darabi N, Hosseinichimeh N, Noto A, Zand R, Abedi V. Machine learning-enabled 30-Day readmission model for stroke patients. Front Neurol. 2021;12:638267.
Article PubMed PubMed Central Google Scholar
Jiang J, Pan H, Li M, Qian B, Lin X, Fan S. Predictive model for the 5-year survival status of osteosarcoma patients based on the SEER database and XGBoost algorithm. Sci Rep. 2021;11(1):5542.
Article CAS PubMed PubMed Central Google Scholar
Xu X, Zhang J, Yang K, Wang Q, Chen X, Xu B. Prognostic prediction of hypertensive intracerebral hemorrhage using CT radiomics and machine learning. Brain and behavior. 2021;11(5):e02085.
Article CAS PubMed PubMed Central Google Scholar
Noble WS. What is a support vector machine? Nat Biotechnol. 2006;24(12):1565–7.
Article CAS PubMed Google Scholar
Gil-Pita R, Yao X. Evolving edited k-nearest neighbor classifiers. Int J Neural Syst. 2008;18(6):459–67.
Article PubMed Google Scholar
Speybroeck N. Classification and regression trees. Int J public health. 2012;57(1):243–6.
Article CAS PubMed Google Scholar
Munir K, Elahi H, Ayub A, Frezza F, Rizzi A. Cancer diagnosis using deep learning: a bibliographic review. Cancers 2019, 11(9).
Ali MM, Paul BK, Ahmed K, Bui FM, Quinn JMW, Moni MA. Heart disease prediction using supervised machine learning algorithms: performance analysis and comparison. Comput Biol Med. 2021;136:104672.
Article PubMed Google Scholar
Meng Y, Speier W, Ong M, Arnold CW. HCET: hierarchical clinical embedding with topic modeling on Electronic Health Records for Predicting Future Depression. IEEE J biomedical health Inf. 2021;25(4):1265–72.
Article Google Scholar
Lin E, Kuo PH, Liu YL, Yu YW, Yang AC, Tsai SJ. A Deep Learning Approach for Predicting antidepressant response in Major Depression using clinical and genetic biomarkers. Front Psychiatry. 2018;9:290.
Article PubMed PubMed Central Google Scholar
Benavides AM, Finn JA, Tang X, Ropacki S, Brown RM, Smith AN, Stevens LF, Rabinowitz AR, Juengst SB, Johnson-Greene D, et al. Psychosocial and functional predictors of depression and anxiety symptoms in Veterans and Service Members with TBI: a VA TBI Model Systems Study. J Head Trauma Rehabil. 2021;36(6):397–407.
Article PubMed Google Scholar
Haley RW, Kurt TL, Hom J. Is there a Gulf War Syndrome? Searching for syndromes by factor analysis of symptoms. JAMA. 1997;277(3):215–22.
Article CAS PubMed Google Scholar
Lee A, Tariq A, Lau G, Tok NWK, Tam WWS, Ho CSH, Vitamin E. Alpha-Tocopherol, and its Effects on Depression and anxiety: a systematic review and Meta-analysis. Nutrients 2022, 14(3).
Plevin D, Galletly C. The neuropsychiatric effects of vitamin C deficiency: a systematic review. BMC Psychiatry. 2020;20(1):315.
Article CAS PubMed PubMed Central Google Scholar
Littlejohn JO Jr, Kaplan SA. An unexpected association between urinary incontinence, depression and sexual dysfunction. Drugs of today (Barcelona Spain: 1998). 2002;38(11):777–82.
Article PubMed Google Scholar
Steers WD, Lee KS. Depression and incontinence. World J Urol. 2001;19(5):351–7.
Article CAS PubMed Google Scholar
Sueoka K, Goulet JL, Fiellin DA, Rimland D, Butt AA, Gibert C, Rodriguez-Barradas MC, Bryant K, Crystal S, Justice AC. Depression symptoms and treatment among HIV infected and uninfected veterans. AIDS Behav. 2010;14(2):272–9.
Article PubMed Google Scholar
Nichter B, Norman S, Haller M, Pietrzak RH. Physical health burden of PTSD, depression, and their comorbidity in the U.S. veteran population: morbidity, functioning, and disability. J Psychosom Res. 2019;124:109744.
Article PubMed Google Scholar
Moring JC, Nason E, Hale WJ, Wachen JS, Dondanville KA, Straud C, Moore BA, Mintz J, Litz BT, Yarvis JS, et al. Conceptualizing comorbid PTSD and depression among treatment-seeking, active duty military service members. J Affect Disord. 2019;256:541–9.
Article PubMed PubMed Central Google Scholar
Nichter B, Norman S, Haller M, Pietrzak RH. Psychological burden of PTSD, depression, and their comorbidity in the U.S. veteran population: suicidality, functioning, and service utilization. J Affect Disord. 2019;256:633–40.
Article PubMed Google Scholar

Download references

Acknowledgements

None.

Funding

None.

Author information

Authors and Affiliations

Department of Epidemiology and Statistics, School of Public Health, Jilin University, Changchun, 130021, China
Zihan Qu, Yashan Wang, Dingjie Guo, Guangliang He, Chuanying Sui, Yuqing Duan, Xin Zhang, Linwei Lan, Hengyu Meng & Xin Liu
School of Computer Science, McGill University, Montreal, H3A 0G4, Canada
Yajing Wang

Authors

Zihan Qu
View author publications
You can also search for this author in PubMed Google Scholar
Yashan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dingjie Guo
View author publications
You can also search for this author in PubMed Google Scholar
Guangliang He
View author publications
You can also search for this author in PubMed Google Scholar
Chuanying Sui
View author publications
You can also search for this author in PubMed Google Scholar
Yuqing Duan
View author publications
You can also search for this author in PubMed Google Scholar
Xin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Linwei Lan
View author publications
You can also search for this author in PubMed Google Scholar
Hengyu Meng
View author publications
You can also search for this author in PubMed Google Scholar
Yajing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

ZQ designed this research and wrote the article. YW, DG, GH, CS, YD, LL, HM, and XZ analyzed the data and made graphs. YW helps with methodological verification and language polishing. XL Interpreted the results and guided the writing of this thesis. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xin Liu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Conflict of interest

Author Zihan Qu declares that she has no conflict of interest. Author Yashan Wang declares that she has no conflict of interest. Author Dingjie Guo declares that she has no conflict of interest. Author Guangliang He declares that he has no conflict of interest. Author Chuanying Sui declares that he has no conflict of interest. Author Yuqing Duan declares that she has no conflict of interest. Author Xin Zhang declares that she has no conflict of interest. Author Linwei Lan declares that she has no conflict of interest. Author Hengyu Meng declares that he has no conflict of interest. Author Xin Liu declares that she has no conflict of interest.

Consent for publication

Not applicable.

Ethics approval and consent to participate

There is no ethical statement in our research.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1: Supplementary Table 1. Codebook

Supplementary Material 2: Supplementary Table 2. Summary of the parameter values of each model

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Qu, Z., Wang, Y., Guo, D. et al. Identifying depression in the United States veterans using deep learning algorithms, NHANES 2005–2018. BMC Psychiatry 23, 620 (2023). https://doi.org/10.1186/s12888-023-05109-9

Download citation

Received: 25 December 2022
Accepted: 13 August 2023
Published: 23 August 2023
DOI: https://doi.org/10.1186/s12888-023-05109-9

Identifying depression in the United States veterans using deep learning algorithms, NHANES 2005–2018

Abstract

Background

Methods

Results

Conclusions

Similar content being viewed by others

Deep learning in mental health outcome research: a scoping review

Deep Feedforward Neural Networks for Prediction of Mental Health

Unravelling the complexities of depression with medical intelligence: exploring the interplay of genetics, hormones, and brain function

Introduction

Methods

Dataset description

Remove missing values

Selecting the study population

Definition of diseases

Model development and validation

Results

Classification model performance

Feature importance

Discussion

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Conflict of interest

Consent for publication

Ethics approval and consent to participate

Additional information

Publisher’s Note

Electronic supplementary material

Supplementary Material 1: Supplementary Table 1. Codebook

Supplementary Material 2: Supplementary Table 2. Summary of the parameter values of each model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation