Introduction

Studies of the long-term psychiatric and neurocognitive functioning of World Trade Center (WTC) responders during the two decades since September 11, 2001 have found high rates of impairment. The most prevalent psychiatric condition is post-traumatic stress disorder (PTSD), which is characterized by re-experiencing, avoidance, negative cognitions and mood, and arousal symptoms1,2,3. Nearly 20% of responders developed PTSD, and 10% continue to suffer from the disorder1,4. The most prevalent neurocognitive condition is mild cognitive impairment (MCI), which is characterized by declines in memory, learning, concentration, and decision-making that are not yet sufficient to cause functional limitations5. Critically, systematic reviews have identified consistent associations between PTSD and both neurocognitive dysfunction6 and dementia7 in cohorts of veterans and Holocaust survivors. In our WTC cohort, we observed a 2.67-fold increase in the incidence of MCI among responders with PTSD two decades after exposure8. Given this association, this paper uses proteomics analysis to undertake an in-depth characterization of the pathophysiology of MCI, PTSD, and their co-occurrence.

Proteomics is a promising strategy for characterizing the biological signatures of disorders that has been facilitated by the emergence of high-throughput technologies9. Proteins execute functions within cells and communication between them, and thus are potentially involved in pathological processes underpinning PTSD and MCI. Proteomics, therefore, aims to capture the dynamics of protein expression and detail their interactions within a cell10, an important process when trying to elucidate cellular adaptation to environmental signals and cellular aspects of disease processes11. Proteomics offers a different level of understanding of these processes compared with genomics and transcriptomics because proteins undergo alternative post-translational modification (e.g., phosphorylation) essential for protein function; as a result, information from a single gene can encode different protein species10 and form protein complexes that determine function11.

Existing work on proteomics has identified biomarkers that are altered both in individuals with PTSD and with MCI. For example, PTSD has been linked to alterations of serum proteins such as glial fibrillary acidic protein (GFAP), vascular endothelial growth factor (VEGF)12, β-amyloid13, and C-reactive protein (CRP)14. Similarly, MCI was associated with changes to VEGF, CRP, and cortistatin (CORT), among others15. Co-occurring PTSD and MCI was examined in only one molecular study of a mouse model that found that the loss of FMN2 gene was associated with both PTSD-like phenotypes (i.e., fear extinction) and age-accelerated memory impairment16. However, no studies to our knowledge have examined the extent to which protein signatures for PTSD, MCI, and their comorbidity differ in vivo in humans. This is important because of known interspecies variability and differences in proteomics17.

This study aims to fill the gap in molecular studies of PTSD and MCI by profiling a large set of proteins (k = 276) with known involvement in related processes to determine whether markers of neurodevelopmental processes, cellular regulation, immunological function, cardiovascular disease, inflammatory processes, and neurological diseases are linked to PTSD and MCI by comparing patients with PTSD, MCI, and comorbid PTSD–MCI with unaffected controls18,19,20. We hypothesized that alterations in these processes reflect a combination of proteomic profiles that are observed in PTSD, MCI, and comorbid PTSD–MCI but not in unaffected individuals. Second, we constructed multiprotein composite scores and examined their associations with PTSD and MCI symptom severity.

Methods

Participants

Participants were recruited through the Stony Brook WTC Health Program21. This study was approved by the Stony Brook University IRB. Written informed consent was obtained. The analysis focused on a subsample of male responders who completed their annual monitoring visit in 2019. We studied only male responders because <10% of the Stony Brook cohort is female, and women show notably different protein expression patterns from men22. Responders with a history of medical or neurodegenerative conditions, brain tumors, cancers, or cerebrovascular conditions were ineligible for the study.

Clinical measures and classification

Probable PTSD was measured with the Posttraumatic Stress Disorder Checklist-Specific Version (PCL-17)23, a 17-item self-report questionnaire modified to assess the severity of WTC-related DSM-IV PTSD symptoms over the past month on a scale of 1 (never bothered by) to 5 (extremely bothered by) (Cronbach α = 0.96). Probable PTSD was operationalized by a PCL total score >44. The unaffected sample was asymptomatic (PCL score <22).

MCI was measured using the Montreal Cognitive Assessment (MoCA), a widely used objective multidomain test24. A conservative cutoff of <22 was applied to reduce misclassification. Normal cognitive functioning was defined as MoCA >26 consistent with testing guidelines25. Unaffected controls (PCL <22 and MoCA >26) were subject to an additional medical record review to rule out responders with a clinical history of PTSD and related disorders.

The final sample (N = 181) included 34 responders with comorbid PTSD–MCI, 39 with PTSD only, 27 with MCI only, and 81 unaffected controls.

Proteomics profiling

Protein expression of plasma was profiled using the Olink Proseek Multiplex Platform. The Olink multiplex immunoassay was designed to provide an ultrasensitive, reproducible, and highly multiplexed method for measuring protein expression. The measurement was based on state-of-the-art Proximity Extension Assay (PEA) technology26. More details are available online (https://www.olink.com). Three commercial Olink panels were profiled for each participant included in the Neurology, Neuro Exploratory and Cardiovascular II (CVII) panels. Thus, 276 proteins (92 proteins per panel) were targeted involving a range of processes indicative of a range of neurological diseases, cellular regulation, immunology, cardiovascular, inflammatory, development, and metabolism.

Proteomics data preprocessing

A number of internal and external controls were added to the plasma samples for quality control to monitor protein–antibody reactions, the DNA extension step, and detection quality of the qPCR in order to estimate the background signal and to calculate the limit of detection (LOD) for Olink panels. Proteins below LOD were imputed with LOD27. Protein concentration was represented in arbitrary units on a log2 scale and termed Normalized Protein eXpression (NPX), i.e., a one NPX difference means a doubling of protein concentration. The NPX value represented a relative quantification so that the data for a specific protein can be compared across different samples. Reference samples run on plates from different batches were included for batch-effect correction. The adjustment factor at protein level for each batch was calculated as median NPX of the bridging samples and subtracted from the NPX values of each sample. Batch-corrected log- transformed NPX was used in subsequent analyses (termed normalized NPX). We compared the reproducibility of the bridging samples using Pearson correlation. Supplementary Figure 1 shows the high reproducibility of the Olink panels across six representative sets of technical duplicates, with a mean correlation r = 0.97.

Differential proteomics analysis

To assess associations of PTSD and MCI with protein regulation, differential analyses were carried out using a linear model with normalized NPX as the dependent and case/control as independent variables, adjusting for age and race, on a subset of (a) 34 PTSD–MCI cases versus 81 unaffected controls, (b) 39 PTSD-only cases versus 81 unaffected controls, and (c) 27 MCI-only cases versus 81 exposed controls. Statistically significant proteins were identified at P < 0.05, as well as at false discovery rate (FDR) < 0.1 within each panel28. To assess the consistency of the findings, a Monte-Carlo experiment was conducted by randomly partitioning the data into 50% discovery and 50% replication subsample. We considered replicated proteins in which both the discovery and replication subsamples were significant at P < 0.10, and had effect sizes in the same direction. The random partitioning was repeated 100 times, and the number of times the proteins were replicated was recorded. The correlation between the estimated beta coefficients of all proteins for case/control status across the three subset analyses was assessed using Pearson correlation coefficients. The overlap between the top proteins identified from each subset analysis was compared via a Venn diagram. The top proteins identified from this study were compared with recent omics studies of PTSD and Alzheimer’s disease (AD).

Disease-burden analysis

Among the proteins identified at FDR < 0.1 from the PTSD–MCI subset analyses, three competing models were fitted to ascertain which of the following models best fit the protein- regulatory pattern: H1, the protein expression of PTSD-only subgroup was intermediary between PTSD–MCI and control (i.e., Control < PTSD only < PTSD–MCI or Control > PTSD only > PTSD–MCI), H2, the protein expression of the PTSD-only subgroup was similar to PTSD–MCI subgroup (i.e., Control ≠ PTSD only = PTSD–MCI), or H3, the protein expression of PTSD-only subgroup was similar to the unaffected controls (i.e., Control = PTSD only ≠ PTSD–MCI). For model H1, a linear model was fitted to the subgroup defined by 1 = control, 2 = PTSD only, and 3 = PTSD–MCI as an ordinal predictor. For model H2, a linear model was fitted to the subgroup defined by 0 = control, 1 = PTSD only, or PTSD–MCI as a binary predictor. For model H3, a linear model was fitted to the subgroup defined by 0 = control or PTSD only, 1 = PTSD–MCI as a binary predictor. All models were adjusted for age and race. The Bayesian Information Criterion (BIC) score was computed, and the model that corresponded to the smallest BIC score was selected as the best-fitting model. Analyses were repeated by replacing PTSD-only subgroup with MCI-only subgroup. Proteins that identified model H1 as the best-fitting model can be regarded as candidate biomarkers for disease burden characterized by co-occurrence of PTSD–MCI.

Multiprotein composite score

To evaluate the utility of proteomics in classifying cases and controls, we applied the elastic net algorithm29. For each case/control subset, the top-ranking proteins by P values from the differential expression analysis were used as candidate feature sets. Leave-one-out (LOO) cross-validation prediction was used to evaluate model performance, i.e., the model was trained on N-1 samples, and used to predict the score in the left-out test sample, and the process was cycled through N samples. Within each training set, the optimal tuning parameters were determined via a fivefold cross-validation. The area under the ROC curve (AUC) was used as a metric for performance evaluation. Pearson correlation was calculated to estimate the association between the multiprotein composite scores and PTSD and MCI symptom-severity score.

Results

Participant characteristics

The overall average age was 55.1 (SD = 7.78), and the mean ages of the four groups were similar. The majority of the sample was Caucasian, and no significant racial/ethnic differences among cases and controls were observed (Table 1).

Table 1 Clinical characteristics of study samples.

Differential protein analysis associated with PTSD and MCI

Subset analysis of comorbid PTSD–MCI case group versus controls identified 16 Olink proteins at P < 0.05, of which six attained FDR < 0.1. Eleven of the original 16 proteins were upregulated in cases. The six proteins significant at FDR < 0.1 were NCAN, BCAN, CTSS, MSR1, MDGA1, and CPA2; all six proteins were replicated >50% times in the Monte-Carlo experiment. On the other hand, subset analysis of PTSD-only cases versus controls identified 24 proteins at P < 0.05, of which two attained FDR < 0.1. In total, 22 out of these 24 proteins were upregulated in cases. The two proteins significant at FDR < 0.1 were CD302 and FLRT2; both were replicated >70% times in the Monte-Carlo experiment. Finally, subset analyses of MCI-only cases versus controls identified 20 proteins at P < 0.05, of which only one attained FDR < 0.1. Seven out of these 20 proteins were upregulated in cases. The protein significant at FDR < 0.1 was PVR, which was replicated >80% times in the Monte-Carlo experiment. Altogether, 50 unique proteins were obtained from the combined lists in subset analyses (Table 2). Several identified proteins had been previously implicated in other omics studies of PTSD and AD. Additional details on comparison of these proteins with recent omics studies of PTSD and AD were provided in Supplementary Text and Supplementary Tables 24. The Venn diagram comparing the overlap between the top proteins in subset analyses (Fig. 1) suggested that CTSS was the only common protein identified by all subset analyses at P < 0.05, whereas EFNA4 was in common between PTSD–MCI and PTSD-only analyses; BCAN, MDGA1, CPA2, and EPHA10 were in common between PTSD–MCI and MCI-only analyses; PVR, CD200, and ATP6V1F were in common between PTSD-only and MCI-only analyses.

Table 2 List of proteins differentially expressed at P < 0.05 from the subset analyses.
Fig. 1: Overlap between the top proteins at P < 0.05 from the three subset analyses.
figure 1

PTSD only (24 proteins), MCI only (20 proteins), and PTSD–MCI (16 proteins).

Among 50 unique proteins identified above, 39/50 showed consistent sign/direction in the estimated beta coefficients across the three subset analyses. The remaining 11 proteins were not among the proteins shared by any two subset comparisons. Across all 276 proteins examined in these analyses, the estimated beta coefficients for PTSD only versus controls and MCI only versus controls were moderately correlated (r = 0.345, P < 0.05) as shown in Fig. 2, suggesting that shared biological mechanisms may be involved in the two disorders.

Fig. 2: Pairwise correlations between the estimated beta coefficients for case/control across the three subset analyses, namely PTSD-MCI vs control, PTSDonlyvs control and MCI-only vs control.
figure 2

The lower triangular panel shows the scatter plots, the upper triangular panel shows the corresponding Pearsoncorrelation coefficients, the diagonal panel shows the distributions of the estimated beta coefficients.

PTSD–MCI-associated proteins linked to disease burden

Among the six proteins significant at FDR < 0.1 in the PTSD–MCI versus healthy control analysis shown in Table 2, BCAN and NCAN showed monotonically decreasing protein expression patterns, whereas for PTSD only versus PTSD–MCI, CTSS, MSR1, MDGA1, and CPA2 showed monotonically increasing protein expression patterns (Supplementary Fig. 2). The BIC scores are reported in Supplementary Table 5. All the proteins (except NCAN) achieved the lowest BIC scores in the H1 model (i.e., the protein expression of the PTSD-only subgroup was intermediary between PTSD–MCI and control). The BIC scores of H1 and H3 models (i.e., control = PTSD only ≠ PTSD–MCI) of NCAN were comparable, indicating that both models fit NCAN equally well, and suggesting that these proteins are associated with disease burden of co-occurring PTSD and MCI compared with PTSD only. On the other hand, only for NCAN, H1 was the best model. The protein expression of BCAN, CTSS, MDGA1, and CPA2 indicated that the MCI-only subgroup was similar to PTSD–MCI since the BIC scores for H2 model (i.e., control ≠ MCI only = PTSD–MCI) were the lowest, whereas for MSR1, the MCI-only subgroup was similar to controls. These results suggest that the dysregulations of BCAN, CTSS, MDGA1, and CPA2 were most strongly associated with MCI.

Multiprotein composite score

The leave-one-out (LOO) cross-validation achieved an AUC = 0.84 in PTSD–MCI classification (Table 3) using the top 37 proteins associated with PTSD–MCI at P < 0.1 listed in Supplementary Table 6 as candidate features. The AUC was lower at 0.81 using the 16 proteins associated with PTSD–MCI (P < 0.05). Similarly, the LOO cross-validation achieved AUC = 0.83 and 0.84 in MCI-only classification using the 20 and 41 MCI-only associated proteins (P < 0.05 and P < 0.1), respectively. However, the LOO cross-validation only achieved AUC 0.77 in PTSD-only classification (Table 3) using the 52 PTSD-only associated proteins at P < 0.1 listed in Supplementary Table 6. The AUC was lower (0.68) using 24 PTSD-only associated proteins at P < 0.05 (Supplementary Table 7). In all three classification models, using all 276 proteins as candidate features achieved a lower AUC, suggesting that adding in other protein signals may induce noise (Supplementary Table 7). Taken together, the results from multiprotein composite scores indicated that the panel of proteins included in this study had larger discriminative power for MCI compared with PTSD.

Table 3 Leave-one-out cross-validation prediction performance on models trained on subsets of (a) PTSD–MCI, (b) PTSD only, and (c) MCI only versus controls.

Discussion

Prior studies have shown that chronic PTSD in the responders to the World Trade Center disaster is associated with systemic and neuropsychiatric conditions including MCI30,31. Furthermore, in some instances, we demonstrated that not only was there an association, but that PTSD helps to mediate the development and chronicity of these conditions, and may be linked to possible early dementia32. The current study was the largest study to evaluate the molecular link between PTSD and MCI in the same cohort. It profiled a large set of proteins involved in a number of neurobiological processes, neurological diseases, cellular regulation, immunology, cardiovascular, inflammatory, development, and metabolism. In this study, we systematically assessed changes in the proteome of WTC responders suffering from PTSD with and without comorbid MCI nearly two decades after the traumatic event, in order to identify biomarkers that could inform us the biologic changes in our patients as well as the nature of the relationship between these conditions. We found that both MCI and PTSD were associated with serologic proteinopathy. The results also suggested that comorbid PTSD–MCI was likely a more severe form of PTSD rather than a separate condition. Last, we found that protein dysregulation was more systematically associated with MCI. As such, the multiprotein composite score provided us with a novel method to characterize and monitor patients with both MCI and PTSD and, if confirmed in independent studies, may ultimately give us insights into potential novel therapeutic interventions.

We identified 16 proteins associated with PTSD–MCI at p < 0.05 (six at FDR < 0.1), 20 proteins associated with PTSD only (two at FDR < 0.1), and 24 proteins associated with MCI only (one at FDR < 0.1), resulting in a total of 50 unique proteins from the combined lists. It is important to note that protein expression in the blood does not represent protein production in any specific tissue, per se, but rather proteins secreted into the blood from multiple organs and tissues. This is in contrast to gene expression analysis that is derived from a specific tissue. Nonetheless, although overall comparison with recent omics studies in AD showed that most of the top genes identified in these studies did not overlap with our targeted panel of 276 proteins as described in Supplementary Text, there were some that did as described below. Among these 50 proteins, only Cathepsin S (CTSS) was in common across the three subset analyses. Our analyses identified positive associations across the three subset analyses (r = 0.35–0.45), suggesting shared biological mechanisms across these two phenotypes. Notably, the gene encoding Cathepsin S (CTSS) had been found to be upregulated in the discovery cohort of Dean Hammamieh33, and plays an important role in antigen presentation and immune responses34. Single-nucleotide polymorphisms (SNP) that map to the CTSS gene have been found to be associated with late-onset Alzheimer’s disease (AD)35. Other members of the Cathepsin family have also been shown to be implicated in AD (Cathepsins B and D)36,37 and SCZ (Cathepsin K)38. On the other hand, MAM domain-containing glycosylphosphatidylinositol anchor protein 1 (MDGA1) and ephrin type-A receptor 10 (EPHA10), which were identified in both the PTSD-only and MCI-only analyses, have been found to be associated with pathologic and clinical diagnoses of AD in the transcriptomes of postmortem brain39. MDGA1 is implicated in the radial migration of cortical neurons of the neocortex40, whereas EPHA10 is involved in mobility in neuronal and epithelial cells and memory formation41. Similarly, V-type proton ATPase subunit F (ATP6V1F) and OX-2 membrane glycoprotein (CD200), which were identified in both the PTSD-only and MCI-only analyses, have been found to be differentially expressed in the transcriptomes of peripheral blood cells of patients with PTSD33,42,43. Based on the transcriptome mega-analysis results of Breen Tylee43 (DE genes at P < 0.05 for each trauma-specific case–control cohort as evident in Supplementary Table 2 of Breen study), ATP6V1F and CD200 showed consistent effect-size direction in transcriptomic regulation compared with the proteomics results in our data. Specifically, ATP6V1F was downregulated in the gene expression of emergency-department trauma survivors42, consistent with the protein expression in our data. In addition, loss of function of ATP6V1F has been shown to be a potential enhancer of tau toxicity, a hallmark of AD44. Yet, CD200 was upregulated in childhood trauma and interpersonal trauma subgroups45, consistent with our proteomics data. CD200 expression was shown to be downregulated in the hippocampus and inferior temporal gyrus of AD patients46. The authors further showed that lower expression of CD200 receptor was observed in microglia compared with blood-derived macrophages. Thus, we hypothesized that the upregulation of CD200 in plasma samples of our study could be a consequence of cell migration to blood through the blood–brain barrier.

The top two proteins, namely neurocan (NCAN) and brevican (BCAN) core proteins, identified from analyses of PTSD–MCI versus controls showed monotonically decreasing protein expression patterns across the PTSD-only and MCI-only subgroups, suggesting that these proteins are candidate biomarkers for disease burden characterized by co-occurrence of PTSD and MCI. Genetic variation in NCAN has been shown to be a common risk factor for bipolar disorder and schizophrenia47, as well as in MCI48. In addition, NCAN and BCAN are members of the chondroitin sulfate proteoglycan (CSPG) protein families, and CSPGs are implicated in neurodegenerative diseases49. Specifically, CSPGs have been shown to accumulate in senile plaques in brains of patients with AD49, potentially suggesting that fewer CSPGs will penetrate into the blood in AD. Together with the previous epidemiologic findings that PTSD is associated with long-term cognitive decline30,50, this suggests that NCAN and BCAN may constitute novel biomarkers contributing to processes by which PTSD affects cognitive functioning.

The multiprotein composite score based on top PTSD–MCI and MCI-only associated proteins achieved a high accuracy (AUC = 0.84) in PTSD–MCI and MCI-only classification, respectively. On the other hand, the multiprotein composite score based on top PTSD-only associated proteins achieved AUC = 0.77 in PTSD-only classification. These results suggested that the proteins included in this study have a larger discriminative power for MCI compared with PTSD. We also found a robust association between the composite score, PTSD, and CI symptom severity. This suggested that the current multiprotein composite score may be further refined into a useful index that aids in classification.

Strengths and limitations

This study has several strengths, including a large-scale high-precision multiplexed proteomic analysis of a large number of neurological, inflammatory, and immune-related proteins using validated panels, and a common trauma in all participants including controls. Nonetheless, our findings must be considered in the context of several limitations. First, our study is cross-sectional, which can establish concurrent associations between protein expression, PTSD, and MCI. However, the direction of the associations cannot be determined. Longitudinal studies of linkages between change in symptom severity and change in protein expression are needed to determine the direction of the effects we observed. Second, potential confounders, such as the level of trauma exposure and comorbid medical conditions, were not considered. Third, the multiprotein composite score was constructed based on the proteins identified from the same study samples. Although we used a LOO cross-validation prediction scheme to reduce the bias in model evaluation, it is important to replicate the composite score in an independent validation cohort. Fourth, although our study covered a wide spectrum of proteins, it is a targeted proteomics study and may therefore miss changes in proteins that were unobserved in this study. In addition, the multiprotein composite score indicated that the current proteomics panel can discriminate MCI from control at high accuracy; however, the accuracy is lower in PTSD classification. It remains uncertain whether PTSD classification accuracy would be improved by surveying other proteins. Mass spectrometry is a competing platform for more comprehensive and hypothesis-free protein coverage. However, absent a targeted hypothesis, this platform requires a much larger sample size to rule out the greater numbers of false positives.

Conclusion

To conclude, the current study identified several novel protein biomarkers for PTSD, MCI, and their co-occurrence. Many of these proteins have previously been implicated in other neurological and psychiatric disorders, in particular AD and schizophrenia. We also found substantial similarities in the profile of protein alterations of PTSD and MCI. This coincides with the evidence of shared heritability and molecular similarities across common brain disorders51. Our study further derived a multiprotein composite score that, upon replication and pending further refinement, could aid development of a practical, plasma-based assay to aid in classifying PTSD, MCI, and comorbid PTSD–MCI. Ultimately, the composite score could potentially be used to monitor patients longitudinally.