Background

Nearly one-third of veterans returning from Operation Desert Storm/Shield during the 1990–1991 Persian Gulf War conflict reported a myriad of symptoms shortly after returning home including cognitive dysfunction, skin rashes, musculoskeletal discomfort, and fatigue [1]. This convergence of symptoms later became referred to as Gulf War Illness (GWI). To date, the underlying mechanisms of GWI disease activity have yet to be fully elucidated, and diagnosis and treatment are based on symptomatic presentation. One of the most common symptoms of GWI, impaired cognitive function, has been linked to chronic inflammation, mitochondrial dysfunction, as well as elevated oxidative stress [2]. The disruption of multiple systemic functions results in a complex clinical presentation of the illness. As a result, accepted biomarkers and mechanisms for treatment are difficult to identify, therefore, afflicted veterans are commonly diagnosed based on psychological or psychiatric evaluations [3]. One confounding factor tied to the illness is the presence of comorbidities, such as post-traumatic stress disorder (PTSD), that may influence GWI pathophysiology.

PTSD is a psychological condition that arises from exposure to trauma which results in symptoms including vivid flashbacks, frightening thoughts, avoidance of certain associated places or events, and a constant state of hyperarousal [4, 5]. These symptoms of PTSD are accompanied by negative alterations in cognition and mood as defined by the Diagnostic and Statistical Manual of Mental Disorders (DSM)-5/International Classification of Diseases-10 [6]. PTSD may also contribute to negative physical health states/symptoms such as a marked increase of musculoskeletal pain and cardio-respiratory complications [7].

It has been documented that many veterans with wartime exposures report symptoms of PTSD such as sudden and unexpected panic, disturbance of sleep, intrusive/unwanted memories and associations caused by the event, alterations in mood, general fatigue, loss of concentration, anhedonia, and a reluctance for social behavior/relationships [8, 9]. The point prevalence of combat-related PTSD reported across studies of US combat veterans ranges from about 2–17%; and lifetime prevalence about 6–31% with rates from veterans of the Vietnam War ranging from 2.2–15.2%, and among Persian Gulf War veterans of 1.9–13.2% [10]. However, when considering comorbidity, Weiner et al. [11] found PTSD prevalence rates in Gulf War veterans of 3.1% in the absence of GWI criteria and 34.6% meeting GWI criteria. A third intermediate group endorsing some combination of GWI symptoms, but not meeting full GWI criteria had a 17.8% prevalence rate of PTSD suggesting a relationship between GWI symptoms and PTSD. Investigations into the effect of PTSD on GWI symptoms in both male and female cohorts have also found an increasing severity of GWI symptoms with increasing post-traumatic stress symptoms, while still finding GWI symptoms in the absence of any post-traumatic stress symptoms [12, 13]. Despite the apparent relationship between PTSD and increased GWI severity, research efforts evaluating the impact of PTSD on GWI pathophysiology have been limited.

Previous studies examining GWI and PTSD indicated that the symptomology of both conditions strongly overlap. This overlap complicates the ability to disentangle and thus accurately diagnose GWI when comorbid with PTSD. Such complications in diagnostics may affect the potential treatment approaches clinicians employ and therefore have a direct impact on patient care and recovery. While GWI and PTSD are distinct illnesses, the complex interactions occurring due to overlapping symptomology questions whether PTSD may be modulating the nature and severity of symptoms reported by GWI confirmed veterans. Indeed, previous research has shown that GWI veterans with a confirmed PTSD diagnosis have a significantly higher likelihood of reporting abdominal pain, musculoskeletal pain, joint pain, and back pain including modulated vagal and heart-rate dynamics [14,15,16,17]. It has also been shown that PTSD may suppress innate immune activity, resulting in compromised inflammatory activity [18]. Immunological dysfunction in relation to PTSD has specifically been associated with changes in circulating cytokine levels [interleukin (IL)-1β, IL-4, IL-6, IL-8, IL-10, interferon (IFN)-γ and tumor necrosis factor (TNF)-α] and altered immune cell activity [19,20,21,22,23]. Impaired immune function has also been observed in GWI, with exercise accentuating the altered immune signature [24,25,26,27,28,29]. Therefore, identifying GWI veterans with and without comorbid PTSD can produce subgroups with distinct phenotypes due to specific changes in cytokine and immune cell profiles.

Of particular relevance to the study of GWI and PTSD comorbidity is the recent work showing that genetic variants in the PON1 gene are associated with an increased rate of developing GWI after chemical exposure [30]. PON1 encodes for serum paraoxonase 1 (PON-1) an enzyme that has multifunctional roles in various biochemical pathways such as protection against oxidative damage and lipid peroxidation, contribution to innate immunity, detoxification of reactive molecules, bioactivation of drugs, modulation of endoplasmic reticulum stress and regulation of cell proliferation/apoptosis. The variant PON1 found in GWI leads to a reduced enzymatic ability to hydrolyze organophosphate compounds, such as the nerve agent sarin and pesticide chlorpyrifos [30], and the pyridostigmine bromide (PB) pills [31] which have been associated with GWI. While the activities and isoforms of PON-1 do not appear to have any relationship to deployment in Gulf War era veterans [32], lowered PON-1 activities do appear to be a key component in the ongoing nitric oxide stress processes that accompany affective disorders, general anxiety and schizophrenia [33]. In line with this, compared with non-trauma-exposed controls, those with PTSD appear to have decreases in paraoxanase levels [34, 35]. However, compared with trauma-exposed controls who didn’t develop PTSD this does not appear to be the case [35], as not all individuals exposed to the same traumatic event develop PTSD due to pre-existing genetic, physiological, psychological and/or environmental factors making some individuals more vulnerable than others. Despite this, the high prevalence rate of PTSD noted in GWI veterans [11], and the involvement of PON-1 in both GWI and PTSD suggest comorbid interaction in their pathophysiology that needs to be addressed.

These overlaps between symptomology and pathophysiology of GWI and PTSD highlight the importance of understanding the underlying mechanisms mediating both illnesses and the implications of PTSD on GWI presentation and progression. Specifically, research efforts have proved that both GWI and PTSD are associated with immune dysfunction and teasing out contributions from PTSD can provide an improved understanding of GWI pathophysiology and progression in the absence and presence of comorbidity. This may help to stratify GWI into distinctive subtypes to improve diagnosis and target treatment more effectively, leading to improved care and better quality of life for ailing veterans. This research endeavor builds on our previous research, which involves the mapping of complex inflammatory mechanisms associated with GWI. Specifically, we aimed to achieve an understanding of the unique symptoms, cytokines, and signaling pathways characterizing the subgroups of GWI as well as distinguishing any compounding dysregulation caused by PTSD.

Methods

Ethics approval and consent to participate

Participants were recruited through several studies carried out at multiple institutions. Participants recruited via the Miami Veterans Affairs Medical Center (MVAMC) signed an informed consent approved by the Institutional Review Board (IRB) of the MVAMC (Protocol numbers 4987.69 and 4987.75). Participants recruited via Boston University School of Public Health signed an informed consent approved by Boston University Medical Campus IRB. Ethics review and approval for data analysis was obtained by the IRB of Nova Southeastern University (NSU) and approved by the United States Army Medical Research and Material Command (USAMRMC) Human Research Protection Office (HRPO).

Cohort

Participants were recruited in three cohorts through both the MVAMC (2 cohorts) and through Boston University (1 cohort). The following describes the cohorts recruited from each site.

Miami cohorts

Two cohorts were recruited via the Miami Veterans Affairs Medical Center (MVAMC) under a Veterans Affairs Merit award [GWI: n = 27, healthy controls (HC): n = 25] and a Department of Defense (DOD) GWI Research Program award (GWI: n = 24, HC: n = 19), both of which compared male veterans meeting criteria for GWI to HC. Inclusion criteria for GWI participants were derived from Fukuda et al. [36] and consisted in identifying veterans deployed to wartime theater between August 8th, 1990, and July 31st, 1991, with one or more symptoms present for six months from at least two of the following: Fatigue, mood, and cognitive complaints, as well as musculoskeletal complaints. Participants were in good health prior to 1990 and had no current exclusionary diagnoses defined by Reeves et al. [37]. This includes exclusion of major dementias of any type and alcoholism or drug abuse, medical conditions including organ failure, rheumatologic disorders, and use of medications that affect immune function, such as steroids or immunosuppressants. Collins et al. [38] support the use of the Fukuda definition in GWI [36]. HC participants consisted of GW era veterans, both deployed and non-deployed, self-defined as healthy with no exclusionary diagnoses and sedentary (no regular exercise program, sedentary employment). All participants in the Miami cohort were subjected to a standard maximal graded exercise test to stimulate their immune response and blood samples were collected at Timepoint 0 – at rest (T0), Timepoint 1 – at peak exercise (T1) and Timepoint 2–4 h after peak exercise (T2) [25].

Boston cohort

The Boston cohort was recruited from the DOD funded Boston GWI Consortium (GWIC) [26] which is now included in the DOD funded Boston Biorepository, Recruitment, and Integrative Network for GWI (W81XWH-18-1-0549) [39]. The GWIC cohort recruited 269 total participants of which 25 male GW veteran’s data was shared for this study. All GWIC participants served in the 1991 Gulf War. GWI cases and controls were determined using the Kansas GWI criteria [40]. The Kansas GWI case criteria required two mild or one moderate-severe chronic symptom in at least three of six symptom domains which included fatigue/sleep, pain, neurologic/cognitive/mood, gastrointestinal, respiratory, and skin rash. This definition also excludes veterans who had one or more medical conditions that could account for their symptoms. For this study veterans were excluded if they endorsed any of the following medical conditions: diabetes, heart disease, stroke, lupus, multiple sclerosis, rheumatoid arthritis, Parkinson's disease, amyotrophic lateral sclerosis, seizure disorders, Alzheimer's disease, cancer other than skin cancer, liver disease, kidney disease, schizophrenia, or bipolar disorder. Those not meeting GWI case criteria or exclusionary criteria were considered healthy controls.

Subgrouping procedure

Miami cohort

The Davidson Trauma Scale (DTS) is a 17-item self-report questionnaire of post-traumatic stress symptoms corresponding [41] to the DSM-IV symptoms of PTSD. A total score, reflecting both frequency and severity ratings for all 17 items and separate ratings for the total frequency and total severity of all 17 items, was used to interpret PTSD probability. The three clusters (Intrusiveness, Avoidance/Numbing, and Hyperarousal) were also scored separately. Following McDonald et al. [42], we applied a simple cut score at 70 for the total DTS score as it has been shown to offer optimal diagnostic accuracy, correctly classifying 90% of cases and providing an accurate estimate of PTSD population prevalence (12–13%). Those GWI subjects with DTS scores 70 and above were considered as probable PTSD positive with GWI with high probability of PTSD symptoms (GWIH), while those below 70 were considered as probable PTSD negative with GWI with low probability of PTSD symptoms (GWIL).

Boston cohort

The Clinically Administered PTSD Scale for DSM-5 (CAPS-5) is a structured clinical interview that includes 30 questions that assess current and lifetime PTSD diagnosis [43]. It includes questions from the DSM-5 PTSD symptoms and requires a traumatic event that symptoms are assessed in relation to PTSD severity, dissociative symptoms as well as impact on work and recreational functioning. The Structured Clinical Interview for DSM-5 (SCID-5) is a semi-structured interview for making the major DSM-5 diagnoses [44]. The SCID is broken down into separate modules corresponding to categories of diagnoses. A diagnosis of PTSD was made following the PTSD diagnostic algorithm and used to assign subjects to the GWIH and GWIL groups.

Symptom measures

All participants from both the Boston and Miami sites received a physical examination and medical history including the GWI symptom checklist as per the case definition. Symptom questionnaires included the Multidimensional Fatigue Inventory (MFI) [45], a 20-item self-report instrument designed to measure fatigue with five resulting composite scores and the RAND Medical Outcomes Study 36-item short-form survey (SF-36) [46, 47], which assesses health-related quality of life with eight resulting composite scores. The PTSD status of the Miami cohort was evaluated with the DTS [41], a self-rating measurement of the frequency and severity of PTSD symptoms in three clusters: Intrusion, avoidance, and hyperarousal. The Boston cohort received a clinical psychiatric interview that included the CAPS-5 [43] and the SCID [44] to identify exclusionary psychiatric diagnoses (bipolar disorder, schizophrenia) and to determine psychiatric diagnoses including PTSD.

Complete blood count (CBC)

Whole blood collected in K2-ethylenediaminetetraacetic acid (EDTA) blood tubes (B-D-Biosciences, San Jose, CA, USA) was measured using the Beckman Coulter LH500 (Beckman Coulter Inc, Brea, CA, USA) to obtain complete blood count with differentials.

Human cytokine analysis

Cytokine analysis was performed using Quansys chemiluminescent assays (Quansys Biosciences, Logan, UT, USA). The Quansys Imager, driven by an 8.4-megapixel Canon 20D digital SLR camera, supports 96 well plate based chemiluminescent imaging. The Q-Plex™ Human Cytokine—Screen is a quantitative enzyme-linked immunosorbent assay (ELISA)-based test where distinct capture antibodies have been absorbed to each well of a 96-well plate in a defined array. Comparable custom 16- and 18-multiplex assays were used in across the studies.

The 16-multiplex panel includes TNF-α, TNF-β, IL-1α, IL-1β, IL-6, IFN-γ, IL-12, IL-2, IL-15, IL-8, IL-5, IL-17, IL-23, IL-10, and IL-13 [48]. For the standard curves, we used the second order (k = 2) polynomial regression model (parabolic curve), Y = b0 + b1X + b2X2… + bkXk, where Y is the predicted outcome value for the polynomial model with regression coefficients b1 to k for each degree and y intercept b0. Quadruplicate determinations were made, i.e., each sample was run in duplicate in two separate assays.

The 18-multiplex panel includes TNF-α, TNF-β, IL-1α, IL-1β, IL-6, IFN-γ, IL-12, IL-2, IL-15, IL-8, IL-5, IL-17, IL-23, IL-10, IL-13, TNF-RI, and TNF-RII. Briefly, plasma samples are thawed at 4 °C overnight. Samples are plated in duplicate following manufacturers protocol. The plates were read at 270 s of exposure time using the Q-view Imager LS (Quansys Biosciences, Logan, UT, USA). Individual cytokine concentrations were obtained using the image analysis software Q-view (Quansys Biosciences, Logan, UT, USA). Sample concentrations were calculated from standard curves created by a five-parameter logistic regression with 1/y weighting. The average value from each duplicate was then used for subsequent analyses.

To account for 16-plex and 18-plex differences, data were separated by 16 and 18 plex and standardized by z-scoring. The two data sets were then combined and reverse z-scored using the mean and standard deviation of the 18-plex dataset to scale the 16-plex data to the 18-plex set. Following this the dataset was normalized using min–max normalization so the final data values ranged between 0 and 1.

Flow cytometry

Flow cytometry was performed on each patient and healthy control subject sample to determine various immune cell populations including T cells, lymphocyte subsets and Natural Killer (NK) cells abundance using a Beckman Coulter FC500 Flow Cytometer (Beckman Coulter Inc, Brea, CA, USA). Whole blood samples were stained in multi-color combinations, with the appropriate concentrations of antibodies (Beckman Coulter Inc, Brea, CA, USA), erythrocytes lysed, and the cells fixed with the Optilyse C Lysing Solution (Beckman Coulter Inc, Brea, CA, USA). Lymphocyte, monocyte, and granulocyte populations were determined using light scatter and backgating on fluorescence for the markers and negative population [49, 50]. Isotype controls were used to determine the background caused by nonspecific antibody binding. A list of antibodies and fluorochrome conjugate cocktails used is provided in Additional file 1: Table S1. Spectral compensation was established daily. Values normalized to healthy controls are presented in the results. Copy of data output showing gating and graphs for various markers analyzed along with isotype controls is presented as supplementary data in Additional file 2: Flow cytometry gating strategy.

NK cell cytotoxicity

The bioassay for NK cell cytotoxicity was performed using whole blood within 8 h of collection in a chromium release assay as previously described [50]. The NK sensitive erythroleukemic K562 cell line was used as the target cell. The assay was done in triplicate at four target-to-effector cell ratios with 4 h incubation. The % activity at each target-to-effector ratio and number of cluster differentiation (CD)3CD56+ (NK) cells per unit of blood was used to express the results as % cytotoxicity at a target-to-effector cell ratio of 1:1.

Hormone analysis

Serum samples were analyzed for concentrations of testosterone by immunoelectro-chemiluminescence assays on a Roche Cobas 6000 analyzer (Roche Diagnostics, Basel, Switzerland), following all manufacturer's instructions for instrument maintenance and assay calibration and test procedures with interassay % (coefficients of variation) CVs that are consistently < 4%. Salivary cortisol was determined by immunoassay using the Salimetrics high sensitivity kit (State College, PA, USA).

Statistical analysis

Following group assignment, values were compared using ANOVA as the groups were unbalanced in size with unequal variances [51,52,53,54,55]. All measures from the omnibus ANOVA test with a P-value of less than 0.05 were chosen for post-hoc analysis. Following ANOVA, a Games-Howell post-hoc analysis was performed on significant omnibus tests to determine pairwise differences with a P-value of less than 0.05 taken as significant [56]. The effect size of the difference between groups for each measure was also estimated using the corrected Hedges g [57]. The corrected Hedges g was chosen as it gives better estimates for small sample size and is corrected to account for bias as an estimator for the population effect size. Effect sizes were interpreted in the following ranges [58]: Negligible, lower than 0.01, very exceedingly small 0.01–0.20, small 0.20–0.50, medium 0.50–0.80, large 0.80–1.20, very exceptionally large 1.20–2.00, and huge 2.00 or higher. While a P-value can inform the reader whether an effect exists, the P-value will not reveal the size of the effect. In reporting and interpreting these studies, both the substantive significance (effect size) and statistical significance (P-value) are essential results to be reported [59]. As such, throughout the description of the results the small, medium, large, huge designations refer to the calculated Hedges’ g effect size values. All calculations were performed using MATLAB (version: 9.12.0, R2022a, The MathWorks Inc., Natick, MA, USA). Statistical analysis data can be found in Additional file 3: Complete Data and Analysis.

Correlation graphs were constructed to express the statistical relationship correlating immune cell abundance, cytokine concentration, blood counts and symptom severities and gauge their influence upon one another for each condition at each timepoint. Correlations were assessed by use of a Spearman correlation coefficient-ρ, which stands as a measure of association that assesses the extent of change between two variables (such as cytokine population and symptom severity) revealing any monotonic relationships. To account for the multiple correlation tests the relationships were filtered using the Storey corrected P-value, q [60]. Correlations were taken as significant for Storey’s q < 0.05. The resulting graphs were then compared via the graph edit distance (GED) measure defined as the minimum amount of edit operations needed to transform one graph into another. Changes can include deletions, insertions, and substitutions of correlations that make up the graph [61, 62]. GED measures have allowed similarity comparisons to occur between graphs with relatively similar information. Roughly, the measure itself dictates how many edits to the first graph need to be performed to obtain the second graph. This yields a number that quantifies the similarity or dissimilarity of the pair of graphs. A higher number would indicate a larger number of edits and thus, a lower similarity. Whereas a lower number would indicate that there were fewer changes needed to be made thus, a higher similarity between graphs [63, 64]. Correlation analysis data can be found in Additional file 3: Complete data and analysis.

Results

Demographics

The total cohort used in this study was comprised of 120 male veterans recruited through the Miami and Boston based sites. For the entire cohort, the average age was (46.0 ± 7.0) years, with an average body mass index (BMI) of (30.29 ± 4.80) kg/m2. The sample consisted of individuals of Asian (n = 2, 1.7%), Black (n = 24, 20.0%), White Hispanic (n = 49, 40.8%), White (n = 41, 34.2%), and other (n = 4, 3.3%) decent. The full demographics of the participant populations with comparison statistics between groups can be found in Additional file 1: Table S2.

The full sample included male veterans (n = 76), meeting the criteria for GWI and HC (n = 44). We further subdivided the GWI sample into two groups based on their PTSD symptoms. The group of GWI subjects with DTS scores above 70 and positive PTSD SCID evaluation were considered to have a high probability of having PTSD and related symptoms and were denoted as GWIH (n = 47), while the remainder had a low probability of having PTSD and related symptoms and were labeled as GWIL (n = 29). All HC had low probability of having PTSD.

Statistical comparisons (Additional file 1: Table S2) were made between both total GWI and HC groups (P2), as well as between the GWIH, GWIL, and HC groups (P3) using ANOVA for continuous variables and the χ2 test for categorical variables. Marginally significant differences (P < 0.10) were found in age and BMI between HC and GWI with the GWI group being older and having a higher BMI. In addition, a statistically different breakdown in racial demographics (P = 0.036) was observed driven primarily by a much higher percentage of White Hispanics in the control group owing to the demographics of the Miami cohort. However, after subtyping based on the DTS no statistical differences were found in age, BMI, racial representation, or average number of years in school across controls and GWI subtypes. As most of the statistical analysis was performed on the subtyped GWI groups, these differences between the GWI and HC group were deemed to not affect the overall results of the study.

Symptom measures

Comparison of the MFI, SF-36, and DTS symptom scales yielded results analogous to our previous work with a smaller male cohort [12] (Additional file 1: Table S3 and Fig. 1). GWIL and GWIH both presented significantly worse symptom measures compared to HC (P ≤ 0.001), with GWIH having significantly worse symptoms compared to GWIL in MFI general fatigue (GWIL: 60.42 ± 4.71, GWIH: 78.53 ± 2.73, P = 0.006), mental fatigue (GWIL: 58.48 ± 5.42, GWIH: 76.55 ± 2.94, P = 0.015) and reduced motivation (GWIL: 44.12 ± 4.91, GWIH: 61.45 ± 3.46, P = 0.018), and SF-36 role limitations due to physical problems (GWIL: 43.89 ± 7.28, GWIH: 22.42 ± 4.12, P = 0.037), vitality (GWIL: 33.75 ± 4.45, GWIH: 20.68 ± 2.36, P = 0.039), social functioning (GWIL: 57.59 ± 5.90, GWIH: 32.61 ± 3.76, P = 0.004), role limitations due to emotional problems (GWIL: 57.74 ± 7.79, GWIH: 27.36 ± 4.97, P = 0.007) and mental health (GWIL: 60.00 ± 3.08, GWIH: 42.47 ± 2.94, P = 0.001). GWI, with or without trauma, had a large to huge effect size (as defined in the methods) on all SF-36 and MFI scores for each subtype compared to HC (g ≥ 1.25). Compared to GWIL the PTSD symptoms of GWIH worsened all SF-36 measures by a medium amount (0.50 ≤ g < 0.80) except for social functioning (g = 0.89), role limitations due to emotional problems (g = 0.81), and mental health (g = 0.92), which were affected by a large amount. For the MFI measures, the PTSD symptoms of GWIH had a medium negative effect (0.50 ≤ g < 0.80) on all measures compared to GWIL, except for reduced activity (g = 0.48) and general fatigue (g = 0.83), which were affected by a small and large amount, respectively.

Fig. 1
figure 1

Comparison of symptom scales between trauma defined GWI groups and controls from the Miami and Boston cohorts. Standard error of the mean error bars. *P < 0.05 as compared with HC, and #P < 0.05 as compared to GWIL via Games-Howell post-hoc analysis. Higher SF36 values indicates better health. GWIH Gulf War Illness with high probability of post-traumatic stress disorder (PTSD) symptoms, GWIL Gulf War Illness with low probability of PTSD symptoms, HC healthy control, MFI multidimensional fatigue inventory, SF36 36-item short form healthy survey

Biological measures

At rest, the combined Miami and Boston cohorts were compared between GWIH, GWIL, and HC groups. Significant differences between groups were shown in cytokine, CBC, and flow cytometry measures (Additional file 1: Table S4 and Fig. 2). The GWIL group showed a lower concentration of IL-1β (HC: 0.14 ± 0.02, GWIL: 0.08 ± 0.01, GWIH: 0.13 ± 0.03) with a small effect size (g < 0.49) however this did not reach the level of significance (P > 0.057). The GWIL group also showed lower IL-15 compared to HC with a significant medium effect size (HC: 0.24 ± 0.03, GWIL: 0.15 ± 0.02, P = 0.026, g = 0.58), but no significant change from GWIH (GWIH: 0.22 ± 0.03, P = 0.12) despite a small effect decrease (g = 0.40). GWIH showed no difference from HC in IL-15. GWIH was found to have a medium effect size with lower CD3CD56+ cell number compared to HC (HC: 201.48 ± 17.94, GWIH: 148.00 ± 11.44, P = 0.036, g = 0.53), with no significant difference from GWIL (GWIL: 161.24 ± 12.47, P = 0.718, g = 0.18). GWIL was shown to have a small effect size with lower CD3CD56+ cell number compared to control, but this did not reach the level of statistical significance (P = 0.164, g = 0.39). GWI subtypes were found to have a higher red blood cell distribution width (RDW) compared to HC at a medium effect size (HC: 12.74 ± 0.12, GWIL: 13.33 ± 0.24, GWIH: 13.29 ± 0.16, g = 0.57), however only GWIH was found to reach the level of significance (P = 0.018). There was no effect observed for RDW between GWIH and GWIL (P = 0.990, g = 0.03). Similarly, both GWI subgroups showed a medium effect with a significantly lower percentage of CD2+ levels compared to HC (HC: 82.76 ± 0.71, GWIL: 78.05 ± 1.82, GWIH: 78.54 ± 1.10, g ≥ 0.65), but again only GWIH reached the level of significance (P = 0.006), and no difference was observed between GWI subgroups (P = 0.971, g = 0.06).

Fig. 2
figure 2

Significant comparisons of hormones, cytokines, complete blood count and flow cytometry measures at rest between trauma defined GWI groups and controls from the Miami and Boston cohorts. Standard error of the mean error bars. *P < 0.05 as compared to HC, and #P < 0.05 as compared to GWIL via Games-Howell post-hoc analysis. All values scaled to HC. GWIH Gulf War Illness with high probability of post-traumatic stress disorder (PTSD) symptoms, GWIL Gulf War Illness with low probability of PTSD symptoms, HC healthy control, MFI multidimensional fatigue inventory, SF36 36-item short form healthy survey

As only the Miami cohort underwent a graded exercise challenge, only the Miami cohort was compared between GWIH, GWIL, and HC groups at peak exercise (Additional file 1: Table S5 and Fig. 3). Similar to the at rest condition, IL-15 levels for GWIL are significantly lower compared to HC with a medium effect size (HC: 0.30 ± 0.04, GWIL: 0.17 ± 0.01, P = 0.005, g = 0.66), but also at a significantly lower level with medium effect compared to GWIH (GWIH: 0.29 ± 0.04, P = 0.038, g = 0.59), with no difference between GWIH and HC (P = 0.988, g = 0.04). Both GWI subgroups showed significantly less basophils (BA) %, all at a medium level effect and lower than HC (HC: 0.40 ± 0.07, GWIL: 0.15 ± 0.03, GWIH: 0.19 ± 0.03, P < 0.040, g ≥ 0.52). Additionally, GWIH showed a significantly lower hematocrit (HCT) than HC, at a medium effect size (HC: 47.23 ± 0.58, GWIH: 44.88 ± 0.48, P < 0.016, g = 0.64). In measures of NK % activity, and all CD3- cell type measures GWIL was found to be significantly reduced by a medium to large amount compared to HC (P ≤ 0.043, g ≥ 0.59). While a small effect decrease for GWIH compared to HC was also observed for these measures (g ≤ 0.43), none reached the level of significance (P ≥ 0.206). GWIL was shown to have significantly higher CD2+CD26+ (%) with a large effect over HC (HC: 35.94 ± 1.58, GWIL: 49.12 ± 3.05, P = 0.013, g = 1.04), and a non-significant but medium effect over GWIH (GWIL: 39.64 ± 1.98, P = 0.113, g = 0.64). While the omnibus measure of CD8+CD26+ (%) was found to be significant, no pairwise group differences were found at the level of significance despite the large increase seen in GWIL compared to HC (HC: 8.38 ± 0.70, GWIL: 15.92 ± 2.47, P = 0.075, g = 0.92).

Fig. 3
figure 3

Significant comparisons of hormones, cytokines, complete blood count and flow cytometry measures at peak exercise between trauma defined GWI groups and controls from the Miami cohort. Standard error of the mean error bars. *P < 0.05 as compared to HC, and #P < 0.05 as compared to GWIL via Games-Howell post-hoc analysis. All values scaled to HC. GWIH Gulf War Illness with high probability of post-traumatic stress disorder (PTSD) symptoms, GWIL Gulf War Illness with low probability of PTSD symptoms, HC healthy control, MFI multidimensional fatigue inventory, SF36 36-item short form healthy survey

The Miami cohort was also compared between GWIH, GWIL, and HC groups for the period 4 h post exercise (Additional file 1: Table S6 and Fig. 4). Of all the measures only IL-15 was found to be significantly reduced in GWIL with a medium effect size compared to HC (HC: 0.22 ± 0.02, GWIL: 0.14 ± 0.01, P = 0.044, g = 0.52). While no significant difference was found between GWIH and GWIL, there was a small effect difference (GWIH: 0.22 ± 0.03, P = 0.215, g = 0.40), and no difference between GWIH and HC (P = 0.999, g = 0.01).

Fig. 4
figure 4

Significant comparisons of hormones, cytokines, complete blood count and flow cytometry measures 4 h post peak exercise between trauma defined GWI groups and controls from the Miami cohort. Standard error of the mean error bars. *P < 0.05 as compared to HC via Games-Howell post-hoc analysis. All values scaled to HC. GWIH Gulf War Illness with high probability of post-traumatic stress disorder (PTSD) symptoms, GWIL Gulf War Illness with low probability of PTSD symptoms, HC healthy control, MFI multidimensional fatigue inventory, SF36 36-item short form healthy survey

Correlation analysis

The GED analysis across subgroups was used to understand the differences and similarities between hormone, immune and symptoms measures, by determining the minimum number of modifications required to transform one correlation graph into another. Additional file 1: Figs. S1-S9 show the correlation graphs for each group at each timepoint. As only the Miami cohort underwent exercise challenge GED comparisons were only made between groups within each time, and not across time within each group. Based on the GED measures indicated in Additional file 1: Table S7, the most similar graphs at rest were GWIL and GWIH with a GED measure of 107.40 while the most different were HC and GWIL at 127.91. For peak exercise, the most similar graphs were HC and GWIH with a GED measure of 116.93, while the most different were GWIL and GWIH at 118.88, closely followed by HC and GWIL at 117.53. For post-exercise recovery, the GWIL and GWIH graphs were again the most similar with a GED measure of 86.07, with GWIH and HC being the most different of 100.80. Overall, this suggests differences in the regulation of symptoms, hormones, and immune function between both GWI subgroups and with HC at all stages across an exercise challenge.

Specifically looking at the biological measures identified as showing significant differences between groups it was found that for HC at rest levels of NK % activity negatively correlated with SF36 role limitations due to physical problems (Spearman correlation coefficient ρ =  − 0.43, P = 0.0033, Storey q = 0.0382) while GWIH showed negative correlation between IL-15 and MFI general fatigue (ρ =  − 0.42, P = 0.0025, q = 0.0253). For HC at peak exercise SF-36 role limitations due to physical problems negatively correlated with CD3CD16+ (×109/L) (ρ =  − 0.45, P = 0.0023, q = 0.0299), CD3CD16+CD11a+ (%) (ρ =  − 0.43, P = 0.0037, q = 0.0425), and CD3CD16+CD11a+ (×109/L) (ρ =  − 0.46, P = 0.0015, q = 0.0224), while CD3CD16+CD11a+ (%) also negatively correlated with SF-36 general health (ρ =  − 0.44, P = 0.0031, q = 0.0369). Finally, for GWIL at peak exercise CD8+CD26+ cell percentage correlated negatively with MFI general fatigue (ρ =  − 0.73, P = 0.0041, q = 0.0150), reduced activity (ρ =  − 0.74, P = 0.0012, q = 0.0236) and reduced motivation (ρ =  − 0.77, P = 0.0003, q = 0.0090). No significant correlations were found between biological measures with significant groups differences and symptoms at the 4 h recovery point.

Discussion

Research efforts to date concerning the link between GWI and PTSD have demonstrated that PTSD clinically appears to have clinical signs different from GWI [65], and that GWI comorbid with PTSD appears to exacerbate GWI symptomology [12, 13]. This research aimed to gain insight into the underlying pathophysiology of GWI and how comorbid PTSD impacts both symptom and biological presentation in this illness. This was achieved through stratifying GWI into trauma-based subgroups (GWIL with low probability of PTSD, and GWIH with high probability of comorbid PTSD) to determine subgroup specific symptom measures and biological markers such as hormones, cytokine, and immune cell measures. Our research efforts comparing symptom scales between GWIL and GWIH in an expanded cohort recapitulated our previous findings in males with GWI that GWI in the absence of PTSD (GWIL) presents with a significant set of symptoms compared to controls marking it as a distinct illness, which in the presence of PTSD (GWIH) presents with significantly worse symptoms overall [12]. Additionally, by comparing biological profiles we found that across GWI subgroups there were significantly lower BA levels at peak exercise making this a potential biomarker for GWI in general. GWIL veterans were characterized distinctly from both GWIH and HC groups by lower levels of IL-15 in general and reduced NK % activity and the related NK cell surface marker antigens CD3CD56+, CD3CD16+ and CD3CD16+CD11a+ at peak exercise in addition to a higher CD2+CD26+ cell percentage at this same time. GWIH veterans, on the other hand, were distinguished from controls by higher RDW, and a lower number of CD3CD56+ and CD2+ cells and rest and lower HCT levels at peak exercise, but no significant differences from GWIL other than for IL-15 at peak exercise. The low levels of IL-15 for GWIL at all times compared to HC, and at peak exercise compared to GWIH suggest it as a potential subgroup discriminating marker.

Here we found that BA % levels in GWI (0.15% for GWIL and 0.19% for GWIH) were approximately half the level observed in controls at peak exercise (0.40%) (Additional file 3). While the determination of low BA levels is difficult, this reduced percentage in GWI is below the typical BA % in human peripheral blood of 0.5–1.0% [66], and outside the healthy reference range of 0.3–1.5% determined previously for the instrumentation used in this study. Research suggests that BA may also regulate T cells to mediate the magnitude of the secondary immune response [67], as well as regulate gut homeostasis [68]. The finding of low BA levels in GWI is consistent with a GWI altered immune profile [26], and more recent evidence suggests an altered gut-microbiome in GWI [69, 70] of which low BA levels may be a contributing factor. As this low level only occurs at peak exercise it may also be related to the exercise induced increase in symptoms and immune dysregulation observed in GWI [25, 29, 71,72,73,74]. Further investigation into this finding is needed to determine if low BA percentages at peak exercise are a consistent biomarker for GWI to distinguish from healthy sedentary Veteran controls.

The IL-15 levels of GWIL veterans were found to be approximately 50—60% of HC, with GWIH being comparable to the control level. This finding of GWIL veterans having distinctly lower IL-15 levels from both GWIH and HC groups is consistent with reduced NK levels at peak exercise (approximately 70% of control) as the cytokine IL-15 induces the proliferation of NK cells, and has been shown to increase following acute exercise [75,76,77,78], thus with lower levels of IL-15 to start, fewer NK cells would be produced. NK cells are innate lymphocytes that play important roles in the defense against microbial pathogens through recognition and lysis of virally or bacterially infected host cells, particularly the herpesvirus family [79], in locations such as the mucosal epithelia of the intestine, lung, and female reproductive tract [80]. Deficiency in these cells for the GWIL group was only found to be significant at peak exercise. This suggests after exertion this group would be at increased susceptibility to new infection, or decreased control of an active or latent infection. This is similar to the GWI related illness myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) for which herpesviruses have been implicated as possible etiological pathogens [81, 82], and for which there is evidence of reduced NK cell function [83,84,85,86,87,88] and IL-15 levels [48] correlating with increased illness severity [89,90,91], highlighting the similarity between these illnesses. Conversely, while NK cell number (CD3CD56+ (×109/L), and CD2+ (%) which includes both NK and T cells) was also observed in the GWIH group at rest it was not accompanied by any significant changes in IL-15 at any time. One explanation for this difference between the GWIH and GWIL groups may be due to PTSD having been shown to have significantly elevated mean levels of IL-15 compared to controls [22]. This has been attributed to PTSD being associated with a genetic variant in the il15 gene [92], and increased expression of the il15 gene in the condition [93]. The apparent nominal level of IL-15 observed in GWIH compared to control may be due to the combined, but opposite effects on IL-15, of GWI and PTSD when comorbid. Additionally, at peak exercise a approximately 15% increase in CD2+CD26+ cell percentage is observed for GWIL compared to control. CD26 is a major contributor to the regulation of T, B, myeloid and NK cells and also plays a major role in T cell-dependent antibody production and immunoglobulin isotype switching in B cells, with abnormal expression of CD26 found in autoimmune diseases, rheumatologic, HIV-related illness and cancer [73, 94]. Normally NK cells usually express only low amounts of CD26 which increases after IL-15 stimulation [94]. Additionally, it appears that CD26 enzymatic activity sustains NK cytotoxicity [94]. An elevated percentage of CD2+CD26+ cells would suggest high levels of IL-15 and NK % activity; however, this is not what is observed here suggesting some dysregulation or dysfunction in this system in GWIL. Further study into IL-15 and NK cell counts and activity in GWI and PTSD are needed to determine if this is a possible subtyping biomarker with relevant biological implications.

The remaining difference between groups was found for GWIH in RDW at rest and HCT at peak exercise compared to control. While both GWIL and GWIH veterans showed the same medium effect increase in RDW (g = 0.57, Additional file 1: Table S4) compared to HC, only the GWIH reached the level of significance. Elevated levels of RDW indicate that there is a greater variation in the rheological properties of red blood cells such as size and volume. This is consistent with previous findings of abnormal rheological properties of red blood cells including higher shear stress, elongation, and elevation of RDW in GWI veterans [95]. An elevated RDW has been associated with fatigue in conditions such as systemic lupus erythematosus (SLE) [96] and post-stroke fatigue [97]. While the relation between RDW and fatigue is commonly related to iron deficiency and anemia [96, 98], like in SLE we found no significant changes in hemoglobin (HGB) that would support this. In addition to RDW elevation at rest, GWIH was found to have lower hematocrit (HCT) levels at peak exercise. HCT is a measure of the % volume of red blood cells and is typically in the range of 39.5–50.3% for the instrumentation used in this study. While all groups had hematocrit values within this range the GWIH condition had a statistically significantly higher level than control with a medium effect size. This lower level, however, is at odds with prior studies finding elevated HCT levels in PTSD compared to control [99,100,101]. While pathologically increased HCT is associated with hypercoagulability [102], and long-term risk of cardiovascular mortality [103], decreased HCT is associated with anemia. While these findings are contradictory, they highlight the importance of investigating circulating blood cell counts in GWI in the context of PTSD as potential illness markers. As nanoelectronics-blood-based diagnostics [104] are being used as a biomarker to detect red blood cell deformability in ME/CFS [105], in light of our findings similar methods may be applicable to GWI.

Correlational analysis indicated differences between all groups in the coordinated regulation of cytokines, hormones, blood, and immune cells and how these relate to symptom presentation. This is consistent with previous network and graph-based analyses in GWI [25, 73, 106] showing differences in overall immune-symptom interactions. The results here show that this is extended further when GWI is subtyped based on PTSD symptoms and shows additional differences in immune-symptom interactions between trauma-based subgroups of GWI, as well as controls. While the significant correlations that were found between key biomarkers and symptoms were not found in all the groups, they did indicate that dysregulation of IL-15, NK % activity and CD3CD16+ cells (with and without CD11a+) is related to role limitations due to physical problems, general health and general fatigue. As noted above there is an interplay between NK cells and IL-15 that is related to the symptoms observed in GWI and the related illness ME/CFS, in addition to noted genetic differences in IL-15 found in PTSD. As such further investigation is warranted into the effect of comorbid PTSD and GWI considering the direct influence of IL-15 and NK cell counts and activity on symptoms.

While the findings presented here support the need for subtyping of GWI based on PTSD symptoms, some limitations must be noted in both data collection and diagnosis classification for both GWI as well as PTSD. The main differences stem from the secondary analysis of a combination of two separate cohorts (Miami and Boston) recruited and assessed under different studies. The inclusion criteria in the two cohorts for both GWI and PTSD have some differences, especially as it relates to identification of subjects with PTSD (i.e. self-assessment versus psychiatric interview). As such, here we place our findings in the context of the groups having a high probability of having PTSD and PTSD symptoms, rather than a direct diagnosis. Future primary studies aimed at investigating the effects of PTSD on GWI should aim to use a clinical diagnosis of PTSD. Another limitation related to this is the lack of comparison to a PTSD cohort without GWI. Again, this stems directly from the secondary analysis nature of this study. Comparison against a PTSD cohort would further delineate symptoms and biomarkers that are unique to GWI, PTSD and comorbidities of these conditions.

Both maximal and submaximal exercise protocols have been used to determine behavioral and physiological consequences of acute exercise challenge in illnesses such as GWI, ME/CFS and long coronavirus disease [25, 71, 107,108,109,110,111]. Here we used exercise challenge to explore the consequences on GWI with and without PTSD symptomatology. It must be noted that only one of cohorts (i.e., the Miami cohort) underwent the exercise challenge. As such the sample size used for peak exercise and post-exercise measures is reduced compared to the at rest condition. However, while future primary study designs should aim to keep groups sizes consistent across the exercise challenge to allow for direct comparisons across time, our results indicate that group difference in IL-15 and CD3CD56+ and CD3CD16+ cells remain at different timepoints of the exercise challenge regardless of group size.

The final limitation of note concerns the cytokine results. Here we used cytokine values produced by the same Quansys system but using 16-plex and 18-plex panels. Due to the differences between these panels direct comparison of values is not appropriate and requires scaling of one data set before inclusion into the other. The result is a lack of accurate reporting of actual cytokine concentration levels in each group, however, the scaling procedure allows reliable statistical comparison of relative means between groups to identify potential biomarker differences.

Conclusions

Based on the significance shown within our findings, this study supported that GWI with comorbid PTSD symptoms resulted in poorer health outcomes compared to both HC and the GWIL subgroup. Moreover, our findings further established that post-traumatic stress is not the driving factor behind the development and clinical presentation of GWI. Stratifying GWI veterans with PTSD and creating three cohorts HC, GWIL, and GWIH, enabled us to identify the relationship between each subgroup’s specific biological measures including hormone, immune cell and cytokine markers with the symptoms experienced. Significant differences were found in basophil levels of GWI compared to HC at peak exercise regardless of PTSD symptom comorbidity indicating its potential usage as a biomarker for general GWI from control. While the unique identification of GWI with PTSD symptoms was less clear, the GWIL subgroup was found to be delineated from both GWIH and control on measures of IL-15 across an exercise challenge. This, the additional differences in NK cell numbers and function and the genetic risk factor of variants of IL-15 highlight this cytokine as a potential biomarker of GWI in the absence of PTSD symptoms. This in turn may provide necessary insight and direction for symptom management and treatment for each distinct group.