Introduction

Dementia is a neurodegenerative disease characterized by cognitive impairment that affects memory, cognitive abilities, and behavior, and significantly interferes with a person’s ability to perform daily activities [1]. The cognitive functional deterioration in dementia is beyond what is expected as the natural consequence of biological ageing [1]. Dementia has uncertain etiology, inherently complex pathophysiology, and heterogeneous manifestations [2]. Alzheimer’s disease dementia (AD) is the most frequent subtype, accounting for 60–80% of dementia as a whole, all-cause dementia (ACD) [3]. AD is driven by the brain accumulation of beta-amyloid plaques and tau protein tangles, but other mechanisms that include neuronal loss, synaptic dysfunction, neurodegeneration, and metabolic and inflammatory alterations might play earlier or more central role [4]. The global burden of dementia, influenced by the increasing life expectancy, demographics, and risk factors, was recently projected to increase in some countries such as the USA and Germany [5]. Indeed, a deeper insight into dementia’s well-known and emerging modifiable risk factors could have a major impact on its rising burden through targeted prevention and intervention strategies.

Interestingly, there is a growing recognition that the gut microbiome may play a role in the occurrence of dementia [6,7,8]. The major putative molecular mechanisms underlying this link are the production of gut microbiome bioactive molecules [6,7,8,9] and gut microbiome-induced systemic inflammation [6, 7, 10]. These gut microbiome bioactive molecules comprising those produced solely by the gut microbiome and those resulting from the host–gut microbiome co-metabolism are primarily bacterial endotoxins such as lipopolysaccharide (LPS), short-chain fatty acids (SCFA), and indole-containing tryptophan metabolites [6, 9, 11]. Crucially, some of these molecules are detectable in the systemic circulation [6, 9, 12] reaching concentrations up to and above those achieved by a typical drug dose [12] and they have been reported to cross the blood–brain barrier (BBB) [8, 9, 13, 14].

Remarkably, epidemiological studies have reported associations of gut microbiome molecules, specifically acetic acid (AA) and propionic acid (PA) [15], isobutyric acid and isovaleric acid [16] and indoxyl sulfate (IS) [17], LPS [18], LPS-binding protein (LBP) [19], AA [20], PA [21, 22], 2-methylbutyric acid, isovaleric acid, valeric acid, indole-3-pyruvic acid [22], and 5-hydroxyindole-3-acetic acid (5OH-IAA) [23, 24] with either ACD or AD risk but others did not observe associations with AA [16, 25], PA [16], and indole [16], LPS [22], indole-3-acetic acid (IAA) [23], indoleacrylic acid (IACR) [22], and 5OH-IAA [26]. Similarly, a few studies reported association of inflammatory markers, C-reactive protein (CRP), IL-1β, IL-6, and tumor necrosis factor-alpha (TNF-α) with either ACD or AD risk [27, 28], whereas others indicated no association with CRP [29,30,31], IL-1β [31], IL-6 [19, 31], and TNF-α [31]. These mixed findings may be partly due to the generally modest sample size of the aforementioned studies. Thus, larger studies are warranted. Furthermore, the levels of some gut microbiome molecules and systemic inflammatory markers as well as dementia risk are influenced by a myriad of factors such as body composition, smoking, alcohol consumption, and chronic diseases [2, 32,33,34]. These factors were not sufficiently accounted for in previous studies. Most important, the extensively reported association of some gut microbiome molecules and inflammatory markers [32, 33, 35,36,37,38] suggests that there is an intricate pathophysiological interplay between these molecules and inflammation. However, a joint investigation of these molecules and inflammatory markers in the context of dementia occurrence, which seems crucially important, has received limited attention.

To this end, the present epidemiological investigation sought to explore the circulating levels of selected gut microbiome molecules and inflammatory markers simultaneously and examine their association with the risk of ACD and AD while accounting for these important risk factors.

Methods

Study Population

The present study is embedded within the German study on Ageing, Cognition, and Dementia in Primary Care Patients (AgeCoDe) study. The AgeCoDe is a unique study in that it is a prospective cohort of elderly (≥75 years) general practitioner (GP) patients. Other inclusion criteria include being a native German language speaker or speaking German fluently, absence of severe hearing or vision impairments, and residency at home. The participants were recruited from six German cities, Bonn, Düsseldorf, Hamburg, Leipzig, Mannheim, and Munich. The study commenced between 2003 and 2004, and detailed clinical information that includes standardized cognitive testing was conducted. Follow-up assessments were performed every 18 months and every 10 months after follow-up seven. At follow-up three (henceforth, baseline) which commenced in 2007, participants provided blood samples at the GP’s office, from where the samples were transported to the central laboratory for storage. For the current study, we considered incident dementia, that is, dementia diagnosis between baseline and follow-up nine (henceforth, end of follow-up), which was completed in 2016. The ethical approval for the AgeCoDe study was obtained from the Ethics Commission of the University of Bonn 050/02, 258/07; the Ethics Commission of the Medical Faculty of the Heinrich Heine University Düsseldorf 2079/2002, 2999/2008; the Ethics Commission of the Medical Association Hamburg OB/08/02, 2817/2007; the Ethics Commission at the Medical Center of the University of Leipzig 143/2002, 309/2007. The ethical approval for the present biomarker analysis was obtained from the Ethics Commission of the University of Bonn 245/22. All participants gave written informed consent. Details of the recruitment of participants and assessment of dementia in the AgeCoDe have been reported previously [39].

Study Design

The eligible individuals were those free of dementia at baseline and with information on classical risk factors of dementia, namely age, sex, and body mass index (BMI) at baseline. Consequently, the total size of this current full cohort was N = 1323, of which 281 (21%) developed incident ACD. Incident dementia was defined as cases from the full cohort occurring between baseline and end. Person time (time-to-event) was calculated from the baseline date to the date of diagnosis of dementia, or the end of study, whichever occurred first. We censored individuals at the end of the study on 29 November 2016.

From these N = 1323 eligible individuals, we designed a classical case–cohort study that comprise 50% (N = 662) subcohort selected via a simple random sampling without replacement and all the ACD cases outside the subcohort. The sampling fraction and method were chosen for their reported efficiency [40]. Consequently, the case–cohort sample for ACD analyses was N = 805 (N = 143 non-subcohort cases, N = 138 subcohort cases, and N = 524 subcohort non-cases). Secondarily, we selected AD cases outside the subcohort to form an AD case–cohort sample of N = 740 (N = 78 non-subcohort cases, N = 73 subcohort cases, and N = 589 subcohort non-cases).

Measurement of Gut Microbiome Molecules and Inflammatory Markers

All biomarkers were measured from EDTA plasma collected at baseline. All laboratory analyses were blinded to the participants’ dementia status and any characteristics.

Lipopolysaccharide and Lipopolysaccharide-Binding Protein

LPS was measured by a quantitative sandwich enzyme immunoassay technique (MBS702450; MyBiosource, San Diego, CA, USA) and LBP by a solid-phase, two-site chemiluminescent immunometric assay (IMMULITE®1000, Siemens Healthcare GmbH, Erlangen, Germany). Further processing of the samples was carried out according to the specifications from the kit instructions or according to the specifications of the laboratories. For both LPS and LBP, the intra-assay coefficient of variation (CV) was <8% and inter-assay CV was <10%.

Indole-Containing Tryptophan Metabolites

Targeted metabolomics quantification of the concentrations 9 indole-containing tryptophan metabolites, IAA, indole-3-acetic acid methyl ester (IAA ME), 5OH-IAA, indole-3-propionic acid (IPA), indole-3-butyric acid (IBA), indole-3-lactic acid (ILA), indole-3-carboxaldehyde (ICARB), indole-3-acryloylglycine (IAG), IS was determined by ultra high performance liquid chromatography-electrospray ionization-triple quadrupole-mass spectrometry, as reported previously [41]. Other metabolites, which include tryptophan, methionine, tyrosine, serotonin, and N-acetyl-tryptophan, were also quantified in this targeted assay. The intra-day CV for all metabolites in the present analysis was less than 15%.

Short-Chain Fatty Acids

The measurement of the SCFA was according to previously reported targeted metabolomics analysis [42]. We measured eight SCFA namely AA, PA, isobutyric acid, butyric acid (BA), 2-methylbutyric acid, isovaleric acid, valeric acid (VA), and hexanoic acid (HA). The intra-day CV for all SCFAs was less than 20%.

Inflammatory Markers

The multiplexing analysis of IL-1β, IL-6, and TNFα was performed using the Luminex™ 200 system (Luminex, Austin, TX, USA). The intra-assay CV for all three markers was less than 10%. The inter-assay CV for IL-1β and IL-6 was less than 15% while TNF-α was less than 20%. High-sensitivity CRP was measured at the central laboratory of the University Hospital in Bonn.

Assessment of Covariates: Sociodemographic, Anthropometry, Lifestyle Factors, and Prevalent Diseases

All covariates were assessed at baseline. Standardized questionnaires were used to obtain information on age, sex, weight, height, education, last employment status, smoking status, alcohol consumption, physical activity, social status (marital status and living alone), and the presence of prevalent disease (hypertension, type 2 diabetes, coronary heart disease, stroke, depression, hearing and visual impairment, and traumatic brain injury). BMI was calculated as weight in kilograms divided by the squared height in meters. Educational level was categorized into low (inadequately completed or elementary schooling), middle (secondary), and high (tertiary). Participants whose last employment status was manual jobs, salaried jobs, civil service jobs, and self-employed were categorized as employed and others were categorized as unemployed. Smoking was assessed as the current smoking of cigarettes, a pipe, cigars, or other tobacco products. Individuals who were current non-smokers but smoked for any number of years of smoking were categorized as ex-smokers. Alcohol consumption was determined from the frequency and quantity of consumption and converted to a uniform measure of grams per day. Physically active individuals were those who had more than one per week cycling, hiking or long walks, swimming, gymnastics, and other sports such as golf. Marital status was categorized into four groups, widowed, divorced, married, and single. The Mini-Mental State Examination (MMSE) score was used to assess global cognitive function at baseline. Depression was assessed based on the 15-item version of the Geriatric Depression Scale dichotomized into <6 points (no evidence of depressive symptoms) and ≥6 points (evidence of depressive symptoms). Medication use was obtained from questionnaire. Plasma hemoglobin A1c (HbA1c) was measured using the Roche/Hitachi-ModularSystems (Roche) according to the manufacturer’s protocols. Apolipoprotein E (APOE)-ɛ4) status was determined from leucocyte DNA.

Outcomes

Dementia diagnoses were based on a validated, structured interview for the diagnosis of dementia of the Alzheimer type, multi-infarct (or vascular) dementia, and dementias of other etiology according to the Diagnostic and Statistical Manual of Mental Disorders (DSM)-III-R, DSM-IV, and ICD-10 (Structured Interview for Diagnosis of Dementia of Alzheimer type, Multi-infarct Dementia and Dementia of other etiology according to DSM-IV and ICD-10 (SIDAM) [43]), implemented by trained research assistants. The SIDAM consists of a cognitive test battery (55 items including the Mini-Mental State Examination and covering the cognitive domains of orientation, memory, abstract reasoning, verbal ability and calculation, constructional ability, aphasia and apraxia) and a section for clinical diagnostic impression and rating of psychosocial impairment with a scale for the assessment of activities of daily living. Dementia was diagnosed according to the criteria of the DSM-IV, which comprise a diagnostic algorithm in the SIDAM including cognitive impairment on the SISCO score and impairment in ADL (score of ≥2 on the SIDAM ADL Scale). The diagnosis of dementia in AD was established according to the National Institute of Neurological and Communicative Disorders and Stroke–Alzheimer’s Disease and Related Disorders Association criteria for probable AD [44]. Vascular dementia diagnosis was guided by the National Institute of Neurological Disorders and Stroke–Association Internationale pour la Recherché et l’Enseignement en Neurosciences criteria [45] (i.e., evidence of a cerebrovascular event [Hashinski-Rosen Scale and medical history] and temporal association of the cerebrovascular event with cognitive decline). Mixed dementia was diagnosed in the absence of temporal association of the cerebrovascular event with cognitive decline. For all analyses, individuals with mixed dementia and dementia in AD were combined. Dementia diagnosis in individuals who were not personally interviewed was based on the Global Deterioration Scale and the Blessed Dementia Scale subscales. A score of at least 4 on the Global Deterioration Scale represented a diagnosis of dementia. The diagnosis was established in these cases if the causal information provided was sufficient for judgment using the aforementioned criteria. All diagnoses were made in consensus conferences that included the interviewer and experienced geriatric psychiatrists or geriatricians. The primary and secondary outcomes for the present study were ACD and AD, respectively.

Statistical Analysis

Descriptive Analysis

Analysis was performed for ACD (n = 805) and AD (n = 740) case–cohorts separately. Continuous and categorical variables were summarized as median (25% and 75% percentile), and count (percentage), respectively. Difference in continuous and categorical predictors across dementia status were tested with the Kruskal–Wallis rank-sum test and either Pearson’s Chi-squared or Fisher’s exact test, respectively.

Multivariable Modeling of the Association of Gut Microbiome Molecules and Inflammatory Markers with Dementia Risk

Selection of Covariates

We identified covariates for the causal inference between gut microbiome molecules and inflammatory markers with dementia risk using a directed acyclic graph (DAG). Based on a priori knowledge and biological plausibility, we used a DAG to draw the directions of the paths between the gut microbiome molecules or inflammatory markers and covariates, between covariates and dementia, and between covariates. A minimal sufficient adjustment set of confounders was selected and used as covariates.

From the DAG, our covariates were as follows: age (years), sex (men and women: reference), BMI (kg/m2), APOE-ɛ4 (homozygous, heterozygous, absent: reference), smoking status (smokers, ex-smokers, non-smokers: reference), alcohol intake (g/day), educational level (high, middle low: reference), employment (employed and unemployed: reference), physical activity (active and inactive: reference), marital status (widowed, divorced, married, single: reference), living alone (yes and no: reference), family history of dementia (yes and no: reference), prevalent hypertension, diabetes mellitus, coronary heart disease, stroke, depression, traumatic brain injury (TBI) (yes and no: reference), hearing impairment (significant hearing loss, mild hearing loss, and no impairment: reference), visual impairment (most severe visual impairment, considerable visual impairment, difficult vision, and no impairment: reference), mediation (yes and no: reference), study center (Leipzig, Hamburg, Düsseldorf, Mannheim Munich, Bonn: reference), MMSE score, HbA1c, and habitual diet. Due to unavailability of self-reported dietary intake data in this study and since plasma concentrations of tryptophan, methionine, and tyrosine have been reported to differentiate habitual diet groups, fish-eaters and vegetarians, meat-eaters, and vegans [46], we used them as proxies for habitual diet. We additionally adjusted for serotonin and N-acetyl-tryptophan. The 23 primary predictors (molecules and inflammatory markers) were LPS (pg/ml), LBP (μg/ml), IAA, IAA ME, 5OH-IAA, IPA, IBA, ILA, ICARB, IAG, IS, AA, PA, isobutyric acid, BA, 2-methylbutyric acid, isovaleric acid, VA, and HA (μM), CRP (mg/ml), and IL-1β, IL-6, and TNF-α (pg/ml). We checked the bivariate association between the primary predictors and covariates using the Spearman correlation and Kruskal–Wallis rank-sum test.

Handling of Missingness

We examined the proportion of missingness across the variables. Globally, we evaluated the missing completely at random (MCAR) assumption with Little’s MCAR test. Afterwards, we tested the missing data mechanism of each variable with a regression-based approach. We imputed the missing values when MCAR or missing at random (MAR) assumption is reasonable.

Statistical Power Analysis

There is no well-established power calculation for case–cohort studies with continuous predictors and non-rare events. Therefore, we used the standard power method for case–control studies. For this method, we used type I error rate of 0.05 and the values of three predictors of AD risk of André et al. [19]. Since the variables were standardized beforehand, we assumed them to be normally distributed with 0 mean and a standard deviation of 1. The dementia-associated adjusted odds ratio of the main predictor was 1.3 and the two covariates were 1.09 and 0.98. We independently included sample size and proportion of events for the ACD and AD. The power of the n = 805 ACD case–cohort was 91% and for the n = 740 AD case–cohort was 76%. Indeed, a sample size of n = 577 and n = 809 would produce the conventionally acceptable power (≥ 80%) for the ACD and AD case–cohorts, respectively. Considering that this standard power method has been shown to produce conservative estimates for case–cohort studies with non-rare events and binary predictors [47], it is likely that the actual power of our study samples would be higher.

Multivariable Regression Analysis

Continuous and categorical predictors were standardized and dummy coded, respectively. To estimate the association of the primary predictors (gut microbiome molecules and inflammatory markers) adjusted for covariates with ACD and AD risk, we adopted one of the recommended multivariable statistical approaches for analyzing case–cohort data, which comprises an initial variable selection step in binary outcome analysis with logistic regression followed by time-to-event analysis with weighted Cox proportional hazards (Cox PH) regression [48]. In the current variable selection step, we used three methods, the parametric adaptive Least Absolute Shrinkage and Selection Operator (aLASSO) logistic regression with 10-fold cross-validation, the random forest–based Boruta algorithm (RF-Boruta), and the recursive feature elimination implemented using Naïve Bayes algorithm (RFE-NB) with 10-fold cross-validation. These methods help to capture all (linear, non-linear, and interactions) complex inter-relationships among the predictors so that those that robustly discriminate dementia cases from non-cases are recovered. It also reduces potentially exaggerating the strength of associations of the multiple primary predictors with dementia risk and over-adjustment for covariates in the time-to-dementia models. Predictors with non-zero coefficients from the aLASSO logistic regression, predictors confirmed as important from the RF-Boruta, and optimal features from the RFE-NB were considered as the true relevant predictors of dementia. These true relevant predictors were used to estimate time-to-ACD and time-to-AD. We fitted Self-Prentice weighted Cox PH and checked for the PH assumption. The continuous predictors that violated the PH assumption were included with their time-varying form, while categorical predictors were handled by stratification of their baseline hazard function leading to no estimates for them. In addition, we fitted the accelerated failure time (AFT) models with weighted least-squares approach.

Bias Analysis

If any main predictor is significantly associated with time-to-dementia in the Cox PH, we estimated the strength of association (E-value) [49] on the risk ratio scale that unmeasured confounder(s) would need to have with both the predictor and dementia risk to completely explain away the predictor–dementia risk association, conditional on the measured covariates. Furthermore, by excluding early (first year of follow-up) incident cases, we addressed reverse causality in which the long prodromal phase and premorbid (subclinical) dementia might have directly altered the levels of these biomarkers rather than the reverse.

All probabilities were two-sided and significant level was set at P < 0.05. Statistical analyses were performed using R version 4.2.1.

Results

Descriptive Analysis

Tables 1 and 2 summarize the baseline characteristics of the n = 805 ACD and n = 740 AD study populations with 281 (35%) cases and 151 (20%) cases, respectively. About two-thirds were women with median age and BMI of 83 years and 25 kg/m2, respectively. Around 77% were physically active, about 5% were current smokers, mild alcohol consumers, and had normal MMSE score. Cases for both dementia outcomes were more likely to be women, older, more likely to be APOE-ɛ4 carriers, less physically active, less likely to be married, more likely to live alone, and had lower MMSE score. AD cases also had lower BMI. In addition, cases had higher IBA and lower 5OH-IAA (Table 3).

Table 1 Baseline characteristics of the all-cause dementia case-cohort
Table 2 Baseline characteristics of the Alzheimer’s disease dementia case-cohort
Table 3 Description of the primary predictors

Multivariable Modeling of the Association of Gut Microbiome Molecules and Inflammatory Markers with Dementia Risk

Some primary predictors are intercorrelated, highest between IAG and IPA, and between 2-methylbutyric acid with isobutyric acid and isovaleric acid (Table A.1). Furthermore, at least one primary predictor was associated with a covariate, with the largest proportion of the primary predictors associated with tryptophan and study center (Table A.1 and Table A.2).

Handling of Missingness

The proportion of missingness was low to moderate, between 0.1 and 6.8% in the ACD case–cohort and between 0.1 and 7.2% in the AD casecohort (Table A.3). For the ACD case–cohort, VA, PA, BA, HA, marital status, and smoking status were MAR, while other variables were MCAR. For the AD case–cohort, VA, BA, marital status, and smoking status were MAR, while other variables were MCAR (Table A.3). Although the global MCAR assumption was not significant for both datasets (P = <0.001), the fact that the predictors were a mix of MCAR and MAR suggests that the MAR assumption is optimally appropriate for both datasets. Hence, missing values were imputed using the non-parametric multivariate imputation by the chained random forest weighted by the number of non-missing values per observation.

Multivariable Regression Analysis

All-Cause Dementia

The 13 (eight positive and five negative) non-zero predictors from the aLASSO logistic regression were age, sex, heterozygous APOE-ɛ4 vs. none, homozygous APOE-ɛ4 vs none, MMSE score, family history of dementia, physically active vs. inactive, prevalent visual impairment, prevalent stroke, TBI, medication use, serotonin, and study center Düsseldorf vs. Bonn (Fig. 1A). The RF-Boruta confirmed eight important predictors, 5OH-IAA, IBA, 2-methylbutyric acid, age, sex, MMSE, heterozygous APOE-ɛ4 vs. none, and serotonin (Fig. 1B). The RFE-NB showed that a four-predictor model with age, sex, MMSE score, and 5OH-IAA optimally discriminates ACD cases from non-cases (Fig. 1C). Overall, 16 true relevant predictors discriminated ACD cases from non-cases. These were three gut microbiome molecules (5OH-IAA, IBA, 2-methylbutyric acid) and 13 covariates (age, sex, homozygous APOE-ɛ4 vs none, heterozygous APOE-ɛ4 vs. none, MMSE score, family history of dementia, physical active vs. inactive, prevalent visual impairment, prevalent stroke, history of TBI, medication, serotonin, and study center Düsseldorf vs. Bonn).

Fig. 1
figure 1

The true relevant predictors of all-cause dementia risk obtained from adaptive Least Absolute Shrinkage and Selection Operator logistic regression (green: positively associated predictors, red: negatively associated predictors) (A), random forest–based Boruta algorithm (Importance = median, minimum, and maximum z-scores of importance computed over multiple iterations) (B), and recursive feature elimination implemented using Naïve Bayes algorithm (Importance = area under the ROC curve (AUC) score importance) (C)

The Cox PH model with the aforementioned true relevant predictors showed that the whole model (P = <0.001), homozygous APOE-ɛ4 vs none (P = <0.001), MMSE score (P = <0.001), and serotonin (P = 0.02) violated the PH assumption. Consequently, we fitted the Cox PH model comprising the original predictors, the time-dependent MMSE score and serotonin, and stratification of the baseline hazard function for homozygous APOE-ɛ4 vs none. The result showed that over time, a one-SD increase in 5OH-IAA is significantly associated with a constant 42% decrease ACD risk (adjusted hazard ratio (HR) 0.58; 95% confidence interval (CI): 0.36 to 0.94, P = 0.025) (Fig. 2A). In line with the Cox PH model, the AFT model with the true relevant predictors showed that for each one-SD increase in 5OH-IAA, the logarithm of ACD survival time significantly increases by 0.21 (adjusted coefficient 0.21; 95% CI 0.02 to 0.40, P = 0.03) years (Fig. 2B). Other significant predictors of time-to-ACD were age, sex, homozygous APOE-ɛ4 vs none, heterozygous APOE-ɛ4 vs. none, MMSE score, physical active vs. inactive, TBI, and serotonin.

Fig. 2
figure 2

Association between true relevant predictors and time-to-all cause dementia risk. (A) Weighted Cox proportional hazard model (hazard ratio and 95% confidence interval). (B) Weighted least-squares accelerated failure time (coefficient and 95% confidence interval)

Bias Analysis

The E-values for 5OH-IAA were 2.27 and 1.27 for its adjusted HR of 0.58 (risk ratio, RR of 0.69) and upper CI of 0.94 (RR of 0.95), respectively. The observed adjusted HR of 0.58 could be explained away by an unmeasured confounder that is associated with both 5OH-IAA and time-to-ACD risk by a RR of 2.27-fold each, beyond the measured confounders, but weaker confounding could not do so. Furthermore, the upper CI could be moved to include one by an unmeasured confounder that was associated with both 5OH-IAA and time-to-ACD risk by a RR of 1.27-fold each, above and beyond the measured confounders, but weaker confounding could not do so. In addition, the association between 5OH-IAA and time-to-ACD risk was robust to the exclusion of cases within the first 1 year of follow-up, making it unlikely that the result is explained by reverse causality (Table A.4).

Alzheimer’s Disease Dementia

The 12 (six positive and six negative) non-zero predictors from aLASSO logistic regression were IAG, age, sex, MMSE score, homozygous APOE-ɛ4 vs none, heterozygous APOE-ɛ4 vs. none, married vs single, widowed vs. single, prevalent stroke vs. not, physical active vs. inactive, Düsseldorf vs. Bonn, and Munich vs. Bonn (Fig. 3A). The RF-Boruta confirmed 10 important predictors, ILA, IAA ME, isobutyric acid, 2-methylbutyric acid, age, MMSE score, homozygous APOE-ɛ4 vs none, widowed vs. single, serotonin, and tryptophan, as discriminating AD cases from non-cases (Fig. 3B). The RFE-NB showed that a four-predictor model with age, MMSE score, married vs single, widowed vs. single optimally discriminates AD cases from non-cases (Fig. 3C). Overall, 18 true relevant predictors discriminated AD cases from non-cases. This included five gut microbiome molecules (IAG, ILA, IAA ME, isobutyric acid, 2-methylbutyric acid) and 13 covariates (age, sex, MMSE score, homozygous APOE-ɛ4 vs none, heterozygous APOE-ɛ4 vs. none, married vs single, widowed vs. single, serotonin, tryptophan, prevalent stroke vs. not, physical active vs. inactive, study center Düsseldorf vs. Bonn, and study center Munich vs. Bonn). However, none of the four inflammatory markers discriminated AD cases from non-cases.

Fig. 3
figure 3

The true relevant predictors of Alzheimer’s disease dementia risk obtained from adaptive Least Absolute Shrinkage and Selection Operator logistic regression (green: positively associated predictors, red: negatively associated predictors) (A), random forest–based Boruta algorithm (Importance = median, minimum, and maximum z-scores of importance computed over multiple iterations) (B), and recursive feature elimination implemented using Naïve Bayes algorithm (Importance = area under the ROC curve (AUC) score importance) (C)

The Cox PH model with the aforementioned true relevant predictors showed that the whole model (global test, P = <0.001), IAG (P = <0.001), and homozygous APOE-ɛ4 vs none (P = <0.001) violated the PH assumption. Consequently, we fitted the Cox PH model comprising the original predictors, time-dependent IAG, and stratification of the baseline hazard function for homozygous APOE-ɛ4 vs none. We observed that none of the primary predictors was associated with time-to-AD in the Cox PH (Fig. 4A) and AFT models (Fig. 4B). Consequently, no bias analysis was performed for AD. The significant predictors of time-to-AD were age, homozygous APOE-ɛ4 vs none, heterozygous APOE-ɛ4 vs. none, MMSE score, physical active vs. inactive, and study center Düsseldorf vs. Bonn.

Fig. 4
figure 4

Association between true relevant predictors and time-to-Alzheimer’s disease dementia risk. (A) Weighted Cox proportional hazard model (hazard ratio and 95% confidence interval). (B) Weighted least-squares accelerated failure time (coefficient and 95% confidence interval)

Discussion

The present epidemiological study leveraging a unique multicenter German cohort explored the relationship between the plasma levels of 19 gut microbiome molecules and four inflammatory markers and the risk of incident dementia. There were three observations. Firstly, seven (three ACD-related: 5OH-IAA, IBA, and 2-methylbutyric acid and five AD-related: IAG, ILA, IAA ME, isobutyric acid, and 2-methylbutyric acid) gut microbiome molecules discriminated incident dementia cases from non-cases. Secondly, decreased 5OH-IAA level was associated with elevated time-to-ACD. Thirdly, there was no association between inflammatory markers and the risk of either ACD or AD.

The relationship between IBA, IAG, ILA, and IAA ME and either ACD or AD risk has not been reported previously. Hence, the current study adds to the body of literature. Our observed relationship between 2-methylbutyric acid and AD risk is in consort with Wu et al. where fecal 2-methylbutyric acid was associated with AD risk in 55 individuals [22]. 2-Methylbutyric acid is a branched SCFA produced by microbial fermentation of isoleucine [50]. Another AD-related branched SCFA, isobutyric acid is produced by microbial fermentation of valine [50]. Isobutyric acid was previously associated with ACD [16]. Bacteroides, Propionibacterium, Bacillus, Lactobacillus, Clostridium, and Escherichia coli produce isobutyric acid and 2-methylbutyric acid [50]. The potential explanation for their relationship with dementia is that SCFAs modulate microglial activation, although the exact signaling pathways are not fully understood [51]. SCFAs also stimulate the expression of aryl hydrocarbon receptor (AhR) factors [52].

The first indole-containing tryptophan metabolite, ILA is secreted by Bifidobacterium infantis [53] and Lactilactobacillus species [54], functioning as a potent activator of human AhR signaling [52]. The second, IBA is a precursor of IAA [55] produced by Bifidobacterium, Lactilactobacillus, Clostridium, and Bacteroides [54]. Furthermore, IAA ME is a metabolite of IAA produced by Pseudomonas amygdali [56] and IAG is synthesized from microbial tryptophan metabolism and host glycine conjugation [57]. The last, 5OH-IAA is the most compelling finding, as it was associated with time-to-ACD independent of several covariates and robust against bias. 5OH-IAA was previously associated with AD [23, 24]. Possible explanation for 5OH-IAA is that it is an agonist for AhR signaling [52, 58,59,60] and prevents the formation of amyloid beta plaques [61]. Thus, its low circulating levels suggest a reduction in these dementia-preventing properties. 5OH-IAA is synthesized mainly in the kidney and liver as the final serotonin catabolite in a two-step reaction, involving monoamine oxidase and aldehyde dehydrogenase [62]. Besides, it functions in the indole pathway [63]. Some gut bacteria such as Pseudomonas [64, 65] also directly synthesize its serotonin precursor. It is therefore unsurprising that 5OH-IAA is highly associated with the gut microbiome [66] and an emerging gut microbiome molecule [58,59,60, 67]. Interestingly, alteration in this host–gut microbiome co-molecule is implicated in neuropsychiatric conditions such as depression [67], epilepsy [68], and schizophrenia [69]. This suggests that the comorbidity of these conditions with dementia [70] may be partly underlined by the gut microbiome and 5OH-IAA.

Surprisingly, the gut microbiome de novo synthesized LPS and a measure of its long-term exposure (LBP) was not related to either ACD or AD. This is in contrast to the association of plasma LPS in 36 individuals [18] and plasma LBP in 636 individuals [19] with AD as well as the independent association of plasma LPS with cognitive decline in 127 individuals [71]. Moreover, other previously reported ACD-related gut microbiome molecules were saliva AA and PA in 51 individuals [15], fecal isovaleric acid in 107 individuals [16], and plasma IS in 24 individuals [17]. While Zhang et al. [18] was independent of age, André et al. [19] and Figueira et al. [15] were independent of a few covariates, Saji et al. [16] was not significant after adjusting for covariates, and Teruya et al. [17] was unadjusted for covariates. In addition, PA [21, 22] and 5OH-IAA [23, 24] have been consistently associated with AD, independent of age and sex. Yilmaz et al. observed the association in the saliva of 21 individuals [21], Wu et al. in the feces of 55 individuals [22]. Whiley et al. in the urine of 556 individuals [23], and Baker et al. in the brain of 25 individuals [24]. The fact most of the aforementioned molecules were not related to either ACD or AD suggests that the discrepancy between our findings and others is likely attributable to underlying differences in the study populations and methodologies. Moreover, the absence of association of any molecule with time-to-AD is unlikely due to statistical power since time-to-event analysis has greater statistical power than binary outcome analysis [72].

Unexpectedly, inflammatory markers, CRP, IL-1β, IL-6, and TNF-α did not discriminate dementia cases from non-cases. The absence of their association with dementia risk is unlikely to be due to intercorrelation of the markers, which is low, or covariate adjustment, since none was associated with dementia in the unadjusted analysis. This finding is in consort with recent reports [19, 29,30,31]. Two studies like ours investigated both ACD and AD [27, 29]. While the meta-analysis showed that CRP and IL-6 are associated with ACD and CRP with AD [27], the more recent study [29] did not observe an association of CRP with either ACD or AD. Considering that most recent studies [19, 30, 31] did not observe an association with other inflammatory markers casts doubt on the independent association between the circulating levels of this set of inflammatory markers and dementia risk. Nevertheless, this does not preclude the potential role of other inflammatory markers since systemic inflammation is the primary cause of BBB damage and often precedes dementia pathologies [73].

It is noteworthy that we confirmed previously reported determinants of dementia such as age [74], sex [16, 74], APOE-ɛ4 [16, 74, 75], MMSE [16, 22], and physical inactivity [2, 75]. In fact, we observed that age, sex, APOE-ɛ4, and MMSE were independent predictors of time-to-dementia risk. Additionally, our objectively measured habitual diet proxies that were negligible predictors of dementia agree with recent inconclusive evidence linking diet to dementia risk [76,77,78]. Furthermore, there are indications that the association between diet and dementia risk is driven by cardiovascular risk factors [2]. Since we adjusted for cardiovascular risk factors, it is likely that residual confounding by diet would be trivially small, despite SCFA and indole-containing tryptophan metabolites being diet-dependent. Furthermore, APOE-ɛ4 is a strong determinant of dementia and interacts with other risk factors [74]. Adjusting for APOE-ɛ4 status as a three-group variable ensures that its influence and related factors are properly accounted for.

There are several strengths of this study. It is a multicenter study; as such, these findings are generalizable to the broader German population. It is also one of the largest studies linking this set of biomarkers to dementia risk. Our analyses are sufficiently powered to produce valid and reliable results. Standardization of all continuous predictors reduces the impact of variation in the molecules and inflammatory markers thereby facilitating their comparison. Furthermore, our multivariable analysis ensured that we accounted for most of the inter-relationships among the molecules and inflammatory markers, covariates, and dementia risk. Our reporting of the effect estimates of primary predictors and covariates will improve the statistical power estimation of future studies. Finally, the case–cohort design affords the use of the subcohort for other health conditions since it is not case-matched. While this study is hypothesis-driven and yielded scientifically interesting and biologically plausible findings, it is important to acknowledge its limitations. This is an observational study; thus, it does not prove causal relationship between these molecules and dementia risk. Indeed, the potential causal relationship of 5OH-IAA with ACD risk may be explored by linking genetically predicted levels of 5OH-IAA with ACD risk in Mendelian randomization analysis. Besides, our finding may be different in other biospecimen. However, this seems unlikely for 5OH-IAA since there is a good agreement of its plasma, serum, and urine levels [60]. Moreover, we reported the impact of reverse causality and the strength of unmeasured confounders that are likely to explain away our findings. Despite these bias analyses and extensive adjustment for covariates, our results could still be biased due to measurement error and the use of two categories for some covariates. Our limited sample size precludes an intermediate step of internally validating and assessing the potential clinical utility of the true relevant predictors in a holdout dataset. Our findings should be confirmed in other studies, particularly in larger well-characterized and harmonized multicenter studies with broader microbiome metabolomics and a larger set of inflammatory markers. Since the levels of some of these dementia-related molecules are time-dependent [33], the relationship between their repeatedly measured levels and dementia risk should be investigated. Indeed, the AD-related molecules are interesting candidates to thoroughly elucidate in future biomarker-based research.

Conclusion

In a relatively large cohort of older Germans, we observed that circulating concentrations of seven gut microbiome molecules are related to dementia risk, of which 5OH-IAA is associated with long-term ACD risk. These molecules underpin gut microbiome-host interactions in dementia occurrence. The modulation of these molecules such as through direct supplemental intake and probiotic consumption of their synthesizing bacteria may be crucially relevant in dementia’s multifactorial risk prevention and intervention strategies.