Background

Alzheimer’s disease (AD) is a debilitating neurodegenerative disorder, most prevalent in the elderly. Although AD is not part of the natural aging process, age is the most important non-genetic risk factor for the onset of this condition. AD places great strain on the U.S. healthcare system [1, 2] as well as on the health and financial well-being of family members and informal caregivers [1, 3,4,5,6,7,8]. At this time there is no clinically validated treatment available for AD other than palliative care; available pharmacological treatment options have uncertain efficacy [9,10,11,12] and are associated with significant monetary cost [13, 14]. Recent estimates place the prevalence of AD in U.S. older adults age 65+, at about 11,300 per 100,000 [15, 16]; although more prevalent in females [17, 18] the role of biological sex in AD risk is not straightforward [19]. In the future, AD is expected to exhibit rising incidence and post-onset mortality coupled with falling survival [20] leading to a projected increase in the prevalence of AD in the U.S [16, 21, 22].

Graves disease (GD) is an autoimmune disorder that causes overproduction of thyroid hormones which can accelerate the metabolism, leading to a weight loss, rapid heartbeat, and other symptoms; GD is the most common cause of hyperthyroidism in the U.S [23]. Current estimates of the prevalence of GD in the U.S. range between 20 and 50 per 100,000 [24] with women (approximately 40 per 100,000) being at higher risk than men (approximately 10 per 100,000) [23, 25]. Unlike AD, GD can be treated successfully. However, short of total thyroidectomy or radioactive iodine ablation [26], there is no permanent cure, with other treatments associated with high rates of recurrence and low rates of disease remission [23, 26, 27]. Although GD is not an aging-related disease per se, with incidence peaking between 30 and 50 years of age [23], some complications of GD are more common in the elderly [23, 28]. Increased risk of cognitive decline and the onset of AD/dementia have recently been associated with hyperthyroidism and GD [29,30,31,32,33,34,35,36] while results for hypothyroidism have been mixed [37,38,39,40]. If correct, then mitigation of the risk associated with GD, through timely identification and successful treatment, becomes an actionable policy target with clear benefits both directly to individuals with hyperthyroidism and indirectly to the public by reducing the magnitude of the burden associated with AD. In this study, we will explore the potential relationship between GD and the risk of clinical AD/dementia in later life through a comparison of propensity-score matched groups of U.S. older adults age 65+.

Data and methods

Data came from a nationally representative 5% sample of Medicare beneficiaries provided by the U.S. Centers for Medicare and Medicaid Services (CMS). In the U.S., the Medicare social health insurance system pays for the healthcare of over 98% of the U.S. population age 65+. The dataset spanned the 1991–2020 time-period and provided individual-level information on the dates of birth and death (if applicable), race/ethnicity, sex, and the diagnoses made (using International Classification of Diseases 9th (ICD-9) and 10th (ICD-10) revision codes) during episodes of care paid for by Medicare Parts A (facility) or B (professional) over that period. We limited our analysis to individuals living within the U.S. and enrolled in either traditional fee-for-service Medicare or a Medicare Advantage plan whose claims are processed by the CMS. Most Medicare Advantage plans do not share claims data with the CMS and therefore information on their beneficiaries is not available for research.

For the calculation of trends in age-adjusted incidence and prevalence, we required the beneficiary to be age 65 + and aggregated all individuals older than 100 into the 100 + age group (this was done both to simplify the age-adjusting process and to better comply with data reporting restrictions set forth by the CMS). The resulting samples contained over 1,600,000 individuals for each study year. Although estimates for 1991–1993 are provided, these should be taken with care as many individuals in this period are still in the process of accumulating diagnoses after entry into the data. Incidence was calculated as the number of new cases of GD diagnosed before the end of a given year divided by the number of at-risk individuals present during that year. Prevalence was calculated as the number of living individuals diagnosed with GD on or before the end of a given year divided by the total number of living individuals. Age adjustment was done using the U.S. population for the year 2000 provided by the U.S. Census Bureau.

For the survival analysis portion of the study, we additionally required the presence of at least 3 years of look-back to ascertain the presence of baseline comorbidities, and at least one year of follow-up. The baseline age was set at the time of a verified GD diagnosis for the cases and three years after data entry for the controls. The look-back period was measured both from the individual age and from the calendar time perspective, making the minimum baseline age about 67.5 and the minimum baseline time January 01, 1994. Individuals with AD on or before the baseline date were excluded. The final sample pool for the survival analysis portion of the study consisted of 3,399,925 individuals.

The presence of co-morbid conditions from the Elixhauser co-morbidity index index [41], Graves Disease (ICD-9:242.00, 242.01; ICD-10:E05.00, E05.01), and Alzheimer’s Disease (ICD-9:331.0; ICD-10:G30), our primary outcome, were identified as follows: For AD, and GD we required at least two distinct claims no more than two years from each other. Date of onset was set at the earliest date of the two. This was motivated by the relative rarity of the conditions, and the need to mitigate the potential bias associated with erroneous diagnosis. Studies on GD relying on biologic data [42], have required the presence of two distinct serum thyroid-stimulating hormone concentration test values of < 0.3 mIU/L. Although our data does not have access to test results (therefore we do not observe if the < 0.3 mIU/L cut-off was reached), requiring a second episode of care with GD recorded as a diagnosis helps to approximate this process. Furthermore, the above criteria may itself miss individuals with rapid post-treatment normalization of thyroid-stimulating hormone concentration. To address this issue, we repeat our primary analysis for individuals with only a single Graved Disease claim on record. These would be excluded from the Graves disease group in primary analysis (Supplementary Appendix A). For the Elixhauser-based co-morbidity index a more standard requirement of 90 days between two distinct claims was used. We also included the following socio-demographic covariates: male sex; Black, Hispanic, and Other (including Asian, Native Americans/pacific islander) race/ethnicity, dual eligibility (as a proxy for poor economic status) and a yearly trend (to represent changes in technology and practice).

Of the 3,399,925 individuals eligible for the analysis, 19,852 were identified as GD cases. After comparing the summary statistics between these groups, we concluded that the GD group was too dissimilar from the healthy population for direct comparisons (Tables 1 and 2: Unmatched Full Sample column). Therefore, a Greedy Propensity Score Matching (PSM) algorithm [43, 44] was used to identify a comparable group of individuals from the healthy control pool. We used 1:1 matching without replacement [45] based on propensity scores generated by a logistic model designed to estimate the probability of having GD using the 31 Elixhauser co-morbidities and available demographic variables. In this way, we were able to identify 19,798 matched pairs for GD (Tables 1 and 2: Matched Full Sample column). To assess any differences in risk associated with race, ethnicity and/or sex we stratified the full sample into six race/ethnicity/sex-specific subgroups and re-ran the PSM algorithm. This resulted in the identification of 4,233 matched pairs for male, 15,534 for female, 16,488 for White, 1,958 for Black, 226 for Hispanic and 948 for individuals of Other races. All analysis in this study is repeated for each of these subgroups.

Table 1 Summary statistics
Table 2 Propensity score matching quality

The standardized difference [46] was used to assess the inter-group differences before and after PSM. The standardized difference is not affected by differences in sample size and has the benefit of being relatable to two other measures of association, the Pearson correlation coefficient for continuous and the phi coefficient for dichotomous variables [47, 48]. We used the criterion of \(\left|{\varDelta }_{s}\right|\le 10\)% to reduce the inter-group differences to a level sufficient for further analysis. Using this criterion, we judged that the PSM algorithm successfully reduced the inter-group differences to a level sufficient for further analysis (Table 2). There were some exceptions. The baseline ages for the GD group occurred 11.73 (Full sample), 12.81 (Female sample), 12.10 (White sample) and 10.96 (Black sample) percentage points (pp) earlier on average than in their PSMed counterparts. None of these differences were greater than 1 year in real terms and this difference in age was explicitly addressed by the way we utilized age in our survival analysis models. The baseline dates for the GD group also occurred between 6.88pp (Black sample) to 23.02pp (Male sample) earlier than those of their PSMed counterparts. However, none of these were greater than 2 years, and most under one year, in real terms. Individuals with GD were 6.88pp (Black sample) to 23.02pp (Male sample) percentage points less likely to be dual eligible than their PSMed counterparts. There were also some minor differences in the rates of chronic pulmonary diseases in the White (-11.83pp) and Male (-13.86pp) samples; peripheral vascular disorders in the Male (-10.3) sample; and blood loss/anemia in the Hispanic (+ 11.00) sample.

Survival analysis was done using two methods: the Cox proportional hazards model and the Fine-Gray competing risk model [49] with death as the competing risk. In both cases, age, the most important non-genetic risk factor for AD, was included non-parametrically as a time-scale variable. Thus, the partial likelihood is maximized for individuals with the same value of the time scale variable. Therefore, the effects of age in the model are accounted for non-parametrically and, in a certain sense, exactly. The only covariate explicitly included in the model was membership in the GD group. The PSM matching ensures that the GD and non-GD groups were nearly identical in terms of all other covariates at baseline. All analysis was done using SAS 9.4 software (Cary, NC: SAS Institute Inc.) after obtaining permission from the Duke University IRB.

Results

The total, 65 + age-adjusted prevalence of GD grew over the study period reaching a maximum of 495 per 100,000 in 2012 (Fig. 1A). Note that the nature of GD (e.g., need for long-term treatment; high recurrence and low remission rates), combined with the low accuracy of identifying remission from administrative health data led to the decision to treat GD as a permanent condition. Therefore, the prevalence levels are likely to be overestimated. Making a counterfactual assumption, that all instances of GD are cured over 5 years, the initial estimates fall sharply (Fig. 1A). As expected, (Fig. 1B) the prevalence of GD is significantly higher in females (maximum of 726 per 100,000 in 2019) than in males (maximum of 244 per 100,000 in 2010). Black individuals (Fig. 1C) have the highest prevalence of GD among all races/ethnicities (maximum of 705 per 100,000 in 2012) and this difference is statistically significant from all other groups from 2000 onwards. In contrast, Hispanic individuals have the lowest prevalence of GD (maximum of 450 per 100,000 in 2011). However, these differences are not statistically different from other non-Black races/ethnicities until 2014.

Fig. 1
figure 1

Trends in graves disease prevalence. Trend in the prevalence of individuals ever to be diagnosed with Graves Disease (per 100,000). Full sample, Graves Disease an absorbing state (black solid line); Full sample, Graves Disease in remission after 5 years (black dashed line); Males black dotted line), Females (B black dot-dash line), White (C black solid line), Black (blue solid line), Hispanic (C red solid line), other (C green solid line)

The total 65 + age adjusted incidence of GD, although subject to some fluctuations, is fairly constant with a maximum of 65 per 100,000 in 2006 (Fig. 2A). Like with prevalence, GD incidence (Fig. 2A) in females (maximum of 88 per 100,000 in 2006) is significantly higher than that of males (maximum of 33 in 100,000 in 2006). However, unlike prevalence, no strong race/ethnicity-related patterns in incidence can be observed (Fig. 2C). Black individuals have the highest incidence rates (maximum of 100 per 100,000 in 2012) and Hispanic the lowest. However, the race/ethnicity-specific confidence intervals overlap. Only in 2012, can we say with any statistical confidence, that Black incidence rates of GD are higher than those of the Hispanic group.

Fig. 2
figure 2

Trends in graves disease incidence. Trend in the incidence of Graves Disease (per 100,000).  Full sample (black solid line) Males (black dotted line), Females (black dash line), White (black solid line), Black (blue solid line), Hispanic (red solid line), other (green solid line)

Survival analysis results are presented in Table 3. The Cox model shows that in a PSM sample, the presence of Graves disease is associated with higher risk of AD in the full sample (Hazard Ratio [HR]:1.19; 95% Confidence Interval [CI]:1.13–1.26), as well as the Male (HR:1.23; CI:1.03–1.47), Female (HR:1.17; CI:1.08–1.25), and White (HR:1.14; CI:1.06–1.23) groups. The results of the competing risk model, are lower on average, and highly consistent with those of the traditional Cox model. The association between GD and AD risk in Black individuals (HR:1.23; CI:1.02–1.49) becomes significant once the competing risk of death is accounted for. The hazard ratios obtained in sensitivity analysis described in Supplementary Appendix A are consistent with the primary results and the effect direction and confidence intervals of the subgroups for which a statistically significant effect could not be identified was consistent with significant findings. No race/ethnicity/sex-related disparities in the effect of GD on AD could be observed as the CI for all study subgroups overlap.

Table 3 Survival analysis results

Discussion

In this paper we found that the presence of GD is associated with a significantly higher risk of clinical AD. Hyperthyroidism is a medical condition where thyroid-stimulating hormone (TSH) levels are low or even undetectable with normal free thyroxine and total or free triiodothyronine levels. It can be caused by increased endogenous production of thyroid hormone, as well as because of administration of thyroid hormone to treat malignant thyroid disease, or by excessive replacement therapy [50]. Association of TSH with dementia [51], suggests that prolonged exposure to low TSH levels could be detrimental to brain function. Individuals with goiter, hypothyroidism, thyroiditis, or hyperthyroidism faced an increased risk of AD, particularly in younger age groups, females, and those with lower comorbidity scores [52]. This, in turn, suggests that the level of estrogen is associated with thyroid function which was confirmed by studies, including ours, that showed that thyroid disorders are more prevalent in women than in men [53].

Another potential mechanism connecting GD (and related hyperthyroidism) to AD may involve shared etiological factors between the two diseases, such as viral infections, compromised/auto immunity, and neuroinflammation, which may themselves contribute to the development of both GD and AD [35, 54,55,56,57,58]. Neuroinflammation of the microglia, the brain’s resident macrophages [59], has been suggested to play a central role in the pathophysiological processes in both GD and AD [60,61,62] and hyperthyroidism was shown to aggravate cognitive deficits in AD mice and induce Aβ deposition and neuronal loss by inducing neuroinflammation [60].

Findings showing that both low and high thyrotropin (a pituitary hormone that stimulates the production of thyroid hormones) could be associated with increased risk of AD [32], suggest the possibility of a U-shaped relationship. This mechanism is supported by significantly higher prevalence rates of both hypothyroidism and hyperthyroidism in participants with AD found among a sample of the Korean National Health Insurance beneficiaries [52], and the lack of association between hyperthyroidism and dementia found in some studies [63, 64].

The above suggests that the role of metabolism could also be contextual [65, 66]. For example, slowdowns in metabolism might promote dementia through declining rates of information processing or by impairing the resilience of the body to adverse health events, such as infections, through delayed immune responses, slower healing, and longer recovery time [66]. On the other hand, it may also slow some aging-related changes in the body and be beneficial in the long term by, for example, reducing the rate of deterioration of the cortex and white matter in the brain [65, 66].

This study has several strengths. First, it was conducted after equalizing the GD and non-GD groups across a wide range of demographic and health-related conditions. This is vital as the GD group was shown to be statistically different from the non-GD population across many health-related conditions, including AD risk-related diseases. Second, it is based on data nationally representative of the 65 + population with follow-up periods of 27 + years. This provides the study with a relatively large group of individuals with a clinical diagnosis of GD even in smaller population strata. However, the study is based on administrative claims data designed for billing purposes, not research. Therefore, valuable information, such as the results of laboratory analyses, is not available. However, the disease ascertainment algorithm used is designed to reduce the impact of tentative and mistaken diagnoses (essentially we are assuming that if the laboratory results were not consistent with a diagnosis of GD, then it would not be listed as a diagnosis on the second visit) and our estimates of the extra risk of AD onset associated with GD are highly consistent with the estimates of at least one study based on biological measurements [42].

Similarly, AD, in the context of our data, represents a clinical diagnosis of possible/probable Alzheimer’s Disease dementia, and may not reflect the exact etiology of the individual’s actual condition. AD is often mistaken for other conditions and, often co-exists with other types of dementia making its diagnosis before autopsy difficult. Even though the study is nationally representative, and the sample size reflects the true situation as capturable by this dataset, additional studies with a focus on oversampling minority groups are warranted. Finally, we were not able to differentiate between the effects of treated/controlled GD and situations where disease management is proving a challenge or of any additional risk associated with alternative types of GD treatment. This is an important avenue of future research as studies have shown that for some chronic health conditions related to AD risk, aggressive management of the risk-related disease acts to reduce the associated AD risk as well [67].

Conclusion

Although the exact biological mechanisms potentially linking the two conditions are unclear and focused studies of race/ethnicity-specific subgroups as well as the replication of these findings on datasets with available biomarkers and laboratory test results are needed, our findings support the hypothesis that there may be a strong relationship between a diagnosis of GD and a diagnosis of AD in later life.