Introduction

Autoimmune thyroid disorders (AITDs) occur in about 0.3–1.5/1,000 persons/year, with female dominance. AITDs are influenced by genetic, environmental, and epigenetic factors [1,2,3]. The frequency of AITDs has shown an increasing tendency in different parts of the world in recent decades [4,5,6]. Rapidly changing environmental factors can be mainly responsible for this increase [3].

The two most frequent types of AITD are Hashimoto’s thyroiditis (HT) and Graves’ disease (GD). Although the underlying autoimmune mechanisms are similar in the two disorders, the targets of the autoantibodies are different, and as a result, the symptoms are also different [1, 7]. HT causes hypothyroidism in about 20–30% of patients [2]. In GD the unregulated thyroid hyperfunctioning is characteristic of the disease, which may also be accompanied by orbital and pretibial extrathyroidal manifestations [8]. Because HT and GD are longstanding diseases with fluctuating disease activity and changing symptoms, they require close medical monitoring and control to avoid severe side effects [2, 6].

Living with autoimmune thyroid disease can seriously affect the health-related quality of life (HRQoL) [9, 10], including disturbing physical symptoms of hypo- or hyperthyroidism, mood-related problems, sexual dysfunctions, neurocognitive disturbances, and cosmetic complaints. Consequently, daily and social functioning are frequently impaired [9,10,11,12]. Since symptoms and their effects on well-being and comorbid difficulties vary depending on the inflammatory activity, hormone level changes, and tissue damage, it is necessary to regularly monitor the burden experienced by patients. Age and time since diagnosis can also be important factors in determining symptoms and quality of life. A long-lasting autoimmune process can cause progressive changes in the thyroid gland and brain leading to increased amounts of some types of symptoms. Age and time since diagnosis can also affect coping and adaptation to the disease [9, 13].

Measuring HRQoL in thyroid patients with relatively easy-to-apply self-reported scales can bring many benefits. Disease-specific health-related quality of life (HRQoL) questionnaires offer a quick and broad impression for physicians about the primary and most stressful symptoms, can help to follow some signs of disease activity and the effects of the treatment on symptoms and well-being, and can provide information about the need for mental health care [10, 14,15,16,17].

Specifically, in autoimmune thyroid diseases, monitoring HRQoL can be extremely important because hormone levels — the medical parameter checked routinely — do not always provide accurate information about the effects of the illness on well-being. HRQoL seems more related to underlying autoimmune activity or comorbid symptoms and disorders than hormone levels or changes [18,19,20]. Based on systematic review articles, the Thyroid-Related Patient-Reported Outcome (ThyPro) is recommended for assessing HRQoL in patients with different benign thyroid diseases and dysfunctions [10, 21].

Although ThyPro shows good psychometric characteristics, content, and structural and cross-cultural validity [10, 21, 22], the need for an abbreviated version has arisen [23]. The short form of the ThyPro (ThyPro-39) is based on the original questionnaire and was constructed with item response theory and validation studies. The exploration of the psychometric characteristics of the abbreviated version has just recently begun [10, 24, 25].

Our study aimed to adapt and validate the Hungarian version of the ThyPro-39 questionnaire, test its factor structure [23], and explore its construct validity (including known-group validity) in the two most frequent autoimmune thyroid diseases: HT and GD. Perceived stress is expected to be associated with the general factors and psychosocial specific factors only, and not expected to be associated with the specific symptom factors.

Materials and methods

Procedure and participants

The study procedures were approved by the Scientific and Research Ethics Committee of the Hungarian Medical Research Council (SE TUKEB 256/2021) and were completed in accordance with the Declaration of Helsinki as revised in 2013. We recruited our participants from disease-specific groups on social media sites. Informed participant consent was obtained.

The study included data from 291 participants. Of these, 82.5% (N = 240) had an HT diagnosis, and the proportion of GD patients was 17.5% (N = 51). 96.2% (N = 280) claimed to be female. The mean age was 45.1 years (SD = 12.0; range: 22–78). The detailed sociodemographic data are shown in Table 1.

Table 1 Demographic and disease-specific characteristics of the recruited sample (N = 291)

Measures

Thyroid-Related Patient-Reported Outcome-39

HRQoL was measured with ThyPro-39 questionnaire (24). The participants responded on a five-point Likert scale (0 = no symptoms; 4 = severe symptoms). The questionnaire contains 12 scales and an individual item measuring the overall HRQoL impact. A scale with an additional composite summary score was also created by the original authors [23]. The composite scale is based on the 22 items from the tiredness, cognitive problems, anxiety, depressivity, emotional susceptibility, impaired social, and daily life scales plus the overall HRQoL impact item. The use of the composite score can be useful when simplicity of reporting combined with small measurement intervals and high precision is the goal. In the previous research, this version showed good test–retest reliability and adequate responsiveness to clinical change [23]. Before the start of the research, the questionnaire was officially translated into Hungarian based on the instructions and supervision of the original authors (two independent Hungarian translations - a common translation based on the two Hungarian versions - back translation into English - evaluation by the original authors - reconciled Hungarian version - cognitive debriefing - final Hungarian version).

Perceived Stress Scale-10

The Hungarian version of the Perceived Stress Scale-10 (PSS-10) was used to measure the stress level [26, 27]. A higher score indicates more frequent stressful situations; the total score is a global indicator of perceived stress.

Mental Health Continuum Short Form

The Mental Health Continuum Short Form (MHCS-SF) is a scale measuring emotional well-being, consisting of 14 items [28, 29]. Overall, the questionnaire describes the subjective well-being of the respondent globally (MHCS-Total), while its three scales (Hedonic — Emotional, Eudaimonic — Social, Eudaimonic — Psychological) summarise each well-being area separately.

Statistical analysis

Data were analysed with SPSS 26 (IBM, 2017) and Mplus 8.5. To test the measurement models, we used a series of confirmatory factor analyses (CFAs) with maximum likelihood estimation robust to nonnormality (MLR). The models were interpreted following standard goodness-of-fit indices (30): the Comparative Fit Index (CFI; ≥ 0.95 excellent, ≥ 0.90 adequate), the Tucker–Lewis index (TLI; ≥ 0.95 excellent, ≥ 0.90 adequate), and the Root Mean Square Error of Approximation (RMSEA; ≤ 0.06 excellent, ≤ 0.08 adequate). Whenever applicable, χ2 difference test and sample-size adjusted Bayesian Criterion (ssaBIC) value were used to compare alternative measurement models. The χ2 scaled difference test is used in the context of CFA to compare nested models when the data may not meet the assumptions of normality or independence. A straightforward manual calculation for the chi-squared difference test for the MLR estimator is available on the Mplus website. The sample-size adjusted Bayesian Information Criterion (BIC) is a statistical measure used in model selection to compare the goodness-of-fit of competing models. It balances the goodness-of-fit and model complexity by penalizing models that have more parameters, and provides a quantitative measure for selecting the best-fitting model among a set of candidates. A lower sample-size adjusted BIC (ssaBIC) value indicates a better trade-off between goodness-of-fit and model complexity, with lower values indicating better-fitting models among the set of candidates being compared. The practical application of the questionnaire raised the possibility of using a composite score. To test the reliability and validity of the composite score, we tested a bifactor model similar to others [23]. The bifactor model simultaneously represents a general severity factor and the problems or symptoms represented by the specific factors. The usual specification of a bifactor model requires that the specific factors do not correlate with each other or the general factor [30]. The advantage of bifactor modelling is that it provides an opportunity to quantify the appropriateness of a composite score. Therefore, we applied the percent of common variance attributable to the general factor using an explained common variance index (ECV) [31, 32]. We also used omega and omega hierarchical indices to measure how precisely a self-reported symptom scale score assesses the combination of general and specific constructs and a certain target construct [33]. The interpretation of ECV, omega, and omega hierarchical indices is moderated by the percentage of uncontaminated correlations (PUCs). In the case of a high PUC value (> 0.90), the indices can be interpreted directly.

The construct validity of ThyPro-39 is being tested using a CFA with covariates model. The CFA with covariates model allows for estimating the effects of grouping variables, such as diagnosis, or other continuous variables on the latent variables, which can be used to investigate known-group validity while controlling for other covariates. Additionally, perceived stress, age, and sex are included as additional variables to provide further evidence of the construct validity of the measurement.

Results

Descriptive statistics

The descriptive characteristics are shown in Table 2. Most scales show acceptable or good internal consistency. Table 2 also contains the comparison of the HT and GD groups on different variables. The correlations among the main variables can be found in Table 3.

Table 2 Descriptive statistics of the scales
Table 3 Correlations among the main well-being variables

Measurement models: confirmatory factor analyses

The measurement model was tested by a series of CFAs. The model fit indices are presented in Table 4. The model with 12 correlated first-order factors (Model 1) showed a good fit. As an alternative model, we examined the fit of a bifactor model that included an overall general factor in addition to the 12 specific factors (Model 2b). Although this model also showed an acceptable fit to the data, the chi-squared difference statistic indicated a better fit for the simpler model (Model 1).

Table 4 Model fit coefficients of the CFA models

Noting that the factors in the measurement model can be grouped into two major categories, namely somatic symptoms and psychosocial factors, we examined the fit of the model containing two general factors — Somatic symptoms and Psychosocial symptoms — in addition to the 12 factors. The fit of the model is slightly weaker than that of Model 1 based on the chi-squared difference statistic, but it has a better fit using information criteria (e.g., ssaBIC).

Since an item in the questionnaire (item 12) is not linked to a specific factor, we also tested measurement models in which this general item only loads on its corresponding general factor (Model 2b and Model 3b). A comparison of the fit of these models with the base model (Model 1) and the previous bifactor models (Model 2a and Model 3a) is not possible because they do not have the same item set.

We also tested the two parts of the questionnaire separately. Thus, on the one hand, we tested a model with seven first-order correlates that represented only Psychosocial symptoms (Model 4). We also tested a bifactor version (Model 5a), which now exhibited a slightly better fit. On the other hand, we tested the model with the five primary somatic symptom factors and its bifactor version. Again, the bifactor model showed a better fit to the data. Thus, the two bifactor models will be presented in further analyses.

The standardised factor loadings of the bifactor model with somatic symptoms and the main indices associated with the bifactor analysis are presented in Table 5. All items loaded significantly on the general factor. For the specific factors, all items loaded significantly except two items. The general factor explained 41% of the common variance (ECV). This value supports the presence of a general factor. The omega indices ranged from 0.70 to 0.90. The omega hierarchical can indicate the added meaning of specific factors over the general factor. These indices were high for the Goitre symptoms, Cosmetic complaints, and Eye symptoms factors and relatively low for Hyperthyroid and Hypothyroid symptoms.

Table 5 Bifactor model of somatic symptoms

The standardised factor loadings of the model measuring psychosocial symptoms and the main indices associated with the bifactor analysis are presented in Table 6. All items loaded significantly on the general factor. The items also loaded significantly on their corresponding specific factors except for one item. The general factor explained 57% of the common variance (ECV). This value supports the presence of a strong general factor. The omega reliability indices ranged from 0.46 to 0.93. The high value of the omega hierarchical index was obtained by the Cognitive problems and Emotional susceptibility factors. The factors Anxiety, Depressivity, and Impaired social life still showed an acceptable value, while Tiredness and Impaired daily life did not reach our predefined criterion level.

Table 6 Bifactor model of psychosocial symptoms

Construct validity of general and specific factors: a confirmatory factor analysis with covariate analysis

To test the construct validity of the general and specific factors, we conducted a CFA with covariate analysis using age, gender, diagnosis, and perceived stress to explain the 12 primary and two general factors. The standardised coefficients are presented in Table 7. For the analysis, we applied the traditional p < .05 significance level; however, in the more detailed interpretations, we will only consider those covariates that reach the p < .01 level.

Table 7 CFA with covariates model

The analysis revealed that the variance of the general (composite) Somatic symptoms factor was significantly explained by the age, the diagnosis, and the perceived stress, of which age and perceived stress remained significant at the more restrictive significance level. The general (composite) Psychosocial symptoms factor was significantly explained by all four predictors, only the perceived stress reached the stricter criterion of significance. However, the association of perceived stress with the general Psychosocial symptoms factor was significantly stronger than with the general Somatic symptoms factor (Wald test = 24.7, p < .001).

Significant associations were also found for specific factors, even when controlling for general factors. Higher age predicted milder cognitive problems and anxiety and more severe eye symptoms. Gender differences were only found in one specific factor, but this did not reach the strictest significance level. In terms of diagnosis, those with GD reported milder cognitive problems, more severe hyperthyroid and eye symptoms, and cosmetic complaints. There was also a difference in the Impaired daily life factor in the diagnosis, but its significance level did not reach the predefined criterion. Perceived distress was significantly associated with anxiety, depressivity, and emotional susceptibility, even after controlling for general factors.

Discussion

The Hungarian version of the ThyPRO-39 questionnaire is appropriate for measuring both general and specific factors of HRQoL in people with autoimmune thyroid disease. Previous research [23, 24] has mainly suggested the use of the composite score only for psychosocial symptoms, and our research supports this for somatic symptoms as well. However, the specific scores also carry information in addition to the composite scores and should be used when a more detailed analysis is required.

Our results supported that two general factors could be identified, suggesting the use of two composite scale scores to evaluate the HRQoL associated with autoimmune thyroid disease. On the one hand, the general factor of somatic symptoms may reflect the severity of the disease. On the other hand, the general factor of psychosocial symptoms may reflect its effect on well-being and psychosocial functioning. The two general factors correlate with each other; however, the nature of this correlation is not clear. It may reflect the impact of somatic symptoms on the psychosocial functioning, or it can be seen how psychosocial functioning has an impact on the disease symptom reporting. Longitudinal research would be needed to clarify the complex relationship between somatic symptoms and psychosocial functioning in this patient group.

Beside the general factors, most somatic symptom scales carry specific meaning. However, in the case of the Hypothyroid and Hyperthyroid symptoms scale the incremental explained variance is lower compared to the others. These scales measure symptoms that are not clearly thyroid-specific; they can connect to other health conditions or other characteristics, such as stress. Furthermore, many of the items of these scales can be connected to both hyper- and hypoactive thyroid functions [33, 34].

Interestingly, only two of the psychosocial-specific symptom scales — Cognitive problems and Emotional susceptibility — had remarkably high variance explained by specific meaning compared to the general factor. These results showed that valuable information could be gathered with the inclusion of these scales in the evaluation of patients’ psychosocial HRQoL.

Depending on the goals, both the composite and the specific scales can be used in the evaluation of HRQoL [23, 24]. The psychosocial scale can be helpful to measure the general impact of the disease/treatment on well-being and functioning and when comparing the burden of the different thyroid diseases. When a detailed evaluation is warranted, it is useful to add the specific well-being scales. Similarly, when we would like to quantify the specific somatic symptoms, we can use the separate, specific physical symptom scale. Still, when we aim to explore the frequency of physical symptoms or their changes in general, it is enough to use the Somatic symptoms (composite) scale.

The construct validity of the ThyPro-39 questionnaire was confirmed by the association with the level of perceived stress and general well-being. Both composite factors were significantly connected to these constructs. As we expected, the associations with the Psychosocial symptoms factor were significantly stronger than with the Somatic symptoms factor. The nature of the relationship between perceived stress and the two general factors is not yet clear. Is the higher perceived stress a consequence of the symptoms and psychosocial impact of the disease, or is it the stress that exacerbates these symptoms? We may assume that both mechanisms may be involved; however, further research should clarify.

The differences between the two autoimmune groups also confirmed the known-group validity of the ThyPro-39 questionnaire. GD patients reported more eye symptoms and cosmetic complaints. The autoimmune process in GD stimulates the TSH receptors, leading to unregulated thyroid hyperfunctioning in most cases [35]. GD may also be accompanied by orbital manifestations [8]; 25–50% of the patients develop Graves orbitopathy (GO), which explains the higher frequency of eye symptoms. GO severely affects HRQoL, primarily the appearance and social life [11, 36].

HT patients reported more severe hypothyroid symptoms and more cognitive problems. Hashimoto’s encephalopathy [37] may result in a decline in cognitive function, among other symptoms. Several mechanisms were described in the background: autoantibodies causing damage to nerve cells, depositing immune complexes destructing neurons through inflammation of blood vessels, and microthrombus processes. Hormonal pathways may also “support” the process [13, 38]; furthermore, hypothyroidism can affect the functioning of the hypothalamic-pituitary-adrenocortical (HPA) axis, which is also associated with cognitive and neuropsychiatric symptoms [38, 39]. Although GD patients also often suffer from neuropsychiatric symptoms, cognitive disturbances are not so prominent [11, 38]; instead, they appear mainly in the acute phase of Graves’ thyrotoxicosis [39].

Our result that older age predicts more somatic symptoms as longstanding autoimmune processes and thyroid failure seem to be associated with persistent, sometimes irreversible symptoms [9, 19, 37]. Presumably, it is not the age itself that matters, but the time since the onset of the disease that matters [18]. The negative association between age and cognitive and anxiety symptoms seems surprising but can originate from the low mean age of our sample. Getting diagnosed at a younger age can be more stressful and disturbing regarding cognitive functioning. Although the relationship between age and cognitive functions is not sufficiently explored [13].

One of the main limitations of our study is that we used self-reported diagnosis. Therefore, we could not control hormone levels and the activity of the underlying autoimmune processes. However, the biological variables are not necessarily crucial for exploring the factor structure and validity of the HRQoL questionnaire. Still, the use of clinical parameters would be advised in further studies when following HRQoL, its changes, and predictors.

Conclusion

In conclusion, the Hungarian version of ThyPRO-39 is a valid and informative instrument and can be used for different purposes in clinical practice and research, such as the proper monitoring of the HRQoL among thyroid patients [12, 18], during treatments and the development of individualised care [10, 15, 23].