Introduction

In recent decades there has been increased life expectancy of the population and, consequently, of the aging process. Persons older than age 60 comprise 20 per cent of the world population in the more developed regions, and from 5 to 8 per cent in the less developed regions. The oldest old, persons aged 80 years or older, is the fastest growing segment of the older population and by 2050 the number of this group is projected to be five times as large as at present [1].

Several aspects of the aging process affect the endocrine system and stimulate the use of screening programs for the detection of hormonal changes and drug interventions with hormone replacement therapies to provide better quality of life for the elderly. Evaluation of thyroid function in normal elderly is difficult, since the prevalence of non-thyroid disease and the use of medications that interfere with thyroid function is greater than in young people. As a result, questions about the meaning of functional changes observed in the elderly are relatively common [2].

Data interpretation of thyroid function in the elderly has been changing over the past decades. In a study conducted in 1995 in a non-selected population, the authors considered that subjects of any age with some degree of TSH elevation had some grade of thyroid gland failure [3]. However, in 2002, the NHANES III study revisited this parameter data in a population excluding those with evidence of thyroid disease and, in this more uniform population, TSH still showed a progressive increase with age [4].

The International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) Committee developed the theory of reference values for the “Reference Intervals and Decision Limits (CRIDL)” [5]. In 1995 the Clinical and Laboratory Standards Institute (CLSI) first published with IFCC the joint guideline “Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory”, reviewed in 2008 [6]. This document recommends application of prospective questionnaires and, if necessary, physical evaluation, of candidate subjects to be part of a control group. It also discourages the indirect approach in which database results are used to establish ranges, retroactively identifying acceptable reference populations. This has been a challenge since then, and many clinical laboratories do not have these procedures performed in accordance with the recommendations, due to the fact that they require time, additional costs, knowledge, and efforts to further clarify physicians and patients.

Recent studies have shown conflicting results regarding the decision to use reference intervals (RI) of TSH suitable for the elderly or not [7, 8]. On studies with review of medical records, it is difficult to use strict inclusion and exclusion criteria in the selection of subjects representing a control group. Several bias may occur due to registration errors, and omission of information that were not actively taken from the patient. The consequent inclusion of a significant number of subjects with potential thyroid or general disease or use of interfering medications as belonging to the control group that may alter the final data obtained is a real possibility. Thus, the theme is still far from being exhausted, and this topic is relevant regarding the issue of not to hyper diagnose subclinical hypothyroidism in the elderly who actually could have higher TSH, but at appropriate levels for their age.

Likewise, as FT4 is the test of choice for the confirmation of thyroid disease, it is also relevant to assess whether there is a need for RI specific for subjects above 60 years and 80 years or older.

The aim of this study was to test the hypothesis that it is necessary to use specific RI for TSH and FT4 for subjects over 60 years old, and over 80 years old, compare the values with those of young subjects, and analyze the impact of using specific TSH RI for these age groups in the screening of thyroid dysfunction.

Materials and methods

Between March and December 2012, 1200 subjects of both sexes were evaluated prospectively distributed as follows: 120 males and 120 females in each of the following age ranges: 20–49, 50–59, 60–69, 70–79 and above 80 years, who attended a clinical laboratory to collect routine tests for which function evaluation and/or thyroid autoimmunity had not been requested. They were invited to participate in order to arrive at the laboratory on pre-determined weekdays, according to availability of the researchers. A questionnaire was applied by one of the researchers, through direct interviews with participating candidates, which consisted of: 1) Identification data, address and phone number of the participants, as well as the name and telephone number of their physician for communication in case of abnormal results; characterization of their color or race according to themselves; if they consider themselves as healthy individuals; personal or familial thyroid disease in the present or past; current medications (among these, medications containing iodine in the last six months), smoking, hospitalization due to illness or accident in the last six months (and what the disease was), and pregnancy (for females who had not had menopause). Subjects were from the metropolitan area of Rio de Janeiro, belonging to middle and upper social class, as declared income, 77% white, 18% mulatto and 5% black. Subjects with negative thyroid peroxidase antibodies (TPOAb) and thyroglobulin antibodies (TGAb), normal lipid profile, ultrasensitive C-reactive protein level (CRP), blood count, renal function, and absence of goiter on palpation were in the inclusion criteria. The measurement of urinary iodine was not done since the salt iodization in Brazil is determined by federal law [9] and the authors have shown in previous study that the intake iodine in the Rio de Janeiro population is sufficient [10]. In addition, the National Health Surveillance Agency (ANVISA), which is the regulatory agency under the Ministry of Health, conducted a review of salt iodization in the samples used in Rio de Janeiro in the year before beginning of the study confirming that it was appropriate [11].

Exclusion criteria included past or present history of thyroid disease, previous thyroid surgery, family history of thyroid disease, TSH <0.1 mU/L or > 10.0 mU/L, as these results indicate a high probability of thyroid dysfunction [12], palpable goiter, smoking habit, use of medicines known as possible analytical or physiological interference on measurement of TSH or FT4 in the past three months (medicines and contrasts containing iodine in the past six months) mentioned in the list of medications and other drugs that may interfere with TSH and/or FT4 measurements [1324], antidepressants, hospitalization in the past six months and pregnant females. As changes in thyroid morphology are significantly higher among the elderly, we assessed whether this was a factor in generating different results in the selection of subjects for the study. A thyroid ultrasonography (US) was performed on 687 subjects with exclusion of 24.1% who presented some thyroid change. More than 120 subjects still remained in each group. Comparing TSH and FT4 values between subjects who had normal US and those who did not perform US, there was no statistically significant difference (data not shown). Therefore, as this exam was not essential in the selection of subjects as controls, all of the initially selected subjects could be enrolled in the study, except those who were initially excluded and had been replaced by the same number of other subjects of the same age and sex in order of maintaining the initial number to be studied.

List of medications and other drugs that may interfere with TSH and/or FT4 measurements

2-3-dimercatopropanol, 2-4-dinitrophenol, 5-fluorouracil, 5-hydroxytryptophan

Acetazolamide, acetylsalicylic acid, alpha adrenergic blockers, aminoglutethimide, aminotriazole, amiodarone, androgens and other anabolic steroids, anphenona, anphetamines, antipyrine

Benserazide, beta adrenergic blockers, bexarotene, bromine, brompheniramine

Cadmium, carbamazepine, chromate, chromium picolinate, cimetidine, clofibrate, clomiphene, clomipramine, cobalt, complex anions, corticosteroids, cytostatics

Danazol, diphenylhydantoin, dinitrophenol, dobutamine, domperidone, dopamine and its agonists, other dopaminergic agents

Erythrosine, estrogens, ethionamide

Fenclofenac, fenoldopam, flunarizine, fluor, furosemide, fusaric acid

Growth hormone (GH), GH-Releasing hormone

Halofenate, heroin, heparin

Interleukins, iopanoic acid, other radiological contrasts, and other iodine-containing substances and drugs (potassium iodide and others), insulin-like Growth Factor-1, interferon

Ketoconazole

L-asparaginase, L-dopa inhibitors, levothyroxine, lithium, lovastatin,

Mefenamic acid, melatonin, metformin, methadone, methimazole, metoclopramide, mitotane

Nevirapine, niacin, nicotinic acid, nifedipine, NSAIDs, nitrate

O, p’-DDD, orphenadrine, opioids, oxcarbazepine

Para-aminobenzoic acid, perchlorate, perphenazine, phenidone, phenylbutazone, phenobarbital, pimozide, prazozin, primidone, propylthiouracil, pyridoxine

Quetiapine

Raloxifen, resorcinol, rifampicin, ritonavir, rubidium

Salsalate, serotonergics antagonists, somatostatin and its analogues, spironolactone, St. John’s Wort, stavudine, sulfonamides, sulfonylureas, sulpiride, steroids hormones

Tamoxifen, thiocyanate, thyroid hormones and their analogs, troleandomycin, tyrosine kinase inhibitors

Valproic acid

Drugs are listed in alphabetical order and not in possible importance as interfering in TSH or thyroid hormone values.

The study was approved by the Research Ethics Committee of the Hospital Clementino Fraga Filho, Universidade Federal do Rio de Janeiro (HUCFF/UFRJ) and individuals agreed to participate by signing the informed consent form.

Collection of data and collection of samples

Serum was collected in the morning, in same species standard sampling tubes with separator gel. The measurements were done on the same day in primary tubes, after blood centrifugation at 3200 RPM for 15 min.

Biochemical data

Serum TSH, FT4, TPOAb and TGAb were measured by electrochemiluminescence immunoassays on the Roche Modular Analytics® E170 (Roche Diagnostics Australia Pty Ltd, Castle Hill, NSW, Australia).

Serum TSH concentrations were measured by an immunometric method, with an intra-assay percentage coefficient of variation (% CV) of 3.0% at concentrations of 0.040 ± 0.001 mU/L, 2.7% at 0.092 ± 0.002 mU/L and 1.1% at 9.4 ± 0.1 mU/L. The reference interval provided by the manufacturer is 0.3- 4.2 mU/L. Serum FT4, TPOAb and TGAb concentrations were measured by competitive assays. For FT4, the % CV was 1.4% at FT4 concentrations of 0.7 ± 0.01 ng/dL, 1.8% at 1.3 ± 0.02 and 2.0% at 2.7 ± 0.1 ng/dL. The reference interval provided by the manufacturer is 0.9 -1.7 ng/dL. For TPOAb the intra-assay % CV is 6.3% at TPO concentrations of 21.3 ± 1.34 IU/mL, 5.1% at 51.2 ± 2.6 IU/mL and 2.7% at 473 ± 12.7 IU/mL. According to the manufacturer, individuals without thyroid disease score lower than 34 IU/mL. For TGAb the intra-assay % CV is 4.9% at TG concentrations of 47.2 ± 2.3 IU/mL, 1.3% at 588 ± 7.4, and 1.3 at 3289 ± 42.0 IU/mL TGAb concentrations. According to the manufacturer, individuals without thyroid disease score lower than <115 IU/mL.

Statistical analysis

Data were analyzed using the program GraphPad Prism®, version 6.0 (GraphPad Software, Inc, California). In order to assess the Normal distribution of both data series (TSH and FT4) Kolmogorov-Smirnov tests were performed. Logarithmic 10 was used for analysis. TSH and FT4 were calculated for each subgroup and gender from 20 to 49 years, then, by 10-year age range until 80 years; all those aged older than 80 were grouped together. Descriptive analysis of serum TSH was reported as medians and 25% and 75% percentiles because it is not normally distributed. Two tailed Mann–Whitney test and Kruskal-Wallis test were used to compare the nonparametric TSH distributions in different subpopulations. Evaluation between two subgroups was made by independent test of Dunn multiple comparisons. Means and standard deviations were calculated for FT4, since it has showed a normal distribution. For comparing FT4 between all groups, two-way ANOVA tests were used when more than two data series were analyzed, and Student test in the case of two data series.

Outlying observations were calculated using the test proposed by Dixon. In all cases, the level of significance used was 0.05 [25]. For the RI, for both hormones 2.5% and 97.5% were taken. We used the method of Harris and Boyd [26] to decide whether it was necessary to separate the reference values for gender. According to this method, to calculate the statistical significance of the difference between the means of groups standard normal deviation (z), if the value z is less than 3 means there is no need to reference values separated by gender.

To calculate correlations between TSH and FT4 levels in all age groups, TSH data were transformed into logarithmic 10. With normal distribution the two-tailed Pearson test was used; it was considered statistically significant if a p-value was less than 0.05. Pearson test was also used to analyze, individually, correlations between TSH and age and FT4 and age.

Results

TSH data

The reference group was comprised of 50% females and 50% males. The mean age by gender in each age subgroups are in Table 1. TSH data analysis by age or gender exhibited a non-Gaussian distribution. In each age group, there was no significant difference in serum TSH median between males and females (Table 1). Therefore, considering these factors, all individuals were eligible in each group for the establishment of reference intervals. Statistical parameters of TSH measurements are listed in Figure 1 and in Table 1. Analysis of median TSH as a whole group showed that there was a significant increase of this hormone with age (p < 0.001), but the analysis of independent subgroups, 20–49 years old versus 50–59 years old (p > 0.05), and 60–69 years old versus 70–79 years old (p > 0.05), showed no statistically significant difference. These data confirm that different RI for three major age groups should be used: 20 to 59 years, 60–79 years and 80 years or more. RI calculated for each sub-group are shown in Table 1.

Table 1 Statistical data of TSH measurements
Figure 1
figure 1

Graph - distribution of TSH among different age groups. The transverse line marks median values.

FT4 data

FT4 exhibited a Gaussian distribution. Data of FT4 measurements are shown in Figure 2, and statistical parameters are listed in Table 2. Analysis of FT4 mean ± standard deviation (SD) shows that there is a significant reduction of the hormone with age as a whole group (p < 0.0001). However, despite a tendency to fall in FT4 with increasing age, the independent comparison test between the sub-groups showed that there was no statistically significant difference between those over 60 years old. Reference intervals calculated for FT4 are shown in Table 2.

Figure 2
figure 2

Graph - distribution of FT4 among different age groups. The transverse line marks median values.

Table 2 Statistical data of FT4 measurements

Regarding the correlation between TSH and FT4, a high level of significance was observed in all age sub-groups, independently analyzed (p < 0.0001), 20–49 years old: Pearson r = -0.4641, 95% confidence interval (CI) = -0.4961 to -0.2926, R squared = 0.1652; 50–59 years old: Pearson r = -0.3862, 95% CI = -0.4881 to -0.2739, R squared = 0.1492; 60–69 years old: Pearson r = -0.4653, 95% CI = -0.5583 to -0.3607, R squared = 0.2165; 70–79 years old: Pearson r = -0.4946, 95% CI = -0.5839 to -0.3934, R squared = 0.2446; 80 years old and over: Pearson r = -0.3951, 95% confidence interval = -0.4961 to -0.2835, R squared = 0.1561.

In order to assess whether the RI obtained in this study had clinical impact regarding the use of RI defined by the manufacturer in screening for thyroid disease, we compare the percentage of subjects who had TSH below or above the RI defined by age range obtained from this study with the values without segmentation by age. The results are showed in Table 3.

Table 3 Comparison of data of TSH (mU/L) obtained regarding RI defined by the manufacturer (without segmentation by age), with the results of this study that redefined RI specific for age range

Discussion

Usually, for the interpretation of a laboratory test, clinical laboratories use RI provided by the manufacturer of lab kits. The result of a patient’s test is then compared to this RI in order to diagnose whether it is normal or not. TSH concentration is the most sensitive test to reliably detect thyroid function abnormalities and is used as the screening test for studying thyroid function because of the inverse log-linear relationship between circulating TSH and FT4 concentrations [27, 28].

This study, conducted prospectively with a reference population, showed that TSH increases progressively and significantly with age. The value of z less than 3.0 in all age groups indicates that the same RI should be adopted for both genders. On the other hand, FT4 tends to decrease with age. In the age groups above 60 years, FT4 values are equal between males and females. In subjects younger than 60, the value of z pointed that FT4 is lower in males than in females. A study of Kratzsch et al. also reports lower FT4 in males than in females [29]. However, as in our study, z result (3.0249) is very close to the cutoff point we consider that in the clinical routine, it is appropriate to use the same RI to both genders, the same way as in older subjects.

The lower TSH limit defined in this study of 0.4 mU/L is the same for all age groups over 60 years old, and does not differ from the lower limit of younger people. This value is in accordance with previous study using third generation immunometric TSH methodology, that refers to the lower TSH reference limit as approximately 0.3 to 0.4 mU/L, irrespective of the population studied or the method used [30]. The value found in this study, very slightly higher than that reported by the manufacturer, has no impact on an eventual reclassification of how many people have low TSH, as the difference between one and other is under 2%.

Association between TSH and age is highly significant. The median is 1.5 mU/L in people under 60 years old, increases to 1.7 mU/L from 60 to 79 years old, and to 2.0 mU/L for those aged 80 or more. Upper TSH limit increases from 4.3 mU/L under 60 years old, to 5.8 mU/L in the range of 60 to 79 years old and to 6.7 mU/L in those very old subjects, over 80. So, although the manufacturer’s TSH kit is suitable for subjects less than 60 years of age, the same is not true for those 60 years or more, in which the limits are significantly higher.

Within the age ranges of 60–79 years and 80 years or over, a significant percentage of subjects are reclassified as having not elevated TSH if age-specific RI is adopted. The same does not occur in young subjects. According to the RI for each age group obtained in this study, 6.5% of subjects between 60 and 70 years and 12.5% of those with 80 years or more have less misdiagnosis of elevated TSH, leading to 19% reclassification from hypothyroidism to normal, using these criteria. With regard to the effects of age on serum TSH levels, other epidemiological studies indicate that the population’s mean TSH levels increase with age [4, 31, 32]. The NHANES III was the best designed in selecting a control population, and data obtained were similar to the current study, indicating that increase in TSH is probably a physiological event for the elderly [33, 34] or that this increase may be due to the presence of TSH isoforms with low bioactivity [35]. In each scenario one can avoid excessive diagnosis of subclinical hypothyroidism in the elderly adopting specific RI. And, to clarify the second hypothesis, the answer would be the development and clinical use of technologies to quantify only isoforms with normal TSH bioactivity.

Failure to find these differences in RI between the young and the elderly reported in other studies may be due to retrospective studies based, or selection choosing populations to be reference for which strict selection criteria were not applied through specific questionnaires and appropriate physical examination to exclude factors such as thyroid dysfunction in the subject or in the family, interfering medications, other illnesses, recent hospitalizations, smoking habits and goiter.

Like TSH, FT4 has the same minimum reference value for all age groups. This value of 0.7 ng/dL however, is lower in relation to that defined by the manufacturer (0.9 ng/dL) in all groups. In relation to the maximum reference value, the level of 1.7 ng/dL is suitable to be used for subjects 60 years old and over. However, although this is not the scope of this study, the results obtained by us suggest that a higher reference superior value (1.9 ng/dL) should be used for young individuals.

Previous study demonstrated that FT4 remained relatively unchanged with age [36]. These data can be difficult to interpret because evaluation of thyroid function in the elderly is often complicated by the increased prevalence of chronic illness and the use of medication. In the present study, subjects with one or more of these factors were excluded. There is evidence that a low activity of thyroid hormone might be beneficial in the elderly. Low levels of FT4 have been associated with a better survival in elderly subjects [8, 33, 37, 38]. On the other hand, even thyroid hormone levels within the normal range might be associated with thyroid hormone-related endpoints. As an example, in euthyroid subjects, especially the elderly, FT4, regardless of TSH levels is associated with atrial fibrillation, and lower physical performance [39, 40]. One hypothesis is that these lower levels of thyroid hormone could possibly serve as an adaptive mechanism to prevent catabolism in the elderly [41].

We consider relevant the data obtained in this age-related prospective study, once it is very important to distinguish between normal and mildly elevated serum TSH concentrations in elderly subjects. Elevations of TSH in young individuals even light and without decreasing FT4 characterize subclinical hypothyroidism or minimal thyroid dysfunction and are related to comorbidities such as dyslipidemia, adverse obstetric events, impact on cognition, quality of life, cardiovascular events, and evolution to clinical hypothyroidism [42]. These effects have no correspondence in the elderly, since there is no evidence of these effects in this age group [43, 44]. There is a consensus that subjects with TSH concentrations above 10.0 mU/L should be treated. However, according to Garber et al., in the Clinical Practice Guidelines for Hypothyroidism, very mild TSH elevations in older individuals, under this level, may not reflect subclinical thyroid dysfunction, but rather be a normal manifestation of aging. While the normal TSH reference may need to be narrowed range for some subpopulations [27, 45], the normal RI may widen with aging [24]. This confirms that not all patients who have mild TSH elevations are hypothyroid and therefore would not require thyroid hormone therapy. These data are also relevant for the monitoring of subjects with hypothyroidism, since the target for TSH in levothyroxine treated subjects should be higher in elderly people. Pitfalls in the interpretation of TSH were carefully excluded in this study. Abnormal levels are observed in various non-thyroidal diseases and other conditions, but the effect of possible changes secondary to thyroid diseases were preventable excluding subjects who had used medication containing iodine during the previous 6 months, had had hospitalization, smokers, as well as those not using any medication that may alter minimally TSH or thyroid hormones.

Concluding, this data shows that prevalence of subclinical hypothyroidism is overestimated in the elderly, in almost 20% of subjects, unless age-specific RI is used. This might improve diagnostic accuracy and reduce the need of confirmatory unnecessary tests.