Introduction

Experimental mild heat shock is widely known as an intervention that results in extended longevity (Cypser and Johnson 2001). Brief exposure to elevated heat resulted in a 15 % increase in the mean life span of Caenorhabditis elegans, compared to non heat-shocked controls (Cypser and Johnson 2001; Cypser and Johnson 2002; Lithgow et al. 1995). Similar effects have also been seen in Drosophila melanogaster (Hercus et al. 2003; Le Bourg et al. 2001), in yeast (Shama et al. 1998), and in cultured human cells (Rattan 1998). In the early 1960s, a group of proteins, now known as heat shock proteins (HSPs) were discovered, which were highly upregulated immediately after a heat shock (Ritossa 1962, 1996). Whether HSPs are responsible for longevity is still under debate, as their levels are only elevated for a short period of time after a heat shock (Link et al. 1999). However, the elevation in HSP levels during the heat shock response was shown to inhibit stress-mediated cell death, and recent experiments indicate a highly versatile role for these proteins as inhibitors of programmed cell death (Garrido et al. 2006).

HSPs can be subdivided in several smaller families, including HSP90, HSP70, HSP60, HSP40, small HSP (sHSP), and HSP10 (Kampinga et al. 2009). From these families, HSP70 and sHSPs show an association with longevity. In C. elegans, extra copies of a homolog of HSPA9 (member of HSP70), otherwise known as mortalin, extended life span up to 45 % (Yokoyama et al. 2002). In humans, decreased serum levels of HSP70 have been associated with exceptional longevity (95+) (Terry et al. 2006). However, the same study evaluated two single nucleotide polymorphisms (SNPs) in HSPA1A and HSPA1B which were not found to be associated to exceptional longevity (Terry et al. 2006).

The over-expression of members of the sHSP family has been shown to extend life of C. elegans and D. melanogaster by up to 32 % (Morrow et al. 2004b; Walker et al. 2001). Conversely, the absence of expression of a sHSP member decreases lifespan of D. melanogaster by 40 % (Morrow et al. 2004a).

HSP expression is regulated by a group of transcription factors known as heat shock factors (HSFs), of which HSF1 is considered to be the master-switch of HSP expression (Akerfelt et al. 2010). Strong evidence exists for a highly important role for HSF1 in longevity. Reduced activity of HSF1 in C. elegans leads to a rapid aging phenotype with a markedly reduced lifespan of 60 % (Garigan et al. 2002). Conversely, animals with an additional HSF1 gene copy lived approximately 40 % longer than normal (Hsu et al. 2003). A strong relationship was found between HSF1 and DAF-16, which functions in the C. elegans insulin/IGF-1 signaling pathway (Hsu et al. 2003). Both genes were shown to function, at least in part, by increasing sHSP gene expression (Hsu et al. 2003).

We have tested 31 genes encoding all members of the HSP70, sHSP, and HSF families and assessed their association with all-cause mortality. To our knowledge, this is the first large-scale candidate gene study of these HSPs and their association to all-cause mortality to be performed.

Methods

Discovery study

Our discovery cohort was the Rotterdam study (RS1). RS1 is a population-based cohort study that investigates the occurrence and determinants of diseases in the elderly (Hofman et al. 2011). Baseline examinations, including a detailed questionnaire, physical examination, and blood collection, were conducted between 1990 and 1993. The Medical Ethics Committee at Erasmus Medical Center approved the study protocol. All of the participants were followed for incident diseases through linkage to the general practitioner data base and record review by trained medical investigators. General practitioners' hospital records as well as death certificates were used for identification of deaths (all-cause mortality) through January 1, 2009.

Genomic DNA was extracted from whole blood samples using standard methods (Miller et al. 1988). Genome-wide SNP genotyping was performed using Infinium II assay on the HumanHap550 Genotyping BeadChips (Illumina Inc, San Diego, USA). Approximately two million SNPs were imputed using release 22 HapMap CEU population as reference. The imputations were performed using MACH software (http://www.sph.umich.edu/csg/abecasis/MACH/). The quality of imputations were checked by contrasting imputed and actual genotypes at 78,844 SNPs not present on Illumina 550K for 437 individuals for whom these SNPs were directly typed using Affymetrix 500K. Using the “best guess” genotype for imputed SNPs the concordance rate was 99 % for SNPs with R 2 (ratio of the variance of imputed genotypes to the binomial variance) quality measure greater than 0.9; concordance was still over 90 % (94 %) when R 2 was between 0.5 and 0.9. The GWAS of all-cause mortality has been analyzed and is published elsewhere (Walter et al. 2011).

For the study of HSPs presented here, a total of 4,430 SNPs in 31 genes were initially selected for the association test on the basis of the following criteria: (1) position within the genes of interest with a margin of 100 kb on each side of the genes according to NCBI build 36.3; (2) p value for Hardy–Weinberg equilibrium test ≥0.0001; and (3) call rate ≥95 %. For further analysis and selection of SNPs for replication analysis, only SNPs with an R 2 higher than 0.8 and a minor allele frequency higher than 0.05 were selected.

Analyses

We performed single SNP analyses using ProbABEL(Aulchenko et al. 2010). We used survival analysis (semi-parametric Cox proportional hazard model), adjusted for age at DNA blood collection and for sex, to model continuous time to death in individuals that were older than 55 years at baseline. To calculate empirical significance for SNPs, permutations were performed per region of interest (ROI) (Churchill and Doerge 1994; Fisher 1935). Briefly, the empirical distribution of the region-wide maximum of the test statistic under the null was obtained in 10,000 replications. To estimate empirical significance, each observed test statistic was compared with null statistics obtained empirically and the p value was estimated as the proportion of replicas generating the test statistics greater than or equal to the observed statistic. The permutation analysis keeps the original genotypes for each individual, but randomly allocates the phenotypes for each consecutive permutation. Therefore, the linkage disequilibrium structure of genes is not broken up. We did not perform any additional correction for the number of ROIs after permutations as we used the permutation analysis to significantly reduce the number of SNPs to be selected for validation of the effect of the SNP. For each suggestively associated ROI (p permuted < 0.10), we next selected the “truly associated” SNPs in a backward stepwise survival analysis until only nominally significant SNPs remained in the model. These independently associated SNPs were then followed up in eight independent cohorts. As effect estimates are expected to be small, a strict replication based on the p value and hazard ratio (HR) observed in individual replication studies, might be difficult to achieve due to power issues. Therefore, we evaluated whether a SNP was showing a HR in the same direction in all cohorts, i.e., whether findings were consistent over the various cohorts. Similar to genome-wide association studies, we performed a joint meta-analysis of the discovery and the replication samples and tested whether the joint p value was significant using a Bonferroni correction for the number of SNPs validated in the replication phase.

We used four studies that are part of the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium (Psaty et al. 2009) plus an additional four associated cohorts to evaluate whether findings were consistent. The studies included in this replication are: Cardiovascular Health Study (CHS); Framingham Heart Study (FHS); Atherosclerosis Risk Communities Study (ARIC); Age, Gene/Environment Susceptibility–Reykjavik Study (AGES); Health, Aging and Body Composition (HABC); Baltimore Longitudinal Study of Ageing (BLSA); InCHIANTI (ICH); and Study of Health in Pomerania (SHIP). All studies are longitudinal population-based studies periodically assessing the health and vital status of their participants. All participants included in this analysis were at least 55 years of age at the time of blood draw for DNA and provided written informed consent. Detailed information on the replication studies can be found in the supplement.

Results

General characteristics of the discovery cohort as well as the eight replication cohorts are shown in Table 1. In the discovery cohort, there were 3,174 deaths with a mean age at death of 83.2 and mean follow-up of 12.5 years. In RS1 and replications combined, there were 8,444 deaths with a mean age at death of 81.1. The mean follow-up time ranged from 5.2 to 15.7 years. A summary of all SNPs tested in the HSP genes in RS1 can be found in Supplementary Table 1. Three HSP70 genes are located head to head on chromosome 6 and are taken as one ROI with 148 SNPs (HSPA1). Similarly, two sHSP genes on chromosome 11 with 66 SNPs were located nearby each other and taken as one ROI (ROIchr11). Also of note is that the HSPs are currently undergoing a change in nomenclature (Kampinga et al. 2009). Supplementary Table 1 provides both the official gene names and the gene names according to the new nomenclature. Here, we will use the aliases from the new nomenclature.

Table 1 General characteristics

Figure 1 a–d illustrate the observed versus the expected chi2 of the SNPs tested in the different HSP gene families in RS1. The plots show an excess of low p values for tested SNPs in sHSP (1b) and HSF (1c) but not for HSP70 (1a). Supplementary Fig. 1 shows an overview of the p values in all HSP70, sHSP, and HSF genes tested in RS1. After adjusting for multiple testing by permutation analysis, three genes showed evidence for association to mortality in RS1. These genes encode HSP70 member HSPA8, sHSP member HSPB1, and HSF member HSF2. Table 2 shows the three SNPs that were selected for replication. In the replication phase, only one SNP reached a Bonferroni adjusted p value for significance (0.05/3 = 0.017). Figure 2a shows the Forest plot of the meta-analysis of HSF2 rs1416733. The direction of effect was consistent across populations with only one study (HABC) showing an opposite HR. The summary HR for this SNP was 0.95 (95 % CI 0.92–0.98; p = 0.003) for the A allele. When excluding the discovery cohort from the meta-analysis, the HR remained virtually unchanged (0.97: CI 0.93–1.00). In Fig. 2b, a regional plot of all SNPs tested in HSF2 from the original RS1 cohort can be found. In the plot, we can see that rs1416733 is located 11.7 Kb from the 5′ region of the HSF2 gene, with no other genes nearby.

Fig. 1
figure 1

Observed versus expected p value plot for the SNPs tested in the Rotterdam study (chi2 are given at the x-axis and y-axis). a HSP70; b sHSP; c HSF. Black line depicts the expected findings; red line depicts the observed ones

Table 2 Results of single SNP replication
Fig. 2
figure 2

The meta-analysis of the HSF2 rs1416733 SNP in the nine population cohorts. a Forrest plot. b Regional plot in RS1

Discussion

In our study, we found a significant association between HSF2 and all-cause mortality. The top associated variant, rs1416733, is located 11.7 Kb from the 5′ region of the HSF2 gene on chromosome 6 and is a known cis-eQTL for HSF2 (Zeller et al. 2010). Each additional copy of the A allele for SNP rs1416733 increased lifespan with a HR of 0.95 in a meta-analysis of nine independent cohort studies. This effect was consistent in direction in eight out of nine cohorts.

The HSF family, like most HSP families, is a highly conserved family across species, indicating a vital role for the survival of the respective organism (Liu et al. 1997). Unlike many other HSP families, there is virtually no redundancy in the HSF family, with HSF1 as the most crucial family member (Akerfelt et al. 2010). The importance of HSF1 is advocated strongly by the large effects of genetic mutations in this gene on longevity in C. elegans (Garigan et al. 2002; Hsu et al. 2003). These large effects are not anticipated in human research, which could explain why we find no evidence for association of SNPs in HSF1 with all-cause mortality (Walter et al. 2011). HSF2 is less known in longevity research. It has previously been mainly described for its role as a development- and differentiation-specific factor (Kallio et al. 2002; Wang et al. 2003). The role of HSF2 in later life has remained unknown for a long time (Wu 1995). Recently, evidence is emerging to suggest that HSF2 modulates HSF1 activity (Akerfelt et al. 2010). It has been shown that HSF2 activation leads to activation of HSF1, revealing a functional interdependency (Sandqvist et al. 2009). It has been proposed that heterotrimerization of HSF1 and HSF2 integrates transcriptional activation in response to distinct stress and developmental stimuli (Sandqvist et al. 2009). Additionally, though HSF2 was never considered to be heat-inducible, a recent study shows that a mild heat-shock in the physiological range does activate HSF2 and has a significant impact on the proteostasis of the cell (Shinkawa et al. 2011).

We found no evidence for association of HSP70 with all-cause mortality. HSP70 is the only HSP family for which SNPs in a couple of genes have been studied for longevity in humans previously. However, these studies only studied two or three SNPs and only in the heat-inducible members of HSP70, namely HSPA1A, HSPA1B, and HSPA1L, all located on chromosome 6 (Altomare et al. 2003; Ross et al. 2003; Terry et al. 2006). In our study, no association was found for these three genes with longevity. We add a more thorough investigation of these genes and the other HSP70 genes by extensively covering common variance.

The sHSPs have been named most often in animal studies for their relationship with longevity (Morrow et al. 2004a; Morrow et al. 2004b; Walker et al. 2001). However, we find no significant associations for single SNPs in these genes. Off course, what we find in animal studies does not always translate well to humans. For example, though genetic variations in IGF-1 signaling have been found in humans, they do not have nearly as strong effects on longevity as in animal models (Kuningas et al. 2008). Another study has found that mutations in HSF1 lead to upregulated sHSP expression in C. elegans (Hsu et al. 2003). We cannot exclude that genetic variation in HSF2 activates a similar mechanism in humans through its effect on HSF1.

Our study has a major advantage since the discovery cohort—the Rotterdam Study—is a large, population-based study. Further, the eight replication cohorts are also relatively large, established population-based epidemiological cohorts. A limitation in the interpretation of our data is that our findings were significant in the meta-analysis but did not reach significance in the individual cohorts except for the RS1 cohort; however, a total of eight out of nine studies showed an effect in the same direction for HSF2. This could be explained by the markedly smaller percentage of deaths in most replication cohorts (except CHS) compared to the discovery cohort. The number of deceased is critical for the statistical power of the study. Even though some of these studies had a longer mean follow-up time (ARIC and BLSA), the population in these studies were on average younger at baseline compared to RS1, which explains the fewer number of deaths and leaves the individual replication studies underpowered to identify rs1416733.

Until now, no large-scale studies have been performed investigating the role of HSP70, sHSP, or HSF genes in all-cause mortality in humans. In our candidate gene study in nine population-based cohorts, we found significant evidence suggesting that genetic variants in HSF2 are associated with all-cause mortality. Combining these data with those of earlier functional studies, in particular in C. elegans, makes it likely that HSF2 plays a role in human all-cause mortality.