Background

It is widely recognized that sleep plays an important role in human mental and physical health [1, 2]. Experimental studies indicated that sleep deprivation and excessive sleep duration can exert an adverse effect on hormones, metabolism and immune function [3,4,5]. From epidemiological aspects, although dozens of studies reported that inappropriate sleep duration and poor sleep quality are reported to be associated with high risk of some common diseases, including diabetes [6], cardiovascular diseases [7] and cancer [8], as well as to increased all-cause and cause-specific mortality rates [9], these associations are not often reproducible.

Over the past decades, many prospective studies have reported a U-shaped relationship between sleep duration and all-cause mortality, with the nadir at 7–8 h of sleep per night [10,11,12,13,14,15,16,17]. In 2016, da Silva and colleagues conducted a meta-analysis by pooling the results of 27 cohort studies, and they found a significant association of both long and short sleep duration with increased all-cause mortality risk in the older people, and the association was more evident for long sleep duration [18]. However, the results of other studies have failed to provide any supportive data on sleep duration and mortality in the older people [19,20,21]. The reasons for these inconsistent reports are multifactorial, possibly relating to inadequate statistical power of individual studies, different backgrounds and characteristics of study groups, and lack of adjustment for confounding factors. Given the accumulating data afterwards, there is a need to reexamine this association in a more comprehensive manner.

To yield more information for future studies, we synthesized the results of prospective cohort studies in the older people, aiming to evaluate the association between sleep duration and all-cause mortality. Meanwhile, we also intended to explore possible causes of between-study heterogeneity.

Methods

This meta-analysis was conducted according to the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) statement [22], and the PRISMA checklist is presented in Supplementary Table 1.

Search strategy

We completed literature search by scanning PubMed, EMBASE and Web of Science databases as of November 30, 2019. The following medical topic terms are used: (sleep OR sleep disorders OR sleep duration OR drowse OR napping OR naps OR nap OR Siesta OR drowsiness OR drowse OR insomnia OR actigraphy sleep OR self-reported sleep [Title/Abstract], AND mortality OR death OR deaths OR premature death OR all-cause mortality [Title/Abstract]), AND (aged OR geriatrics OR older people OR older age OR older adult OR older adult OR older persons OR older people OR older men OR older women OR aging OR aging women OR aging men OR the older people OR aging individuals [Title/Abstract]). We also scanned the reference lists of retrieved articles and systematic reviews to avoid potential missing hits.

Two investigators (M.H. and X.D.) independently reviewed all retrieved articles, and, they carefully evaluated preliminary qualification based on their titles or abstracts and full texts if necessary.

Inclusion/exclusion criteria

Our analyses were restricted to the articles that met the following criteria: (1) participants aged ≥60 years old; (2) all-cause mortality as the outcome; (3) prospective cohort studies; (4) clear classification of sleep duration; (5) at least 70% follow-up rate. Studies with subgroup analysis in older people on sleep duration and all-cause mortality were also included in this meta-analysis. Articles were excluded if they focused on cause-specific mortality or involved participants with serious diseases, or if they are case reports/series, editorials, and narrative comments.

Data extraction

Two investigators (M.H. and X.D.) independently extracted data from each qualified article, and typed them into a standardized Excel spreadsheet, including name of the first author, year of publication, country where study was conducted, race, sample size, sex, baseline age, follow-up period, ascertainment of sleep duration, death certificate, adjusted confounders, sleep duration, effect estimation, and other traditional risk factors, if available. The divergences were resolved through joint reevaluation of original articles, and, if necessary, by a third author (W.N).

Statistical analysis

We used the Stata software version 14.1 for Windows (Stata Corp, College Station, TX) to manage and analyze data. Irrespective of the magnitude of between-study heterogeneity, the random-effects model was employed. Effect size estimates are expressed as hazard ratio (HR) and its 95% confidence interval (CI), and the difference between two estimated was tested by the Z-test as reported by Altman and Bland [23]. The dose-response association was examined by the generalized least squares regression proposed by Greenland and Longnecker [24] for trend estimation of summarized dose-response data. Additionally, the restricted cubic splines of exposure distribution with 3 knots (25th, 50th, and 75th percentiles) were used to conduct nonlinearity test between sleep duration and all-cause mortality.

The inconsistency index (I2) is used to assess heterogeneity between studies, and it represents the percentage of diversity observed between studies that results from chance rather than an accidental result. If the I2 value is greater than 50%, significant heterogeneity is recorded, and a higher value indicates a higher degree of heterogeneity. Because of diverse sources of heterogeneity possibly from clinical and methodological aspects, a large number of prespecified subgroups were analyzed according to baseline age, sex, region, race, follow-up, short sleep duration and long sleep duration, respectively.

The probability of publication bias was evaluated by both Begg’s funnel plots and Egger regression asymmetry tests at a significance level of 10%. The trim-and-fill method was used to estimate the number of theoretically missing studies.

Results

Eligible studies

After searching prespecified public databases using predefined medical subject terms, a total of 2098 articles were initially identified, and 28 of them with data on sleep duration and all-cause mortality were eligible for inclusion [10, 14, 17, 19, 21, 25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47], including 95,259 older persons in the final analysis. The detailed selection process including specific reasons for exclusion is schematized in Fig. 1. Since most articles provided data according to different age groups at baseline or follow-up periods, they are processed separately in subgroup analyses.

Fig. 1
figure 1

Flow chart of records retrieved, screened and included in this meta-analysis

Study characteristics

Table 1 and Table 2 show the baseline characteristics of all cohort studies involved in this meta-analysis. Of 28 eligible articles, 2 in older women [17, 19], and 6 specifically described the number of men and women and the number of deaths of men and women [27, 30, 35, 38,39,40]. Five articles provided data on the association between sleep duration and all-cause mortality by gender [30, 35, 36, 38, 42]. Of all eligible articles, 9 investigated the total sleep duration of 24 h in the older people [19, 26, 31, 33, 38, 39, 41,42,43], and the others focused on the nighttime. One article adopted the actigraphy method to collect sleep time [43], and 2 articles simultaneously used actigraphy method and questionnaires [17, 47]. Based on geographic locations, all eligible articles were classified into America [14, 17, 19, 26, 32, 33, 40, 42], Europe [10, 21, 34, 37, 42,43,44,45], and Asia [27,28,29,30,31, 35, 36, 38, 39, 41].

Table 1 The baseline characteristics of all cohort studies involved in this meta-analysis
Table 2 The baseline characteristics of all cohort studies involved in this meta-analysis

Quality assessment

Table 3 shows the quality assessment results by using the Newcastle-Ottawa Scale (NOS) tool for cohort studies, with the total scores (mean: 7.46, standard deviation: 0.74) ranging from 6 to 9 in this meta-analysis.

Table 3 The Newcastle-Ottawa Scale (NOS) for assessing the quality of all cohort studies involved in this meta-analysis

Overall analyses

After pooling the results of all qualified prospective cohorts together (Table 4), unadjusted effect-size estimates for the association of the long (HR = 1.43; 95% CI: 1.30–1.58; P < .001; I2 = 88.6%) and short (HR = 1.15; 95% CI: 1.06–1.25; P < .001; I2 = 71.5%) sleep duration with all-cause mortality in the older people were remarkably significant. After adjusting for potential confounders, long sleep duration was significantly associated with an increased risk of all-cause mortality in the older people (HR = 1.24; 95% CI: 1.16–1.33; P < .001), whereas only marginal significance was observed for short sleep duration (HR = 1.04; 95% CI: 1.00–1.09; P = .033) (Table 4). In view of the striking differences before and after adjustment, the following analyses are based on adjusted effect-size estimates for the sake of relative accuracy.

Table 4 Overall and subgroup analyses of short and long sleep duration with all-cause mortality in the older people

Publication Bias

Figure 2 shows the Begg’s funnel plot to assess publication bias for the association of sleep duration with all-cause mortality, and only the plot of short sleep duration seemed symmetrical. As revealed by the Egger’s test, there was no evidence of publication bias for short sleep duration (P = .392), yet strong evidence of publication bias for long sleep duration (P = .020). Further filled funnel plots showed that there were 9 potentially missing studies due to publication bias to make the plot of long sleep duration symmetrical. After adjusting for these potentially missing studies, effect size estimates were still statistically significant for the association of long sleep duration with all-cause mortality (HR = 1.15; 95% CI: 1.07–1.23, P < .001).

Fig. 2
figure 2

The Begg’s and filled funnel plots for the association of both short and long sleep duration with all-cause mortality

Subgroup analyses

A series of prespecified subgroup analyses were conducted to account for possible causes of between-study heterogeneity for both short and long sleep duration in the older people (Table 4).

By gender, the association of long sleep duration with all-cause mortality was statistically significant in both women (HR = 1.48; 95% CI: 1.18–1.86; P = .001) and men (HR = 1.31; 95% CI: 1.10–1.58; P = .003) (Two-sample Z test P = .205). By contrast, with regard to short sleep duration, statistical significance was observed in men (HR = 1.13; 95% CI: 1.04–1.24; P = .007), but not in women (HR = 1.00; 95% CI: 0.85–1.18; P = .999) (Two-sample Z test P = .099).

By geographic locations, the association of long sleep duration with all-cause mortality was stronger in Asia (HR = 1.41; 95% CI: 1.26–1.57; P < .001) than in Europe (HR = 1.01; 95% CI: 0.93–1.09; P = .823) (Two-sample Z test P < .001) and America (HR = 1.19; 95% CI: 1.07–1.31; P = .001) (Two-sample Z test P = .013). There was no observable difference for short sleep duration between Asia (HR = 1.04; 95% CI: 0.96–1.12; P = .384) and Europe (HR = 1.03; 95% CI: 0.93–1.14; P = .627).

By total sleep time, significance was only observed for the association of long sleep duration with all-cause mortality, and there was no material difference between the nighttime (HR = 1.25; 95% CI: 1.13–1.38; P < .001) and the 24 h sleep duration (HR = 1.25; 95% CI: 1.14–1.36; P < .001).

By ascertainment of sleep, for long sleep duration, the association was more evident for questionnaire survey (HR = 1.26; 95% CI: 1.17–1.35; P < .001) than for actigraph survey (HR = 0.83; 95% CI: 0.61–1.13; P = .233) (Two-sample Z test P = .004). Contrastingly, for short sleep duration, there was no detectable significance.

By the median value (7.5 years) of follow-up intervals, the association of long sleep duration with all-cause mortality was significant in both long (≥7.5 years) (HR = 1.24; 95% CI: 1.14–1.34; P < .001) and short (< 7.5 years) (HR = 1.27; 95% CI: 1.12–1.45; P < .001) follow-up. As for short sleep duration, the association was only significant in studies with long follow-up intervals (HR = 1.07; 95% CI: 1.02–1.12; P = .006).

By the median value (65 years) of baseline age, long sleep duration was significantly associated with all-cause mortality in both subgroups (≥65 years: HR = 1.20; 95% CI: 1.11–1.30; P < .001, and < 65 years: HR = 1.38; 95% CI: 1.19–1.60; P < .001), and for short sleep duration, only marginal significance was observed for studies with median age < 65 years (HR = 1.21; 95% CI: 1.02–1.23; P = .018).

Dose-response analyses

In the dose-response analysis on short sleep duration, all-cause mortality increased with the decrease of sleep time (≤5 h: HR = 1.06; 95% CI: 1.01–1.11; P = .014, ≤6 h: HR = 1.05; 95% CI: 1.01–1.10; P = .031, and ≤ 7 h: HR = 1.04; 95% CI: 1.00–1.09; P = .033) (Two-sample Z test P = .379 for ≤5 h vs. ≤6 h, and P = .379 for ≤6 h vs. ≤7 h) (Table 4). For long sleep duration, the trend was more evident (≥8 h: HR = 1.24; 95% CI: 1.16–1.33; P < .001, ≥9 h: HR = 1.31; 95% CI: 1.21–1.41; P < .001, and ≥ 10 h: HR = 1.45; 95% CI: 1.24–1.70; P < .001) (Two-sample Z test P = .147 for ≥8 h vs. ≥9 h, and P = .128 for ≥9 h vs. ≥10 h) (Table 4 and Fig. 3A).

Fig. 3
figure 3

The trend plots of effect-size estimates with the increase of sleep duration in all older persons (A) and by genders (B and C). Abbreviations: HR, hazard ratio; 95% CI, 95% confidence interval

In men, the risk associated with all-cause mortality was significant and increased with both shorter and longer sleep duration, and the increasing trend was more obvious for long sleep duration (Fig. 3B). In women, the risk associated with all-cause mortality was nonsignificant for short sleep duration, yet it was significantly increased with longer sleep duration in a graded manner, which was steeper than men (Fig. 3C).

In both genders, dose-response regression analyses, using log (effect-size estimates) as dependent variable and categorized sleep duration as independent variable, revealed that trend estimation was more obvious for long sleep duration (regression coefficient: 0.13; P < .001) than for short sleep duration (regression coefficient: 0.02; P = .046) (Fig. 4). In men, the regression coefficient for tread estimation was 0.05 (P = .022) and 0.15 (P < .001) for short and long sleep duration, respectively, and the regression coefficient was separately 0.04 (P = .449) and 0.20 (P < .001) in women.

Fig. 4
figure 4

The dose-response relationship plot for the association of sleep duration with all-cause mortality. Lines with long dashes represent the pointwise 95% confidence intervals for the fitted nonlinear trend (solid line). Lines with short dashes represent the linear trend. The red horizontal line represents the reference line (hazard ratio: 1)

Discussion

To the best of our knowledge, this is thus far the most comprehensive meta-analysis that has explored the dose-response relationship between sleep duration and all-cause mortality in the older people. It is worth noting that long sleep duration was associated with a significantly increased risk of all-cause mortality, especially in women, and the mortality risk associated with short sleep duration was only significant in men. Moreover, besides gender, geographic region, sleep survey method, baseline age and follow-up interval were identified as possible causes of between-study heterogeneity. Our findings highlight the importance and the necessity of closely monitoring the sleep status of elders who have long sleep duration, as well as elderly men of sleep deficiency, to control and prevent all-cause mortality.

In the previous meta-analysis of 27 cohort studies by da Silva and colleagues, both long and short sleep duration were found to be associated with a significantly increased risk of all-cause mortality risk in the older people [18]. Differing from the meta-analysis by da Silva and colleagues [18], we restricted analysis only to prospective cohort studies that reported HRs and 95% CIs to quantify the association between sleep duration and all-cause mortality in elders. After synthesizing the adjusted effect-size estimates from 28 articles including 95,259 older persons, albeit the consistent marginal significance for short sleep duration in overall analyses, extending the findings by da Silva and colleagues [18], we in subsidiary analysis observed a remarkably significant mortality risk associated with short sleep duration in men only. Similarly, da Silva and colleagues [18] and we unanimously supported the significant contribution of long sleep duration to all-cause mortality. The reasons behind above inconsistent observations are manifold. First, the most likely reason is the unaccounted confounding, as our analysis based on unadjusted effect-size estimates indicated that short sleep duration was a significant predictor for all-cause mortality, yet no significance was detected after adjustment.

Another possible reason is the synthesis of different types of effect-size estimates. To minimize this statistical noise, we restricted analysis to only HRs that were calculated after adjusting for confounding factors, despite the varying panels of adjusted factors across each involved study in this meta-analysis. The third reason is the significant heterogeneity across individual studies. To fully account for this, we conducted both subgroup and meta-regression analyses, and found that gender, geographic region, sleep survey method, baseline age and follow-up interval were possible causes of between-study heterogeneity. We agree that future large-scale, well-designed cohort studies were warrant to derive a relatively reliable estimate.

Although the mechanisms for the association between long sleep duration and all-cause mortality are not completely understood, the current possible explanation is that sleep affects the human body through inflammatory processes. When sleep duration is too long, concentrations of inflammatory markers, such as interleukin-6 and C-reactive protein can increase [48, 49]. In addition, it is reported that unstable sleep duration was associated with some common diseases, such as hypertension [50, 51], diabetes [52], and coronary heart disease [53, 54]. It is hence reasonable to speculate that long-term irregular sleep duration is likely to destroy the body’s immune system balance through chronic inflammatory processes, and further increase all-cause mortality risk. There is also evidence showing that sleep has a crucial impact on autonomic nervous system, system dynamics, cardiac function, endothelial function and coagulation [55]. Nevertheless, over sleep duration can accelerate the occurrence or progression of chronic diseases, and further precipitate all-cause mortality.

It is worth noting that we identified strong evidence of between-study heterogeneity for the association of long sleep duration with all-cause mortality, irrespective of adjustment. By contrast, for short sleep duration, heterogeneity was dwindled from strong in the unadjusted model to low in the adjusted model. It is hence reasonable to expect that besides methodological heterogeneity (such as study design), clinical heterogeneity like different baseline characteristics (such as age, sex ratio, dietary habits) of study populations in this meta-analysis may explain the discrepancy. In particular, insufficient adjustment for residual confounding by incompletely measured or unmeasured clinical covariates might exist in our results. As such, translating our findings into clinical practice should be done with caution.

Finally, some limitations should be acknowledged for this present meta-analysis. First, only sleep duration was considered in this study, and other sleep-related indexes, such as sleep quality, are of added interest for explorations in case of sufficient eligible studies. Second, although adjusted effect-size estimates were synthesized in this meta-analysis, some important confounding factors are still not taken into account by all involved studies, such as physical activity and other lifestyle factors. For example, in a long-term follow up of older adults in the UK, physical activity and prefrailty was observed to be significant modifiers for the prediction of long sleep duration for all-cause mortality [40]. Third, although there was a high probability of publication bias for long sleep duration as reflected by Begg’s funnel plot and Egger’s test, we adopted the trim-and-fill method to impute theoretically missing studies and recalculated our pooled effect-sized estimate, which was still statistically significant. Fourth, although a large panel of subgroup and meta-regression analyses were undertaken to account for possible causes of heterogeneity, significant heterogeneity still persisted in some subgroups, limiting the interpretation of pooled effect-size estimates. Last but not the least, the majority of studies involved in this meta-analysis recorded sleep duration based on nighttime, and data on naps are sparse.

Conclusions

Taken together, our findings indicate a significantly increased risk of all-cause mortality associated with long sleep duration, especially in women, as well as with short sleep duration in men only. We agree that the findings of this meta-analysis pose a challenging task for searchers, clinicians, and policy makers to attach importance to monitor the sleep status of elders, especially with long sleep duration. Further investigations on the molecular mechanisms linking sleep duration and all-cause mortality are also warranted.