Validation of the National Health And Nutritional Survey (NHANES) single-item self-reported sleep duration against wrist-worn accelerometer

Purpose and methods This study aimed to validate the single-item sleep duration question used in the National Health And Nutritional Survey (NHANES), “How much sleep do you usually get at night on weekdays or workdays (hours)?”, against a wrist-worn accelerometer (ActiGraph GT3X +) in waves 2011–2012 and 2013–2014 among an adult population aged 20 or above (n = 8,438, mean age 49.7, 48% male). Results The accelerometer-measured and self-reported sleep duration were 6.01 (SD 1.48) and 6.88 (SD 1.40) h/day, respectively, representing a 0.87 h/day of over-reporting (SD 1.90, p < 0.001). Such an over-reporting was observed in all subgroups, where the over-reporting ranged from 0.72 (those aged 41–50) to 1.13 h/day (those aged 71 or above). The correlation between accelerometer-measured and self-reported sleep duration was low (ρ = 0.14, p < 0.001). Conclusions The associations between sleep duration and other health outcomes identified using NHANES data should be further tested using more accurate and valid measures of sleep duration. Supplementary Information The online version contains supplementary material available at 10.1007/s11325-021-02542-6.


Introduction
Measuring habitual sleep duration is essential in observational studies as it is correlated with many other health outcomes [1][2][3]. Three methods can be used to measure habitual sleep, including sleep questionnaire (single point self-reporting of habitual sleep duration), sleep diary that reports time in bed and wake up time for a representative period of time (usually 7 consecutive days), and accelerometry using an electronic device that measures the movement pattern of the subject under investigation, with the wake-sleep pattern determined by prolonged non-movement. All these methods are validated against the gold standard of sleep measurement in a laboratory condition, polysomnography [4][5][6], although discrepancies have been observed between these sleep measures and polysomnography, and the validity varied among subjects with sleep disorders [7]. Among these three measurements, accelerometry is the only objective measurement, and it has become more popular in sleep research due to its decreasing cost and ability to collect additional data such as physical activity and both electric and outdoor sunlight exposure [8]; therefore, accelerometry is also being regarded as a standard of sleep measurement in a free-living condition [9,10], albeit with an overestimation in sleep duration and underestimation in wake after sleep onset and sleep onset latency [11].
Given its low cost and ease of administration, selfreported sleep duration remains a popular choice in observational studies. The sleep diary has a better validity than a sleep questionnaire as a diary is completed on a day-by-day basis and can capture the sleep variation of the respondents. The sleep diary was shown to have less than a 15-min difference in measuring sleep duration compared with accelerometry, albeit moderately correlated (r = 0.4-0.6) [12,13]. However, the sleep diary introduced a burden to its respondents, and systematic bias existed due to the unavoidable difference between actual sleep onset time/wake up time and time at completing the diary. Therefore, selfreported questionnaires are still being widely used albeit with questionable validity. For instance, the National Health And Nutritional Survey (NHANES), which surveyed a USrepresentative sample of around 5,000 individuals each year, used a single-item question "How much sleep do you usually get at night on weekdays or workdays (hours)?" from wave 2005-2006 to wave 2017-2018 to measure sleep duration. Data collected using this question had been widely used as a measure of habitual sleep duration, and a number of studies correlated this with other health outcomes in NHANES [1,2]. The single-item question "On average, how many hours of sleep do you get in a 24-h period?" used in Behavioral Risk Factor Surveillance System (BRFSS) has been validated in a subsample of 300 participants [14], but the validity of the NHANES question has not been validated. As a quality assurance procedure, this study aimed to validate the single-item total sleep duration question used in NHANES against a wrist-worn accelerometer (ActiGraph GT3X +) in waves 2011-2012 and 2013-2014 among an adult population aged 20 or above. The results may help evaluate the methodological quality of these existing studies using NHANES data.

Participants
The complete details of the NHANES recruitment procedure can be found on the NHANES official website, https:// wwwn. cdc. gov/ nchs/ nhanes/ conti nuous nhanes/ overv iew. aspx? Begin Year= 2011. A total of 11,329 participants aged 20 + were recruited in NHANES 2011-2012 and 2013-2014, and only those who provided valid data on self-reported and accelerometer-measured sleep duration (defined in the "Measurement" section) were included in the present analysis.

Measurement
Self-reported sleep duration Participants were asked "How much sleep do you usually get at night on weekdays or workdays (hours)?". Participants responded with an integer value between 2 and 11, and responses of 12 h or more were coded to 12. I regarded sleep duration of 2 h/day as outliers and were removed from the analysis (n = 31).

Accelerometer-measured sleep duration
The complete details of the accelerometer procedure can be found on the NHANES official website (https:// wwwn. cdc. gov/ nchs/ data/ nhanes/ 2011-2012/ manua ls/ 2012-Physi cial-Activ ity-Monit or-Proce dures-Manual-508. pdf). In short, participants were invited to wear an accelerometer (ActiGraph GT3X + , https:// actig raphc orp. com/) on their non-dominant wrist at the day of the examination, continue to wear it 24 h a day for 7 consecutive days, and remove it on the morning of the 9th measurement day. The accelerometer measured the acceleration with an 80 Hz frequency, and the epoch length was set at 1 min. Each of the measured minute was classified as either wake, sleep, non-wear, or unknown according to the signal power, variance of the orientation, and change of the orientation using a machine learning algorithm [15]. For the current analysis, sleep onset was defined as a consecutive sleeping period of at least 15 min, and a sleep period ended if a consecutive waking period of at least 15 min were recorded. The non-wear and unknown status of accelerometer data were not used in the current analysis. Sleep duration was calculated as the difference between the sleep onset and sleep offset. A sensitivity analysis was conducted to test the robustness of this parameter by computing the total sleep duration using 5-min, 10-min, and 20-min criteria. To align with the self-reported sleep duration, accelerometer data at weekends (i.e., Friday-Saturday and Saturday-Sunday nights) were removed from the analysis.

Data analysis
All accelerometer-measured sleep duration of < 3 or > 12 h/ day were regarded as outliers and removed from the analysis. Paired sample t-test and Pearson correlation were used to examine the difference and correlation between self-reported sleep duration and accelerometer-measured sleep duration, respectively. Mean and SD of accelerometer-measured sleep duration across all levels of self-reported sleep duration were reported. The self-reported sleep duration was classified as underestimation, accurate estimation, and overestimation if the difference between the corresponding accelerometer-measured sleep duration was smaller than − 0.5 h/ day, between − 0.5 and 0.5 h/day, and larger than 0.5 h/day, respectively. Bland-Altman plot was used to evaluate the agreement of self-reported sleep duration and accelerometermeasured sleep duration. All statistical analysis was conducted using R 4.0. The R syntax for accelerometer data processing is available as supplementary material.

Results
A total of 8,438 participants (mean age 49.7, SD 17.6) were included in the present analysis. On average, 2.8 (SD 1.2) valid accelerometer-measured sleeping episodes (i.e., sleep duration between 3 and 12 h) were provided by the participants, and the intra-class correlation coefficient was 49.5%. Table 1 shows the demographic characteristics and sleep duration of the participants. The sample was uniform across age and gender, and most of them were Non-Hispanic Whites (40.8%) and Blacks (23.4%). More than half of them 0.12*** had at least some college or AA degree (56.0%) and were married (50.8%). A large over-reporting was observed in the average daily sleep duration. The accelerometer-measured and self-reported sleep duration were 6.01 (SD 1.48) and 6.88 (SD 1.40) h/day, respectively, representing a 0.87 h/ day of over-reporting (SD 1.90, p < 0.001). Such an overreporting was observed in all subgroups, where the overreporting ranged from 0.72 h (those aged 41-50) to 1.13 h/ day (those aged 71 or above). The correlation between accelerometer-measured and self-reported sleep duration represented a small but positively, statistically significant effect size (ρ = 0.14, p < 0.001). A similar pattern was observed among all subgroups, with correlations ranging from 0.02 (separated) to 0.19 (those who graduated from college or above). Table 2 shows the distribution of the self-reported sleep duration, as well as the accelerometer-measured sleep duration across all levels of self-reported sleep duration (3-12 h/ day). While there was a positive association between the sleep duration measured by accelerometer and self-report, the association was weak, and the mean accelerometer-measured sleep duration (h/day) across the groups differed by less than 2 h. For self-reported sleep duration of 5 h or less, the self-reported sleep duration overestimated the objectively measured sleep duration, while the self-report underestimated those who have an objectively measured sleep duration of 6 h or more. Figure 1 shows the Bland-Altman plot for these two measurements, where their large discrepancy was revealed by the wide range of the 95% limits of agreement (− 4.59, 2.84). Table 1 shows the results of the sensitivity analysis. The accelerometer-measured sleep duration using the 5-min, 10-min, and 20-min criteria were 5.07 (SD 1.39), 5.68 (SD 1.41), and 5.49 (SD 1.59), respectively. They were mildly correlated (ρ = 0.37-0.79), and they all have small but significant correlation with self-reported sleep duration (ρ = 0.07-0.12). Similar patterns were found using different definitions of accelerometer-measured sleep onset and awake (5-min, 15-min, and 20-min definitions). The results of this sensitivity analysis supported that the conclusions drawn using the main study were robust to the criterion used to define sleep onset and awake.

Discussion
This study shows that self-reported single-item total sleep duration was only weakly associated with the sleep duration measured by a wrist-worn accelerometer (ActiGraph GT3X +), and participants over-reported their sleep duration by approximately 52 min per day with a wide 95% limits of agreement (− 2 h 50 min, 4 h 35 min). Therefore, the validity of this single-item sleep duration measurement and the # All differences significant at 0.1% level 1 Sleep onset is defined as a consecutive sleeping period of at least 5 min, and a sleep period will end if a consecutive waking period of at least 5 min were recorded 2 Sleep onset is defined as a consecutive sleeping period of at least 15 min, and a sleep period will end if a consecutive waking period of at least 15 min were recorded 3 Sleep onset is defined as a consecutive sleeping period of at least 20 min, and a sleep period will end if a consecutive waking period of at least 20 min were recorded validity of studies examining the associations between sleep duration and other health outcomes using NHANES data are questionable. Results obtained from research assessing sleep duration using this single-item question should be further tested using more accurate and valid measures of sleep duration. For analysis of sleep using NHANES data, the accelerometer data should be used instead given its validity and ability to measure other sleep parameters including sleep efficiency and wake after sleep onset.
Results of this study are not without limitations. The reference measure of sleep duration in the current study, the actigraphy, has its own limitations. Accelerometers are found to overestimate sleep duration by about 5-15 min in adult populations [5,10,11], indicating that the over-reporting of sleep duration might be more than the 52 min found in the current study. Note that the machine learning algorithm used to detect sleep duration in the current study has not been validated in a free-living condition among a general population. However, visual inspection of the accelerometer data from several participants by the authors confirmed the validity of this algorithm. There were no data on the time lag between self-reported and objectively measured sleep duration; thus, its effect on the validity of self-reported sleep duration could not be evaluated. Furthermore, as no data were available on the participants' working pattern, a Monday to Friday pattern was assumed, and the accelerometermeasured sleep duration extracted may not have represented participants who worked on weekends.
The  night (e.g., NHANES and BRFSS [14]), and the second type is a two-item questionnaire that asks the sleep onset time and wake up time, and the sleep duration is determined by their difference (e.g., Pittsburgh Sleep Quality Index, PSQI [16]). In NHANES 2019-2020, instead of asking the total sleep duration, the second type of questionnaire was used. However, no concurrent objective measurement of sleep duration was available for 2015-2016 onwards, and the validity of the new sleep questions could not be evaluated.
In BRFSS [14], a US community sample comparable to NHANES, both self-reported and accelerometer-measured sleep duration, was 7 h/day. In the current sample, the selfreported sleep duration was 7 h/day and accelerometermeasured 6 h/day. Assuming that BRFSS and NHANES had a similar target population, it is possible that the accelerometer algorithm was biased and underestimated the actual sleep duration. However, no other measurements of sleep duration were available for the NHANES sample, and this postulation could not be examined. Also, the BRFSS subsample analyzed in the aforementioned study may not be comparable to NHANES as it was a small (n = 300) and geographically limited (Upstate New York region) study.
With the NHANES that surveyed a US-representative sample of n ~ 5,000 each year, much population-level research could be conducted, for example, sleep patterns in sub-groups and longitudinal change in sleep patterns. A simple analysis was performed here on the average sleep duration across different age groups, gender, ethnic groups, and education level ( Table 1). The current research serves as a starting point of the above possibilities by providing the validity of the single-item sleep duration question.
Author contribution Paul H. Lee: conceptualization, methodology, formal analysis, and writing-original draft.

Data availability
The dataset(s) supporting the conclusions of this article is(are) available in the NHANES website (https:// wwwn. cdc. gov/ nchs/ nhanes/).

Code availability
The R syntax is available as a supplementary material.

Declarations
Ethics approval This study was approved by the NCHS Research Ethics Review Board (ERB) (Protocol #2011-17).

Consent to participate
All participants consented to participate.

Conflict of interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.