Skip to main content
Log in

Organizing and Analyzing the Activity Data in NHANES

  • Published:
Statistics in Biosciences Aims and scope Submit manuscript

Abstract

The NHANES study contains objectively measured physical activity data collected using hip-worn accelerometers from multiple cohorts. However, using the accelerometry data has proven daunting because (1) currently, there are no agreed-upon standard protocols for data storage and analysis; (2) data exhibit heterogeneous patterns of missingness due to varying degrees of adherence to wear-time protocols; (3) sampling weights need to be carefully adjusted and accounted for in individual analyses; (4) there is a lack of reproducible software that transforms the data from its published format into analytic form; and (5) the high dimensional nature of accelerometry data complicates analyses. Here, we provide a framework for processing, storing, and analyzing the NHANES accelerometry data for the 2003–2004 and 2005–2006 surveys. We also provide an NHANES data package in R, to help disseminate high-quality, processed activity data combined with mortality and demographic information. Thus, we provide the tools to transition from “available data online” to “easily accessible and usable data”, which substantially reduces the large upfront costs of initiating studies of association between physical activity and human health outcomes using NHANES. We apply these tools in an analysis showing that accelerometry features have the potential to predict 5-year all-cause mortality better than known risk factors such as age, cigarette smoking, and various comorbidities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Banack HR, Kaufman JS (2014) The obesity paradox: understanding the effect of obesity on mortality among individuals with cardiovascular disease. Prev Med 62:96–102. https://doi.org/10.1016/j.ypmed.2014.02.003

    Article  Google Scholar 

  2. Centers for Disease Control and Prevention (2017) About the national health and nutrition examination survey. http://www.cdc.gov/nchs/nhanes/about_nhanes.htm

  3. Cooper R, Huang L, Hardy R, Crainiceanu A, Harris T, Schrack JA, Crainiceanu C, Kuh D (2017) Obesity history and daily patterns of physical activity at age 60-64 years: findings from the MRC national survey of health and development. J Gerontol A Biol Sci Med Sci 72(10):1424–1430

    Article  Google Scholar 

  4. Curtin L, Mohadjer L, Dohrmann S (2012) The national health and nutrition examination survey: sample design, 1999–2006. Vital Health Stat 2(155):1–39

    Google Scholar 

  5. Di C, Crainiceanu CM, Caffo BS, Punjabi NM (2009) Multilevel functional principial component analysis. Ann Appl Stat 3(1):458–488

    Article  MathSciNet  MATH  Google Scholar 

  6. Di J, Leroux A, Urbanek J, Varadhan R, Spira A, Schrack J, Zipunnikov V (2017) Patterns of sedentary and active time accumulation are associated with mortality in US adults: the NHANES study. bioRxiv. https://doi.org/10.1101/182337

  7. Gellar JE, Colantuoni E, Needham DM, Crainiceanu CM (2015) Cox regression models with functional covariates for survival data. Stat Model 15(3):256–278

    Article  MathSciNet  Google Scholar 

  8. Huang L, Scheipl F, Goldsmith J, Gellar J, Harezlak J, McLean MW, Swihart B, Xiao L, Crainiceanu C, Reiss P (2016) refund: Regression with functional data

  9. Klenk J, Srulijes K, Schatton C, Schwickert L, Maetzler W, Becker C, Synofzik M (2016) Ambulatory activity components deteriorate differently across neurodegenerative diseases: a cross-sectional sensor-based study. Neurodegener Dis 16:317–323

    Article  Google Scholar 

  10. Krane-Gartiser K, Henriksen TEG, Morken G, Vaaler A, Fasmer OB (2014) Actigraphic assessment of motor activity in acutely admitted inpatients with bipolar disorder. PLoS ONE 9(2):1–9. https://doi.org/10.1371/journal.pone.0089574

    Article  Google Scholar 

  11. Krane-Gartiser K, Henriksen TEG, Vaaler AE, Fasmer OB, Morken G (2015) Actigraphically assessed activity in unipolar depression: a comparison of inpatients with and without motor retardation. J Clin Psychiatry 76(9):1181–1187

    Article  Google Scholar 

  12. Lee E, Zhu H, Kong D, Wang Y, Giovanello KS, Ibrahim JG (2015) Bflcrm: a bayesian functional linear cox regression model for predicting time to conversion to alzheimer’s disease. Ann Appl Stat 9(4):2153–2178

    Article  MathSciNet  MATH  Google Scholar 

  13. Leroux A (2018) rnhanesdata: NHANES accelerometry data pipeline. R package version 1.0. https://github.com/andrew-leroux/rnhanesdata

  14. Lohr SL (2009) Sampling: design and analysis, 2nd edn. Duxbury Press, Australia

    MATH  Google Scholar 

  15. Lumley T (2010) Complex surveys: a guide to analysis using R. Wiley series in survey methodology. Wiley, Hoboken, NJ

    Book  Google Scholar 

  16. Lumley T (2017) survey: Analysis of complex sample surveys. R package version 3.32

  17. Lumley T, Scott A (2015) AIC and BIC for modeling with complex survey data. J Surv Stat Methodol 3(1):1–18. https://doi.org/10.1093/jssam/smu021

    Article  Google Scholar 

  18. National Cancer Institute (2018) Risk factor monitoring and methods: SAS programs for analyzing nhanes 2003 2004 accelerometer data. https://epi.grants.cancer.gov/nhanes_pam/

  19. National Center for Health Statistics (2015) Office of analysis and epidemiology, public-use linked mortality file. http://www.cdc.gov/nchs/data_access/data_linkage/mortality.htm

  20. Preston SH, Stokes A (2014) Obesity paradox: conditioning on disease enhances biases in estimating the mortality risks of obesity. Epidemiology 25(3):454–461

    Article  Google Scholar 

  21. R Core Team (2018) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

  22. Ramsay J, Silverman B (2005) Functional data analysis. Springer, New York

    Book  MATH  Google Scholar 

  23. Robillard R, Hermens DF, Naismith SL, White D, Rogers NL, Ip TK, Mullin SJ, Alvares GA, Guastella AJ, Smith KL, Rong Y, Whitwell B, Southan J, Glozier N, Scott EM, Hickie IB (2015) Ambulatory sleep-wake patterns and variability in young people with emerging mental disorders. J Psychiatry Neurosci 40(1):28–37

    Article  Google Scholar 

  24. Schrack JA, Zipunnikov V, Goldsmith J, Bai J, Simonsick EM, Crainiceanu C, Ferrucci L (2014) Assessing the “physical cliff”: detailed quantification of age-related differences in daily patterns of physical activity. J Gerontol A Biol Sci Med Sci 69(8):973–979

    Article  Google Scholar 

  25. Shou H, Zipunnikov V, Crainiceanu CM, Greven S (2015) Structured functional principal component analysis. Biometrics 71(1):247–257

    Article  MathSciNet  MATH  Google Scholar 

  26. Steeves JA, Murphy RA, Crainiceanu CM, Zipunnikov V, Van Domelen DR, Harris TB (2015) Daily patterns of physical activity by type 2 diabetes definition: comparing diabetes, prediabetes, and participants with normal glucose levels in NHANES 2003–2006. Prev Med Rep 2:152–157

    Article  Google Scholar 

  27. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, Downey P, Elliott P, Green J, Landray M, Liu B, Matthews P, Ong G, Pell J, Silman A, Young A, Sprosen T, Peakman T, Collins R (2015) UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 12(3):e1001779

    Article  Google Scholar 

  28. Troiano RP, Berrigan D, Dodd KW, Mâsse LC, Tilert T, McDowell M (2008) Physical activity in the united states measured by accelerometer. Med Sci Sports Exerc 40(1):181–188

    Article  Google Scholar 

  29. Van Domelen DR (2018) accelerometry: Functions for processing accelerometer data. R package version 3.1.2

  30. Van Domelen DR, Pittard WS, Harris TB (2014) nhanesaccel: Process accelerometer data from NHANES 2003–2006. R package version 2.1.1/r86

  31. Van Domelen DR, Pttard SW (2014) Flexible R functions for processing accelerometer data, with emphasis on nhanes 2003–2006. R J 6:52–62

    Article  Google Scholar 

  32. Varma VR, Dey D, Leroux A, Di J, Urbanek J, Xiao L, Zipunnikov V (2018) Total volume of physical activity: tac, tlac or tac(\(\lambda \)). Prev Med 106:233–235. https://doi.org/10.1016/j.ypmed.2017.10.028

    Article  Google Scholar 

  33. Wood SN, Pya N (2016) Säfken: smoothing parameter and model selection for general smooth models. J Am Stat Assoc 111(516):1548–1575

    Article  Google Scholar 

  34. Xiao L, Zipunnikov V, Ruppert D, Crainiceanu CM (2016) Fast covariance estimation for high-dimensional functional data. Stat Comput 26(1):409–421

    Article  MathSciNet  MATH  Google Scholar 

  35. Yoshida K, Bohn J (2017) tableone: Create ‘Table 1’ to describe baseline characteristics. R package version 0.9.3

  36. Zipunnikov V, Caffo B, Yousem DM, Davatzikos C, Schwartz BS, Crainiceanu CM (2011) Multilevel functional principal component analysis for high-dimensional data. J Comput Graph Stat 20(4):852–873

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to thank the CDC, specifically the National Center for Health Statistics for collecting, organizing, and making public this unique data resource. We would also like to thank them for the permission to repost the publicly available NHANES and NDI data in analytic format. Also, we would like to thank the thousands of anonymous participants in the NHANES, whose data led to the exciting findings in this paper.

Funding

This research was supported by National Heart, Lung, and Blood Institute (R 01 HL123407), National Institute of Neurological Disorders and Stroke (R 01 NS060910), and National Institute on Aging Training Grant (T 32 AG000247).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew Leroux.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Leroux, A., Di, J., Smirnova, E. et al. Organizing and Analyzing the Activity Data in NHANES. Stat Biosci 11, 262–287 (2019). https://doi.org/10.1007/s12561-018-09229-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12561-018-09229-9

Keywords

Navigation