Increasing numbers of older individuals are continuing to seek employment largely out of financial need. Continuing employment can also offer older people benefits beyond financial rewards, such as feeling fulfilled (Maestas et al. 2019), maintaining physical functioning (Choi et al. 2016) and cognitive health (Adam et al. 2013; Bonsang et al. 2012), reducing the risk of dementia (Grotz et al. 2016), and increasing longevity (Wu et al. 2016), all of which are well-recognized issues of aging. Furthermore, the theoretical perspective on productive aging recognizes employment as an activity that can help older people remain productive members of society; this in turn is interrelated with avoidance of disease and maintenance of high cognitive and physical function in later life (Hinterlong 2008; Matz-Costa et al. 2014; Rowe and Kahn 1997).

It is, however, questionable whether these benefits extend to all older workers, particularly those with physically demanding jobs—who typically have relatively low wages and limited education. Despite the decrease in the proportion of physically demanding jobs in the U.S. due to an overall economic shift, a national survey found that many common occupations among older workers are physically taxing (e.g., truck drivers and building cleaners for men; personal aides for women) (Johnson and Wang 2017). Evidence indicates that workers with physically taxing jobs show a higher tendency towards physical health ailments, such as cardiovascular disorders (Krause et al. 2015; Petersen et al. 2012), musculoskeletal disorders (da Costa and Vieira 2010), disabilities (Mc Carthy et al. 2013), occupational diseases (Chau et al. 2009), and even mortality (Coenen et al. 2018). Despite poor physical health, however, older workers may be compelled to work out of financial necessity until reaching the age of full eligibility for Social Security (Rho 2010).

Studies suggest that long-term exposure to work tasks that are physically taxing, relatively simple, and/or limit intellectual challenge tend to result in a cumulative decline in intellectual functioning in old age (Fisher et al. 2014; Marengoni et al. 2011; Schooler et al. 1999; Smyth et al. 2004). Even when controlling for education (Qiu et al. 2003; Smyth et al. 2004), age, and intelligence (Potter et al. 2008)—key risk factors for development of Alzheimer’s disease (AD) in all racial and ethnic groups—the association was still significant. This negative health impact, though initially relatively modest, may accumulate over time, harming older workers more severely than those with more intellectually stimulating jobs (Ilmarinen and Rantanen 1999).

The precise relationship between physical demands and cognition, however, remains to be established. Some studies found that complexity of work is not necessarily related to advancing AD pathology (Gow et al. 2014; Helmer et al. 2001; Stern et al. 1995) because strenuous activity might actually be protective against cognitive declines in normal aging. These conflicting findings may have stemmed from the relatively small samples from limited geographical regions in many of the earlier studies, limiting the generalizability of published findings. Therefore, to fill this gap in knowledge, it is imperative that larger, population-based data be used to better establish significant associations.

The onset of cognitive impairment is an important and growing individual and public health concern because, in the long run, it often leads to more serious consequences, including functional impairment (Griffith et al. 2010), increasing rates of hospitalization and institutionalization (Wang et al. 2013), and even mortality (Dewey and Saz 2001; Johnson et al. 2007). All of these can be costly not only to the individuals and their families, but also to society as a whole, which collectively pays for the nation’s healthcare. A continued gap in knowledge in this area may place older workers with physically challenging jobs at significant risk for cognitive decline both in the short run and cumulatively over the long run. Thus, there is a critical need to define potential detrimental effects of physically arduous jobs on older workers’ cognitive function over time.

The current study is designed to establish the scientific premise for the association between physically demanding jobs and cognition, generating hypotheses for a future longitudinal study. In this study, we articulate the following specific aims: (1) Identify whether the perceived level of physical demands placed on older workers 55 or older is significantly associated with their cognitive function; and (2) to examine whether the association differs by age cohort groups. Based upon relatively limited published findings, we hypothesize that there is a significant negative association between the level of occupational physical demands and cognition among older workers. We also predicted that there are differences in the association by different age cohorts because different cohorts have experienced different historical, social, cultural, and political events (Glenn 2005). Because some domains of cognition begin to deteriorate in mid-life (Singh-Manoux et al. 2012), this study’s focus on workers 55 or over is expected to target those at greater risk of cognitive impairment in the near future.

Methods

Data

This study used the Health and Retirement Study (HRS), publicly available secondary datasets which have biannually surveyed more than 26,000 nationally representative Americans aged 51 or over since 1992. HRS participants were recruited by a multistage area probability sample of households, using the 84 strata National Sample frame, available at the Survey Research Center. Funded by the National Institute on Aging and the Social Security Administration, HRS datasets provide diverse variables of health and labor market participation. We employed its 2010 wave, the first wave where new measures of fluid cognitive ability (including the number series) were added as well as the latest wave available at the time of our data analysis for the HRS imputed cognitive functioning variables. Proper treatment for missing information in cognition measures is important because those missing data may not be missing completely at random (MCAR) or missing at random (MAR); rather it may be not missing at random (NMAR) as it may relate to respondents’ level of cognitive functioning (e.g., respondents refused to answer a question due to their fear of answering incorrectly or not knowing the answer). The HRS provides imputed cognition data using sophisticated specifications, with information on demographic, health, financial status, prior and current cognitive status (see Fisher et al. 2015 for full information on the imputation process and specifications). The inclusion criteria for this study were (1) employed either full-time, part-time, or self-employed, (2) aged 55 or older, and (3) either Whites or Blacks. The total sample size for the present study was 4506.

Measures

Cognitive Functioning

Two cognitive domains of verbal episodic memory and reasoning were measured using total word recall summary and number series scores, respectively. As for the memory dimension, total word recall summary sums the immediate word recall and delayed word recall test scores (Fisher et al. 2015; McArdle et al. 2007). For the single-trial immediate word recall test, the interviewer selects one out of four possible sets and reads a list of 10 nouns in the set (e.g., lake, car, army, etc.) to the respondents and asks them to recall from the list as many words as possible in any order. In the delayed word recall test, the interviewer asks the respondents to recall the words presented in the immediate word recall test after approximately five minutes of asking other survey questions (e.g., questions on depressive symptoms). The total word recall score ranges from 0 to 20, reflecting the number of words correctly recalled. Higher scores indicate higher levels of cognitive functioning.

In terms of reasoning, number series measures reasoning ability that involves mathematical concepts. Respondents are given a series of numbers with a number missing from the series (i.e., 5,8,11, and a blank) and are asked to determine the numerical pattern and then to tell the missing number to the interviewer. Number series is made of two lists of 15 items. Each list is grouped into five sets of three items, which is further grouped by level of difficulty. All respondents initially get the same three items consisting of easier, moderately difficult, and more difficult items. The remaining four sets are given based on the respondents’ number of correct answers using adaptive testing methodology (Fisher et al. 2013). Total scores are standardized as a W-score, equivalent to the W-scores in the Woodcock-Johnson III (WJIII) test battery, from which this test originated (Jaffe 2009). Higher scores reflect better performance on the number series task.

The Level of Physical Demands

The variable of interest, perceived level of physical demands placed on older workers, was measured by the variable, “My job is physically demanding,” ranging from 1 (strongly disagree) to 4 (strongly agree). Therefore, higher values reflect greater physical demands.

Risk Factors for Age-Related Cognitive Deterioration

The presence of diabetes was captured by a binary variable based on the respondents’ self-report on a diabetes diagnosis (i.e., “Has a doctor ever told you that you have diabetes or high blood sugar?”). Level of depression was measured by the Center for Epidemiologic Studies Depression (CES-D) scale (Radloff 1977). The CES-D score sums eight indicators measured by yes/no responses about respondents’ feelings over the week prior to the interview. The total score ranges from 0 to 8, the higher the score indicating higher depression symptoms in the previous week. Alcohol consumption was calculated by the well-established quantity-frequency (QF) approach (Room 1990; Sobell and Sobell 2003): the average number of days per week of drinking within the previous three months (frequency) multiplied by the number of drinks consumed in the previous three months on the days when drinks were consumed (quantity). Cigarette smoking was captured by a binary variable asking whether the respondents currently smoked cigarettes. Level of obesity was measured by body mass index (BMI) calculated by the respondents’ weight in kilograms divided by height squared, measured in meters.

Cognitively Stimulating Leisure Activities

Participating in cognitively stimulating leisure activities was included in the analytic model because they have been shown to protect against dementia or AD (Karp et al. 2006; Fritsch et al. 2005; Wang et al. 2002; Wilson et al. 2002; Verghese et al. 2003). To measure respondents’ number and frequency of participation in leisure activities requiring cognitive effort, we employed variables from the Psychosocial and Lifestyle Questionnaire, a subset dataset of HRS which has longitudinally collected information on lifestyle and subjective wellbeing among randomly-selected 50% of the HRS participants every four years since 2006 (Smith et al. 2013). This measure offers a way to capture current educational pursuits as opposed to completed years of school. We summed the total number of cognitively challenging leisure activities among eight types: (a) attending an educational course, (b) going to a social club, (c) attending non-religious organizations, such as political or community groups, (d) reading books, magazines, or newspapers, (e) doing games such as crossword puzzles, (f) playing cards or chess, (g) writing letters or stories, and (h) using the computer. All of these were coded as 1 (never), 2 (not in the last month), 3 (at least once a month), 4 (several times a month), 5 (once a week), 6 (several times a week), and 7 (daily). The analyses used the sum of these activities, ranging 8 to 56, with higher scores corresponding to more activities.

Demographic Variables

Six demographic variables were included in the analysis. Age represented the respondents’ current age in years in 2010. Gender (male and female) and race (Whites/Caucasian and Blacks/African American) were included in the analysis as binary variables. Education was a continuous variable representing the respondent’s years of education ranging 0 to 17 years. Total household income was a continuous variable summing various sources, including job earnings, compensation, Social Security benefits, pension income, and capital income of family members living in the household. Current marital status was captured by a binary variable (0 = not married, 1 = married). The original HRS variable for marital status used eight categories: we recoded as 1 for those who were married, married but spouse absent, and separated; and as 2 for those who were partnered, divorced, separated/divorced, and widowed. Overall health was measured by self-rated health (1 = poor, 2 = fair, 3 = good, 4 = very good, 5 = excellent). Despite the controversy over its validity due to perceived subjectivity, self-rated health has been reported as a good indicator of a person’s medically determined health (Dwyer and Mitchell 1999; McGarry 2004; Munnell et al. 2008) and the best measure for determining the effects of health on work (Munnell et al. 2008).

Analysis

Main Analysis

We used generalized additive modeling (GAM), a method that is advantageous in modeling nonlinear relationships among selective variables. GAM is characterized as “a generalized linear model with a linear predictor involving a sum of smooth functions of covariates” (Wood 2017). GAM has no specific limitations in modeling, such as the number of smoothed covariates, the shape or functional form of the relationship. In mathematical form, the model is expressed as Eq. (1):

$$ {y}_i=f\left({x}_i\right)+{\varepsilon}_i,f\left({x}_i\right)={\sum}_{j=1}^q{b}_j\left({x}_i\right){\beta}_j $$
(1)

where q is the number of simpler functions that are needed to represent or approximate f(xi). As shown, there is no restriction on the functional form of b(x).

GAM provides a wide range of modeling options in terms of smooth functions and the degree of smoothness. We made an effort to find the most appropriate functional form and degree of smoothness, finding that what best fit our data was a cubic smoothing spline, generally known as an “optimal, or at least very good” smoothing method (Boor 1978), expressed as y(t) = at3 + bt2 + ct + d. The most important part of the function is to set a common point, usually called “knots” (Wood 2017). We also determined the best parameter for the degree of smoothness using an ordinary cross-validation method.

Selecting Nonlinear Terms

Although fluid cognitive function including memory and reasoning tends to decline as people age (Park 2000; Salthouse 2012; Salthouse et al. 2003), its relationships with crucial demographic factors (e.g., education and income) may not be strictly linear. Violating the linearity assumption could contribute to the large portion of unexplained variability in dependent variables reported in social scientific research. Using scatterplot matrix, we performed a series of analyses to check whether using GAM is appropriate for the study data and to detect nonlinear relationships between the outcomes and study predictors. We found three variables showed significant and nonlinear relationships with the outcomes: age, education, and income. Given the series of diagnostic analyses indicating that the normality assumption was violated from the relationships between the outcomes and the three demographic variables, we confirmed that the use of GAM is suitable for our data and then modeled the three variables as nonlinear terms in the analytic models.

Cohort Analyses

To examine whether the association between physical demands and cognitive function differs for age cohorts (Aim 2), we stratified the data into three age groups and ran GAM analyses on the groups separately. Based on the participants’ birth years, we created the three age cohorts: (a) Baby Boomers (born 1948–1959; ages 62–51), (b) the War Babies (born 1942–1947; ages 68–63), and (c) the HRS cohort (born 1931–1941; ages 79–69) and the Children of Depression (born 1924–1930; ages 86–80 in 2010) (Health and Retirement Study 2010).

Supplemental Subgroup Analyses

We added supplemental subgroup analyses to test whether the association between physical demands and cognition was independent of the influence of participating in cognitively challenging leisure activities. A variable of cognitively stimulating leisure activities was added to the initial GAM models (n = 1898 for the total word recall; n = 1850 for the number series scores).

For all analyses, we used the HRS weighting variables (the strata, cluster, and individual weighting) to account for the multistage probability survey design (Heeringa and Connor 1995), individual nonresponses, and oversampled racial/ethnic minorities. This proces is to adjust the data to nationally representative estimates. For the supplemental subgroup analyses, we used weighting variables created especially for the HRS Psychosocial and Lifestyle Questionnaire. Since only half of the randomly chosen HRS participants were asked to complete this Questionnaire, HRS offers separate respondent-level survey weights to adjust for the sample selection (Smith et al. 2013). All data were analyzed by R.

Results

Descriptive Analysis Results

Table 1 shows the descriptive results of the data. In terms of demographics, the average age of the participants was 59.09 (SD = 5.76) with a minimum of 55 and maximum of 88. Females and Blacks consisted of approximately 52% and 23% respectively, out of the total sample size of 4506. It is noteworthy that among those who identified themselves as White (or Caucasian), 11.3% also considered themselves as Hispanic/Latino. Among those who defined their race as Black (or African American), 1.7% also considered themselves as Hispanic/Latino. Thus, in total 10% of Hispanics/Latinos existed in the data. The majority of the participants were married (65%). The mean education level was 13.34 years (SD = 3.16) and the mean of annual household income was around $95,000 (SD = 114,000).

Table 1 Descriptive Statistics (N = 4506)

As for the risk factors for age-related cognitive deterioration, participants’ overall health scored 3.49 out of five (SD = 0.98). The mean depression level was 1.12 out of 8. Around 16% of the participants had been diagnosed with diabetes, and 15% were currently cigarette smokers. The average level of alcohol assumption was 3.05 (SD = 6.26). The mean BMI was 28.92 (SD = 5.72), indicating that participants on average were overweight (Center for Disease Control and Prevention 2020, August 7). The mean score for the perceived physical demand was 2.26 out of four (SD = 1.13). The mean of the total word recall was 10.82 (SD = 3.05), with 0 the minimum and 20 the maximum. The mean of the number series was 506.10 (SD = 41.60), ranging from 390 to 579.

Aim 1: GAM Analysis Results

Total Word Recall Summary

Table 2 shows the results of the GAM analysis. The overall model fit (R2) was 0.18, which is relatively high considering that we used general population survey data. Also, four diagnostic plots for model fit examination, including a normal Q-Q plot and histogram of residuals, show that the model fit was reasonable (figures not included). The results showed that women had better scores in the total word recall than men (B = 1.26, p < .001), and the coefficient for race variable indicates that Blacks compared to Whites had worse cognitive function by more than one point (B = −1.12, p < .001). Married individuals showed higher scores of the total word recall summary than those not married (B = −1.12, p < .001). Overall health was significantly and positively related to the total word recall (B = −0.31, p < .001). While smoking and diabetes were found to be not significant, drinking was significantly and negatively related to cognitive function (B = −.01, p = .06).

Table 2 GAM Analysis Results

All the three variables (age, education, and income) chosen to be modeled nonlinearly were found to be significant, indicating that our approach using GAM successfully improved the model fit. Supplementary Fig. 1 depicts the relationships between the nonlinear term variables and the total word recall summary. The gray area in this figure describes a 95% confidence interval for the prediction.

When ruling out the effects of risk factors for cognitive decline, physical demands from work were significantly associated with individuals’ cognitive function (B = −.10, p = .01); those with more physically demanding jobs tended to have poorer cognitive function.

Number Series Scores

Table 2 shows that the model explained more variance in number series scores, compared to total word recall: R2 increased to 0.31. The results for the nonlinear variables also justified our choice of GAM in analyzing the study model. Contrary to the results of total word recall, women had poorer scores in the number series than men (B = −6.55, p < .001). Blacks had poorer cognitive function compared to Whites in both outcomes, but it is notable that the racial difference was much larger for the outcome of the number series (B = −25.01, p < .001). Also, while depression was not a significant predictor for the word recall summary, it was for the number series scores (B = −0.84, p = .01). Drinking was significantly related to the word recall summary but was not for the number series. Finally, marital status, overall health, and drinking were not significantly associated with the number series scores, unlike for the total word recall summary.

The nonlinear terms (age, education, and income) were significantly associated with the reasoning side of cognition, measured by the number series. Supplementary Fig. 2 illustrates the relationships between the nonlinear term variables and the number series. While the number series was linearly and negatively associated with age, its relationship with income was nonlinear, presenting an n-shape relationship. Its relationship with education was complicated; overall, there was a positive association, albeit presenting a nonlinear pattern among individuals with less than five years of education.

When the demographics and risk factors were ruled out, physical demands placed on older workers were still significantly related to the number series in a negative direction (B = −2.49, p < .001).

Supplemental Subgroup Analyses Results

We ran supplemental subgroup analyses to test whether the association between physical demands and cognition was independent of the influence of participating in cognitively stimulating leisure activities. First, we checked for a correlation between physical demands and cognitively stimulating activities and found a significant association (r = −.24, p < .001). Second, we ran the initial GAM analyses in a subgroup of individuals who answered questions about cognitively demanding leisure activities.

The results of the subgroup analyses remained the same as the main GAM findings. Results showed that cognitively challenging leisure activities were significantly linked with both outcomes (B = .07, p < .001 and B = .96, p < .001, respectively). Even after the influence of those leisure activities was ruled out, physical demands placed on older workers were still significantly associated with total word recall summary (B = −.15, p < .001) and number series scores (B = −2.03, p < .001). Thus, the subgroup analyses further supported the main findings from the GAM results.

Aim 2: GAM Results by Age Cohorts

We first checked descriptive statistics to examine whether perceived level of physical demands differs by age cohorts. Results showed that the means of physical demands were 2.11 for the Children of Depression & HRS cohort, 2.13 for War Babies, and 2.36 for Baby Boomers.

The one-way ANOVA results demonstrated statistically significant differences among the groups, suggesting that the Baby Boomers perceived a higher level of physical demands than the other two cohorts (F (2, 4760) = 20.88, MSE = 1.28, p < .001). Next, we ran the same GAM analyses conducted for Aim (1), separately by cohorts. Table 3 shows that for the outcome of the total word recall summary, when all covariates were controlled for, its association with physical demands was significant only for the War Babies (B = −.31, p < .001) and the Baby Boomers (B = −.09, p = .03), but not the Children of Depression and HRS cohort (B = −.13, p = .13). For the outcome of number series scores Table 4, its association with physical demands was significant only for the Baby Boomer cohort (B = −3.15, p < .001), but not for the War Babies (B = −1.22, p = .33) and the Children of Depression & HRS cohort (B = −1.50, p = .21).

Table 3 Age Cohort Analyses for Total Word Recall Summary
Table 4 Age Cohort Analyses for Number Series Scores

Discussion

This study aimed to examine whether the level of physical demands placed on older workers 55 or older is significantly associated with their cognitive functioning, especially regarding two domains of cognition: verbal episodic memory and reasoning. When the risk factors for age-related cognitive deterioration and demographic variables were controlled for, this study found that the level of physical demands as perceived by older workers was still significantly and negatively linked with both the memory and reasoning dimensions of cognition. Perhaps this association can be explained by different levels of job control and intellectual engagement: more physically demanding jobs usually have low job control (e.g., a low level of independence, less freedom to make decisions, and a lower frequency of decision-making), which has been found to be detrimental to cognition in old age (Andel et al. 2015; Seidler et al. 2004; Wang et al. 2009). Similarly, less physically demanding jobs usually involve more intellectual engagement, which promotes cognitive function (Andel et al. 2011; Stern 2012). Given these findings, the health benefits of working in old age may not extend equally to all older workers; perhaps the health benefits of employment should not then be described as “one-size-fits-all” but rather delineate how different job characteristics may differently affect workers’ cognition.

The association between physical demands and cognition among older workers was also independent of the influence of participating in cognitively stimulating leisure activities, which have previously been shown to protect against dementia or AD (Karp et al. 2006; Fritsch et al. 2005). This finding implies that occupational experience does relate to cognitive decline above and beyond related leisure experiences, congruent with several studies (e.g., Andel et al. 2005; Karp et al. 2009; Schooler et al. 1999; Smyth et al. 2004; Qiu et al. 2003; Seidler et al. 2004; Smyth et al. 2004; Potter et al. 2008). The results therefore support the cognitive reserve hypothesis; i.e., certain aspects of life experience provide either protective or detrimental influences on brain pathology (Mortimer et al. 2005; Scarmeas and Stern 2004). In addition, between the two different dimensions of cognition used in this study, the reasoning domain of cognition appeared to be more strongly correlated with the level of physical demands than the verbal episodic memory domain. This could be because reasoning is more sensitive to aspects of physically demanding jobs (e.g., a lack of cognitive stimulation) that influence cognitive function.

Cohort effects played a role in the relationship between physical demands and cognitive function. We found that among the three age cohorts, the relationship was significant only for relatively younger age cohorts. In terms of the memory domain of cognition, the association was significant only for the Baby Boomers (born 1948–1959; ages 62–51 in 2010) and War Baby cohorts (born 1942–1947; ages 68–63), and as for the reasoning dimension of cognition, only for the Baby Boomer cohort. This is perhaps because of selective survival or a healthy worker survivor effect; i.e., those who remain in the labor force are more likely to be healthier than those who have left (Arrighi and Hertz-Picciotto 1994). Given the ages of the eldest old age cohorts (69–86 years old), many of them would have already withdrawn from the strenuous industry either voluntarily (e.g., due to retirement) or involuntarily (e.g., due to health impairments). Research also demonstrates that blue-collar workers usually retire earlier than white-collar workers due to the physical toll that manual labor can take (Belbase et al. 2015). Another explanation for the observed cohort effects is that occupational characteristics may be less tied to late-life cognitive ability for the Children of Depression cohort, who faced especially challenging conditions related to war and the Great Depression (Glenn 2005). A third explanation is that sample size differences might have driven the different pattern of results across the birth cohort groups.

The findings presented in this study should be viewed in light of certain limitations. The cross-sectional nature of this study may be limited in inferring a causal relationship between the level of physical demand and cognitive function. Second, the findings may represent reverse causation—the tendency for people with lower IQ to have more physically demanding jobs. In other words, it is possible that level of cognitive functioning influenced the magnitude of physical demands, rather than vice versa. Similarly, a potential healthy worker survivor effect may have limited power to detect an association in this study; a concern to many studies on health and aging (Parker et al. 2013). Third, the generalizability of the findings may be limited because this study’s data included only Whites and Blacks (regardless of ethnicity). Fourth, the measure for the level of physical demands used in this study is potentially limited due to its single-item, self-reported nature. Finally, results of this study may not reflect social, economic, and political changes that occurred after the time of data collection and may have influenced the study outcomes and predictors (e.g., sharp decreases in life expectancy in the U.S. between 2014 and 2017, a public health emergency due to the opioid crisis in 2017, and the global pandemic in 2020).

Given the cross-sectional results from this study, future research needs to further examine the relationship between physical demands placed on older workers and their cognitive outcomes by taking a longitudinal approach. There is still a great deal to be learned about what specific types of jobs are beneficial and harmful to older workers’ cognitive function, which will advance theory and intervention in the field of aging.