“Mental load, also sometimes called emotional labor, is having lots of things on your mind. It’s having to remember to pick up eggs, to label your kid’s PE kit, to plan the Christmas shopping, to buy and make dinners for the week, to read the communications from school—the list goes on. Even if you ask someone else to buy eggs, it’s you then checking that the eggs were in fact bought. It’s essentially project management. And when it’s at work, that’s what we call it. Project management. Or just management. It’s a whole job. Yet when it’s at home, we call it, well, we don’t really have a word for it.”—Forbes.com (Carrell, 2019).

In the late 1980s, thanks in part to Hochschild’s (1989) influential work on women’s “second shift,” scholars began to explore the division of labor to understand why working women continue to do most of the family work. Much of this scholarly work identified the physical tasks associated with the household and child rearing (e.g., cooking, paying bills, etc.) (Mederer, 1993; Thompson, 1991). Though studying concrete, observable tasks is valuable, a May 2017 cartoon published in The Guardian depicted the wife as the household “project manager” who thinks about, knows, plans, organizes, and tells others what needs to be done and when. This generated popular interest in the invisible load required for household management and suggested that women bear a heavier load than men—and that doing so has significant costs for health and well-being (Desmond, 2017).

Academic attention to this topic has been scarce and fragmented. Hochschild (1989) defined “management” of domestic life as “remembering, planning, and scheduling domestic chores and events” (p. 276). In one of the first scientific investigations, Walzer (1996) qualitatively studied parents of newborns and identified a lot of “invisible, mental labor,” particularly for mothers, noting that such unseen responsibilities contributed to marital tension. Yet, as an indication of the historical (lack of) scholarly interest in this topic, this foundational paper has been cited only 204 times in the past 27 years. An academic literature search reveals only a handful of studies, most of which are qualitative studies in sociology, gender, and/or marriage/family disciplines. The ongoing popular discussion appears to have reinvigorated scholarly attention to the mental demands of family work given the recent uptick in publications (Ciciolla & Luthar, 2019; Daminger, 2019; Robertson et al., 2019).

The topic of invisible family load is ripe for investigation and is relevant for occupational health and management scholars. Not only is invisible family load likely to negatively affect health and well-being, thereby levying undue costs on organizations, but spillover effects (e.g., Kossek & Ozeki, 1998) suggest that the invisible load from one’s family may negatively affect job attitudes. Yet, as the opening quote suggests, the concept remains poorly understood. There is a lot to resolve, such as: “What should we call it? How do we define it? How do we measure it? Do men and women differ in their experience of it? What are its effects on health and well-being and at work?” As explained by Stone-Romero (1994), “tremendous amounts of time, effort, and other resources have been expended on research that has poor conceptual and methodological underpinnings” (p. 175). Given the resurgence of interest, this is an ideal time to set a conceptual and measurement foundation for work on the topic.

In Table 1, we summarize the construct labels, measures, methods, and results from prior studies which highlights the variation across them in terms of the labeling and meaning of this construct. For example, recent studies use different terms, such as “mental labor” (Robertson et al., 2019), the “cognitive dimension of household labor” (Daminger, 2019), and “invisible household…mental and emotional labor” (Ciciolla & Luthar, 2019), while the term “mental load” is more common in the popular press. There is also variation in the intended meaning of this construct. Although all extant definitions are multidimensional, they vary in the number and type of dimensions. The content of dimensions include worrying, information processing, managing, management of routines, child adjustment, finances, planning, anticipating, knowing, self-regulating, and meta-parenting (Ciciolla & Luthar, 2019; Daminger, 2019; Meier et al., 2006; Walzer, 1996). In short, the fragmented descriptions have resulted in confusing terminology and insufficient construct clarity (Robertson et al., 2019).

Table 1 Summary of prior studies and measures of invisible family load

The present research aims to define, conceptualize, and operationalize the phenomenon. In doing so, we make several contributions to the literature. First, we specify a common label and provide a comprehensive definition. Having a thorough understanding of a phenomenon—including what it does and does not include—is a prerequisite for measuring and studying a construct effectively (Cronbach, 1990). In Study 1, we build on definitions in the academic and popular literatures by undertaking a qualitative study in which individuals provide personal examples of the construct and its associated consequences. Study 1 addresses the research question, “What should we call this construct? And how should we define it?”.

Second, given that the research has been primarily qualitative, there is not a well-validated, widely agreed upon measure, particularly for use in the organizational literature (see Table 1 for a review of prior measures). Although there are a few scales (e.g., Ciciolla & Luthar, 2019; Lee & Waite, 2005; Mederer, 1993; Meier et al., 2006), they have limitations. One limitation is their task-specific nature. For example, the Mederer (1993) scale uses eight specific tasks such as making a grocery list, planning dinner, and making medical appointments. Similarly, the Lee and Waite (2005) scale assesses time spent thinking about eight household tasks such as washing dishes. The task-specific nature does not capture other types of planning, scheduling, or thinking people may do, and as such, may underestimate the true invisible family load. Rather than focusing on the particular context, we contend it is preferable to focus on the nature of the mental act itself for several reasons. First, it ensures items are broadly applicable across a range of family contexts. For example, planning birthday parties is more relevant for parents of younger children than of adolescents, whereas organizing transportation may be more relevant for the latter. Yet, both types of families must engage in planning and organizing, and it is likely that the load experienced by individuals is derived from the nature or frequency of the activity itself (e.g., planning or remembering) rather than the particular content remembered (e.g., soccer sign-ups or labeling clothes).

Aside from their narrow applicability, the task-specific nature of prior scales has resulted in lengthy measures. For example, the Meier et al. (2006) measure has 21 items for household/childcare management and 25 items for physical task completion, with 46 additional items for how much participants worry about each of these management and physical tasks. The scale length may contribute to participant fatigue and deter its use. Another limitation of some scales is their limited utility for some samples or settings. In the most recently developed measure, Ciciolla and Luther (2019) used 13 total items focused on three dimensions: management of household routines, child adjustment, and financial decisions. Respondents are asked to rate on a three-point scale who does each task: (1) “Mostly me,” (2) “Mostly my spouse/partner,” or (3) “Both equally,” thus inherently assuming partnered participants. This limits the applicability for meaningful subsets of respondents, such as single parents. Whereas the relative distribution of invisible family load may be relevant to relationship outcomes, the absolute value may be most relevant to work, health, and well-being outcomes.

In addition, none of the existing measures have been subjected to rigorous scale development procedures, and none have been examined within the work context. Great care is needed to ensure that items capture the content universe (i.e., content validity), are reliable, and relate to other constructs as expected (convergent and discriminant validity, construct validity; Stone-Romero, 1994). Without such efforts, it is possible that extant measures do not capture the intended construct. To overcome these conceptual and practical limitations, we use best practice procedures to develop a psychometrically sound scale (Studies 2–4), ensuring that it is broadly applicable as well as reasonable in length for use in organizational research. In these studies, we address the research question: “How do we measure invisible family load?”.

The earliest work on division of labor suggests that it is gendered, with women being socialized to and generally taking on the bulk of family caregiving (Hochschild, 1989). Even as women’s participation in the labor force has increased, they still take on the heavier load in terms of time spent on family and household labor (Bureau of Labor Statistics, BLS, 2018). The popular literature prominently features “Mom as manager,” suggesting women take on more invisible family load than do men. This evidence is in line with social role theory (Eagly & Wood, 2016) that women are socialized as caregivers whereas men are socialized as providers. As such, we empirically examine gender differences (Study 5) to address the research question “Do men and women differ in their experiences of invisible family load?”.

Finally, we address the question of whether invisible family load has implications for employee health and well-being, as well as family-to-work spillover (also in Study 5). Notably, prior academic work and the popular press have painted invisible family load as having uniformly negative outcomes. In light of our qualitative findings (Study 1) and theory on challenge and hindrance stressors (Cavanaugh et al., 2000) and work-family enrichment (e.g., Wayne et al., 2007), we introduce the possibility that it has costs and benefits. Through this comprehensive examination (refer to Table 2 for an overview of the purpose and sample used in each study), we establish a strong foundation for empirical inquiry.

Table 2 Overview of study progression, purpose, and samples

Labeling the Focal Construct

The existing discourse around this construct has not coalesced around a common label. We consolidate the scattered literature to offer a precise term that is well grounded in the literature: invisible family load. The term “invisible” reflects that these are unseen rather than observable physical tasks. Within existing research, most studies have used the term “mental labor” (see labels in Table 1). Despite its popularity, we did not use the term “mental” because this implies the construct is entirely cognitive, but as can be seen in Table 1 and elaborated shortly, other elements, such as planning and worrying, are behavioral and/or emotional rather than cognitive. The term family specifies the domain in which these activities occur and differentiates them from activities occurring at work, and is broader than “household,” reflecting that these activities sometimes pertain to adult children, elderly parents, or other family members living outside one’s household. Finally, we use the term load rather than labor. Dictionary definitions indicate that “labor” indicates expenditure of great effort and/or services performed for wages, whereas “load” offers a more inclusive terminology, encompassing whatever is put on a person, something that weighs down the mind or spirit, and is a burdensome responsibility but not for wages (https://www.merriam-webster.com/). Also, as seen in the opening quote, the popular press has equated “mental load” with “emotional labor” and some relevant academic literature also refers to “emotion work” or “emotional labor,” generally construed as worrying about family activities, events, schedules, and/or needs (Ciciolla & Luthar, 2019; Meier, 2006; Offer, 2014; Walzer, 1996). However, within the management literature, “emotional labor” more commonly refers to self-regulating such that one’s outward display of emotion is consistent with display rules (Grandey, 2000). Another reason we use the term “load” is that it also better differentiates the construct from other management constructs (i.e., emotional labor). For these reasons, we adopt the invisible family load terminology for the overarching focal construct.

Study 1: Defining the Focal Construct Through Triangulation of Sources

Our next step was to provide a comprehensive definition. Our initial goal was to cast a wide net to define the content universe and avoid construct deficiency (Hughes, 2018) by triangulating across academic work, popular press discussions, and a qualitative data collection. Several key findings are apparent from our review of the academic literature. Per Table 1, this construct has been defined or measured as including worrying, information processing, and management of division of labor (Walzer, 1996); management and worry of household and childcare (Meier et al., 2006); planning, organizing and allocating work, setting standards, and providing quality control (Warren, 2011); planning, organizing, and management of tasks as well as feelings and concerns that accompany them (Offer, 2014); management of household routine, child adjustment, and finances (Ciciolla & Luthar, 2019), organizing, anticipating needs, knowing, managerial thinking, self-regulating, and meta-parenting (Robertson et al., 2019), and anticipating, monitoring, and decision-making (Daminger, 2019). Though there are several recurring themes, there is not consensus regarding the key elements of this construct.

A review of popular press discussions revealed a sizeable literature yielding 204,000,000 hits in a Google search. The vast number made it unfeasible to conduct an exhaustive review, so we reviewed a sample (e.g., Oppenheimer, 2017; Owens, 2018; Stewart, 2017; Wade, 2016). Predominant themes are (i) coordinating/organizing/planning/ scheduling, (ii) remembering/thinking about/keeping track/mental list, (iii) directing/delegating, and (iv) caring/worrying about family needs and responsibilities.

Though prior studies have qualitatively examined experiences of invisible family load, sample sizes have been small (fewer than 50 participants), with many focusing on narrow samples (e.g., working mothers of infants). We offer a broader account of layperson’s experiences through a more inclusive data collection.

Method

We recruited 159 employed parents via social media and offered one of five $20 Amazon gift cards as incentives. The sample was mostly female (83%), Caucasian (93%), with an average age of 41 years, and was employed in a range of occupations. Participants reported working their paid job for an average of 39.44 h per week, and spending 29.87 weekly hours on childcare, and 14.92 weekly hours on household responsibilities. Participants were instructed to think about a typical week and share examples of invisible activities they did to care for their family. To generate potential correlates for future studies, we asked them to describe how these activities positively or negatively affected them personally, in their families, and at work.

Results

Of the 159 participants, 145 responded to the question describing their invisible family activities. Our overarching goal was to identify the nature and extent of activities to ensure they were reflected in the construct definition as well as for use in item development. The first author read participant responses and coded the raw data (key words) to ensure the full range of activities was captured in the content domain. When similar words were used (e.g., planning, organizing, and scheduling), they were listed separately so that coding was broad and comprehensive. This approach allowed key words to be placed together into larger categories later, if needed. Then, we conducted electronic searches of the dataset to capture the number of times each key word was used (see Table 3) to understand how characteristic each is in laypeople’s thinking about invisible family load. The activities that people reported most frequently are planning and scheduling. The next most frequently used words were ensuring, worrying, managing, coordinating, and thinking.

Table 3 Frequency of key words in Study 1 participant examples of invisible family load

Discussion

As can be seen in Tables 1 and 3, the way academics and laypeople define invisible family load encompasses many activities. The author team examined commonalities across sources, identified themes, and discussed and resolved discrepancies, resulting in identification of three conceptual factors. This process indicated that primarily, invisible family load refers to being responsible for “managing” the home and lives of one’s family members. The notion of management is captured repeatedly in the academic literature (e.g., Ciciolla & Luthar, 2019; Hochschild, 1989; Walzer, 1996; Table 1). Consistent with historical definitions of management (to plan, coordinate, organize, control, and oversee the activities of others and an organization’s many operations to achieve its goals; Griffin & van Fleet, 2013; Kaehler & Grundei, 2019), key terms in popular press and laypeople’s examples include “planning, organizing, coordinating, scheduling, supervising.” Thus, a primary theme is managerial family load.

Other tasks are cognitive such as thinking, remembering, and paying attention, which emerged in popular press as well as laypeople’s examples, and are represented in the “knowing” dimension identified by Robertson et al. (2019). Another theme that pertains to cognitive tasks is that of problem solving, gathering and processing information, and making decisions, present in Walzer’s (1996) information processing dimension. It is also represented in Daminger’s (2019) focus on “cognitive load,” which includes anticipating, gathering information, deciding, and monitoring. Finally, thinking, learning, remembering, attending to, and processing information are considered central cognitive capacities (American Psychological Association, 2019). As such, we identified a second primary theme as cognitive family load.

Finally, an emotional component pertains to worrying about one’s family’s needs, responsibilities, activities, and/or well-being. Walzer (1996) initially identified this as a significant component of invisible family load in newborn care, and it was later adopted by Meier et al. (2006). The popular press also prominently characterizes worrying and caring about “everything” as a key ingredient, and “worry/ing” was the fourth most frequently used word from lay respondents. Thus, the third primary theme was emotional family load.

Based on our triangulation process, we contend that invisible family load involves the managerial, cognitive, and emotional load activities to address family’s needs, goals, activities, responsibilities, and/or well-being.

  • Managerial family load includes managerial activities ranging from planning, organizing, directing, supervising, and delegating work

  • Cognitive family load includes cognitive activities ranging from attention, memory, and thinking, to anticipating, processing information, making decisions and solving problems

  • Emotional family load includes one’s worry or concern about meeting family needs, goals, activities, responsibilities, and/or well-being.

Study 2: Item Development and Content Validation

The resulting three-part conceptual definition was used by the authors to deductively generate initial items (Hinkin, 1998). We aimed to develop items that captured the broad activity (e.g., remembering, planning) rather than focusing on a particular context. Because an initial item pool should contain at least twice as many items as desired in the final scale (Hinkin, 1998) and we aimed for 3–5 final items per dimension, we developed 15 items for managerial, 18 for cognitive, and 11 for emotional load (See Table S1 in Supplemental Materials).

Per scale development guidelines (Hinkin, 1995; Schriesheim et al., 1993), we examined the content adequacy of the initial 44 items. We recruited 26 subject matter experts (SMEs), PhD candidates and faculty members in Organizational Behavior, all of whom were familiar with scale development and work-family content. In an online survey, SMEs were provided the above definitions, as well as a list of randomly ordered items. They were told to indicate which definition best corresponded to each item. A “not sure/none” response was given to avoid forced-choice categorization. We asked them to provide feedback on any item that could be improved, as well as on any content that was absent that should be included.

We used frequencies (see Table S1 in Supplemental Materials) to identify items that assessed the intended construct with an a priori agreement index of 75% (Hinkin, 1998). Based on this standard, we retained 13 of the 15 managerial, 12 of the 18 cognitive, and all 11 of the emotional items. Whereas SMEs clearly judged items pertaining to remembering, thinking, and paying attention as cognitive, they were divided in their judgments about information processing and decision-making items which they saw as relevant to cognitive and managerial dimensions. Information processing and decision-making activities were separated from the “managerial” dimension in the work of Walzer (1996) and were the exclusive focus of Daminger’s (2019) work on the “cognitive dimension” of household labor. Yet, our data indicate that the distinction between information processing/decision-making as cognitive vs. managerial may not be as clear as prior research has suggested. In retrospect, this is not surprising given that decision-making is a managerial activity. Though it was not clear which dimension these items most effectively tapped, there is sufficient evidence that information processing/decision-making is relevant to the content universe, so we retained seven items pertaining to decisions (C13-18, M14 in Table S1) for further empirical study, resulting in 43 items.

Study 3: Exploratory Factor Analyses

Next, we used exploratory factor analysis (EFA) to refine our scale (Hinkin, 1998). We recruited participants via Amazon’s Mechanical Turk (MTurk). We followed best practice recommendations to ensure high quality responding (Aguinis et al., 2020). To avoid participant misrepresentation, we used screener tools (e.g., employed > 30 h/week), with participants presented a variety of demographic questions so that qualifications were not identifiable. As further data quality controls, only adult, US participants with a 95% approval rating who had successfully completed at least 1000 tasks were invited. Finally, to ensure effortful responding, we interspersed three attention check items (e.g., “Please leave this item blank”); anyone who missed one was omitted from analyses. Participants were paid $1.75 for the survey which included other scale development items. We ensured it was a fair price for the time required to motivate their attention. Finally, all scale points were labeled to improve attention. Based on these criteria, there were 209 participants, with 39% female, 58% Caucasian, 73% married, and 67% reported at least one child at home and were an average age of 35 years.

Using the 43 items from Study 2, we instructed participants to think about the past month and indicate the extent to which they agreed they had done each on a 5-point scale (1 = strongly disagree to 5 = strongly agree). Principle axis factor analysis with promax rotation was used. Relying on the solution yielded from specifying eigenvalues greater than 1, the scree plot, and comparing the actual eigenvalues with the random data eigenvalues (i.e., parallel analysis), we determined a 4-factor structure, which explained 61.23% of the variance. For each dimension, the 7–8 items with the highest factor loadings, greater than 0.40, with minimal cross-loadings were identified (Hinkin, 1998). As reported in Table S2 in Supplemental Materials, items loading on the first factor captured managerial family load such as scheduling, coordinating, and planning. The second factor captured cognitive family load pertaining to attending to, processing, thinking about and keeping mental “to do” lists. The third factor contained emotional family load items. The fourth factor contained four items about directing, telling, or delegating “to others.” Given there were only four items and their content did not clearly map onto a unique, previously conceptualized factor, these items were not retained. Of note, six of the decision items (C13-18, Table S2) cross-loaded on several factors. Because they did not meet our criterion for content validation in Study 2 or have clear factor loadings in Study 3, they were dropped. Based on our criteria, we retained 7 managerial, 8 cognitive, and 7 emotional items—22 items for the next phase.

We reconducted the EFAs with only these 22 items and this revealed a clean, three-factor structure (see Table 4). Descriptive statistics and internal consistency reliabilities for the retained factors are: managerial load (M = 3.86, SD = 0.67, α = 0.83), cognitive load (M = 3.95, SD = 0.68, α = 0.87), and emotional load (M = 3.18, SD = 1.15, α = 0.95). Of note, for two of the three factors, the means were above the midpoint (nearing 4 out of 5), and standard deviations were small (less than 1). As such, we considered whether changing from a 5-point to a 7-point scale might be more psychometrically appropriate in terms of increasing variance and curbing potential ceiling effects. Further, a frequency-based scale better fit the construct in terms of how often people engage in aspects of invisible family load. Based on emerging work examining underlying psychometric aspects of scale development (e.g., Tourangeau et al., 2020), next, we examined how features of the response scale affected scale properties.

Table 4 Study 3 exploratory factor analysis factor structure matrix based on the selected 22 items

Study 4: Response Scale Examination

Using the 22 items identified in Study 3, we empirically examined how seven response formats affected scale properties including descriptive statistics, internal consistency reliability, EFA results, and relations to other theoretically relevant variables. As elaborated further in Study 5, drawing from the challenge-hindrance model of stress (Cavanaugh et al., 2000), we expected that emotional and cognitive family load act as hindrance stressors, and as such, relate to potential outcomes such as negative spillover and poorer health and well-being. In contrast, we expected that managerial family load would act as a challenge stressor, and as such, offer a sense of fulfillment, fostering positive spillover, health and well-being. Our goal was to determine the format that yielded strongest reliability and validity evidence (proposed three-factor EFA, greatest number of predicted relations) as well as demonstrated sufficient variance (greater than 1) without ceiling effects (means within 1.5 points of the midpoint, or in the 2.5–5.5 range out of 7). After identifying the preferred format, we used EFA to refine to three items per dimension.

Method

All participants were told to consider the past month when responding and all items used a 7-point response scale. Participants were randomly assigned to one of seven format conditions. As a baseline comparison, we used a typical agreement scale like that used in Study 3 (1-strongly disagree to 7-strongly agree; version 1, agreement). The remaining scales used frequency responses. In version 2, participants were asked: “In the past month, how often did you do each of the following to manage your family’s needs, goals, activities, responsibilities, and/or well-being?” with a traditional frequency scale from 1-never to 7-always (version 2, word frequency). Version 3 asked participants to report how many days they had engaged in each item (1: 0–2 days; 2: 3–7 days; 3: 8–12 days; 4: 13–18 days; 5: 19–23 days; 6: 24–28 days; 7: 29–31 days) (version 3, days). Version 4 asked people to rate how much of their total time they had spent doing each activity with responses based on percentage of total time with word frequency (1: Never/0–9%; 2: Rarely/10–24%, 3: Occasionally/25–39%, 4: Sometimes/40–59%, 5: Frequently/60–74%, 6: Usually/75–89%, and 7: Always/90–100%) (i.e., total time, percent and word). Version 7 asked people to rate how much of their total time they had spent doing each, with responses based on percentage of total time (1: 0–9%; 2: 10–24%, 3: 25–39%, 4: 40–59%, 5: 60–74%, 6: 75–89%, and 7: 90–100%) (i.e., total time, percent). In all conditions using percentages, participants were told that when considered as a whole, their responses did not need to total 100% and/or could exceed 100%. In versions 5 and 6, people were instructed: “Thinking about the past month, of the time you spent attending to family needs and responsibilities, how much of your family time did you spend doing each of the following?”. In version 5, response options were based on percentage of family time and word frequency (1: Never/0–9%; 2: Rarely/10–24%, 3: Occasionally/25–39%, 4: Sometimes/40–59%, 5: Frequently/60–74%, 6: Usually/75–89%, and 7: Always/90–100%) (i.e., family time, percent, and word). In version 6, response options were based on percent of family time (1: 0–9%; 2: 10–24%, 3: 25–39%, 4: 40–59%, 5: 60–74%, 6: 75–89%, and 7: 90–100%) (i.e., family time, percent).

Participants

Participants who were working at least 31 h per week and married/living with partner were recruited on Prolific with the same quality control measures previously described. Our final sample consists of 201 participants in Version 1, 204 in Version 2, 200 in Version 3, 196 in Version 4, 198 in Version 5, 189 in Version 6, and 203 in Version 7. Across all versions (N = 1391), 49.5% of respondents were female, 76.2% were Caucasian, with an average age of 37.36 and an average weekly work hours of 42.64.

Measures

Response anchors differed for the invisible family load items, as above. All other constructs were rated on a 7-point agreement scale.

Invisible Family Load

Based on Study 3 findings and as reported in Table 4, we used 7 managerial, 8 cognitive, and 7 emotional load items, for a total of 22 items.

Interrole Spillover

Work-to-family conflict (WFC; \(\alpha\)= 0.96 across all versions) and Family-to-work conflict (FWC; \(\alpha\)= 0.93 across all versions) were each measured with the 5 items by Netemeyer et al. (1996). Family-work enrichment was measured with the 3 items by Kacmar et al.’s (2014) shortened version of Carlson et al.’s enrichment (2006) scale: “My involvement in my family puts me in a good mood and this helps me be a better worker.” (\(\alpha\)= 0.91 across all versions).

Global Work-Nonwork Balance

We used four items from the Wayne et al. (2021) scale such as “Overall, my work and nonwork roles fit together” (\(\alpha\)= 0.94 across all versions).

Health. Fatigue

Physical fatigue was assessed with a single item adapted from Frone and Tidwell (2015): “Physical fatigue at the end of the day; that is extreme physical tiredness and/or an inability to engage in physical activity.” Mental fatigue was assessed with the single-item: “Mental fatigue at the end of the day; that is extreme mental tiredness and/or an inability to think or concentrate.” Emotional fatigue was assessed with: “Emotional fatigue at the end of the day; that is extreme emotional tiredness and/or an inability to feel or show emotions.”

Well-being

Family/personal satisfaction (α = 0.94 across all versions) was measured with three items adapted from Cammann et al.’s (1983) scale such as “All in all, I am satisfied with my family/personal life.” Life satisfaction (α = 0.93 across all versions) was measured with Diener’s (1994) five items such as, “In most ways, my life is close to my ideal.”

Results

Descriptive Statistics by Response Format

Table S3 in the Supplemental materials presents means and standard deviations by response format. Version 1 (the agreement scale) shows the highest mean values while versions 6 and 7 (percentage of family and total time) show the lowest mean values for all three factors. Version 1 appears to capture the least variance while versions 3, 6, and 7 capture the greatest variance in all factors. Based on our retention criteria, consistent with Study 3, version 1 does not provide sufficient variance or has means near the midpoint of the scale, and as such, is not the preferred format.

Exploratory Factor Analysis by Response Format

We conducted EFAs to identify the number of factors extracted using each format. In line with recommendations (e.g., Hinkin, 1998), we conducted a principal axis EFA with promax rotation on all items for each format (see results in Table S4S10 in Supplemental Materials). Commonly used criteria were adopted to evaluate the optimal factor structure—scree test and eigenvalues > 1 (Hayton et al., 2004; Velicer et al., 2000; Yong & Pearce, 2013). Six versions (1–5, 7) had a three-factor structure with at least 70% of the variance explained, consistent with our conceptualization. The two-factor structure suggested by version 6 does not align with our theoretical understanding; hence, we do not recommend the format used in version 6.

Criterion-Related Validity by Response Format

We next examined the extent to which each invisible family load factor correlated with theoretically-relevant variables to examine the extent to which each response format evidenced criterion validity via expected relations. We calculated the managerial, cognitive, and emotional load factors based on the items previously identified as pertaining to those factors based on Study 3. As shown in Table S11, our hypothesized negative relations of emotional load with potential outcomes were largely supported across all formats. In almost all versions, emotional family load was positively related to WFC, FWC, and fatigue (physical, mental, and emotional) and negatively related to WFE, WFB, life satisfaction, and family satisfaction. As expected, managerial load was positively associated with enrichment, balance, and family and life satisfaction across most formats, albeit to varying degrees. Notably, though not in the expected direction, cognitive load generally had significant, positive relations to these potential outcomes.

There were two formats (versions 4 and 7, using “total time” formats) that exhibited a different pattern than expected and different than the other five formats: cognitive and managerial load were positively associated with fatigue and not associated with enrichment, balance, and well-being, as had been expected. These findings indicate that this response format yields qualitatively different relations and there is something distinctly different about how people responded to the “total time” ratings than the other five formats. Based on this, we do not recommend use of these “total time” response formats (versions 4 and 7). Also, of the remaining versions (2, 3, and 5), the versions that consistently exhibited significant relationships were versions 2 (17 significant relationships) and 3 (15 significant relationships), whereas version 5 had fewer (10) significant relationships.

Summary Evaluation and Recommendation

Based on descriptive statistics, the three-factor structure revealed by EFA results, and criterion-related validity, evidence supports the numerical frequency scale (version 2) or the number of days (version 3) as preferred formats. Version 2 (never to always) is a more widely used scale format, and therefore, likely to be more amenable to researchers than the days-per-month format used in version 3. Therefore, we selected the frequency response scale of 1-never to 7-always. We then used the EFA results for version 2 (see Table S5 in Supplemental Materials) to identify three items with the highest factor loadings that did not cross-load on any other factor. We then conducted EFA with only the final 9 items and results are displayed in Table 5.

Table 5 Exploratory factor analysis results based on the final 9 items selected from version 2 in Study 4

Study 5: Construct Validation and Hypothesis Testing

Following best practices (e.g., Hinkin, 1998), our final study aimed to (i) confirm the three-factor structure obtained in Study 4 in an independent sample, (ii) examine whether the factor structure is invariant by gender, (iii) demonstrate convergent validity with other invisible family load measures, and (iv) examine discriminant validity from related constructs (i.e., personality, coping). Our fifth aim was to demonstrate construct validity by ensuring that our measure aligns with theoretical predictions regarding gender differences and how dimensions relate to potential correlates such as interrole spillover, health, well-being and performance (Cronbach & Meehl, 1955). Finally, we also wanted to ensure that our scale brings unique empirical value. To do this, we examined its predictive, incremental validity in relation to potential outcomes, above and beyond related constructs (i.e., personality) and also beyond existing measures of invisible family load (i.e., Ciciolla & Luthar, 2019; Meier et al., 2006).

Proposed Gender Differences

Ample popular press articles suggest that the invisible family load “typically falls on women’s shoulders” (e.g., Gonsalves, 2022). This is consistent with early work which described the division of household labor as gendered (Hochschild, 1989) and with social role theory (Eagly & Wood, 2016) which suggests that men and women are distributed into different roles in society. Social roles are organized, and women are socialized, such that they are more likely than men to engage in caregiving roles in the family and in the workforce. Men are socialized and expected to be providers and engage in physical or strength tasks at home and/or leadership roles at work. Through socialization and gender stereotypes, men and women engage in behaviors that support and sustain the gendered division of labor. This implies that women take on the bulk of the household work, whether it be visible or invisible. Consistent with this theoretical explanation, evidence indicates that women spend more time than men on family and household labor (BLS, 2018). Social role theory, statistical evidence, and popular literature suggest that:

  • Hypothesis 1: Women experience greater invisible family load than do men.

Convergent and Discriminant Validity

Though prior invisible family load measures have their limitations and we expect that our newly developed measure has unique value, we do expect our measure to converge with extant measures of the same construct. A scale by Meier et al. (2006) assesses invisible family load pertaining to “household and childcare management” with items that include cognitive (“think about…”) and managerial (“planning…”) tasks in each dimension. Then, they use these same items to ask people to report how much they worried about each of those same tasks, somewhat analogous to our emotional load factor. The most recently developed scale by Ciciolla and Luther (2019) includes three factors including “management” (e.g., organizing schedules), “responsibility for childhood adjustment” (e.g., knowing teachers) and “management of financial affairs,” with the former two reflecting some elements of our managerial and cognitive dimensions. Because no existing measure captures the construct in its entirety as we define it, we believe these extant measures are likely moderately related to, but distinct from, our measure.

  • Hypothesis 2: Managerial, cognitive, and emotional family load factors exhibit moderately strong correlations with corresponding parts of prior measures.

Managerial and emotional family load, as activities that pertain to planning and worrying, are likely related to personality characteristics and coping strategies. In particular, conscientiousness reflects one’s tendency to be planful, organized, and responsible (Goldberg, 1990). Coping by planning is a problem-focused coping strategy that involves “coming up with action strategies, thinking about what steps to take and how best to handle the problem” (Carver et al., 1989, p. 286). Given the conceptual overlap between personality and coping with our constructs, which involve planning, organizing and being responsible for family tasks, it is important to demonstrate that our measures are not merely attributable to personality and/or coping. Likewise, neuroticism captures one’s tendency to be tense, anxious, and worried (Goldberg, 1990); as these are elements related to emotional family load, we aim to demonstrate that emotional load is related, but not merely attributable, to one’s neuroticism.

  • Hypothesis 3: (a) Managerial and cognitive load positively correlate, but do not demonstrate redundancy, with conscientiousness and coping by planning, and (b) Emotional load positively correlates, but does not demonstrate redundancy, with neuroticism.

Theorized Relations to Health, Performance, and Well-being

Popular press (e.g., Golsavles, 2022) and empirical work (e.g., Ciciolla & Luthar, 2019) describe the “worry work” and “mental burden” of being perpetually aware of family needs as an exhausting type of labor that saps time and energy, and due to its invisible nature, is often not acknowledged. This negative view is consistent with the stressor-strain perspective, suggesting that role demands or stressors, such as doing cognitive, emotional, and managerial family load activities, are uniformly negative, resulting in negative outcomes (Lazarus & Folkman, 1984). Yet, our qualitative findings present a more nuanced perspective in that people described positive and negative consequences. This possibility is consistent with the challenge and hindrance model of stress (Cavanaugh et al., 2000) and the scarcity (Goode, 1960) and enrichment perspectives of multiple roles (Wayne et al., 2007), which we use to guide our hypotheses.

As discussed by Cavanaugh et al. (2000), hindrance stressors are those that are overwhelming and create unpleasant feelings of discomfort and hinder achievement of valued goals, causing distress, and ultimately, negative outcomes. Examples include role ambiguity, role conflict, and job insecurity. In our qualitative data collection (Study 1), comments indicated that having to worry, think about, or remember everything (i.e., emotional and cognitive load) was particularly stressful and depleting. These data tentatively suggest that emotional and cognitive family load are hindrance stressors, creating strain which is harmful to health, well-being, and performance. Similarly, the scarcity view of multiple roles suggests that given that emotional and cognitive load originate in the family domain, they are hindrance stressors that may take time, energy, and attention away from one’s work, contributing to greater family-to-work conflict (FWC, Greenhaus & Beutell, 1985).

  • Hypothesis 4: Emotional and cognitive family load are associated with greater family-to-work conflict and poorer health (i.e., exhaustion, sleep problems, and alcohol use), well-being (family and life satisfaction), and performance.

In contrast, certain role demands, though stressful, are viewed as rewarding and fulfilling. Challenge stressors are those responsible for growth and mastery, create feelings of fulfillment, and lead to goal attainment, creating eustress; this positive motivating force yields positive outcomes (Cavanaugh et al., 2000). Examples include overload, time pressures, and high levels of responsibility. Some Study 1 participants noted positive consequences, such as feeling valuable or important (e.g., “I like being the leader in my family”) or finding meaning and purpose through taking on managerial activities in the family. Thus, we contend that managerial load, with high levels of responsibility for the family, functions as a challenge stressor and is a positive motivating force, generating positive outcomes including better health, well-being, and performance. Further, the enrichment perspective suggests that managerial load as a challenge stressor originating in the family can create a sense of fulfillment that spills over to and benefits work (i.e., family-to-work enrichment; FWE, Wayne et al., 2007).

  • Hypothesis 5: Managerial load is associated with greater family-to-work enrichment and better health (i.e., less exhaustion, sleep problems, and alcohol use), well-being (family and life satisfaction), and performance.

Incremental Validity Relative to Personality and Prior Invisible Family Load Measures

Though we aimed to ensure that our invisible family load overlapped, but was not redundant, with extant invisible family load measures and personality, we also aimed to show that our scale offers unique empirical value to potential outcomes (interrole spillover, health, well-being, performance), above and beyond these measures. That is, we expect that our measure of invisible family load is capturing something distinct from, and therefore, will explain variance in potential outcomes not explained by the personality traits of conscientiousness and neuroticism. Further, because our scale encompasses a broader construct domain (managerial, cognitive, and emotional), we contend that our measure will account for unique variance in the prediction of potential outcomes not explained by prior measures.

  • Hypothesis 6: Our measure of invisible family load explains unique variance to the prediction of interrole spillover, health, well-being, and performance, above and beyond (a) traits of conscientiousness and neuroticism and (b) existing measures of invisible family load.

Method

Participants

The Prolific platform was again used. Respondents were recruited if they were working at least 31 h per week, 18 to 65 years of age, with an approval rating of 100% with a minimum of 20 submissions. Respondents in Study 4 were excluded from Study 5 through pre-existing filters. Among the 451 eligible participants, 3 were omitted because they missed more than one of three attention check items. Among the final sample of 448 respondents, 41.3% were female, 76.3% were Caucasian, 54.7% had children, their average age was 37.68 years, and worked an average of 41.00 h per week. A variety of industries were represented, with most being in education, training, and library occupations (10.3%), computer and mathematical occupations (9.4%), and business and financial operations (7.8%).

Measures

Invisible Family Load

We used the nine invisible family load items, reported in Table 5, measured on a scale ranging from 1 = never to 7 = always, based on the past month.

Other Measures of Invisible Family Load

Meier et al. (2006) measured household management (11 items; e.g., “plans meals; makes medical appointments, α = 0.85) and childcare management (10 items; “managing child’s social activities; thinking about solving problems with childcare provider”, α = 0.96). They also included two scales measuring physical household tasks and physical childcare tasks. They also asked respondents to what extent they worried about all 46 tasks in the original scale. To avoid fatigue, we used four items to rate their worry about each dimension: “How often did you WORRY about managing the household (e.g., making grocery lists, planning meals, assigning chores, making money decisions, planning family events)? How often did you WORRY about managing childcare (e.g., deciding medical care and scheduling appointments, social activities, preparing things, child development, solving childcare problems)? How often did you WORRY about physical household tasks (e.g., laundry, grocery shopping, cleaning, yard work, pet care, etc.)? How often did you WORRY about physical childcare tasks (e.g., dressing, bathing, transportation, discipline, shopping)?. We averaged these items to create a “total worry” scale (α = 0.87). We used two factors from the Ciciolla and Luthar (2019) scale: “Household routine management” (α = 0.89), comprised of four items such as “organizing schedules for the family; being captain of the ship” and responsibility for childhood adjustment (α = 0.90), comprised of four items such as “knowing the children’s school teachers or administrators”.

Personality and Coping

Personality was measured with the adjective-based approach (Goldberg, 1990). Participants rated the extent to which each word was characteristic of them (1 = not at all to 5 = extremely characteristic of me). Conscientiousness was measured with five items such as “organized” and “disciplined” (α = 0.79). Neuroticism was measured with five items such as “stressed” and “anxious” (α = 0.90). Coping by planning (α = 0.81) was measured with four items from Carver et al. (1989), including “When coping with stress, I make a plan of action.”

Theorized Correlates

All responses were rated on a 7-point scale (1 = strongly disagree to 7 = strongly agree), unless otherwise noted.

Interrole Spillover

Family-work conflict (α = 0.93) was measured with Netemeyer et al.’s (1996) scale. Family-work enrichment (α = 0.84) was measured with the 3 items by Kacmar et al. (2014) as in Study 4.

Well-being

Family satisfaction (α = 0.93) was measured with items adapted from Cammann et al.’s (1983) scale consisting of three items such as “All in all, I am satisfied with my family.” Life satisfaction (α = 0.94) was measured with Diener’s (1994) assessment of five items such as, “In most ways, my life is close to my ideal.”

Job performance was measured with four items from Williams and Anderson (1991) such as “I perform well in the job tasks that are expected of me” (α = 0.93).

Health

Job exhaustion was measured with 3 items from the emotional exhaustion subscale of burnout (α = 0.92), such as “I feel emotionally drained from my work” (Schaufeli et al., 1996). We adapted these items to measure family exhaustion (α = 0.92). Sleep problems (α = 0.91) was measured with four items assessing the extent to which participants had difficulty falling asleep, staying asleep, waking up often, or waking up tired. Responses were rated on a 7-point scale assessing how frequently they experienced each in the past month (1 = never to 7 = Always). Higher scores represent greater sleep problems. Alcohol use was assessed with the item “Thinking about the past month, on average, how many days per week did you typically drink any alcohol?” on a scale from 0 to 7.

Results

Factor Structure

We first performed confirmatory factor analysis (CFA) on three different measurement models with Mplus (Version 8.3; Muthén & Muthén, 1998–2017). In our hypothesized three-factor model (i.e., Model 1), managerial, cognitive, and emotional load factors each represented a first-order factor. Because our content validation suggested some overlap between the managerial and cognitive factors, we tested an alternative model in which items of managerial and cognitive factors were combined and loaded onto a first-order factor, while emotional load items loaded on another first-order factor (the two-factor model, Model 2). In the other alternative model, the single-factor model (i.e., Model 3), items of all three factors loaded onto a single first-order factor. Confirming the three-factor structure, per Table 6, the two-factor and the single-factor models showed significantly worse fit than the hypothesized three-factor model.

Table 6 Results of confirmatory factor analysis; Study 5

Measurement Invariance by Gender

To test whether the measure was invariant between men and women, we conducted a multiple-group CFA. We first performed a constrained measurement model in which the factor structure was assumed to be invariant (i.e., factor loadings were constrained to be equal across gender groups). Men (n = 263) were set as group 1; women (n = 188) as group 2. Results suggest a good fit, \({\chi }^{2}\) (63) = 101.919, CFI = 0.987, RMSEA = 0.053, SRMR = 0.065. Second, we performed an unconstrained model in which there was no assumption about gender invariance and the factor structure was allowed to be freely estimated in each gender group. Although the fit of this unconstrained model suggests an acceptable fit, \({\chi }^{2}\) (54) = 93.017, CFI = 0.987, RMSEA = 0.057, SRMR = 0.035, the difference in chi-square test between the two models was not significant, \({\chi }^{2}\) (9) = 8.902. This suggests that the unconstrained model does not fit significantly better than the constrained model, suggesting the factor structure is invariant by gender.

Mean Difference by Gender

To test Hypothesis 1, independent samples t-tests were conducted to examine mean differences in each invisible family load factor by gender. As expected, women reported greater managerial family load (M = 5.66, SD = 1.19) than did men (M = 5.11, SD = 1.30), t(445) =  − 4.53, p < 0.001. Women reported greater cognitive family load (M = 6.08, SD = 0.98) than did men (M = 5.70, SD = 1.05), t(446) =  − 3.88, p < 0.001. Finally, women reported greater emotional family load (M = 3.52, SD = 1.64) than did men (M = 3.15, SD = 1.46), t(446) =  − 2.56, p = 0.011. Hypothesis 1 was supported.

Nomological Validity

Convergent Validity

As shown in Table 7, the correlations between invisible family load and existing measures by Meier et al. (2006) and Ciciolla and Luthar (2019) were consistent with Hypothesis 2. Our managerial factor correlated significantly with household management (r = 0.59, p < 0.01), childcare management (r = 0.42, p < 0.01), household routine management (r = 0.64, p < 0.01), and responsibility for child adjustment (r = 0.43, p < 0.01). Our cognitive factor correlated significantly with household management (r = 0.42, p < 0.01), childcare management (r = 0.29, p < 0.01), household routine management (r = 0.54, p < 0.01), and responsibility for child adjustment (r = 0.38, p < 0.01). The “worry” scale correlated significantly with our emotional load factor (r = 0.46, p < 0.01). Results support the convergent validity of our measure.

Table 7 Descriptive statistics and correlations for construct validity; Study 5

Discriminant Validity

Per Hypothesis 3, bivariate correlations support the distinction between invisible family load factors and personality and coping. The correlations between conscientiousness with managerial (r = 0.39, p < 0.01) and cognitive family load (r = 0.34, p < 0.01) suggest they are not attributable entirely to this trait. Coping by planning was modestly positively correlated with managerial (r = 0.31, p < 0.01) and cognitive (r = 0.26, p < 0.01) and negatively correlated with emotional (r =  − 0.19, p < 0.01) load, suggesting these facets are distinct. Also, neuroticism was sufficiently distinct from but much more strongly correlated with emotional load (r = 0.54, p < 0.01) than it was with the other two factors (i.e., r = 0.11, r = 0.15), as would be expected.

Criterion-Related Validity

Next, we examined whether the invisible family load factors significantly predicted potential outcomes of theoretical and practical interest. Consistent with Hypothesis 4 and as can be seen in Table 7, emotional family load related to greater FWC, poorer health (job and family exhaustion, sleep problems), well-being (life and family satisfaction) and job performance. In mixed support of Hypothesis 4, cognitive family load was positively associated with FWE, well-being (life and family satisfaction) and job performance but also more sleep problems. Consistent with Hypothesis 5, managerial family load was positively associated with greater FWE, well-being (life and family satisfaction) and job performance but contrary to Hypothesis 5, also more sleep problems. Thus, there was mixed support for Hypotheses 4 and 5.

Incremental Predictive Validity

Finally, we examined whether invisible family load factors predicted potential outcomes after controlling for personality traits (i.e., neuroticism and conscientiousness) associated with work attitudes and behaviors (Costa & McCrae, 1990). Results of hierarchical regressions are presented in Tables 8 and 9. In partial support of Hypothesis 6a, managerial family load significantly related to FWE but did not have significant positive associations with life and family satisfaction, beyond the personality variables. Cognitive family load significantly and positively predicted job performance and family satisfaction beyond personality. Consistent with Hypothesis 6a, after accounting for conscientiousness and neuroticism, emotional family load positively predicted FWC, poorer health (sleep problems, family and job exhaustion), and well-being (life and family satisfaction). There was mixed support for Hypothesis 6a.

Table 8 Hierarchical regression analysis results with personality variables—part 1, Study 5
Table 9 Hierarchical regression analysis results with personality variables—part 2, Study 5

Finally, we examined the incremental predictive validity of our invisible load factors to ensure they add empirical utility beyond extant measures. Per Tables 10 and 11, after controlling for Ciciolla and Luthar’s (2019) two subscales (i.e., household routine management, responsibility for childhood adjustment), emotional family load incrementally predicted FWC, FWE, sleep problems, family satisfaction, family exhaustion, job performance, and job exhaustion. Cognitive family load incrementally predicted FWC, family satisfaction, and job performance. Per Tables 12 and 13, after controlling for the five subscales (i.e., household management, childcare management, physical household tasks, physical childcare tasks, and worry) of Meier et al.’s (2006) measure, cognitive family load significantly incrementally predicted FWC, FWE, family satisfaction, and job performance; managerial family load significantly incrementally predicted FWE; and emotional family load still incrementally predicted FWC, FWE, sleep problems, life satisfaction, family satisfaction, family exhaustion, job exhaustion, and job performance. Thus, there was strong support for Hypothesis 6b.

Table 10 Hierarchical regression analysis results with Ciciolla and Luthar’s (2019) measures—part 1; Study 5
Table 11 Hierarchical regression analysis results with Ciciolla and Luthar’s (2019) measures—part 2; Study 5
Table 12 Hierarchical regression analysis results with Meier et al. (2006) measures—part 1; Study 5
Table 13 Hierarchical regression analysis results with Meier et al. (2006) measures—part 2; Study 5

General Discussion

Results from our research offer new insights into the concept of invisible family load which, while commonly and increasingly referenced in lay discourse (e.g., Carrell, 2019; Murray, 2020), has received fragmented scholarly attention. Grounding our research in the extant literature, the broader lay discussion, as well as our own qualitative data, we provide a comprehensive, multidimensional definition of the construct, which we label invisible family load. Critically important, we provide a psychometrically sound, 9-item scale to measure its component parts – managerial, cognitive, and emotional family load. Beyond this construct and scale development, two primary questions our research addresses are (i) Do men and women differ in their experience of invisible family load? (ii) What are its implications for employee health, well-being, and family-to-work spillover? This research sets the stage for scholars to forge a path forward to enhance understanding of this phenomenon and its implications.

To that end, consistent with social role theory (Eagly & Wood, 2016), not only do women do more of the physical household and childcare tasks than do men, as shown in prior work, but our research indicates that women carry more of the invisible family load. Specifically, women report more often doing the emotional (worrying), managerial (organizing), and cognitive (thinking about) load in their families. In terms of effect sizes, the gender differences are moderately large (Cohen, 1988). These findings are consistent with theory as well as the popular press notion that women tend to be the “project managers” of their families.

Also, our findings give rise to the interesting—and at times, counterintuitive—ways in which aspects of invisible family load may differentially impact potential outcomes of import. For instance, the popular discussion of invisible family load indicates that its consequences are entirely negative. Yet, our research offers a more nuanced view. While our results do substantiate significant negative consequences of emotional family load, findings from our qualitative and quantitative studies suggest that people find some meaning and purpose in taking on some aspects of the invisible load in their family. In addition to the positive consequences discussed by participants in our qualitative Study 1, our quantitative findings (Studies 4 and 5) indicate that cognitive family load, including thinking about and remembering family needs, was associated with greater FWE, family satisfaction, and job performance. Notably, these relationships exist above and beyond conscientiousness and neuroticism, suggesting that they are not attributable to these personality traits. These findings are contrary to the expectation that the mental work of thinking about family tasks would interfere with one’s work and harm family satisfaction and performance. Rather, consistent with challenge stressors (Cavanaugh et al., 2000), being more cognizant of one’s family responsibilities is related to better performance at work and increased satisfaction with one’s family. Of note, the bivariate relationships in Studies 4 and 5 indicated that managerial family load was positively associated with enrichment (FWE), well-being (family and life satisfaction), and job performance. However, when personality traits of conscientiousness and neuroticism are considered (Study 5), managerial family load is positively associated with FWE, but does not account for significant additional variance in well-being or performance, suggesting that personality may underlie these observed relationships. In sum, these findings paint a more nuanced picture than the exclusively negative view of invisible family load.

Nevertheless, despite the potential for some positive consequences of managerial and/or cognitive family load, the emotional element of invisible family load demonstrates numerous negative consequences. Several findings across our studies highlight the important role of emotional load as an aspect of invisible family load. First, as noted in our literature review, there has previously been no clear consensus regarding the meaning of invisible family load and its components. Central to this was the fact that emotional load was included in some definitions (e.g., Meier et al., 2006; Walzer, 1996) but not others (e.g., Robertson et al., 2019). Our qualitative research (Study 1) indicates that when spontaneously generated, participants identified worry as a common element of the invisible aspects of caring for their families. Further, content validation showed people readily identified these as examples of emotional load. Moreover, our empirical findings showed that worrying about one’s family as an emotional element of invisible family load is not merely due to the fact that one has a predisposition to worry in general (i.e., neuroticism), suggesting that this is a distinct and important element of invisible family load. Finally, carrying the emotional load showed the most consistent relation to undesirable experiences in the form of greater FWC, exhaustion in one’s job and family, sleep problems, and poorer job performance. Even when other measures of invisible family load were used as predictors or correlates (i.e., Meier et al. household management, childcare management, physical household tasks and physical childcare tasks), the total worry scale and our emotional family load factors emerged as the primary predictors. Thus, although there may be some potential enhancement to well-being due to one’s managerial and/or cognitive family load, the emotional load is clearly a hindrance stressor that comes at great costs to multiple aspects of health and well-being.

Future Theoretical and Empirical Directions

Collectively, our definition and measure of invisible family load offer new avenues for exploring this topic, its antecedents, and its impact on employees, organizations, and families. Consistent with hindrance stressors (Cavanaugh et al., 2000), bearing the emotional family load creates strain that is depleting, generating interference into one’s work (FWC) and harming one’s health and well-being. Future theoretical development and research should consider the pathways by which this occurs, such as whether FWC mediates the relation between emotional family load and potential consequences such as life satisfaction and sleep problems. The development of interventions targeted at reducing emotional family load would likely have beneficial consequences across multiple domains. Such research could focus on strategies shown to reduce anxious thoughts (e.g., mindfulness, self-compassion, cognitive reframing) and applying them specifically to family roles.

Scoping out, managerial and cognitive family load may operate more from a challenge stressor and enrichment perspective (Cavanaugh et al., 2000; Wayne et al., 2007) than a hindrance stressor and/or scarcity perspective (Greenhaus & Beutell, 1985). Unlike emotional family load, cognitive and managerial family load are based in thoughts and behaviors when caring for one’s family which may offer more opportunities for challenge and/or enrichment. Researchers should theorize as to why this might be true, drawing from different sources of enrichment. It may be that cognitive family load fosters learning of new knowledge, skills, or ways of thinking across roles (e.g., developmental enrichment) whereas managerial family load may give a stronger sense of fulfillment (e.g., affective enrichment). Different types of enrichment could be considered as explanatory mechanisms linking cognitive and managerial family load with role attitudes. It may also be that cognitive and managerial family load operate as challenge stressors such that they require effort in terms of thinking and behaving but they foster individual (and/or family) goal achievement, generating positive consequences for the individual, such as role or life satisfaction, while also creating some indicators of exhaustion, such as sleep problems (Cavanaugh et al., 2000).

These results for managerial and cognitive family load can also be understood through the lens of agency, and the many established benefits of feeling as though one has a meaningful level of control over aspects of one’s life, and investing cognitive energy in service of such control (e.g., Steckermeier, 2021). Similar, somewhat parallel, research findings show that employees’ proactive attention to issues of career adaptability, including their senses of concern and control over their adaptability, is positively associated with positive affect and job satisfaction, despite also requiring more effort on behalf of the employee (Fiori et al., 2015). Indeed, Fiori et al. suggested that “adapting behaviors, such as…choosing and planning” (p. 114) are likely related to adaptation outcomes, including satisfaction. Similarly, it may be that the very act of carrying the invisible family load, and investing effort into it managerially (akin to Fiori’s “planning” and “choosing,” respectively) and/or cognitively may increase individuals’ satisfaction through an increased sense of agency regarding their environment. Such perceived agency is likely to be an important explanatory variable that we encourage scholars to empirically test in future work on invisible family load.

Another theoretical perspective that might be helpful is self-determination theory (Deci & Ryan, 2000). It may be that invisible family load, particularly feeling in charge of directing or managing one’s family (cognitive and managerial family load), meets one’s needs for competence and relatedness, thereby generating greater life satisfaction. It may also be that higher cognitive and managerial family load improve satisfaction with one’s life (Steckermeier, 2021) as a result of having had more autonomy over constructing the details of that life. Given the measurement foundation provided here, we encourage researchers to examine multiple theoretical explanations to further understand invisible family load.

Another important avenue for future research is theoretical explanation and empirical examination of gender differences in invisible family load. Our bivariate results indicated that aspects of invisible family load were associated with personality, including conscientiousness and neuroticism. Research shows that women score higher than men on agreeableness and neuroticism (Weisberg et al., 2011), raising the question of whether the gender differences found herein may be at least partly due to gender differences on agreeableness and neuroticism. Alternatively, it may be that these gender differences are at least partly due to socialized expectations of mothers, and as such, might be partially explained by intensive mothering (or parenting) ideologies. These and other explanations should be considered to better understand the extent and nature of gender differences in invisible family load. Future research should also examine gender as a potential moderator of the relationships proposed herein.

Finally, we encourage scholars to examine theoretically based antecedents of invisible family load. Though options are plentiful, examples might include gender egalitarianism, marital status (i.e., single parent vs. married), type of family (e.g., same-sex vs. opposite-sex or single-earner vs. dual-earner couples), type of work arrangement (remote, hybrid, in-person) and caregiving demands, such as the number and age of children, eldercare, as well as whether children experiencing chronic mental and physical health conditions, learning disabilities, etc., intensifies invisible family load. The ground is fertile for theoretical explanation and empirical examination of antecedents of invisible family load.

Limitations and Implications

Our series of studies offers a psychometrically-sound, multifactorial measure of invisible family load that future research can use to better execute methodologically rigorous and theoretically grounded research. The final 9-item measure is parsimonious enough to be practical for survey administration, while comprehensive enough to ward against construct deficiency and to effectively differentiate between factors to allows for examination of differential functioning.

That said, a noteworthy limitation of our research is the reliance on WEIRD (White, Educated, Industrialized, Rich, Democratic) samples. Future research would do well to examine invisible family load across countries and cultures, and even within countries across particular demographics. For example, various events circa 2020 brought recognition of the potential disparate invisible family load borne by individuals as a result of demographic characteristics—in particular, race and gender. While we considered gender, more attention is needed. For example, the recent COVID-19 pandemic has shone a bright light on the excessive and disproportionate home and childcare load borne by mothers in particular, an inequity highlighted by lay media (e.g., Bennett, 2020; Charlton, 2020; Cohen & Hsu, 2020) and scholarly work (e.g., Mills et al., in press; Shockley et al., 2021) alike. Yet, more scholarly work is needed to tease apart the effects of the increased and disproportionate physical load as compared to the increased and disproportionate invisible family load, particularly at a factor level.

Another limitation is our reliance on cross-sectional designs using retrospective self-report measures. Given the extent to which invisible family load is, in fact, invisible, future research should examine the construct and its potential gender discrepancies using less retrospective and subjective methodologies, such as diary studies or experience sampling approaches, and should compare results to perceived invisible family load as assessed by the measure developed herein. Future research should also use longitudinal designs to test directions of implied causality such that emotional family load contributes to poorer health, well-being, and performance rather than the other way around. Finally, Study 4 results indicated that changing the response scale (words vs. numerical frequency) significantly changed descriptive statistics, factor structures, and relations to theoretically relevant variables. From a measurement perspective, this suggests that seemingly subtle changes in measurement and design can yield unintended consequences. Various methods are also likely to reveal potential gender differences to varying degrees, and as such, should be investigated further in that regard. While comparatively little research explicitly addresses this, some research (e.g., Mills & Grotto, 2017) as well as lay accounts (e.g., Miller, 2020) have suggested that men may tend toward overestimating their family demands. Researchers could be more mindful of how the measures, time frames, and methods they use may impact such conclusions.

Also, we did not address the potential for disparities across racial lines. Some lay media has raised the issue of disproportionate invisible family load for minorities (e.g., Murray, 2020), but scholarly research has yet to examine this crucial issue. Researchers should build upon existing work regarding race-related stressors (e.g., Williams, 2018) to explore whether invisible family load is exacerbated for minorities in terms of emotional load (given the enhanced emotionality involved with being a historically marginalized minority) as well as managerial and cognitive load (e.g., planning for the safety of one’s children).

Conclusion

Although much is known of the concrete and observable physical tasks associated with household management and child rearing, there is scant understanding of the less visible tasks that are just as critical. Our research provides clarification on the invisible family load construct and offers a rigorously developed, psychometrically sound measurement instrument for use in future research. We offer initial evidence surrounding the construct’s nomological net, including negative consequences as well as the potential for less intuitive positive consequences for well-being. We found such consequences to function differentially by factor, and also found gender differences in dimensions of invisible family load. Overall, our scale and nuanced findings offer substantial fodder for future research and practice alike.