1 Introduction

Work–family conflict has been a major concern for modern families as the number of dual-earning couples has risen. Telecommuting or working from home (WFH) has been regarded as a promising means of improving workplace flexibility, and previous research (Kelly et al., 2014; Sherman, 2020) has shown that WFH can reduce work–family conflict for women.

While the earlier studies mainly discuss women’s work–family balance, some scholars suggest that WFH should also increase men’s engagement with their families. Under the social distancing policy that have become implemented in response to the current COVID-19 pandemic, the practice of WFH has become common for many workers,Footnote 1 although the feasibility of WFH varies greatly across and within industries and occupations.Footnote 2Alon et al. (2020) claim that because many women work in health care and other businesses considered critical, such as grocery stores and pharmacies, their husbands who can work from home inevitably become the main providers of childcare.Footnote 3 They further argue that the reallocation of household duties during the pandemic is likely to have persistent effects on men’s future participation in childcare, as indicated by the literature on paternity leave policy reforms.Footnote 4 However, as far as we are aware, there is a lack of causal evidence in the literature that WFH increases husbands’ household work or engagement with their family more generally.

The objective of this paper is to estimate the causal effects of WFH on male workers’ engagement with their families using Japanese data. While gender gaps in unpaid domestic work exist in many OECD countries, Japan is among the countries that exhibit the largest inequality (Figs. 2 and 3). Therefore, it is especially relevant to examine how the prevalence of WFH affects men’s participation in domestic work and attitudes toward their families in a society with such entrenched traditional gender roles.

Our data are taken from the Survey on Changes in Attitudes and Behavior Under the Influence of the Novel Coronavirus, conducted in December 2020 by the Cabinet Office of Japan. The survey asks questions on relative changes that have occurred since December 2019 (before the pandemic) in the number of days per week that men work from home and how much they engage with their family. These questions allow us to use the first-difference estimator to avoid an omitted variable bias from time-invariant unobserved individual characteristics.

Nevertheless, concerns may arise about an endogeneity bias caused by a possible correlation between a growth in the frequency of WFH and that of unobserved factors. For example, if workers chose to work from home because their fear of COVID-19 led them to become more family oriented, the change in their attitude toward the family is likely to increase their WFH days and participation in housework simultaneously. To address this concern, we use the feasibility of WFH as of December 2019 (i.e., before the pandemic) reported by each respondent to instrument the changes in the number of WFH days. Our WFH feasibility index captures individual-level differences in working conditions, and has an advantage over the common use of industry or occupation codes in some studies (e.g., Alipour et al., 2020; Boeri et al., 2020; Dingel & Neiman, 2020). We take this first-difference instrumental variable (IV) estimator as our preferred specification.

We find that an additional WFH day increases male workers’ engagement with their families. Specifically, an extra day of WFH per week leads to a 6.2% increase in time spent on housework and a 9.3% increase in the fraction of couples in which the husbands’ share of housework rises. An additional day of WFH also increases time spent with the family by 5.6%, and raises the share of male workers reporting that they became more life oriented rather than work oriented by 11.6%.

A potential drawback of WFH is its adverse effects on work-related outcomes such as productivity. Our estimates indicate that WFH effectively reduces time spent on commuting but has no significant effect on working hours and workers’ self-perceived productivity. Hence, we conclude that the practice of WFH encourages male workers to engage in their family life without sacrificing productivity.

To address some concerns about the validity of the exclusion restriction of our instrumental variable, we demonstrate that our main results are robust to alternative specifications. First, if the WFH feasibility correlates with the outcome through pathways other than WFH, the exclusion restriction of our instrumental variable does not hold. Considering the regional, industry, and/or spouse’s job characteristics as possible pathways, we show that the results are essentially unchanged from the main results when controlling for the region and industry fixed effects and whether the spouse works from home. Second, the workers who have experienced WFH may update their perception of WFH feasibility, and thus, changes in the error term may conversely affect the instrument. We cannot exclude this possibility because WFH feasibility is self-reported, and hence, more or less subjective. To address this concern, we construct two alternative instrumental variables. One is based on occupation classification (in a similar way to Dingel & Neiman, 2020) and the other is based on the proportion of work that cannot be done from home regardless of productivity. These alternatives are less likely to leads to the above-mentioned bias because occupational classification and whether each task can be done from home (as opposed to the degree) are arguably more objective. We find that the estimation results from the alternative IVs produce comparable results with those from our preferred specification.

Another potential concern is whether misreporting of housework contribution by husbands who have experienced WFH might bias the estimates. It is worth noting that even if husbands tend to overreport their contribution to housework, it does not bias our estimates unless this measurement error (i.e., overreporting) is correlated with the IV. We show by using an additional dataset from Japan that the overreporting of housework contribution is unlikely to be common and to correlate with WFH feasibility, suggesting that the self-reported nature of our outcome is not driving our results.

Finally, we examine the heterogeneity of the treatment effects. Our estimates indicate that the effects are stronger for male workers under 45 years of age and those who have preschool children, suggesting that there is a greater increase in time spent on childcare compared with time spent on other household chores when male workers increase their WFH days. In addition, our estimates suggest that the estimated effects are largely driven by university-educated male workers. Overall, our estimates indicate that WFH increases the time that men spend on domestic work and makes them more family oriented without losing productivity or reducing work hours, which will eventually promote greater gender equality within the family. This result suggests that policymakers may wish to promote WFH even once the pandemic ends in future.

The rest of the paper is structured as follows. Section 2 reviews the related literature. Section 3 describes the data set and defines the variables. Section 4 explains our identification strategy and lays out the first-difference IV model. In Section 5, we present the results, including robustness checks and the heterogeneity analysis. In Section 6, we discuss the implications of our results in the context of the literature. We conclude in Section 7.

2 Literature review

Our paper contributes to the literature on the causal impacts of WFH. Reflecting difficulty of avoiding self-selection into WFH, the literature has faced a challenge in establishing causality. Exceptions include Dutcher (2012) and Bloom et al. (2015). Dutcher (2012) conducts a laboratory experiment and shows that the productivity of telecommuting may depend on how creative the tasks are. Bloom et al. (2015) provide evidence from a field experiment that WFH increases the performance of call center employees by 13%.

As these papers mainly examine the effect of WFH on productivity, our research is more closely related to studies that estimate the effects of management practices on work–life balance (Kelly et al., 2014; Sherman, 2020). Kelly et al. (2014) conduct a randomized training intervention designed to improve supervisors’ support and employees’ schedule control, and show that the intervention leads to improvements in employees’ work–family balance and family time adequacy. Note, however, that the intervention aims to improve employees’ control over when and where to work, and the support provided by supervisors. Hence, it is not clear to what extent the improved work–life balance can be attributed to remote working.Footnote 5 Sherman (2020) focuses on the discretionary uptake of remote working and finds significant effects on family-to-work conflict for mothers but not for fathers.

The above two studies consider WFH as an option that improves workplace flexibility for those suffering from work–family conflicts, most of which are presumably working mothers, and examine the effect of allowing them to work from home. However, they do not examine how WFH affects workers with a lower level of work–family conflicts in the baseline. The pandemic is a situation of compulsion in which workers who would not ordinarily prefer to work from home are strongly encouraged or required to do so. Taking advantage of the pandemic and pre-existing variations in the feasibility of WFH as an IV, we estimate a causal and independent effect of WFH, which complements the evidence from the previous studies.

Our research also contributes to the recent emerging literature on the impacts of COVID-19 on within-household gender inequality. Some studies report increased participation of males in childcare during the pandemic.Footnote 6 However, very few studies have attempted to establish causal evidence of the effects of the increased WFH on the allocation of housework. Champeaux and Marchetta (2021) assess the effect of the lockdown policy in France on the distribution of housework and intrahousehold conflict. They find that the husband’s share of housework increased only when the husband stayed at home and the wife worked away from the home. In contrast, our estimates suggest that WFH positively affects men’s engagement with their family regardless of whether their spouse works from home. Moreover, unlike the studies examining the total impact of the lockdown, we attempt to isolate the effects of WFH by simulating the estimated model.

3 Data

3.1 Overview

Our main data are taken from the 2nd Survey on Changes in Attitudes and Behaviors in Daily Life under the Influence of the Novel Coronavirus Infection,Footnote 7 conducted in December 2020 by the Cabinet Office (2020) of Japan. The survey asks about the frequency of WFH, work-related outcomes such as hours of work and commuting time, the share of housework and childcare within the household, views on work–life balance, and other questions, such as why a respondent has changed his/her number of WFH days. Notably, the survey mainly asks respondents about changes since December 2019, prior to the COVID-19 pandemic. For example, one question asks, “How has the time you spend with your family changed compared with December 2019?” The format of such questions makes them suitable for our first-difference specification, as explained in Section 4. Approximately 10,000 individuals participated in the survey. They were randomly selected from a pool of registered monitors so that the same number of individuals are included for each gender and five-year age group. The region of residence was selected according to the population composition, ensuring that the sample is geographically representative.

We note that the survey is retrospective, that is, respondents working in December 2020 answered the questions; therefore, the sample is conditioned on working after the outbreak of COVID-19. This survey structure raises concerns because working status after the outbreak may be affected by the COVID-19 outbreak. To address this issue, we restrict our sample to married male workers with children under the age of 18 years.Footnote 8 As shown in Fig. 1, reproduced from Fukai et al. (2023), the employment rate of this specific demographic group is extremely stable even during the COVID-19 pandemic.Footnote 9 Importantly, their employment rate after the COVID-19 outbreak did not decrease significantly from the pre-pandemic period. Therefore, we consider that any biases arising from conditioning on working after the COVID-19 outbreak are negligible.

Fig. 1
figure 1

The Predicted and Observed Employment Rates for Married Men with Children Notes: The solid line represents the employment rate of married men with children from 2015 to 2020. The dashed line represents the predicted employment rate calculated from Equation 1 in Fukai et al. (2023) from 2015 to 2020. The vertical line in the graph represents the outbreak of the COVID-19 pandemic in Japan, beginning in March 2020. The estimation sample in Fukai et al. (2023) is restricted to men aged 25 to 54 years who are married with children and for whom there is information on education and working status in the previous year. Source: Figure 15 in Fukai et al. (2023)

For additional information on the Japanese pandemic context, refer to Appendix A. Overall, Japan’s stay-at-home restrictions were much less stringent than those adopted in other countries. There was a nationwide school closure from March 2, 2020, until the end of May 2020, which constituted a particularly severe anti-COVID-19 measure in Japan. However, the data collection was conducted long after this closure period.

3.2 Variable definitions

3.2.1 Working from home

In the survey, respondents were asked what percentage of their total work was conducted from home in December 2019 and December 2020, selecting their response from five possible answers: 100%, more than 50%, less than 50%, usually work outside of home but work from home irregularly, and none. Taking the middle points of the intervals, we treat “more than 50%” and “less than 50%” as 75% and 25%, respectively. If respondents answered that they usually worked outside of home but worked from home irregularly, we assume WFH accounts for 10% of their work. Hence, the share of WFH in total work takes a value of 100%, 75%, 25%, 10%, or 0%.Footnote 10 To facilitate interpretation, we multiply this variable by the number of days worked per week.Footnote 11 The constructed variable is interpreted as the number of WFH days per week.Footnote 12

The survey also asks about the feasibility of WFH before the pandemic, as follows: “How much of your work falls into each of the following four categories?: 1. work that you can do from home without any problems, 2. work that you can do from home although productivity would be slightly lower, 3. work that can be done from home if the work procedure is appropriately altered, and 4. work that you cannot do from home. Provide your answers to each category as a percentage of your total workload. Make sure that the sum is 100%.” We define the share of “work that can be done from home without any problems” as our index of WFH feasibility.Footnote 13

3.2.2 Engagement with family

The survey asks several questions on how engagement with family has changed since December 2019, which are our main outcome variables. First, respondents provide answers on the percentage change in time spent on housework compared with the level in December 2019. According to Fig. 3, in 2016, the average time Japanese men spent on unpaid work per day was 44.6 min. Thus, 10% change in time spent on housework roughly corresponds to 4.5 min per day on average.

Second, respondents report the percentage change in time spent with family in interval terms, with possible answers including −51% or lower,−50% to −21%, −20% to−6%, −5% to 5%, 6% to 20%, 21% to 50%, and 51% or higher. We construct a variable of the change in time spent with family by taking the middle point of each interval in the original question. If respondents answered that they increased (decreased) time spent with family by 51% or more, we calculate the variable as 51 × 1.25 (−51 × 1.25)%.Footnote 14

Third, we have a dummy variable that takes a value of one if respondents answer that they became more life rather than work oriented, and zero otherwise.

Fourth, the survey asks whether there has been a change in the division of roles between spouses regarding housework and childcare compared to December 2019 and how it changed. We construct a dummy variable that takes a value of one if a respondent answers that there has been a persistent change and only the husband’s own housework role increased, and zero otherwise. We interpret that this dummy variable indicates whether the share of housework and childcare for a husband increased. Note that, if a respondent answers that “the division of roles changed but has now returned to normal” or “the role of both the husband and the wife increased (decreased),” the value takes zero. Hence, it may take a value of zero even when the husband’s role increased more than his wife’s, which actually leads to an increase in the husband’s share of housework. Given this definition, we consider the variable to be a conservative measure of the change in the shares of housework.

Because all of these measures are self-reported, one possible issue is misreporting. Specifically, husbands might overreport their time spent on housework and childcare due, for example, to social desirability bias. However, because our main identification strategy is the IV regression discussed in Section 4.2, overreporting does not generate biased estimates as long as our instrument is uncorrelated with the tendency of overreporting. See Section 5.5 for further discussion.

3.2.3 Work-related outcomes

The survey asks about the change in commuting time, working hours, and self-perceived productivity relative to December 2019. As for the change in commuting time, respondents answer by choosing an interval, with the same selection of responses as for the question concerning time spent with family. The method for construction of the continuous variable is also the same. Turning to working hours and self-perceived productivity, as with the question on time spent on housework, respondents provide answers on the percentage change compared with the level in December 2019.

3.3 Descriptive statistics

Table 1 reports the descriptive statistics. Our sample consists of 984 married male workers with children under the age of 18 years.Footnote 15 The average household size is 3.9. The proportion of workers who have a preschool child is 53.9%.

Table 1 Descriptive statistics

On average, WFH days increased by 0.5 days per week from December 2019 to December 2020. In December 2019, 12.5% of respondents worked from home at least once. The proportion rose to 28.5% in December 2020. On average, it was possible to do 22.2% of work from home in December 2019.

Turning to family-related outcomes, the time spent with family increased by 9.6%, while that spent on housework increased by 1.5%. In 14.9% of the sample households, the husband increased his share of housework. The respondents’ family values changed as well, with 40.6% of respondents reporting the importance of personal life over work increased.

Commuting time and working hours decreased by 7.6% and 2.5%, respectively, from December 2019 to December 2020. Respondents also reported that their productivity declined by 4.2% on average. Because the sample include both those who did and did not work from home, the figure does not necessarily reflect the effect of WFH.

4 Econometric model

This section details the econometric model used to estimate the causal effect of WFH on the outcomes. In subsection 4.1, we set up the first-difference specification as a baseline model, which examines the correlation between the change of the outcomes and the change of the WFH days. In subsection 4.2, we introduce an IV regression as our preferred specification.Footnote 16 We instrument the change in the WFH days by the feasibility of WFH in December 2019.

4.1 Baseline model

Because our data are for two periods, December 2019 and December 2020, we begin with the following first-difference regression to estimate the effect of WFH:

$$\begin{array}{r}{{\Delta }}{Y}_{i}={\beta }_{0}+{\beta }_{1}{{\Delta }}{D}_{i}+{{{{\bf{X}}}}}_{i}^{{\prime} }{\beta }_{2}+{{\Delta }}{\epsilon }_{i},\end{array}$$
(1)

where Y is an outcome variable; D is the number of days of WFH per week; X is a vector of individual characteristics that consist of education, age, the number of household members, and the youngest child’s educational stage as a proxy of age; and Δ is the first-difference operator, which takes the difference of each variable between December 2019 and December 2020. The parameter of interest is β1, which captures the effect of one extra day of WFH per week on the outcome.

If we regress the level of the outcome on the level of the WFH days, the estimates are likely to be biased because unobserved individual characteristics may affect both simultaneously. Table 2 reports the results from the regression of the number of WFH days in 2019 and 2020 on individual characteristics: education, age, the region of residence, the number of household members, and the child’s educational stage. The coefficients on the indicators of living in Tokyo and the Kanto region (a region consisting of six prefectures near Tokyo) are significantly different from zero both for 2019 and 2020. If this correlation is due to unobserved differences between workers in the different regions, simply regressing outcome variables on the number of WFH days will produce biased estimates.

Table 2 Correlation between WFH variables and individual characteristics

However, such time-invariant unobserved individual characteristics are removed by first differencing. Our identifying assumption of the first-difference estimator is that changes in days of WFH are orthogonal to changes in the error term, conditional on observed individual characteristics Xi. Note that we still include Xi as control variables because ΔDi and Xi might be correlated.

4.2 Identification with the instrumental variable

Although a correlation between the level of the WFH days and that of the error term does not bias our estimates from the first-difference model, we are concerned that changes of the WFH days may be correlated with those of the error term. For example, the fear of COVID-19 may affect both the WFH days and an individual’s family orientedness. If this is the case, unobserved changes in the fear of COVID-19 bias our estimates from Eq. (1).

Although the pandemic meant that people were urged to stay at home more strongly than ever before, there are reasons to believe that workers and firms had some discretion about whether to adopt WFH. As documented in Appendix A, Japan’s stay-at-home restrictions are substantially less stringent than those adopted in other countries. As of December 14, 2020, during our data collection period, the Government Response Stringency Index-a composite measure of nine response indicators provided by The Oxford COVID-19 Government Response TrackerFootnote 17-is 48.15 for Japan. This is much lower than measures for France (75.00), the United States (71.76), the United Kingdom (73.15), and Canada (67.13). In fact, in the survey, some respondents report that they reduced WFH days by December 2020 because their preferences for WFH changed. If workers had discretion on whether to work from home, it may be the case that those who became more family-oriented than before chose to work from home, while others did not.

To address the potential endogeneity bias, we use the feasibility of WFH in December 2019 as an IV denoted by zi.Footnote 18 We expect that the feasibility of WFH is likely to affect the actual change in the WFH days apart from the workers’ preference for WFH.

This feasibility index for WFH can be considered to reflect the nature of the respondent’s job tasks. For example, workers in the IT industry may be able to work from home because they can perform most of their tasks anywhere with a computer and an Internet connection. In contrast, WFH is infeasible for supermarket clerks because face-to-face service is necessary.

Moreover, previous studies have documented non-negligible heterogeneity in the WFH feasibility across individuals even within the same occupation (Adams-Prassl et al., 2020; Kawaguchi & Motegi, 2021). Our measure, based on the individual responses, allows us to capture the individual-level differences in working conditions and has an advantage over the industry or occupation codes (for example, Alipour et al., 2020; Boeri et al., 2020; Dingel & Neiman, 2020) that are most commonly used.

The first-stage regression equation is:

$$\begin{array}{r}{{\Delta }}{D}_{i}={\pi }_{0}+{\pi }_{1}{z}_{i}+{{{{\bf{X}}}}}_{i}^{{\prime} }{\pi }_{2}+{u}_{i}\end{array}$$
(2)

where ΔD is the change in the number of days of WFH per week; z is the feasibility of WFH; and X is a vector of individual characteristics that consist of education, age, the number of household members, and the youngest child’s educational stage as a proxy of age. In Section 5, we confirm that the feasibility of WFH is strongly correlated with changes in the WFH days.

Our instrument must satisfy the following exclusion restriction:

$$\begin{array}{r}E[{{\Delta }}{\epsilon }_{i}| {z}_{i},{{{{\bf{X}}}}}_{i}]=0.\end{array}$$
(3)

The exclusion restriction requires that after controlling individual characteristics Xi, the feasibility of WFH in December 2019 be not correlated with the changes in the error term. In other words, the feasibility of WFH affects outcomes only through changes in the WFH days. Note that we allow for the correlation between the instrument and the level of the error term. For example, even if workers in the IT industry tend to contribute more to housework than other workers, the exclusion restriction is not violated because the IV is correlated only with the levels of the outcome. In contrast, if they tended to change the amount of time spent on housework between December 2019 and December 2020, then that would invalidate the exclusion restriction. In Section 5.3, we discuss the potential threat to the exclusion restriction and examine the validity of our instrument.

One might be concerned that the potential overreporting of the change in family engagement might bias our estimates. Note again, however, that even if husbands tend to overreport their contribution to housework, it does not bias our estimates for the causal effects of WFH unless overreporting correlates with our instrumental variable. We discuss this issue further in Section 5.5.

5 Results

5.1 Family-related outcomes

Table 4 presents the estimates for the outcomes related to engagement with the family.Footnote 19 Columns 1, 4, and 7 report estimates from the first-difference specification defined by Equation (1). Overall, an increase in the WFH days improves all the three outcomes. An additional day of WFH increases time spent on housework by 5.5%. In addition, time spent with family increased by 5.2% and the fraction of male workers who became more life oriented than before increased by 6.1%.

As discussed in Section 4, however, the first-difference specification may be subject to the endogeneity bias caused by time-varying unobserved variables. We address this problem by employing the IV specification. The estimation result for the first stage (Eq. (2)) is reported in Table 3. The coefficient of WFH feasibility is significantly positive, indicating that a 10% increase in the WFH feasibility is associated with an increase in the number of WFH days per week by 0.24 days. Other characteristics of workers are not predictive of the growth in WFH days. The F-value is 165.0, confirming that the feasibility of WFH serves as a strong instrument for the growth of the WFH days. Figure 5 further represents the relationship between the feasibility of WFH and the change in WFH days.

Table 3 First stage regression

In Table 4, Columns 2, 5, and 8 report the reduced-form estimates, and Columns 3, 6, and 9 report the IV estimates using the feasibility of WFH as an instrument. All the estimates from the reduced form and the IV regressions are significantly positive. Focusing on the IV results, our preferred specification, an additional WFH day increases time spent on housework by 6.2%. It also increases time spent with family by 5.6% and the fraction of men who became more life oriented than before by 11.6%.

Table 4 The effect of working from home on involvement with Family

The IV estimates are greater than the first-difference estimates for all the outcomes; however, using the Hausman test, we can reject the hypothesis that the two estimates are the same only for the life oriented indicator. We note that the discrepancy in the estimates could be explained by the fact that the IV estimator identifies the effects of WFH on a different subpopulation from the one for which the first-difference estimator identifies the effects. Whereas the first-difference estimates reflect the change in outcomes for all treated workers, the IV estimator identifies local average treatment effects for workers induced to work from home because of their high feasibility of WFH.

Note that we construct the variable for time spent with family by taking the middle point of the interval in the original question. To deal with a potential bias from the variable construction, we employ an ordered probit model with the same instrument to estimate the effect of WFH on the change in time spent with family. Table 10 shows that the estimates are similar to the main results.

5.2 Work-related outcomes

Although WFH increases workers’ engagement with their families, a concern is that it may potentially have adverse effects on work performance. To examine whether WFH lowers work productivity, we conduct the same estimation exercise as in Section 5.1 for work-related outcomes. The results are reported in Table 5.

Table 5 The effect of working from home on work-related outcomes

We find that an additional day of WFH reduces commuting time by 12.4%. Because most workers work five days a week and the effect of WFH on the change in commuting hours is expected to be around 20%, the effect appears underestimated. Nevertheless, this difference is likely to arise from a rounding error. As discussed in Section 3, we take the middle point of each interval to construct the variable. Accordingly, for the respondents whose commuting time decreased by 20%, our variable is −13%, almost the same as our estimate.

While WFH effectively reduces commuting time, we find no significant effect of WFH on working hours or productivity in all specifications. Thus, taking this together with the results in Section 5.1, we conclude that WFH promoted greater family engagement by male workers without sacrificing their productivity at work.

5.3 Validity of the exclusion restriction

As discussed in Section 4, the exclusion restriction is a crucial assumption to identify the causal effect of WFH. Although we argue that the exclusion restriction holds—that is, the percentage of work that workers can do from home is not correlated with changes in the error term—concerns may remain about its validity. This section considers possible pathways other than WFH through which the IV affects the outcome, i.e., the possible threats to our identification strategy. Then, we examine whether controlling such variables changes the results.

5.3.1 Regional characteristics

If white-collar occupations are concentrated in the metropolitan area, WFH is likely to be more feasible such areas. We confirm this prediction in our data; the average feasibility of WFH in Tokyo is 35%, whereas it is 21% in other regions. In general, WFH is more feasible in larger cities.

On the other hand, the numbers of COVID-19 cases and deaths vary substantially by prefecture, and Tokyo has the largest number of cases per population out of 47 prefectures in Japan in almost every period that we study.Footnote 20 In general, large cities tend to have more COVID-19 cases. More COVID-19 cases and more deaths may make people more family oriented out of fear, leading them to spend more time with their family and, hence, contribute more to housework.

If this is the case, workers in large cities are more likely to have a job with high WFH feasibility and to become more family oriented because of the more intense COVID-19 situation compared with other cities, which implies a correlation between the instrument and the changes of the error term.

We include prefecture fixed effects in Eq. (1) and estimate the model with the IV to address this concern. By including the prefecture fixed effects, our identification relies on the variation of the WFH feasibility within the prefecture rather than across prefectures. The estimates are reported in Columns 1, 4, and 7 in Tables 6 and 7. The results are essentially the same as the main results in Tables 4 and 5. Thus, we consider that regional differences in the spread of COVID-19 do not invalidate our exclusion restriction.

Table 6 The effect of working from home on involvement with family with additional controls

5.3.2 Industry characteristics

Another potential threat to the validity of the exclusion restriction lies in industry characteristics. For example, under the COVID-19 pandemic, the IT industry has been increasing profits, whereas the food service industry has experienced a significant drop in sales. Such differences in business performance by industry may affect workers’ perceptions regarding work–life balance and change their roles in the household. For example, workers in the food service industry may increase their contribution to housework to compensate for the reduction in their salary or to make use of the reduction in their working hours. Because the feasibility of WFH varies by industry, our instrument may be correlated with changes in the error term in Equation (1) through industry characteristics, which would bias our estimates.

To avoid the endogeneity bias, we additionally control for the industry (for example, manufacturing, retail business, and transportation), the job category (for example, sales, accounting, and human resources), and the number of employees of the firm in Equation (1) and estimate the model with the IV. Columns 2, 5, and 8 in Tables 6 and 7 report the estimates from these regressions. Again, the estimates are similar to the main results and do not change our conclusion. The results ensure that industry and other job characteristics are not pathways through which our IV is correlated with changes in the error terms.

Table 7 The effect of working from home on work-related outcomes with additional controls

5.3.3 Spouse’s WFH status

A final concern is that the feasibility of WFH is associated with the spouse’s feasibility of WFH, and that this correlation may lead to the violation of the exclusion restriction. According to Malkov (2020), in the US, teleworkability-based occupational sorting occurs; in about 60% of couples, both spouses work in either teleworkable or non-teleworkable occupations. If the wife’s feasibility of WFH is positively correlated with her time spent on housework, this can reduce the time that the husband spends on housework. Moreover, this is more likely to occur in couples where the husband’s feasibility of WFH is high, which is a potential source of bias in our case.

Because our data do not contain information on spouses’ feasibility of WFH, we directly control whether the spouse works from home, assuming that the husband’s WFH days do not affect the wife’s WFH status. Columns 3, 6, and 9 in Tables 6 and 7 report the results of the IV regressions. All estimates are comparable with those from the main specification, which reassures us about the validity of our IV.

However, some may argue that the husband’s WFH status directly influences whether his spouse works from home. If that is the case, we should not directly control the spouse’s WFH status in Equation (1) because it is affected by our treatment variable, the husband’s WFH days. As we mentioned in Section 4, workers in Japan have their own discretion regarding whether to work from home. Thus, a husband and a wife may jointly decide on their WFH days. Nonetheless, because the husband’s feasibility of WFH tends to be positively correlated with the spouse’s feasibility of WFH, and because the wife’s feasibility of WFH is negatively correlated with the outcomes concerning the husband’s involvement with the family, our estimates from the specification without controlling the spouse’s WFH status (Tables 4 and 5) can be regarded as a lower bound of the effects of WFH on the outcome. Therefore, even if our estimates are biased by omitting variables related to the spouse’s feasibility of WFH, our conclusion does not change or would be even stronger.

5.4 Alternative definitions of the instrumental variable

As we explained in Section 3, the definition of our IV, the feasibility of WFH, is the share of “work that can be done from home without any problem.” The exclusion restriction implies that this IV affects outcomes only through changes in the WFH days. One potential concern is that the actual WFH experience during the pandemic might have affected workers’ perception of how much of their work can be done from home without any hassles. If this change in perception is correlated with the changes in the error term, it invalidates the exclusion restriction. To deal with this concern, we apply alternative definitions for our IV for robustness checks.

5.4.1 Occupation-based definition

The first alternative IV is the occupation-based feasibility index of WFH, following Dingel and Neiman (2020), who classify the feasibility of WFH for each occupation using O*NET. Because this occupation-based index is free from workers’ perception of how much of their work can be done from home, it is appropriate to check the robustness of our main IV. Referring to Dingel and Neiman (2020), Kotera (2020) constructs the occupation-based WFH index using the Japanese version of O*NET by the Japan Institute of Labour Policy and Training. Ishii et al. (2020) aggregate this measure to construct the WFH feasibility index for twenty major occupation categories in the Japanese National Census in 2015. Because our data have only twelve occupation categories, we follow the same strategy as Ishii et al. (2020) by matching occupation categories in our data to those in the Census major occupations to create the occupation-based WFH feasibility.

Tables 11 and 12 replicate the main regression results in Tables 4 and 5 but use the occupation-based IV. The point estimates of the IV regression in Tables 11 and 12 are similar to those in Tables 4 and 5, which suggests that the main results estimated with our preferred IV, the feasibility of WFH, are unlikely to be biased. We note that compared to our main results, Tables 11 and 12 report large standard errors, which makes the most estimates statistically insignificant, and the smaller F-value of the first stage regression. These results suggest that our original WFH feasibility index has an advantage over the occupation-based feasibility index because our original IV captures individual-level differences in working conditions, which can be confirmed by the strong first stage (the F value is greater than 160).

5.4.2 Self-perceived WFH feasibility

The potential concern about the feasibility of WFH variable is that WFH experience might change workers’ perception of how much of their work can be done from home without any hassles. To mitigate this concern, we construct the second alternative IV defined as the sum of the shares of “work that can be done without any problem,” “work that can be done from home although productivity would be slightly lower,” and “work that can be done from home if the work procedure is appropriately altered.” This alternative IV is equivalent to the proportion of work that is not categorized as “work that you cannot do from home.” Because workers’ perception of whether or not their work can be done from home is expected to be less affected by their WFH experience than that of how much of their work can be done from home without any problems, the second alternative IV is more robust to workers’ perception of WFH feasibility.

Tables 13 and 14 replicate Tables 4 and 5 with this alternative IV. The estimates are comparable with those from the main specification, suggesting that our main results obtained with the original IV are valid.

5.5 Self-reported housework participation

One might be concerned about the use of the self-reported measures of change in family engagement because husbands may believe they are doing more household chores than they are doing in reality. Note, however, that even if husbands tend to overreport their contribution to housework, it does not bias our estimates unless overreporting correlates with our instrumental variable.

To examine the extent of correlation between husbands’ overreporting family engagement and the instrumental variable, we conduct additional analyses using another survey that took place in almost the same period as our main dataset. The data include responses from 2,024 couples living with children regarding the division of housework and childcare. We construct an indicator variable for the husband’s overreporting by assuming that the wife’s reported share of housework (childcare) reflects the actual division and regress the indicator on WFH-related variables.

The sample averages imply that the majority of husbands rather tend to underreport their contribution to housework and childcare, and the correlation between the change in WFH days per week and overrerpoting is not found for housework and is negative for childcare. Most importantly, respondents who report “it is difficult for my work to be done from home,” which relates to the definition of our IV the most closely, are no less likely to overreport their housework and childcare contribution. Therefore, we conclude that it is unlikely for the self-reported nature of the data to drive our main results. See Appendix B for a detailed explanation.

5.6 Heterogeneous effects of working from home

In this subsection, we explore the heterogeneity of the effect of WFH. Tables 1522 examine the heterogeneous effects of WFH by education, age, child’s educational stage, and household size.

The results reported in Tables 15 and 16 show that the estimates for university graduates are similar to those obtained for the whole sample (Tables 4 and 5), suggesting that university graduates largely drive the results for the whole sample, which is consistent with the recent finding by Cowan (2023). For workers with lower education levels, all the estimates except the change in life orientation are insignificant. Note, however, that none of the differences between the two groups is statistically significant.

Turning to other workers’ characteristics, the estimates in Tables 1722 show interesting patterns: the effects of WFH on housework tend to be greater for those who are younger, whose child is younger, and whose household size is larger. A possible explanation for the difference is that fathers of young children increase their time spent on caring for children at home rather than time spent on other household chores, as suggested by Champeaux and Marchetta (2021).

Another issue is whether the extent to which male workers increase their participation in household chores varies by whether their wives can work from home. Alon et al. (2020) expect the largest effects for families in which the father is able or forced to work from home while the mother is not. Champeaux and Marchetta (2021) show that under the lockdown in France, fathers effectively increased their contribution to housework only when the mother was the sole household member working outside the home.

As discussed in Section 5.3, although wives’ feasibility of WFH is not available from our data, their actual WFH status is available, as men report whether their spouse worked from home, did not work from home, or did not have a paid job. We understand that the estimation controlling for the actual WFH status may not be valid because the spouses’ WFH status may be endogenous. That being said, it is informative to estimate the IV regressions by splitting the sample by whether the wife worked outside the home or did not (i.e., in the latter case, she worked from home or did not have a paid job). The results are reported in Tables 23 and 24 in the Appendix. An additional WFH day has consistently positive effects on the family-related outcomes regardless of whether the wives stay at home. The differences in the effects between the two groups are insignificant for all the outcomes. This suggests that WFH encourages the reallocation of housework for couples who both stay at home as effectively as for couples in which only the husband stays at home.

6 Discussion

In this paper, we have examined the effects of WFH on men’s family engagement using data from Japan during COVID-19. In this setting, it is natural to ask whether WFH contributes to reducing the gender gap in the burden of housework, and how important WFH was for changes in workers’ attitudes and behavior under COVID-19, when changes in many dimensions other than work style occurred. In Sections 6.1 and 6.2, we address these questions. Section 6.3 relates our results to the previous literature and discusses the policy implications.

6.1 Implication for gender inequality in housework

Although we have shown that WFH has a positive impact on the husband’s time spent on housework, if his wife also increases the time spent on housework, then WFH might not contribute to closing the gender gap in housework burden. To examine the implication for the gender gap in housework, we estimate the impact of WFH on the husband’s housework share using the dummy variable defined in Section 3 which indicates the husband’s own housework role only increased. Although this variable is not complete, as stated in Section 3, in the sense that it takes a value of zero if both the husband and the wife increase (or decrease) the roles of housework, it is informative to shed light on the shares of housework.Footnote 21

Table 25 presents the estimates. Focusing on the IV result, an extra day of WFH increases the fraction of men who increased their share of housework in the family by 9.3%. This result suggests that WFH not only increases time spent on housework but also contributes to reducing the gender gap in housework by increasing the husband’s share of housework.

6.2 How much does WFH contribute to the overall changes?

We have confirmed that WFH has causal effects on outcomes related to engagement with family and commuting time, but to what extent does WFH account for the changes in attitude toward family under the COVID-19 pandemic? That is, because the pandemic has dramatically impacted our perceptions and behavior, WFH may play a little role relative to the role of the pandemic itself. We examine how much WFH contributed to the overall change in the outcomes between December 2019 and 2020.

Using the estimates obtained from Equation (1) with the IV, \({\hat{\beta }}_{0},{\hat{\beta }}_{1},\) and \({\hat{\beta }}_{2}\), the sample mean of our dependent variable, \(\overline{{{\Delta }}Y}\), can be written as follows:

$$\begin{array}{r}\overline{{{\Delta }}Y}={\hat{\beta }}_{0}+{\hat{\beta }}_{1}\overline{{{\Delta }}D}+{\overline{{{{\bf{X}}}}}}^{{\prime} }{\hat{\beta }}_{2},\end{array}$$
(4)

where \(\overline{{{\Delta }}D}\) and \(\overline{{{{\bf{X}}}}}\) are the sample averages of ΔD and X, respectively. To quantify the contribution of WFH, we define the counterfactual mean of the change in outcome, \({\overline{{{\Delta }}Y}}_{CF}\), as the value when no respondents change the number of WFH days, or by setting \(\overline{{{\Delta }}D}=0\):

$$\begin{array}{r}{\overline{{{\Delta }}Y}}_{CF}={\hat{\beta }}_{0}+{\overline{{{{\bf{X}}}}}}^{{\prime} }{\hat{\beta }}_{2}.\end{array}$$
(5)

Then, we define the percentage contribution of WFH as

$$\begin{array}{r}WFHcontribution=\frac{\overline{{{\Delta }}Y}-{\overline{{{\Delta }}Y}}_{CF}}{\overline{{{\Delta }}Y}}\times 100.\end{array}$$
(6)

For example, \(\overline{{{\Delta }}Y}=1\) and \({\overline{{{\Delta }}Y}}_{CF}=0.6\) indicates that 40% of the overall change is contributed by WFH.

Table 8 reports the actual sample mean, counterfactual mean, and WFH contribution. For the outcome variables related to involvement with family, WFH contributes to 14% to 33% of the change in these outcomes from December 2019 to December 2020, with the exception of time spent on housework. Because our estimates predict that the average married male worker who does not change the number of WFH days will decrease his time spent on housework, the contribution calculated for WFH exceeds 100%. Overall, the contribution of WFH to the change in engagement with family is large even compared with other effects, including the pandemic itself.

Table 8 Contribution of working from home

As for work-related outcomes, it is worth noting that the contribution of WFH to the change in commuting time is 87%, and it is not statistically significantly different from 100%. This estimate implies that WFH is the only major path through which commuting time decreases between December 2019 and December 2020, which we find plausible.

6.3 Relation to previous literature

Our results indicate that WFH promotes men’s participation in household chores without reducing work productivity, which provides empirical evidence for the argument by Alon et al. (2020) and Hupkau and Petrongolo (2020) who argue that the increased work flexibility for men during the COVID-19 outbreak may encourage them to contribute more to housework and childcare. In contrast to many comparative studies investigating the consequences of COVID-19 confinement policies on families, our study establishes causal evidence for the impacts of WFH on families during the COVID-19 pandemic.

Further, our subsample analysis suggests that WFH leads to the redistribution of housework regardless of whether the spouse works from home. This is in contrast to the previous arguments in the literature. Alon et al. (2020) argue that the increased participation of men in housework is likely to be driven by telecommuters whose spouses work outside the home. Champeaux and Marchetta (2021) show that the redistribution of housework induced by the lockdown in France is effective only for families in which the mother works outside the home while the father works from home. The difference in the results may arise from the difference in the pre-existing gender disparity in domestic work. Japanese fathers may have a lower baseline and more room to increase their contribution to domestic work when working from home than do French fathers.

Regarding the estimates for work productivity, our results appear to contradict Morikawa (2020) and Kitagawa et al. (2021), who report negative effects of WFH on productivity using a survey conducted in Japan. This discrepancy can be attributed to the fact that they used different estimators and different study periods compared with our study.

Morikawa (2020) asks survey participants who have adopted WFH about their productivity in WFH relative to working at the usual workplace. Kitagawa et al. (2021) estimate an average effect on employees who have experienced WFH by using a first-difference model similar to Eq. (1) in our paper. It is important to note that both Morikawa (2020) and Kitagawa et al. (2021) estimate the effect of WFH from April to June in 2020. During that period, many workers were strongly urged to work from home even if they knew their productivity would decrease as a result of WFH.

In contrast, the survey we use asks about productivity in December 2020. As discussed in Section 4, firms and workers had considerable discretion at this point in deciding whether to work from home. Workers who would not suffer a productivity decline are likely to be selected into WFH.Footnote 22 Our IV estimator identifies the local average treatment effect on workers induced to work from home because of their high WFH feasibility; therefore, the estimate indicates a null effect on productivity. In addition, given that we focus on December 2020, it is important to note that over time, as the pandemic continued, firms invested in IT equipment to improve the effectiveness of WFH, and many workers became more accustomed to WFH by December 2020, meaning that there was no longer a negative impact on their productivity because of WFH.Footnote 23

Our findings have important implications for considering a new working style. Some studies suggest that even after the COVID-19 pandemic, a large fraction of workers prefer to continue WFH (Kitagawa et al. 2021) and, hence, the practice of WFH will continue (Barrero et al. 2021; Bick et al., 2020). By showing that WFH helps promote gender equality within households without sacrificing productivity at work, our results provide another reason to argue that policymakers should promote WFH options even after the pandemic.

7 Conclusion

In this paper, we study the impacts of WFH on male workers’ participation in household chores and attitude toward their families. Our estimates indicate that WFH leads men to spend more time on housework and with their family, and makes it more likely that they will take a larger share of housework and value their personal life relative to work. Regarding work-related outcomes, we find no significant effect on the workers’ self-perceived productivity and hours worked. Therefore, our estimates indicate that WFH encourages male workers to contribute more to household chores without sacrificing their performance at work.

This paper contributes to the literature on WFH by showing that the practice of WFH improves men’s work–life balance. Although several studies have established that WFH reduces women’s work–family conflict, scant attention has been paid to the impact of WFH on men. This lack of evidence for men may be attributed to the difficulty of avoiding self-selection into WFH in “normal” nonpandemic times or in an experiment that allows workers’ discretion about whether to work from home. Exploiting the preexisting variation in the feasibility of WFH as an instrument and the pandemic as a situation in which many male workers are strongly motivated to work from home, we show that WFH increases fathers’ engagement with the family.

This research is subject to at least two limitations. First, because we employ the IV estimator, our results show effects only for a subgroup of the population, that is, working fathers whose jobs can be readily performed from home. Our results may not be immediately extrapolated to other groups. Second, whether the effect of WFH on within-household gender equality persists is outside the scope of the current study, although we note that Alon et al. (2020) expect that it does persist. Further studies should address whether increased WFH would have longer-term effects by analyzing post-pandemic data.