Commuting time and sickness absence of US workers

This paper analyzes the relationship between commuting time and days of sickness absence of US workers. Using data from the Panel Study of Income Dynamics for the years 2011 to 2017, we find that a 1% increase in the daily commute of workers is associated with an increase of 0.018 and 0.027% in the days of sickness absence per year of male and female workers, respectively. These results are robust for women when sample selection, missing variables, and health status are explored. Further exploration of this relationship shows that the positive relationship between commuting and sickness absence is concentrated in urban areas only, and is present in the intensive margin (hours) for men and the extensive margin (participation) for women. By uncovering how commuting time is related to sickness absenteeism, we contribute to the literature on the negative correlation between commuting and workers’ health and well-being.


Introduction
In this paper, we analyze the relationship between commuting time and sickness absence of workers in the US. The analysis of commuting has gained relevance in the literature in recent decades (see Ma and Banister (2006) for a chronological review), as a result of the increase in the time/distance workers in developed countries devote to commuting to/from work Molina 2014, 2016;Goerke and Lorenz 2017;Kirby and LeSage 2009). Commuting is a complex social phenomenon  and has been related to negative health-related outcomes for workers (Hansson et al. 2011;Künn-Nelen 2016), which include lower subjective/psychological well-being (Dickerson et al. 2014;Roberts et al. 2011) and increased stress (Frey and Stutzer 2008;Gottholmseder et al. 2009;Novaco and Gonzalez 2009;Wener et al. 2003). These negative health-related outcomes of commuting on workers are important not only at the worker level, but also in terms of public health in general.
The negative consequences of commuting have also been linked to increased labor costs (Allen 1983;Goodman et al. 2012), losses of productivity (Grinza and Rycx 2018), and increased sickness absence of workers (van Ommeren and Gutierrez-i-Puigarnau 2011). Zhang et al. (2011) conclude that illness and absenteeism are a substantial source of productivity loss. Sickness absence is inversely related to worker productivity and is a major reason for absence from work in the US. Data from the US Bureau of Labor Statistics show worker absence rates for 2018, and around 70% of total absence at work is due to illness and injury. 1 Furthermore, only two empirical studies have formally tested the relationship between commuting and sickness absence (Goerke and Lorenz 2017;van Ommeren and Gutiérrez-i-Puigarnau 2011) although their results refer only to Germany, and their conclusions differ.
On theoretical grounds, one question that arises from prior research is that of whether workers are already being compensated in terms of wages for any mental and physical burden associated with commuting. If commuting generates health costs for workers, then a worker will only decide to commute longer distances if he/ she is fully compensated via higher salary or by lower rents in their place of residence. But the evidence may indicate that workers do not consider both pecuniary and non-pecuniary commuting costs correctly. In fact, prior research suggests that individuals have difficulty assessing non-pecuniary costs, particularly health costs (van den Berg and Ferrer-i-Carbonell 2007). If this is the case, then workers may be underestimating the wage premium required by higher compensation.
Within this framework, the goal of the paper is to provide empirical evidence of the relationship between worker commuting and sickness absence in the United States. To that end, we use data from the Panel Study of Income Dynamics of the United States, for the years 2011, 2013, 2015, and 2017, to analyze the relationship by estimating Fixed Effects panel data models that account for worker time-invariant unobserved heterogeneity, which allows us to relate variations in commuting time to variations in sickness absence. We find that workers who have longer commutes are more likely to be absent from work due to sickness. In particular, we find that the elasticity between daily commuting time and the annual days of sickness absence is estimated to be 0.018 and 0.027 for male and female workers, respectively, indicating that a 10% increase in commuting time is related to 0.18 and 0.27% increases in absenteeism, respectively. A 10% increase in commuting time represents an increase of around 4 min per day. These results are robust when sample selection, missing variables, and health status issues are explored. Furthermore, the relationship is present in urban areas only.
We contribute to the literature by complementing prior analyses by van Ommeren and Gutierrez-i-Puigarnau (2011), and Goerke and Lorenz (2017). While the former find a positive relationship between commuting distance and sickness absence in Germany, the latter find no evidence of a relationship in the same country, and only those employees who commute long distances are absent about 20% more than employees with no commutes. Thus, the existing evidence for this relationship is mixed, and limited to Germany. This paper represents the first estimate of the relationship for the case of the United States, and adds to the mixed evidence on commuting and sickness absence. While we find that the elasticity of commuting time with sickness absence is between 0.01 and 0.03%, van Ommeren and Gutierrez-i-Puigarnau (2011) find it is around 0.10%, and Goerke and Lorenz (2017) find no statistically significant correlation between them. Such differences may be due to the variable used to measure commuting, since we use commuting time and they use commuting distance, as a proxy for commuting time as well as for monetary costs.
The rest of the paper is organized as follows. Section 2 presents the conceptual framework in which the analysis is based, and Sect. 3 shows data and the variables used in the empirical analysis. Section 4 describes the econometric strategy, and Sect. 5 shows the main results. Finally, Sect. 6 sets out our main conclusions.

Conceptual framework
Several authors have established a link between longer commuting times, on the one hand, and decreased health outcomes, lower subjective and objective health, well-being, psychological problems, and increased stress and fatigue, on the other. See Clark et al. (2020), Künn-Nelen (2016), Rüger et al. (2017), and Tajalli and Hajbabaie (2017) for recent literature reviews. Despite that, and even when the relationship between workers' health and sickness absenteeism may seem straightforward, only a few authors have analyzed whether longer commutes are associated with increased absenteeism due to worker sickness. Van Ommeren and Gutierrezi-Puigarnau (2011) first studied how the distance commuted by workers in Germany affected their sickness absenteeism. Goerke and Lorenz (2017) then found that only employees who commute very long distances are absent more than their counterparts who do not commute to/from work. Considering that commuting is a complex phenomenon Cropper and Gordon 1991;Guell et al. 2012;Jessoe et al. 2018;Krüger and Neugart 2018;Manning 2003;Rodríguez 2004;Ross and Zenou 2008;van Acker and Witlox 2011), and that urban forms may vary across countries 1 3 (Gobillon et al. 2007;Gimenez-Nadal et al. 2021), we aim to study the following hypothesis: commuting time is positively correlated with worker sickness absenteeism in the US.
At least two distinct (and potentially simultaneous) mechanisms may take place to explain this notion. First, and given the well-known impact of commuting on worker health, stress, and fatigue, workers with longer commutes are more likely to fall ill, because longer commuting decreases health outcomes. If this is the channel through which commuting affects absenteeism, it would represent a biological consequence of longer, thus more stressful and exhausting commuting. Contrarily, worker utility may be negatively affected by commuting duration, since longer commutes reduce workers' potential leisure time . Thus, workers who spend more time commuting to/from work would have increased incentives to call in sick -even though they are not necessarily so-compared to workers who spend less time commuting. This channel, then, would represent strategic behaviors of workers, and not a biological consequence of commuting time.
Distinguishing between the two channels presented above may not be straightforward, but if the impact of commuting time on sickness absenteeism is driven by decreased health outcomes, then controlling for workers' health would partially net out that channel, and we would be able to determine whether workers' respond strategically to longer commutes by calling in sick. If the correlation between commuting time and sickness absenteeism is positive and statistically significant without controlling for workers' health, either (or both) of the presented channels could drive the relationship. Then, if the same correlation with worker health is not statistically significant, that would suggest that decreased health drives the correlation between commuting and sickness absenteeism, therefore rejecting the strategic behavior channel. If the correlation remains positive, quantitatively robust, and statistically significant, that would indicate that the impact of commuting on sickness absenteeism is not driven by decreased health, but by worker strategic behaviors linked to shirking (Ross and Zenou 2008). 2 However, both potential channels could operate simultaneously, and in such a scenario one would expect that the correlation (once health controls are considered) remains positive and significant, although quantitatively smaller than when such controls are not considered.

Data and variables
We use data from the Panel Study of Income Dynamics (PSID), a longitudinal household survey conducted every two years (since 1986) by the University of Michigan. 3 The PSID consists of a representative sample of more than 5000 US households per wave, and contains information about household and personal characteristics, including socio-demographics, employment, and wealth. The PSID includes information at household level, and for every member of the interviewed households. Waves of the PSID before 2011 cannot be used throughout the analysis, as information about commuting time was only included in the 2011 PSID questionnaire. Thus, we use data from the 2011, 2013, 2015 and 2017 waves of the PSID interviews.
From each household, we select individuals who are defined as the head of the household and the spouse (if any), and restrict the sample to employed individuals who report positive labor supply; thus, students, retired workers, and disabled workers are omitted from the analysis. Individuals who appear in the sample for just one year are also omitted, as we aim to take advantage of the panel structure of the data to net out this relationship from time-invariant unobserved heterogeneity of individuals. 4 Self-employed workers are also excluded from the analysis, as their commuting trips may be of a different nature than commutes of employees van Ommeren and van der Straaten 2008). 5 The final sample is composed of an unbalanced panel of 18,559 observations, corresponding to 5902 individuals (the average individual appears in the sample 3145 times), of whom 3218 are men and 2684 are women.
The PSID contains information about sources of job absence during the previous year, including (own) sickness absence, absence because another in the household was sick, strikes, and vacations and time off, measured as the number of work days individuals missed. Strikes, vacations, and time off can hardly be defined as a form of sickness absence, and thus they are not considered in the analysis. Despite that evidence has shown that spouses' commuting is related (Carta and De Philippis 2018;Hong et al. 2018), the potential effects of a worker's commute on the spouse's sickness absence is beyond the scope of our analysis, and thus we define sickness absence only as days of job absence due to own sickness. Table 1 shows summary statistics of the days of sickness absence, for men and women. We observe that men are absent from work, on average, 0.74 days per year because of sickness, vs. 0.73 days in the case of women. The gender difference in sickness absence is not statistically significant at the 95% level of confidence, according to a t-type test.
It is important to note that sickness absenteeism in the PSID lies below the official average statistics provided by the US Bureau of Labor Statistics, where absence rates due to illness or injury are estimated to be about 2.10%. This is also the case with other analyses of absenteeism using the PSID, such as Du and Leigh (2018), who report annual rates of absenteeism of about 1.5%, measured in weeks with absences over total weeks worked. 6 Differences between absenteeism as measured in the PSID and the official statistics could come from different channels. For instance, a potential source of measurement error may affect absenteeism in the 4 We also identify and omit outliers, using the Blocked Adaptive Computationally Efficient Outlier Nominators (BACON) algorithm (Billor et al. 2000). We find only one outlier, which corresponds to an individual who reported 208 days of sickness absence. This observation is not considered in the analysis. 5 Travel related to work and commuting forms part of the profit function, and thus the relationship between commuting and sickness absence may differ from the relationship for employees (Gimenez-Nadal and Velilla, 2021). 6 https:// www. bls. gov/ cps/ cpsaa t47. htm.

3
PSID, as is also the case for work hours and earnings (Bound et al. 1994). Future research should focus on this topic, which is beyond the scope of this study. On the other hand, the current legal framework concerning sickness absenteeism in the US is heterogeneous and varies across states, though there is no national law requiring employers to pay sick leave to employees (Martel et al. 2015). Thus, US workers' incentives to be absent from work due to illness are reduced, compared to other countries, such as Germany, where employees receive sick pay amounting to 100% of their salary for a maximum of six weeks. This could also explain the low rates of absenteeism reported in the sample.
Our main explanatory variable is the time devoted to commuting to/from work. This information was first collected in the PSID in 2011, and is measured in minutes per day, via the following question: "On a typical day, how many minutes is (was) your round trip commute to and from work?". Table 1 shows that male workers spend, on average, 43.22 min per day in commuting, against the 37.60 average minutes spent by women. 7 This gender difference in commuting time is statistically significant at standard levels, in line with prior studies of the gender gap in commuting (e.g., Molina 2014, 2016).
The PSID provides information about individual, family, and labor characteristics that may be correlated with sickness absence. We consider that age may affect the number of days of sickness absence, given that as people grow older their health status may worsen, leading to an increase in the number of days of sickness absence. Education is also considered, since prior research has highlighted the negative educational gradient in sickness absence (Hämmig and Bauer 2013;Kaikkonen et al. 2015;Piha et al. 2009Piha et al. , 2012. Education is collected in the PSID as the highest grade or year of schooling completed, measured in completed years of education. We also control for the number of hours worked per week (i.e., in a "typical week"), and total family income, defined as all the income received by the household, measured in dollars per year, which allows us to control for the socio-economic position of the household. 8 Better socio-economic positions have been linked to lower sickness absence (Barmby et al. 1995;Löve et al. 2013;Markussen et al. 2011;Piha et al. 2009). The PSID also includes information about the health status of workers, which is defined in five levels of self-reported health: excellent, very good, good, fair, and poor, via the question "Would you say health in general is excellent, very good, good, fair, or poor?" From this information, five dummy variables are created (see Table 1 for the distribution of these variables).
Household characteristics such as the presence/age of children, and marital status have been linked to sickness absence (Bratberg and Naz 2009;Mastekaasa 2000;Simonsen and Skipper 2012). For instance, Mastekaasa (2000) argues that women are more often absent from work due to sickness because they are exposed to the 'double burden' of combining paid work with family obligations, particularly for married women. Thus, the presence of children and the marital status of the couple may condition the number of days of sickness absence. For the number of children, we follow Campaña et al. (2016) and consider the number of children in two age-groups, children under 7 years, and children between 7 and 17 years (inclusive). For the two age groups, we create dummy variables that take value "1" if there are one or more children in the household at this age range, and value "0" otherwise.
Regarding marital status, we create a dummy variable that takes value "1" if the corresponding individual lives with a (married or unmarried) partner, and "0" otherwise.

Econometric analysis
We exploit the panel structure of the data, linking changes in commuting time to changes in sickness absence of workers, controlling for a set of socio-demographic characteristics. We construct a linear model with the Fixed Effects estimator, which allows us to examine the relationship between commuting and sickness absence, net of unobserved individual heterogeneity. There may be unobserved factors at the individual level related to both commuting time and sickness absence, introducing a source of endogeneity in the relationship. The Fixed Effect estimator allows us to control for the time-invariant unobserved factors that affect both variables, but we cannot control for the time-variant unobserved heterogeneity of individuals (i.e., factors related to both sickness absence and commuting time that vary over time).
We estimate the following linear Fixed Effects (FE) model: Where α i represents the unobserved time-invariant effect of individual "i" (timeinvariant unobserved heterogeneity), Y it represents the days of sickness absence for individual "i" in wave "t" (t = 2011, 2013, 2015, 2017), C it is the daily minutes of commuting of individual "i" in wave "t", X it represents a vector of the sociodemographic characteristics of individual "i" in wave "t", and u it is the error term. Further, as men and women tend to show different behaviors in their time-allocation decisions, we estimate Eq. (1) separately for male and female workers Sevilla 2011, 2012).
The coefficient of interest is C , which represents the estimated relationship between commuting time and sickness absence. We expect C to be positive: more commuting increases sickness absenteeism for workers. We transform commuting time and days of sickness absence to their log form so that C can be interpreted in terms of elasticity, that is, the percent change in y (the dependent variable), when x (the independent variable) increases by 1%. Figures 1 and 2 show k-density functions of the log of commuting time, and the log of sickness absenteeism, respectively. We observe that the distribution of both commuting and absenteeism is very similar between men and women, with a peak at zero (non-commuters and nonabsenters, respectively). Furthermore, the distribution of commuting time shows an inverted U-shaped distribution, that resembles the shape of a normal distribution, whereas the distribution of sickness absenteeism seems to be more concentrated at low values.
The self-reported health status of workers is also included as a control in Eq. (1), as it potentially isolates the effect of possible health shocks on sickness absenteeism from the effects of commuting (Leigh 1991; van Ommeren and Gutierrez-i-Puigarnau 2011). However, commuting time is also related to health outcomes Dolan et al.   Roberts et al. 2011), so including self-reported health could take away some of the relationship between commuting time and sickness absence (Angrist and Pischke 2014). We include self-reported health in our models, although its exclusion does not change our main conclusions.
We also include in Eq.
(1) a year trend, in order to control for changes in sickness absence over time, and occupation fixed effects, and we control for the type of occupation, as there may be cross-individual differences in the nature of jobs, and thus differences in sickness absence rates. Then, by controlling for occupation we partially isolate the effect of commuting from the effect of differences in the nature of the job. We finally include state fixed effects in order to control for changes in sickness absence due to cross-state differences in, for example, weather conditions, that may affect both sickness absence and commuting behavior. However, we cannot include worker characteristics that are time-invariant, such as the ethnic status of workers (Baker and Pocock 1982;Leigh 1991), since the potential correlation between these attributes and sickness absenteeism is already captured by the individual fixed effect term.

Results
Columns (1) and (2) of Table 2 show the results of estimating Eq. (1) for the (log of) days of sickness absence for men and women, respectively, when health controls, occupation, and state fixed effects are not included in the model. We observe a positive and statistically significant association between commuting time and the annual days of sickness absence for both male and female workers, with the coefficients being statistically significant at the 99% level of confidence. Specifically, a 1% increase in the daily commuting of workers is associated with an increase of 0.018 and 0.027% in the days of sickness absence per year of male and female workers, respectively.
Columns (3) and (4) show similar estimates when health controls are included in the model. Health controls may take away part of the correlation between commuting time and sickness absenteeism, while one empirical concern related to estimates in Columns (1) and (2) is the endogeneity of commuting times, due to missing variables. Several authors have highlighted the importance of controlling for health status, in order to separate the effect of potential health shocks on sick-day absence from the effects of commuting (Leigh 1991;van Ommeren and Gutierrez-i-Puigarnau 2011). 9 This inclusion does not change the coefficients of interest, neither qualitatively nor quantitatively. Thus, the correlation between commuting time on the one hand, and worker sickness absenteeism on the other hand, seems not to depend on worker health status. This suggests that, according to the conceptual framework, the correlation between commuting time and sickness absenteeism is partially driven by strategic behaviors, and not 1 3 Empirica (2022) 49:691-719 Table 2 Fixed Effects estimates Robust standard errors clustered at the household level in parentheses. The sample (PSID 2011(PSID -2017 is restricted to workers who report positive hours of market work. Self-employed workers are excluded. The dependent variable is the log of sick-day absences. Commuting time is measured in log-of-minutes per day. *** Significance at the 1% level, ** significance at the 5% level, * significance at the 10% level  (5) and (6) show similar estimates when occupation and state fixed effects are included in the estimating equation. The results remain unchanged, suggesting that estimates are robust to these worker characteristics.
Regarding the remaining explanatory variables, in general, very few sociodemographic characteristics are related to the days of sickness absence at statistically significant levels. This could be because estimates include individual fixed effects and thus are net of personal time-invariant unobserved heterogeneity. The hours worked per week display an inverted U-shaped relationship with the days of sickness absence for males, but a positive linear correlation for females, both being significant at the 95% level. Living in couple is also correlated with decreased absenteeism at the 95% level, but only for males, while males (but not females) in wealthier households display increased absenteeism. The squared log of household income is negative in the males' equation, but significant only at the 90% level. Furthermore, males who have changed their job, relative to the previous wave, show fewer days of sickness absenteeism than those workers who have not changed their job. The similar coefficient for females is not statistically significant at standard levels.
Another empirical concern in this analysis is the endogeneity of the commuting variable due to simultaneity. In comparison with healthy workers, those with poor health may have a comparatively longer commute as they need to be close to a hospital, and at the same time they are more likely to be absent from work, which may explain the positive relationship between commuting time and sickness absence. To partially deal with this issue, we now select workers who report having a fair, good, very good, and excellent health status, and exclude from the analysis those individuals who report having poor health.
Columns (1) and (2) of Table 3 show the results of estimating Eq.
(1) when we exclude individuals with poor health in the first wave (i.e., 2011), for males and females respectively. We can probably assume that the choice of residence location for these workers is not motivated by health problems -but for other reasons, such as available facilities in the area of residence, the income level of the area, or the need to bargain over partner's commuting times -and so any variation in commuting time during the period is not motivated by changes in individual health status. In comparison to the results shown in Table 2, we observe that these results are robust, as the coefficients remain unchanged. When workers are chosen by considering health status during the three previous waves of the sample, and we exclude workers reporting poor health in any of those three previous waves (2005, 2007 or 2009), results (shown in Columns (3) and (4) of Table 3) are also robust to our main estimates. Table 4 shows a battery of results aimed at testing whether our main conclusions are robust to the choice of the econometric model and sample selection issues. Columns (1) and (2) show estimates of Eq. (1) where we omit the log-transformation of the variables of interest and, instead, compute the inverse hyperbolic sine of sickness absence and commuting time (without the need to rescale due to problems with zero-absence and zero-commute). Columns (3) and (4) show the results when we replace the log of the variables of interest plus the unity, by the log of the variables plus 0.1, to analyze whether the ad-hoc addition of unity for the logarithms to be correctly defined is affecting the main estimates. We then estimate Eq. (1) on a reduced sample by eliminating those individuals with more than 15 days of sickness absence (0.24% of the sample), and those with more than 120 min of commuting (2.30% of the sample), to minimize the effect of atypical workers. We estimate the model to include self-employed workers. All these results are consistent with the results shown in Tables 2 and 3, indicating that our results are robust to sample selection, atypical workers, and missing variables issues.

Other robustness checks
As an additional robustness test, we restrict the sample to workers who do not change their job or residential location (same job and residence), following van Ommeren and Gutierrez-i-Puigarnau (2011). This strategy partially eliminates a potential source of endogenous variation in workers' commuting time that is due Table 3 Estimates for individuals with good health Robust standard errors clustered at the household level in parentheses. The sample (PSID 2011-2017) is restricted to workers who report positive hours of market work. Self-employed workers are excluded. Individuals who report health status "poor" are excluded, and individuals who report health status "fair", "good", "very good" or "excellent" are retained. The dependent variable is the log of sickday absences. Commuting time is measured in log-of-minutes per day. Additional coefficients are available upon request. *** Significance at the 1% level, ** significance at the 5% level, * significance at the 10% level

Variables
The first wave in the sample  Table 4 Robustness checks Robust standard errors clustered at the household level in parentheses. The sample (PSID 2011-2017) is restricted to workers who report positive hours of market work. Self-employed workers are excluded. Columns (5) and (6) are restricted to individuals who report less than 15 sick-day absences, and less than 120 min of commuting per day. Columns (7) and (8) include self-employed workers. The dependent variable is the log of sick-day absence. Additional coefficients are available upon request. *** Significance at the 1% level, ** significance at the 5% level, * significance at the 10% level to changes in job or location. We estimate the correlation between the days of sickness absence (not in logs) and the log of commuting time, using a count data model. Specifically, we estimate a Poisson Fixed Effects regression with robust covariance matrix (Woolridge 1999), a count model applicable when dependent variables are count variables, as is the case of the annual days of sickness absence. The Poisson Fixed Effects model includes the dependent variable, not measured in its log form. An alternative to this model would be the use of a negative binomial model. Conditional and unconditional fixed-effect negative binomial estimates are used in van Ommeren and Gutierrez-i-Puigarnau (2011) to determine that conditional models show downwardly minimal biases. However, conditional fixed-effects negative binomial models have been criticized, and unconditional negative binomial models tend to underestimate the error terms Allison and Waterman 2002;Greene 2007;van Ommeren and Gutierrez-i-Puigarnau 2011). Results for this set of estimates are robust to results shown in Tables 2 and 3, and are shown in Table A1 in the Appendix. An important point here is that, despite that robustness checks provide qualitatively robust estimates, some quantitative differences emerge. Specifically, when we replace Table 5 Results for gender differences Robust standard errors clustered at the household level in parentheses. The sample (PSID 2011(PSID -2017 is restricted to workers who report positive hours of market work. Self-employed workers are excluded. Columns (3) and (4) are restricted to workers who report positive commuting time. The dependent variable is the log of sickday absences. Commuting time is measured in log-of-minutes per day. Being a commuter takes value 1 if the individual reports positive commuting, 0 if reports zero commuting. Additional coefficients are available upon request. *** Significance at the 1% level, ** significance at the 5% level, * significance at the 10% level

Variables
Commuting (1)  the log of the variables of interest plus the unity, with the log of the variables plus 0.1, the point estimates of the slopes of interest (i.e., between the log of commuting time and the log of the days of absenteeism) change. A potential explanation for this quantitative difference is that the days of sickness absenteeism in the sample are low, with an average of about 0.7 days. Thus, by adding 1 we always obtain values greater than or equal to 1, whereas by adding 0.1 we still obtain values lower than 1 on average. Since the slope of the logarithm function is steeper between 0 and 1 (when the log takes values from minus infinite to 0), that explains the quantitative difference that emerges between the two different log transformations proposed. On the other hand, point estimates of the Poisson fixed effects regression are also quantitatively larger, relative to the main estimates. This is likely due to the different model estimated, as Poisson fixed effects regressions include the dependent variable not in logs, and the interpretation of point estimates is different. Indeed, Poisson fixed effects regressions provide estimates of the relationship between commuting time and sickness absenteeism, closer to van Ommeren and Gutierrez-i-Puigarnau (2011), even though those authors do not analyze commuting time but commuting distance, and studied Germany, not the US. Tables 2 and 3 show possible quantitative gender differences in the magnitude of the relationship between commuting and sickness absence -although not at statistically significant standard levels (p = 0.203). In a sensitivity analysis, van Ommeren and Gutierrez-i-Puigarnau (2011) find the relationship between commuting distance and sickness absence is stronger for men, but still significant for women. Thus, we now explore whether there are gender differences in the relationship between commuting and sickness absence.

Differences by gender
To that end, we explore the extensive (participation) and intensive (amount) margins of commuting as predictors of sickness absence. We first focus on the extensive margin of commuting. Columns (1) and (2) of Table 5 show the estimation of a Fixed Effects linear probability model, where the explanatory variable is a dichotomous variable that measures whether workers report positive commuting time (1) or not (0), for men and women, respectively. Results show that, while the coefficient for participation in commuting is significant at the 95% level for female workers, it is not statistically significant for men. Thus, being a commuter is associated with an increase in the days of sickness absenteeism of 12.2% for female workers.
To analyze the intensive margin, we now estimate Eq.
(1) where we exclude zerocommuters, so that only those workers who devote positive time to commuting are analyzed. 10 Restricting the analysis to these workers only allows us to analyze the 10 Table A4 in the Appendix shows the rate of workers, by occupation, reporting zero commuting time. The larger percentages are found among men in sales occupations (13.8%) and in construction occupations (11.3%), along with women in installation occupations (17.4%). Nonetheless, since only 21 women in the sample work in installation occupations, this percentage may well be unrepresentative. On the other hand, the occupations with the lowest percentages of zero commuters are catering and production, where between 1% and 1.5% of workers in such occupations report zero commuting time. Zero-commuters can be interpreted as being teleworkers, who do not have to travel to work. This is important in the current context, given prior evidence showing the importance of teleworkers in the US (Gimenez-Nadal et al. 2020).

3
Empirica (2022) 49:691-719 intensive margin. Results for men and women are shown in Columns (3) and (4) of Table 5. We observe that the intensive margin of commuting time is not significant in the absenteeism of female workers, since for those workers the relationship between the duration of the commute and absenteeism is not statistically significant. However, it is statistically significant for men at the 99% level: a 10% increase in commuting time is associated with a 0.31% increase of sickness absence for male commuters.
This evidence indicates that the extensive margin in commuting is more important for women then for men, in relation to absenteeism, while for men the intensive margin is comparatively more important than for women.

Differences by urban/rural residence
In all previous analyses, we have not taken into account whether individuals reside in rural or urban areas. However, the probability that commuting time varies for workers may differ, depending on where they live, as they may have more difficulty adapting to changes in commuting determinants (e.g., maintenance work of roads and highways, development of new public transport systems….) or they may be more affected by daily commuting shocks. 11 For example, if the Table 6 Estimates by urban status Robust standard errors clustered at the household level in parentheses. The sample (PSID 2011(PSID -2017 is restricted to workers who report positive hours of market work. Self-employed workers are excluded. The dependent variable is the log of sick-day absences. Commuting time is measured in log-of-minutes per day. Additional coefficients are available upon request. *** Significance at the 1% level, ** significance at the 5% level, * significance at the 10% level worker lives in a rural area and the road he/she uses to go to work is under maintenance, this worker will not necessarily have an alternative mode of transport (e.g., public transport) and thus his/her commuting time will be affected. On the other hand, if a comparable worker lives in an urban area, he/she may find alternative ways to get to work. Alternatively, those workers living in urban areas may be more affected by unexpected commuting shocks than workers living in rural areas, as traffic congestion or accidents may be more present in urban areas. Thus, whether sickness absence of workers living in urban areas is more or less affected by their commuting behavior, in comparison to those living in rural areas, is unknown a priori. We now analyze the relationship between commuting and sickness absence in terms of the differences between workers in urban or rural areas. Columns (1) and (2) of Table 6 show the results for male and female workers in urban areas, and Columns (3) and (4) show the results for male and female workers in rural areas. We observe a positive and statistically significant association between commuting time and the annual days of sickness absence for both male and female workers in urban areas, with the coefficients being statistically significant at the 95% level. Specifically, a 10% increase in the daily commute is associated with an increase of 0.16% and 0.26% in the days of sickness absence per year of male and female workers, respectively. In rural areas, the association between commuting time and the annual days of sickness absence is not statistically significant at standard levels for any workers. Thus, the positive relationship between commuting time and annual days of sickness absence seems to be concentrated in urban areas.

Additional results
A different concern regarding the baseline set of results is that we do not control for earnings. Nevertheless, the results may suffer from omitted variable bias, since those workers with relatively higher earnings may be less likely to be absent from work (i.e., a higher opportunity cost of work time). Those with longer commutes may also have relatively higher earnings, given that they must be compensated for their longer journeys. Thus, we include the annual salary of workers, and its square, to see how our results change. To avoid multicollinearity issues, we transform household income to exclude respondents' annual salary. The annual salary of workers is defined as the labor income of respondents, excluding farm and unincorporated business income, including "wages and salaries, bonuses, overtime, tips, commissions, professional practice or trade, additional job income, and miscellaneous labor income", and is measured in dollars per year. Table A2 in the Appendix shows the main coefficients for the set of estimates under this specification; the main conclusions remain unchanged. Finally, to test the sensitivity of the estimates to the inclusion of health controls, we repeat in Table A3 in the Appendix the set of estimates excluding health controls. The results are again robust to the main specification that includes worker health controls.

Discussion
Certain limitations of the data may produce results that are downward biased (i.e., the relationship is greater than estimates show). First, the question used to measure commuting clearly characterizes a day without unexpected events. This is important, since commuting stress, which may lead to sickness absence later on, may be caused by unexpected commuting events (commuting is no longer controllable for the individual). Thus, our explanatory variable, and its changes over time, measures normal or usual commuting, and thus the estimated relationship between changes in commuting time and changes in sickness absence may be stronger than estimated. This bias may be larger than expected as the PSID does not include information on commuting modes. Some commuting modes, such as driving or public transport, may be more subject to unexpected shocks in comparison to other modes, such as walking/ cycling or riding a scooter, and thus the relationship between commuting and sickness absence may be concentrated in users of the former commuting modes. 12 It is true that in the US, 92% of commuting is done by private car (91% in rural areas, and 95% in urban areas, according to the American Time Use Survey; Gimenez-Nadal and Molina 2019b) and so our results mostly refer to commuting by car, and are representative of the bulk of the worker population in both urban and rural areas.
Regarding the gender difference in the relationship between commuting time and sickness absence, several explanations can be found. For instance, differences may be due to the type of jobs men and women occupy. Some occupations may be more likely to have a higher proportion of teleworkers, and thus those workers may be less affected by commuting. To the extent that there is occupational sorting by gender (Goldin 2015), this could explain gender differences. Alternatively, household responsibilities may affect how commuting time is related to sickness absence. According to this, women have historically been responsible for the functioning of the household, as they devote more time to household chores, even when they also participate in the labor market. Such differences in the amount of time devoted to household chores has led researchers to formulate the Household Responsibilities Hypothesis (HRH), in that household responsibilities lead women to choose jobs that are comparatively closer to their residence than men, in order to facilitate the fulfillment of their household responsibilities, especially the care of children (see Gimenez-Nadal and Molina (2016) for a review of the HRH literature). Furthermore, female workers are more likely to use public transport services, rather than driving a car.
The fact that female workers have shorter commutes, and use public transport more often, in comparison to male workers, may make females less subject to unexpected commuting shocks, and thus they have less commuting stress and ultimately less sickness absence. While commuting hours are more significant for male workers, as they do more driving, commuting by female workers may be less affected by unexpected commuting events. The PSID does not include enough data to test this, and so it must be left for future research. The case of urban/rural differences can also be based on unpredictability of commuting shocks, as those living in urban areas may be more subject to such shocks, so the relationship between commuting time and sickness absence is stronger, in comparison to rural workers. Again, this needs to be analyzed with appropriate data.

Conclusions
This paper provides empirical evidence of the relationship between worker commuting time and sickness absence. Using the PSID, we show a positive and significant correlation between commuting and sickness absenteeism, for both men and women, which indicates that workers with longer commutes are more likely to be absent from work due to sickness. Furthermore, the positive relationship between commuting time and the annual days of sickness absence is concentrated in workers in urban areas, and we find gender differences in this relationship. These results are robust when sample selection, missing variables, and health controls are explored.
The results are of interest for several reasons. Sickness absence is costly for firms as it directly affects the labor costs. Thus, firms should consider to what extent reducing the commuting of their workers results in decreases in their costs as a consequence of lower sickness absenteeism, and here telework may be a direct and immediate solution. Firms should promote policies that improve workers' health and general well-being, focusing on targeted support for older workers with more social and family ties in the place of residence. Firms and companies may select those areas where transportation infrastructures and services are better and more developed, so that their workers have shorter commutes and are less affected by commuting shocks. From this point of view, local and regional governments could implement public policies aimed at decreasing commuting times and commuting shocks, via the development of better transport infrastructures, improved public transport services, and more control over traffic congestion (e.g., road pricing at peak hours).

Appendix A
See Tables 7, 8, 9 and 10.  (PSID 2011(PSID -2017 is restricted to workers who report positive hours of market work. Self-employed workers are excluded. The dependent variable is the log of sick-day absences. Commuting time is measured in log-of-minutes per day. Additional coefficients are available upon request. *** Significance at the 1% level, ** significance at the 5% level, * significance at the 10% level      Robust standard errors clustered at the household level in parentheses. The sample (PSID 2011(PSID -2017 is restricted to workers who report positive hours of market work. Self-employed workers are excluded. The dependent variable is the log of sick-day absence. Additional coefficients available upon request. *** Significance at the 1% level, ** significance at the 5% level, * significance at the 10% level    Robust standard errors clustered at the household level in parentheses. The sample (PSID 2011(PSID -2017 is restricted to workers who report positive hours of market work. Self-employed workers are excluded. The dependent variable is the log of sick-day absence. Additional coefficients available upon request. *** Significance at the 1% level, ** significance at the 5% level, * significance at the 10% level