Time Use and Subjective Well-Being in France and the U.S.
- First Online:
- Cite this article as:
- Krueger, A.B., Kahneman, D., Fischler, C. et al. Soc Indic Res (2009) 93: 7. doi:10.1007/s11205-008-9415-4
- 789 Views
Social scientists and policymakers have long been interested in comparing the subjective well-being (SWB) of populations over time and across countries, although SWB is hard to define and measure. Nevertheless, attempts have been made to rank countries based on SWB (e.g., Veenhoven 1996; OECD 2005). Cross-country data have also been used to study the effect on SWB of public policies, economic conditions and institutions (e.g., Alesina et al. 2002; Frey and Stutzer 2002; Blanchflower 2007). The most common measure of SWB in these studies is based on a question that asks respondents about their overall level of life satisfaction or happiness. Other measures of SWB include ecological momentary assessment (EMA; Stone et al. 1999) and the day reconstruction method (DRM; Kahneman et al. 2004). These measures collect individuals’ time use and affective experience over time, either using real-time data collection or diary recall methods. An advantage of such time-based SWB data is that they connect individuals’ reported SWB to actual events that occurred in their lives, but these measures have not been used previously in cross-country studies.
A parallel literature compares individuals’ allocation of time. In these studies, researchers either use external judgment to classify certain activities as enjoyable leisure time (e.g., Aguiar et al. 2007) or assign enjoyment scores to activities based on subjects’ average ratings of the activities in general (e.g., Juster 1985; Robinson and Godbey 1999). One limitation of the former is that researchers’ judgments are used to determine which activities constitute leisure and which constitute home production. Should gardening, for example, be classified as enjoyable leisure or irritating home production? A limitation of the latter approach is that recalled enjoyment ratings of activities in general may diverge from the feelings actually experienced during the activities.
Krueger et al. (2008) propose combining data on time use with affective ratings to produce National Time Accounts (NTA). They define NTA as a set of methods for measuring, comparing and analyzing how people spend and experience their time—across countries, over historical time, or between groups of people within a country at a given time. Critically, they collect information on individuals’ self-evaluations of their emotional experiences during various uses of their time, so called “evaluated time use”. (Gershuny and Halpin (1996) and Robinson and Godbey (1999), who analyzed the extent of enjoyment and time use collected together in a time diary, are forerunners to this approach.) Krueger et al. (2008) emphasize a measure of well-being known as the U-index, or proportion of time that individuals spent in an unpleasant emotional state, to facilitate interpersonal comparisons of SWB. An unpleasant emotional state is defined as a period of time in which the strongest feeling is a negative one.
In this article, we first apply NTA to two cities in France and the United States and ask whether the standard measure of life satisfaction and NTA yield the same conclusion concerning relative subjective well-being. Specifically, we designed a survey to compare overall life satisfaction, time use, and recalled affective experience during episodes of the day for random samples of women in Rennes (Brittany) in France and in Columbus (Ohio) in the United States. These cities were selected to represent “middle America” and “middle France”, although there are obvious limitations when it comes to drawing inferences from two cities. We also present results using time allocation derived from national samples in the United States and France to extend our analysis beyond two cities.
Based on the standard life satisfaction question, we find that Americans report higher levels of life satisfaction than the French in our samples. Yet based on the DRM, we find that the French spend their days in a more positive mood, on average, and spend more of their time in activities that are more enjoyable. Consistent with the pattern of time use in our two cities, the national time-use data also indicate that the French spend relatively more of their time engaged in activities that tend to yield more pleasure than do the Americans, using either the average American woman’s rating of activities or average French woman’s rating of activities. While our data are not representative of the entire countries that we study, our results illustrate the feasibility of NTA as a methodology for comparing time use and SWB across countries. The observed discrepancies between global reports of well-being and NTA suggest that considerable caution is required in comparing standard life satisfaction data across populations with different cultures. In particular, the Americans seem to be more emphatic when reporting their well-being. This tendency leads the American to be more likely to report that they are very satisfied with their lives than the French, as well as more likely to report that they are not at all satisfied with their lives. The U-index helps to circumvent this inclination.
Lastly, we illustrate how NTA could be used to make comparisons of well-being within a country over time. Because micro data on well-being and time use are not available from DRM-like instruments over time, we are forced to use historical data on time use across various activities in the U.S. from available surveys. We combine the historical time-use data with ratings on affective experience during various activities collected from a telephone survey version of the DRM that was designed to collect nationally representative data on affective experience and time use in the U.S. The results suggest that American women have gradually shifted their time into activities that are less pleasant emotionally over the last 40 years, while for men there has been remarkable stability in the average emotional experience associated with the pattern of time allocation across activities.
We view NTA as a compliment to the National Income Accounts, not a substitute. Like the National Income Accounts, NTA is also incomplete, providing a partial measure of society’s well-being. National time accounting misses people’s general sense of satisfaction or fulfillment with their lives as a whole, apart from moment to moment feelings. Still, we argue that evaluated time use provides a valuable indicator of society’s well-being, and the fact that our measure of well-being is connected to time allocation has analytical and policy advantages that are not available from other measures of subjective well-being, such as overall life satisfaction.
1 Method and Data: French-American Comparison
The sample consists of 810 women in Columbus and 820 women in Rennes. Subjects were invited to participate after being contacted by random-digit telephone dialing (RDD) in the Spring of 2005, and they were paid approximately $75 for their participation in both countries. The age range spanned 18 to 68, and all participants spoke the country’s dominant language at home. The Columbus sample was older (median age of 44 vs. 39), more likely to be employed (75% vs. 67%) and better educated (15.2 years of average schooling years versus 14.0) than the Rennes sample. In addition, the Rennes sample was more likely to be currently enrolled in school (16% vs. 10%). These differences in demographic characteristics partly reflect different circumstances in the countries (e.g., the employment rate is eight percentage points higher in the U.S. than in France, and average education is 0.9 years higher in the U.S.), and partly reflect idiosyncrasies of our two cities and samples. Because we compare SWB measured with different methods for the same samples, our results should reflect differences in the methods, not demographic differences between the samples.
The DRM protocol described in Kahneman et al. (2004) was followed. Groups of participants were invited for a weekday evening to a central location, where they completed a series of questionnaires contained in separate packets. The first packet included general satisfaction and demographic questions. The life satisfaction question was nearly identical to that in the World Values Survey. The second packet asked respondents to construct a diary of the previous day as a series of episodes, noting the content and the beginning and ending time of each. (About 300 participants in each country were recruited for Mondays to describe a weekend day. Half of them were instructed to describe the preceding Saturday and half the preceding Sunday. Data were not collected pertaining to Fridays.) The average number of episodes described was 13.2 in Columbus and 14.5 in Rennes. Respondents were told that they could keep their time diary so they could feel free to write down their private thoughts.
In the third packet, respondents completed a form for each of the episodes they had previously listed. The form included a list of 22 activities and eight interaction partners, with an instruction to mark all that apply. Respondents who checked multiple activities were requested to indicate the one that “seemed the most important to you at the time” (we call it focal). All of the analyses below refer to focal activities. The form also requested ratings of 10 emotions that were experienced at the time on a scale from 0 (Not at all) to 6 (Very Strongly). We concentrated on the following emotions because, based on multiple translations, it was felt that they represented similar concepts in French and English, and because they capture the dimensions of the emotions circumplex (Russell 1980): ‘Happy’, ‘Tense/stressed’, ‘Depressed/blue’, and ‘Irritated/angry’. The questionnaires were back translated between French and English to guard against different interpretations.
The data were re-weighted by day of week to be representative of a random day. Weekdays received 5/7th of the weight and Saturday and Sunday each received 1/7th of the weight in the weighted samples. Additional details of the surveys are available on line at http://management.ucsd.edu/faculty/directory/schkade/fa-study/. Kahneman et al. (2004) provide some evidence that the DRM yields information on emotional experience that is similar to what is obtained from real-time data capture methods.
2 Life Satisfaction
Distribution of life satisfaction in Columbus and Rennes
Not at all satisfied
Not very satisfied
On further inspection, however, Table 1 provides less clear cut evidence that the Americans’ responses exhibit higher life satisfaction. American respondents are over-represented in both extremes, in both the “very satisfied” and the “not at all satisfied” categories. If the top two categories on the satisfaction scale (very satisfied and satisfied) are combined, the French indicate higher life satisfaction than the Americans: 83% vs. 77%. Thus, it is unclear from these data whether the French are less satisfied or less prone to use the extreme ends of the scales. The propensity to express one’s self in extremes can be influenced by culture and social expectations. Cultural and social norms may discourage French women from reporting themselves as very satisfied compared with Americans.
Data collected from the DRM can be used to compute the average U-index for France and the U.S. The U-index measures the percent of time that someone spends in an unpleasant state (Kahneman et al. 2006; Krueger et al. 2008). An unpleasant state is defined as an episode in which the most intense emotion is negative; that is, U equals 1 for an episode if max(negative emotions) > max(positive emotions), and 0 otherwise. The duration-weighted average U-index can then be calculated for people or activities. The U-index has the advantage of depending on within-subject ordinal rankings of emotions during each episode. The U-index helps to overcome situations in which some individuals have a tendency to be more or less emphatic than others, as long as their tendency is consistently applied to positive and negative emotions.
If the French tend to rank the intensity of positive and negative emotions in a consistent way but nonetheless refrain from using the upper reaches of the scales in reporting their emotions, then the U-index will be unaffected by a tendency for the Americans to be more emphatic than the French. To take an extreme example, suppose the French only use the 1–5 part of the 0–6 scale, while the Americans utilize the full scale. Provided that the French use the 1–5 range consistently for reporting positive and negative emotions—i.e., an emotion reported as a 5 is always experienced more intensively than an emotion reported as a 4—then the U-index is unaffected by this differential use of scales. (One nontrivial caveat to this conclusion, however, is that using a scale with a limited number of integers may compress fine distinctions and increase the number of episodes in which the maximum of the positive emotions equals the maximum of the negative emotions.) As commonly applied, the standard life satisfaction measure is not robust to such reporting differences across people.
4 Comparing SWB with the DRM
U-index@ for various groups in Columbus and Rennes
Day of week
We explored whether the lower U-index for the French is a result of any single negative emotion, or combinations of them. The lower U-index for the French appears to be a fairly robust result. If we required that at least two negative feelings were rated more strongly than happy, for example, the U-index was still about three points lower in France than in the U.S. (10.1% vs. 7.4%) And if we dropped any one of the negative emotions and compared the remaining two to happy, the U-index was lower in France than in the U.S. in each case. These results suggest that the lower U-index in France is not due to the rating of any particular negative emotion.
Table 2 also provides breakdowns of the U-index for various subpopulations. The general pattern is plausible. The U-index in both countries is considerably lower on weekends than on weekdays. The French-American gap is largest for non-students, employed people, low-income people and during the week. Interestingly, in both countries—but especially in the U.S.—the U-index of the unemployed is much higher during the week than it is during weekends. This pattern suggests that observing or reflecting on others going to work during the week worsens the mood of the unemployed during weekdays.
Another issue concerns vacations. In our sample, the French report taking 21 more vacation days a year than the Americans. We were not able to interview people if they were away from home, so we did not sample most vacation days. Accounting for vacations would almost certainly further lower the U-index in France relative to the U.S., as vacation days are likely to have a lower U-index than non vacation days. The following back-of-the-envelope calculation suggests, however, that this is not a large bias. The 21-day difference in vacations amounts to only 5.8% of the year. If the U-index is 10 points lower on vacation days than nonvacation days, which is almost double the difference on weekdays and weekends, then the French U-index would be an additional 0.58 percentage points lower than the American U-index.
5 Counterfactual Cross-Country Comparisons
The U-index and allocation of time across activities
U-index per activity
Percent of time
Both the pattern of time allocation and the U-index by activities are similar in the two countries, with correlations of 0.93 and 0.85 across activities, respectively. The most notable exceptions to this pattern are that the Americans report childcare episodes as substantially more unpleasant than do the French, and the French spend less time engaged in childcare and more time eating. The latter is explained mainly by the fact that Americans are much less likely to indicate eating as their main activity when they engage in multiple activities that include eating. It is also worth noting that the French women in our sample are slightly less likely to have children living at home (56% vs. 60%).
Synthetic U-index based on country’s aggregate time allocation and country’s U-index by activity (based on data in Table 3)
The results indicate that if the French and American women’s allocation of time is weighted by either the average American woman’s rating of activities or the average French woman’s rating of activities, the average French woman is predicted to have a lower synthetic U-index than the average American woman. These findings suggest that French women allocate their time in a way that may produce relatively fewer unpleasant moments, regardless of which country’s average activity ratings is used. The cross-country differences are not statistically significant, however.
6 Combining National Time Use Data with DRM-Based Activity Ratings
One advantage of the synthetic U-index is that it can be computed with national time-use data. This provides a check on whether our results for Rennes and Columbus can be extended to the countries as a whole, and it provides larger samples. Specifically, we analyzed national time use data from the 2003 to 2004 American Time Use Surveys (ATUS) and from the 1998–1999 French “Enquête emploi du temps” [Time-Use Survey] from INSEE. Although the French data are from an earlier period, they are the most recent national data publicly available, and time allocation does not change very rapidly over time. We limited the samples to women from age 18 to 60. Because the activity categories in the data sets are not harmonized, we collapsed the activities in these surveys into six broad categories: work/commuting; compulsory activities; active leisure; passive leisure; eating; and other. The U-index for these categories was computed from the DRM for Rennes and Columbus for the same activities.
National time-use data for U.S. and France and synthetic U-Indexes
Fraction of awake time spent in each activity
Average U-Index per activity
7 Changes in SWB over Time within the U.S.A.
Ideally, one would like to have nationally representative data collected with the DRM over historical time to compare changes in SWB over time. If such data were available, the U-index and other measures of evaluated time use could be computed each year. Changes over time could be decomposed into changes in activities, changes in affective ratings for the same set of activities, changes in the environmental characteristics of activities (e.g., do they involve more or less social interactions), changes in demographics, and so on, using the types of statistical techniques that we have used for the French-American comparison with the DRM data. Unfortunately, such micro data are not currently available over time.
Synthetic U-index based on country’s aggregate time allocation from national time-use data and country’s U-index by activity from DRM
Historical time-use data were drawn from the Yale University Program on Non-Market Accounts, known as the American Heritage Time Use Studies (AHTUS). The AHTUS consists of five time-use surveys conducted from 1965–1966 through 2003, and 2005 data from the ATUS were added as well. (The disparate activity codes were harmonized to a common set of 72 main activities, and PATS was harmonized to the same activities.) The PATS yields information on time use and ratings of affective experience (happy, sad, angry, pain, etc.). In one approach cluster analysis was applied to the PATS to group activities into six categories that are associated with similar affective experiences (see Krueger 2007). Time use in these six categories was then tracked over time. This exercise reveals that women have shifted toward affectively neutral activities, such as watching television, and away from engaging leisure and social activities and mundane household chores over the last 40 years. For men, the share of the day devoted to work-like activities has declined while it has increased for women.
Time use and affective experience can profitably be combined to produce indicators of subjective well-being. We illustrated this approach with the U-index using new survey data from two cities and with historical data for the U.S. Our results suggest that caution is required in comparing standard life satisfaction data across countries. Cultural differences appear to raise the likelihood that Americans report their feelings using the extremes of response scales, skewing comparisons of SWB. Data on positive and negative emotional experiences, together with time allocation, offer a means to overcome some reporting differences in SWB.
The time-use and affect data collected in an instrument like the DRM can be used to compute other measures of experienced well-being. Krueger and Stone (2008), for example, use the PATS data to study the occurrence of pain across activities and demographic groups. We focus on the U-index because policymakers may be particularly concerned about the incidence of unpleasant feelings, just as policymakers are concerned about the incidence of poverty.
Interpreting the U-index (or other measures of affective experience) during specific activities is potentially problematic for three reasons. First, the U-index could change if people allocate more or less time to an activity. Second, it is plausible that those who participate in an activity find it more rewarding than those who choose not to participate in that activity. Third, it is possible that individuals with certain personality traits (e.g., cheerful) participate in different activities than those with different personality traits (e.g., depression). All of these considerations pose a challenge for the counterfactual calculations that were performed, and these calculations are best regarded as suggestive. In principle, however, these problems are not insurmountable.
For example, individual fixed effects (encompassing personality differences) could be removed from the activity-level U-indexes. Furthermore, exogenous changes in time allocation can be used to measure changes in affective experience with an instrument like the DRM.
Lastly, we should note that EMA and the DRM are costly to implement in representative national samples. The PATS provides a telephone version of the DRM. Future work can apply the telephone survey method in other countries and over historical time, possibly by adding a module to countries’ official time-use surveys. We are also developing a web version of DRM, which can be used for large-scale surveys as well.
Until consistent historical data that combine time use and affective experience are available, researchers will be limited to analyzing trends in evaluated time use at the activity level. Still, data on evaluated time use at a point in time can help to classify activities or to summarize the emotional quality of activities, which facilities historical time-series comparisons without resorting to external researcher judgments concerning the enjoyment of various activities, as was illustrated here.