Efficiency versus gender roles and stereotypes: an experiment in domestic production

Empirical studies cast doubt on the efficiency assumption made in standard economic models of household behavior. In couples, the allocation of time between activities remains highly differentiated by gender. In this paper we examine whether couples deviate from efficiency in household production, using an experimental design. We compare the allocation of gendered vs. gender-neutral domestic tasks. Our results show that women in the household overspecialize in “feminine tasks” and men in “masculine tasks” compared to what their comparative advantage would require, hence revealing the influence of gender roles and stereotypes on the couples’ behavior.

Fig. 1
Fig. 2


    Except for in some agricultural households in developing countries.

    In a recent paper, Auspurg et al. (2017) use a vignette-based experimental design to investigate whether men and women have different preferences about the allocation of housework within couples. They find remarkably little evidence of any systematic gender differences in preferences, and a general inclination towards an equal distribution of housework. They conclude that the reasons for the gendered division of housework do not derive from gender differences in preferences.

    Household models which include social norms have been developed by Lundberg and Pollak (1993), Carter and Katz (1997) and Cudeville and Recoules (2015).

    If stereotyped but false beliefs about partners’ relative productivity in tasks play a role in explaining inefficiencies in couples’ allocation choices, then providing the true information about productivity to couples should reduced them.

    Few other papers in experimental economics used real effort work tasks, e.g. Fahr and Irlenbusch (2000), Falk and Ichino (2006); and in the context of the study of couples’ consumption behavior, Dasgupta and Mani (2015).

    If time spent at home and time spent on paid work can be freely substituted for one another, the market wage rate measures the opportunity cost of an hour at home.

    The experimental design was calibrated to meet these conditions.

    Other measures of distance such as the Euclidian distance give similar results.

    In our case, using established couples was necessary since we were interested in observing the impact of social gender norms and stereotypes in the specific context of the family, which is presumably characterized by a high level of communication and cooperation. If gender bias and stereotypes come into play in the intimacy of the family unit, we may expect them to affect all other kinds of relationships in which information sharing is weaker.

    This condition was checked by comparing partners’ answers to individual questionnaires which included questions such as the names of in-laws.

    The web site was maintained by the Parisian Experimental Economics Laboratory (LEEP). A phone number was available to provide assistance if needed.

    Over the 24 experimental sessions, 15 involved four couples, 4 involved three, 4 involved two and only 1 involved one. We planned for each session to include four couples and one experimenter per couple. However sometimes only two or three couples signed up and a few who signed up did not show up. The number of experimenters present was then reduced accordingly in order to keep the number of experimenters per couple constant.

    Questionnaires are available in Appendix I in the ESM.

    Instructions for the experiment are reported in Appendix H in the ESM.

    We further checked in the multivariate analysis (Sect. 4) whether the experimenter’s gender had an impact upon the results.

    The treatment was chosen to be applied between subjects. A within treatment was excluded as too time-consuming, since it would have forced couples to perform four different tasks.

    In Europe, for example, Building and Winqvist (2004) compared 10 countries and showed that women’s share of total household time devoted to laundry ranges from 80% in Sweden to 100% in Slovenia. Conversely, “construction and repairs” is a masculine activity, as, on average, the sharing of time is around 85% for men and 15% for women in Europe.

    Indeed, “household management tasks” are among the tasks which are the most equally-shared between genders. Among French couples, household management is carried out on average 54% by men and 46% by women (Time-Use Survey 2009–2010, authors’ calculations).

    Of course, since the stopwatch was placed in front of each participant during phase 1, he/she could have kept in mind his/her performance in the tasks and chosen to share this information with his/her partner in phase 2. But in the non-informed condition, they could keep this information private.

    The participants were not allowed to empty the basket onto the table to reconstitute the pairs of socks, they had to work in the basket and could not get the matched socks out of the basket before having folded them into a ball.

    Spouses were unable to identify which share of the money payoff earned was attributable to their individual output or work effort. This greatly limited the issue of “ex post undoing” stressed in Munro (2018), which corresponds to the fact that the spouse who contributed most to generating earnings might have a greater say about how to spend the money.

    A cubicle consisted of three tables configured in a U-shape with space for two subjects to work back to back. This configuration allowed one experimenter to measure the performance of two subjects at a time.

    Imposing time limitation to accomplish the tasks results from the theoretical model described above: domestic production here represents a fixed cost. It is only after domestic tasks have been performed that the remaining time can be used for market work and hence paid.

    The second productivity measure performed at phase 3 of experimental sessions was precisely intended to check the stability of individual productivity parameters.

    See condition (2) in Sect. 2.1. No subject in the sample failed to accomplish the tasks in the allotted time.

    An illustrative example is given in Appendix C in the ESM.

    The wage vectors were computed automatically, at the end of phase 1, once the individual productivity measures had been collected from the experimenters, using a pre-programmed Excel-file. The formulas for this computation, based on the theoretical model, are given in Appendix D in the ESM.

    The rules of joint production for a couple were carefully explained to the participants with illustrative examples. In order to avoid complementarity between spouses’ working times, the participants were not allowed to help each other in the production process of one good (the partners could not work together to produce one pair of matched socks, one “T”, one envelope, or one form). Thus, although allowed to work in parallel on the same task, partners had to work independently and keep their output in front of them.

    There was no reference to market work, domestic work or leisure time in the instructions; only the term “free time” was used.

    Experimenters were provided with a pre-prepared form on which they simply filled in the time, once the “free time” signal was given. The output of each partner was counted and noted after they had both left their seats and at the same time as the material was changed for the next round.

    Differences in the first and second measures of performance are significant at the 5% level, except for the phone task, as shown by the t-test results reported in the penultimate column of Table E2 in Appendix E, ESM.

    While one may wish to measure the distance to efficiency in terms of payoffs (i.e. by the difference between actual and efficient payoffs) instead of the Manhattan distance DI used here, payoffs are in fact strongly dependent, not only on the degree of couple efficiency, but also on both of the absolute and relative productivity of partners. Hence, as the measure incorporates many more elements than just efficiency, it does not give a satisfactory indicator.

    Note, however, that as participants were informed of the three wage settings from the beginning of phase 2, they also knew that they would have the opportunity to reverse the roles in another session of joint work: this design was used precisely to avoid equity problems.

    This last observation could result from the fact that all the tasks were perceived as “domestic” by participants, hence partners could think it inappropriate that men perform them all.

    Individual preferences for tasks were elicited from questionnaire Q1 (see Fig. 1): regarding gender-neutral tasks, men preferred the phone task and women the envelope task.

    OLS are not well fitted for fractional variables. The drawback of linear models in this case is that the predicted values from an OLS regression can never be guaranteed to lie within the unit interval.

    As a robustness check, we added as controls the other socio-economic characteristics “years in couple” and “childless”. They are not significant and we found similar results.

    Answers were respectively coded from 1 (very unpleasant) to 4 (very pleasant).

    DifChoice is defined from the answers to question 1. It takes the value 1 when the woman prefers the task for which, on average, women have a comparative advantage (Socks or Envelops) while her partner prefers the other task (Brackets or Phones); \(-1\) in the opposite case; and 0 if both partners prefer the same task.

    DifMWTi for \(i=(1,2)\) is defined as the difference between male and female answers to question 2 which are coded for each task from 1 (very unpleasant) to 4 (very pleasant), so DifMWTi ranges from \(-\,3\) to 3 and is positive if the man likes the task i more than the woman, or negative if he likes it less. The higher the DifMWTi, the more the woman enjoyed performing Task i as compared to her partner.

    Note that, when differences in preferences are controlled for, the spouses’ average years of study significantly increase the relative frequency of efficient choices. This result supports the idea that full Pareto efficiency may fail to be realized due to bounded rationality.

    Violating the prescriptions evokes anxiety and discomfort in oneself and in others.

    Note that, generally, men tended to have a better opinion of their own performance in tasks than that expressed by their partner, except for the bracket task where women strongly overestimate men’s productivity. Moreover, rightly, women tended to have a better opinion of their own performance than of that of their partner, again except for brackets. Finally, men overestimated more systematically and more substantially their own performances than women did (except in the sock task). This confirms the experimental literature on the “overconfidence” of men (see, among others, Niederle and Vesterlund 2007). Interestingly, women tended to adhere more strongly than their companions to gendered stereotypes.

    The variable “False belief” equals 1 if one and/or the other partner has beliefs about their relative performances on tasks which are inconsistent with their true comparative advantage.

    We clearly noticed during the experimental sessions that partners exchanged more and computed more during the free discussion time in neutral sessions than in gendered sessions during which the bargaining time was much shorter.

    Note that in the experiment, couples were isolated from each other so that no social pressure was exerted on their choices. In this context they modify their decisions following a revision of their beliefs, but this might not be the case in a context where their decisions are publicly observable. This issue was out of the scope of this paper but deserves further research.


We thank P. Apps, M. Boltz, P-A Chiappori, the participants of the June 2014 Workshop “Economics of Gender” in Nice, as well as the anonymous referees and the editor, who kindly helped us to greatly improved the present paper. We also thank all the experimenters who helped us in running this experiment. Financial support from the French National Research Agency (ANR “Ginhdila”) is gratefully acknowledged. All errors remain ours.

