Treatments and groups
Three treatments were applied during the experiment: Treatment Single, Treatment Multi, and Treatment Choice (subjects were randomly allocated to treatments within each session and were unaware of these labels). In Treatment Single, subjects had to work on two tasks consecutively, for 12 minutes each. In Treatment Multi, subjects were forced to switch between the two tasks approximately every four minutes,Footnote 9 resulting in the same total time constraint per task as before. Subjects did not know how many switches would occur and the time intervals between switches varied, making anticipation unlikely. In Treatment Choice, subjects could alternate between the two tasks by pressing a ‘Switch’ button, subject to the same time constraint per task as before (12 minutes each). A timer informed subjects about the remaining time for each task. When the 12 minutes for one task expired, the screen changed automatically to the other task and the Switch button could not be used anymore.
It is important to see that this design ensures that the same amount of time is spent on each task in all three treatments. If we tried to resemble simultaneity, for example by splitting the screen, we could not determine how much time subjects spend on each task, and therefore we would not know whether performance between treatments differs due to differences in the amount of time allocated to the two tasks or due to differences in the schedules.
As shown in Table 1, subjects were assigned to three groups. Every subject played two rounds, the first of which was Treatment Single. In the second round, subjects in Group 1 played Treatment Single again, subjects in Group 2 played Treatment Multi, and subjects in Group 3 played Treatment Choice. The subjects knew from the start that there would be two rounds and that they would work on one Sudoku puzzle and one Word Search puzzle in each. The puzzles given in Round 2 were different from the puzzles in Round 1 (but they were the same for all subjects within rounds).
This design allows us to answer all four research questions and the fact that Group 1 plays Single twice allows for a difference-in-differences approach. This enables us to correct for learning effects and performance drops due to exhaustion or boredom. To examine the effect of forced multitasking on productivity, we can compare the performance difference between Round 1 and Round 2 of Group 2 to the performance difference of Group 1. To examine the effect of a self-chosen work schedule, we can compare the performance difference of Group 3 to the performance differences of the other two groups. If subjects choose the optimal work schedule, we should see that the performance difference of Group 3 is at least as high as the performance difference of the other two groups.Footnote 10
To examine gender differences in the effects of multitasking on productivity, we follow a difference-in-difference-in-differences approach. Note that any gender difference in performance cannot be led by differences in task proficiency since we compare performance in Round 2 to a subject’s own performance in Round 1. Besides, Group 1 captures any gender differences in learning or exhaustion. For Group 2, any gender difference in performance therefore can only come from differences in the reaction to multitasking. For Group 3, both the reaction to multitasking and the self-chosen degree of multitasking determine the performance difference.
Finally, to examine whether there is any gender difference in the propensity to multitask, we check whether there is a gender difference in the number of switches in Treatment Choice. The propensity to multitask might vary with proficiency: subjects who perform well might find switching easier or more beneficial. Alternatively, subjects who get stuck more often may want to switch more often. To avoid attributing such effects to gender differences in multitasking, we control for performance in Round 1.
Our design requires tasks that are not gender-specific and for which multitasking is natural and possibly beneficial. For these reasons, we have chosen Sudoku and Word Search as tasks.Footnote 11 Sudoku is played over a 9×9 grid, divided into 3×3 sub-grids called “regions”. The left panel of Fig. 1 illustrates that a Sudoku puzzle begins with some of the grid cells already filled with numbers. The objective of Sudoku is to fill the other empty cells with integers from 1 to 9, such that each number appears exactly once in each row, exactly once in each column, and exactly once in each region. The numbers given at the beginning ensure that the Sudoku puzzle has a unique solution. For example, the unique solution to the Sudoku in Fig. 1 is illustrated in the right panel. We measure performance in the Sudoku task by the number of correctly filled cells.
When solving a Sudoku puzzle, solutions often come in waves. Multitasking can be appealing when one is stuck: one can work on the other task and hope to see the problem from a different angle when switching back.
The other task was to find as many words as possible in a Word Search puzzle. An example of a Word Search puzzle is presented in the left panel of Fig. 2, and its solution is presented in the right panel. Participants had to look for the English names of European and American countries in a 17×17 letter grid. Words could be in all directions, including diagonal and backwards. Subjects’ performance is measured by the number of correct words found.Footnote 12
As in the case of Sudoku, it is reasonable to expect subjects to switch when unable to find new words for a while. The situation is similar to polishing a paper, when reading the same lines over and over becomes counterproductive after a while—one changes to another task simply because a ‘fresh eye’ is needed to recognize meaning behind the letters.
Procedures, payments, timeline
One pilot and ten regular sessions were run in the computer lab of CREED (Center for Research in Experimental Economics and Political Decision-Making) at the University of Amsterdam. Participants were university students from various fields of study. The application procedure ensured that the two genders were represented approximately equally in every session, but left subjects unaware that the experiment examines gender-related issues. The experiment was conducted in English, therefore both international and Dutch students could participate. All instructions and tasks were computerized,Footnote 13 and subjects were not allowed to use any paper or take notes during the experiment.
The experiment started with an introduction that explained the rules of the two tasks and gave the participants opportunity to practice. Subjects learned that there would be two rounds and that they would have to play a Sudoku and a Word Search in both rounds. In each round, subjects earned 6 points for each correctly filled Sudoku cell and lost 6 points for each cell filled with a wrong number to avoid random guessing. Subjects were not penalized for cells filled with multiple numbers.Footnote 14 They received 9 points for each word found in Word Search. In Word Search, only entire words could be marked and there was therefore no need to penalize random clicking. Subjects’ total points for each round were determined as the sum of their points in Sudoku and their points in Word Search. Negative total points were set to 0. One of the two rounds was randomly selected for payment at the end and the conversion rate was 1 euro per 11 points. In addition to this, there was a fixed show-up fee of 7 euros. The performance payments and the conversion rate were chosen based on the results of a pilot, such that subjects could earn approximately equal amounts on the two tasks and that the average payment was around 23 euros. The sessions lasted for approximately 1 hour and 45 minutes.
The order of the tasks within each round was randomized, and the assignment of subjects to the three treatments in round 2 was random as well, so that each group consisted of approximately one third of the subjects in every session. The rules of the treatments were explained immediately before the start of the treatment. Subjects were not aware of the fact that not everyone was playing the same treatment as they did.
After both rounds were over, but before being informed about their payment, we elicited some background information such as gender, age, field of study, and nationality from the subjects through a questionnaire. Those who participated in Treatment Choice were also asked their reasons for (not) switching.
Our sample consists of 218 subjects from the ten regular sessions.Footnote 15 They are 22 years old on average and the majority of them is Dutch (73 percent). Approximately half of the sample consists of economics students (53 percent). The sample contains 11 censored observations from subjects who solved the entire Sudoku puzzle in the second round but not in the first.Footnote 16 As Sect. 3.1 explained, subjects were randomly assigned to three groups. Table 2 shows the number of observations per group and gender.Footnote 17 As we can see, there are between 30 and 43 subjects per cell.