Face masks increase compliance with physical distancing recommendations during the COVID-19 pandemic

Governments across the world have implemented restrictive policies to slow the spread of COVID-19. Recommended face mask use has been a controversially discussed policy, among others, due to potential adverse effects on physical distancing. Using a randomized field experiment (N = 300), we show that individuals kept a significantly larger distance from someone wearing a face mask than from an unmasked person during the early days of the pandemic. According to an additional survey experiment (N = 456) conducted at the time, masked individuals were not perceived as being more infectious than unmasked ones, but they were believed to prefer more distancing. This result suggests that wearing a mask served as a social signal that led others to increase the distance they kept. Our findings provide evidence against the claim that mask use creates a false sense of security that would negatively affect physical distancing. Furthermore, our results suggest that behavior has informational content that may be affected by policies. Supplementary Information The online version contains supplementary material available at 10.1007/s40881-021-00108-6.

The first supplementary material is a collection of sections describing the field experiment with greater detail and includes technical details, supplementary descriptive statistics, and a randomization check. The second unit contains details of the online survey experiment with similar content. The third unit is the protocol for the field experiment. The last section is an English translation of the text of the survey experiment as well as the complete original survey in German including both treatment conditions with one of the experimenters. 16 The IRB approval is attached to the end of this file.
Appendix S1 Study 1 -Field Experiment S1.1 Experimental procedures Throughout data collection, the use of face masks was recommended by the Berlin state government but not mandated. 17 Businesses typically regulated how many customers were allowed to enter their premises at the same time to ensure compliance with the physical distancing mandate. At the time, in Berlin people were required by a state directive to keep a 150 cm distance to nonhousehold members in public spaces. 18 During the period of data collection, the regulatory circumstances did not change.
During data collection, experimenters followed a predefined dress-code and an experimental protocol (see Section S3 for details). Each experimenter collected data in public lines of people waiting to enter a store, supermarket, or post office. Data was collected in daylight to ensure good visibility and on flat surfaces to allow for precise measurements. At the beginning of each data collection, the experimenter determined via a coin toss whether to start with Mask or NoMask. They would switch to the other treatment after a predetermined number of observations and collect an equal number of observations in both treatments.
In the treatment condition, Mask, only FFP2-type face masks were used. 19 We measured and recorded the distance between the arriving next person and the experimenter (see Section S3 for details on the procedure). 20 We estimate the effect of masks on distancing as the difference between the average recorded distances in (Mask) and (NoMask) treatments.
To start data collection, the experimenters took a position at the end of the line, ensuring a distance of 150 cm to the person in front of them, assuming a sideways position in the line. When the next person arrived (the subject), the experimenters recorded the distance between their own and the subject's feet. 21 The experimenter proceeded to the next observation by returning to the end of the line until the predetermined number of observations was reached.
17 Mandatory use of masks was first introduced in some public spaces in multiple steps starting from April 27, 2020 (Berlin Senate, 2020). Note: The announcement was made after the end of the data collection for the field experiment.
18 In Germany, most policies were within the discretion of the individual states but the federal government and talks between state governments lead to largely uniform rules. In Berlin, the policies to limit the spread of COVID-19 including physical distancing were regulated through the SARS-CoV-2 Containment Measures Ordinance (SARS-CoV-2-EindmaßnV) on March 22, 2020; the ordinance was changed several times since but not in a respect relevant to the experiment (Berlin Senate, 2020).
19 An FFP2 face mask or filtering facepiece respirator is a half-face mask that filters the air inhaled by the wearer. Details are specified in the EN 149 standard, an equivalent of the N95 US standard. At the time of data collection, the device was the most commonly available protective mask. It is identical to the one used in the online survey. 20 FFP2 respirators must meet high filter standards and are among the most effective commercially available face coverings. However, at the time of the field experiment, no government policy was in place encouraging the community use of masks, nor any information campaign informing the public about mask types. Moreover, there was an acute shortage of masks, and surgical masks were in general unavailable. Hence, we do not believe that the mask choice played a role in the measurements at the time. 21 The measurement was recorded by an augmented reality application on a mobile device that is able to measure a distance between two points on a flat surface in 1-centimeter increments. To comply with privacy laws, no visual recording was taken.
A distance was not recorded if the target subject changed position during the measurement or when the camera view was obstructed by, for example, a signpost. When a group approached the end of the line, the distance was measured to the person standing closest to the experimenter. If the closest person was an infant in a stroller or a person in a wheelchair, the point used for measurement was where the front wheel touched the ground. 22 All data was collected in Berlin, Germany, between April 18 and April 24, 2020, by five experimenters, aged 31 to 35, two women and three men, who acquired 60 observations each, balanced across the two treatments. 23

Descriptive statistics and randomization checks
Our sample consists of independent observations from 300 subjects, 48.7% of whom were male. The majority of subjects were estimated to be between 25 and 45 years old (58.3%). The percentage of subjects entering the line alone was 80.4%, whereas 12.6% were accompanied by at least one adult and 7% were with at least one child. At the time of measurement, 17% of the subjects were wearing a face mask.

S1.3 Kernel density estimates
Using non-parametric kernel density functions, we estimate the distribution of the distance values separately in the two treatments (Fig. S1). A positive 22 Dogs were not included in the study as SARS-CoV-2 has been shown to replicate poorly in canines (Shi et al, 2020). 23 All experimenters participated in data collection voluntarily and are credited as co-authors of this article. None of the authors were in an employee-employer relationship, mitigating ethical concerns that might arise because time spent in public for data collection during the pandemic may pose a certain health hazard.  shift in distancing can be statistically confirmed (D=0.1933, P <0.01, 2-sided Kolmogorov-Smirnov test) and it demonstrates that the presence of a mask induces individuals to keep a greater distance.

S2.1 Survey design and procedures
The survey was conducted via www.prolific.co. The subject pool was restricted to adult individuals who live in Germany (see Table S4 for the geographical distribution). The survey language was German. The translation of questions can be found below in Section S4). In total, the sample consisted of 463 observations; 7 observations were excluded from the analysis due to having failed attention checks (about the gender, pose, mask, and hair color of the pictured person) leading to a final sample of 456 used for the analysis. The survey lasted on average 8.5 minutes.
The survey participants were paid 2.15 EUR for their participation. An additional bonus was paid for some questions. On average, the bonus amounted to 0.18 EUR. All payments were made via the website of the subject pool provider www.prolific.co.
A key feature of our framework is, that respondents were not only asked their opinion about the possible behavior but also had to predict the most popular answers of other respondents to the same questions. For each correct prediction, the respondents received a bonus of 0.20 EUR. Table S3 reports the average descriptive statistics by treatment. As can be seen, the respondents' characteristics are equally distributed between treatments with a small exception regarding respondents' native language: the share of the German native speakers is by 6.1 percentage points lower in the NoMask condition. This difference is at the 10% significant (two-sided t-test) but not significant when tested with Mann-Whitney U test (z = -1, P = 0.3173). Another difference in the sample between treatments is that the respondents in the NoMask condition report to have taken part in a slightly larger number of studies about the masks than those in the Mask treatments (t = 1.65, P = 0.0995). The results of the Mann-Whitney-U test are not significant (z = 1.552, P = 0.1207).

S2.2 Descriptive statistics and randomization checks
The average age of respondents in the sample is 28.1 (SD = 8.2) years. Of the respondents, 58.77% are male, 8.77% of respondents identified themselves as belonging to the risk group for COVID-19, and a further 2.4% answered they were not sure. Virtually all respondents live in Germany. Respondents' distribution by German federal states is reported in Table S4 and largely corresponds to the distribution of the German population.
The average household size of the respondents is 2.6 (SD = 1.82) persons. The income distribution for the subsample of respondents who provided an answer to the question about their household income is given in Table S5.
Respondents also reported their past compliance with recommended prevention measures. Average compliance on a 6-point Likert scale ranging from 1 'never' to 6 'always' was for hand-washing 4.7 (SD = 1.08), for wearing a face mask indoors 2.2 (SD = 1.38), for wearing a mask outdoors 2.1 (SD = 1.42), and for keeping a 150 cm distance to people they do not share a household with 5.0 (SD = 0.94).
The survey further elicited attitudes toward possible mask mandates using a 5-point Likert scale ranging from -2 for 'very negative' to 2 for 'very positive.' A mandate for wearing a mask in supermarkets and public transport was evaluated positively (M = 1.21, SD = 0.94 and M = 1.17, SD = 0.94, both P = 0, 2-sided Wilcoxon signed-rank test). However, a possible mandate to wear a mask while walking outside was perceived negatively (M = -0.49, SD = 1.21, P = 0). On average, the respondents indicated that they perceived face masks as being relatively effective in preventing the spread of the coronavirus (M = 0.78, SD = 0.92, P = 0). Notes: Column 1 and Column 2 report mean answers in NoMask and Mask conditions respectively. Standard deviations in parentheses. Column 3 reports the difference between the treatments. The significance levels of two-sided t-test are reported on superscripts. * p < 0.10, ** p < 0.05, *** p < 0.01. Table S6 reports the mean and standard deviations for our key outcome variables. Column 3 reports the differences between treatments and the p-values for the multiple hypothesis testing as described by List et al (2019).   Notes: Column 1 and Column 2 report mean answers by treatment and standard deviations in parentheses. Column 3 reports the difference between the treatments. P-values for multiple hypotheses testing (List et al, 2019) are reported in parentheses. * p < 0.10, ** p < 0.05, *** p < 0.01. Notes: Ordinary least squares estimates. Standard errors in parentheses. * p < 0.10, * * p < 0.05, * * * p < 0.01. This table shows detailed estimation results obtained from a linear regression of survey respondents' estimated average distance kept by the participants of the field experiment on their first-and second-order beliefs about the experimenter in Mask and NoMask. In both panels, we consider three types of beliefs: beliefs about the pictured person's preferred distance, beliefs about the likelihood of the person being sick, and beliefs about the likelihood of the person being infectious. In panel A, the independent variables are all three first-order beliefs. In panel B, we instead use second-order beliefs, which are beliefs about the average or mode answer of other respondents about the preferred distance, the sickness and the infectiousness. The first-order beliefs are not incentivized, but the second-order beliefs are incentivized. In models (3) and (4), the control variables are levels of compliance with lockdown measures in the past week, beliefs toward the effectiveness of masks, and demographic information consisting of age, gender, income, household size, political view, and risk attitude.

S2.3 Additional results
Excluding the believes about the sickness and infectiousness of the pictured person does not change the results.

Introduction
The instructions for recording the data follow. Please read the whole document and follow all points very carefully.

Experimenter Appearance
As an experimenter, you will need an FFP2 respiratory protection mask for this experiment. Each time before you go to an experiment location, you will take two full-body (self-)portrait photos of yourself: One with and one without a mask. The primary purpose of the photos is to record variables describing your appearance if this is requested by the reviewers. To decrease the noise due to experimenter appearance, you are expected to wear a pair of blue jeans and a dark-colored (black, dark gray or navy blue) top without any visible text or logo. 25 Location You may choose a location that satisfies the following list of conditions.
• The establishment is an open supermarket, a drug store (except pharmacy) or a post office. • There must be a waiting line outside with people waiting to enter the store. The waiting line must stand on a flat surface with no obstructing objects. Make sure that the waiting line is clearly visible and it is clear for the arriving subject that you are the last person in the line and approximately where they should stand. • You can record the data anytime until April 24 between 8am to 8pm during daylight with good visibility. In order to secure good visibility conditions, do not record data when it is raining. • You should avoid stores that have heavy traffic that would make measurement difficult. For instance, if there is another store or a subway exit next 24 The Robert Koch Institute (RKI) is the government's key scientific institution in the field of biomedicine. It is one of the central bodies for the safeguarding of public health in Germany. It identifies risk factors that increase the chance of a serious illness; we confirm that none of the experimenters fall into these categories. See https://www.rki.de/.
25 Please consult us if you do not own these items.
door, people in the waiting line might change their position frequently, making recording data problematic. • The time gap between people who are let into the store must be sufficiently long. The measurement may take a couple of seconds, and you may be asked to move forward if the waiting line moves; the subject can also move before you can record the distance between you. The speed is usually smaller at post offices than at supermarkets.
Data Recording Method You will need a smartphone with an installed augmented-reality tape-measure app that is capable of measuring small distances in centimeters with small measurement errors. The error is measured individually on the same device you use on location. Place two flat objects on the ground at any location with a clear surface exactly 100 cm from each other. Similarly to the protocol on location, measure this distance with the application. Do the same measurement five times with different positions of the objects. You may proceed with this hardware and application if the error is within a 3% margin every time.
Preparation for Data Recording In total, you are expected to perform 60 independent observations. Before each session, you set an even target of observations you are planning to record. Half of them you execute with, the other half without your mask on. The order you decide randomly using a fair coin or any random number generator. Example: You set the number to 20. After tossing the coin, you start with 10 observations with your mask on. After finishing with this, you remove the mask and perform another 10 without it. Finally, you leave the location.
The purpose of changing your appearance only once is to limit the number of times you may accidentally touch your face. You can safely avoid this if you remove the mask by only touching the strings. You should proceed the same way if you start your work without your mask on. To learn about the safe way to wear a mask, please consult the website of the Robert Koch Institute.
Data Recording Procedure Due to lockdown measures in place, you will work alone and record the data individually. After choosing the location, go to the end of the waiting line outside and carefully follow this protocol. 1. Go to the waiting line and stand 150 centimeters (1.5 meter) away from the last person. 26 Measure the distance using the same application. 2. Turn sideways, not facing either the waiting line nor the subject arriving after you. Make sure that you can see both. 3. If necessary, calibrate your application such that it is ready for measurement. Do not open other applications at this point. 4. If someone approaches, turn your back to the waiting line and face the subject before they arrive. Make sure that your face is visible, but look at your device the whole time. Keep a neutral facial expression and do not make eye contact. 5. The app measures distance by pinning two points on the ground. These two points are the closest points of yours and the subject's shoes. You pin the tip of their shoe first when they arrive, and the tip of your shoe second. 6. Record the length and exit the waiting line. 7. After this, record all remaining variables, starting with the number of people in the waiting line who were standing before you outside at the point of measurement. After this, go back to the end of the waiting line until you reach your target number of observations.

Further Points to Consider
If there is a group, the subject is the person closest to you, irrespective of age. Exceptions: If the closest person is an infant in a stroller or a person in a wheelchair, the closest point is where the front wheel touches the ground. If this reference point belongs to a stroller, the person you record is the one handling the stroller.
Do not record an observation if you are unable to pinpoint the position of the subject accurately (i.e., the subject might keep jogging in place, or move back or forward before you can finish pinning) or if the subject engages in an activity that would trigger distancing according to local social norms (i.e., smoking, talking on the phone, eating).
There are three time slots per day: morning 8am-12 noon, afternoon 12 noon-16pm, and late afternoon/early evening 4pm-8pm. Do not record more than 50% of the observations in one period of time (e.g., morning), even if they are recorded on different days.
Do not attempt to make any media recording of the subject or any other individual near you as without consent this may be unwelcome. If you meet with a hostile or unfriendly reaction or you are questioned by someone, you can reveal your identity and that you are conducting a publicly funded scientific study. If this hinders or influences recording data, or puts you in an uncomfortable situation, leave the location.

Data and Variables
In this part, you can find the list of variables with the corresponding codes. Your task is to complete the spreadsheet for each observation. You will receive the spreadsheet by email. Once you have finished recording, send the file to gyula.seres@hu-berlin.de.

Distance
Distance to the subject. Measured in centimeters (cm).

TotalNumofPeople
The total number of people outside in front of you in the waiting line at the moment of measurement. Do not include people inside.
1 "strongly agree" 2 "moderately agree" 3 " agree a little" 4 "neither agree nor disagree" 5 "disagree a little" 6 "moderately disagree" 7 "strongly disagree" • Have you seen this person before? Yes / No / Maybe Opinion about the preferences and the health condition of the person (not) wearing a mask, the effectiveness of masks for distancing Imagine the following situation: The person you saw in the photograph at the beginning of the survey is standing in a waiting line outside of a post office. Now another person (who is interested in getting into the post office) approaches the end of the waiting line.
• In your opinion, at which distance will the person approaching come to stand behind the person in the photograph. Please indicate the distance in centimeters below (100 cm = 1 m). • What do you think is the minimum distance the person in the photograph would like the person approaching the waiting line to keep from her/him while waiting in line outside a post office? Please indicate the distance in centimeters below (100 cm = 1m). • In your opinion, how likely is it that the person in the photograph is infectious for other people in the waiting line? Please choose one answer from 1 to 7. 1 "definitely not infectious" 2 "very unlikely to be infectious" 3 "somewhat unlikely to be infectious" 4 "I don't know" 5 "somewhat likely to be infectious" 6 "very likely to be infectious" 7 "definitely infectious." • In your opinion, how likely is it that the person pictured is sick with the coronavirus, the flu, or another virus-related respiratory diseases? Please choose one answer from 1 to 7. 1 "definitely not sick" 2 "very unlikely to be sick" 3 "somewhat unlikely to be sick" 4 "I don't know" 5 "somewhat likely to be sick" 6 "very likely to be sick" 7 "definitely sick."

Introduction of the bonus rules
In the upcoming part of the survey you will be able to earn some additional bonus payment. You will be asked to estimate the average or most frequent answers of other survey participants. For each correct guess, you will receive an additional payment of 0.20 EUR (20 cents). More details about the rules for bonus payment will be given below.
Please enter your Participant ID here if you would like to receive the payment. It will be used for payment purposes only. After the payment has been made, it will be deleted from the data set.

Incentivized beliefs / Descriptive social norm elicitation
Other survey participants were shown the same photograph as you at the beginning of the experiment and were asked the same questions as you.
All participants saw the following situation description: "Imagine the following situation: The person you have seen in the photograph at the beginning of the survey is standing in a waiting line outside of a post office. Now another person (who is interested in going into the post office) approaches the end of the waiting line." Please estimate the average answers to the following two questions by 50 randomly selected individuals. Think about your answer thoroughly, because for each guess that does not deviate from the actual average answer of 50 other participants by more than 5 cm, you will receive an additional bonus of 0.20 EUR.
• What is the average answer of 50 other randomly selected participants to the following question: "At which distance will the arrived person come to stand behind the person in the photograph." Please guess the average answer to this question: • What is the average answer of 50 other randomly selected participants to the following question: "What is the minimum distance this person would like the next person in the waiting line to keep from him/her while waiting in line outside a post-office?." Please guess the average answer to this question: Now, we would like you to estimate the most frequent answer among 50 randomly selected participants of this survey. Think about your answer thoroughly, because for each correct guess you will receive a bonus of 0.20 EUR.
• What is the most common answer among 50 randomly selected survey participants to the following question: "How likely is it that the person in the photograph is infectious for other people in the waiting line? (From 1 to 7)" Please guess the most common answer to this question: 1 "definitely not infectious"; 2 "very unlikely to be infectious"; 3 "somewhat unlikely to be infectious"; 4 "I don't know"; 5 "somewhat likely to be infectious"; 6 "very likely to be infectious"; 7 "definitely infectious." • What is the most common answer among 50 randomly selected survey participants to the following question: "How likely is it that the pictured person is sick with the coronavirus, the flu, or another virus-related respiratory disease? (From 1 to 7)" Please guess the most common answer to this question: 1 "definitely not sick"; 2 "very unlikely to be sick"; 3 "somewhat unlikely to be sick"; 4 "I don't know"; 5 "somewhat likely to be sick"; 6 "very likely to be sick"; 7 "definitely sick."

Estimation of the experimental results
Last week we ran a study in which we measured the distance that individuals keep at the end of a waiting line from another person. The study was done in Berlin in a line for the post office. The last person in the waiting line was an experimenter, who you saw in the picture at the beginning of the survey.
Please guess the average distance 30 individuals kept from this person.
Think about your answer thoroughly, because you can earn an additional bonus based on the correctness of your guess. If your guess does not deviate from the actual average distance from our study by more than 5 cm, you will receive an additional bonus of 0.20 EUR.
• Please guess the average distance kept away from the experimenter by 30 individuals approaching him/her at the end of the waiting line: Attitude towards masks and mask-wearing behavior • How do you evaluate the introduction of the compulsory wearing of face masks in public transport in Germany? 1 "very positive"; 2 "rather positive"; 3 "undecided"; 4 "rather negative"; 5 "very negative." • How do you evaluate the introduction of compulsory wearing of face masks in supermarkets? 1 "very positive"; 2 "rather positive"; 3 "undecided"; 4 "rather negative"; 5 "very negative." • How do you evaluate a possible introduction of compulsory wearing of face masks while walking outside? 1 "very positive"; 2 "rather positive"; 3 "undecided"; 4 "rather negative"; 5 "very negative." • In your opinion, to what extent are face masks effective for preventing the spread of coronavirus? 1 "very effective"; 2 "somewhat effective"; 3 "I don't know"; 4 "not very effective"; 5 "not effective at all." • In the last week, how often did you : (1 "never" to 6 "always") wash hands with soap for at least 20 seconds. wear a face mask in indoor areas wear a face mask in outdoor spaces keep a distance of at least 150 cm to people who are not living in your household.
• There are some groups of people who are at particular risk of developing a serious disease due to infection with the coronavirus. These groups include people who are over 65 years of age, have a weakened immune system, or have a relevant underlying medical condition (e.g., chronic diseases of the respiratory system, diabetes, cardiovascular diseases, cancer). Do you belong to a coronavirus risk group? Yes/No/Maybe. Past experience with coronavirus-related survey • How many times have you participated in surveys about COVID-19 / coronavirus in the last 4 weeks? Scale 0 to "10 or more."