To distinguish the situations where people take informed, misinformed, and uninformed decisions, we need to compare people’s understanding of the Wi-Fi networks’ properties and visual cues relative to the choices they make. Therefore, we conduct a study where we ask participants the following: first, to read the description of a specific scenario setting, a given context and a specific task to perform; second, to choose between different Wi-Fi networks to achieve the task; third, to answer questions about the meaning of the visual cues they encountered; and finally, to answer questions about their knowledge regarding Wi-Fi networks.
What we investigate is whether the choice of a Wi-Fi network depends on the properties of the Wi-Fi network itself and on the specific task to be undertaken. Thus, more precisely, the dependent variable we investigate is the participants’ Wi-Fi choice, a dichotomous (i.e., 0/1, wrong/right) variable. As main independent variables we choose the presence/absence of the padlock sign (
) —supposed to indicate secure communication, technically the presence of encryption— and the presence of one of the two signal strength sign
)—supposed to indicate quality of connectivity, technically the strength of the received Wi-Fi signal. These are in fact the properties of Wi-Fi networks typically communicated to the user. In our study we thus display one of the four possible combinations: ‘
’, or ‘
’. In the remainder of this document, for sake of conciseness, we use the terms “Encryption” for secure communication and “QoS” for good connectivity.
“Encryption” (i.e., secure communication) and “QoS” (i.e., good connectivity) represent also the two meaning dimensions that we assess from our participants in relation to how they understand the cues. We measure how much the participants think a cue means “Encryption” or “QoS”, and this is driven by the task a user is involved in; we consider four tasks designed to evoke a need for “Encryption” and “QoS” through context description.
Additional independent variables that we consider to be important factors to control for are the following: the order of the Wi-Fi network names; speed of appearance over time, i.e., how quickly or slowly the network is listed by the network manager; and the participant’s social and personal background, i.e., tech-savvy vs non-tech-savvy users. Moreover, to ensure that participants do not avoid encrypted networks because they do not have a password, we provide a password to half of the sample, aleatorily.
To investigate those factors, while maximizing internal validity, we chose an in-between subject study design. Participants were presented only one scenario to avoid security priming of one scenario on the others. The study was conducted on-line: the flow of the study design comprised a socio-demographic questionnaire; the description of a scenario with instructions to select a Wi-Fi network from a given list; several rounds of network selections; an assessment of the meaning participants have for the given cues; and a follow-up questionnaire to assess further attitudes and beliefs about ICT security (e.g., misconceptions and beliefs regarding Wi-Fi networks). In each scenario, we describe for the participant a character they implicitly inhabit and ask him/her what network s/he would select given the context and task to be accomplished. Participants were assigned to respond to 1 scenario out of 4 possible ones; thus the probability of assignment was of .25. Each scenario differed in terms of the requirements the Wi-Fi network should have to complete the task (i.e., combination of ‘Encryption” and “QoS”). Participants had five rounds of choices; each round presented a list of 4 Wi-Fi networks, ordered randomly, each displaying a randomly generated name, a signal strength indicator (
), with or without a padlock sign (
). Figure 1 shows the Wi-Fi networks for the four rounds. To test for consistency we added a fifth round, not shown in the figure: it is one of the presented 4 rounds, randomly chosen. Due to space limitations, in this manuscript we focus and describe only the results associated with the third round of network choices. Either, we have no space to present and discuss how the delay, and/or the timing, of the listing of network names affects the Wi-Fi network choices; and also how the sequential order of the Wi-Fi networks makes a difference. This is left as future work.
To assess whether users associate the right intended meaning to the cues (“Encryption” for the padlock, and “QoS” for the signal strength bar) we ask the participants to express their understanding using a 4-points Likert scale (Not at all, Partially, Mostly, Completely) the extent to which they agree that each of the 2 visual cues (
) corroborate in meaning with 4 words related to “Encryption” (confidential, protected, encrypted, and private), and 4 related to “QoS” (good signal strength, high-bandwidth, high-speed, and fast).
As mentioned above, we complement the study with additional attitude and belief questions regarding the participants’ use of Wi-Fi networks. For instance we ask such things as their thoughts about whether the padlock sign
means “locked out”, and whether they tend to make choices out of convenience. To be clear, our convenience variable is a composite of three questions (Cronbach’s \(\alpha =0.76\)) and is used as such in our analyses. Additional questions are used to measure ICT skills: these are split into 2 separate variables, stated ICT skills (s.ICT) reflecting the participants’ stated ICT skills, and measured ICT skills (m.ICT) reflecting how well the participants answered the technical questions. We collected a host of other variables thought to be associated with the Wi-Fi network choice; for the sake of space we ought to omit these results as well.
Choosing the Tool for our On-line Survey. We aimed to have a large number of participants and among a population larger than the one we could reach if we had run our experiment within our University quarters. Therefore, we opted for Amazon Mechanical Turk (mturk), a market place for on-line work which however offers readily available and substantially large samples of participants. The use of mturk as a tool for social experiments is debated; we are aware of it and of mturk’s potential limitations (e.g., ) that can harm internal validity. For this reason we took several countermeasures to maximise as much as we could the quality in the collected data. We implemented a great amount of quality checks to detect that participants provide answers simply by clicking randomly. Namely, we implemented attention checks, for instance we added choices like: “I answer randomly and I should not be paid: Yes or No”; we repeated questions several times and we presented them with different wording; we measured the time participants took to answer each question to test unusually fast answering which can potentially indicate a low quality data; we also prevented a participant from participating more than once.
On the positive side, however, mturk allows us to recruit participants world-widely, and in the specific case of the US (and we admitted only participants from this country, see later in this paragraph) it is thought to be better representative of the general population than those commonly recruited via university settings . Moreover, evidence suggests that self-reported behaviours gathered with mturk are comparable to observed behaviours in laboratory studies . To make our analyses and interpretation of our results easier, we choose to recruit only participants located in the US, where the majority of mturk workers do not use the tool as their primary source of income. We ran the study by batch of 100 participants at different times of the day, during workdays and week-ends. Following the guide edited by a community  of mturk workers, we took great care to guarantee workers’ rights of information and privacy, and we paid USD 0.90 for an average of 5 min of participation. We collect their age, gender, how comfortable they feel with ICT and their occupation. Occupation categories are organized following the US Bureau of labor statistic’s classification major groups . Optionally, participants can communicate ethnicity related information that follow the US census’ interviewing manual guidelines .
The Pilot Study. Another issue, not related to mturk, but yet could potentially challenge the reliability of the data and the internal validity of the study is whether the participants in fact understand correctly what they are presented. In particular, because in theory there is an infinite number of scenarios we could have used to convey and illicit a need for certain Wi-Fi network properties, we had to take special care to pilot test several possible scenarios to identify the ones we ultimately used in our study. For instance, to evoke a task that does not need secure communications or good connectivity, we can ask the participants to picture themselves waiting at a bus stop (no time pressure) searching for a Wi-Fi network to browse the Internet (no need for security), but this scenario could be understood differently by men and women. To guarantee unambiguity in understanding the scenarios, we ran a pilot study using the same tools and settings as the main study that aimed at finding the most intelligible and less biased scenarios. We built 3 different “vignettes” , or candidates, for each scenario, and asked 156 participants to rate how much the task mentioned in the vignettes should comply with several properties. There were 6 properties related to “secure communications” (confidential, protected, encrypted, secret, masked, and private), and 6 related to the “good connectivity” (good signal strength, high-bandwidth, high-speed, first-class, responsive, and fast). We analysed the results of the pilot study with the R statistical software  and performed Wilcoxon rank tests  to discriminate the vignettes with the best psychometrical discrimination while checking for gender, age, and other social background variable effects. Table 1 shows for each scenario: the technical property that it intends to convey (“Encryption” or “QoS”), the selected vignette, and the limitations we need to be aware of when using it.
In summary, we model the dichotomous outcome (dependent variable) using Logistic Regression : we estimate the conditional probability of choosing the target response option “clicking on the network with a
” net of important independent variables. Our statistical modelling approach is relatively straightforward: firstly, we investigate the effect of the password because we expect it to be an important and significant control; we in fact find evidence of this and thus include it in all subsequent models. Secondly, we investigate the question of whether participants make an informed decision relative to each scenario, and then whether the participants’ answers reflect, in a consistent way, their expressed choice relative to the meaning they attribute to the
cues. Finally, we investigate whether the respondents’ choices vary significantly by several basic socio-demographic variables.