To underline the importance of privacy for end-users, this section summarizes the literature on users’ privacy perception and an economic perspective of privacy. It describes how current digital assistants incorporate privacy features to address the need for data transparency. Prior work on usable privacy interfaces is presented, focusing on playful interactions, i.e. gamification and serious games. It concludes with the insights into user privacy preferences and defines the research gap.
Users’ privacy perceptions and privacy assistance
Privacy has different meanings in different disciplines, cultures and individuals (Smith et al. 2011; Rho et al. 2018). Information privacy is a subset of privacy that “concerns access to individually identifiable personal information” (Smith et al. 2011, p. 990). Accompanied by developments in information technologies, information privacy has become a main focus of information system (IS) research (Smith et al. 2011). However, no comprehensive definition has been made so far (Smith et al. 2011). In this paper, we build upon the notion of informational self-determination as defined by the German Federal Constitutional Court (1983) and enshrined in the European General Data Protection Regulation (GDPR).
Missing trust is an antecedent of privacy concerns (Smith et al. 2011). Together with prior experience in information systems, trust can “have a significant influence on the intention to disclose information” (Kumar et al. 2018, p. 631). With advances in information technology, privacy concerns have been raised in different domains (Bélanger and Crossler 2011; Goldfarb and Tucker 2012). For example the impact of trust in DAs was examined by Mihale-Wilson et al. (2017). Further, these concerns can be external and internal: External, because digital assistants are software-based and hackers can exploit security vulnerabilities and access personal information (Biswas and Mukhopadhyay 2018). Internal, because DAs need a substantial amount of personal data for a meaningful personalization. There is a conflict here for users between maintaining their privacy and benefiting from personalization, which is called the “personalization-privacy paradox” Sutanto et al. (2013). In general, users feel the need to have control over their privacy (GlobalWebIndex 2019; Bélanger and Crossler 2011), which often includes users’ risk assessment (Kehr et al. 2015). However, in the field of IoT, many users are often unaware that their devices are able to intrude upon their privacy (Lopez et al. 2017; Manikonda et al. 2018).
As digital assistants rely heavily on personal data, there is always a threat to the user’s privacy (Maedche et al. 2019; Saffarizadeh et al. 2017). But there are different attacker models. Strangers with physical access to a smart speaker could ask questions that reveal sensitive details about the owner of the device (Hoy 2018). For example, a study of Amazon’s Alexa revealed that attackers can access usage patterns, user’s interests, shopping lists, schedules and driving routes (Chung and Lee 2018). With the omnipresence of connected devices, ever more data could be disclosed (Liu et al. 2017). It is becoming increasingly difficult for end-users to manage their privacy, for example, because the DA feature a large number and variety of configurable privacy controls, which make it difficult for users to align their preferences (Liu et al. 2016). Further, users do not understand the underlying algorithms (Fischer and Petersen 2018).
Privacy assistants are meant to help users manage their privacy (Das et al. 2018). These assistants can control and visualize the data streams from different devices and are capable of learning users’ privacy preferences (Liu et al. 2016). Several authors have developed privacy assistants that actively support users. For example, Andrade and Zorzo (2019) present “Privacy Everywhere”, a mechanism that processes IoT data before it is sent to the cloud, in order to allow users to decide whether or not to disclose it. In the same domain, Das et al. (2018) develop an assistant that predicts user privacy preferences. Other domains include smartphones (e.g. Liu et al. 2016) or smart buildings (e.g. Pappachan et al. 2017). However, none of these studies focuses on the user interface to control or understand privacy settings.
Users’ privacy preferences and their economic value
Understanding privacy preferences is important for business (as it can be a competitive advantage), for legal scholars (as privacy is becoming increasingly prominent), and for policy makers (to identify desirable goals) (Acquisti et al. 2013). Accordingly, much research has been done on preferences in different domains. In the field of home assistants, Fruchter and Liccardi (2018, p. 1) identify “data collection and scope, ‘creepy’ device behavior, and violations of personal privacy thresholds” as major user concerns. Other work focuses on preferences related to mobile devices. Emami-Naeini et al. (2017) are able to predict privacy preferences with an accuracy of 86%, while “observing individual decisions in just three data-collection scenarios”. Similarly, Liu et al. (2016) use a smartphone app to identify user preferences and create privacy profiles over time. Lin et al. (2012, p. 501) use “crowdsourcing to capture users’ expectations of what sensitive resources mobile apps use”. Das et al. (2018) apply machine learning to predict user preferences. However, privacy preferences are subjective and difficult to assess (Acquisti et al. 2013), depend on data type and retention time (Emami-Naeini et al. 2017), and change over time (Goldfarb and Tucker 2012). These preferences influence system adoption (McLean and Osei-Frimpong 2019).
On the one hand, users value “privacy as a good in itself” (Acquisti et al. 2016, p. 447) and are confronted with trade-offs between privacy and functionality when using a DA. On the other hand, users are willing to disclose information for monetary incentives (Hui et al. 2007). Whereas privacy is often seen as an imperfect information asymmetry between consumers and companies (Acquisti et al. 2016), Jann and Schottmüller (2020) disprove this view by calculating situations where privacy can even be Pareto-optimal.
A recent study by Cisco (2020, p. 3) found that “more than 40% [of the surveyed companies] are seeing benefits at least twice that of their privacy spend”. Nonetheless, privacy “is one of the most pressing issues” (Jann and Schottmüller 2020, p. 93) for companies. To help them anticipate financial gains, several researchers use conjoint analyses to estimate users’ willingness-to-pay (WTP) for different privacy features (e.g. Mihale-Wilson et al. 2019; Roßnagel et al. 2014). In the field of online storage, Naous and Legner (2019, p. 363) perform a conjoint analysis and find that “users are unwilling to pay for additional security and privacy protection features”. Derikx et al. (2016) find that users of connected cars give up their privacy concerns towards the insurance company for minor financial compensations. In the field of DA, Zibuschka et al. (2016) conduct a user study and investigate three key attributes, namely data access control, transparency, and processing location. Using a choice-based conjoint (CBC) analysis, they find that users are willing to pay 19.97 €/month for an assistant offering intelligent data access control to the servers of the manufacturer while providing intelligent transparency notices. A similar approach, but with different attributes, is conducted by Mihale-Wilson et al. (2017). They find users prefer a DA with full encryption with distributed storage certified by a third party auditor, and have a WTP of 25.85 €/month.
Usable privacy interfaces
Several authors conduct studies on privacy and usability in prototypical settings (e.g. Adjerid et al. 2013; Cranor et al. 2006; Wu et al. 2020). Acquisti et al. (2017, p. 11) and emphasize the importance “to overcome decision complexity through the design of interfaces that offer users […] easy-to-understand options”. Bosua et al. (2017, p. 91) stress the importance of nudges “as a warning or intervention that can support users in making decisions to disclose relevant or more or less information”. Most personal assistants today use voice commands to interact with the user. However, Easwara Moorthy and Vu (2015) found that users prefer not to deal with personal information via voice-controlled assistants. Instead, privacy interfaces could be facilitated using gamification the usage of “game-like elements in a non-game context” (Deterding et al. 2011, p. 9). Examples for such elements include scoring systems or badges that can be obtained by the user of an information system and that foster motivation (Blohm and Leimeister 2013). Other interfaces, so-called serious games, are fully-fledged games using game mechanics in a non-entertainment environment (Blohm and Leimeister 2013). For example, Zynga visualized their privacy policy until 2017 and let the user walk around in “PrivacyVille” to explore how user data is processed (Ogg 2011).
Various domains now use gamification and serious games. For example, health and fitness apps encourage users to do their workout (Huotari and Hamari 2017). In e-learning systems, users earn points for answering questions correctly (Aparicio et al. 2019). In user authentication research, users are presented with a chess-like playing field and playing tokens. They progress through the playing field and choose a path which is then converted into a login path that represents a secure password (Ebbers and Brune 2016). However, so far, very little research has been conducted on gamification as a means for controlling and understanding privacy settings.
Privacy information provisioning
Besides playful elements, the proper amount of information is an important element of user interfaces. Whereas too much information can confuse users, too little information is “inefficient and may tax a person’s memory” (Galitz 2007, p. 181). In the field of personal data, privacy policies are commonly used to inform users how their information is threatened. As stated in Article 12 GDPR, the vendor must provide this information in a “concise, transparent, intelligible and easily accessible form, using clear and plain language” (European Parliament and Council 2018). However, many privacy policies are too long and difficult to read. For example, users would have to spend 244 h each year just to read the privacy policies of every website they visit (Gluck et al. 2016). To lower users’ effort and raise awareness, some researchers attempt to visualize privacy policies (Soumelidou and Tsohou 2020). For example, “The Usable Privacy Policy Project” coordinated by Prof. Sadeh and Prof. Acquisti has been tackling this problem since 2013 (Carnegie Mellon University 2017). Moreover, Cranor et al. (2006, p. 135) develop user interfaces based on a “standard machine-readable format for website privacy policies”.
Additionally, a prospective threat analysis of possible privacy-related functions could be useful for individuals (Friedewald and Pohoryles 2016). We consider this a privacy impact assessment by end-users. There is much research about the possible threats of information sharing, e.g. location data used for location-based advertising (Crossler and Bélanger 2019) or even to identify a user’s home address (Drakonakis et al. 2019) or revealing shopping behavior helping marketing companies to identify someone as pregnant before even family members know (Hill 2012). A DA could show such examples to the user, e.g. as soon as an IoT device asks to be granted location access.
Explainable decisions of digital assistants
Decision-making is often a process of mutual cooperation between humans and DAs (Maedche et al. 2019). Digital assistants can use artificial intelligence (AI) to make recommendations or even autonomous decisions based on the stored data. However, this decision-making process is invisible to users and so they often mistrust or do not accept AI (Ribeiro et al. 2016). Accordingly, the European Commission (2019) calls for understandable and “trustworthy AI”. To provide greater visibility, there are different ways decisions could be explained. For instance based on the underlying algorithm (Gunaratne et al. 2018) or based on the data involved (Kuehl et al. 2020; Miller et al. 2019). While Förster et al. (2020) conduct a study to identify user preferences for different types of explanation, they do not ask what the explanation should be based on. Whereas data-based explanations could be especially interesting when interacting with personal data, studies on the acceptance of the Corona tracing app in Germany show that some users are also interested in a deeper understanding of the underlying algorithm and processes (Buder et al. 2020). We provide a comprehensible example of these attributes in the “Concept Selection” section.
Research gap
The body of literature on DAs is rich. However, the customer’s attitude towards privacy features, in particular the user interface, and their willingness to pay for such features is scarce. Whereas several researchers identify user preferences concerning technical and organizational privacy features of digital assistants, preferences towards the user interface design are not studied. Different possible features, such as explainable AI, the amount of information and the degree of gamification have only been studied separately and absent from privacy. We close this gap by combining these elements and identify feature combinations that are preferred by prospective users.
Research method
This study aims to identify user preferences for the design of the DA’s interface for controlling and understanding privacy settings. More specifically, we investigate preferences concerning the amount of background information about personal data that is shown to the user, the explainability of the assistant’s decisions, and the degree of gamification of the DA’s interface.
Figure 1 shows our research design. It is divided into three stages as proposed by Chapman et al. (2008). Prior to the first stage, we performed a literature review of privacy features for assistants. Then, in “concept selection”, we relied on a panel of qualified experts to choose the attributes. Next, we tested the validity and understandability of these features in a first questionnaire, which was presented to a non-expert sample of users but who are potential users of such a DA. In the second “study refinement” stage (originally called “product refinement”), we conducted a representative user study in which we presented choice sets of different privacy assistants with different prices to users (Gensler et al. 2012). In the final stage, we analyzed the CBC results.
Literature search
We conducted an iterative literature search in journals of the AIS Basket of Eight using the search engine Scopus, limited to English and German publications from 2000 or later. As our research question concerns user interaction, we also included the journals from the SIG HCI. Further, as privacy is an interdisciplinary concept, we searched in Google Scholar while prioritizing papers with the highest number of citations. We came across the “Personalized Privacy Assistant Project”,Footnote 1 reviewed their publications and conduced a forward- and backward-search. The keywords and precise search process can be found in the online “Appendix 2” (see file “Appendix 2_LiteratureReviewProcess.pdf”, which will be submitted as an online appendix). We started to search for general information privacy literature, then technical features and characteristics of current DAs, and lastly combined both search topics to find features that have privacy implications. In total, we identified 16 attributes with three attribute levels for each. One example is the way prospective users could communicate with the DA: text-based, image-based or voice-based (Knote et al. 2019). (The full list can be found in Appendix 1_ListOfKeyAttributes.pdf, which will be submitted as an online appendix.)
“Concept selection”: Identifying three key attributes for the user Interface’s interaction with the digital assistant
Following Chapman et al. (2008), in this stage, we held informal talks with five experts (three with a doctoral degree, one professor, and one with significant industry experience in the domain of privacy and assistance systems). We chose this method, because it answers “questions that commonly arise in early-phase user research for new technology products” (Chapman et al. 2008, p. 37). This is very suitable for us because the features of the assistant in our study can be regarded as new technology products. We aimed at identifying three key attributes and the corresponding attribute levels as a limit to avoid “biased or noisy results” (Johnson and Orme 1996, p. 1) and to keep the CBC convenient. We informed the experts of the user-centric perspective as a decision criterion. Further, the experts selected attributes based on practical relevance, as well as their feasibility and applicability for a multinational technology company active in the IoT field, among others. In addition, we gave the experts the opportunity to suggest alternative wordings. After several rounds of coordination, the experts agreed on the key attributes and their corresponding attribute levels (see Table 1).
Table 1 Final set of key attributes and attribute levels for the design of privacy features with explanation For the “explainability of the assistant’s decisions” attribute, the assistant could explain decisions on three distinct levels of detail: (1) No explanation means that the assistant just presents its decision. For example, while en route to a destination, the assistant guides the user to a gas station if the vehicle needs refueling. (2) The assistant explains its decision by presenting the data used as a basis. For example, the assistant guides the user to a specific gas station and displays the symbol of a loyalty card. The user concludes that the DA chose this gas station because she owns the respective loyalty card and can earn loyalty points. (3) In addition to the data basis, the assistant explains the underlying algorithm of the assistant’s decision. For example, it shows the current tank level and current gas consumption and calculates the maximum travel distance still possible. Then the assistant shows a circle on a map with the diameter of the travel distance and highlights gas stations within it. In addition, it shows the calculation of expected loyalty points depending on the fueled liters.
The attribute “amount of presented information” covers the scope and effect of data processing and features the following levels. (1) Users are presented with text directly from the privacy policy. For example, the assistant could be granted general permission to access the user’s gallery while using the systems of a third party. (2) Besides the privacy policy, there is a visual explanation of data usage. For example, the assistant shows if and how often the gallery is accessed. (3) Users are offered a prospective analysis about the possible impacts of data disclosure. For example, the assistant informs the user that it is possible to access location data stored in the images.
The attribute “degree of gamification of the UI” can offer (1) no playful elements, i.e. a standard interface. (2) Gamified elements can award users points for blocking devices’ privacy-violating permissions. (3) A serious game can represent user devices and interactions in an immersive game world. For example, users can block a washing machine’s network access with virtual scissors as an avatar. Fig. 2 shows these different interface options.
Price is our fourth attribute. It helps us to elicit the WTP of users, which is defined as the “price at which a consumer is indifferent between purchasing and not purchasing” (Gensler et al. 2012, p. 368). According to Schlereth and Skiera (2009), prices should be chosen in a way that means the respondents are not subject to extreme choice behavior, which would result if they always chose or never chose the no-purchase option. To avoid this issue, we followed the approach of Gensler et al. (2012), which says that a good study design should offer low and high price levels to enforce choices and no-choices. We suggested different price levels to the expert panel based on prior price levels in related studies (Krasnova et al. 2009; Mihale-Wilson et al. 2017; Mihale-Wilson et al. 2019; Roßnagel et al. 2014; Zibuschka et al. 2016). Based on a joint discussion with regard to technical feasibility and potential development costs, they suggested four subscription prices: 5, 10, 30 and 59 €/month. This is a common approach, but there is no guarantee that it covers the full range of possible WTP (Gensler et al. 2012).
Next, we conducted a moderated survey (n = 20) to verify the understandability of the three attributes and attribute levels. Further, we aimed at descriptive results for possible usage domains and reasons for using privacy features. We employed a paper questionnaire and asked participants to think aloud while completing it. The insights served as the basis for the final survey.
“Study refinement”: Representative user study
In the “study refinement” phase (Chapman et al. 2008), we carried out a CBC survey. We chose this method because it is “a useful methodology for representing the structure of consumer preferences” (Green and Srinivasan 1978, p. 120), can present easy-to-imagine products to prospective users (Desarbo et al. 1995), and is widely used in information systems and market research (Miller et al. 2011). Further, it allows us to evaluate the prospective user valuation of certain product features using statistical measures (Gensler et al. 2012). In our CBC, participants chose from among 15 choice sets of different versions of the design of a DA’s privacy features with different prices, including a no-purchase option. We used the Sawtooth Software to obtain an optimal design. We also ran a pre-test to ensure that users understand the different attribute levels before offering the survey to the full sample. The CBC surveyFootnote 2 comprised four steps:
-
1.
Introductory video-clip: We showed a five-minute video clip in English (with German subtitles) to introduce digital assistants and to demonstrate their possible use in the normal workday of a fictive employee called Henry. This video was originally created for the marketing purposes of a multinational firm and did not relate to our three identified attributes directly. Instead, its aim was to help users see what data a DA uses and perhaps make them start to think about possible privacy threats. Users were not able to skip the video clip as the “next”-button appeared only after the video had ended.
The video showed the following use cases:
-
The assistant wakes the user earlier than scheduled to inform him of a traffic jam forecast on his way to the office. Thus, the user decides to work from home and asks the assistant to inform his colleagues in the office.
-
As the day goes on, the assistant informs Henry that his grandmother is unwell and advises him to buy medicine.
-
When leaving the house, the DA reminds Henry to take his key.
-
When the postal worker rings at the door, Henry allows the assistant to disable the alarm system and open the door for sixty seconds.
-
The DA warns that the dryer might have a malfunction and suggests sending diagnostic data to the maintenance service.
-
2.
Descriptive text: After the video clip, descriptive text informed the user that the DA needs personal data to provide such functionalities.
-
3.
Introduction of different privacy features: We introduced several versions of the DA that differ with regard to the design of privacy features. In the next step, we explained these design features and the three identified key attributes. We provided a description and examples for each attribute level.
-
4.
Users start choice-based conjoint survey: Table 2 shows an example of a choice set.
Table 2 Example of a CBC choice set with user’s marker on product 3 “Customer definition”: Analyzing demographic and psychographic information
In the “customer definition” stage (Chapman et al. 2008), we enlisted the help of a market research company to investigate a sample of the German population that is representative with regard to demographics, specifically gender and age distribution. We predefined a sample size of 300 participants, as this is a common number for representative user studies in this domain (e.g. Gensler et al. 2012).
We elicited demographic (e.g. age, education, etc.) and psychographic (e.g. risk tolerance, technology affinity and playfulness) information. This is needed to better understand users’ decisions and to identify consumer groups as a basis for further market segmentation. To do so, we applied established scales to the evaluation of our psychographic variables, such as Steenkamp and Gielens (2003) for the users’ attitude towards innovation, Meuter et al. (2005) for measuring the participants’ technology affinity, and Taylor (2003) and Westin (1967) for examining privacy concerns. The interaction of the user, originally from Serenko and Turel (2007), was adapted to the usage of DAs and showed similar reliability scores.
Evaluating the participants’ willingness to pay
The WTP can be interpreted as the threshold at which a user is indifferent to purchasing or not purchasing (Gensler et al. 2012). In CBC, consumers repeatedly pick their preferred variation of a product. The underlying assumption is that a user chooses the product that yields the highest personal utility. The no-purchase option helps us to determine unpopular product variants. This information can be used to calculate the WTP. (1) describes the probability that individual h chooses a specific feature version i, from a choice set a (Gensler et al. 2012).
$$ {P}_{h,i,a}=\frac{\exp \left({u}_{h,i}\right)}{\exp \left({u}_{h, NP}\right)+{\sum}_{i^{\prime}\in Ia}\exp \left({u}_{h,{i}^{\prime }}\right)\ } $$
(1)
Where
-
Ph,i
:
-
probability that consumer h chooses product i from choice set a.
-
uh,i
:
-
consumer h’s utility level of product i.
-
uh,NP
:
-
consumer h’s utility level of the no-purchase option.
-
uh,i’
:
-
consumer h’s utility level of all presented products.
-
H
:
-
index set of consumers.
-
A
:
-
index set of choice sets.
-
I
:
-
index set of products.
-
Ia
:
-
index set of products in choice set a, excluding the no-purchase option.
The parameters in (1) can vary, because the WTP differs across consumers. Thus, we used hierarchical Bayes to derive individual parameters (Karniouchina et al. 2009; Andrews et al. 2002). We can use information about prior segment membership probabilities to derive individual parameters based on estimated segment-specific parameters by employing a latent class multinomial logit (MNL) model, which maximizes the likelihood function (Wedel and Kamakura 2000). We can gain information about the user’s WTP by deriving individual parameters for the product attributes (βh,j,m), price (βh,price), and no-purchase option (βbh,NP) (Moorthy et al. 1997).
$$ {\sum}_{j\in J}{\sum}_{m\in {M}_j}{\beta}_{h,j,m}\cdotp {x}_{i,j,m}-{\beta}_{h, price}\cdotp {WTP}_{h,i}={\beta}_{h, NP}\kern1.25em \forall h\in H,i\in I $$
(2)
Rearranged (2):
$$ {WTP}_{h,i}=\frac{1}{\beta_{h, price}}\cdotp \left({\sum}_{j\in J}{\sum}_{m\in {M}_j}{\beta}_{h,j,m}\cdotp {x}_{i,j,m}-{\beta}_{h, NP}\ \right)\kern1.25em \forall h\in H,i\in I $$
(3)
The individual h’s utility for product i is the sum of the partial utilities supplied by the product attributes and price (4):
$$ {u}_{h,i}={\sum}_{j\in J}{\sum}_{m\in {M}_j}{\mathrm{\ss}}_{h,j,m}\ast {x}_{i,j,m}+{\mathrm{\ss}}_{h, price}\ast {p}_i $$
(4)
with
- uh,i:
-
consumer h’s utility level of product i.
- J:
-
index set of product attributes excluding price.
- Mj:
-
index set of alternatives for the attribute j.
- ßh,j,m:
-
consumer h’s parameter for the alternative m of the attribute j.
- xi,j,m:
-
binary variable indicating if product i features the level m of the attribute j.
- ßh,price:
-
consumer h’s price parameter.
- pi:
-
price for product i.
We can define the WTP as the price point at which a user is indifferent towards purchasing or not purchasing a version (5). Then we can deduce WTP as follows (6):
$$ {\sum}_{j\in J}{\sum}_{m\in {M}_j}{\mathrm{\ss}}_{h,j,m}\ast {x}_{i,j,m}+{\mathrm{\ss}}_{h, price}\ast {WTP}_{h,i}={\mathrm{\ss}}_{h, NP} $$
(5)
$$ {WTP}_{h,i}=\frac{1}{{\mathrm{\ss}}_{h, price}}\left({\mathrm{\ss}}_{h, NP}-{\sum}_{j\in J}{\sum}_{m\in {M}_j}{\mathrm{\ss}}_{h,j,m}\ast {x}_{i,j,m}\right) $$
(6)
where ßh,NP is consumer h’s utility from choosing the no-purchase option.
We can estimate individual parameters for the product attributes, price, and the no-purchase option (i.e. ßh,j,m, ßh,price, ßh,NP) based on the CBC output and formulas (1) and (4). Accordingly, we can calculate the overall WTP (6) for each of the presented DA versions (Moorthy et al. 1997).