Keywords

1 Introduction

Perry Barlow states: “The internet is the most liberating tool for humanity ever invented, and also the best for surveillance. It’s not one or the other. It’s both” [1]. One of the reasons for surveilling users is a rising economic interest in the internet [2]. However, users who have privacy concerns and feel a strong need to protect their privacy are not helpless, they can make use of privacy-enhancing technologies (PETs). PETs allow users to improve their privacy by eliminating or minimizing personal data disclosure to prevent unnecessary or unwanted processing of personal data [3]. Examples of PETs include services which allow anonymous communication, such as Tor [4] or JonDonym [5]. There has been lots of research on Tor and JonDonym [6, 7], but the large majority of it is of technical nature and does not consider the user. However, the number of users if crucial for this kind of services. Besides the economic point of view which suggests that more users allow a more cost-efficient way to run those services, the quality of the offered service is depending on the number of users since an increasing number of (active) users also increases the anonymity set. The anonymity set is the set of all possible subjects who might cause an action [8], thus a larger anonymity set may make it more difficult for an attacker to identify the sender or receiver of a message.

In the end, the sustainability of a service not only depends on the number of active users but also on a company or organization with the intention of running the service. One intention certainly is a well working business model. As a consequence, it is crucial to not only learn about the users’ intention to use a PET, but also to understand the users’ willingness to pay (WTP) for a service. Determining factors to understand the users’ WTP along with a suitable tariff structure is the key step to establish economically sustainable services for privacy. The current market for PET providers is rather small, some say the market even fails [9]. We argue that the lack of WTP for privacy is one of the most important reasons for the non-existence of large players engaging in the offering of a PET. Earlier research on WTP often works with hypothetical scenarios (e.g. with conjoint-analyses) and concludes that users are not willing to pay for their privacy [10, 11]. We tackle the issue based on actual user experiences and behaviors and enhance the past research by analyzing two existing PETs with active users, with some of them already paying or donating for the service. Tor and JonDonym are comparable with respect to their functionality and partially with respect to the users’ perceptions about them. However, they differ in their business model and organizational structure. Therefore, we investigate the two research questions:

RQ1: Which factors influence the willingness to pay for PETs?

RQ2: What are preferred tariff options of active users of a commercial PET?

The remainder of the paper is structured as follows: Sect. 2 briefly introduces the anonymization services Tor and JonDonym and lists related work on PETs and users’ willingness to pay. In Sect. 3, we present the research hypotheses and describe the questionnaire and the data collection process. We present the results of our empirical research in Sect. 4 and discuss the results and conclude the paper in Sect. 5.

2 Theoretical Background and Related Work

Privacy-Enhancing Technologies (PETs) is an umbrella term for different privacy protecting technologies. Borking and Raab define PETs as “a coherent system of ICT measures that protects privacy […] by eliminating or reducing personal data or by preventing unnecessary and/or undesired processing of personal data; all without losing the functionality of the data system” [12]. In the following sections, we describe Tor and JonDonym as well as related work with respect to WTP for privacy.

2.1 Tor and JonDonym

Tor and JonDonym are low latency anonymity services which redirect packets in a certain way in order to hide metadata (the sender’s/receiver’s internet protocol (ip) address) from passive network observers. Low latency anonymity services can be used for interactive services such as messengers. Due to network overheads this still leads to increased latency which was evaluated by Fabian et al. [13] who found associated usability issues when using Tor. Technically, Tor – the onion router – is an overlay network where the users’ traffic is encrypted and directed over several different servers (relays). The chosen traffic routes should be difficult for an adversary to observe, which means that unpredictable routes through the Tor network are chosen. The relays where the traffic leaves the tor network are called “exit nodes” and for an external service the traffic seems to originate from those. JonDonym is based on user selectable mix cascades, with two or three mix servers in one cascade. For mix networks route unpredictability is not important so within one cascade always the same sequence of mix servers is used. Thus, for an external service the traffic seems to originate from the last mix server in the cascade. As a consequence, other usability issues may arise when websites face some abusive traffic from the anonymity services [14] and decide to restrict users from the same origin. Restrictions range from outright rejection to limiting the users’ access to a subset of the services functionality or imposing hurdles such as CAPTCHA-solving [15]. For the user it appears that the website is not function properly. Tor offers an adapted browser including the Tor client for using the Tor network, the “Tor Browser”. Similarly, the “JonDoBrowser” includes the JonDo client for using the JonDonym network. Although technically different, JonDonym and Tor are highly comparable with respect to the general technical structure and the use cases. However, the entities who operate the PETs are different. Tor is operated by a non-profit organization with thousands of voluntarily operated servers (relays) over which the encrypted traffic is directed. Tor is free to use with the option that users can donate to the Tor project. The actual number of users is estimated with approximately 2,000,000 active users [4]. JonDonym is run by a commercial company. The mix servers used to build different mix cascades are operated by independent and non-interrelated organizations or private individuals who all publish their identity. The service is available for free with several limitations, like the maximum download speed. In addition, there are different premium rates without these limitations that differ with regard to duration and included data volume. Thus, JonDonym offers several different tariffs and is not based on donations. The actual number of users is not predictable since the service does not keep track of this.

From a research perspective, there are some papers about JonDonym, e.g. a user study on user characteristics of privacy services [16]. Yet, the majority of work is about Tor. Most of the work is technical [6], e.g. on improvements such as relieved network congestion, improved router selection, enhanced scalability or reduced communication/computational cost of circuit construction [17]. There is also lots of work about the security respectively anonymity properties [18, 19] and traffic correlation [20].

2.2 Related Work

Previous non-technical work on PETs mainly considers usability studies and does not primarily focus on WTP. For example, Lee et al. [21] assess the usability of the Tor Launcher and propose recommendations to overcome the found usability issues. Further research suggests zero-effort privacy [22, 23] by improving the usability of the service. In quantitative studies, we already investigated privacy concerns and trust on JonDonym [24] and Tor [25, 26] based on Internet users’ information privacy concerns (IUIPC) [27] and could extent the causal model by “trust in the service” which plays a crucial role for the two PETs. Some experiments suggest that users are not willing to pay for their privacy [10, 11]. In contrast to these experiments, we surveyed actual users – some of them already paying or donating for the service. Grossklags find contradicting behavior of users when it comes to WTP to protect information and “willingness to accept” compensation for revealing information [28]. Further work covers selling personal data [29, 30] e.g. on data markets [31] or experiments on the value of privacy [32]. Some work tries to explain the privacy paradox with economic models [33] or discusses the right of the users to know the value of their data [34]. However, all of these are focused on the value of certain data or privacy and not on the users’ WTP for privacy. Cranor et al. investigate how actual users use their privacy preferences tool [35]. Spiekermann investigate the traits and views of actual users of the predecessor of JonDonym, AN.ON/JAP, a free anonymity service [16]. However, since the tools were free, none of them investigated the users’ WTP. Following a more high-level view, some research addresses the markets for PETs. Federrath claims that there is a market for PETs but they have to consider law enforcement functionality [36]. Rossnagel analyzes PET markets based on diffusion of innovations theory about anonymity services [9] and concludes a market failure. Schomakers et al. do a cluster analysis of users and find three groups with different attitudes towards privacy and argue that each of the groups need distinct tools [37]. In the same line, further research concludes that one should focus on specific subgroups for the adoption of Tor [38]. Following a market perspective, Boehme et al. analyze the condition under which it is profitable for sellers in e-commerce environments to support PETs, assuming that without PETs they could increase their profit with price discrimination [39].

3 Methodology

In this section we present the research hypotheses, the questionnaire and the data collection process. The demographic questions were not mandatory to fill out. This was done on purpose since we assumed that most of the participants are highly sensitive with respect to their personal data and could potentially react to mandatory demographic questions by terminating the survey. Consequently, the demographics are incomplete to a large extent. Therefore, we had to resign from a discussion of the demographics in our research context.

The statistical analysis of the research data is conducted with the open-source software R. First of all, we focus solely on JonDonym and compare the differences of average preferences for alternative tariff schemes. Thereby, we differentiate between participants stating to use JonDonym in the free of charge option those stating to use it with one of the available premium tariffs. Due to non-normality of the data, we use the non-parametric test Wilcoxon rank sum test to determine whether preferences for newly designed tariffs differ from each other among different types of users. We designed these new tariffs in collaboration with the chief executive of the JonDos GmbH in order to provide realistic pricing schemes which are economically viable and sustainable for the company. We used the paired Wilcoxon test to determine whether users’ preferences for one tariff are statistically significantly different from the other tariffs. The Wilcoxon rank sum test is also called Mann-Whitney-U-Test. It is a nonparametric test of the null hypothesis that the mean of one sample will be different from the mean from a second sample. The paired Wilcoxon test is also called the Wilcoxon signed-rank test which is a similar nonparametric test used for dependent samples [40, 41]. In order to illustrate the difference in preferences among two types of users, i.e. free users and premium users, we use boxplots to visualize the descriptive statistics of the two samples [42]. A boxplot is a method for graphically depicting groups of numerical data through their quartiles. Boxplots are non-parametric. They display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution. The upper line of the box is the first quartile, the band inside the box is the second quartile (the median) and the bottom line of the box is the third quartile.

3.1 Research Model and Hypotheses for the Logistic Regression Model

As a last step, we conduct a logistic regression to find out which factors influence users’ willingness to pay for privacy (in our case willingness to pay for JonDonym and willingness to donate to Tor). We used the logistics regression to build the model because our dependent variable is a binary variable. A linear regression is not an appropriate model here due to the violation of the assumption that the dependent variable (WTP) is continuous, with errors which are normally distributed [43]. The probit regression is also not suitable because it assumes that our dependent variable is not normally distributed. Willingness to pay for JonDonym is defined as the binary classification of JonDonym users’ actual behavior.

$$ willingness\,to\,pay = \left\{ {\begin{array}{*{20}c} {0, if\,the\,respondent\,uses\,a\,free\,tariff} \\ {1,\,if\,the\,respondent\,uses\,a\,premium\,tariff} \\ \end{array} } \right. $$
(1)

Accordingly, willingness to donate is defined as the binary classification of Tor users’ actual behavior.

$$ willingness\,to\,donate = \left\{ {\begin{array}{*{20}c} {0, if\,the\,respondent\,has\,never\,donated} \\ {1, if\,the\,respondent\,has\,donated} \\ \end{array} } \right. $$
(2)

The independent variables are risk propensity (RP), frequency of improper invasion of privacy (VIC), trusting beliefs in online companies (TRUST), trusting beliefs in JonDonym (TRUSTPET) and knowing of Tor/JonDonym (TOR/JD) or not. Thus, our research model is as follows:

$$ WTP/WTD_{i} = \beta_{0} + \beta_{1} RP_{i} + \beta_{2} VIC_{i} + \beta_{3} TRUST_{i} + \beta_{4} TRUST_{PET, i} + \beta_{5} TOR/JD_{i} + \varepsilon_{i} $$
(3)

Risk propensity measures the risk aversion of the individual, i.e. the higher the measure, the more risk-averse the individual [44]. Literature finds that a risk aversion can act as a driver to protect an individual’s privacy [45]. Thus, we hypothesize:

H1: Risk propensity (RP) has a positive effect on the likelihood of paying or donating for PETs.

Privacy victim (VIC) measures how often individuals experienced a perceived improper invasion in their privacy [27]. Results of past research dealing with perceived bad experiences with privacy indicate that such experiences can cause individuals to protect their privacy to a larger extent [46]. Thus, we hypothesize:

H2: The more frequent users felt that they were a victim of an improper breach of their privacy, the more likely they are to pay or donate for PETs.

The construct trust in online companies assesses individuals trust in online companies with respect to handling their personal data [27]. Results in the literature suggest that a higher trust in online companies has a positive effect on the willingness to disclose personal information. Following this finding, we argue that users who have a higher level of trust in online companies, are less likely to spend money for protecting their privacy. Therefore, we hypothesize:

H3: The more users trust online companies with handling their personal data, the less likely they are to pay or donate for PETs.

Trust in JonDonym/Tor is adapted from Pavlou [47]. Trust can refer to the technology (in our case PETs (Tor and JonDonym)) itself as well as to the service provider. Since the non-profit organization of Tor evolved around the service itself [4], it is rather difficult for users to distinguish which label refers to the technology itself and which refers to the organization. The same holds for JonDonym since JonDonym is the only main service offered by the commercial company JonDos. Therefore, we argue that it is rather difficult for users to distinguish which label refers to the technology itself and which refers to the company. Thus, we decided to ask for trust in the PET (Tor and JonDonym, respectively), assuming that the difference to ask for trust in the organization/company is negligible. Literature shows that trust in services enables positive attitudes towards interacting with these services [24,25,26, 47]. In line with these results, we argue that a higher level of trust in the PET increases the likelihood to spend money for it. Thus, we hypothesize:

H4: The more users trust the PET, the more likely they are to pay or donate for it.

Lastly, we included a question about whether users of Tor/JonDonym know JonDonym/Tor. We included this question due to previous findings about a substituting effect of Tor with regard to the WTP for JonDonym [48]. Users of JonDonym partially stated that they would only spend money for a premium tariff, if Tor was not existent. Thus, we wanted to include this factor as a control variable in our analysis and hypothesize:

H5: The likelihood of JonDonym users to pay for a premium tariff decreases, if they are aware of Tor (we do not expect a similar effect for Tor users).

3.2 Data Collection

We conducted the studies with German and English-speaking users of Tor and JonDonym. For each service, we administered two questionnaires. Partially, items for the German questionnaire had to be translated since some constructs are adapted from the English literature. To ensure content validity of the translation, we followed a rigorous translation process. First, we translated the English questionnaire into German with the help of a certified translator (translators are standardized following the DIN EN 15038 norm). The German version was then given to a second independent certified translator who retranslated the questionnaire to English. This step was done to ensure the equivalence of the translation. Third, a group of five academic colleagues checked the two English versions with regard to this equivalence. All items were found to be equivalent [49]. The items for all analyses can be found in the appendix.

We installed the surveys on a university server and managed it with the survey software LimeSurvey (version 2.72.6) [50]. For Tor, we distributed the links to the English and German version over multiple channels on the internet. An overview of every distribution channel can be found in an earlier paper based on the same dataset [26]. In sum, 314 participants started the questionnaire (245 English version, 40 English version posted in hidden service forums, 29 German version). Of those 314 approached participants, 135 (105 English version, 13 English version posted in hidden service forums, 17 German version) filled out the questionnaires completely. After deleting all participants who answered a test question in the middle of the survey incorrectly, 124 usable data sets remained for the following analysis. For JonDonym, we distributed the links to the English and German version with the beta version of the JonDonym browser and published them on the official JonDonym homepage. In sum, 416 participants started the questionnaire (173 English version, 243 German version). Of those 416 approached participants, 141 (53 English version, 88 German version) remained after deleting unfinished sets and all participants who answered a test question incorrectly.

4 Results

We present the results of our empirical analyses in this section. In the first part, we discuss the analysis of the current tariff structures (JonDonym) and donation statistics (Tor). Furthermore, we assess preferences of JonDonym users regarding new alternative tariff schemes. In the second part, we show the results of the logistic regression model with the factors influencing the willingness to pay (JonDonym)/to donate (Tor).

4.1 Tariff Analysis for JonDonym

Among the 141 JonDonym users in of our survey, 85 users use a free tariff. 56 users are using JonDonym with a paid tariff. Among the 124 Tor users of our survey, 93 of them have never donated to Tor. Among donating users, the amounts of donation are arbitrary. The payment structure of JonDonym and descriptive statistics for the donations to Tor are shown in Table 1. It can be seen that roughly 1/3 of the participants spend money for JonDonym (25%) and Tor (39.72%). To analyze potential tariff optimizations for JonDonym, we asked about users’ preferences for three general tariff structures, namely a high-data-volume tariff (TP1), a low-price tariff (TP2) and a low-anonymity tariff (TP3). In addition, we designed five new tariffs. TRN4 is the tariff with the lowest data volume per month and TRN5 is the tariff with highest data volume per month. The specific wording of the tariff options can be found in the appendix.

Table 1. Tariff and donation statistics of JonDonym and Tor users

Figure 1 shows the boxplots for the preferences for the five new tariff options (TRN) differentiated between free and premium users as well as three alternative tariff structures (TP). The median preferences of free users for the five tariffs are neutral (preference = 4). However, the mean preference of free users for TRN4 is slightly higher compared to the other options. In comparison, premium users have a higher preference for TRN1 and TRN4. In a next step, we analyze whether the differences illustrated with the boxplots between options for the different groups (full sample, premium users, free users) are statistically significant (Table 2). Our results indicate that the whole sample of users shows the highest preference for TRN4 and the second highest preference for TRN1. The remaining tariffs, i.e. TRN2, TRN3 and TRN5 are favored least of all. However, this contradicts with the conclusion that the total users show the highest preference for TP1. Thus, it makes sense to split the sample and look at free and premium users. Premium users show the highest preference for TRN1 and TRN4, the second highest preference for TRN2 and TRN3, and the least preference for TRN5. Thus, they show a higher preference for 100 GB tariffs. This is in line with the conclusion that premium users have the highest preference for TP1. Free users show a neutral preference for all five tariffs except for TRN4 (slightly higher).

Fig. 1.
figure 1

Users’ preference for alternative tariff structures (left side) and users’ preferences for tariff structures (right side), free users = 85, premium users = 56

Table 2. Paired Wilcoxon tests for the five new tariffs and three tariff structures

Table 2 also presents the results for the differences in preferences for the tariff structures (TP). The results indicate that the 141 users have a higher preference for a high-data-volume tariff compared to a low-price tariff (TP1 vs. TP2). The results are similar for the sub-group of premium users. They have the same preference order as the whole sample of users. However, free users have the same preference for TP1 and TP2.

4.2 Factors Influencing Willingness to Pay for Privacy

Before analyzing the results in detail, we have to assess whether the independent variables correlate with each other (multicollinearity), since this would negatively impact the validity of our model. We test for multicollinearity by calculating the variance inflation factor (VIF) for all independent variables. None of the variables has a VIF larger than 1.7, indicating that multicollinearity is not an issue for our sample.

The results of the logistic regression model can be seen in Table 3. We highlighted statistically significant results in bold face. For JonDonym, RP and TRUSTPET are the only statistically significant independent variables in the model. Surprisingly, RP has a negative coefficient, indicating that more risk-averse users are less likely to choose a premium tariff for JonDonym. This empirical result is in contrast to hypothesis 1, thus we cannot confirm this hypothesis derived from results of the literature and the associated rationale. Reasons for this contradictory result can be manifold. For example, there might be unobservable variables not included in the model which impact the relationship between RP and WTP. Hypotheses 2, 3 and 5 cannot be confirmed as well due to insignificant coefficients. In contrast to this, hypothesis 4 can be confirmed. Given the average marginal effect (avg. marg. effect), our result indicates that a one unit increase in trust in JonDonym increases the likelihood of choosing a premium tariff by 12.17%. This result is statistically significant at the 0.1% level. Hypothesis 4 can also be confirmed for the logistic regression model for Tor users with a slightly larger average marginal effect size of 12.45%. The variable VIC is statistically significant at the 1% level with a marginal effect of 5.33%. This indicates that bad experiences with privacy breaches lead to a higher probability of donating money to Tor, and thereby, supporting the Tor project financially. No other hypotheses can be confirmed for Tor.

Table 3. Results of the logistic regression model

5 Discussion and Conclusion

With respect to research question 1, our results show that PET providers should focus on building a strong reputation since trust in the PET is the strongest factor influencing the probability of spending money for privacy for both, JonDonym and Tor. In addition, we can observe that Tor users are more likely to donate for the service if they were a victim of a privacy breach or violation in their past.

Our second research question is about an optimized design of tariff options for users of commercial PETs based on the case of JonDonym. Here, we can see that the results differ when looking at different groups of users, which is in line with former research [37]. Users who use JonDonym with the free option, are indifferent with respect to the newly introduced tariffs as well as the general tariff structures (high volume vs. low price vs. low anonymity). However, some of them tend to prefer the tariff option with the lowest price with an included high-speed volume of 40 GB the most. Thus, free users would prefer the cheapest tariff, if they were to decide for paying at all. Practically, this implies that commercial PET providers should try to offer options with a relatively low monetary barrier to convert as many free users as possible into paying ones. The already paying users prefer high-volume tariffs over the other options.

Limitations of this study are the following. First, our sample only includes a relatively small number of active users of both PETs. This sample size is sufficient for the sake of our statistical analyses. However, the results about the current payment and donation numbers provide only a rough idea about the actual distribution. In addition, it is very difficult to gather data of actual users of PETs since it is a comparable small population that we could survey. It is also relevant to mention that we did not offer any financial rewards for the participation. A second limitation concerns possible self-report biases (e.g. social desirability). We addressed this issue by gathering the data fully anonymized. Third, mixing results of the German and English questionnaire could be a source of errors. On the one hand, this procedure was necessary to achieve the minimum sample size. On the other hand, we followed a very thorough translation procedure to ensure the highest level of equivalence as possible. Thus, we argue that this limitation did not affect the results to a large extent. However, we cannot rule out that there are unobserved effects on the results due to running the survey in more than one country at all. Lastly, demographic questions were not mandatory to fill out due to our assumption that these types of individuals who use Tor or JonDonym are highly cautious with respect to their privacy. Thus, we decided to go for a larger sample size considering that we might have lost participants otherwise (if demographics had to be filled out mandatorily). However, we must acknowledge that demographic variables might be relevant confounders in the regression model explaining the WTP of PET users.

Future work should aim to determine the relation between paying users and the groups Schomakers et al. [37] identified. In addition, researchers can build on our results by implementing such tariff options for commercial PET services in practice and investigate whether users are more prone to spend money for their privacy protection. Furthermore, it is relevant for commercial PET providers to differentiate themselves against free competitors as Tor in our example. This can be done by providing a higher level of usability in terms of ease of use, performance and compatibility with other applications [25, 48]. If commercial PET providers cannot create a unique selling point (USP) compared to free services, it is very unlikely that they establish a successful monetarization strategy in the market. Therefore, it is necessary to investigate how a USP for a commercial PET provider can look like and assess it in the field with active users of existing PETs as well as non-users.