Flow-based reputation with uncertainty: Evidence-Based Subjective Logic

The concept of reputation is widely used as a measure of trustworthiness based on ratings from members in a community. The adoption of reputation systems, however, relies on their ability to capture the actual trustworthiness of a target. Several reputation models for aggregating trust information have been proposed in the literature. The choice of model has an impact on the reliability of the aggregated trust information as well as on the procedure used to compute reputations. Two prominent models are flow-based reputation (e.g., EigenTrust, PageRank) and Subjective Logic based reputation. Flow-based models provide an automated method to aggregate trust information, but they are not able to express the level of uncertainty in the information. In contrast, Subjective Logic extends probabilistic models with an explicit notion of uncertainty, but the calculation of reputation depends on the structure of the trust network and often requires information to be discarded. These are severe drawbacks. In this work, we observe that the `opinion discounting' operation in Subjective Logic has a number of basic problems. We resolve these problems by providing a new discounting operator that describes the flow of evidence from one party to another. The adoption of our discounting rule results in a consistent Subjective Logic algebra that is entirely based on the handling of evidence. We show that the new algebra enables the construction of an automated reputation assessment procedure for arbitrary trust networks, where the calculation no longer depends on the structure of the network, and does not need to throw away any information. Thus, we obtain the best of both worlds: flow-based reputation and consistent handling of uncertainties.


Introduction
Advances in ICT and the increasing use of the Internet have resulted in changes in the way people do everyday things and interact with each other. Everything people do happens online, significantly increasing the number of business transactions carried out daily over the Internet. Often, users have to decide whether to interact with services or users with whom they have never interacted before. Uncertainty about services and users' behavior is often perceived as a risk [2] and, thus, it can restrain a user from engaging in a transaction with unknown parties. Therefore, to fully exploit the potential of online services, platforms and ultimately online communities, it is necessary to establish and manage trust amongst the parties involved in a transaction [11,40,43].
Reputation is widely adopted to build trust among users in online communities where users do not know each other beforehand. The basic idea underlying reputation is that a user's past experience as well as the experience of other users influences his decision whether to repeat this interaction in the future. Thus, reputation provides an indication of services' and users' trustworthiness based on their past behavior [36]. When a user has to decide whether to interact with another party, he can consider its reputation and start the transaction only if it is trustworthy. Therefore, a reputation system, which helps managing reputations (e.g., by collecting, distributing and aggregating feedback about services and users' behavior), becomes a fundamental component of the trust and security architecture of any online service or platform [42].
The application and adoption of reputation systems, however, relies on their ability to capture the actual trustworthiness of the parties involved in a transaction [41]. The quality of a reputation value depends on the amount of information used for its computation [10,17,27]. A reputation system should use "sufficient" information. However, it is difficult to establish the minimum amount of information required to compute reputation; also, different users may have a different perception based on their risk attitude [2]. For instance, some users may accept to interact with a party which has a high reputation based on very few past transactions, while other users might require more evidence of good behavior. Therefore, a reputation system should provide a level of confidence in the computed reputation, for instance based on the amount of information used in the computation [15,32]. This additional information will provide deeper insights to users, helping them decide whether to engage in a transaction or not. In addition, reputation systems should provide an effective and preferably automated method to aggregate the available trust information and compute reputations from it.
Reputation systems usually rely on a mathematical model to aggregate trust information and compute reputation [13]. Several mathematical models for reputation have been proposed in the literature. These models can be classified with respect to the used mathematical foundations, e.g. summation and averaging [14], probabilistic models [15,21,28], flow-based [6,23,25,35], fuzzy metrics [3,38]. As pointed out in [42], the choice of the type of model has an impact on the type and amount of trust information as well as on the procedure used to compute reputation.
Among the others, two prominent reputation models are the flow-based model and Subjective Logic (SL) [15]. Flow-based reputation models use Markov chains as the mathematical foundation. Flow-based models provide an automated method to aggregate all available trust information. However, they are not able to express the level of confidence in the obtained reputation values. On the other hand, SL is rooted in the well-known Dempster-Shafer theory [34]. SL provides a mathematical foundation to deal with opinions and has the natural ability to express uncertainty explicitly. Intuitively, uncertainty incorporates a margin of error into reputation calculation due to the (limited) amount of available trust information. SL uses a consensus operator '⊕' to fuse independent opinions and a discounting operator '⊗' to compute trust transitivity. This makes SL a suitable mathematical framework for handling trust relations and reputation, especially when limited evidence is available. However, the consensus operator is rooted in the theory of evidence, while the discounting operator is based on a probabilistic interpretation of opinions. The different nature of these operators leads to a lack of "cooperation" between them. As a consequence, the calculation of reputation depends on the shape of the trust network, the graph of interactions, in which nodes represent the entities in the system and edges are labeled with opinions. Depending on the structure of the trust network, some trust information may have to be discarded to enable SL-based computations.
Our desideratum is to have a reputation system which has the advantages of both flowbased reputation models and SL. In particular, the goal of this work is to devise the mathematical foundation for a flow-based reputation model with uncertainty. We make the following contributions towards this goal: -We observe that the discounting rule '⊗' in SL does not have a natural interpretation in terms of evidence handling. We give examples of counterintuitive behavior of the ⊗ operation. -We present a brief inventory of the problems that occur when one tries to combine SL with flow-based reputation metrics. -We present a simplified justification of the mapping between evidence and opinions in SL. -We introduce a new scalar multiplication operation in SL which corresponds to a multiplication of evidence. Our scalar multiplication is consistent with the consensus operation (which amounts to addition of evidence), and hence satisfies a distribution law, namely α · (x ⊕ y) = (α · x) ⊕ (α · y). -We introduce a new discounting rule . It represents the flow of evidence from one party to another. During this flow, lack of trust in the party from whom evidence is received is translated into a reduction of the amount of evidence; this reduction is implemented using our new scalar multiplication rule. Our new discounting rule satisfies x (y⊕z) = (x y) ⊕ (x z). This right-distribution property resolves one of the problems of SL.
Our new discounting rule multiplies evidence instead of opinions and is thus fully based on the handling of evidence. In contrast to the old discounting rule, the new one does not have associativity. This, however, does not pose a problem since the flow of evidence has a well defined direction. The adoption of our discounting rule results in an opinion algebra that is entirely centered on the handling of evidence. We refer to it as Evidence-Based Subjective Logic (EBSL). We show that EBSL provides a solid foundation for the development of reputation models able to express the level of confidence in computed reputations. Moreover, having an opinion algebra rooted in a single foundation makes it possible to define an automated procedure to compute reputation for arbitrary trust networks. We believe that having an automated procedure is a critical and necessary condition for the implementation and adoption of reputation systems in online services. To demonstrate the applicability of EBSL to the development of reputation systems, we make the following contributions: -We show that replacing SL's discounting operation ⊗ by the new solves all the problems that usually occur when one tries to combines flow-based reputation with SL, in particular the problem of evidence double-counting and the ensuing necessity to make computations graph-dependent and to discard information. -Using EBSL, we construct a simple iterative algorithm that computes reputation for arbitrary trust networks without discarding any information. Thus, we achieve our desideratum of automated computation of flow-based reputation with uncertainty.
We stress that this is only one out of many possible reputation models that can be constructed on top of EBSL. EBSL can be used as a foundation for other existing reputation models, e.g. reputation models based on random walks [9]. The investigation of these models is an interesting direction for future work. The remainder of the paper is organized as follows. The next section presents an overview of flow-based reputation models and SL. Section 3 discusses the limitations of SL, and Section 4 illustrates these limitations when combining flow-based reputation and SL. Section 5 revisits SL and introduces the new EBSL scalar multiplication and discounting operators. Section 6 presents our flow-based reputation model with uncertainty along with an iterative algorithm that computes reputation for arbitrary trust networks. Section 7 presents an evaluation of our approach using both synthetic and real data. Finally, Section 8 discusses related work, and Section 9 concludes the paper providing directions for future work.

Preliminaries
In this section we present an overview of flow-based reputation models (based on [35]) and of Subjective Logic (based on [15]). We also introduce the notation used in the remainder of the paper.

Flow-based reputation
Flow-based reputation systems [6,23,25,35] are based on the notion of transitive trust. Intuitively, if an entity i trusts an entity j, it would also have some trust in the entities trusted by j.
Typically, each time user i has a transaction with another user j, she may rate the transaction as positive, neutral, or negative. In a flow-based reputation model, these ratings are aggregated in order to obtain a Markov chain. The reputation vector (i.e., the vector containing all reputation values) is computed as the steady state vector of the Markov chain; one starts with a vector of initial reputation values and then repeatedly applies the Markov step until a stable state has been reached. This corresponds to taking more and more indirect evidence into account.
Below, we present the metric proposed in [35] (with slightly modified notation) as an example of a flow-based reputation system. Example 1 Let A be a matrix containing aggregated ratings for n users. It has zero diagonal (i.e., A ii = 0). The matrix element A ij (with i = j) represents the aggregated ratings given to j by i. Let s ∈ [0, 1] n , with s = 0, be a 'starting vector' containing starting values assigned to all users. Let α ∈ [0, 1] be a weight parameter for the importance of indirect vs. direct evidence. The reputation vector r ∈ [0, 1] n is defined as a function of α, s and A by the following equation: Eq. (1) can be read as follows. To determine the reputation of user x we first take into account the direct information about x. From this we can compute sx, the reputation initially assigned to x if no further information is available. However, additional information may be available, namely the aggregated ratings in A. The weight of direct versus indirect information is accounted for by the parameter α. If no direct information about x is available, the reputation of x can be computed as rx = y (ry/ )Ayx, i.e. a weighted average of the reputation values Ayx with weights equal to the normalized reputations. Adding the two contributions, with weights α and 1 − α, yields (1): a weighted average over all available information.
The equation for solving the unknown r contains r. A solution is obtained by repeatedly substituting (1) into itself until a stable state has been reached. This is the steady state of the Markov chain. It was shown in [35] that Eq. (1) always has a solution and that the solution is unique.
Intuitively, A can be seen as an adjacency matrix of a trust network where nodes represent entities and edges represent the direct trust that entities have in other entities based on direct experience. Based on the results presented in [35], Eq. (1) can be applied to assess reputation for arbitrary trust networks.

Subjective Logic
Subjective Logic (SL) is a trust algebra based on Bayesian theory and Boolean logic that explicitly models uncertainty and belief ownership. In the remainder of this section, we provide an overview of SL based on [15].
The central concept in SL is the three-component opinion.
Definition 1 (Opinion and opinion space) [15] An opinion x about some proposition P is a tuple x = (x b , x d , xu), where x b represents the belief that P is provable (belief ), x d the belief that P is disprovable (disbelief ), and xu the belief that P is neither provable nor disprovable (uncertainty). The components of x satisfy x b + x d + xu = 1. The space of opinions is denoted as Ω and is defined as An opinion x with x b + x d < 1 can be seen as an incomplete probability distribution. In order to enable the computation of expectation values, SL extends the three-component opinion with a fourth parameter 'a' called relative atomicity, with a ∈ [0, 1]. The probability expectation is E(x) = x b + xua. In this paper, we will omit the relative atomicity from our notation, because in our context (trust networks) it is not modified by any of the computations on opinions. In more complicated situations, however, the relative atomicity is modified in nontrivial ways.
Opinions are based on evidence. Evidence can be represented as a pair of nonnegative finite numbers (p, n), where p is the amount of evidence supporting the proposition, and n the amount that contradicts the proposition. The notation e = p + n is used to denote the total amount of evidence about the proposition. There is a one-to-one mapping between an opinion x ∈ Ω and the evidence (p, n) on which it is based, The bijection (2) holds for any value of the atomicity a. It has its origin in an analysis of the a posteriori probability distribution (Beta function distribution) of the biases which underlie the generation of evidence [15]. This distribution is given by where t is the probability that proposition P is true, and B(·, ·) is the Beta function. The opinion x = (p,n,2) p+n+2 is based on the vague knowledge (3) about t. The left part of (2) holds because the thus constructed opinion x has the same expectation as (3).
Intuitively, the amount of positive and negative evidence about a proposition determines the belief and the disbelief in the proposition, respectively. Increasing the total amount of evidence (e) reduces the uncertainty. Note that there is a fundamental difference between an opinion where a proposition is equally provable and disprovable and one where we have complete uncertainty about the proposition. For instance, opinion (0, 0, 1) indicates that there is no evidence either supporting or contradicting the proposition, i.e. n = p = 0, whereas opinion (0.5, 0.5, 0) indicates that n = p = ∞.
We use the notation p(x) def = 2 xb xu to denote the amount of supporting evidence underlying opinion x, and likewise n(x) def = 2 xd xu for the amount of 'negative' evidence. Moreover, we use the notation e(x) = p(x)+n(x) to represent the total amount of evidence underlying opinion x.
SL provides a number of operators to combine opinions. One of the fundamental operations is the combination of evidence from multiple sources. Consider the following scenario. Alice has evidence (p 1 , n 1 ) about some proposition. She forms an opinion x 1 = (p 1 , n 1 , 2)/(p 1 + n 1 + 2). Later she collects additional evidence (p 2 , n 2 ) independent of the first evidence. Based on the second body of evidence alone, she would arrive at opinion x 2 = (p 2 , n 2 , 2)/(p 2 + n 2 + 2). If she combines all the evidence, she obtains opinion The combined opinion x is expressed as a function of x 1 and x 2 via the so-called 'consensus' rule; this is denoted as Definition 2 (Consensus) [15] Let x, y ∈ Ω. The consensus x ⊕ y ∈ Ω is defined as Eq. (5) precisely corresponds to (4). Note that x ⊕ y = y ⊕ x. Furthermore, the consensus operation is associative, i.e. x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z. These properties are exactly what one intuitively expects from an operation that combines evidence. It is worth noting that the evidence has to be independent for the ⊕ rule to apply. Combining dependent evidence would lead to the problem of double counting evidence. We formalize and discuss this problem in Section 3.3. The second important operation in SL is the transfer of opinions from one party to another. Consider the following scenario. Alice has opinion x about Bob's trustworthiness. Bob has opinion y about some proposition P . He informs Alice of his opinion. Alice now has to form an opinion about P . The standard solution to this problem is that Alice applies an x-dependent weight to Bob's opinion y [4,23,26,31,35]. This is the so called 'discounting'. The following formula is usually applied.
Definition 3 (Discounting) [15] Let x, y ∈ Ω. The discounting of opinion y using opinion x is denoted as x ⊗ y ∈ Ω, and is defined as It holds that x ⊗ y = y ⊗ x and that x ⊗ (y ⊗ z) = (x ⊗ y) ⊗ z. The discounting rule (6) is In SL, a trust network can be modeled with a combination of consensus and discounting operators. The consensus operator is used to aggregate trust information from different sources, while the discounting operator is used to implement trust transitivity. Note that, in a trust network, SL distinguishes two types of trust relationship: functional trust, which represents the opinion about an entity's ability to provide a specific function, and referral trust, which represents the opinion about an entity's ability to provide recommendations about other entities. Referral trust is assumed to be transitive, and a trust chain is said to be valid if the last edge of the chain represents functional trust and all previous edges represent referral trust.

Limitations of Subjective Logic
Our desideratum is a novel reputation metric that has the advantages of both SL and flowbased models. On one hand, we aim at an automated procedure for computing reputation as in flow-based approaches. On the other hand, we aim to determine the confidence in reputation values by making uncertainty explicit as in SL. In this section, we discuss the limitations of SL. Then, in Section 4 we show how these limitations affect a naïve approach that combines flow-based reputation and SL.

Dogmatic Opinions
Definition 4 The extreme points corresponding to full belief (B), full disbelief (D) and full uncertainty (U ) are defined as The special points B, D, U behave as follows regarding the consensus operation: The full uncertainty U behaves like an additive zero. With respect to the discounting rule, the special points B, D, U behave as Opinions that have u = 0 (i.e., lying on the line between B and D) are called 'dogmatic' opinions. They have to be treated with caution, since they have e = ∞ and therefore overwhelm other opinions when the consensus ⊕ is applied. We will come back to this issue in Section 5.1.

Counter-intuitive behaviour of the ⊗ operation
We observe that the discounting rule ⊗ does not have a natural interpretation in terms of evidence handling. For instance, if we compute the positive evidence contained in x ⊗ y we get, using p, n notation as introduced in Section 2.2, where in the final step we have used 1/yu = 1 + 2p(y) + 2n(y). Similarly, we get 8 Eqs. (8) and (9) are complicated functions of the amounts p(x), n(x), p(y), n(y). We show that they do not have a clear and well-defined interpretation in terms of evidence handling. For instance, Eq. (8) does not allow us to determine whether the evidence underlying x ⊗ y originates from x or y; one can argue that p(x ⊗ y) is either a "fraction" of p(y), a "fraction" of p(x), or contains evidence from both y and x. Even if we try to interpret in the standard way (see Section 2.2), we can observe that the factor multiplying p(y) in (8) and n(y) in (9) depends on p(y) and n(y). Hence, the evidence underlying x ⊗ y is not simply an xdependent multiple of the evidence underlying y. In fact, the examples below show that in some cases the contribution from y completely disappears from the equation.
In both these examples, y is based on a lot of evidence; but even if x contains a lot of belief, none of y's evidence survives in x ⊗ y. We conclude that the discounting operation ⊗ gives counter-intuitive results. The ⊗ rule is inspired by a probabilistic interpretation of opinions. The probabilistic interpretation might suggest that it is natural to multiply probabilities, i.e. that the expressions However, we argue that this is not at all self-evident. When discounting y through x, the uncertainties in x induce an x-dependent probability distribution on y. This can be thought of as an additional layer of uncertainty about beta distributions. Let (3) describe opinion y; then the discounting through x introduces uncertainty about the parameters p and n in the equation (a probability distribution on p and n). It is not at all self-evident that the resulting opinion is x ⊗ y as prescribed by Def. 3. It would make equal sense to replace the discounting factor x b by e.g. the expectation x b + axu. In this paper we do not pursue such an approach based on distributions, but we mention it in order to point out that the ⊗ rule is not necessarily well-founded.
The fact that SL employs on the one hand a consensus rule based on adding evidence and on the other hand a discounting rule based on multiplying opinions leads to a lack of 'cooperation' between the ⊕ and ⊗ operations. Most importantly, the ⊗ is not distributive with respect to ⊕, Consider the following scenario.
Example 4 Alice has opinion x about Bob's trustworthiness in providing recommendations. Bob experiments with chocolate on Monday and forms an opinion y about its medicinal qualities. On Tuesday he does some more of the same kind of experiments and forms an independent opinion z. He informs Alice of y, z and his final opinion y ⊕ z. What should Alice think about the medicinal qualities of chocolate? One approach is to say that Alice should appraise opinions y and z separately, yielding (x ⊗ y) ⊕ (x ⊗ z) (Note that the two occurrences of x represent the very same opinion, i.e. the evidence underlying the two occurrences is the same). Another approach is to weight Bob's combined opinion, yielding x ⊗ (y ⊕ z). Intuitively the two approaches should yield exactly the same opinion, yet the SL rules give (10).
We now present a numerical example that illustrates the issue discussed above. Example 5 Figure 1 shows the trust network representing the scenario in Example 4. To highlight that opinions x and y are independent, in the figure we abuse the network notation and duplicate the node representing Bob: B 1 represents Bob on Monday and B 2 represents Bob on Tuesday. The edge between Alice (A) and Bob (dashed rectangle) represents Alice's opinion x about Bob's recommendations. This opinion concerns Bob's recommendations regardless of when they are formed (e.g., on Monday or on Tuesday). Suppose that Alice's opinion about Bob's trustworthiness is x = (0.6, 0.1, 0.3), and Bob's opinions about the proposition P are y = (0.3, 0.6, 0.1) and z = (0.5, 0.2, 0.3). We are interested in Alice's opinion w about P based on Bob's recommendations. As discussed in the previous example, we have two approaches to compute such an opinion: Clearly, the two approaches yield different opinions, contradicting the intuitive expectation.

Double counting of evidence in Trust Networks
The ⊕ rule imposes constraints on the evidence that can be aggregated: it requires evidence to be independent [15]. In the literature, however, there is no well-defined notion of evidence independence. Some researchers [33,45] assume that pieces of evidence are independent if they are obtained from independent sources, where two sources are said to be independent if they measure completely unrelated features. This definition, however, is too restrictive. For instance, the evidence collected by a sensor at different points in time can also be independent.
Evidence is usually extracted from "observations" of a system. In this work we adopt a notion of evidence independence based on the independence of observations, which we define in the same way as independence of random variables. Intuitively, observation independence requires that the probability that o i happens, assuming that o j happened, is the same as the probability that o i happens regardless of o j . The definition above can be extended to opinions: Definition 6 (Independent Opinions) Let x, y ∈ Ω be opinions. We say that x and y are independent if and only if the evidence underlying x and y is independent.
Combining dependent evidence leads to the problem of double counting evidence.
Definition 7 (Double Counting) Let x, y ∈ Ω. In an expression of the form x ⊕ y we say that there is double counting if there is dependence between the evidence underlying x and the evidence underlying y.
Intuitively, dependent evidence shares "part" of the evidence. Therefore, aggregating dependent evidence leads to counting some part of evidence more than once.
where both occurrences of x are obtained from the same observation. The evidence underlying x is contained on the left side as well as the right side of the '⊕'. This is a clear case of double counting.
Example 7 Consider the expression (x ⊗ y) ⊕ (x ⊗ z), again with both instances 'x' coming from the same observation. The evidence underlying x is contained on the left side as well as the right side of the '⊕', but less evidently than in Example 6, because now x is used for discounting. In Section 3.2 we showed that the ⊗ rule causes evidence from x to end up in x ⊗ y in a complicated way. Hence the opinions x ⊗ y and x ⊗ z are not independent, which causes double counting of x in the expression ( It is worth noting that double counting of x in the expression (x ⊗ y) ⊕ (x ⊗ z) can also be observed in Example 5. Indeed, the uncertainty in (x ⊗ y) ⊕ (x ⊗ z) is lower than the uncertainty in x ⊗ (y ⊕ z), indicating that the result contains more evidence when the trust network is represented using the first expression.
To avoid the problem of double counting, SL requires that the trust network is expressed in a canonical form [20,22], where all trust paths are independent. Intuitively, a trust network expression is in canonical form if every edge appears only once in the expression.
Example 8 Consider the two trust network expressions representing the trust network in Figure 1 given in Example 4: (x ⊗ y) ⊕ (x ⊗ z) and x ⊗ (y ⊕ z). The first expression is not in canonical form as opinion x occurs twice in the expression; the second expression is in canonical form as every edge appears only once in the expression. Thus, x ⊗ (y ⊕ z) is the proper representation of the trust network in Figure 1.
In the next section we show that it is not always possible to express a trust network in canonical form. As suggested in [20,22], this issue can be addressed by removing some edges from the network. This, however, means discarding part of the trust information, thus reducing the quality of reputation values.

Combining flow-based reputation and Subjective Logic
This section presents a naïve approach that combines flow-based reputation and SL. We illustrate the limitations of such a naïve approach. We first introduce some notation and definitions. Flow-based reputation models usually assume that users who are honest during transactions are also honest in reporting their ratings [23]. This assumption, however, does not hold in many real-life situations [1]. Thus, as it is done in SL, we distinguish between referral trust and functional trust (see Section 2). We use A to represent direct referral trust and T to represent direct functional trust. R denotes the final referral trust and F the final functional trust.
Example of a trust network that is problematic for Subjective Logic.
Definition 8 For n users, the direct referral trust matrix A is an n × n matrix, where Axy ∈ Ω (with x = y) is the direct referral trust that user x has in user y, and Axx = (0, 0, 1) for all x.
Note that we impose the condition Axx = (0, 0, 1) in order to prevent artifacts caused by self-rating [35]. Let T jP be the opinion of user j about some proposition P , and let R ij be i's (possibly indirect) opinion about the trustworthiness of user j. The opinion of user i about P based on direct and indirect evidence can be computed using the following equation: Eq. (11) computes the final functional trust F iP by combining user i's direct opinion T ip with all the available opinions of other users, {T jP } j =i . The opinion received from j is weighted with the 'reputation' R ij . To find R ij , we could try a recursive approach 1 inspired by Eq. (1): To demonstrate the problems that occur in this naïve combination of flow-based reputation and SL, we consider the trust networks shown in Figures 2 and 3. Referral trust is drawn as a full line, functional trust as a double full line. Figure 2 is a variant of a network discussed in [20]. We are interested in determining the opinion F 1P of user 1 about some proposition P . User 1 does not have any direct evidence. The only functional trust about P comes from user 7. Thus, we have to determine R 17 , user 1's referral trust in user 7. This is done by taking A 67 with the proper weight, namely R 16 . Continuing this recursive approach using (12) gives This, however, is a problematic result. Recall that SL requires trust networks to be expressed in a canonical form. If this restriction is not satisfied, we face the problem of 'double counting' opinions, i.e. applying the ⊕ operation to opinions that are not independent (Def. 7). In Figure 2, consider the case A 45 = U . The canonical solution for this case is whereas (13) yields In (15) the A 12 and A 23 are double-counted. We conclude that the naïve recursive equation (12) does not properly reproduce the canonical solution.
There is a further problem, unrelated to the naïve recursive approach. As was shown in [20], it is not even possible to transform the trust network in Figure 2 into a canonical form in the general case A 45 = U .
The problems become even worse when the trust network contains loops, e.g. a loop as shown in Figure 3. Here too, there is no canonical form. Applying the recursive approach to Figure 3 gives Repeatedly substituting the latter into itself yields We observe that taking opinion A 32 into account causes excessive double-counting of A 12 .
If the loop is somehow discarded, then the information contained in A 32 is destroyed.
In conclusion, (i) generic trust networks with several connections A ij = U cannot be handled with SL because there is no canonical form for them that avoids double-counting; (ii) even when there is a canonical result, this result cannot be reproduced by a straightforward recursive approach.

Subjective Logic revisited
This section presents a new, fully evidence-based approach to SL. We refer to the resulting opinion algebra as Evidence-Based Subjective Logic or EBSL.

Excluding dogmatic opinions
As mentioned in Section 3.1, dogmatic opinions are problematic when the ⊕ operation is applied to them. Furthermore, a dogmatic opinion corresponds to an infinite amount of evidence, which in our context is not realistic. In the remainder of this paper, we will exclude dogmatic opinions. We will work with a reduced opinion space defined as follows.

Definition 9
The opinion space excluding dogmatic opinions is denoted as Ω and is defined We are by no means the first to do this; in fact, the exclusion of dogmatic opinions was proposed as an option in the very early literature on SL [22].

The relation between evidence and opinions: a simplified justification
We make a short observation about the mapping between evidence and opinions. As was mentioned in Section 2.2, there is a one-to-one mapping (2) based on the analysis of probability distributions (Beta distributions). Here we show that there is a shortcut: the same mapping can also be obtained in a much simpler way, based on constraints.
Theorem 1 Let p ≥ 0 be the amount of evidence that supports 'belief'; let n ≥ 0 be the amount of evidence that supports 'disbelief'. Let x = (b, d, u) ∈ Ω be the opinion based on the evidence. If we demand the following four properties, then the relation between x and (p, n) has to be where c > 0 is a constant.
Proof: Theorem 1 shows that we can derive a formula similar to (2), based on minimal requirements which make intuitive sense. Only the constant c is not fixed by the imposed constraints; it has to be determined from the context. One can interpret c as a kind of soft threshold on the amount of evidence: beyond this threshold one starts gaining enough confidence from the evidence to form an opinion.
We observe that (17) with its generic constant c is already sufficient to derive the consensus rule ⊕, i.e. the consensus rule does not require c = 2.

Lemma 1
The mapping (17) with arbitrary c implies the consensus rule ⊕ as specified in Def. 2.
Proof: Consider x = (b 1 , d 1 , u 1 ) = (p 1 , n 1 , c)/(p 1 + n 1 + c) and y = (b 2 , d 2 , u 2 ) = (p 2 , n 2 , c)/(p 2 +n 2 +c). An opinion formed from the combined evidence (p 1 +p 2 , n 1 +n 2 ) according to (17)  Furthermore, in (2) and (3) we can replace '2' by c and the expectation of t, obtained by integrating the Beta distribution times t, is still x b + axu! Therefore, in the remainder of the paper, we will work with a re-defined version of the p(x) and n(x) functions (Section 2.2). The new version has a general value c > 0 instead of c = 2.
Definition 10 Let x = (x b , x d , xu) ∈ Ω . We define the notation p(x), n(x) and e(x) as

Scalar multiplication
Our next contribution has more impact. We define an operation on opinions that is equivalent to a scalar multiplication on the total amount of evidence.
We define the product α · x as Lemma 2 Let x ∈ Ω and α ≥ 0. The scalar multiplication as specified in Def. 11 has the following properties: 5. The evidence underlying α · x is α times the evidence underlying x, i.e. p(α · x) = αp(x) and n(α · x) = αn(x).

New discounting rule
We propose a new approach to discounting: instead of multiplying (part of) the opinions we multiply the evidence. The multiplication is done using our scalar multiplication rule (Def. 11). We return to the example where Alice has an opinion x ∈ Ω about the trustworthiness of Bob, and Bob has an opinion y ∈ Ω about some proposition P . We propose a discounting of the form g(x) · y, where g(x) ≥ 0 is a scalar that indicates which fraction of Bob's evidence is accepted by Alice. One can visualize the discounting as a physical transfer of evidence from Bob to Alice, during which only a fraction g(x) survives, due to Alice's mistrust and/or uncertainty. It is desirable to set g(x) in the range [0, 1]: allowing g(x) < 0 would lead to negative amounts of evidence (not to be confused with the term 'negative evidence' which is used for evidence that contradicts the proposition P); allowing g(x) > 1 would "amplify" evidence, i.e., create new evidence out of nothing, which is clearly unrealistic. It makes intuitive sense to set lim x→B g(x) = 1, lim x→D g(x) = 0 and g(U ) = 0, or even to set g(x) =g(x b ), i.e. a function of x b only, withg(0) = 0 andg(1) = 1. For instance, we could set g(x) = x b . 2 On the other hand, it could also make sense to set g(U ) > 0, which would represent the "benefit of the doubt". An intuitive choice would then be g(x) = x b + axu, i.e. the expectation value corresponding to x. We postpone the precise details of how the function g can/should be chosen, and introduce a very broad definition.
Definition 12 (New generic discounting rule ) Let x, y ∈ Ω . Let g : Ω → [0, 1] be a function. We define the operation as with the · operation as specified in Def. 11.
Differently from ⊗, the operator has a well-defined interpretation in terms of evidence handling. The following theorem states that the evidence underlying x y is a fraction of the evidence underlying y defined by a scalar weight depending on x.
Proof: Property 1 follows from x y = g(x) · y and the first property in Lemma 2. Property 2: We compute p(x y) = (x y) b /(x y)u using the definition (21), which yields g(x)y b /yu = g(x)p(y). For n(x y) the derivation is analogous. Property 3: Follows directly by dividing the belief and disbelief part of (21). Property 4: Follows by setting y b = 0, y d = 0 and yu = 1 in (21). Property 5: We have (x y)u = yu (yb+yd)g(x)+yu . Since y b + y d + yu = 1 and g(x) ∈ [0, 1], the denominator of the fraction lies in the range [yu, 1].
Corollary 1 Let x, y ∈ Ω with g(x) > 0. Let lim y→B g(y) = 1, lim y→D g(y) = 0 and g(U ) = 0. Then the extreme points B, D, U have the following behavior with respect to the new discounting rule , We stress again that the whole 'dogmatic' line between B and D is not part of the opinion space Ω , so that we avoid having to deal with infinite amounts of evidence.
Theorem 3 There is no function g : Ω → [0, 1] such that x y = x ⊗ y for all x, y ∈ Ω .
Proof: On the one hand, we have On the other hand, x y = (g(x)yb,g(x)yd,yu) g(x)(yb+yd)+yu . Demanding that they are equal yields, after some rewriting, This requires g(x), which is a function of x only, to be a function of yu as well.
Being based on the scalar multiplication rule (and hence ultimately on the ⊕ rule), our operation has several properties that ⊗ lacks: (i) right-distribution; (ii) permutation symmetry of parties that transfer evidence. This is demonstrated below.
Lemma 5 Let x, y, z ∈ Ω . Then Proof: It follows trivially from x (y ⊕ z) = g(x) · (y ⊕ z) and Lemma 3. This distributive property resolves the issue discussed in Section 3.2: using the operator, it does not matter if y and z are combined before or after the discounting. This solves the inconsistency caused by the ⊗ operation.
Notice also that the left-hand side of (23) obviously is not double-counting x; hence also the expression on the right-hand side does not double-count x. In contrast, the right-hand side expression with ⊗ instead of would be double-counting. We come back to this point in Section 5.6.
Lemma 6 Let y, x 1 , x 2 ∈ Ω . Then Proof: x 1 (x 2 y) = g(x 1 ) · (g(x 2 ) · y). Using Lemma 4 this reduces to (g(x 1 )g(x 2 )) · y. Exactly the same reduction applies to x 2 (x 1 y). Lemma 6 generalizes to chains of discounting: Expression (25) is invariant under permutation of the opinions x 1 , . . . , x N . Note that one property of the ⊗ rule is not generically present in the rule: associativity. Whereas the old rule has x ⊗ (y ⊗ z) = (x ⊗ y) ⊗ z, the new rule has However, it is important to realize that the lack of associativity is not a problem. The transfer of evidence along a chain has a very clear ordering, which determines the order in which the operations have to be performed. (See Section 6.) Also note that does not have a left-distribution property for arbitrarily chosen g. It takes some effort to define a reasonable function g that yields left-distributivity.
Lemma 7 There is no function g : Ω → [0, 1] that satisfies lim s→B g(s) = 1 and gives (x ⊕ y) z = (x z) ⊕ (y z) for all x, y, z ∈ Ω . Proof: We consider the limit x → B, y → B. On the one hand, (x ⊕ y) z → B z = z. On the other hand, (x z) ⊕ (y z) → z ⊕ z.
It may look surprising that we cannot achieve left-distributivity with a function g chosen from a very large function space with only a single constraint. (And a very reasonablelooking constraint at that). But left-distributivity requires g(x ⊕ y) = g(x) + g(y), which conflicts with the constraint lim s→B g(s) = 1.

New specific discounting rule
One way to satisfy g(x ⊕ y) = g(x) + g(y) is by setting g(x) ∝ p(x). This approach, however, causes some complications. Suppose we define g(x) = p(x)/θ, where θ is some constant. If the amount of positive evidence ever exceeds θ, then the discounting factor becomes larger than 1, i.e. amplification instead of reduction, which is an undesirable property. If we redefine g such that factors larger than 1 are mapped back to 1, then we lose the distribution property. We conclude that the "g proportional to evidence" approach can only work if the maximum achievable amount of positive evidence in a given trust network can be upper-bounded by θ.
Definition 13 (New specific discounting rule ) Let x, y ∈ Ω . Let θ be a threshold larger than the maximum amount of positive evidence in any opinion that is used for discounting. We define the operation as with the · operation as specified in Def. 11.
We stress again that θ depends on the interactions between entities within the system, i.e. on the structure of the trust network and the maximum amount of positive evidence in the network.
We are not claiming that is the proper discounting operation to use. It has the unpleasant property that the negative evidence underlying x is completely ignored in the computation of x y. A quick-fix of the form g(x) ∝ p(x) − n(x) does not work since it can cause g(x) < 0 and therefore x y / ∈ Ω . We note that there is no alternative g-function to the ones discussed above if linearity of g is required. This is formalized in the following lemma.
The proof is given in the Appendix. Note that, for sufficiently smooth g, it is possible to prove that the property g(x ⊕ y) = g(x) + g(y) implies g(g(x) · y) = g(x)g(y), i.e. associativity.

The new discounting rule avoids double-counting
In Section 2.2 we saw that the consensus operation ⊕ should be applied only to opinions that are derived from independent evidence. If this restriction is not obeyed then we speak of double-counting. Example 7 showed that in the SL expression (x⊗y)⊕(x⊗z), the evidence in x is double-counted. If we look at the equivalent expression in which ⊗ is replaced with , we get Here there is obviously no double-counting. Next we look at more complicated EBSL expressions.
Lemma 11 Let x, y ∈ Ω be independent opinions. Let Q, W ∈ Ω be opinions independent of x and y, but with mutual dependence. Then, the expression does not double-count any of the evidence underlying Q and W .
Proof: The evidence underlying Q x is the evidence from x, scalar-multiplied by g(Q). Likewise, the evidence underlying W y is a scalar multiple of (p(y), n(y)). Since x and y are independent, the evidence on the left and right side of the '⊕' in (31) is independent. Based on the result above we can conclude that: Corollary 2 Transporting different pieces of evidence over the same link x with the operation is not double-counting x.
Thus, many expressions that are problematic in SL become perfectly acceptable in EBSL, simply because is just an (attenuating) evidence transport operation, whereas SL's ⊗ is a very complicated thing that mixes evidence from its left and right operand (Eqs. 8 and 9).

Flow-based reputation with uncertainty
In this section we will use the discounting rule without specifying the function g. We show that EBSL can be applied to arbitrarily connected trust networks and that the simple recursive approach (12), with ⊗ replaced by , yields consistent results that avoid the double-counting problem.

Recursive solutions using EBSL
We show that the trust networks discussed in Section 3, which are problematic in SL, can be handled in EBSL. We take the EBSL equivalent of the recursive approach in SL (11), (12), namely and demonstrate that these equations yield acceptable results in the case of the trust networks depicted in Figures 2 and 3. For Figure 2 we obtain the EBSL equivalent of (13) by using (32) recursively as follows, By substituting these expressions from bottom to top we get the end result for F 1P . The result is very similar to (13), but now we have lots of brackets because is not associative. We inspect R 16 , We observe that the links A 12 and A 23 occur three times. The link A 34 occurs twice. However, the computation of R 16 only relies on the evidence in A 46 and A 56 ; all the other opinions A ij serve as 'transport', i.e. merely providing weights multiplying the evidence in A 46 and A 56 . Therefore, there is no double counting of evidence.
In the case of Figure 3, i.e. with a loop, recursive use of (32) gives the direct EBSL equivalent of (16). We have Eq. (35) gives an expression for the unknown R 12 that contains R 12 . It can be solved in two ways. The first is to repeatedly substitute (35) into itself. We define a mapping f (x) = (35) we see that R 12 is a fixed point of f . The fixed point can be found approximately by setting x = A 12 and repeatedly applying f until the output does not change any more.
Convergence will be fast if A 23 and A 32 contain a lot of uncertainty. In Eq. (37) we observe that (i) The evidence contributing to R 12 comes from the first A 12 and from the final A 32 .
(ii) All the other occurrences of opinions A ij are only used to compute the weights for the scalar multiplication. The second method is to treat (35) as two independent equations in two unknowns (the two independent components of R 12 ∈ Ω ) and to solve them algebraically. This can be quite difficult if the function g is complicated, since g has to be applied twice, The solution is simple in the following special cases: -If g(A 12 ) = 0 and g(U ) = 0 then R 12 = A 12 . - can be overwhelmed by the indirect A 32 , even if user 1 has little trust in user 2. This demonstrates the danger of allowing opinions close to full belief. -If A 23 → B and g(B) = 1 then R 12 → A 12 ⊕ A 32 .

Recursive solution in matrix form
The recursive equation (32) for obtaining the R ij solutions can be rewritten in matrix notation. We choose g such that g(U ) = 0. We are looking for the off-diagonal components of a matrix R or, equivalently, for a complete matrix R which has the uncertainty 'U ' on its diagonal. Let X be an n × n matrix containing opinions; it is allowed to have a non-empty diagonal, e.g. X ii = U . We define a function f as where 'offdiag X' is defined as X with its diagonal replaced by U entries, and an expression of the form X A is a matrix defined as (X A) ij = k X ik A kj . Note that f (X) is a matrix that can have a non-empty diagonal. Solving (32) is equivalent 3 to the following procedure: 1. Find the fixed point X * satisfying f (X * ) = X * . 2. Take R = offdiag X * .
One approach to determine the fixed point X * is to pick a suitable starting matrix X 0 and then keep applying f until the output does not change any more, X * ≈ f N (X 0 ). Another approach is to treat the formula f (X * ) = X * as a set of algebraic equations, whose complexity is affected by the choice of the g function. At this point, two important questions have to be answered: i) whether the recursive approach for (40) converges, i.e. if the fixed point exists, and ii), when it converges, whether the fixed point solution is unique. When there are no loops in the network, then trivially we have convergence and the fixed point is unique. Intuitively, the repeated applications of f after the trust network has been completely explored do not propagate additional evidence.
In the case of general networks the situation is more complicated. We can prove that there is no divergence. In every iteration of the mapping X → f (X), the new value of X is of the form X ij = A ij ⊕ k g((offdiag X) ik ) · A kj . We see that the evidence in each A kj gets multiplied by a scalar smaller than 1. Hence, no matter how many iterations are done, the amount of evidence about user j that is contained in X can never exceed the amount of evidence about j present in A. This puts a hard upper bound on the amount of evidence in offdiag X, which prevents the solution from 'running off' towards full Belief. Hence, the evidence underlying R ij cannot be greater of the total amount of evidence underlying the opinions in A about user j.
It can be observed that, being flow-based, our fixed point equation for the matrix R has the same form as the fixed point equation for a Markov Chain. The main difference is that in an ordinary Markov chain there is a real-valued transition matrix whereas we have opinions A ij ∈ Ω * , and in our case multiplication of reals is replaced by and addition by ⊕. In spite of these differences, we observe in our experiments that every type of behavior of Markov Chain flow also occurs for R. Indeed, experiments on real data show that we indeed have convergence (see Section 7). Moreover, for some fine-tuned instances of the A-matrix, which are exceedingly unlikely to occur naturally, oscillations can exist just like in Markov chains; after a number of iterations the powers of A jump back and forth between two states. Just as in flow-based reputation (1), adding the direct opinion matrix A in each iteration dampens the oscillations and causes convergence.

Recursive solutions using the discounting rule
We investigate what happens when we replace the generic EBSL discounting operation by the special choice as specified in Def. 13. First we consider the case of Figure 2. The generic Eq. (33) reduces to Notice that we do not have to put brackets around chains of operations because they are associative (Lemma 9). Also notice that in (41) the common 'factor' A 12 A 23 has been pulled outside the brackets. We are allowed to add and remove brackets at will because of the associativity and full distributivity of . Next we consider Figure 3, the loop case. Eq. (39) reduces to which is easily solved, We observe that the system parameter θ has to be chosen with great caution. If values p(A ij ) can get too close to θ then the fraction in (43) may explode and may result in p(R 12 ) > θ, which is problematic. Let us define pmax = max ij p(A ij ). Then it is necessary to set θ ≥ pmax · (1 + √ 5)/2. (This bound is obtained by setting p(A 12 ) = p(A 23 ) = p(A 32 ) = pmax in (43) and demanding that p(R 12 ) ≤ θ.)

Evaluation
We have implemented our flow-based reputation model with uncertainty and performed a number of experiments to evaluate the practical applicability and "accuracy" of the model using both synthetic and real-life data. Note that, while it is possible to define some limiting situations (synthetic data) in which a certain result is expected, in general numerical experiments cannot 'prove' which approach is right because there is no 'ground truth' solution to the reputation problem that we could compare against. The only thing that can be verified by numerics is: (i) do the results make sense? (ii) is the method practical? Thus, we have used synthetic data to compare the accuracy of the opinions computed using different reputation models. On the other hand, we used real-life data to study the practical applicability.
Experiments for assessing the robustness of the reputation model against attacks like slandering, self-promotion and Sybil attacks [8,13,35], have not been considered in this work and are left for future work, as our goal here is the definition of the mathematical foundation for the development of reputation systems. A study of the robustness against attacks requires to consider many other aspects that are orthogonal to this work.
In the remainder of this section, first we briefly present the implementation; then we report and discuss the results of the experiments.

Implementation
We have developed a tool in Python. It implements the procedure for computing the fixed point described in Section 6.2. All SL and EBSL computation rules presented in this paper have been implemented in a Python library.
The tool takes as input a log containing recorded user interactions. Based on the evidence contained in the log, the tool extracts the direct referral trust matrix A. The tool repeatedly iterates the recursive equation until it converges, that is, the difference between the new matrix R (k+1) and the previous one R (k) is less than a certain threshold. In particular the termination condition is set as follows: where δ(x, y)  Table 1: Evidence, opinions, and aggregate ratings for case C1.

Synthetic Data
We have conduced a number of experiments using synthetic data to analyze and compare the different approaches for trust computation. The goal of these experiments is to analyze the behavior of the reputation models in a number of limiting situations for which it is known a priori how the result should behave.

Experiment Settings
The experiments are based on the trust network in Figure 2. We considered six approaches: (i) the flow-based method without uncertainty in Eq. (1); (ii) the flow-based SL approach presented in Section 3; (iii) SL in which the specification of the trust network is transformed to canonical form by removing the edge from 4 to 5 (i.e., A 45 is set to U in Eq. (14)); (iv) EBSL with g( and (vi) EBSL using the operator .
To make the results comparable, we specify the amount of positive and negative evidence (p, n) for each edge in the trust network and use such evidence to compute the opinions used for EBSL and SL as well as the aggregate ratings (Axy) used for flow-based reputation (without uncertainty). The mapping between evidence and opinion is computed using (2). For the mapping between evidence and aggregated ratings, we use an approach similar to the one presented in [35]: where Axy = 1 means fully trusted, Axy = 0 fully distrusted, and Axy = 1 2 neutral. For the analysis we consider three cases. Table 1 presents the evidence along with the derived opinions and aggregated ratings for our first case (C1). In this case, we use θ = 1000 for EBSL using operator . In the second case (C2), we consider the same evidence except for the edge from 7 to P which is now (10, 900). The corresponding opinion and aggregate rating are T 7P = (0.011, 0.987, 0.002) and A 7P = 0.011 respectively. In the last case (C3), we consider the evidence for all edges to be (10000, 0) and we set θ = 20000. In this case, all opinions are equal to (0.9998, 0.0000, 0.0002) and aggregated ratings are equal to 1.

Results
The results of the trust computation are presented in Table 2 in terms of opinions (trust value for flow-based method), and in Table 3 in terms of amount of evidence. Note that in Table 3 we have not included the amount of evidence for the flow-based approach of (1) as it it not possible to reconstruct it from trust values.
The results confirm our expectation about the impact of the trust network representation on the trust computation when SL is used. As expected, the uncertainty component of opinion F 1P computed using SL is larger when the trust network in Figure 2    The results show that the SL and EBSL approaches preserve the ratio between belief and disbelief components ( Table 2) and consequently the ratio between positive and negative evidence (Table 3). This ratio is close to the one between the positive and negative evidence underlying the functional trust T 7P . If the amount of evidence increases (C2), one would expect that the amount of evidence underlying opinion F 1P increases proportionally to the increase of the amount of evidence underlying T 7P (Theorem 2). Accordingly, the amount of positive evidence underlying F 1P should be the same in C1 and C2, and the amount of negative evidence underlying F 1P in C2 should be ten times the amount of negative evidence in C1. We can observe in Table 3 that this is true for EBSL but not for SL. This is explained by the fact that x ⊗ y is not an x-dependent multiple of the evidence underlying y, as was shown in Eqs. (8) and (9).
Finally, in the last case (C3) we have considered a limiting case where every trust relation in the network of Figure 2 is characterized by a large amount of positive evidence. Here, one would expect that the opinion F 1P is close to (1, 0, 0) and the trust value r 1P close to 1. From Table 2 we can observe that SL and all EBSL approaches meet this expectation. However, if we look closely at the evidence underlying such an opinion (Table 3), we can observe that when the SL discounting operator ⊗ and EBSL operator are used, a large amount of evidence is "lost" on the way. In contrast, we expect the amount of evidence underlying F 1P to be close to that of T 7P (Theorem 2). Table 3 shows that EBSL, both for g(x) = x b and g(x) = √ x b , preserves the amount of evidence when referral trust relations are close to full belief.
Moreover, Table 2 shows that the value of r 1P is close to neutral trust rather than to full trust. This can be explained by Eq. (1) and the impossibility to express uncertainty. On the one hand, at each iteration Eq. (1) computes a weighted average of aggregated ratings where weights are equal to the trust a user places in the users providing recommendations. On the other hand, the flow-based approach does not distinguish between neutral trust (equal amount of positive and negative evidence) and full uncertainty (zero evidence). In particular, the lack of evidence between two users is represented in the matrix of aggregated ratings A as neutral trust (see Eq. 46). In sparse trust networks (i.e., networks with only a few edges) like the one in Figure 2, the neutral trust used to express uncertainty has a significant impact on the weighted average used to compute trust values. 4 These results demonstrate that the ability to express uncertainty is fundamental to capture the actual trustworthiness of a target, which is one of the main motivations for this work.

Real-life Data
We performed a number of experiments using real-life data to assess the practical applicability of EBSL and flow-based reputation models built on top of EBSL. In particular, we study the impact of various discounting operators on the propagation of evidence and the convergence speed of the iterative procedure.
Experiment Settings For the experiments we used a dataset derived from a BitTorrent-based client called Tribler [29]. The dataset consists of information about 10,364 nodes and 44,796 interactions between the nodes. Each interaction describes the amount of transferred data in bytes from a node to another node. The amount of transferred data can be either negative, indicating an upload from the first node to the second node, or positive, indicating a download.
To provide an incentive for sharing information, some BitTorrent systems require users to have at least a certain ratio of uploaded vs. downloaded data. Along this line, we treat the amount of data uploaded by a user as positive evidence, and the downloaded amount as negative evidence. Intuitively, positive evidence indicates a user's inclination to share data and thus to contribute to the community.
It is worth noting that Tribler has a high population turnover and, thus, the dataset contains very few long living and active nodes alongside many loosely connected nodes of low activity [9]. This results in a direct referral trust matrix that is very sparse (i.e., most opinions are full uncertainty). In this sparse form it is inefficient to do large matrix multiplications. To this end, we have grouped the nodes into 200 clusters, each of which contains about 50 nodes. Intuitively, a cluster may be seen as the set of nodes under the control of a single user.
For the experiments with real data, we considered four reputation models: (i) the flowbased SL approach presented in Section 3; (ii) EBSL with g(x) = x b ; (iii) EBSL with g(x) = √ x b ; and (iv) EBSL using the operator . Note that, due to the large number of interconnected loops in the trust network, we did not consider SL in which the trust network is transformed into a canonical form.
In all four models we computed the final referral trust matrix R. The amount of evidence in the matrix A is visualized in Figure 4. Figure 4a presents the amount of positive evidence, and Figure 4b the total amount of evidence (sum of positive and negative). We can observe the presence of a few active users who had interactions with a lot of other users (visible as dark lines). A horizontal dark line in Figure 4a indicates a user who downloaded data from many other users. The vertical dark lines in Figure 4b represent negative evidence: many users uploading to the same few users. Note that Figure 4b is not symmetric, since an interaction never results in user feedback from both sides. It is also interesting to note the clusters of strongly connected users who often interact with each other (for instance the top-left corner).

Results
The amount of evidence in R is presented in Figure 5 (only positive evidence) and in Figure 6 (sum of positive and negative evidence). For most users, the amount of evidence in R has increased compared to the initial situation (Figures 4a and 4b). The plots are characterized by uniform vertical stripes, indicating that (most) users have approximately the same amount of (positive) evidence about a given user. The amount of evidence, however, remains close to 0 for those users who had very few interactions with other users (horizontal white lines in Figures 5 and 6). It is also worth noting that the diagonal of R is clearly recognizable as a white line. This is due to the fact we impose the diagonal to be full uncertainty, i.e. users cannot have an opinion about themselves, to reduce the effect of self-promoting.
The choice of the discounting operator, which defines how evidence is propagated, has a significant impact on the amount of evidence in R. Ideally, users should be able to use the available trust information to decide whether to engage an interaction with another user [42]. Therefore, a reputation system should allows users to gather as many recommendations (i.e., evidence) as possible from trusted users. However, the use of the operator causes most of the evidence to be lost along the way. This can be clearly understood by observing that the initial situation in Figure 4a (Figure 4b respectively) and the final referral trust matrix R in Figure 5a (Figure 6a respectively) are almost the same. Figures 5b and 6b show that the ⊗ operator propagates more evidence than . We remind the reader that ⊗ causes doublecounting as well as discarding of evidence as shown in Examples 2 and 3. The operator with both g(x) = x b and g(x) = √ x b results in the propagation of more evidence compared to the and ⊗ operators, as shown in Figures 5c and 5d (positive evidence) and in Figures 6c  and 6d (total evidence). These findings confirm the results obtained with the synthetic data (Table 3). Therefore, we conclude that the operator is preferable to the other operators.
(a) R obtained using (b) R obtained using ⊗ (c) R obtained using with g(x) = x b (d) R obtained using with g(x) = √ x b Fig. 5: Positive evidence in the final referral trust matrix R for the Tribler data. For each pair (i, j) the amount of positive evidence underlying the opinion of i about j is shown as a shade of gray, using a logarithmic gray scale. White corresponds to zero, black to 8.7 · 10 6 , which is the maximum amount of evidence occurring in a single matrix entry in all the experiments.
Convergence We have analyzed the convergence of the iterative approach using the Tribler dataset. The experiments show that the reputation models built on top of EBSL converge. In particular, EBSL with g(x) = x b converges after 47 iterations, EBSL with g(x) = √ x b converges after 24 iterations, and EBSL using the operator after 9 iterations. One can observe that, in all the cases, convergence is fast. Accordingly, we believe that the proposed reputation model can handle real scenarios. In the experiments we also analyzed the convergence of the naïve approach that combines flow-based reputation and SL, as presented in Section 3. Here convergence is not reached in a reasonable amount of time: after 1000 iterations we still have i,j δ(R (a) R obtained using (b) R obtained using ⊗ (c) R obtained using with g(x) = x b (d) R obtained using with g(x) = √ x b Fig. 6: The sum of positive and negative evidence in the final referral trust matrix R for the Tribler data. For each pair (i, j) the total amount of evidence underlying the opinion of i about j is shown as a shade of gray, using a logarithmic gray scale. White corresponds to zero, black to 8.7 · 10 6 , which is the maximum amount of evidence occurring in a single matrix entry in the experiments.
To study the link between our approach and Markov chains, we performed additional experiments (not reported here) with a number of limiting situations, i.e. synthetic data unlikely to occur in real life. In particular, we studied the EBSL case where the powers of A show oscillations, i.e. A k+2 = A k with A k+1 = A k . Here A k stands for ((· · · A) A) A. This can occur in Markov chains too. In flow-based reputation (1) the added term (1 − α)sx dampens the oscillations and thus improves convergence. Similarly, in our EBSL experiments the added term A in each iteration (40) gives a convergent result in spite of the oscillatory nature of A. This strengthens our conviction that EBSL correctly captures the idea of reputation flow.

Related Work
The notion of uncertainty is becoming an important concept in reputation systems and, more in general, in data fusion [5,24]. Uncertainty has been proposed as a quantitative measure of the accuracy of predicted beliefs and it is used to represent the level of confidence in the fusion result. Several approaches have extended reputation systems with the notion of uncertainty [15,30,32,39]. For instance, Reis et al. [32] associate a parameter with opinions to indicate the degree of certainty to which the average rating is assumed to be representative for the future. Teacy et al. [39] account for uncertainty by assessing the reputation of information sources based on the perceived accuracy of past opinions. Differently from the previous approaches, Subjective Logic [15] considers uncertainty as a dimension orthogonal to belief and disbelief, which is based on the amount of available evidence.
One of the main challenges in reputation systems is how to aggregate opinions, especially in the presence of uncertainty. SL provides two main operators for aggregating opinions: consensus and discounting (see Section 2.2). Many studies have analyzed strategies for combining conflicting beliefs [16,19,37] and have proposed new combining strategies and operators [7,44]. In Section 5.2, we re-confirm that the standard consensus operator used in SL is well-founded on the theory of evidence.
In contrast, less effort has been devoted to studying the discounting operator. Bhuiyan and Jøsang [4] propose two alternative discounting operators: an operator based on opposite belief favouring, for which the combination of two disbeliefs results in belief, and a base rate sensitive transitivity operator in which the trust in the recommender is a function of the base rate. Similarly to the traditional discounting operator of SL, these operators are founded on probability theory. As shown in Section 3, employing operators founded on different theories has the disadvantage that these operators may not "cooperate". In the case of SL, this lack of cooperation results in the inability to apply SL to arbitrary trust networks. In particular, trust networks have to be expressed in a canonical form in which edges are not repeated. A possible strategy to reduce a trust network to a canonical form is to remove the weakest edges (i.e., the least certain paths) until the network can be expressed in canonical form [20]. This, however, has the disadvantage that some (possibly even much) trust information is discarded. An alternative canonicalization method called edge splitting was presented in [18]. The basic idea of this method is to split a dependent edge into a number of different edges equal to the number of different instances of the edge in the network expression. Nonetheless, the method requires that the trust network is acyclic; if a loop occurs in the trust network, some edges have to be removed in order to eliminate the loop, thus discarding trust information. In contrast, we have constructed a discounting operator founded on the theory of evidence. This operator together with the consensus operator allows the computation of reputation for arbitrary trust networks, which can include loops, without the need to discard any information.
Cerutti et al. [7] define three requirements for discounting based on the intuitive understanding of few scenarios: Let A be x's opinion about y's trustworthiness, C the level of certainty that y has about a proposition P , and F = A • C the (indirect) opinion that x has about P . (i) If C is pure belief, then F = A; (ii) If C is complete uncertainty, then F = C; (iii) The belief part of F is always less than or equal to the belief part of A. Based on these requirements, they propose a family of graphical discounting operators which, given two opinions, project one opinion into the admissible space of opinions given by the other opinion. These operators are founded on geometric properties of the opinion space. This makes it difficult to determine whether the resulting theory is consistent with the theory of evidence or probability theory. Our discounting operator satisfies requirement (ii) above, ⊗  Associativity  yes  no  yes  Left-distribution  no  no  yes  Right-distribution  no  yes yes  Recursive solutions  no  yes yes   Table 4: Comparison of the operators ⊗, , and .
but not requirements (i) and (iii); indeed, for g(A) > 0 it holds that A B = B (where B represents full belief). It is worth noting that the requirements proposed in [7] are not well founded in the theory of evidence: B means that there is an infinite amount of positive evidence; discounting an infinite amount of evidence still gives an infinite amount of evidence. In Theorem 1, we provided a number of desirable properties founded on the theory of evidence. In particular, if p + n → ∞ then u → 0. Accordingly, if C = B the uncertainty component of F should be equal to 0, regardless of the precise (nonzero) value of the uncertainty component of A.
To our knowledge, our proposal is the first work that integrates uncertainty into flowbased reputation.

Conclusion
In this paper, we have presented a flow-based reputation model with uncertainty that allows the construction of an automated reputation assessment procedure for arbitrary trust networks. We illustrated and discussed the limitations of a naïve approach to combine flowbased reputation and SL. An analysis of SL shows that the problem is rooted in the lack of "cooperation" between the SL consensus and discounting rules due to the different nature of these two operators. In order to solve this problem, we have revised SL by introducing a scalar multiplication operator and a new discounting rule based on the flow of evidence. We refer to the new opinion algebra as Evidence-Based Subjective Logic (EBSL).
A generic definition of discounting (the operator ) lacks the associative property satisfied by the SL operator ⊗. This, however, is not problematic since the flow of evidence has a well defined direction. Furthermore, the operator has right-distributivity, a property that one would intuitively expect of opinion discounting. One can choose a specific discounting function g(x) proportional to the amount of positive evidence in x. The resulting discounting operator is denoted as . As shown in Table 4, this operator is completely linear (associative as well as left and right distributive). However, it has potentially undesirable behavior since it ignores negative evidence, and requires a carefully chosen system parameter related to the maximum amount of positive evidence in the system.
The adoption of the discounting operator results in a system that is centered entirely on the handling of evidence. We have showed that this new EBSL algebra makes it possible to define an iterative algorithm to compute reputation for arbitrary trust networks. Thus, EBSL poses the basis for the development of novel reputation systems. In particular, our opinion algebra guarantees that trust information does not have to be discarded, thus preserving the quality of the aggregated evidence. Moreover, making the uncertainty of the computed reputation explicit helps users in deciding how much to rely on it based on their risk attitude. In our opinion, this will facilitate the adoption and acceptance of reputation systems since users are more aware of the risks of engaging a transaction and, thus, can make more informed decisions.
The work presented in the paper poses the basis for several directions of future work. We have shown how EBSL can be used to build a flow-based reputation model with uncertainty. However, several reputation models have been proposed in the literature to compute reputation over a trust network. An interesting direction is to study the applicability of EBSL as a mathematical foundation for these models. This will also make it possible to study the impact of uncertainty on the robustness of reputation systems against attacks like self-promotion, slandering and Sybil attacks. that allows for negative belief and/or disbelief components. We have to slightly modify the relation between evidence and opinions, so that negative amounts of evidence can be represented, (b, d, u) = (p, n, c) |p| + |n| + c ; (p, n) = c (b, d) u with c > 0. Here p and n can be negative. This relation automatically leads to a slightly modified definition of evidence addition (⊕) and scalar multiplication, For x, y ∈ Ω and α ≥ 0 all this reduces to the algebra of Sections 2.2 and 5.3; For x, y ∈ Ω * and α ∈ R all the nice linear properties still hold. The space Ω * with the ⊕ and · operations is a vector space. (The underlying space of (p, n) evidence pairs has also been turned into a vector space by allowing negative amounts of evidence.) We introduce an inner product on this vector space as follows, x, y def = p(x)p(y) + n(x)n(y).
It is easily verified from the definitions that this expression satisfies all the requirements for being an inner product, namely x, y = y, x ; x, y ⊕ z = x, y + x, z ; x, x ≥ 0 and α · x, y = α x, y .
With all this structure in place we can now invoke the Riesz-Fréchet theorem [12], If Ω * is a real Hilbert space and g a linear functional, then there exist a unique vector v ∈ Ω * such that g(x) = v, x for all x ∈ Ω * .
Here 'linear functional' means that the linear property g(x ⊕ y) = g(x) + g(y) holds. Hence, the only way to achieve this property is to set g(x) = v, x , i.e. a linear combination of p(x) and n(x).