1 Introduction

As electronic commerce becomes increasingly popular, more and more people are involved. The problems of trust have become the main challenges in the development of electronic commerce. In multi-agent based electronic commerce, self-interested agents may be deceptive, collusive or strategic. Unfair rating attacks (such as the collusive unfair ratings, Sybil, Camouflage, Whitewashing, and discrimination attacks [7, 23]) from dishonest reviewer render reputation systems’ vulnerable to mislead buyers to transact with dishonest sellers. Dishonest reviewers may also employ sophisticated attacking strategies (such as a combination of various unfair rating strategies) to avoid being detected. Lots of trust models have been proposed to cope with unfair/false ratings. However, these models are not completely robust against various strategic attacks, i.e., they have limitations in defending against some kinds of strategic attacks. To address such problems, we design a new robust algorithm called WBCEA for improving intelligent agents’ capabilities in accurately estimating the trustworthiness of sellers under various types of attacks and further reducing the risk of purchase.

The main steps of WBCEA are as follows. First, based on historical ratings, the defending buyer agent should evaluate the trustworthiness of reviewers who rated these recommended sellers according to its own experiences or the trustable buyers’ experiences (the trustable buyers are evolved based on the buyer agent’s whitelist and blacklist). And then, the defending buyer generates a list of most trustworthy reviewers as advisors (based on their trustworthiness) for each recommended seller agent. Thirdly, it evaluates each seller’s trustworthiness according to its own experience and advisors’ ratings, and selects the most trustworthy seller as trading partner. Finally, the defending buyer agent updates its own whitelist and blacklist.

In contrast to existing strategy, the novel features of this strategy are as follows. First, considering that each buyer agent has both a trustworthy facet and an untrustworthy facet, WBCEA considers both trustworthy facet and untrustworthy facet of each reviewer. Moreover, each buyer agent maintains two lists (i.e., a whitelist and a blacklist) to keep track of the most trustworthy advisors (i.e. reviewers) and most untrustworthy advisors that are evolved according to its own experience. Secondly, based on blacklist and whitelist, a customized optimal advisors list is generated for evaluating each recommended seller, which is similar to PEALGA while different from MET (which adopts one advisor list for evaluating all sellers). However, the agents in each optimal advisor lists of PEALGA are selected from all the buyers in the system, while the agents in each optimal advisor lists of WBCEA are selected from those buyers who have traded with the recommended seller. Therefore, the latter optimal advisor lists are targeted more. Thirdly, WBCEA considers both trust and distrust information in social network of buyers. That is different from PEALGA, which only considers trust information.

The rest of this paper is developed as follows. Section 2 reviews related literature. Section 3 gives a framework for multi-agent based electronic commerce platform. Section 4 illustrates the WBCEA strategy in detail. Section 5 verifies the performance of our approach using experiments. Section 6 concludes this paper with future work directions.

2 Literature review

The aim of this section is to review the models which are designed for detecting and defending against malicious agents. Since our work is based on a local (buyer’s) view, not on the global (electronic commerce platform) view, we only review models designed from buyers’ viewpoints. In general, the defending models can be divided into two categories, i.e., trust-based approaches, and the trust and distrust-based approaches. The following paragraphs illustrate these two kinds of models in detail.

It is widely agreed that trust means the confidence that one or many entities behave as they expected [16]. Based on trust information, many defending models such as BRS [19], iCLUB [11, 12], TRAVOS [18], ReferralChain [20, 21], Personalized [22], MET [6], PEALGA [5] have been designed. These models can effectively defend against some kinds of attacks. However, BRS becomes inefficient andiCLUB is unstable when the majority of buyers are dishonest because both of them employ the “majority-rule”. When dishonest advisors adopt shifty attacks, TRAVOS does not work well because it assumes that each advisor’s rating behavior is consistent. ReferralChain sets 1 to the initial trust of each new buyer (advisor) which provides a chance for dishonest advisors to abuse the initial trust (i.e., Whitewashing). Personalized model is vulnerable when buyers have insufficient experiences with advisors and the majority of advisors are dishonest (i.e., combination of Sybil and Whitewashing).MET evolves one advisor list, which is not necessarily suitable enough for estimating the trustworthiness of all the sellers. PEALGA pre-evolves a customized advisor list for evaluating each candidate seller. However, this algorithm still only considers trust information between buyers.

Distrust is recognized to play an equally important role as trust [3]. Though some attack defending models such as GBR [13] and Multi-faceted [4] are designed, the investigation of utilizing distrust is still in its infancy [14, 15]. GBR is proposed to combat web spam employing the “majority-rule”. Therefore, it has strong bias towards seed pages if a small seed set is used. However, it is time consuming to get a large seed set artificially. Therefore, it is difficult for researchers to tradeoff between the number of seed pages and time complexity. Multi-faceted considers both interpersonal and impersonal aspects that may bring in redundant and even noisy information. Moreover, it cannot be very prominent in the fight against most kinds of attacks, especially Sybil and Whitewashing attack. Both GBR and Multi-faceted only generate one advisor list for evaluating all sellers, which leads to problems such as lack of pertinence or inaccuracy of evaluation.

As links among users in a social network are consist of trust as well as distrust connections, both of these information can be used in the design of defending strategies. However, existing models have some shortcomings. For example, the performance of GBR depends on training data (i.e., the number of seed pages). In contrast to GBR, our strategy WBCEA also relies on data as its performance can only keep stable when enough transaction experience is accumulated. Of course, we can use the rolling incremental updated transaction and rating data as input of our strategy, which are easier to acquire compared to the seed pages in GBR.

3 A framework for multi-agent-based electronic commerce platform whitelist and blacklist mechanism

We propose a framework for multi-agent-based electronic commerce platform (see Fig. 1). In this framework, there are three kinds of agents in the electronic market, i.e., buyer agents, seller agents, and search agents. Each buyer agent maintains a whitelist (which is used to store those buyers who are trustworthy), a blacklist (which is used to store those buyers who are untrustworthy) and a list of historical transaction records (which is used to store those sellers who have been traded with it). The function of search agents is to help buyers select some sellers who can satisfy their purchase demand, and simultaneously list each seller’s reviewers (see the central box of Fig. 1).

Fig. 1
figure 1

A framework for electronic commerce platform with advisor mechanism

Once getting the recommended sellers returned by search agent, the honest buyers who defending various attacks will select a suitable seller as trading partner based on their trustworthiness. So, rational defending buyers should evaluate each seller’s trustworthiness according to its own experience and reviewers’ advises. As some reviewers are selfish and dishonest by giving unfair ratings, therefore, the defending buyer should evaluate each reviewer’s trustworthiness before referring their ratings. For each reviewer, if the honest buyer and the reviewer once traded with same sellers, the buyer can evaluate the reviewers’ trustworthiness according to its own experience directly. Otherwise, the honest buyer will seek the reviewers it trusts for help. These trusted reviewers can be found by the social network that are constructed based on whitelist and blacklist of the honest buyer. Once getting reviewers’ trustworthiness, the honest buyer will generate a customized optimal advisors list from reviewers for evaluating each recommended seller. After selecting trading seller and getting the goods or enjoying the service, the honest buyer will rate the trading seller according to its subjective feel. Simultaneously, the defending buyer’s whitelist and blacklist should be updated according to this experience of trading. The symbols and their meanings of this framework are summarized in Table 1.

Table 1 Symbols used in the framework and their meanings

In order to explicitly define the research scope and background of this paper, we assume that the defense agents in the electronic commerce platform with whitelist and blacklist mechanism follows the following assumptions.

Assumption 1

Buyer agents pay more attention to the ratings that are given by the advisors who have similar opinions to their own. Some inexperience buyer agents often need to consider other buyer agents’ advices. Similar to the experience of humans, buyer agents are more incline to refer to the advisors who have similar opinions and believe that agents who often give positive/negative ratings belong to the same category.

Assumption 2

Buyer agents’ acceptance toward other reviewers’ ratings decrease over time. The more recent the rating time, the more accurate the rating can reflect current trustworthiness of the seller and the more significance it is to the seller’s reputation prediction.

Assumption 3

We assume that the buyer agents are not competitive in general and are willing to share their whitelist and blacklist with others. This is quite common in major e-marketplaces and travel agent portals.

Assumption 4

We rule out the influence of various prices in the selection of trading seller in this study as we concentrate on the effect of trustworthiness computation under the environment that the prices of the provided products or services are similar.

4 The whitelist and blacklist co-evolutionary strategy

Every time after search agent returning the recommended sellers S candidate and each seller’s recent reviewers \(B_{s_{j}^{candidate}}^{H} \) to the honest buyer b i , it will adopt the whitelist and blacklist co-evolutionary defending strategy (abbr. WBCEA) and implement following steps.

  1. (1)

    Based on whitelist and blacklist, the honest buyer b i constructs its social network \(TN_{b_{i}} \). And then, through the propagation of trust and distrust in the social network \(TN_{b_{i}} \), buyer b i tries to find the reviewers (i.e. buyers) \(B_{b_{i}}^{T} \) that can be trusted by it (see Algorithm 1).

  2. (2)

    For each reviewer who rates the candidate seller \(s_{j}^{candidate} \) in the \(B_{s_{j}^{candidate}}^{H} \), if it also traded with some sellers who traded with buyer b i , b i can evaluate the reviewer’s trustworthiness according to its own experience directly. Otherwise, b i will seek trustworthy buyers in \(B_{b_{i}}^{T} \) (that is gained from algorithm 1) for reference. Based on the synthetized trustworthiness of each reviewer, the defending buyer b i further generates an optimal advisors list \(A_{b_{i}}^{s_{j}^{candidate}} \) for evaluating each recommended seller \(s_{j}^{candidate} \in S^{candidate}\)(see Algorithm 2).

  3. (3)

    The honest buyer b i evaluates each candidate seller \(s_{j}^{candidate} \)’s trustworthiness according to its own experience and optimal advisors’ advices based on Algorithm SRCA.

  4. (4)

    The honest buyer b i rates the selected seller after the transaction, and updates its’ whitelist and blacklist according to this experience of trading (see Algorithm 3).

The following sub-sections illustrate these steps in detail.

4.1 The trust network construction algorithm

An agent’s trust network is constructed based on its social network. In this subsection, we first explain the concepts of social network, distance, and layer. And then, we illustrate the principle of finding trustable buyers through an example intuitively. Finally, the trust network construction algorithm is given.

Definition 1

A social network of a buyer are networks that are constructed based on its whitelist and blacklist. It is composed of many vertexes and directed edges. Each vertex represents a buyer agent and each directed edge represents a trust or distrust relationship between the two connected agents.

The trust relationship is represented by solid arrow labeled with “trust” (see Fig. 2a), and the distrust relationship is represented by dotted arrow labeled with “distrust” (see Fig. 2b). The agent that connects arrowhead is named “trustee” (who receives trust statement), and the one who connects tail is called “trustor” (who specifies trust statement). In constructing a social network for an honest buyer (e.g., b i ), all the members in its whitelist and blacklist are added to its social network by solid and dotted arrows respectively. We assume in this paper that the buyer agents are willing to share their whitelists and blacklists with other buyer agents (Assumption 3). We note that the members in b i ’s whitelist and blacklist also have their respective whitelists and blacklists. Hence, we can say that an agent trusts another agent if the latter is in the whitelist of the former, or an agent distrusts another agent if the latter is in the blacklist of the former. Meanwhile, the six degrees of separation theory asserts that “any two persons in the world can be connected by at most six persons”. Based on this theory, we construct a social network for an agent b i by including all other agents that have at most 6 trust or distrust relation with b i . From an honest buyer’s view, the trustworthiness and untrustworthiness of a buyer in its social network is determined by the distance that the latter buyer to the honest buyer in the trust/distrust chains. Example 1 explains these concepts of distance and layer.

Fig. 2
figure 2

Representation of trust and distrust relationship between agents

Example 1

Supposing that there is a sample social network with trust and distrust labels (see Fig. 3a). In this figure, b i is an honest buyer. b i ’s whitelist \(WL_{b_{i}} \) is {b a , b b , b c }, b a ’s whitelist \(WL_{b_{a}} \) is {b d , b e }, b f is in b b ’s whitelist and in b e ’s blacklist simultaneously, b g is in b d ’s whitelist and in b e ’s blacklist simultaneously, and b h is in b f ’s whitelist and b c ’s blacklist simultaneously. The whitelist or blacklist of each buyer that is not listed above is empty. Obviously, we can find a chain < b i ,b a ,b e ,b g > from Fig. 3a. According to the basics of graph theory, the distance from b i to itself is zero, and the distance from b i to b a is 1. Therefore, the distance from b i to b g in chain < b i ,b a ,b e ,b g > is 3. In the social network of b i , the agents who have equal distance from b i are located in the same layer. In Fig. 3a, if b i is located in the first layer, b a , b b and b c are located in the second layer. Moreover, an agent may belong to multiple chains with similar or different distance (or layer). For another example, b f is located in the third layer of chain < b i , b b , b f > and in the fourth layer of chain < b i , b a , b e , b f > simultaneously. Similarly, b g is located in the fourth layer of chains < b i , b a , b d , b g > and < b i , b a , b e , b g > respectively.

Fig. 3
figure 3

An example of social network and trust network

Propagation of trust and distrust along social network chains is similar to the “word of mouth” propagation of information for humans [1]. Josang et al. [8] concluded that “trust will be weakening in the chain of propagation” and “in the case where an agent receives conflicting recommended trust, e.g. both trust and distrust, it needs some methods for combining these conflicting recommendations”. Based on these conclusions, in this paper, we define the following rules to determine the trustworthiness or untrustworthiness of an agent in social network. These rules consider the fact that “trust will be weakening in the chain of propagation”, therefore, it is believed the shorter the propagation chain, the less the weakening effect. Therefore, “layer” (or distance to the honest agent) of agents is considered in these rules. Example 2 explains these two rules intuitively.

  1. Rule 1:

    If an agent is trusted and distrusted by (i.e., in the whitelist and the blacklist of) different buyers who are located in the same layer simultaneously, this agent’s trustworthiness is considered to be uncertain.

  2. Rule 2:

    If an agent is trusted and distrusted by (i.e., in the whitelist and the blacklist of) different agents who are located in different layer, the agent’s trustworthiness can be judged based on the trust/distrust of its upper layer agents. If the upper agent whose layer is smallest trust this agent, this agent is considered as trustworthy, otherwise, it is believed to be not untrustworthy.

Example 2

In Fig. 3a, b g is trusted by b d and distrusted by b e simultaneously, b d and b e are located in the same layer (i.e., 3). Therefore, b g is located in the same layer of chains < b i ,b a ,b d ,b g > and < b i ,b a ,b e ,b g > . Moreover, b g is trusted by b d and distrusted by b e simultaneously. Hence, according to Rule 1, b i cannot give certain judge whether b g is trustworthy or not. For another example, in Fig. 3a, b f is located in the third layer and the forth layer of chains < b i , b b , b f > and < b i , b a , b e , b f > respectively. Moreover, b b trust b f , while b e distrust b f . As b f ’s upper layer agents, b b and b e ’s layer is 2 and 3 respectively. As b b ’s layer is the smallest one of b f ’s upper layer agents, therefore, according to Rule 2, b f is considered as trustworthy based on b b ’s trust.

Algorithm 1 implements above ideas about how to construct a trust network by using two auxiliary queues, i.e., Q t and Q d . Q t is used to temporarily store agents who are in whitelist, and Q d is used to temporarily store agents who are in blacklist. Moreover, we define a variable named depthLimit to represent the upper bound of chain length (i.e. 6) in the resulting trust network. The main steps of this algorithm are as follows. First, the queues Q t and Q d (see step (1) in Algorithm 1) are initialized. Secondly, we find the trustworthy buyers \(B_{b_{i}}^{T} \) for b i (see step (3-5) in Algorithm 1). Thirdly, we find the buyers that are not trustworthy (denoted as \(B_{b_{i}}^{D} )\) for buyer b i (see step (6-8) in Algorithm 1). Finally, the queues Q t and Q d (see step (9-11) in Algorithm 1) updated.

figure f

If we execute algorithm 1 over agent b i ’s social network shown in Fig. 3a, we can get the resulting trust network shown in Fig. 3b. From this figure, we can see that b a , b b , b c , b d , b e , and b f are the agents who are trusted by b i . In comparison, according to the trust network construction method given in previous work [5, 6, 20, 21] that do not consider “distrust” labels, all the agents b a , b b , b c , b d , b e , b f , b g and b h will be added into the trust network. Therefore, algorithm 1 further purifies the trust network using “distrust” information.

4.2 The optimal advisor lists generation algorithm

As there may be many sellers who satisfy the buyer agents’ requirement, only one fixed advisor list may lack pertinence in evaluating all of the recommended sellers’ trustworthiness. That is because some members in a fixed advisor list may be experienced in trading with one seller, while inexperienced in trading with others, therefore, their advices about the latter sellers will be inaccuracy. Similar to the idea of finding a most suitable advisor list (called optimal advisor list) for evaluating a recommended seller we have proposed in literature [5], this paper also tries to find an optimal advisor list for each seller. Algorithm 2 shows the main idea of the optimal advisor lists generation algorithm based on whitelist and blacklist, which is composed of four steps. First, buyer b i calculates pair-wised similarities between the itself and any buyer that rated S candidate (e.g. b k ) according to equation (1) or (2) (see step (3)-(6) in Algorithm 2). Psychology research [10] showed that people have both the trustworthy aspect and the untrustworthy aspect. Based on their results, we randomly set two initial values (denoted as \(R_{b_{i} ,T} (b_{k} )\in [0,1]\) and \(R_{b_{i} ,D} (b_{k} )\in [0,1]\) respectively )to each buyer agent for denoting these two aspects. Moreover, we specify that buyer b i updates b k ’s trustworthy aspect and untrustworthy aspect according to equation (3) (see step (7) in Algorithm 2). Thirdly, considering the trustworthy aspect and the untrustworthy aspect simultaneously, the synthetized trustworthiness of b k is calculated according to equation (4) (see step (8) in Algorithm 2). Finally, buyer b i updates and generates the optimal advisor lists for evaluating the recommended seller (see step (10) in Algorithm 2).

figure g

The following definitions illustrate the formulas used in this algorithm. It should be noted that we do not consider b k ’s layer (or distance) in social network in definitions 2-4. That is caused by two reasons. One is that, for any buyer b k in \(B_{s_{j}^{candidate}}^{H} \) while not in \(B_{b_{i}}^{T} \), there is not any relationship between b k and b i , let alone layers. That is to say, layers do not exist in all the similarity calculation cases. The other reason is that, similarity of trustor and trustee’s viewpoints affects their trust, while it is not necessarily true inversely. For example, the degree that a trustor trusts a trustee is affected by factors such as the familiarity between the trustor and the trustee, their interaction frequency, the consistency/similarity of their viewpoints, the number of common friends, and so on. However, the similarity of two persons’ behaviors or viewpoints are not necessarily affected by their trust. In real life, a person’s behavior or viewpoints may be similar to the ones of distant strangers or even his/her competitors and enemies.

Definition 2

If the defending agent b i and any reviewer agent b k once traded with same sellers, the similarity between agent b i and agent b k is determined by the ratings they rated these sellers, the average values of their ratings. Equation (1) defined the calculation method.

$$ sim_{1} (b_{i} ,b_{k} )=\frac{\sum\limits_{s_{j} \in S_{b_{i} ,b_{k}}} {(r_{b_{i} ,s_{j}} -\overline {r_{b_{i}}} )(r_{b_{k} ,s_{j}} -\overline {r_{b_{k}}} )}} {\sqrt {\sum\limits_{s_{j} \in S_{b_{i} ,b_{k}}} {(r_{b_{i} ,s_{j}} -\overline {r_{b_{i}}} )^{2}}} \sqrt {\sum\limits_{s_{j} \in S_{b_{i} ,b_{k}}} {(r_{b_{k} ,s_{j}} -\overline {r_{b_{k}}} )^{2}}}} $$
(1)

where, \(S_{b_{i} ,b_{k}}\) represents the set of sellers who once traded with b i as well as b k , \(r_{b_{i} ,s_{j}} \) is the rating that b i rated s j (s j S b i ,b k ), \(\overline {r_{b_{i}}} \) represents the average of ratings that b i rated its trading partners, \(r_{b_{k} ,s_{j}} \) represents the rating that agent b k rated s j (s j S b i ,b k ), \(\overline {r_{b_{k}}} \) represents the average of ratings that b k scored its trading partners.

Definition 3

If defending agent b i and any reviewer agent b k did not trade with any same seller in their histories, the similarity between them is determined by characteristics of b k ’s rating and the ratings given by the buyers who are trusted by b i . Equation (2) defines the calculation method.

$$ sim_{2} (b_{i} ,b_{k} )=\frac{\sum\limits_{s_{j} \in S_{b_{k}}^{H}} {(r_{b_{k} ,s_{j}} -\overline {r_{b_{k}}} )(\overline {r_{s_{j}}} -\overline r )}} {\sqrt {\sum\limits_{s_{j} \in S_{b_{k}}^{H}} {(r_{b_{k} ,s_{j}} -\overline {r_{b_{k}}} )^{2}}} \sqrt {\sum\limits_{s_{j} \in S_{b_{k}}^{H}} {(\overline {r_{s_{j}}} -\overline r )^{2}}}} $$
(2)

where, \(S_{b_{k}}^{H} \) is the sellers that historically traded with b k , \(r_{b_{k} ,s_{j}} \) is the rating that b k rated \(s_{j} (s_{j} \in S_{b_{k}}^{H} )\), \(\overline {r_{b_{k}}} \) is the average of ratings that agent b k rated its trading partners, \(\overline {r_{s_{j}}} \) is the average of ratings that \(B_{b_{i}}^{T} \)(i.e., the set of agents who are trusted by agent b i ) scored s j , \(\overline r \) is the average of ratings that all the members in \(B_{b_{i}}^{T} \) scored the sellers that they traded with.

Definition 4

From defending agent b i ’s viewpoint, the trustworthy aspect and untrustworthy aspect of agent b k can be updated based on their similarity. Equations (3) and (4) defines the updating method respectively.

$$ \left\{ {\begin{array}{l} R_{b_{i} ,T} (b_{k} )=R_{b_{i} ,T} (b_{k} )+R_{b_{i} ,T} (b_{k} )\times (sim(b_{i} ,b_{k} )-\omega )\times \vert 1-\beta_{1}^{sim(b_{i} ,b_{k} )-\omega} \vert \\ R_{b_{i} ,D} (b_{k} )=R_{b_{i} ,D} (b_{k} )-R_{b_{i} ,D} (b_{k} )\times (sim(b_{i} ,b_{k} )-\omega )\times \vert 1-\beta_{2}^{sim(b_{i} ,b_{k} )-\omega} \vert \\ \end{array}} \right. $$
(3)
$$ sim(b_{i} ,b_{k} )=\!\left\{ {{\begin{array}{*{20}c} \!{(sim_{1} (b_{i} ,b_{k} )+1)/2\quad\textit{if}\; b_{i}~\textit{and}~b_{k}~\textit{have}~\textit{traded}~\textit{with}~\textit{same}~\textit{sellers}} \hfill \\{\!\!\!\!\!\!(sim_{2} (b_{i} ,b_{k} )+1)/2}\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\textit{otherwise}\\ \end{array}}} \right. $$
(4)

where, \(R_{b_{i} ,T} (b_{k} )\in [0,1]\) represents the trustworthy aspect of b k from b i ’s viewpoint, \(R_{b_{i} ,D} (b_{k} )\in [0,1]\) represents the untrustworthy aspect of b k from b i ’s viewpoint; s i m(b i ,b k ) ∈[0,1] is calculated according to equation (4) (i.e., the similarity between b i and b k is the normalized value of s i m 1(b i ,b k ) if b i and b k have traded with same sellers, otherwise, their similarity is a normalized value of s i m 2(b i ,b k ); ω ∈(0,1) is a classification factor to divide the growth of trustworthy aspect and untrustworthy aspect into positive, negative and zero; β 1 and β 2 are factors to control the increment speed of trust and distrust respectively.

In experiment, ω ∈(0,1) is set to 0.5. This setting ensures that the trustworthy aspect and untrustworthy aspect of b k keep unchanged if the similarity between b i and b k is 0.5 (which means their similarity is not obviously very large or very small). If the similarity between b i and b k is larger than 0.5(which means that their similarity is obviously very large), the trustworthy/untrustworthy aspect of b k increases/decreases by a certain amount; otherwise, the trustworthy/untrustworthy aspect of b k decreases/increases by a certain amount. It is also important to note that factors β 1, β 2 should satisfy the condition 0 < β 2 < β 1 < 1. The constraints β 2 < β 1 is set to ensure that the speed of increase in the trustworthy aspect is less than that of the untrustworthy aspect, and the speed of decrease in the trustworthy aspect is greater than that of the untrustworthy aspect. This constraint is consistent with the research result that “people devote more attention to negative information than to positive information” [17].

The synthesized trustworthiness of an agent can be gained by synthesizing its trustworthy aspect and untrustworthy aspect (see Definition 5). In constructing formula of \(STD_{b_{i}} (b_{k} )\), two thresholds 𝜃 1(0 < 𝜃 1 < 1) and 𝜃 2(0 < 𝜃 2 < 1) are introduced based on perception of humans that: a person is trustable when his/her trustworthy aspect is greatly larger than his/her untrustworthy facet; a person is not trustable when his/her trustworthy facet is slightly smaller than his/her untrustworthy facet [10]. Therefore, 𝜃 1 should be larger than 𝜃 2. In experiment, 𝜃 1 and 𝜃 2 are assigned with 0.8 and 0.2 respectively. That is to say, if a person’s trustworthy aspect is larger than its untrustworthy aspect by 0.8 (i.e., \(R_{b_{i} ,T} (b_{k} )-R_{b_{i} ,D} (b_{k} )>\theta _{1} \), 𝜃 1 = 0.8), his/her synthesized trustworthiness is believed to be 1. If a person’s trustworthy aspect is smaller than its untrustworthy aspect by 0.2 (i.e., \(R_{b_{i} ,T} (b_{k} )-R_{b_{i} ,D} (b_{k} )<-\theta _{2} \), 𝜃 2 = 0.2), his/her synthesized trustworthiness is believed to be 0. Otherwise, the synthesized trustworthiness of this agent is calculated according to formula \(\frac {\text {1}}{\theta _{1} +\theta _{2}} (R_{b_{i} ,T} (b_{k} )-R_{b_{i} ,D} (b_{k} )+\theta _{2} )\text {} \) (see third case of formula (5)). This formula is constructed by linear connection of the two points (𝜃 1,1) and (- 𝜃 2,0). Figure 4 illustrates the construction principle visually. Its horizontal axis is \(R_{b_{i} ,T} (b_{k} )-R_{b_{i} ,D} (b_{k} )\), and vertical axis is the synthesized trustworthiness of a person.

Fig. 4
figure 4

Construction principle of synthesized trustworthiness

Definition 5

To facilitate comparison, the trustworthy aspect and untrustworthy aspect of b k can be synthetized to one value by b i . Equation (4) defines the synthetization method.

$$ STD_{b_{i}} (b_{k} )=\left\{ {{\begin{array}{*{20}l} {1 \quad\quad\text{ } {\kern43pt} \mathit{if}\quad R_{b_{i} ,T} (b_{k} )-R_{b_{i} ,D} (b_{k} )>\theta_{1}} \hfill \\ {0{\kern28pt}\quad\quad\quad\quad\mathit{if}{} \quad R_{b_{i} ,T} (b_{k} )-\!R_{b_{i} ,D} (b_{k} )\!<\!-\theta_{2}} \hfill \\ {\frac{\text{1}}{\theta_{1} +\theta_{2}} (R_{b_{i} ,T} (b_{k} )-\!R_{b_{i} ,D} (b_{k} )+\theta_{2}){} \quad\quad\quad\textit{otherwise}} \hfill \\ \end{array}}} \right. $$
(5)

where, \(R_{b_{i} ,T} (b_{k} )\) represents b k ’s trustworthy aspect from b i ’s view, \(R_{b_{i} ,D} (b_{k} )\) represents b k ’s untrustworthiness from b i ’s aspect; 𝜃 1(0 < 𝜃 1 < 1) and 𝜃 2(0 < 𝜃 2 < 1)are two thresholds.

4.3 Seller’s reputation evaluation algorithm

To decrease purchase risk, honest buyers often evaluate sellers’ reputation first. Therefore, buyer b i must be endowed with the ability of evaluating sellers. For accurately evaluating each seller’s reputation, an honest buyer comprehensively considers its private trust (which is gained by their own experience) to the seller and the public reputation of the seller (which is calculated according to their optimal advisors’ comment). Algorithm 3 illustrates the new seller’s reputation calculation algorithm [5]. The main idea of this algorithm is as follows: (1) Buyer b i first calculates each seller’s private trustworthiness according to its own trading experience with this seller (see step (2)-(5) in Algorithm 3). (2) Buyer b i calculates each seller’s public reputation according to the optimal advisor lists obtained from Algorithm 3(see step (6) in Algorithm 3). (3) The private trustworthiness and the public reputation are combined to get the perceived reputation of each seller (see step (7) in Algorithm 3). The formulas used in this algorithm are defined in the following definitions, which are similar to the ones given by Zhang and Cohen [22] and Ji et al. [5].

Definition 6

b i ’s private trust to seller \(s_{j}^{candidate} \)is calculated according to b i ’s rating to this seller. Formula (6) defines the calculation method [22].

$$ R_{b_{i} ,pri}^{s_{j}^{candidate}} =\frac{\sum\limits_{t=1}^{n} {N_{b_{i} ,pos}^{s_{j}^{candidate}} \lambda^{t-1}+1}} {\sum\limits_{t=1}^{n} {(N_{b_{i} ,pos}^{s_{j}^{candidate}} +N_{b_{i} ,neg}^{s_{j}^{candidate}} )\lambda^{t-1}+2}} $$
(6)

where, \(N_{b_{i} ,pos}^{s_{j}^{candidate}} \) represents the number of positive rating that b i rated to \(s_{j}^{candidate} \), \(N_{b_{i} ,neg}^{s_{j}^{candidate}} \) represents the number of negative rating that b i rated \(s_{j}^{candidate} \), λ is a discount factor, t(t = 1,2,… n) is the time windows of rating.

Definition 7

b i ’s public reputation to seller \(s_{j}^{candidate} \) is calculated according to the ratings of \(A_{b_{i}}^{s_{j}^{candidate}} \). Formula (7) defines the calculation method.

$$ R_{b_{i} ,pub}^{s_{j}^{candidete}} =\frac{\left[ {\sum\limits_{k=1}^{m} {\sum\limits_{t=1}^{n} {P_{a_{k} ,pos}^{s_{j}^{candidete}} \lambda^{t-1}}}} \right]+1}{\left[ {\sum\limits_{k=1}^{m} {\sum\limits_{t=1}^{n} {(P_{a_{k} ,pos}^{s_{j}^{candidete}} +P_{a_{k} ,neg}^{s_{j}^{candidete}} )\lambda^{t-1}}}} \right]+2} $$
(7)

where, \(P_{a_{k} ,pos}^{s_{j}^{candidete}} \) represents the trust weighted probability of positive rating that advisor \(a_{k} \in A_{b_{i}}^{s_{j}^{candidate}} \) (which is evolved in Algorithm 3) rated \(s_{j}^{candidate} \), \(P_{a_{k} ,neg}^{s_{j}^{candidete}} \)represents the trust weighted probability of negative rating that advisor a k rated \(s_{j}^{candidate} \), and λ is a discount factor, t(t = 1,2,… n) is the time windows of rating. The calculation principle of \(P_{a_{k} ,pos}^{s_{j}^{candidete}} \) and \(P_{a_{k} ,neg}^{s_{j}^{candidete}} \) are presented in formula (8), which are adapted from Zhang and Cohen [22] and the formulas given by Jøsang and Ismail [9] and Yu and Singh [20] based on Dempster-Shafer theory.

$$ \left\{\begin{array}{*{20}c} {P_{b_{i} ,pos}^{a_{k}} =\frac{2STD_{b_{i}} (a_{k} )N_{b_{i} ,pos}^{a_{k}}} {(1-STD_{b_{i}} (a_{k} ))(N_{b_{i} ,pos}^{a_{k}} +N_{b_{i} ,neg}^{a_{k}} )+2}} \hfill \\ {P_{b_{i} ,neg}^{a_{k}} =\frac{2STD_{b_{i}} (a_{k} )N_{b_{i} ,neg}^{a_{k}}} {(1-STD_{b_{i}} (a_{k} ))(N_{b_{i} ,pos}^{a_{k}} +N_{b_{i} ,neg}^{a_{k}} )+2}} \hfill \\ \end{array}\right. $$
(8)

where \(STD_{b_{i}} (a_{k} )\) is a k ’s synthetic trustworthiness estimated by buyer b i , \(N_{b_{i}, pos}^{a_{k}} \) represents the number of positive rating that b i rated to a k , and \(N_{b_{i} ,neg}^{a_{k}} \) represents the number of negative rating that b i rated to a k .

Definition 8

The perceived reputation by buyer b i of a given seller \(s_{j}^{candidate} \) is the weighted combination of b i ’s private trustworthiness and public reputation given by \(A_{b_{i}}^{s_{j}^{candidate}} \), as defined by formula (9) [22].

$$ R_{b_{i}}^{s_{j}^{candidate}} =wR_{b_{i} ,pri}^{s_{j}^{candidate}} +(1-w)R_{b_{i} ,pub}^{s_{j}^{candidate}} $$
(9)

where, w is calculated according to formulas (10) and (11).

$$ w=\left\{\begin{array}{*{20}c} {\frac{N_{all}^{b_{i}}} {N_{\min}} \ if\text{} N_{all}^{b_{i}} <N_{\min}} \hfill \\ {1~otherwise} \hfill \\ \end{array}\right. $$
(10)
$$ N_{\min} =-\frac{1}{2\varepsilon^{2}}\ln \frac{1-\eta} {2} $$
(11)

where, \(N_{all}^{b_{i}} \) is the total number of ratings provided by b i for the seller \(s_{j}^{candidate} \) , and \(N_{\min } \) is a threshold calculated according to formula (11), which is similar to that defined in literature [22]. If \(N_{all}^{B_{r}} \ge N_{\min } \), buyer b i will be confident about the private trustworthiness estimated based on its own ratings, therefore, the weight of private trustworthiness is simply assigned as 1. Otherwise, b i will also consider public reputation estimated based on advisors ratings. ε represents the maximal level of error that can be accepted, η represents confidence.

figure h

4.4 The whitelist and blacklist updating algorithm

After the transaction, the defending buyer b i can rate the selected seller and then update its’ own whitelist and blacklist according to its experience. This updating makes the whitelist and the blacklist always a timely record of the buyers, the honest buyer trust, and distrust. The main idea of this updating process (see Algorithm 4) is as follows: (1) considering whether each member in \(B_{s_{_{j}}^{candidate}}^{H} \)(which is denoted as b k ) should be exchanged into b i ’s whitelist and blacklist(see step (1) in algorithm 4); (2) if b k ’s syntactic trust is larger than the one of the most untrustworthy buyer in \(WL_{b_{i}} \)(denoted as b m u ), b k will replace b m u (see step (2-4) in Algorithm 4); if b k ’s synthetic trust is smaller than the one of the most trustworthy buyer in \(BL_{b_{i}} \)(denoted as b m t ), b k will replace b m t (see step (5-7) in Algorithm 4).

figure i

5 Experimental results

To verify the performance of WBCEA given in Section 4, we design a set of experiments. Similar to experiments designed in previous studies [5, 6], 6 typical kinds of attacks including AlwaysUnfair, Camouflage, Sybil, Whitewashing, Sybil-Camouflage, and Sybil-Whitewashing are selected to attack the reputation system. AlwaysUnfair attackers always give high reputation to dishonest sellers while rate low to the honest seller. Camoufl age attackers intermittently tell the truth or call white black (i.e., give unfairly high scores to dishonest sellers and unfairly low scores to honest sellers). In experiments of this paper, each Camoufl age attacker will rate honestly in the first 20 days, while give unfair ratings to both the duopoly dishonest and honest seller in following days. Similar to AlwaysUnfair attackers, buyers who adopt Sybil attack always called white black. Different from AlwaysUnfair attack, the number of dishonest buyers in Sybil is greatly larger than the one in AlwaysUnfair attack. The buyers who use the Whitewashing attack strategy always whiten their low reputation by recreating a new account. In experiments, the Whitewashing attackers provide an unfair rating each day and then recreate a new account in later day to whitewash their sham action.

Naive strategy and the oracle strategy are designed as baselines. Naive strategy means that the buyers who adopt this strategy believe all raters are good and their ratings are true. Oracle strategy assumes that the buyers are omniscient. Therefore, they always know each seller’s real reputation. Moreover, strategies such as iClub, Personalized, MET, PEALGA, GBR and Multi-faceted are selected to compare with our strategy (WBCEA). iClub [11, 12], Personalized [22], MET [6] are typical strategies that only consider trust factor in sellers’ and buyers’ evaluation process. These strategies belong to the filtering, discounting and evolutionary category respectively. PEALGA [5] is another evolutionary strategy that only considers trust. But it constructs different customized optimal advisor list for different candidate sellers. GBR [13] and Multi-faceted [4] are compared because they consider both trust and distrust information in the evaluation of sellers and buyers. In particular, this paper constructs an algorithm named WBCEA_S. The only difference between WBCEA andWBCEA_S is that the latter algorithm only evolves one advisor list to evaluate all the candidate sellers. This algorithm is constructed for verifying whether customized optimal advisor list can outperform one advisor list in evaluating candidate sellers when both trust/distrust are considered and whitelist/blacklist are maintained.

5.1 Experimental settings

To facilitate the analysis of various defense strategies, Jiang et al. [6] simulated a market with two duopoly sellers, 99 dishonest common sellers, and 99 honest ordinary sellers. The ordinary sellers are selected as noisy when honest buyers’ selecting duopoly sellers. For comparing with the strategy given by Jiang et al. [6], we choose a similar market setting and relist the electronic market parameters and the values in trust attack/defense experiment (see Table 2). According to Table 2, under all attacks except for Sybil, there are 12 dishonest buyers and 28 honest buyers in the market. Under Sybil attack, the dishonest buyers and honest buyers are 28 and 12 respectively. Besides, 100 days of transactions are simulated in total. It should be noted that the initial trustworthy aspect (i.e., \(R_{b_{i} ,T} (b_{k} ))\) and untrustworthy aspect (i.e., \(R_{b_{i} ,D} (b_{k} ))\) of reviewers is randomly assigned with a value range from 0 to 1 respectively. In each day, every buyer makes one transaction with a partner. The ratings they score sellers are ranged from 0 to 1. It also should be noted that, recommendation algorithm design is not the topic of this paper, the recommendation list about sellers is randomly generated in experiments. The settings of parameters in WBCEA are listed in Table 3.

Table 2 Parameters in simulation
Table 3 Setting and meaning of variables or parameters used in WBCEA

5.2 Evaluative criteria

To compare the experimental results, we choose similar criteria given in Jiang et al. [6] to evaluate the performance of each strategy. One criterion is robustness, which is used to evaluate the feasibility (i.e., anti-attack ability) of each defense strategy from macroscopic scale. Formula (12) defines the function of robustness. According to this formula, the value of robustness ranges from -1 to 1. The more transactions a defense agent trades with the honest duopoly seller, the higher its correct selection rate is, the larger the value of robustness is, and therefore the better the defending ability is.

Definition 9

Robustness of a defense strategy (abbr., Def ) against an attack model (abbr., Atk) is the average transaction difference between the two kinds of duopoly sellers and the honest buyers in the simulation days. Its formula (12) defines the calculation principle [6].

$$ R(Def,Atk)=\frac{\left| {Tran(s^{H})} \right|-\left| {Trans(s^{D})} \right|}{\left| {B^{H}} \right|\times Days\times Ratio} $$
(12)

where T r a n(s H) is the transaction volume of duopoly honest seller, T r a n s(s D) is the transaction volume of duopoly dishonest seller, B H is the number of honest buyers, D a y s is the total transaction days, and Ratio is the selection probability of duopoly sellers.

The mean absolute error (abbr., MAE) of seller’s reputation is used to measure the accuracy of trust models in modeling seller’s reputation. Formula (13) defines the calculation function of MAE. The smaller the MAE, the accuracy the defense strategy’s prediction is, and therefore the better the defense strategy is.

Definition 10

The mean absolute error (MAE) of a seller s j ’s reputation is defined according to its real reputation and estimated reputation when buyer adopt a given defense strategy. Formula (13) [6] defines the calculation function.

$$ MAE(s_{j})=\frac{\sum\limits_{t}\sum\limits_{b_{i}}\left|R^{t}(S_{j})-\tilde{R}^{t}_{bi}(S_{j})\right|}{\left| B^{H}\right|\times Days} $$
(13)

where B H is the number of honest buyers, D a y s is the total transaction days, R t(s j ) is the actual reputation of seller s j in day t(t ∈ [0,D a y s]), and \(\tilde {{R}}_{b_{i}}^{t} (s_{j} )\) is the estimated reputation of seller s j in day t, which is calculated according to the ratings of advisors of b i B H.

5.3 Results and analysis

5.3.1 Robustness analysis

The experimental results about the robustness of compared strategies are listed in Table 4. Each row represents the robustness of one defending strategy. Each column represents the robustness of comparison strategies when defending one attack. The number before “ ±” is the mean value of robustness. The number after “ ±” is the mean square deviation. The larger the mean value or the smaller the mean square deviation, the more robust the corresponding strategy will be. The following paragraphs explain these results in detail.

Table 4 Robustness of compared strategies

The naïve strategy assumes that all raters are good and their ratings are true. If majority of reviewers are attackers, following these reviewers’ advices, naïve strategy may make the defending agent falsely judge the trustworthiness of sellers and falsely choose dishonest seller to trade with. The oracle strategy assumes agents always know the real trustworthiness of each reviewer and can always choose the honest seller according to honest reviewers’ advices. Therefore, o racle can always reach the highest robustness. Oracle and naïve values can be regarded as baselines. The nearer a robustness to oracle, the more robust the corresponding strategy is.

Comparing all the strategies, we can find that PEALGA and WBECA achieve the best result or nearly the best result upon defending various pure and combined attacks. In particular, PEALGA and WBCEA achieve the best performance when defending against Sybil and Sybil & whitewashing attacks. Other compared strategies score poorly on one or more scenarios. For example, though GBR and Multi-faceted strategies consider trust and distrust information simultaneously, they cannot defend the attacks including Sybil very well. WBCEA_S is greatly inferior to PEALGA and WBCEA when defending against Sybil (0.88 ± 0.09) and Sybil&Whitewashing (0.74 ± 0.45) attacks. The high performances of PEALGA (only considers trust in the pre-evolution of the optimal customized advisor list) and WBCEA (which considers trust and distrust factors in the co-evolution of the whitelist and blacklist) are caused by the fact that both of them emphasize the idea of evolving an optimal customized advisor list to evaluate each candidate trading seller, which enable defenders to accurately predict duopoly sellers’ reputations and accurately choose the honest duopoly seller as transaction partner. Therefore, we can conclude that the idea of generating an optimal customized advisor list is more influential than the idea of simultaneously consideration of trust and distrust and maintaining of whitelist and blacklist in accurately evaluating sellers’ trustworthiness.

5.3.2 Accuracy analysis

Though Table 4 can reveal the difference of robustness in defending against various attacks, it cannot differentiate these defending strategies’ estimation accuracy about duopoly sellers’ reputation. Tables 5 and 6 list the MAE and variances of dishonest and honest duopoly sellers’ reputation respectively. As oracle strategy assumes that agent knows each seller’s real reputation, agents who are oracles can always accurately predict seller’ reputation (i.e., MAE equals to zero). Therefore, oracle strategy can be taken as the lowest (best) baseline. The closer a strategy’s MAE to oracle strategy, the more accurate it is. In contrast, as naïve strategy assumes that agents trust all the buyers’ ratings are true, therefore, difference between real reputation and predicted reputation gained by naïve strategy is large (i.e., MAE is very large). Especially, in predicting the reputation of dishonest duopoly seller, the MAE is as large as 0.76 in some cases. Therefore, naïve strategy can be taken as the highest (worst) baseline. The closer a strategy’s MAE to the one of naïve strategy, the more inaccurate the strategy is. From Tables 5 and 6, we can see that the MAEs of PEALGA and WBCEA can approach the ones of oracle strategy very well under almost all the attacks. WBCEA_S also perform well when predicting duopoly seller’s reputation under attacks except for Sybil&Whitewashing. GBR and Multi-faceted strategies can quite accurately predict honest duopoly seller’s reputation, especial when attacks without Sybil. However, their prediction about dishonest duopoly seller is not accurate (i.e., the MAE is as large as 0.52 ± 0.10, 0.46 ± 0.06, 0.32 ± 0.10).

Table 5 MAE and variance of dishonest duopoly seller’s reputation
Table 6 MAE and variances of honest duopoly seller’s reputation

Except for Tables 5 and 6, Figs. 5 and 6 illustrate variation trends of the averaged estimated reputation of duopoly sellers when defenders (i.e., honest buyers) adopt various defending strategies under multifarious attacks in detail. In these figures, the horizontal axis represents the 100 days of trading, while the vertical axis is the averaged estimated reputation of duopoly sellers. From these figures, we can see that oracle can always accurately estimate honest and dishonest sellers’ reputation (the real reputation of dishonest seller is 0, and the real reputation of honest seller is 1). Therefore, the closer a curve of given strategy to the oracle curve, the more accurate this strategy’s estimation is. Inversely, as naïve strategy assumes the defender trust all others’ ratings, it can be regarded as the worst baseline. The following paragraph analyzes and compares these strategies’ ability in defending against various attacks in detail.

Fig. 5
figure 5

Reputations of dishonest duopoly sellers when buyer agents adopt various strategies defending attacks

Fig. 6
figure 6

Reputations of honest duopoly sellers when buyer agents adopt various strategies defending attacks

In general, PEALGA and WBCEA outperform all the trust strategies and all the trust and distrust strategies, under almost all attacks, especially under attacks such as Sybil, Sybil&whitewashing, and Sybil&camouflage. Moreover, PEALGA and WBCEA perform similarly and achieve the best performance under all attacks as there is very little difference between their reputation MAE (see the 7 th row and the 11 th row in Tables 5 and 6). That can also be further explained according to Figs. 5 and 6, in which PEALGA and WBCEA always approach the oracle baseline best under all attacks. What’s more, WBCEA outperform other strategies including PEALGA in defending Sybil as it is the quickest strategies to reach their real reputation (see Figs. 5d and 6d). These results are achieved because simultaneous consideration of trust and distrust can improve defending strategy’s prediction accuracy slightly, while the adaptation of customized optimal advisor list for evaluating each candidate seller can improve its prediction accuracy greatly.

6 Conclusions

In this paper, based on psychology research result of Lewicki et al. [10], we assume that each buyer agent has both the trustworthy aspect and the untrustworthy aspect, and propose to assign two scores to each buyer agent for denoting its trustworthy aspect and untrustworthy aspect respectively. The synthetization of these two aspects can be used as criteria for evaluating a buyer’s synthetized trustworthiness. Besides, each buyer maintains a whitelist and a blacklist, which is evolved according to the new algorithm called WBCEA for defensing multifarious attacks. The WBCEA algorithm is composed of several sub-algorithms, such as the trust network construction algorithm, the optimal advisor lists generation algorithm, the seller’s reputation calculation algorithm and the whitelist and blacklist updating algorithm. According to the whitelist and blacklist, a buyer can construct its own trust network (see algorithm 1). By doing so, the buyer can select trustworthy advisors for evaluating each candidate seller and chose the most trustworthy seller as trading partner.

A set of experiments is designed and implemented to compare the performance of our strategy with recent typical defending strategies and baseline ones. Experimental results show that the WBCEA strategy and the PEALGA strategy have similar performance in defending against all attacks. Moreover, they can outperform existing related trust strategies and all the trust and distrust strategies in robustness and MAE of seller’s reputation when defending various attacks. In particular, WBCEA slightly outperform PEALGA when defending against Sybil attack.

The strategies compared in this paper are suitable for B2B electronic market, where sellers’ behaviors are not changing frequently. Whether the strategy proposed in this paper is suitable for the C2C market in which the sellers’ behaviors often change is a problem that needs to be explored. Moreover, it is significant for us to study the stability of this strategy when the configuration (number of sellers, the ratio of dishonest buyers) changes. Besides, using real data to verify the accuracy, robustness and stability of this strategy is our future target.