1 Introduction

Electronic marketplaces (e-markets) such as E-Bay and Amazon are inter-organisational information systems that enable participating buyers and sellers to exchange information about prices and product offerings online, and typically support all steps of the entire order fulfilment process [9]. One problem with existing e-markets is that their participants must monitor continuously how their activities progress. As the duration of some of these activities can be long, e-market participants may find it tedious and time-consuming to monitor multiple, on-going activities [32]. Consider the following scenario:

Bob needs a second-hand laptop and joins an e-market to purchase it. Bob’s preferred price range is between [700–900] and has a deadline to acquire it within five days. He enters the e-market on a Monday and after a quick search, he finds his preferred laptop being sold in an on-line auction. But as on-line auctions are time consuming (indeed the auction’s expiration deadline is in a month’s time), he also searches for individual sellers who allow items of interest to be negotiated on-line. He finds two possible sellers: Alice and Tom. As shown in Fig. 1, Bob starts a negotiation with both sellers at the same time, by making offers to each one of them separately for the laptops they offer. Alice, sends a counter-offer first, followed by Tom. After evaluating both counter-offers, Bob is inclined to buy the laptop from Tom. But, a new seller, John, joins the e-market on Tuesday. Bob’s choices are now extended as the number of alternatives increases because the market has dynamically changed. Should he (a) buy from Tom? (b) negotiate with John risking to lose Tom’s offer? or (c) continue to negotiate with all three, in which case what offer should he make to them?

Fig. 1
figure 1

Buyer Bob negotiates concurrently with sellers Alice, Tom and John

To answer the questions raised by our scenario, Bob needs a high-level plan, or strategy, about which actions to take. This should consider his preferences, the nature of the e-market and the lack of information about the strategies of the sellers, their deadlines or information about possible competitors (other buyers who wish to buy the same laptop and may be negotiating with Bob’s sellers at the same time as he is). For the dynamic part of the e-market environment Bob’s strategy should also consider if current competitors leave the e-market because their deadlines have been reached or if new competitors have entered the e-market, while Bob is still negotiating. For the incomplete information part too Bob’s strategy should consider different issues like how much longer will Tom remain in the market and if there are competitors interested in Tom’s laptop, to have an idea of how much to propose to him in his next offer.

Motivated by the above scenario, we study the problem of what kind of strategy should a software agent use in order to represent a buyer’s needs in an e-market. We employ a multi-agent systems approach [61] to deal with such dynamic and complex e-markets, whereby agents negotiate bilaterally with others on behalf of their users. The idea here is that agents can satisfy their users’ preferences by isolating emotions and feelings [22] as well as save users time. We are interested in many-to-many negotiations that occur in pairs (bilateral) and at the same time (concurrent). Supporting such negotiations is important because they can enable practically automated buying and selling processes online, e.g. on E-Bay. As agents are inherently multi-threaded, they can negotiate concurrently. The immediate advantage is to eliminate one of the shortcomings of human negotiations, slowness, especially in applications where the task of negotiation can be delegated to a software agent to save time. Another advantage is to evaluate offers from multiple sellers, which increases the chances of reaching an agreement, thus maximising the agent’s utility [6, 47, 63]. Utility can be seen as a function that is designed by the user to measure the performance of the agent during negotiation [36]. Moreover, concurrent negotiation provides the agent with negotiating power, which means the agent can negotiate with many sellers at the same time, to acquire a stronger negotiation stance relative to another seller [50, 58].

Negotiation has gained attention in numerous multi-agent system applications [37], not only in e-commerce [23, 38, 52], but also in other areas such as service-oriented systems [13, 45], formation of virtual organisations [43], supply chains [16, 27], service grids [60] and cloud computing [15]. In this context, many different aspects of negotiation strategies have been studied, ranging from the introduction and evaluation of opponent models [19, 28, 39, 47, 63, 64] to encounter-based learning approaches [40, 59]. Negotiation strategies based on game theoretic approaches are often used for practical applications; however, they have been critiqued in situations where one cannot have complete information about the negotiating environment and the negotiation opponents [31, 50]. Like [6], we are not assuming that agents have complete information about the factors considered in our framework, which makes agents’ reasoning even more difficult to compute optimal negotiation strategies. Therefore, instead of formally modelling the negotiation process, we have followed a heuristic approach that is experimentally tested and evaluated.

In addition, existing strategies abstract away from the e-market environment and its dynamic nature and, as a result, how the environment affects the progress of ongoing negotiations, which in turn lead agents to obtain lower utility. In addition, existing strategies do not always have explicit measures of how well the agent itself is progressing during a negotiation and, at times, they make strong assumptions about the domain, e.g. that deadlines of all participants are public, thus constraining their applicability in a variety of practical negotiation settings. Moreover, the concurrent negotiation literature does not tackle the problem of different agreement zones during negotiation. In the literature, the intersection between the price ranges (known as agreement zones) is always implicitly assumed to be 100% of initial and reservation prices of buyer, an assumption that is unrealistic in practical applications. The reason for exploring different agreement zones is to show how negotiation can move the positions of the agents closer to the areas of the zones where there is agreement and thus improve their utility.

Another major issue related to building practical negotiation strategies for e-markets is how to evaluate the performance of an agent’s strategy. While some researchers test their agents theoretically [21], most agent developers use simulation platforms [18, 29, 39, 48, 63], especially for evaluating heuristic-based strategies. However, some of these platforms limit the types of strategies that can be tested and as a result there is scope for a standardised simulation environment to provide fair and objective comparisons between negotiating agents.

Given the limitations of existing work discussed above, our hypothesis is that: if an agent models the e-market environment and the progress of itself in it as explicit parameters, then the agent can capitalise on the fluctuations of these parameters to make offers that improve the agent’s utility. To test this hypothesis we propose the Conan strategy, a heuristic-based strategy that supports a buyer negotiate in open e-market environments, where agents not only enter and leave at their own will, but also have diverse strategies that are unknown to their negotiation opponents. Our metrics to evaluate our hypothesis is based on a novel combination of environment and self-related factors, which to the best of our knowledge has not been attempted in existing work. Moreover, we keep the deadline of negotiating agents private, so that agents do not know when others will exit from the negotiation. Furthermore, we experiment with a variety of realistic settings by using an agent platform that supports changing the parameters of a simulation to evaluate the performance of Conan and other state-of-the-art strategies. Our contribution can thus be summarised as follows.

  • We formulate the Conan strategy, and we show how it can be specified declaratively as part of an adaptive agent model for concurrent bilateral negotiations. Conan makes concurrent bilateral negotiation suitable in applications where existing work imposes too strong assumptions to be realistic.

  • We build an open and complex negotiation environment to test the strategy. We then design a series of experiments simulating negotiation scenarios between Conan and opponent strategies from the literature, and we evaluate extensively their performance.

  • We demonstrate that Conan significantly outperforms the state-of-the-art [63] in terms of average utility gained from negotiations.

The rest of the paper is structured as follows. Section 3 formalises our concurrent negotiation environment. Section 4 describes our negotiating agent, Conan. Section 5 explains our experimental setting and evaluates the experimental results. Section 2 reviews the relevant literature. Section 6 concludes with potential future directions for our work.

2 Related work

We review here related work according to the number of agents negotiating at a specific time, giving rise to multiple concurrent bilateral negotiation models. Our review will focus on addressing the following questions: (a) How an agent can behave in a dynamic environment where there are both external changes (i.e. changes that are out of the agent’s control, namely changes in the environment) and internal changes (i.e. changes in the agent’s intentions and goals, namely changes in the agent’s mental state); (b) when and how an agent can change its strategy during negotiation; and (c) what information is available for an agent during the negotiation to support the decision-making process. In addition, our review will also include two of the concurrent alternating offers protocols that are similar to ours.

2.1 Multiple concurrent bilateral negotiation models

In this context Nguyen and Jennings [46, 47] developed and evaluated a heuristic concurrent bilateral negotiation model for e-commerce purchasing. The model consists of a coordinator and a number of negotiation threads. The coordinator is responsible for deciding and changing the negotiation strategy for each thread and updating the lowest utility value of an offer that is considered acceptable by the agent. The coordinator also employs the negotiation strategy according to (a) the probability distribution of seller, determining whether the seller is one of conceder or non-conceder; (b) the percentage of success matrix, describing the chance of reaching an agreement when the buyer agent applies a specific strategy to negotiate with a particular type of seller; and (c) the payoff matrix, measuring the average utility value of an agreement reached in similar situations. The success and payoff matrices are initially set to a common value and updated after the negotiation finishes.

While Nguyen and Jennings [46, 47] model is very relevant to our study, it is not directly applicable to our negotiation setting. This is due to their strict assumption about the knowledge of the opponent types, negotiation success and payoff matrices before the negotiation. Moreover, their model is based on a Bayesian learning approach requiring prior knowledge about the environment as well as the opponents. The Conan strategy presented later in this paper is more open than that of Nguyen and Jennings as it does not deal only with two types of sellers, agents can come and go during negotiation, and it does not require prior information about the opponents and the environment in the form of success and payoff matrices.

An et al. [7] proposed a strategy for deciding what proposal to make and when to make it, in multiple concurrent negotiations. The limitation of this strategy is that compared with our strategy, the negotiation situations of different negotiation threads are calculated according to the minimum offer received from all sellers, while in our case, we analyse each negotiation thread based on how the negotiation with the opponent progresses. Also, in the An et al. strategy, there are no rules to decide when to accept an offer, but in our strategy there are. Moreover, their strategy does not employ a concurrent protocol, as we do. Furthermore, An et al. did not benchmark their strategy against other state-of-the-art strategies, while in our strategy we do.

Li et al.  [39] presented a strategy for bilateral negotiations about one issue (i.e. price). Their strategy is composed of sub-strategies about single-threaded negotiations where one buyer and one seller are assumed; synchronised multi-threaded negotiations where there are multiple threads in the buyer account for negotiating with different sellers; and dynamic multi-threaded negotiations where the uncertainty of new sellers joining in the future is factored in to the buyer’s strategy. The sub-strategy that is most related to our setting is that of dynamic multi-threaded negotiations in the presence of other opponents. One limitation of the strategy is that it adopts only the basic time-dependent negotiation strategy, viz., the concession rate of the offer only depends on the negotiation time. Furthermore, the work does not provide implementation details on how the authors set up their empirical evaluation. In addition, the strategy assumes the availability of prior information about opponents (e.g. the estimated reservation price of an opponent), which is again not appropriate for practical negotiation settings like ours.

Kolomvatsos et al. [33] adopted particle swarm optimisation (PSO) and kernel density estimation (KDE) for each thread in the concurrent negotiation to find the proposed prices in every round of the negotiation. Similarly to our strategy, they focused on the buyer side and they studied concurrent negotiations between a buyer and a set of sellers. In this setting, the buyer utilises a number of threads, with each thread following a specific strategy that adopts swarm intelligence techniques for achieving the optimal agreement. The PSO algorithm is adopted by each thread. What is interesting in their approach is that their strategy requires no central coordination. Their strategy does make some assumptions about the opponents. However, their experimental results focused on the time interval where an agreement is possible and were not compared with the current state-of-the-art (as we did). Also, Kolomvatsos et al.  [34] used an artificial bee colony (ABC) algorithm to find the optimal agreement when negotiating with a group of sellers; however, the model has the same limitations as  [33].

Williams et al. [63] proposed a negotiation strategy for concurrent negotiations in which the agent negotiates with multiple opponents about one item with multiple issues. This is the most similar existing research with respect to our setting. As with Nguyen and Jennings, the Williams et al. strategy has two components: a set of negotiation threads and a coordinator. Each negotiation thread is responsible for two tasks: (a) performing Gaussian process regression to predict the future utility of its opponent and (b) calculating the concession rate for an offer by taking into account the best time and best utility values from the coordinator. The coordinator is responsible for determining the best time and utility for each thread to reach an agreement. Williams et al.  [63] developed two versions of their strategy. The first assumes that each thread has a different utility function from the others, while the second presupposes that all threads should have the same utility function.

However, there are weaknesses in the Williams et al. strategy: (a) both the agent and its opponents have the same deadline (and are aware of each other’s), which is not realistic for most practical problems; (b) they model the openness of the market by having agents using the same break-off probability (likelihood of leaving the negotiation) which is prior knowledge that is not applicable in real e-markets where buyers and sellers are concurrently negotiating with each other; and (c) their negotiation setting is not completely open since they only allow agents to leave the environment, but not to enter during run-time. We will use the Williams et al. strategy to benchmark our strategy, as it represents the current state-of-the-art in multiple and concurrent bilateral negotiation.

Table 1 Functionality comparison of the existing work in concurrent negotiation

Table 1 compares existing work in concurrent negotiation according to functionality: (a) continuous time—where real time is used during negotiation instead of using negotiation rounds or discrete time; (b) private deadline—where the buyer’s and sellers’ deadlines are unknown to each other in advance; (c) asynchronous negotiation—where the agent does not wait for all the sellers’ counter-offers to reply; (d) environment model—where the buyer has a model of the market’s environment in terms of the number of sellers, its competitors and other parameters that formulate the market’s dynamics; (e) unknown opponents—where the buyer has no information about the strategy of the sellers; (f) incomplete information—where there is no a priory knowledge about the negotiation environment and the agents in it; (g) model of open market—where a negotiation agent is capable of modelling how agents enter and leave to/from the market at a time; (h) explicit decisions—where clear conditions about when to offer, accept, reqToReserve, cancel and exit are specified; (i) different price ranges—where the agent is tested against three classes of intersections between the price ranges (i.e. agreement zones) \([\mathrm{IP}_{b}, \mathrm{RP}_{b}]\) and \([\mathrm{IP}_{s_{i}},\mathrm{RP}_{s_{i}}]\): 10, 60 and 100%; (j) opponent model—models the behaviour of the opponents; and (k) multi-issue negotiation—where buyers/sellers negotiate multiple issues like price and warranty.

2.2 Alternating offers protocol

We study a practical concurrent negotiation setting where a buyer agent engages in multiple bilateral negotiations in order to acquire a good. In doing so we experimented with the alternating offers protocol [54] due to its simplicity and wide use [37]. As shown in Fig. 2, in this protocol a buyer and a seller take turns to negotiate with possible actions: offer, accept and reject. The negotiation terminates either because an agreement is reached (buyer/seller accepts) or the negotiation has terminated (buyer/seller reaches their deadlines).

Fig. 2
figure 2

Alternating offers state diagram

Fig. 3
figure 3

Williams’ concurrent alternating offers state diagram

There have been two previous attempts to extend the alternating offers protocol. In [4], a confirm action is allowed after an accept; a buyer cannot cancel from confirmed offers. In [63], Fig. 3, a concurrent protocol is proposed and the possible actions are:

\(A = \{\mathrm{offer}[x], \mathrm{accept, exit, confirm}, \textit{de-commit}\}\)

The limitations in this method are that: (a) when the buyer makes a deal via a confirm action, the deal will be completed and the buyer will obtain the item under negotiation and the seller will receive the money for that item, so if the buyer cancels the deal (via a de-commit action), it results in a more complex situation where the buyer has to return the item back to the seller and the seller has to return the money back to the buyer, which can be inefficient in real-time applications; and (b) only the buyer can cancel the deal (via a de-commit action), which is not fair for the seller if the seller receives a better offer from other buyer. Moreover, the de-commit action is not explicitly shown in the state diagram of [62]. Furthermore, it implies that the protocol goes with a de-commit action from a final state (an agreement) to another final state (a conflict deal), which is logically problematic.

We will adopt the model of a percentage of the deal penalty to enable a fair comparison with the benchmark strategy of Williams et al.  [63].

3 Concurrent negotiation setting

Given a domain of application such as buying and selling in an e-market place, automated negotiation is normally viewed as a mechanism [53] that relies upon two main elements:

  • Negotiation protocol defines the space of possible agreements that agents can achieve and the legal actions that negotiating agents can make (i.e. the rules of encounter), including the process they need to follow to achieve these agreements. A protocol is public, viz., every agent knows the rules of negotiation in advance.

  • Negotiation strategy determines an agent’s behaviour during the negotiation process and specifies when and how to act. A strategy is private to the agent, i.e. we cannot assume that agents will know of the other agents’ strategies.

A negotiating agent makes proposals defined by its protocol using its strategy. Negotiation proceeds as a series of rounds, and each round may involve one or more proposals. Negotiation terminates when agents reach an agreement, or a deadline is reached, whichever is earlier. Next, we describe the skeleton of our negotiation framework and the protocol used.

3.1 Skeleton of a concurrent negotiation

We assume that a market environment contains a set of buyers B where \(\left| B \right| =n\) (\(n \ge 1\)) and a set of sellers \(S= \left\{ s_1,\ldots ,s_m\right\} \) (\(m \ge 1\)) that wish to negotiate over a set of resources \(R=\{r_1,\ldots ,r_l\}\) (\(l \ge 1\)) using a protocol p. We develop a strategy for a buyer agent \(b \in B\), who negotiates concurrently with multiple sellers over a resource \(r \in R\) using the protocol p. Negotiation advances in real time t (\(t \in \mathbb {R}^+\)) where agents take decisions to make offers. At any time t during a negotiation, we identify the following information.

\(\mathrm{Seller}_b^t\) is the set of seller agents negotiating with b at time t, where \(\mathrm{Seller}_b^t \subseteq S\).

\(C_b^t\) is the set of competitor agents for b at time t, where \(C_b^t \subseteq B {\setminus }\{b\}\).

The negotiation starts at time \(T_s\). Once started, the negotiation can potentially last for time \(T_b\), which is the maximum duration of time that b can negotiate for. We also refer to \(T_e\) as the deadline of negotiation for b, where \(T_e = T_s + T_b\). We assume that b does not know the sellers’ deadlines, and that the sellers do not know \(T_b\).

b has \(\left| \mathrm{Seller}_b^t \right| \) negotiation threads, one for each seller \(s_i\in \mathrm{Seller}_b^t\).

\(\delta ^{t}_{ag_j\rightarrow ag_k}\) is a negotiation action from \(ag_j\) communicated to \(ag_k\) at time t, where either \((ag_j=b\ \wedge \ ag_k\in \mathrm{Seller}_b^t)\) or \((ag_j \in \mathrm{Seller}_b^t\ \wedge \ ag_k=b)\), and \(\delta \in \) {offer(x), reqToReserve, reserve, cancel, confirm, accept, exit} where offer(x) is a term that contains the price x of the resource r under negotiation at time t.

\(\mathrm{IP}_{b}\) and \(\mathrm{RP}_{b}\) represent the initial and reservation prices of buyer b, while \(\mathrm{IP}_{s_{i}}\) and \(\mathrm{RP}_{s_{i}}\) represent the initial and reservation prices of seller \(s_i\). In the literature, the intersection between the price ranges (i.e. agreement zones) \([\mathrm{IP}_{b}, \mathrm{RP}_{b}]\) and \([\mathrm{RP}_{s_{i}},\mathrm{IP}_{s_{i}}]\) is always implicitly assumed to be 100% of \([\mathrm{IP}_{b}, \mathrm{RP}_{b}]\), a very difficult assumption to make when dealing with practical applications. So, in order to make our setting more realistic, we adopt three classes of intersections for the aforementioned price ranges: 10%, 60% and 100%. The reason behind this is that we want to capture situations where we can obtain higher utility, which was not previously possible without using negotiation. However, we do make the assumption that at any time t if \(\forall s_i \in \mathrm{Seller}_b^t\ (\mathrm{IP}_b \ge \mathrm{IP}_{s_{i}})\) holds, then there is no need to negotiate. This is a much softer assumption to make.

\(\mathrm{EG}_b \in [0,1]\) represents the eagerness of buyer b, viz., specifying the willingness of b to obtain a resource; this is specified by the user. For instance, if the user needs to buy a laptop to use it in his project in the next week, then the user has a high eagerness to buy a laptop. On the other hand, if the user needs to move to a house in the next two years, then he has a low eagerness to buy a house.

\(U_b(x):\mathbb {R}^+ \rightarrow [0,1]\) is the utility function that agent b uses to determine the utility of an offer x made either by b or \(s_i\) as follows:

$$\begin{aligned} U_b(x) = (\mathrm{RP}_{b} - x )/ (\mathrm{RP}_{b} - \mathrm{IP}_{b}) \end{aligned}$$
(1)

Since our focus is on a concurrent negotiation setting where a buyer agent b engages in multiple bilateral negotiations with multiple sellers \(s_i \in \mathrm{Seller}_b^t\) in order to acquire a single resource, we need a protocol that determines rules of negotiation that are appropriate for concurrent negotiation. The agent needs two types of protocols to rule the concurrent negotiation: the individual bilateral negotiation protocol and the overall negotiation protocol.

3.2 Bilateral protocol

In order to decide which protocol p is allowed in our negotiation market, we experimented with the alternating offers protocol [54] because it is simple and widely used. However, this protocol is not sufficient to handle the complexity of concurrent negotiations due to its limited actions (i.e. offer(x), accept and reject). The following two cases show the difficulty in using the alternating offer protocol as is in a concurrent negotiation.

  1. (i)

    While Bob is negotiating with Tom and about to send an accept to Tom’s offer, a new seller John enters the market. This creates a new opportunity for Bob. As a result of using the alternating offers protocol, Bob needs to make a decision on how to proceed in the negotiation, either by: (a) buying from Tom without exploring what John has to offer; or (b) negotiating with John and thus risking to lose all exchanges between him and Tom that led Tom produce his current offer.

  2. (ii)

    At time t1, Bob is negotiating concurrently with Alice, Tom and John. At time t2 Alice sends a counter-offer to Bob, while at time t3 Tom sends back an offer to Bob. At t4 Bob sends back a counter-offer to Alice and an accept to Tom. Also at t4 Bob receives an accept from John. Since Bob needs to buy one item only, the problem now arises at t5 of how will Bob deal with two accepts and whether Bob will continue the negotiation or not.

Fig. 4
figure 4

Conan concurrent alternating offers state diagram

As shown in Fig. 4, we extend the protocol to allow for the additional actions: reqToReserve, reserve, cancel, confirm and exit. At negotiation time t, these actions can be described as follows.

  • offer(x): represents an offer made either by b (or \(s_i\)), proposing price x for resource r;

  • reqToReserve: provides b with the opportunity to hold a preferred offer from a seller \(s_i\) until a better offer is made by another seller \(s_j\). The reason for reqToReserve actions is that b cannot accept an offer directly because maybe a new seller will enter the market with a lower price. reqToReserve action is also useful because apart from helping to keep a good seller, there is a cancellation penalty which prevents the seller from exiting the negotiation. b can send a reqToReserve to \(s_i\) at any time t if b receives an accept from \(s_i\) or b receives an offer from \(s_i\).

  • reserve: gives \(s_i\) with the opportunity to agree on holding a preferred offer from a buyer b. \(s_i\) can send a reserve action to b if \(s_i\) receives a reqToReserve action from b.

  • cancel: allows b or \(s_i\) or both to cancel from their reserved offers, which means that they retract or withdraw from existing reserved agreements. The cancelling agent pays a penalty Penalty to the other agent involved in the cancelled agreement, to ensure fairness and avoid unnecessary cancellation. We will discuss later how to deal with penalties.

  • confirm: enables b and \(s_i\) reach an agreement about the resource r under negotiation being sold at the price of the most current offer at time t. b can send an confirm to \(s_i\) at any time t if \(s_i\) accepts an offer from b.

  • accept: permits b and \(s_i\) reach an agreement about the resource r under negotiation being sold at the price of the most current offer at time t. b can send an accept to \(s_i\) at any time t if: (i) b reaches its deadline \(t_b\) and has a reserved offer with \(s_i\); and (ii) b accepts an offer from \(s_i\). In addition, \(s_i\) can send an accept for any offer made by b.

  • exit: supports b or \(s_i\) to withdraw from the course of negotiation at any time, even without notifying other agents. This implies that the negotiation between b and \(s_i\) has failed.

3.3 How To negotiate concurrently

Our protocol will be used by the buyer to communicate with multiple sellers concurrently. Negotiation starts when the buyer sends an offer to a seller and then waits for the seller’s response. As soon as it receives a response from the seller, the buyer evaluates the seller’s offer based on its strategy. If the buyer does not agree on the offer, it will send a counter-offer. The buyer and the seller continue to exchange offers until either agent exits the negotiation, an offer is reserved, or they reach an agreement. We assume that the agent has to reply to all pending offers and that messages received never expire. If the seller likes the buyer’s offer it will send an accept action and wait for the buyer to:

  • confirm the seller’s accept, in which case the negotiation terminates with an agreement. The reason for this is to avoid the situation where b receives accepts from two different sellers;

  • reqToReserve to the seller’s accept, which implies that the buyer and seller move to the reservation request state and the buyer will continue to search for other better offers. However, the seller is not allowed to reqToReserve to the buyer’s offer since we assume that the seller has large numbers of the same resource to sell; or

  • exit from the negotiation with a seller and thus forcing negotiation to fail.

Likewise, if the buyer likes the seller’s offer, it can either:

  • accept the seller’s offer and the negotiation terminates with an agreement; or

  • reqToReserve to the seller’s offer, so the buyer and seller move to the reservation request state and the buyer will continue to search for other better offers.

In the reservation request state the following may happen:

  • the seller can exit from the reservation request state without paying any penalty. This move would be allowed only when the seller does not want to reserve the item and exits;

  • the seller sends a reserve action agreeing to the reqToReserve from the buyer;

  • the buyer or the seller cancel from the reserved state, which results in a penalty being paid by the participant who initiates the cancel; or

  • the buyer can accept the reserved offer and the negotiation ends with an agreement.

3.4 Overall negotiation protocol

This protocol governs the communication between agents in the e-market. The overall negotiation starts when the agent b enters the e-market and finishes either when:

  1. E1.

    b accepts or confirm an offer from a seller \(s_i\); thus, agreement is reached and the negotiation succeeds; or

  2. E2.

    b exits from all bilateral negotiations with all sellers; thus, no agreement was reached, b exits the e-market, and the negotiation fails.

An agent now needs a strategy to make decisions during negotiation. Such a strategy is protocol-dependent in that participating agents need to conform to the protocol’s actions for progress to be made during negotiation. For this work we assume that agents are protocol conformant, i.e. that they will follow the protocol and will never breach it during negotiation.

3.5 Example run

To give an example of how agents interact with our overall negotiation protocol consider Fig. 5 depicting an example run using the protocol. The run shows the exchanges between agents for the scenario presented in the introduction. Each box corresponds to a bilateral negotiation between the buyer agent Bob and the seller agents Alice, Tom and John. Bob starts first with the two sellers Alice and Tom and, while he is negotiating with them, observes that a new seller John is available, so he decides to negotiate with him too. This example shows that the difficulties that Bob had under the alternating offers protocol in Sect. 3.2, case (i), can be overcome mainly because Bob now is able to handle negotiations with Alice, Tom and the new seller John at the same time. More precisely, Bob can now send a reqToReserve to Tom to hold a good offer and at the same time start negotiating with John to explore what is it that John has to offer, thus improving his options to improve his utility during the negotiation. Eventually, the example shows that the bilateral negotiations of Bob with Alice and John fail, but Bob’s negotiation with Tom succeeds. For a buyer like Bob, bilateral negotiations are separate concurrent threads that encapsulate their own initial states and form part of the buyer’s overall internal state. Interactions within threads are viewed as state transitions that follow the protocol of Fig. 4. \(S1, S2,\ldots S6\) are state labels for the state of the buyer. The state transitions of the buyer are then caused by negotiation actions of the form \(\delta ^{t}_{ag_j\rightarrow ag_k}\) (as discussed earlier).

Fig. 5
figure 5

An example run of concurrent bilateral negotiation in CONAN

3.6 Implementation

We build a negotiation framework using Recon  [2], an experimental simulation platform that supports the development of software agents interacting concurrently with other agents in negotiation domains. Unlike existing simulation toolkits [42], Recon also supports declarative strategies to enable agents explain their decisions to a user, if necessary. Our buyer agent is implemented according to KGP agent model [57] with the specialised agent architecture discussed in [3]. Recon is built on top of GOLEM Footnote 1 [12] suitably specialised with infrastructure agents to manage an e-market and extract statistics from the negotiations that take place. A tutorial video on Recon is available online.Footnote 2

4 CONAN

We study the development of a heuristic negotiation strategy for a buyer agent in a concurrent negotiation market setting. Existing work [47, 63] provides answers only about when to offer or accept and what to offer, but does not explicitly represent reqToReserve and cancel actions in a concurrent negotiation dialogue. In addition, offers are computed using an opponent’s model and do not take into account the environment and/or self models [47, 63]. Moreover, some existing work assumes complete and certain information about the negotiation environment and the opponents [47]. In most approaches the agent strategy is an isolated component without clear illustration of how the model is designed and how different components interact with each other [47, 63]. To develop an adaptive negotiation decision-making model in a dynamic environment, we must support an agent to deal with an enormous number of environmental and attitude changes at the same time. To address this issue, we develop the Conan strategy. Since our buyer agent needs to acquire a resource when negotiating with multiple sellers at the same time, the agent will use Conan to decide (1) what actions to take during negotiation and when to take them; and (2) whether the action is an offer, then how to generate an offer. Algorithm 1 gives an overview of Conan.

figure d

Conan takes \(T_b\), \(\mathrm{IP}_{b}\) and \(\mathrm{RP}_{b}\) as an input from the agent user and returns either an agreement or no agreement to allocate a resource as output. Conan starts by initialising its deadline, then calculates the concession rate \(\mathrm{CR}_{t,{s_i}}\), generates the next offer and decides which action to proceed with for each seller \(s_i\). The designs of each heuristic are discussed in sub-sections below.

4.1 Heuristics to calculate the concession rate

Conan generates offers \(\mathrm{offer}_{t,{s_i}}\) at time t for seller \(s_i\) based on the following equation [18]:

$$\begin{aligned} \mathrm{offer}_{t,{s_i}}= \mathrm{IP}_{b}+ (\mathrm{RP}_{b}-\mathrm{IP}_{b}) * \mathrm{CR}_{t,{s_i}} \end{aligned}$$
(2)

where

  • \(\mathrm{IP}_{b}\): is the initial price;

  • \(\mathrm{RP}_{b}\): is the reservation price;

  • \(\mathrm{CR}_{t,{s_i}}\): is the concession rate at time t for seller \(s_i\). \(\mathrm{CR}_{t,{s_i}} \in [0,1]\) and \(\mathrm{offer}_{t,{s_i}} \in [\mathrm{IP}_{b}, \mathrm{RP}_{b}]\).

Our evaluation of the concession rate depends on the environment and self factors:

$$\begin{aligned} \mathrm{CR}_{t,{s_i}} = \left\{ \begin{array}{ll} 0, &{}\quad \text {if }t= T_{s} ;\\ 0.99, &{}\quad \text {if } t=T_e-\alpha \text { where } \alpha = \text { time to complete one}\\ &{}\quad \text {negotiation round};\\ \mathrm{CR}_{t-1,{s_i}}, &{}\quad \text {if } \mathrm{CR}_{t,{s_i}} - \mathrm{CR}_{t-1,{s_i}}< 0;\\ \mathrm{CR}_{t-1,{s_j}} * 0.9, &{}\quad \text {if offer of }\mathrm{CR}_{t-1,{s_j}} \text { reserved and }j \ne i;\\ w_{\mathrm{Env}_{t}} E_t + w_{\mathrm{Self}_{t}} S_{t}, &{}\quad \text {otherwise} \end{array} \right. \end{aligned}$$
(3)

where

  • \(E_t\): is the effect of the environment factor where \(E_t\in [0,1]\);

  • \(S_{t}\): is the effect of self factor where \(S_t\in [0,1]\);

  • \(w_{\mathrm{Env}_{t}}, w_{\mathrm{Self}_{t}}\): are the environment and self weights, normalised in the following sense: \(w_{\mathrm{Self}_{t}}, w_{\mathrm{Env}_{t}} \in [0,1]\) and \(w_{\mathrm{Self}_{t}}+ w_{\mathrm{Env}_{t}} = 1\).

Before explaining the environment and self factors \(E_t\) and \(S_t\), we demonstrate how we calculate each case of the concession rate \(\mathrm{CR}_{t, {s_i}}\) in Eq. 3 above.

  • Case 1 When b starts the negotiation, it will offer its initial price first.

  • Case 2 When time is \(\alpha \) time units close to the deadline and the buyer b has no reserved offer, then it needs to secure an offer even if it is near to its reservation price. Therefore, b concedes near to its reservation price by setting the concession rate \(\mathrm{CR}_{t,{s_i}}=0.99\).

  • Case 3 When b generates an offer that is non-monotonically increasing (i.e. \(\mathrm{CR}_{t,{s_i}} - \mathrm{CR}_{t-1,{s_i}}<0\)) at t, then, to ensure its own rationality, b will actually offer its previous offer \(\mathrm{CR}_{t-1,{s_i}}\). This case needs a first generation of the value \(\mathrm{CR}_{t,{s_i}}\) with cases 1 or 5.

  • Case 4 If b has a reserved offer at \(t-1\), there is no need for b to concede more to other sellers. Thus, b will generate an offer which is less than 10% of its reserved offer. This case needs a first generation of the value \(\mathrm{CR}_{t,{s_i}}\) with cases 1 or 5.

  • Case 5 Otherwise, b will generate an offer that takes into consideration the environment and self factors. To give a flavour of these, the following are some special cases illustrating how environment and self factors affect the concession rate.

  1. (a)

    Adding and/or removing a seller to/from the negotiation: while agent b is negotiating with agent s1 and agent s2, a new seller agent s3 enters the market and starts a new negotiation thread with b. This creates a new option for b. As a result, b will decrease its concession rate to see how the negotiation process proceeds with agent s3. On the other hand, when s1 exits the negotiation, it means b loses an option; thus, it has to increase its concession.

  2. (b)

    Adding and/or removing negotiation competitors: consider agent b negotiating with agents s1 and s2 as before, but now b perceives that a competitor c1 is negotiating with agent s1. Hence, b may concede by a larger amount to attract s1 to sell the resource. In contrast, if c1 leaves the market, b may concede by a smaller amount since it has no other competitors while negotiating with s1.

  3. (c)

    Change in the negotiation situation: if b is in a bad negotiation situation (a notion to be defined by a function when we explain the self factor \(S_t\)), then b has to concede more in order to secure the resource before the deadline. On the contrary, when in a good negotiation situation, b has a higher chance of winning the negotiation, which in turn leads to b reducing its concession rate.

Now, we will explain the self and environment factors \(S_t\) and \(E_t\).

4.1.1 Self factor

Given a concurrent negotiation setting, the self factor is an indication of how well the agent is doing at a specific time. In Conan, we address this point heuristically by studying a self factor \(S_t \in [0,1]\) that depends on how many reserved offers (\(\mathrm{RO}\)) the agent has made so far, the agent’s negotiation situation (\(\mathrm{NS}\)) representing progress with all negotiation threads (i.e. how well opponents respond to our offers and how opponents concede over time), the effect of the passage of time (\(\mathrm{TE}\)) and the eagerness (\(\mathrm{EG}_b\)) of our buyer agent to obtain the item under negotiation. We compute the self factor as:

$$\begin{aligned} S_t=\left( \frac{1}{\mathrm{RO}+1} + \mathrm{NS} + \mathrm{TE} + \mathrm{EG}_b\right) / 4 \end{aligned}$$
(4)

As we expect each sub-factor \(\mathrm{RO}\), \(\mathrm{NS}\), \(\mathrm{TE}\), \(\mathrm{EG}_b\) \(\in [0,1]\), we divide the final sum by 4 to ensure that the value for \(S_t\in [0,1]\). We are now in the position to explain each sub-factor in detail.

  • \(\mathrm{RO}\): is the number of reserved offers, obtained from the set of reserved offers \(\mathrm{RO}_t\) at time t and evaluated as \(|\mathrm{RO}_t|\). \(\mathrm{RO}_t\) (and as result \(\mathrm{RO}\)) is updated every time the buyer reserves an offer or cancels a reservation. If the number of reserved offers increases, then the agent decreases the concession rate for generated offers. We normalise \(\mathrm{RO} \in [0,1]\).

  • \(\mathrm{NS}\): is the negotiation situation for all threads. The evaluation of the negotiation situation in each thread is used to decide the next offer for each thread. It is calculated using the following two criteria: Opponent’s response time and Concession rate. These criteria are based on combinations of time-dependent factors (for the opponent’s response time) and behaviour-dependent factors (for the opponent’s concession rate), which are inspired by the works described in [18, 20].

Opponent’s response time is the time taken for the opponent to make a counter-offer given the time the buyer b has available to negotiate. More specifically, given an offer from the buyer b to seller \(s_i\) at time \(t_l\), denoted as \(\delta ^{t_l}_{ b\rightarrow s_i}\), and a response to that offer from \(s_i\) to b at time \(t_m\) (\(t_m > t_l\)), denoted as \(\delta ^{t_m}_{s_i\rightarrow b}\), then Opponent’s response time is as described below:

$$\begin{aligned} Opponent ' s \, response \, time = (t_m - t_l) / T_b \end{aligned}$$
(5)

Opponent’s response time \(\in (0, 1]\). The actual value for the response time of an opponent during negotiation classifies opponents into three distinct classes: compatible, moderately compatible or incompatible. Each opponent class is obtained by dividing the possible values of the Opponent’s response time (0, 1] into three equal sub-intervals shown below:

  • compatible \(\in (0, \frac{1}{3}]\);

  • moderately compatible \(\in (\frac{1}{3}, \frac{2}{3}]\);

  • incompatible \(\in (\frac{2}{3},1]\).

The smaller the response time, the more compatible an opponent is. These three notions of compatibility are used here as a heuristic way to classify the behaviour of opponent responses in relation to the expectations of the buyer agent. The intuition is that this classification is sufficiently discriminating if compared with the bipolar alternative compatible/incompatible, while at the same time it is equally simple to explain and has the added advantages of distinguishing behaviours that lie in the middle.

Opponent’s concession rate is the rate of change between two consecutive offers of the opponent at time \(t=t_k\) over the price range the agent has to negotiate. This is based on a window of consecutive opponent offers \(\phi > 1\), containing first the last offer made to b by \(s_i\), denoted as \(\mathrm{offer}^{t_{k-1}}_{s_i\rightarrow b}\), and ending with \(\mathrm{offer}^{t_{k - \phi }}_{s_i\rightarrow b}\). We can then compute the opponent’s concession rate as follows:

$$\begin{aligned} Opponent ' s \, concession ~ rate \ (\phi )= 1 - \Bigg (\frac{\mathrm{offer}^{t_{k -\phi }}_{s_i\rightarrow b}-\mathrm{offer}^{t_{k-1}}_{s_i\rightarrow b}}{\mathrm{RP}_b-\mathrm{IP}_b}\Bigg ) \end{aligned}$$
(6)

For instance, if \(\phi =3\) and the previous consecutive opponent offers of \(s_i\) to b at time \(t_k\) are 770, 790 and 800, then the concession will be \((800 - 770)/ 3\). We then normalise the Opponent’s concession rate to a real number \(\in [0, 1]\) and map the result to incompatible, moderately compatible and compatible, as with the Opponent’s response time.

From these two criteria, our goal is to derive a local situation \(\sigma _i\) for each negotiation thread i (where i represents the negotiation of buyer b with seller \(s_i\)), which also takes the values incompatible, moderately compatible and compatible. To compute \(\sigma _i\), we have chosen the multi-criteria decision-making process known as the Borda method [11], giving a value to every qualitative decision based on its position, as in Table 2. We select this method because it: (1) supports proportional representation of the situation of each thread; (2) is neutral and monotonic [10]; (3) produces a complete, transitive ranking for a set of alternatives [17]; (4) preserves the decisiveness of labels (incompatible, moderately compatible and compatible) [25]; (5) uses straightforward calculation; and (6) employs feasible computation.

Table 2 Mapping procedure using the Borda method

Then \(\sigma _{i}\) sums up the scores Opponent’s response time and Opponent’s concession rate:

$$\begin{aligned} \sigma _{i} = { Opponent's}\,{ response}\,{ time} + { Opponent's}\,{ concession}\,{ rate}. \end{aligned}$$
(7)

Given the score combinations of Opponent’s response time and Opponent’s concession rate, the domain of \(\sigma _{i}\) is [2, 6], with 2 implying that the opponent is compatible with our style of negotiation and 6 implying that the opponent is incompatible. We then derive the global negotiation situation \(\sigma \) for buyer b, which we calculate as the sum of all the local negotiation situations \(\sigma _i\), as shown in Eq. (8). However, \(\sigma \) needs to be normalised, so that we can use the normalised value to generate the concession rate. So \(\mathrm{NS}\) is the normalised global negotiation situation given by Eq. 9, where 2 and 6 are the minimum and maximum values that a thread can score, respectively, while y is the number of sellers \(y = |\mathrm{Seller}_b^t|\).

$$\begin{aligned} \sigma= & {} \sum _{i=1}^{y} \sigma _{i} \end{aligned}$$
(8)
$$\begin{aligned} \mathrm{NS}= & {} \frac{\sigma -(2*y)}{(6*y)-(2*y)} = \frac{\sigma -(2*y)}{4*y} \end{aligned}$$
(9)

\(\mathrm{TE}\): is the effect of the passage of time on the concession rate for buyer b. We use Eq. 10 to normalise \(\mathrm{TE}\) between [0, 1].

$$\begin{aligned} \mathrm{TE}= \frac{t-T_s}{T_b} \end{aligned}$$
(10)

\(\mathrm{EG}_{b}\): is the eagerness of buyer b to obtain the resource and is specified by the user. Eagerness means the degree of desirability of an item by the buyer. In practice, if the eagerness of the buyer for an item sold by a seller is higher than that sold by another, then the buyer is willing to concede more on the price of the more desired item, not only because of the deadline. As discussed in Sect. 3, \(\mathrm{EG}_{b}\) is between [0, 1].

4.1.2 Environment factor

The environment factor allows a buyer agent using Conan to answer the question of how suitable is the environment to the agent at a given time. As with the self factor, the environment factor is a heuristic originating from the social sciences based on the negotiation power principle, which gives more power to the buyer to explore different alternatives when negotiating with many opponents [22]. Given a dynamic market environment, the environment factor \(E_t \in [0,1]\) depends on the number of sellers (\(\mathrm{Se}_t\)) that are available, the number of competitor agents (\(C_t\)) that are after our item and the demand/supply ratio (\(R_\mathrm{ds}\)). We compute the environment factor as:

$$\begin{aligned} E_t = \left( \frac{1}{\mathrm{Se}_t} + C_t + R_\mathrm{ds}\right) / 3 \end{aligned}$$
(11)

As we expect the sub-factors \(\mathrm{Se}_t\), \(C_t\), \(R_\mathrm{ds}\) \(\in [0,1]\), then we need to divide the sum of all sub-factors by 3 to obtain a value for \(E_t\in [0,1]\). We explain next each sub-factor.

  • \(\mathrm{Se}_t\): is the number of sellers that are actively negotiating with b. The value of \(1/\mathrm{Se}_t \in (0,1]\) (see case 5(a)).

  • \(C_t\): is the number of active competitor agents, other buyers who are trying to make a deal with the sellers negotiating with b for the same resource. We assume that this number is obtained from the market. The lower the number of competitors during negotiations, the higher the possibility of b reaching an agreement by conceding less [see case 5(b)]. We normalise the value of \(C_t\) between [0, 1].

  • \(R_\mathrm{ds}\): is the demand/supply ratio calculated as the ratio of the number of buyers to the number of sellers. The higher the ratio, the higher the price for the resource and, hence, the more difficult it is to reach an agreement. There are many e-markets in the literature that provide demand/supply ratios [55], including Tete-a-Tete, Kasbah, AuctionBot and the Fisher market. If the e-market does not provide the agents with the demand/supply ratio, then it can be calculated by dividing the number of sellers by the number of competitors. For instance, E-Bay provides the user with the number of buyers that follow specific item (number of competitors). We normalise the value of \(R_\mathrm{ds}\) between [0, 1].

4.1.3 Assigning weights to factors

We develop a method to assign weights to the environment and self factors presented earlier. First, in Eq. 12, we assign a weight to the self factor \(S_t\) depending on the value of the factor itself. Then, in Eq. 13, we calculate the impact of the price range \(\mathrm{DM}\). The functions below present this method. We classify the value of \(S_t\) as low, medium or high, where:

  • low \(\in (0, \frac{1}{3}]\);

  • medium \(\in (\frac{1}{3}, \frac{2}{3}]\);

  • high \(\in (\frac{2}{3},1]\).

The reason for the classification as low, medium or high is the same as that for choosing to classify as incompatible, moderately compatible or compatible in Sect. 4.1.1. The buyer always looks for its self factor \(S_t\) first, to see how satisfied it is with the current progress of the negotiation. If the value of \(S_t\) is low, it means that the buyer only by looking on this factor will concede less, which implies that the buyer is in a good negotiation position; focusing on \(S_t\) only, b will concede only by a small amount. For this reason, b must put more weight on the self factor, as it suggests that b should concede very little, which in turn implies that the environment factor should be assigned a lower weight. As a result, a high weight will be assigned to \(w_{\mathrm{Self}_{t}}\). This is captured by Eq. 12, where if the value of \(S_t\) is low, we multiply by 0.75 to make it more important; if the value of \(S_t\) is medium, by 0.5 to make it indifferent; and if the value of \(S_t\) is high, by 0.25 to make it less important. In this way, \(w_{\mathrm{Self}_{t}}\) is adjusted according to the negotiation progress.

$$\begin{aligned} w_{\mathrm{Self}_{t}} = \left\{ \begin{array}{ll} \mathrm{DM} * 0.75,&{} \quad \text {if } S_{t}= \text {low } \\ \mathrm{DM} * 0.5,&{} \quad \text {if } S_{t}= \text {medium }\\ \mathrm{DM} * 0.25,&{} \quad \text {if } S_{t}= \text {high } \end{array} \right. \end{aligned}$$
(12)

We also need to cater for the situation where the initial price of the seller is far away from the reservation price of the buyer. In this case, even if \(S_t\) is low, b will need to concede more, in order to close the price gap in fewer negotiation cycles. We therefore introduce a price distance multiplier \(\mathrm{DM}\) in Eq. 12, which measures the negotiation gap between b and \(s_i\) and is defined by Eq. 13.

$$\begin{aligned} \mathrm{DM} = (\mathrm{IP}_{s_{i}}/\mathrm{RP}_{b}) * \mathrm{TE} \end{aligned}$$
(13)

The ratio estimated by the initial price of the seller \(\mathrm{IP}_{s_{i}}\) and the reservation price of the buyer \(R_{b}\) is multiplied by \(\mathrm{TE}\), which is the effect of time passage, to remove a small percentage from the distance. The domain of \(\mathrm{TE}\) is within the interval [0, 1]. We are now in a position to evaluate the weight of the environment factor, as shown in Eq. 14.

$$\begin{aligned} w_{\mathrm{Env}_{t}}=1-w_{\mathrm{Self}_{t}} \end{aligned}$$
(14)

4.2 Cancellation penalty

The cancellation penalty is important to ensure fairness between negotiation participants, to save money and resources and to allow the system to be stable in terms of avoiding unwanted cancelled deals.

The penalty amount needs to be determined by the agent’s user before the start of the negotiation in order to decide when to cancel from an offer. \(\mathrm{Penalty}_b\) represents the total amount of penalties for all the reserved offers. Thus, if the agent, during negotiation, finds a better offer, then the agent will cancel from the least preferred offer (i.e. one cancel only) and reqToReserve to the better offer. At the end of the deadline, the buyer accepts its most preferred reserved offer and cancels the rest of the reserved offers. The buyer keeps an \(\mathrm{MRO}\) (maximum number of reserved offers), which when it is reached the buyer cannot reserve any further offer.

The penalty the buyer has to pay \(\mathrm{Penalty}_b\) is calculated as the difference between the amount of money the buyer has to pay to cancel any reserved offers and the amount of money the buyer has already received from opponents that cancelled their reservations with the buyer, i.e.:

$$\begin{aligned} \mathrm{Penalty}_{b,s_{i}}= & {} \mathrm{Penalty}_{s_{i},b} = D * \mathrm{Price}_\mathrm{reserved} \end{aligned}$$
(15)
$$\begin{aligned} \mathrm{Penalty}_b= & {} (\mathrm{Penalty}_{b,s_{i}} * \mathrm{RO}) - (\mathrm{Penalty}_{s_{i},b} * \mathrm{RO}_{\mathrm{Seller}_b^t}) \end{aligned}$$
(16)

where \(\mathrm{Penalty}_{b,s_{i}}\) is the penalty paid by b to cancel an offer made to seller \(s_i\), \(\mathrm{Penalty}_{s_{i},b}\) is the penalty paid by \(s_i\) to cancel an offer made by b, D is the percentage to be paid from the cancelled offer \(\in [0-1]\), \(\mathrm{RO}\) is the number of reserved offers so far, and \(\mathrm{RO}_{\mathrm{Seller}_b^t}\) is the number of cancelled offers that occur from sellers \(\mathrm{Seller}_b^t\) negotiating with b.

The reason for choosing D (a percentage of the reserved offer) is for fair comparison with the benchmark, viz., the strategy used by Williams et al.  [63].

4.3 Heuristics to decide actions

In order for the buyer b to decide which action to choose at each point in time, we developed a rule-based strategy. We start with the reqToReserve and cancel and then discuss the accept conditions. In these rules, we will use three parameters determined by the user: (a) \(\mathrm{MRO}\)—this is the maximum number of reserved offers; (b) D—this is the percentage of the deal for cancelling from negotiation threads (more details in Sect. 4.2); and (c) \(\mathrm{MAN}\)—this is the maximum number of active threads (ensures that the agent has limited computational resources).

4.3.1 Conditions to request to reserve

Agent b reqToReserve to \(s_i\) when:

  • offer \(\delta ^{t}_{s_i\rightarrow b} + \mathrm{Penalty}_b \le \delta ^{t+1}_{b\rightarrow s_i}\) < any reserved offers and b is not near the deadline.

  • \(s_i\) accepts the last offer from b and b’s last proposed offer + \(\mathrm{Penalty}_b\) \(\le \) any reserved offers and b is not near the deadline.

4.3.2 Conditions to cancel

Agent b cancels from:

  • the least preferred offer when \(\mathrm{RO} \ge \mathrm{MRO}\);

  • all reserved offers immediately after b has accepted an offer.

4.3.3 Conditions to accept

Agent b accepts:

  • \(\delta ^{t}_{s_i\rightarrow b}\) when offer \(\delta ^{t}_{s_i\rightarrow b} + \mathrm{Penalty}_b \le \delta ^{t+1}_{b\rightarrow s_i}<\) any reserved offers and b is nearing the deadline.

  • \(s_i\)’s acceptance when \(s_i\) accepted the last offer from b and b’s last proposed offer + \(\mathrm{Penalty}_b\) \(\le \) any reserved offers and b is nearing the deadline.

  • the highest utility reserved offer when b is nearing the deadline \(t=T_e\).

4.3.4 Conditions to confirm

Agent b confirm:

  • \(s_i\)’s acceptance when \(s_i\) accepted the last offer from b and b’s last proposed offer + \(\mathrm{Penalty}_b\) \(\le \) any reserved offers and b is nearing the deadline.

4.3.5 Conditions to exit

We study next the conditions that the buyer b needs to check before exiting from an individual thread or from the negotiation as a whole.

Conditions to exit from a thread

Agent b exits from:

  • any thread if the deadline has been reached, i.e. \(t=T_e\);

  • \(\sigma _{i}\) if seller \(s_i\) is incompatible (i.e. based on the evaluation \(\mathrm{NS}\) mentioned above) and there is a new seller \(s_j\) and \(|\mathrm{Seller}_b^t|=\mathrm{MAN}\) (which means that the current opponent is incompatible and the buyer has a new opponent to negotiate with, but due to the limited number of active negotiations, the agent has to terminate the negotiation with the incompatible opponent).

  • \(\sigma _{i}\) if \(s_i\) has accepted an offer, but a higher utility offer has been accepted at the same time by \(s_j\) (which means that the buyer receives two accept actions from two different sellers at the same time, leading the buyer to choose the highest utility offer from \(s_j\) and exit from \(s_i\)).

Conditions to exit from the whole negotiation

Agent b exits from the whole of the negotiation

  • if the deadline has been reached, i.e. \(t=T_e\).

  • if \(\delta ^{t}_{b\rightarrow s_i}\) = accept.

4.4 Representation of the strategy

In the context of our implementation discussed in Sect. 3.6, we represent the state of an agent during negotiation as a temporal logic program implemented in Prolog, which we assume the reader is familiar with. Prolog uses the convention that a constant is an identifier starting with a lower-case letter, while a variable is an identifier starting with a upper-case letter. A constant stands for a specific entity, and different constants stand for different entities. On the other hand, a variable can stand for any entity, and different variables can stand for the same entity. The predicate name must be a constant, while each argument can either be a constant or a variable. A Prolog program can be used to answer queries, or to achieve goals, the prompt “?-” denotes a query whose truth will be evaluated by the program.

The temporal part of the logic is represented using the Event Calculus (EC) [35], a logical language for reasoning about actions and their effects. We will use a dialect of the EC based on temporal variables known as multi-valued fluents due to the fact that these variables can take different values at different times. At a time T a fluent is represented in the form F=V to denote that it holds at T that F has value V [8]. For instance, to query that the initial price of specific laptop, say laptop12, costs 1200 at the current time 15, we write:

figure e

The state of the agent will generally contain many other fluents to represent information about negotiation threads, information about other agents and knowledge about the environment.

From a representation of state such as the one above, we develop a strategy as an action selection policy whose specification is defined by Prolog rules of the form:

figure f

An example of a part of the strategy is shown in Listing 1, showing how the buyer b can reason about which action to select next.

figure g

Listing 1 assumes that the buyer has a goal to buy an item from the market, and has to identify a seller Si, and create a thread of negotiation with that seller. This is done by the buyer creating a sub-goal to buy the Item from Si using the concurrent alternating offers protocol ao. The predicate representing the head of the rule

figure h

(line 1) returns an offer (represented by the variable Offer) for the agent’s goal (buyfrom(Si, Item, ao)), seller (Si) and item under negotiation (Item). This predicate calls another predicate that acts as a function to generate the counter-offer at line 6 after checking the buyer’s deadline. The predicate call

figure i

(line 8) generates the counteroffer at time T for seller (Si) based on the initial price (Min), reservation price (Max) and concession rate (CA). The predicate call

figure j

(line 9) calculates the concession rate (CA) at time T for seller (Si) according to Eq. 3. The remaining of the strategy is defined with similar rules that are called from within a declarative agent cycle. Details on how the strategy is incorporated in the buyer’s agent cycle are beyond the scope of this paper and the interested reader is referred to our previous work [3], where this issue has been discussed.

5 Empirical evaluation

In the previous sections we have presented the background that motivated the development of Conan and the model itself. This section presents the experimental results of the work and their evaluation. We proceed as follows. We outline first our negotiation simulation framework Recon  [2], which implements the negotiation agents and facilitates the way we have generated our experimental results. We present two large experiments to test Conan. We report their results and evaluate their significance.

5.1 Experiment 1: comparing CONAN with the state-of-the-art

The aim of this experiment is to determine whether Conan can outperform the state-of-the-art and if so, under what settings. This section starts by explaining the experimental setting in detail, which includes the choice of opponents, the benchmark strategy that we are comparing our work with, the parameters of the proposed experiments, and the performance measurements characterising our results. After that, we report on the experimental results.

5.1.1 E-market setting

As discussed in Sect. 3.6, we implemented Conan agents using the architecture proposed in [3]. Interaction between Conan and other agents is handled using inter-agent communication. As explained in Sect. 3, Recon supports both the imperative definition of agent strategies in Java and the declarative specification of logic-based strategies in Prolog, within the same environment. This feature is important, because we can develop simulations with Conan developed declaratively and then compete with imperative agents, as those currently representing the state-of-the-art in the negotiation literature, without any difficulties. In addition, Recon supports concurrent negotiation which, at the time of our experimentation effort, other existing negotiation platforms did not support [42].

Choice of opponents For our evaluation, we developed extended versions of Faratin’s  [18] and ANAC 2011 (International Automated Negotiating Agents Competition) [24] strategies that allow opponents (sellers) to concurrently negotiate with different buyers. In the simulation, we divide these strategies into three groups:

  • The first group, Faratin’s Sellers [18], contains three combinations of strategies which we combine into one group. The first combination is called time-dependent strategies, where the value of the counter-offers and the acceptance value for the offers depend on the remaining negotiation time. The second and third combinations are the resource-dependent and behaviour-dependent strategies. In resource-dependent strategies, sellers concede more rapidly as the quantity of resources becomes limited. As in Faratin et al.  [18], resources are the number of buyers participating in the negotiation and the negotiation time. In behaviour-dependent strategies, the seller imitates the behaviour of the buyers, so the seller will compute the next offer based on the previous attitude of the buyer.

  • The second group is the ANAC Sellers [24]. We chose this group for fair comparison with the benchmark strategy, which we will describe once we have completed this discussion on opponents. In addition, the ANAC competition provides the most recent and most competitive state-of-the-art practical negotiation agents. It is important to note that in our experiments we noticed, in the 10% agreement zone (for more details see Sect. 3), that three ANAC sellers were accepting offers with negative utility for them. This was due to the fact that ANAC sellers always assume that there will be a 100% agreement zone between them and the buyers. To make the comparison fair with ANAC sellers in the 10% agreement zone, we adjusted their strategies to offer or accept only when they obtained positive utility.

  • The third group, All Sellers, is a combination of all Faratin’s Sellers and ANAC Sellers.

Each seller has two fixed private deadlines. The first is the deadline for how long it will be staying in the market. The second is the deadline related to the negotiation with each buyer. When a seller starts to negotiate with each buyer, it assigns a fixed duration for the negotiation, for instance, three minutes. Hence, when the seller starts a new negotiation with another buyer, it negotiates for a fixed time, three minutes only. The seller will exit the market if (a) the seller sells all its resources, or (b) the seller reaches the deadline for staying in the market, or (c) the market forces the sellers to exit their negotiations to alter the demand/supply ratio of the market.

Benchmark strategy We will use the state-of-the-art strategy developed by Williams et al. [63] as the benchmark strategy. Williams et al. also validated their strategy empirically. The opponents’ strategy was obtained from ANAC Sellers only. They compare their strategy with two other strategies: a random strategy and Nguyen and Jennings’ strategy [47]. Their results show that their strategy outperforms both. Here, we compare Conan with the second version of their strategy, where all negotiation threads have the same utility (see Sect. 2) and this is due to our assumption that all the sellers provide the same identical resource to be negotiated. We set the simulation parameters based on Williams et al. settings in their empirical evaluation in terms of public deadline, break-off probability and penalties.

Simulation parameters The market allows the agents to enter and leave the market at any time. The market simulates the entrance and the exit of the opponents. For each buyer, the market announces the arrival of new sellers and the total number of buyers. The time of the announcement is based on the parameter Market update time as shown in Table 4. Table 3 illustrates the variables used in a simulation. The number of new sellers and buyers added will depend on the parameters Change in market population and Change in demand/supply ratio. For instance, if the change rate in market population is 8 and the demand/supply ratio is 1:1, and the market situation has 5 buyers and 7 sellers, then the market terminates 1 buyer and 3 sellers to reach the correct desired change in the market. These terminated buyers and sellers send an exit action to their opponents.

Table 3 Simulation parameters
Table 4 Simulation parameter values

Performance measurements

To evaluate the performance of Conan and the state-of-the-art strategy, we use the following performance measurements for a buyer agent.

  • Average utility [18, 47, 63]: is the sum of all average utilities of that agent \(U_\mathrm{ag}(x)\) (see Eq. 1) over the number of negotiation runs N:

    $$\begin{aligned} \text {Average utility} = \frac{\sum _{i=1}^{N} U^i_\mathrm{ag}(x)}{N} \end{aligned}$$
    (17)
  • Percentage of successful negotiations is the total number of negotiation runs where the agent reaches an agreement with one of the sellers over all runs.

    $$\begin{aligned} \text {Percentage of successful negotiations }= \frac{ \text {number of successful runs}}{N} *100 \end{aligned}$$
    (18)
  • Average number of rounds [30]: is the average number of negotiation rounds until the accept agreement is reached.

    $$\begin{aligned} \text {Average number of rounds}= \frac{ \text {number of rounds per run}}{N} \end{aligned}$$
    (19)

Experimental set-up In our evaluation, we allow the demand/supply ratio and market population to change during negotiation to create more realistic settings. In addition, as indicated in Table 4, we set the eagerness for all threads to 0.5, the deadline to long \([151s -210s]\) and the market update time to 10s. We set the same deadline for all buyers and sellers in the e-market for fair comparison with the benchmark strategy since it assumes public deadlines. (Both buyers and sellers have identical deadlines.) To test that Conan negotiates using private deadlines, Conan does not know that buyers and sellers have the same deadline. Also, as we explained in Sect. 4, we set the \(\mathrm{MRO}= 1\) and \(\mathrm{MAN} = 100\). We also set the penalty \(D = 0.1\) for fair comparison with Williams’s et al.  [63] strategy.

Similarly, we choose three ranges for the sellers’ initial price and reservation price to create three different agreement zones: 10, 60 and 100%. The reason for this choice is to simulate real-life situations, where the buyer does not know the price range of the sellers. Conan and the benchmark agent run concurrently within the same simulation as competitors, while in the literature the buyer agents run separately in individual simulations. We run three agents for each strategy (state-of-the-art and Conan) and run the simulation 100 times for 81 different settings (3 different demand/supply ratio values * 3 different market population values * 3 price ranges * 3 opponent groups) = 8100 runs. We report the average utilities, percentage of successful negotiations and average number of negotiation rounds for those simulations. While some authors have not applied any statistical significance measures [7, 33, 47], we adopt the standard statistical analysis approaches that has been used in the negotiation literature [63]. We conduct t-tests to report the statistical significance between the utilities of Conan and the state-of-the-art for each setting and use ANOVA test (analysis of variance) to calculate the statistical significance of the resulting utility between each price range for each group of sellers.

Fig. 6
figure 6

Average utility for Faratin’s Sellers

5.1.2 Results

The results for Faratin’s Sellers are shown in Fig. 6. In the top plot, market population is kept at average and we change the rate for the demand/supply ratio. In the bottom plot, the rate for the demand/supply ratio is kept at average and we change the market population. It can be seen from the plots and from the p value < 0.05 that our strategy outperforms the state-of-the-art according to experimental evidence that is statistically significant. However, the utilities of Conan and the state-of-the-art are stable during the changes in the demand/supply ratio among the price ranges. On the other hand, the utility of Conan increases when the market population increases, while the utility of the state-of-the-art decreases when the market population increases. We can observe from Fig. 6 that:

  • CONAN is independent from price range changes Conan obtains a significantly higher utility than Williams et al. when negotiating in a high population market and a 60% agreement zone with Faratin’s Sellers. The most likely explanation for this result is that Conan takes into consideration that the price ranges between buyers and sellers may differ.

Fig. 7
figure 7

Average utility for ANAC Sellers

The next result is about negotiating with ANAC Sellers and is shown in Fig. 7. Our strategy still outperforms the state-of-the-art (\(p<0.05\)). The top plots indicate that each strategy gains the same amount of utility when the rate of demand/supply ratio changes, while in the bottom plots, when the market population is average, Conan has the highest utility. We can observe from Fig. 7 that:

  • CONAN has high utility in 60% agreement zone: Conan obtains significantly higher utility than Williams et al. when negotiating in an average population market and 60% agreement zone with ANAC Sellers. We observed in the experiments that ANAC Sellers cannot cope with high numbers of agents in the market, so as the number of agents rises, they become slower in producing conceded offers. As a result Conan accepts offers from the ANAC Sellers before the deadline since the utility of the offers \(>0\).

Fig. 8
figure 8

Average utility for All Sellers

In the third group of sellers, All Sellers, the results are shown in Fig. 8. As before, Conan outperforms statistically significantly the state-of-the-art (\(p<0.05\)) within each setting and for each group of settings. Conan obtains the highest utility when the market population is high. We can observe from Fig. 8 that:

  • CONAN is seller independent Conan outperforms the state-of-the-art in all price ranges, demand/supply ratios and market densities when negotiating with Faratin’s Sellers, ANAC Sellers and All Sellers;

  • CONAN has high utility in average population Conan obtains significantly higher utility than Williams et al. when negotiating in an average population market and 60% agreement zone with All Sellers, as explained in observation 4.

Fig. 9
figure 9

Percentage of successful negotiations for All Sellers

Fig. 10
figure 10

Number of negotiation rounds for All Sellers

We can see from Fig. 9 that Conan gains more successful negotiations than Williams et al. in all the simulation settings when negotiating with All Sellers. The top plots indicate that each strategy gains the same percentage of successful runs when the rate of demand/supply ratio changes, while in the bottom plots, when the market population is high, Conan has the highest percentage of successful runs. We can observe from Fig. 9 that:

  • CONAN has high successful negotiations Conan outperforms the state-of-the-art in all price ranges, demand/supply ratios and market densities when negotiating with All Sellers, especially in the 10 and 60% agreement zones. This is because it is hard for any agent to get an agreement in the 10 and 60% zones, while it is easier to get an agreement in the 100% zone, since in the 100% agreement zone, the buyer’s initial price \(\ge \) the seller’s reservation price so the utility > 0.

We can see from Fig. 10 that Conan conducts less negotiation rounds compared to Williams et al. when negotiating with All Sellers. This is because Conan needs more time to think than Williams et al. . Thus, Conan talks less but has more time to think about the best agreement and the best opponent to negotiate with, which minimises the agent’s use of communication resources.

The number of rounds is stable (around an average of 100 rounds at low population) for every market population and every demand/supply ratio. On the other hand, Williams et al. have a high number of rounds when the market population is low and a low number of rounds when the e-market population is high.

  • CONAN conducts less negotiation rounds Conan conducts less negotiation rounds and has a stable number of negotiation rounds than the state-of-the-art in all price ranges, demand/supply ratios and market densities when negotiating with All Sellers;

The following is a general observation about the results presented in Figs. 6 7, 8, 9 and  10:

  • CONAN is market-independent Conan is a robust and adaptive strategy that is not affected by changes in the market population and/or in the demand/supply ratio in any price ranges. Experimental evaluation of the strategy produces statistically significant results in terms of average utility compared to the state-of-the-art Williams et al.  [63]. Conan also outperforms the state-of-the-art in terms of the percentage of successful negotiations, with fewer rounds. Conan also achieves the highest utility when the change in demand/supply ratio is low as a result of the market being less competitive, which increases Conan ’s opportunity to negotiate with more sellers. Also, in terms of market population, Conan obtains the highest utility when the market population is high. As a result of the number of sellers increasing, the buyer’s opportunity to get an agreement also increases.

5.2 Experiment 2: comparing CONAN with random and Faratin’s strategies

Given the results of Experiment 1, the aim for this experiment is to determine whether Conan can outperform other negotiation strategies. In this experiment, we will benchmark more strategies and we will keep the simulation parameters and the experimental set-up as in Experiment 1 (Sect. 5.1). The reason for the second experiment is to make the experimentation more credible and convincing by checking the behaviour of Conan compared to other, equally famous and competing strategies. We did not combine Experiments 1 and 2 because:

  • Experiment 1 considers many performance measurements comparing Conan with the key benchmark strategy of Williams etal. In Experiment 2 we focus on a general overview of Conan ’s performance; thus, it has only one performance measurement.

  • Experiment 1 makes sure that Conan outperforms Williams et al. by running 3 agents for each. In Experiment 2, since we have many buyers to assess, we cannot run more than one agent per strategy in order to keep the low population setting of the e-market as low as possible.

  • Experiment 1 tests the performance of Williams et al. and Conan with different groups of opponents (Faratin’s Sellers, ANAC Sellers and All Sellers) since Williams et al. is the key benchmark strategy. In Experiment 2, we need a general overview of Conan ’s performance since All Sellers represents both ANAC Sellers and Faratin’s Sellers.

5.2.1 E-market setting

As in Experiment 1, we use our negotiation simulator Recon with the following settings:

Choice of opponents For our evaluation, we use All Sellers strategies as in Experiment 1. The reason for choosing All Sellers is that it combines all the opponents that we have, to give a general overview of the performance of Conan.

Benchmark strategy We use the Random strategy, and the extended versions of Faratin’s strategies [18] as the benchmark buyers strategies, since there are no more concurrent strategies that fit to our benchmark selection criteria based on the background review in Sect. 2.

  • The Random strategy offer generation function is obtained from the GENIUS Footnote 3 platform. The offer is randomly generated and the utility of the offer should be \(>0\).

  • Faratin’s strategies: there are eight strategies proposed by Faratin et al.  [18]: Linear, Conceder, Boulware, Resource Dependent, Resource Time Dependent, Relative Tit-For-Tat, Random Absolute Tit-For-Tat and Average Tit-For-Tat.

To make a fair comparison of these existing strategies with Conan, we extended the Random and Faratin’s action selection methods to be able to use the concurrent negotiation actions, which are reqToReserve and cancel. More precisely, the action selection mechanisms for Random and Faratin’s agents have been implemented as follows. The action selection criteria for the offer, accept and exit actions are kept the same as in their original implementations. However, these original implementations are extended with reqToReserve and cancel actions as implemented in the Conan strategy so that these agents can make fair use of our concurrent negotiation protocol and to enable fair comparison with Conan.

Fig. 11
figure 11

Average utility for All Sellers with 10% agreement zone

Fig. 12
figure 12

Average utility for All Sellers with 60% agreement zone

5.2.2 Results

Figure 11 depicts the result of negotiation with All Sellers in the 10% agreement zone. As in Experiment 1, in the top plot, market population is kept average and we change the rate for the demand/supply ratio. In the bottom plot, the rate for the demand/supply ratio is kept average and we vary the market population.

It can be seen from the plots and from the p value < 0.05 that our strategy Conan outperforms the Random and Faratin’s strategies according to experimental evidence that is statistically significant. From the results we can say that the utility of Conan, Random and Faratin’s strategies is stable during changes in the demand/supply ratio. On the other hand, the utility of Conan increases when the market population increases, while the utility of Linear, Conceder, Boulware and Resource Time Dependent decreases when market population increases.

Figure 12 shows the result of negotiation with All Sellers in the 60% agreement zone. As in Fig. 11, Conan still outperforms the benchmark strategies (\(p<0.05\)) and the utility of Conan, Random and Faratin’s strategies is stable during changes in the demand/supply ratio. The utility of Conan increases when the market population increases, and the utility of Random and Faratin’s strategies increases when the market population increases.

Fig. 13
figure 13

Average utility for All Sellers with 100% agreement zone

Figure 13 shows the results for negotiation with All Sellers in the 100% agreement zone. As before, Conan outperforms significantly the benchmark strategies (\(p<0.05\)) and the utility of Conan, Random and Faratin’s strategies is stable during changes in the demand/supply ratio. The utility of Conan increases when the market population increases, and the utility of Random and Faratin’s strategies increases when the market population increases.

For this section we combine all the observations from Figs. 11, 12 and 13:

  • CONAN outperforms the Random and Faratin’s strategies Conan outperforms the Random and Faratin’s strategies in the 10, 60 and 100% price ranges, demand/supply ratios and market densities when negotiating with All Sellers;

  • CONAN has high utility in 10% agreement zone Conan obtains significantly higher utility than the Random and Faratin’s strategies (except Relative Tit-For-Tat) when negotiating in a high population market in the 10% agreement zone in Fig. 11. As with the Williams et al. strategy, this is because it is easy for any agent to negotiate in the 100% agreement zone and get a better agreement rather than negotiating in the 10% agreement zone, which is more difficult. Also, Conan obtains significantly higher utility than the Random and Faratin’s strategies when negotiating in a low population market in the 100% agreement zone in Fig. 13. This shows that Conan can perform well even in a simple market setting where the number of agents is low in a 100% agreement zone.

6 Conclusions

We have presented Conan, a novel heuristic strategy for concurrent negotiating agents for open, dynamic and complex negotiation settings. In this context, we have also discussed how to extend the well-known alternating offers protocol to support concurrent bilateral negotiations and how to use Conan to generate offers combining environment and self factors. We have also shown how Conan decides when to accept and reqToReserve offers and when not to further pursue a negotiation thread any more. We have implemented our negotiation framework using the Recon platform [2] to evaluate our work, and we have created realistic e-market experiments by varying parameters such as the market population and the demand/supply ratio. Moreover, we have relaxed most of the assumptions made in previous work, in particular how to deal with private deadlines and the idea of an open market where buyers and sellers may come and go at any time.

We have evaluated the performance of Conan in various settings and showed that it performs significantly better than the state-of-the-art [63], Random and Faratin’s strategies. We have also shown that Conan ’s utility is not affected by the changes in the market population, the demand/supply ratio and the price range, including all of them together. Also, our work contributes to the long tradition of practical agent technology in automated negotiation. As our experimental results indicate, users who adopt the CONAN strategy when negotiating (e.g. in e-Bay) will increase their profits compared to other users who adopt the current state-of-the-art strategy.

The extensive experiments that we have performed in this work validate our approach in the selected application domain of e-markets only. Still, the proposed model opens up new possibilities for developing more realistic and flexible models for negotiating agents, especially in areas where the metaphor of e-market [49] has a natural application. We mention here two areas where our contribution is generalisable: (i) service grids [56, 60]—where the price of a service by a provider is a single issue under bilateral negotiation with a service client to form contracts, and (ii) cloud computing [5]—where our protocol, buyer strategy and existing seller strategies can be used as a mechanism for allocation of a resource (e.g. a storage) using bilateral negotiations over a single issue (e.g. storage price). Also, the GOLEM agent platform [12] used to build our negotiation testbed RECON [2], has facilities that can include humans as avatars in the experiments (see [14] for an example of user immersed in an e-retailing agent environment), so our approach is generalisable for negotiations between humans and agents [41]. However, a detailed discussion of these issues is beyond the scope of this paper but covered by our plan for future work. This plan includes the following:

  • Develop an algorithm to save computation time by determining the most promising sellers as negotiation progresses, and gradually decrease the number of concurrent negotiation threads. This is due to the limited computing resources of the agent, which consume time to reply to all the sellers at the same time.

  • Expand our agent model to be able to negotiate over multiple e-markets. Our model only takes into consideration a single e-market. Considering multiple e-markets will facilitate obtaining the item under negotiation faster, and it may increase the agent’s utility, since the agent will negotiate in different e-markets at the same time. In this context, taking into account trust [44] in matching the buyer with different e-markets will also be beneficial.

  • Extend our agent model to be able to negotiate over multiple issues. Our current agent only considers one issue (i.e. price) rather than multiple issues (i.e. price, warranty, colour...). As a result of considering multiple issues we will be better positioned to simulate real-life applications, such as negotiating with users. For the simple case of independent issues handled sequentially can be addressed straightforwardly by applying the same approach on each issue—given suitable mapping functions for categorical variables. We will use the item’s multiple issues to represent different values of the buyer’s eagerness \(\mathrm{EG}_b\). For instance, the buyer will have high eagerness to buy a red laptop rather than a blue laptop. Eagerness will help the agent to differentiate between different items and then increase the agent’s chance of buying the desired item at a good price.

  • Develop a trader agent using Conan strategy where the agent acts both as a buyer and seller at the same time in a competitive market by expanding the heuristics for deciding actions (Sect. 4.3), in particular the reserve action. We will develop a social welfare metric to measure the agent’s utility which combines the buyer’s utility and the seller’s utility.

  • In our experiments so far all participants have been automated software agents. However, in practical applications not all people will choose agents to negotiate on their behalf. As a result, exploring human avatars in GOLEM to negotiate with software agents will be relevant and works such that of Rosenfeld et al.  [51] to provide an agent that is able to negotiate with agents and humans at the same time will be key. For this we will need to create a flexible negotiation language for an agent to interact with both artificial and human agents alike. In this context, we will also have to address the challenges of designing decision-making strategies that handle social dependencies to other agents, as in the Color Trails Game of Grosz et al. [26].