1 Introduction

A hedonic game [8] specifies for each player a payoff in each coalition, giving players preferences over coalitions. In particular, unlike in games in characteristic function form, there is no competition over payoffs within any coalition: If a coalition forms, each player’s payoff is determined. When solving a hedonic game, one is, hence, only interested in how players partition into coalitions. Arguably, the most prominent solution is the set of core stable partitions: In such a partition no set of players can increase their payoffs by forming a new coalition. Unfortunately, despite their rather simple structure, hedonic games might not have core stable partitions. [4] and [2] provide sufficient conditions for the nonemptiness of the core; [12] provides a condition which is both necessary and sufficient and very similar to the balancedness condition by [16] and [5].

The idea of the core, namely that there be no profitable formation of new coalitions, is quite intriguing; yet, it requires the assumption of players’ naivety: When a coalition considers forming, all its members compare their current payoff with their payoff after forming. No attention is being paid to future deviations by some of its members, and no attention is being paid to other coalitions who might leave the status quo. That is, players are myopic.

The first investigation of farsighted players in hedonic games has been provided by [7] who used the more general models of [9] and [6]. Although their analysis captures players’ rationality in that they can anticipate the consequences of their deviating, there are two potentially problematic assumptions: First, it is assumed that whenever a player deviates from a coalition, the remainder of this coalition stays intact. Second, players compare payoffs from all reasonable future deviations to the status quo. So, they might act together even though their final goals differ (see, for instance, [3]), and they do not take into account that other coalitions might move preemptively. The result is a solution which always exists, but which is too permissive and which does not account for the full rationality of players.

In this paper, we introduce a different way to talk about farsighted players in hedonic games, which is based on [15] and [13]. For that purpose, we translate a hedonic game into an abstract game. This abstract game considers the set of partitions as state space and specifies for any partition and coalition what new partition emerged if this coalition formed. We provide four axioms for this specification that ensure that all players have the same unique expectation about potential moves among partitions.

The abstract game describes a coalition formation game that is similar to [14], where coalitions are endowed with strategies (behaviors) that specify at each partition whether or not to form (if they are not already part of that partition). A behavior profile, thus, defines transitions among partitions, which in turn define a Markov process. The stationary distribution of such a process determines how much time is spent in each partition. Thus, the payoff from any behavior profile is a weighted average of payoffs, where the weights are given by the relative time spent in each partition.

In a weak equilibrium, each coalition behaves optimally in each partition \(\pi \), given the behavior of all coalitions (including itself) at all other partitions, and the behavior of all other coalitions at \(\pi \). Thus, a weak equilibrium is stable with respect to one-shot deviations. In contrast to [14] we allow coalitions to play mixed strategies; in fact, we restrict our analysis to strategies that play each pure behavior with some small but positive probability \(\varepsilon >0\), taking into account the possibility to make mistakes. This ensures that there is a path of moves between any two partitions, so that any partition is reached with positive probability. Thus, any optimality condition in equilibrium applies to all partitions in which a coalition might have to decide whether or not to form.

The main result of our paper is that for every \(\varepsilon >0\) a weak equilibrium exists. The mathematical difficulty in showing this result is that players’ payoff functions are not linear in the probability with which each behavior is being played. Thus, showing that the set of best replies is convex (as it is in normal form games) is difficult. (If we allow for deviations that are not one-shot, then the set of best replies is, in fact, not convex.) But once, convexity is proven, obtaining the result is straightforward.

The remainder of the paper is structured as follows: In Sect. 2, we introduce the necessary notation, recall the definition of hedonic games and introduce hedonic coalition formation games that are a special class of abstract games. In Sect. 3 we introduce coalition behaviors, translate them into transition matrices and derive the relevant payoff functions. Section 4 introduces the equilibrium and proves its existence. We close the paper with Sect. 5 where we show that weak equilibria are not necessarily stable with respect to arbitrary deviations.

2 Preliminaries

2.1 Hedonic Games

Let N be a finite set of players. Subsets \(S\subseteq N\) are called coalitions. For \(S\subseteq N\) write \(2^S\) for the set of subsets of S, and P(S) for the set of nonempty subsets. A partition is a collection \(\pi =\left\{ S^1,\ldots ,S^m\right\} \) of nonempty coalitions such that \(\bigcup _{k=1}^mS^k=N\) and \(S^k\cap S^l=\emptyset \) for all \(k\ne l\); the set of all partitions is denoted by \(\Pi \). For \(i\in N\) and a partition \(\pi \) we write \(\pi (i)\) for the unique element of \(\pi \) that contains i. A hedonic game is a map v that maps each nonempty coalition S to some \(v(S)\in {\mathbb {R}}^S\). That is, a hedonic game is a cooperative game such that each player’s payoff in each coalition is uniquely determined: There is no negotiation over payoffs within coalitions whatsoever. For any hedonic game v, we define the map \(V:\Pi \rightarrow {\mathbb {R}}^N\) by \(V_i\left( \pi \right) =v_i\left( \pi (i)\right) \). That is, \(V\left( \pi \right) \in {\mathbb {R}}^N\) is the payoff vector if partition \(\pi \) forms.

Arguably, the most prominent solution of a hedonic game is the set of core partitions: Those partitions for which no coalition has an incentive to deviate. To make this precise, we say that a partition \(\pi \) is dominated via S if \(v_i(S)>V_i(\pi )\) for all \(i\in S\). The core is the set of undominated partitions.Footnote 1

Example 2.1

(The roommate problem) There are three players who have to decide about who of them will be moving in together in a two-bedroom flat. They have somewhat conflicting interests: While everybody dislikes to move in with three people into a two-bedroom flat, 1 prefers to move in with 2 over moving in with 3 over staying alone; 2 prefers moving in with 3 over moving in with 1 over staying alone; and 3 prefers moving in with 1 over moving in with 2 over staying alone. Suppose that payoffs from staying alone are 0, from moving into an overcrowded place is \(-1\), from getting the preferred roommate is 4, and from getting the other room mate is \(a\in \left( 0,4\right) \). This game does not have a core stable outcome: Surely, neither the partition into singletons nor the partition that only contains the grand coalition are core stable. But neither are the others: Whenever two players have formed a coalition, one of the two has an incentive to form a new one with the outside player.Footnote 2\(\square \)

The core is a myopic concept: At any partition \(\pi \), the members of a potential coalition S compare their payoffs from forming with those at \(\pi \). No attention is paid to any moves other coalitions (or even some members of S) could make after S has formed. In particular, the players in S do not take the behavior of those in \(N\setminus S\) into account: It is irrelevant for v(S), and S operates under the presumption that no one will react upon their deviation.

If players are not myopic, they will account for the possibility that after their own deviation other coalitions might form. Thus, they have to make assumptions about what happens to those “left behind.” So, a dominance relation cannot simply be defined between a partition and a coalition, but rather between two partitions. [7] define such a dominance relation based on [6]: Partition \(\pi '\) farsightedly dominates partition \(\pi \) if there is a sequence of pairs \(\left( S^l,\pi ^l\right) _{l=1}^m\) such that

$$\begin{aligned} \pi ^l=\left\{ S^l\right\} \cup \left\{ T\setminus S^l\right\} _{T\in \pi ^{l-1}}\setminus \{\emptyset \} \end{aligned}$$

for \(l=1,\ldots ,m\), where \(\pi ^0=\pi \), \(\pi ^m=\pi '\), and \(V_i\left( \pi '\right) >V_i\left( \pi ^{l-1}\right) \) for all \(i\in S^l\). A solution that is based on such a definition seems, at least from a rationality point of view, more plausible than a purely myopic solution. Still, there are some caveats: For instance, it makes the implicit assumption that players who are left behind stay together. Another problem is that different members of a coalition S might only work together because they have different, and potentially contradicting, expectations about how the game unfolds.Footnote 3 The most severe issue, however, seems to be that coalitions still do not behave rationally: They make assumptions about the consequences of their forming, but they ignore the consequences of their not forming.

In order to overcome these issues, we shall translate hedonic games into abstract games for which farsightedness has recently gained some attention. In general, an abstract game is a tupel \(\left( N,X,\left( \rightarrow _S\right) _{S\in P(N)},\left( U_i(\cdot )\right) _{i\in N}\right) \), where X is a set of states, \(U_i:X\rightarrow {\mathbb {R}}\) is player i’s utility function over states and \(\rightarrow _S\) describes coalition S’s ability to move from one state to another: For two states \(x,y \in X\) we write \(x\rightarrow _Sy\) if S can replace x with y. In this case, we say S is effective for a move from x to y. In the context of a hedonic game we choose \(X=\Pi \), i.e., the set of states is exactly the set of partitions, and \(U_i=V_i\), which is i’s payoff function over partitions.

Example 2.2

Recall the roommate problem in Example 2.1. A potential \(\rightarrow \) for this hedonic game in depicted in Fig. 1, where \(\pi ^0=\{\{1\},\{2\},\{3\}\}\), \(\pi ^1=\{\{1,2\},\{3\}\}\), \(\pi ^2=\{\{1\},\{2,3\}\}\), \(\pi ^3=\{\{2\},\{1,3\}\}\), and \(\pi ^4=\{N\}\). \(\square \)

Fig. 1
figure 1

The roommate problem

As the profile \(\rightarrow = \left( \rightarrow _S\right) _{S\in P(N)}\) is the most relevant piece of the puzzle, we shall have a closer into it look in the next section.

2.2 Effectivity in Hedonic Coalition Formation

The question of what partitions can arise in a hedonic game hinges on the coalitions’ abilities to change partitions. The hedonic game itself remains quite agnostic about this as it only specifies payoffs for coalitions and nothing more. Thus, we shall derive four assumptions on coalitions’ abilities to affect partitions, which are reflected in \(\rightarrow \). First, we would expect \(\rightarrow \) to satisfy:

H1:

  If \(\pi \rightarrow _S\pi '\), then \(S\in \pi '\).

That is, S can only move from a partition \(\pi \) to a partition \(\pi '\) if it is a member of the latter. Observe that we do not allow the members of S to jointly form a partition of S: If they collaborate, they must form a coalition. As [15] point out, the action of farsighted players in S depends on the expected reaction by \(N\setminus S\) as this might influence future deviations. To avoid unintuitive results, they propose two conditions and refer to them as coalition sovereigntyFootnote 4:

  1. H2

    If \(\pi \rightarrow _S\pi '\), \(T\in \pi \), and \(S\cap T=\emptyset \), then \(T\in \pi '\).

  2. H3

    For every \(\pi \in \Pi \) and \(S\in P(N)\) there is \(\pi '\) with \(S\in \pi '\) such that \(\pi \rightarrow _S\pi '\).

Condition H2 requires that a coalition S that deviates from partition \(\pi \) has no influence over coalitions that have not been affected by its deviation. That is, a coalition in \(\pi \) that did not intersect with S will not change. Condition H3 requires that from each partition \(\pi \) each coalition S that is not a member of \(\pi \) can deviate. Both conditions are highly appropriate in the context of hedonic games: They endow coalitions with the power to form at any state, yet they ensure that no coalition has the power to affect the behavior of others when moving.Footnote 5 An observation worth making is that H1 and H2 together imply \(\pi =\pi '\) whenever \(S\in \pi \) and \(\pi \rightarrow _S\pi '\).

The transition between partitions in [7] satisfies Conditions H1H3, but these conditions alone still allow for quite a range of partitions \(\pi '\) that a coalition S might move to from \(\pi \), as nothing has been said about those players who were “left behind” by S. Define for any partition \(\pi \) and any coalition S the set \(\pi (S)\) by \(\pi (S)=\bigcup _{i\in S}\pi (i)\), which is the set of all players whose coalitions are affected by a deviation of S. There is no reason to presume S have power about the behavior of \(\pi (S)\setminus S\). Yet, we shall assume that there is a (common) expectation about their behavior. A residual map is a map \(\tau \), which maps each pair \(\left( \pi ,S\right) \) on a partition \(\tau \left( \pi ,S\right) \) of the set \(\pi (S)\) with \(S\in \tau \left( \pi ,S\right) \). For \(i\in \pi (S)\) we write \(\tau \left( i\mid \pi , S\right) \) for the unique element of \(\tau \left( \pi ,S\right) \) that contains i.

  1. H4

    There is a residual map \(\tau \) such that if \(\pi \rightarrow _S\pi '\), then \(\pi '(i)=\tau (i\mid \pi ,S)\) for all \(i\in \pi (S)\).

Condition H4 ensures that the behavior of \(\pi (S)\) cannot be chosen by S, yet is uniquely determined and commonly known. Thus, H2 and H4 together simply ensure that all coalitions have a common expectation about the immediate consequences of any move.

We shall not impose any conditions on the residual map; for the remainder its existence is sufficient. Yet, there are several instance of \(\tau \) that have been investigated in the literature before. For instance, [10] consider two variants: The \(\gamma \)-model, where coalitions who are left behind split up into singletons, and the \(\delta \)-model, where they remain as they were.

Example 2.3

For any pair \(\left( \pi ,S\right) \) of a partition and a coalition let \(\gamma \left( \pi ,S\right) =\left\{ S,\left\{ \left\{ i\right\} \right\} _{i\in \pi (S)\setminus S}\right\} \). The unique \(\rightarrow ^{\gamma }\) that satisfies H1H4 with \(\tau =\gamma \) is

$$\begin{aligned} \pi \rightarrow _S^{\gamma }\pi ' \qquad \text {if and only if} \qquad \pi '=\{S\}\cup \left\{ T\right\} _{T\in \pi \setminus \pi (S)}\cup \left\{ \left\{ i\right\} \right\} _{i\in \pi (S)\setminus S}. \end{aligned}$$

For any pair \(\left( \pi ,S\right) \) of a partition and a coalition let \(\delta \left( \pi ,S\right) =\left\{ S,\left\{ \pi (i)\setminus S\right\} _{i\in \pi (S)\setminus S}\right\} \setminus \left\{ \emptyset \right\} \). The unique \(\rightarrow ^{\delta }\) that satisfies H1H4 with \(\tau =\delta \) isFootnote 6

$$\begin{aligned} \pi \rightarrow _S^{\delta }\pi ' \qquad \text {if and only if} \qquad \pi ' = \{S\}\cup \left\{ T\setminus S\right\} _{T\in \pi }\setminus \left\{ \emptyset \right\} . \end{aligned}$$

Observe that \(\rightarrow \) in Example 2.2 corresponds to the \(\gamma \)-model. This can be seen from \(\pi ^4 \rightarrow _{\{i\}}\pi ^0\) for \(i=1,2,3\). \(\square \)

We allow coalitions to act strategically when deciding whether or not to move, and we are interested in their equilibrium behavior. We assume that in each partition \(\pi \) coalitions are allowed to move in a specified order that is described by a bijection \(\rho ^{\pi }:\left\{ 1,\ldots ,2^{\left| N\right| } -1\right\} \rightarrow P(N)\). Here, \(\rho ^{\pi }(l)\) is the l-th coalition that is allowed to move at \(\pi \).Footnote 7

Definition 2.4

A hedonic coalition formation game is a tuple \(\left( N,V,\rightarrow ,\rho \right) \), where V is the payoff function from a hedonic game with player set N, \(\rightarrow \) satisfies conditions H1H4, and \(\rho =\left( \rho ^{\pi }\right) _{\pi \in \Pi }\) is an order profile.

As H2 and H4 together uniquely determine the behavior of \(N\setminus S\) for any S and \(\pi \), we obtain the following result. Its proof, as all proofs, can be found in the appendix.

Theorem 2.5

Let \(\left( N,V,\rightarrow ,\rho \right) \) be a hedonic coalition formation game. Then, for each partition \(\pi \) and each coalition \(S\subseteq N\) there is a unique partition \(\pi '\) with \(\pi \rightarrow _S\pi '\).

Observe that if a coalition S could decide to form a partition of S (which would violate H1), then S could move to more than one other partition, and Theorem 2.5 would not hold.

3 Analysing Hedonic Coalition Formation Games

3.1 Coalition Behavior and Transitions

In a hedonic coalition formation game, any coalition (that has not formed yet) has only two options: To be or not to be? That is the question. The only strategic decision that a coalition has to make (at any partition) is, hence, to choose the probability with which to form.Footnote 8 Thus, a (mixed) coalition behaviorFootnote 9 of coalition S is a map \(\beta _S:\Pi \rightarrow [0,1]\), where \(\beta _S(\pi )\) denotes the probability that S forms and deviates from \(\pi \).Footnote 10 We write \(\Delta _S\subseteq [0,1]^{\Pi }\) for the set of all coalition behaviors of S. A behavior profile is a vector \(\beta =\left( \beta _S\right) _{S\in P(N)}\in \Delta =\times _{S\in P(N)}\Delta _S\) of behaviors.

Example 3.1

Recall \(\rightarrow \) for the 3-player roommate problem in Example 2.2 and consider the hedonic coalition formation game \(\left( N,V,\rightarrow ,\rho \right) \) where \(\rho ^{\pi }=\rho \) for all \(\pi \in \Pi \) and \(\rho \) is defined by

$$\begin{aligned} \rho (1)&= \{1\}&\rho (2)&=\{2\}&\rho (3)&= \{3\}&\rho (4)&=\{1,2\}\\ \rho (5)&= \{2,3\}&\rho (6)&= \{1,3\}&\rho (7)&=\{1,2,3\}. \end{aligned}$$

Consider the following behavior profile. \(\beta _{\rho (l)}\left( \pi \right) =0\) for all \(\pi \in \Pi \) and \(l=1,2,3,7\). Further, \(\beta _{\rho (l)}\left( \pi ^0\right) = \beta _{\rho (l)}\left( \pi ^4\right) = 1\) for \(l=4,5,6\). Lastly,

$$\begin{aligned} \beta _{\rho (4)}\left( \pi ^2\right)&= \beta _{\rho (5)}\left( \pi ^3\right) = \beta _{\rho (6)}\left( \pi ^1\right) = p,\\ \beta _{\rho (4)}\left( \pi ^3\right)&= \beta _{\rho (5)}\left( \pi ^1\right) = \beta _{\rho (6)}\left( \pi ^2\right) = \beta _{\rho (4)}\left( \pi ^1\right) = \beta _{\rho (5)}\left( \pi ^2\right) = \beta _{\rho (6)}\left( \pi ^3\right) = 0, \end{aligned}$$

where \(p\in \left( 0,1\right) \). That is, each pair would deviate from the grand coalition and from the singleton partition with probability 1; and whenever a pair has formed, another pair will deviate with probability p. Surely, both \(\pi ^0\) and \(\pi ^4\) will be left for \(\pi ^1\) by \(\{1,2\}\) with probability 1. The probability that partition \(\pi ^1\) will be left is p, and if it is left, then by a move of \(\{1,3\}\) to \(\pi ^3\). Similarly, \(\pi ^2\) will be left with probability p to \(\pi ^1\), and \(\pi ^3\) will be left with probability p to \(\pi ^2\). So, the transition probabilities between partitions are described by the \((\Pi \times \Pi )\)-dimensional matrix

$$\begin{aligned} P^{\beta }=\left( \begin{array}{cccccc} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ 1 &{} 1-p &{} p &{} 0 &{} 1 \\ 0 &{} 0 &{} 1-p &{} p &{} 0 \\ 0 &{} p &{} 0 &{} 1-p &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 \end{array} \right) \end{aligned}$$

where \(P_{\pi ',\pi }\) denotes the probability of a transition from \(\pi \) to \(\pi '\). \(\square \)

3.2 Markov Processes and Expected Payoffs

Given an order profile \(\rho =\left( \rho ^{\pi }\right) _{\pi \in \Pi }\) and a behavior profile \(\beta \), the probability of a transition from \(\pi \) to \(\pi '\) is

$$\begin{aligned} P^{\beta }_{\pi ',\pi }&= \sum _{l:\pi \rightarrow _{\rho ^\pi (l)}\pi '}\prod _{h<l}\left( 1-\beta _{\rho ^\pi (h)}\left( \pi \right) \right) \beta _{\rho (l)}\left( \pi \right)&\text { for all } \pi ,\pi '\in \Pi , \pi '\ne \pi , \end{aligned}$$
(1)

and the probability, that no coalition will move out of \(\pi \) is, hence,

$$\begin{aligned} P^{\beta }_{\pi ,\pi }&= \prod _{l=1}^{2^{\left| N\right| }-1}\left( 1-\beta _{\rho ^\pi (l)}\left( \pi \right) \right)&\text { for all } \pi \in \Pi . \end{aligned}$$
(2)

If the behavior profile \(\beta \) is such that the Markov process with transition matrix \(P^{\beta }\) converges independently of its starting point towards a unique partition \(\pi \), then the expected payoff from \(\beta \) is easily determined, namely \(V(\pi )\). But in general, we cannot expect such a partition to exist; in fact, one of the main problem in hedonic games are circles among partitions such as in the roommate problem. In this case, we can define payoffs by asking: How much time will be spent in each partition given some behavior profile \(\beta \)? For this purpose note that if a partition \(\pi \) is reached, then after n periods of possible moves among states, the expected number of periods spent in some \(\pi '\) is given by the \(\pi '\)-th entry of the \(\pi \)-the column of the matrix \(\sum _{m=1}^n\left( P^{\beta }\right) ^m\). Thus, if the process starts at \(\pi \) and we do not impose any restrictions on the number of periods during which coalitions are allowed to move, then the expected relative amounts of time spent in each partition are given by the \(\pi \)-th column of the matrix \(\lim _{n\rightarrow \infty }\frac{1}{n}\sum _{m=1}^n\left( P^{\beta }\right) ^m\), which is given by \(\Pi \)-dimensional vector

$$\begin{aligned} \mu ^{\beta }=\lim _{n\rightarrow \infty }\frac{1}{n}\sum _{m=1}^n\left( P^{\beta }\right) ^me^{\pi }, \end{aligned}$$
(3)

where \(e^{\pi }\) is the \(\Pi \)-dimensional unit vector with 1 as its \(\pi \)-th entry. With a slight abuse of notation we shall write \(\mu ^{\beta }\left( \pi '\right) \) for the vector entry \(\mu ^{\beta }_{\pi '}\).Footnote 11 We shall formulate a condition on \(P^{\beta }\) such that \(\mu ^{\beta }\) does not depend on the choice of \(\pi \). This is particularly important for the case of hedonic games as these games do not specify any initial partition.

The Markov process with transition matrix \(P^\beta \) is called irreducible if for every two partitions \(\pi ,\pi '\) there is \(m\in {\mathbb {N}}\) such that \(\left( \left( P^{\beta }\right) ^m\right) _{\pi ',\pi }>0\). The following proposition comprises well-known results about irreducible Markov processes with finite state space that we will need later. We do not provide a proof but refer the reader to the standard literature, e.g., [17].

Proposition 3.2

Let P be the transition matrix of an irreducible Markov process over \(\Pi \). Then there is a unique vector \(\mu \in {\mathbb {R}}^{\Pi }\) with \(\sum _{\pi \in \Pi }\mu (\pi )=1\) that satisfies (3) for all \(\pi \in \Pi \). In particular, \(\mu \left( \pi '\right) >0\) for all \(\pi '\in \Pi \) and \(\mu \) satisfies \(P\mu =\mu \), i.e., \(\mu \) is the unique eigenvector of P to eigenvalue 1 with length 1.

Example 3.3

Recall the transition matrix in Example 3.1. The Markov process that is defined by the transition matrix \(P^{\beta }\) has stationary distribution \(\left( 0,\frac{1}{3},\frac{1}{3},\frac{1}{3},0\right) \). This means that after a very long time it will have spent the same amounts of time in \(\pi ^1,\pi ^2\), and \(\pi ^3\), while it will not have spent any time in \(\pi ^0\) or \(\pi ^4\). Observe that even if the game starts in \(\pi ^0\) or \(\pi ^4\), these partitions will be left in the first period any never be returned to. Thus, relative time spent there converges towards 0. \(\square \)

We can now define payoffs for those behavior profiles \(\beta \) for which \(P^{\beta }\) is the transition matrix of an irreducible Markov process.

Definition 3.4

Let \(\beta \) be a behavior profile such that \(P^{\beta }\) be the transition matrix of an irreducible Markov process with the unique stationary distribution \(\mu ^{\beta }\). Then the payoffs from the behavior profile \(\beta \) are

$$\begin{aligned} u_i\left( \beta \right) = \sum _{\pi \in \Pi }\mu ^{\beta }(\pi )V_i(\pi ) \end{aligned}$$
(4)

for all \(i\in N\). \(\square \)

Example 3.5

Recall the game in Example 2.2, the behavior in in Example 3.1 and the corresponding stationary distribution in Example 3.3. Here, the payoffs are given by \(u_i\left( \beta \right) =\frac{4+a}{3}\) for all \(i\in N\). \(\square \)

While the payoff function in (4) is rather intuitive, it has two severe caveats: First, it is not well defined if the stationary distribution of \(P^{\beta }\) is not unique. Second, unlike payoff functions in standard normal form games, it is not linear in \(\beta \). Thus, when looking for an equilibrium that is based on (some form of) best replies, it is not trivial to show that the set of best replies is convex.

3.3 Errors and \(\varepsilon \)-Behaviors

We are interested in the following class of behavior profiles which lead to irreducible Markov processes and, hence, well defined expected payoffs according to Definition 3.4.

Definition 3.6

Let \(\varepsilon >0\) and \(S\in P(N)\). An \(\varepsilon \)-behavior of coalition S is a (mixed) coalition behavior \(\beta _S\) such that \(\beta _S\left( \pi \right) \in \left[ \varepsilon ,1-\varepsilon \right] \) for all \(\pi \in \Pi \) with \(S\notin \pi \). The set of \(\varepsilon \)-behaviors of coalition S is denoted by \(\Delta _S^{\varepsilon }\), and the set of \(\varepsilon \)-behavior profiles by \(\Delta ^{\varepsilon }=\times _{S\in P(N)}\Delta _S^{\varepsilon }\). \(\square \)

In an \(\varepsilon \)-behavior, every possible move is implemented with positive probability. This means that coalitions will make mistakes with some small (but positive) probability. In the theory of dynamic games the ability to account for (even one own’s) possible mistakes provides one of the motivation of subgame perfection: Players specify their actions even for histories that would never be reached if they followed their strategy, and after any history their strategy needs to specify some equilibrium behavior. In this paper, we make this option of mistakes explicit as it ensures that for any two partitions \(\pi ,\pi '\) there is some positive probability that a chain of coalitional moves will lead from \(\pi \) to \(\pi '\).

Lemma 3.7

For every \(\varepsilon >0\) and every \(\beta \in \Delta ^{\varepsilon }\) the Markov process with transition matrix \(P^{\beta }\) is irreducible.

This lemma together with Proposition 3.2 implies that the payoff function in (4) is well defined for all \(\beta \in \Delta ^{\varepsilon }\). Observe, however, that irreducibility of the emerging Markov process is not necessary for the payoff function to be well defined: The behavior profile in Example 3.5 is not an \(\varepsilon \)-behavior, yet the payoffs are well defined.

4 Equilibrium

From here on, let \(\varepsilon >0\) and \(\rho =\left( \rho ^{\pi }\right) _{\pi \in \Pi }\) be fixed. We shall use the payoff functions in (4) to obtain an equilibrium coalition behavior that exists for all hedonic games.

4.1 Definition

Let S be a nonempty coalition and let \(\pi \in \Pi \). For a given \(\beta \in \Delta ^{\varepsilon }\), we write \(\beta _{-S}=\left( \beta _T\right) _{\emptyset \ne T\ne S}\) for the profile of \(\varepsilon \)-behaviors for all coalitions but S. It will also be convenient to write \(\beta _{S}\left( -\pi \right) \) for the restriction of the behavior \(\beta _S\) on \(\Pi \setminus \{\pi \}\). In this case, we write \(\left( \beta _S(\pi ),\beta _S\left( -\pi \right) \right) \) for the behavior \(\beta _S\). Let \(\beta \in \Delta ^\varepsilon \). Then \(S\in P(N)\) has a profitable one-shot deviation from \(\beta \) at \(\pi \) if \(S\notin \pi \) and there is \(q\in \left[ \varepsilon ,1-\varepsilon \right] \) such that

$$\begin{aligned} u_i\left( q,\beta _S\left( -\pi \right) ,\beta _{-S}\right) >u_i\left( \beta \right) \end{aligned}$$

for all \(i\in S\). We say that \(\beta _{S}^*(\pi )\) is a weak best reply against \(\beta \) at \(\pi \) if S does not have a profitable one-shot deviation from \(\left( \beta ^*_S\left( \pi \right) ,\beta _S\left( -\pi \right) ,\beta _{-S}\right) \) at \(\pi \). That is, a weak best reply of S against \(\beta \) at \(\pi \) takes the behavior of all coalitions and S’s behavior everywhere but in \(\pi \) as given and specifies an optimal probability at \(\pi \). We denote the set of S’s weak best replies against \(\beta \) at \(\pi \) by \(R_{S,\pi }^{\varepsilon }\left( \beta \right) \).Footnote 12

Definition 4.1

A weak \(\varepsilon \)-equilibrium is an \(\varepsilon \)-behavior profile \(\beta \) such that for each \(S\in P(N)\) and each \(\pi \in \Pi \) it holds that \(\beta _S\left( \pi \right) \in R_{S,\pi }^{\varepsilon }\left( \beta \right) \). \(\square \)

That is, \(\beta \) is a weak \(\varepsilon \)-equilibrium if for each nonempty coalition S and each partition \(\pi \) the behavior \(\beta _S\) specifies a weak best reply \(\beta _S\left( \pi \right) \) against \(\beta \) at \(\pi \). We call such profile a “weak” equilibrium as it is only stable with respect to one-shot deviations, but not with respect to arbitrary deviations.

Example 4.2

Recall the roommate problem and the behavior profile in Example 3.1. Although this is not an \(\varepsilon \)-behavior, we have well defined payoff functions so that we can try and find weak best replies. For that purpose recall the behavior profile \(\beta \) in Example 3.1 and consider coalition \(\{2,3\}\) at \(\pi ^1\). Suppose this coalition leaves \(\pi ^1\) with probability q. Then the corresponding transition matrix differs from \(P^{\beta }\) only in the second column, where p is replaced by q. The stationary distribution of the new matrix is given by \(\left( 0,\frac{p}{p+2q},\frac{q}{p+2q},\frac{q}{p+2q},0\right) \). So, the payoffs of players 2 and 3 are

$$\begin{aligned} u_2\left( q,\beta _{\{2,3\}}\left( -\pi ^1\right) ,\beta _{-\{2,3\}}\right)&= \frac{ap+4q}{p+2q}&u_3\left( q,\beta _{\{2,3\}}\left( -\pi ^1\right) ,\beta _{-\{2,3\}}\right)&= \frac{(4+a)q}{p+2q}. \end{aligned}$$

Observe that \(u_3\) is always increasing in q while \(u_2\) is increasing in q for \(a<2\) and decreasing in q for \(a>2\). Thus, for \(a<2\) the only weak best response of \(\{2,3\}\) at \(\pi ^1\) is to choose \(q=1-\varepsilon \).Footnote 13 On the other hand, for \(a\ge 2\), every \(q\in \left[ \varepsilon ,1-\varepsilon \right] \) is a weak best response as the interests of 2 and 3 are conflicting. \(\square \)

In the previous example the set \(R^\varepsilon _{\{2,3\},\pi ^1}\left( \beta \right) \) is, depending on a, either a point set or a compact interval. This is true in general for all \(\beta \in \Delta ^{\varepsilon }\), \(S\in P(N)\), and \(\pi \in \Pi \). (See Lemma A.1 in the appendix.)

4.2 Existence

We have mentioned before that the utility function in (4) is not necessarily linear in \(\beta \). The reason is as follows: Consider two behavior profiles \(\beta \) and \(\gamma \) that induce Markov processes with transition matrices \(P^{\beta }\) and \(P^{\gamma }\), which in turn have stationary distributions \(\mu ^{\beta }\) and \(\mu ^{\gamma }\). It can easily be verified that the convex combination \(r\beta +(1-r)\gamma \) will lead to an irreducible Markov process. However, there is very little that can be said about the stationary distribution \(\mu ^{r\beta +(1-r)\gamma }\) of this process. In particular, it is not necessarily the case that \(\mu ^{r\beta +(1-r)\gamma }\) is a convex combination of \(\mu ^{\beta }\) and \(\mu ^{\gamma }\).

This nonlinearity of the utility functions in \(\beta \) creates a problem as we cannot use standard arguments from normal form games to show that the set of weak best replies is convex. Instead, we prove the following theorem that considers the stationary distribution of a convex combinations of two Markov processes whose transition matrices are identical everywhere but in one column.

Theorem 4.3

Let X be a finite set, and let \(P,Q\in \left[ 0,1\right] ^{X\times X}\) be transition matrices of irreducible Markov processes over X, so that there is \(y^*\) with \(P_{x,y}=Q_{x,y}\) for all \(x\in X\) and all \(y\ne y^*\). Let \(\lambda \) and \(\mu \) be the (unique) stationary distributions of P and Q, respectively. Let \(r\in \left[ 0,1\right] \) and define

$$\begin{aligned} t = \frac{r\mu \left( y^*\right) }{r\mu \left( y^*\right) + \left( 1-r\right) \lambda \left( y^*\right) } \end{aligned}$$
(5)

Then \(rP + \left( 1-r\right) Q\) is the transition matrix of an irreducible Markov process, and \(\nu =t\lambda + \left( 1-t\right) \mu \) is the unique stationary distribution of this process.

Consider a (completely mixed) behavior profile \(\beta \), and fix a partition \(\pi \) and a coalition \(S\in P(N)\). Then, for any two strategies \(\beta _S^1\) and \(\beta _S^2\) that coincide with \(\beta _S\) everywhere but in \(\pi \) the transition matrices of the corresponding Markov processes differ only in column \(\pi \). That is, they satisfy the condition of Theorem 4.3. Thus, we obtain the following result.

Corollary 4.4

Let \(\beta \in \Delta ^{\varepsilon }\), \(S\in P(N)\), and \(\pi ^*\in \Pi \) with \(S\notin \pi ^*\). Let \(\underline{\beta }_S, \overline{\beta }_S\) be such that \(\underline{\beta }_S\left( \pi \right) ={\overline{\beta }}_S\left( \pi \right) =\beta _S\left( \pi \right) \) for all \(\pi \ne \pi ^*\), and \(\underline{\beta }_S\left( \pi ^*\right) =\varepsilon \) and \(\overline{\beta }_S\left( \pi ^*\right) =1-\varepsilon \). Let \(r=\frac{1-\varepsilon -\beta _S\left( \pi ^*\right) }{1-2\varepsilon }\). Then

$$\begin{aligned} u_i\left( \beta \right) = tu_i\left( {\underline{\beta }}_S,\beta _{-S}\right) + \left( 1-t\right) u_i\left( {\overline{\beta }}_S,\beta _{-S}\right) \end{aligned}$$

for all \(i\in N\), where t is defined as in (5).

This solves the issue outlined above: For any behavior profile \(\beta \), each player’s payoff from a convex combination of two deviations of S at \(\pi \) is a convex combination of the payoffs from the two deviations.

For all \(\beta \in \Delta _S^{\varepsilon }\) let \(R^{\varepsilon }\left( \beta \right) =\times _{S\in P(N)}\times _{\pi \in \Pi }R^{\varepsilon }_{S,\pi }\left( \beta \right) \). Then coalition behavior profile \(\beta \) is a weak \(\varepsilon \)-equilibrium if and only if it is a fixed point of the correspondence \(\beta \mapsto R^{\varepsilon }\left( \beta \right) \). Thus, it is sufficient to prove that this correspondence has a fixed point. The most important part, namely convexity, follows from Corollary, 4.4. The rest of the proof is in the appendix.

Theorem 4.5

For every hedonic coalition formation game \(\left( N,V,\rightarrow ,\rho \right) \) and every \(\varepsilon >0\), there is a weak \(\varepsilon \)-equilibrium.

5 Best Responses Versus Weak Best Responses

We have seen that the definition of weak \(\varepsilon \)-equilibria ensures stability against one-shot deviation, but not necessarily against deviations at more than one state. So, the coalition formation games that we have defined in Sect. 3 lack some kind of “one-shot-principle.” We shall provide an example here where a coalition does not have a one-shot deviation, i.e., is playing a weak best response, but can find a better response by changing its behavior at two states.

Let \(N=\{1,2,3\}\) and v be the hedonic game given by \(v(\{1\})=20,\ v(\{2\})=0, \ v(\{3\})=0,\ v(\{1,2\})=(17,14), \ v(\{1,3\})=(17,0),\ v(\{2,3\})=(15,0)\) and \(v(N)=(1,18,0)\). Let \(\rightarrow \) be defined by the residual map \(\gamma \) in Example 2.3. Define three bijections \(\rho _1,\ \rho _2,\rho _3 :\{1,\ldots ,7\}\rightarrow P(N)\) by

\((\rho _1(1),\rho _1(2),\rho _1(3),\rho _1(4),\rho _1(5),\rho _1(6),\rho _1(7))=(\{1\},\{2\},\{3\},\{1,2\},\{1,3\},\{2,3\},\{1,2,3\}),\)

\((\rho _2(1),\rho _2(2),\rho _2(3),\rho _2(4),\rho _2(5),\rho _2(6),\rho _2(7))=(\{1,2,3\},\{2,3\},\{1,3\},\{1,2\},\{3\},\{2\},\{1\}),\)

\((\rho _3(1),\rho _3(2),\rho _3(3),\rho _3(4),\rho _3(5),\rho _3(6),\rho _3(7))=(\{1,2\},\{1,3\},\{2,3\},\{1,2,3\},\{1\},\{2\},\{3\}).\)

Let the partitions be numbered as in Example 2.2. Define the collection \(\left( \rho ^{\pi }\right) _{\pi \in \Pi }\) by \(\rho ^{\pi ^0}=\rho ^{\pi ^3}=\rho ^{\pi ^4}=\rho _1\), \(\rho ^{\pi ^1}=\rho _2\) and \(\rho ^{\pi ^2}=\rho _3\). Let \(S^0=\{1,2\}\). We now construct a behavior profile \(\beta \) such that coalition \(S^0\) has a profitable deviation at \(\beta \), but no one-shot deviation. For all \(T\ne S^0\) and all \(\pi \in \Pi \) with \(T\notin \pi \) let \(\beta _T\left( \pi \right) =\frac{1}{20}\). That is, \(\beta \) prescribes for any \(T\ne S^0\) at any \(\pi \) with \(T\not \in \pi \) to form and deviate from \(\pi \) to \(\pi '\) with probability \(\frac{1}{20}\) and to remain at \(\pi \) with probability \(\frac{19}{20}\). For each \(k=1,2,3,4\) let \(\beta _{S^0}^p\left( \pi ^k\right) =1-p_k\), where \(p_k\in \left[ \varepsilon ,1-\varepsilon \right] \). Then the transition matrix of the Markov process associated with profile \(\left( \beta _{S^0}^p,\beta _{-S^0}\right) \) is

$$\begin{aligned} P=\left( \begin{array}{rrrrr} p_0(1-x)^3 &{}x(1-x)^3(2-x)&{}p_2(1-x)^2x(2-x)&{}x(2-x)&{}x(x^2-3x+3)\\ 1-p_0&{}(1-x)^5&{}x&{}(1-x)^2(1-p_3)&{}(1-x)^3(1-p_4)\\ p_0(1-x)x&{}x(1-x)&{}p_2(1-x)^4&{}p_3(1-x)^2x&{}p_4(1-x)^4x\\ p_0x&{}x(1-x)^2&{}x(1-x)&{}p_3(1-x)^4&{}p_4(1-x)^3x\\ p_0(1-x)^2x&{}x&{}(1-x)^2(1-p_2)&{}p_3(1-x)^3x&{}p_4(1-x)^5\\ \end{array} \right) . \end{aligned}$$

where \(x=\frac{1}{20}\). The stationary distribution of P, \(\mu \), is given by \(\mu \left( \pi _k\right) = \frac{{\overline{\mu }}\left( \pi _k\right) }{\sum _{l=1}^k{\overline{\mu }}\left( \pi _l\right) }\), where

$$\begin{aligned} \overline{\mu }(\pi _0)&= 2.659690476 \ 10^{22}-1.121619627 \ 10^{22} p_2 p_3 p_4+1.611058344 \ 10^{22} p_2 p_3\\&\quad +1.449336223 \ 10^{22} p_2 p_4+1.442311007 \ 10^{22} p_3 p_4- {2.081799782} \ 10^{22} p_2\\&\quad - {2.058328825} \ 10^{22} p_3-1.863704548 \ 10^{22} p_4\\ \overline{\mu }(\pi _1)&=2.621440000 \ 10^{23} +1.207584620 \ 10^{23} p_0 p_2 p_3 p_4-1.505504975 \ 10^{23} p_0 p_2 p_3\\&\quad -1.498726885 \ 10^{23} p_0 p_2 p_4-1.479293341 \ 10^{23} p_0 p_3 p_4-1.415672613 \ 10^{23} p_2 p_3 p_4\\&\quad +1.859871285 \ 10^{23} p_0 p_2+1.860700832 \ 10^{23} p_0 p_3+1.831542937 \ 10^{23} p_0 p_4\\&\quad +1.739116855 \ 10^{23} p_2 p_3+1.748510977 \ 10^{23} p_2 p_4+1.725374942 \ 10^{23} p_3 p_4\\&\quad -2.293812673 \ 10^{23} p_0-2.135179264 \ 10^{23} p_2-2.140798157 \ 10^{23} p_3 \\&\quad -2.124770265 \ 10^{23} p_4\\ \overline{\mu }(\pi _2)&=1.245184000 \ 10^{22}-5.434523952 \ 10^{21} p_0 p_3 p_4+7.432904695 \ 10^{21} p_0 p_3\\&\quad + {7.042336316} \ 10^{21} p_0 p_4+ {7.023069493} \ 10^{21} p_3 p_4-9.632257219 \ 10^{21} p_0\\&\quad -9.608306688 \ 10^{21} p_3-9.101201613 \ 10^{21} p_4 \\ \overline{\mu }(\pi _3)&=1.242071040 \ 10^{22}-5.307293743 \ 10^{21} p_0 p_2 p_4+7.351769281 \ 10^{21} p_0 p_2\\&\quad +6.854619389 \ 10^{21} p_0 p_4+6.950743632 \ 10^{21} p_2 p_4-9.478516665 \ 10^{21} p_0\\&\quad -9.634996429 \ 10^{21} p_2-8.976693796 \ 10^{21} p_4 \\ \overline{\mu }(\pi _4)&=2.434498560 \ 10^{22}-1.319357013 \ 10^{22} p_0 p_2 p_3+1.705305641 \ 10^{22} p_0 p_2\\&\quad +1.467654760 \ 10^{22} p_0 p_3+1.695404081 \ 10^{22} p_2 p_3-1.896199018 \ 10^{22} p_0\\&\quad -2.191368192 \ 10^{22} p_2-1.884302724 \ 10^{22} p_3 \end{aligned}$$

The payoffs of players 1 and 2 are

$$\begin{aligned} u_1\left( \beta _{S^0}^p,\beta _{-S^0}\right)&=20( \mu _P(\pi _0)+\mu _P(\pi _2))+17(\mu _P(\pi _1)+\mu _P(\pi _3))+\mu _P(\pi _4)\\ u_2\left( \beta _{S^0}^p,\beta _{-S^0}\right)&=14\mu _P(\pi _1)+15\mu _P(\pi _4)+18\mu _P(\pi _4). \end{aligned}$$

Let \(p_k^*=\frac{19}{20}\) for \(k=0,2,3,4\). Then

$$\begin{aligned} \frac{d}{d p_0}u_1\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right)&>0&\frac{d}{d p_0}u_2\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right)&<0 \\ \frac{d}{d p_2}u_1\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right)&>0&\frac{d}{d p_2}u_2\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right)&<0 \\ \frac{d}{d p_3}u_1\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right)&>0&\frac{d}{d p_3}u_2\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right)&<0 \\ \frac{d}{d p_4}u_1\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right)&<0&\frac{d}{d p_4}u_2\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right)&>0. \end{aligned}$$

That is, for each k any change in \(p_k^*\) makes exactly one player better off and one player worse off, so that \(S^0\) does not have any profitable one-shot deviations from \(\left( \beta ^{p^*}_{S^0},\beta _{-S^0}\right) \).

Finally, define \({\hat{p}}\) by \({\hat{p}}_0={\hat{p}}_2=\frac{19}{20}\) and \({\hat{p}}_3={\hat{p}}_4=\frac{1}{20}\). Then

$$\begin{aligned} u_1\left( \beta _{S^0}^{{\hat{p}}},\beta _{-S^0}\right) = 17.72703770896&> 16.2479670393=u_1\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right) \\ u_2\left( \beta _{S^0}^{{\hat{p}}},\beta _{-S^0}\right) =9.19781147654&> 7.4664207782=u_2\left( \beta _{S^0}^{p^*},\beta _{-S^0}\right) \end{aligned}$$

That is, by changing their behavior both at \(\pi ^3\) and at \(\pi ^4\) both members of \(S^0\) can strictly improve their payoffs.