Mean-Field Game Approach to Admission Control of an M/M/ ∞ Queue with Shared Service Cost

We study a mean (cid:28)eld approximation of the M/M/ ∞ queueing sys-tem. The problem we deal is quite di(cid:27)erent from standard games of congestion as we consider the case in which higher congestion results in smaller costs per user. This is motivated by a situation in which some TV show is broadcast so that the same cost is needed no matter how many users follow the show. Using a mean-(cid:28)eld approximation, we show that this results in multiple equilibria of threshold type which we explicitly compute. We further derive the social optimal policy and compute the price of anarchy. We then study the game with partial information and show that by appropriate limitation of the queue-state information obtained by the players we can obtain the same performance as when all the information is available to the players. We show that the mean-(cid:28)eld approximation becomes tight as the workload increases, thus the results obtained for the mean-(cid:28)eld model well approximate the discrete one.


Introduction
This paper is devoted to the problem of whether an arrival should queue or not in an M/M/∞ queue.It is assumed that the cost per customer decreases with the number of customers.
In a wireless context, the M/M/∞ queue may model the number of calls in a cell with a large capacity.The assumption that the cost per call decreases with the number of calls is typical for a multicast in which the same content is broadcast to all mobiles, so that the cost of the transmission can be shared among the number of calls present.
Our rst objective in this paper is to study the structure of both individual as well as globally optimal policies.Our analysis reveals that there exist threshold type of policies in which an individual is admitted if the number of ongoing calls exceeds some threshold (whose value depends on whether globally or individually optimal policies are considered).
The assumption that the cost decreases with the number of customers distinguish our model from the standard congestion control problems which consider that cost increases with the number of customers.The structure of both globally and individually optimal policies can thus be expected to be quite dierent than those standard congestion control problems which have been studied for over half a century starting with the seminal paper of Pinhas Naor [10].Naor had considered an M/M/1 queue, in which a controller has to decide whether arrivals should enter a queue or not.The objective of his paper was to minimize a weighted dierence between the average expected waiting time of those that enter, and the acceptance rate of customers.Naor then considered the individually optimal policy (which can be viewed as a Nash equilibrium in a non-cooperative game among the players) and showed that it is also of a threshold type with a threshold bigger than that of a centralized model.His result revealed that arrivals that join the queue under individual optimal policy wait longer in average compared to the global optimal policy.Finally, he showed that there exists some toll such that if it is imposed on arrivals for joining the queue, then the threshold value of the individually optimal policy can be made to agree with the socially optimal one.Since this seminal work of Naor there has been a huge amount of research that extend the model: More general inter-arrival and service times have been considered, more general networks, other objective functions and other queuing disciplines have also been considered, see e.g.[17,14,13,7,8,3,6,1,12] and references therein.
The importance of the fact that a threshold policy is optimal is that in order to control arrivals we only need partial information -in fact we only need a signal to indicate whether the queue length exceeds or not the threshold value Ψ .The fact that this much simpler information structure is sucient for obtaining the same performance as in the full information case motivates us to study the performance of threshold policy and related optimization issues for a non co-operative game with partial information setting.We rst study the full information setting where each individual is only optimizing its own cost and explicitly obtain that there exists a plethora of threshold type of symmetric Nash Equilibrium (NE) strategy proles.Subsequently, we compare the social cost under NE strategy prole with the globally optimal social cost.Then, we consider the individual optimization problem with partial information; where we send a green signal if the queue length exceeds the value Ψ and a red signal otherwise and each individual player will select strategy in order to optimize its own social cost.We note that by using this signaling approach instead of providing full state information, users cannot choose any threshold policy with parameter dierent than Ψ , and so in the individual optimization case, one could hope that by determining the signaling according to the value Ψ that minimizes the social cost, one would obtain the socially optimal performance i.e. the global optimal cost will be achieved.We show that this is not the case here; in fact, we observe that as in [2], where similar approach was proposed for an M/M/1 queuing system, the performance obtained under the best possible signaling policy (in the partial information case) achieves the same performance as equilibrium under the full information.
We study here a simplied mean eld limit of the M/M/∞ queuing system rather than the actual discrete model since on one hand it is much simpler to handle and solve than the original discrete problem (we obtain closed-form formulas for all the equilibria) and on the other hand the approximation becomes tight as the workload increases.To show that, in this paper we establish the convergence of the game to its mean eld limit under appropriate conditions.
The model and some results from sections 4, 8 and 9 appeared in conference proceedings as [16].
The organization of the paper is as follows: We introduce the discrete model in Section 2 and its mean-eld approximation in Section 3. In Section 4 we nd equilibria of the mean-eld model, while in Section 5 its social optimum.In Section 6 we study the version of the game with partial information.In Section 7 we show that the information can be limited without aecting the performance.In Section 8 we establish the convergence of discrete models to the mean-eld one as the workload increases.We numerically evaluate the threshold policies in Section 9. Finally, we conclude the paper with some remarks in Section 10.

Discrete Model
We consider a service facility in which an arriving customer can observe the length of the queue (X t ) upon arrival.We interchangeably denote X t as system state.The value of service is γ and the cost of spending time in service can be computed as an integral of the cost function c(•) over the service time with c(•) a continuous decreasing function of the number of users in the queue.An arriving customer can either join the queue or leave without being served.The decision is made upon arrival.The situation is modeled as a M/M/∞ system with incoming rate λ and service rate µ.
A customer k arriving at time t k chooses whether to enter the queue (E) or not (N ).It follows that the set of pure actions for any customer is V = {E, N }.Since the decision that he makes is based on the length of the queue, a policy (or a strategy) of any customer will be a mapping1 π k : S → ∆(V ) (since the set V is only a two-point set, we will identify π k with a function from N to [0, 1], describing the probability it assigns to action E), where S ⊂ R denotes the set of possible system states (in the discrete model S = N).In what will follow we will assume that the users limit their policies to the sets of so-called impulse or threshold policies, dened below.Denition 1 A policy π k of a user is called an impulse policy if there are nitely many points x 1 , . . ., x n ∈ S, with x 0 := inf S, x n+1 := sup S, such that π k is constant on any interval (x k , x k+1 ), k = 0, . . ., n.
A subclass of the set of impulse policies with very simple structure are threshold policies.Denition 2 A policy π k of a user is called an [Θ, q]-threshold policy if At time t, an incoming client who employs this policy joins the system if the queue length, X t , is bigger than Θ, while if X t = Θ he does so with probability q.Otherwise he never joins the queue.
The cost of a user k arriving at time t k is dened as follows: where σ k is user k's service time.
For each multi-policy π = (π 1 , π 2 , . ..), let [π k , π −k ] be the policy which replaces π k by π k in π.Now we are ready to dene the solution we will be looking for: Denition 3 A policy π k is an optimal response for user k against a multi- for every policy π k of player k.
policy of every user k is the optimal response for user k against π * , for every k.If inequalities (2) are true up to some ε > 0, we say that π * is an ε-Nash equilibrium.

Fluid Model
In what follows we will mostly analyze the uid approximation, which can be viewed as the weak limit of the system (scaled in a proper way) as the arrival rate of players goes to innity (see e.g.[15]).Now, we describe the uid model.The system state (the length of the queue) X t ∈ R + .Consequently, the policies of the players are dened on R + .The customers arrive at the queue according to a uid process with rate λ.As each of them uses some policy π k , the real incoming rate at time t is π(X t )λ where π(X t ) is the average strategy of the arriving users.Each of them stays in the queue an exponentially distributed time with parameter µ, and so the outow is according to a uid process with rate µX t .This can be described as the following ODE: Since there are innitely many players in the game now, we encounter problems with dening the multi-policies.For that reason we assume that in multi-policy π all the players use the same policy π.If we want to write that only one player, say player k changes his policy to some π k , we write that players apply policy [π −k , π k ], meaning that each player uses policy π except player k.Also note that the game is symmetric since each player has the same payo function and strategy space, thus, it is very dicult to implement an asymmetric Nash Equilibrium-we elucidate the inherent complications considering only two players: If in an NE π * 1 = π * 2 , then, by the symmetric nature of the the game, (π * 2 , π * 1 ) is also an NE.If player 2 knows that player 1 selects π * (π * 2 , respectively), then the optimal response for player 2 is to select π * 2 (π * 1 ,respectively), but player 2 can not know the selection of player 1 due to the non co-operation between them.Under symmetric NE, all players select the same strategy and thus the above complication is somewhat alleviated.Moreover, π is not always well dened, but in such a case π(X t ) ≡ π(X t ).Also with these assumptions, both the cost and the equilibrium can be dened as in the discrete model.

Equilibria of the Fluid Model
In this section we characterize the equilibrium points of our game.We begin by characterizing the evolution of the system state in case all the users apply the same impulse policy.
Lemma 1 Suppose all the players (except maybe one) apply the same impulse policy π.Then if the initial state of the system is x 0 , then X t is continuous in t for any x 0 and is nondecreasing in x 0 .Proof It is clear that for π ≡ π having nitely many discontinuity points, the (non-classical) solution to the equation ( 3) is well-dened a.e. and continuous in t.
Next, suppose that x 0 < x 0 and there exists a s such that2 X s [x 0 ] > X s [x 0 ].X t is continuous in t, thus by the intermediate value property there exists a t * < s such that X t * [x 0 ] = X t * [x 0 ].But in both cases and at each time all users apply the same policy π, depending only on the current state of the system, thus for any t > t * X t [x 0 ] = X t [x 0 ], which is a contradiction, as we assumed that We have one immediate corollary of the above lemma.
Corollary 1 The expected cost of a player joining the queue at time t k , when all the other players apply policy π when Note that in the above corollary we have replaced x 0 with X t k .This is justied, as the coecients of (3) depend on t only through X t .Corollary 1 has an important consequence which is stated in the lemma below: Lemma 2 Any best response to a symmetric impulse multi-strategy π is a threshold strategy.Moreover, the best response is unique up to the value of q (see (1)).
Proof A player k arriving at time t k has only two pure actions: to enter the queue (E) or not to enter the queue (N ).When he uses the former, his cost is with σ k ∼ Exp(µ), which is by Corollary 1 decreasing in X t k .On the other hand, when k uses action N , his cost is 0. Thus, if k prefers to use action E for X t k = x 1 , he will also prefer it for X t k = x 2 > x 1 .Similarly, if he prefers to use N for X t k = x 2 , he will also prefer it for X t k = x 1 < x 2 .Finally, as the cost of using E is strictly decreasing in X t k , there may only exist one point where k is indierent between E and N and so he may choose to randomize.Moreover, in any other point the best response is uniquely determined.
An immediate, but very important consequence of Lemma 2 is the following: Corollary 2 In any symmetric4 equilibrium to our queuing game any player uses a threshold policy.
Remark 1 Note that the equilibrium species the action to take at any state, including states that are in practice never reached.If a state x is never visited then any variation of the equilibrium at states larger than x will not change the performance of any player.Yet since we allow for any initial state, there may be customers that will nd the system at states that are transient and will not be visited again.Therefore specifying the equilibrium in such states is considered to be important in game theory.Equilibria that are specied in all states including transient ones, are known as perfect equilibria.It can also be shown that such equilibria are good approximations of those that we obtain in case that there is some suciently small constant uncontrolled inow.This follows from [5].
Assuming that all (except maybe one) users apply the same [Θ, q]-threshold strategy, we may write explicitly the evolution of the system state X t : Lemma 3 Suppose the initial state of the system is x 0 and that all the users (except maybe one) apply the [Θ, q]-threshold policy.Then the system state at time t can be explicitly written as: where Proof We know that when p is a constant, the solution of the equation .
Note that, when p = 1, this means that X t → λ µ monotonically when t → ∞.Thus, if x 0 > Θ and λ µ > Θ, X t never leaves the region where policy π prescribes to use action E, and so Similarly, note that for p = 0, X t decreases monotonically to 0, thus when x 0 < Θ, X t never leaves the region where policy π prescribes to use action N , and so (5) reduces to Now suppose that x 0 > Θ > λ µ .Then X t starts in the region where π prescribes to use action E with probability one, which implies that its trajectory decreases towards λ µ until time t (x0,Θ) when it reaches the threshold Θ.From then on π prescribes to use action N with probability 1.It is easy to compute that for t ≤ t (x0,Θ) , Since by denition t (x0,Θ) is such that X t (x 0 ,Θ) = Θ, we easily obtain that . Then, for t ≥ t (x0,Θ) , X t has to satisfy (5) with p = 0 and t 0 = t (x0,Θ) instead of 0, which gives Finally, when x 0 = Θ, X t satises at t = 0 (5) with p = q.If x 0 = qλ µ , by (5) X t ≡ qλ µ .Otherwise if x 0 < qλ µ , X t moves upwards and for t > 0 behaves like when x 0 > Θ, while if x 0 > qλ µ , X t moves downwards and for t > 0 behaves like when x 0 < Θ.Now, to simplify the notation, we will make use of the fact that all the players use threshold policies.Let us dene to be the expected service cost for player k if he enters the queue when its state is x and all the players except k apply a [Θ −k , q −k ]-threshold policy.C k can be written as The following lemma gives exact ways to compute C k in each of the cases of Lemma 3.
Lemma 4 C k can be computed using following formulas: Then by (a) of Lemma 3 which can be further written as µ , which is the limit of the expression in (a) when x → λ µ .We will use similiar convention throughout the paper, putting 1 a−a a a f (u) du = f (a), if needed.This will reduce the number of cases considered in subsequent results, without aecting the validity of any of them.
which can be further written as Finally, when x 0 < Θ or x 0 = Θ > q −k λ µ , we can apply part (d) of Lemma 3, obtaining which can be further written as In next two lemmas we characterize the best responses to any given threshold strategies.
Lemma 5 [Θ k , q k ]-threshold policy is a best response of player k to a [Θ −k , q −k ]threshold policy used by all the others if Θ k is obtained by nding the unique solution to the equation and taking any q k .If equation ( 6) has no solutions then Θ k is taken as the only value such that and player k if he joins the queue when its state is x, while his cost when he does not join is 0.Moreover, the expected cost of player joining the queue is by Corollary 1 monotone decreasing function of x.Thus, equation ( 6) may have at most one solution, and the cost of joining the queue for x > Θ k is negative, that for x < Θ k is positive, while that for x = Θ k is 0, regardless of q k .Thus [Θ k , q k ]-threshold policy always gives player k the smallest cost available.Similarly, when (6) has no solutions, from the monotonicity of the cost we can repeat the arguments from the proof of the rst part of the lemma, to show that ]-threshold policy of the others in this case.Lemma 6 Let [Θ k , q k ]-threshold policy be a best response of player k to a [Θ −k , q −k ]-threshold policy used by all the others and dene Θ and Θ as the unique solutions to the following equations6 : Then Θ k and q k satisfy the following: and q k is arbitrary (which means that the best response is a policy never prescribing to enter the queue).
where Θ is some uniquely dened function on λ µ , Θ satisfying Θ(x) > x. q k is arbitrary for µ c(0) then Θ k = 0 and q k = 1 (which means that the best response is a policy always prescribing to enter the queue).
Proof To show (a) rst note that any form of C k described in Lemma 4 is bounded below by 1  µ inf u≥0 c(u), which equals 1 µ lim u→∞ c(u), as c is a strictly decreasing function.Thus, in case γ ≤ 1 µ lim u→∞ c(u), also γ < C k (x, (Θ −k , q −k )) for any value of x, thus ( 7) is satised for Θ k = ∞.This means that the strategy never prescribing player k to enter the queue is his best response to the [Θ −k , q −k ]-threshold policy used by all the others.Now suppose the assumptions of part (b) of the lemma are satised.Note that the function µ depend on the relation between Θ −k and q −k λ µ : If the former is smaller, for Θ k = Θ −k we are in the set where , thus according to Lemma 5 the value of q k depends on the relation between 1  µ c(Θ −k ) and γ, exactly as it is written in Lemma 6. Finally if and so q k = 0.
To nish the proof of part (b) of the Lemma, we need to show that for To do that, it is enough to prove that for any xed which proves the desired property.
To prove part (c) of the lemma note that since now γ ∈ 1 µ , and thus by Lemma 5, The choice of q k is made exactly as in part (b) of the lemma.
Finally, suppose that γ ≥ 1 µ c(0).Then for any value of x, C (x, (Θ −k , q −k )) < γ, and thus the optimal response of player k to the [Θ −k , q −k ]-threshold strategy of all the others is always to join the queue.Now we are ready to state the main result of this section.
Theorem 1 The game under consideration always has a symmetric equilibrium where each of the players uses the same [Θ, q]-threshold strategy.Moreover: (a) If γ ∈ 0, 1 µ lim u→∞ c(u) then the equilibrium is unique, with Θ = ∞, which means that the equilibrium policies prescribe every user never to enter the queue.
Proof A strategy for any player k will induce a symmetric equilibrium if it is a best response to itself.Below we analyze which strategies may satisfy this condition.
In case (a) it is obvious by (a) of Lemma 6 that the policy prescribing never to join the queue is always the best response to itself, and since this is the only best response to any policy, this is the only possible equilibrium.
Θ = qλ µ ∈ Θ, λ µ and c(Θ) = µγ.Note however that by the denition of Θ and continuity of c, if Θ < λ µ then there must exist a solution Θ * to the equation c(Θ) = µγ in Θ, λ µ , so Θ * and q = Θ * µ λ is an equilibrium.In particular, if Θ = λ µ , then also Θ * = λ µ and q = 1 is one.In case (c) Θ has to be in interval [0, Θ] and needs to be related to q in one of the following ways: q = 0 and Θ > qλ µ = 0 or Θ = 0 with c(0) > µγ, which is always true in case (c).q = 1 and Θ < qλ µ = λ µ , which is always true, as Θ ≤ Θ < λ µ in this case, which was shown in the proof of Lemma 6. Θ = qλ µ ≤ Θ and c(Θ) = µγ.Note however that by the denition of Θ and continuity of c, there must exist some Θ * in the interval (0, Θ) such that c(Θ * ) = µγ, so Θ * and q = Θ * µ λ is the only equilibrium in this case.Finally, in case (d), by (d) of Lemma 6 it is obvious that the policy always prescribing to join the queue is the best response to itself.Since this is the only best response to any policy in this case, this is the only possible equilibrium.This ends the proof of the theorem.
Remark 2 It should be noted here that there are multiple equilibria in certain situations.In that case, it is normally not clear which one would prevail.Nevertheless, as the cost of being served is a decreasing function of Θ and of q for a xed value of Θ, we may assume that the customers will naturally choose the equilibrium strategies with the biggest values of Θ and q.In Section 5 we will, nevertheless, analyze the social outcome of all the possible equilibria, comparing them to the social optimum.

Social Optimum
The social cost associated to some symmetric strategy prole π can be computed using equality where x ∞ (π, x 0 ) denotes the stationary state of the queue when the players apply multipolicy π and initial state of the queue is x 0 , while π ∞ (x 0 ) is the limit value of strategy π when time goes to innity (note that it may have three values, depending on whether the trajectory of X approaches x ∞ (π, x 0 ) from above, from below, or is from some point constant).
If we assume that π is a [Θ, q]-threshold policy, C(x 0 , π) equals: Note however that, as it can be clearly seen from Lemma 3, when everyone uses the same threshold policy, the only stationary states possible in the game are 0, λ µ and qλ µ .Moreover, they can by easily deduced from the values of Θ, q and x 0 , and thus the following lemma is true.
Lemma 7 Suppose all the players in the game apply the same [Θ, q]-threshold policy π.Then social cost function in the game can be computed as follows: (c) In any other case C(x 0 , (Θ, q)) = 0.
Using this lemma we can easily nd strategies minimizing the social cost for any x 0 .
Theorem 2 (a) If c λ µ < γµ then the social optimum equals 1 µ c λ µ − γµ and is attained for the strategy prole consisting of [0, 1]-threshold strategies of all the players, prescribing to always join the queue.(b) If c λ µ = γµ then the social optimum equals 0 and is attained for any symmetric strategy prole consisting of [Θ, q]-threshold strategies such that > γµ then the social optimum equals 0 and is attained for the strategy prole consisting of [∞, 0]-threshold strategies of all the players, prescribing never to join the queue.
Proof Suppose c λ µ < γµ.Then 1 µ c λ µ − γµ < 0 and so it is always more protable to be in case (a) of Lemma 7 than in case (c).As c is a strictly decreasing function, also q µ c qλ µ − γµ < 1 µ c λ µ − γµ .Thus a strategy prole such that the assumptions of case (a) of Lemma 7 are satised for any x 0 minimizes the social cost function then.It is straightforward to see that when all the players use [0, 1]-threshold strategies this is the case.

Price of Anarchy and Price of Stability
A commonly used concept for evaluating the equilibria in any given game is that of Price of Anarchy, introduced by Koustoupias and Papadimitrou [9], which is the ratio between the cost of an equilibrium and that of the optimal solution.As in our game there may exist multiple equilibria, each gives a dierent social cost, we would like to adapt here the concept of two quantities describing quality of equilibria [4]: Price of Anarchy, being the ratio between a worst (in terms of its social cost) equilibrium's cost and the optimal social cost, and Price of Stability, dened as the ratio between the cost of a best equilibrium and that of the optimal solution.The problem in using these quantities in our model could be that here, unlike in network congestion games, the social cost may be both negative and positive (and in fact it often equals zero).Note however that in any situation a player can guarantee himself zero cost, so both in social optimum and in equilibrium it is never positive.This suggests dening Price of Anarchy and Price of Stability in the following manner: Here N E denotes the set of Nash equilibria in the game, while π Opt is an optimal policy prole in the game.Also it is important to note that in both denitions we use conventions that 0 0 = 1 and c 0 = +∞ for a negative value of c, so we treat 0 as 0 − .
The following theorems characterize PoA and PoS in our model.They are direct consequences of Theorems 1 and 2, and Lemma 7, and thus we state them without proofs.
Theorem 4 The Price of Stability: As we can see, both PoA and PoS take only two values, 1 and ∞.This is a consequence of the fact that the social cost of equilibrium happens to be greater than that of optimal solution only if the former equals zero while the latter is negative.

Fluid Model with Partial Information
In this section we assume that the knowledge of each user when he decides on entering the queue, is limited to the information whether the state of the queue is above some threshold Ψ or not.Thus instead of X t ∈ R + , the system state perceived by the players will be X t ∈ {0, 1}, with X t = 0 denoting X t < Ψ and X t = 1 denoting X t ≥ Ψ .Consequently the strategies of the players will be of one of the forms EE, EN , N E or N N , where the rst letter stands for the strategy in state 0, while the second one for the strategy in state 1.Using arguments from section 4 we can argue that strategy EN will be never used, so for the ease of analysis we will only consider the three remaining ones.It is also important to note that these three strategies can also be interpreted as threshold strategies in the original game, only with the set of thresholds available limited to {0, Ψ, ∞} (for policies EE, N E and N N respectively).We will analyze the equilibria in this model and, in particular, how they depend on the value of Ψ .Then we shall check how this aects the social welfare.
We assume that the knowledge of each of the players is limited to the value of the threshold Ψ and the partial information about the state X t .Thus the users will assume that the actual state of the queue at the time they decide on entering the queue is the one for which the cost of joining the queue is the highest.By Corollary 1 this cost is decreasing in X t , thus the players will assume X t = 0 if X t = 0 and X t = Ψ if X t = 1.
We will need some additional notation to formulate our main results.Let L EE , L N E and L N N denote the worst-case service cost for a player i entering the queue with X t = 0, when all the other players apply strategy EE, N E or N N , respectively.Similarly, let H EE , H N E and H N N denote the worst-case service cost for a player i entering the queue with X t = 1, when all the other users play EE, N E or N N , respectively.We can use the interpretation of the policies in our new model as threshold strategies in the original game and Lemma 4 to obtain: All the the main properties of functions L s and H s , s = EE, N E, N N , are summarized in the following lemma.Lemma 8 For any Ψ > 0 Proof By the monotonicity of c we can write: and (the inequality is true both in case 0 < Ψ < λ µ and when Ψ > λ µ ; in the degenerate case when Ψ = λ µ the RHS reduces to 1 µ c(Ψ ), which is obviously also smaller than the LHS): This establishes the strict inequalities in (a).The equalities are direct consequences of the formulas for H N N , H N E and H EE written before the lemma.
Part (b) of the lemma also follows from the monotonicity of c, as: for Ψ > 0 and Proof By Lemma 8 cases (a)(f) cover all the possible situations in the game.
Then each of the cases follows directly from the denition of pure-strategy Nash equilibrium.
The following information about how the functions L s and H s behave when Ψ changes can be immediately derived from their denitions and the monotonicity of c.

Lemma 9 For any
Using this lemma we can prove, how the worst-case equilibria depend on the value of the threshold Ψ .
Theorem 6 Worst-case equilibria in the game with partial information depend on Ψ in the following way: 0 c(u) du then for Ψ small enough all the players use policy N N in the equilibrium, while for Ψ approaching innity all the players use policy N E in the equilibrium.
µ c(0) then for Ψ small enough there are three equilibria, in which all the players use the same policy, which is any of EE, N E or N N policies, while for Ψ approaching innity either all the players use policy N E or all the players use policy EE in equilibrium.0 c(u) du then by Lemma 9 for Ψ small enough also H N E (Ψ ) > γ.L EE (Ψ ) is is independent of Ψ and always bigger than γ, then by Theorem 5 all the players apply policy N N in the worst-case equilibrium for such Ψ .On the other hand, if Ψ is big enough, H N N (Ψ ) < γ.Since, as already mentioned, also L EE (Ψ ) > γ, by Theorem 5 the strategy prole where everybody plays N E is the only worst-case equilibrium for such ≤ γ and H N E (Ψ ) < γ for any bigger Ψ .L EE is independent of Ψ and by assumption smaller than or equal to γ.So, as long as Ψ satises H N N (Ψ ) > γ, Theorem 5 implies that proles where everybody uses the same strategy, which is any of EE, N E or N N are equilibria.But for Ψ close to 0, and H N N (Ψ ) goes to 1 µ lim u→∞ c(u) < γ, thus for Ψ big enough we have two worst-case equilibria, where either all the players use policy N E or all use policy EE.
(d) If γ = 1 µ c(0) then for any value of Ψ , both H N N and L EE are smaller than γ.On the other hand, L N N ≡ γ and so by Theorem 5 for any Ψ there are two worst-case equilibria, where either all the players use policy N E or all use policy EE.
(e) If γ > 1 µ c(0) then L N N is always smaller than γ, and thus the only worst-case equilibrium for any value of Ψ is when everyone applies policy EE.
Remark 3 Note that, the limits when Ψ is taken to innity and zero make the signal completely uninformative on the state of the system.Thus, we can easily derive form Theorem 6 the equilibria in our game when the queue is completely unobservable.It is enough to look at the action prescribed to be taken above the threshold when Ψ → 0 or the one below the threshold when Ψ → ∞.It turns out that in cases (a) and (b) uninformed players should not enter the queue, in cases (d) and (e) they should enter the queue, while in case (e) there are two equilibria where players either enter or do not enter the queue.
Roughly speaking, Theorem 6 suggests that by increasing Ψ we can increase the set of global states for which the players would enter the queue.Since this will also aect the stationary state of the queue, which, as we can see from section 5, is crucial for the social welfare, it seems that by a proper choice of Ψ we can make the social welfare very close to its optimal value.We study the above idea in the perspective of social cost in section 7 where a hierarchy will be introduced in the game with the social planner choosing Ψ at the rst stage, and then the users playing the partial-information game from the present section at the second one.

Introducing Hierarchy to Boost the Performance of Equilibria
In this section we assume that the game is played in two stages.In the rst stage the social planner, having all the information about the game, including the actual value of x 0 , chooses Ψ and announces it to the players.His goal is to minimize the social cost C(x 0 , π) by appropriately limiting the data available to the players.On the second stage the users play the game considered in section 6 using all the information they have, which only consists of the announced value of Ψ , assuming that the state of the queue when they decide about entering is the worst possible.We will see that this kind of hierarchical formulation can reduce the social cost of equilibrium.
We rst study how equilibria in the hierarchical model will look.Towards this end, we consider the pessimistic and the optimistic case.In the pessimistic setting the social planner chooses Ψ in order to minimize max π∈N E C(x 0 , π), so he assumes that whenever the players choose their strategies, they choose the equilibrium which yields the highest social cost.In the optimistic case the social planner chooses Ψ minimizing min π∈N E C(x 0 , π), assuming that the players choose the equilibrium which yields the lowest social cost.N E above denotes the set of Nash equilibria of the game of the second stage.The result is summarized in the following theorem: Theorem 7 In the worst-case hierarchical model: (a) If γ ≤ 1 µ lim u→∞ c(u) then the social planer chooses any Ψ , with all the players using strategy N N in the equilibrium.µ c(0) then in the optimistic case the social planner chooses Ψ = 0 while all the players use strategy 7 EE or N E in the equilibrium.
Proof In cases (a) and (b) the social planner wants the players never to enter the queue.In case (a) never entering the queue is the equilibrium, regardless of Ψ .In case (b) forcing players to use N N policies requires choosing threshold Ψ such that H N E (Ψ ) > γ.If we compare the denition of H N E with (8), we obtain that Ψ < Θ, as for γ < 1 µ c λ µ , H N E (Ψ ) can only obtain the γ value for Ψ > λ µ .In cases (c)(e) the social optimum is achieved if players always enter the queue.Thus the social planner forces the players to use EE policies if possible (optimistic scenarios in (d) and (e)).In case (d) this means choosing any value of threshold Ψ , in case (e) this means choosing Ψ = 0, so that two equilibrium policies N E and EE were equivalent.If forcing the players to use EE policies is impossible, the social planner chooses the lowest possible Ψ such that the players would use N E, and not N N policies in equilibrium.In case (c) this means the smallest Ψ such that H N E (Ψ ) = γ, which for 0 c(u) du equals Θ (in such a case H N E (Ψ ) obtains the γ value both for Ψ < λ µ and Ψ > λ µ ).In the pessimistic variant of case (d) this means choosing Ψ such that H N N (Ψ ) = γ, which, by the denition of H N N and ( 8) is equivalent to Ψ = Θ.
Remark 4 Note that in Theorem 7 we did not consider the case of γ = 1 µ c λ µ .This is because in this case the social cost of any policy (used by all the players) 7 For Ψ = 0 they are equivalent.equals 0. Thus the social planner may choose any value of Ψ .This however may result in dierent equilibria in the game of the second stage.
We further analyze, how this result aects Price of Anarchy and Price of Stability in our model.Both these quantities are computed ex post, that is, we assume that all the users have their knowledge about the state of the queue limited when they make their decisions, but PoA and PoS are computed when all the state information is revealed.This allows us to compare the results obtained in the hierarchical model with the ones obtained for the full information case.The result presented below is an immediate consequence of Theorem 7, and thus presented without proof.Theorem 8 In worst-case hierarchical model: 0 c(u) du and x 0 < Θ. Otherwise it equals 1.

The Price of Anarchy is innite if
µ c(0) and x 0 < Θ. Otherwise it equals 1.
As we can see from Theorem 8, when only information available for the players is an indicator of state being above or below Ψ , then both PoA and PoS stay the same as in the model with full information.Thus we can claim that we can reduce the information given to the players without degrading the performance.On the other hand, we cannot improve it only by choosing appropriate signal to send.

Approximation of the Discrete Model
In the section below we present a result which joins the equilibria of the uid model with -equilibria of the discrete model when the incoming rate is suciently high.To formulate it, we need to introduce some additional notation, dierentiating between the discrete and the uid model.Let us start with xing that the function c and parameters λ and µ dene the uid model, whose state will be denoted by X t .Then let M n be a discrete model with service cost c n (x) = c x n , incoming rate λ n = nλ and service rate µ.The state in model M n will be denoted by X n t , while will be its normalized state.Using this notation we can formulate the main result of this section and its proof.
Theorem 9 Suppose that the initial (normalized) state of the queue x 0 ∈ [0, x max ] for some xed x max and that the user k plays against [Θ, q]-threshold policies of all the others (denoted shortly as π policies) in the uid model with service cost c, incoming rate λ and service rate µ.Then for any ε > 0 there exists an N such that for any n ≥ N his expected cost from entering the queue in the discrete model M n , where π n denotes a [nΘ, q]-threshold policy (which is a proper rescaling of policy π to t M n ), diers from the expected cost E [C k (X t (π))] in the uid model by at most ε.
Proof Let us consider two policies for the discrete model M n : .
They are rescalings of the following policies for the uid model: These policies dier from [Θ, q]-threshold policy π only on sets (Θ − β, Θ) or (Θ, Θ + β) respectively.Next, consider equation ( 3) when all the players apply policy π β .It can be directly computed that the solution X t (π β ) has the following properties: Properties (ii) and (iii) clearly imply that X t (π β ) is Lipschitz-continuous with constant max x max , λ µ , λ + µΘ , independent of β.Thus all the functions X t (π β ) are equicontinuous (as functions of t).
Next, we can nd T ε such that Clearly, as X t (π β ) are equicontinuous and converging to X t (π), by the Arzelà-Ascoli theorem X t (π β ) converges to X t (π) uniformly on interval [0, T ε ].On the other hand, c is continuous, decreasing and bounded, thus it is uniformly continuous, which means that there exists a δ > 0 such that for any x, y such that |x − y| < δ we have |c(x) − c(y)| < ε 8Tε .Using uniform convergence of X t (π β ) we can further conclude that there exists a β > 0 such that Now note that by the Kurtz theorem (see Theorem 5.3 in [11]), for some positive constant D and a function F satisfying lim η 0 η 2 ∈ (0, ∞).By this last property, the probability bounded above converges to zero as n goes to innity at rate of e −n , so for n large enough this probability is not bigger than ε 8Tεc(0) .Next, using uniform continuity of c we can write: Finally we can write where the last inequality is a consequence of ( 9), ( 10), (11) and the bound on the probability that X n t (π β,n ) and X t (π β ) dier by more than δ (recall that c(0) is the biggest value of c) Now we can repeat all the above considerations for policies π β and π β,n , obtaining similiar inequality To complete the proof note that X n t (π β,n ), X n t (π) and X n t (π β,n ) are birthdeath processes starting at the same x 0 , with the same death rate, but with increasing birth rates.As a consquence X n t (π β,n ) is for any t ≥ 0 stochastically dominated by X n t (π), which in turn is stochastically dominated by X n t (π β,n ).This however implies that . But this, together with ( 14) and ( 15) implies the thesis of the theorem.
Using Theorem 9 we can immediately show that all the results proved for the mean-eld model can be viewed as good approximations of what happens in the discrete case when service rates go to innity.This is formulated in ve corollaries below.
Corollary 3 Suppose that the initial (normalized) state of the queue x 0 ∈ [0, x max ] for some xed x max and that [Θ, q]-threshold policies of all the players form an equilibrium in the uid model with service cost c, incoming rate λ and service rate µ.Than for any ε > 0 there exists an N such that for any n ≥ N [nΘ, q]-threshold policies form ε-equilibria in dicrete models M n .Corollary 4 Suppose all the players have the same statistical information about the system state ρ and that for some Ψ ≥ 0, f policies of all the players (where f is of one of three types: EE, N E, N N ) form an equilibrium in the Bayesian partial information uid model with service cost c, incoming rate λ and service rate µ.Than for any ε > 0 there exists an N such that for any n ≥ N f policies form ε-equilibria in Bayesian partial information counterparts of discrete models M n .
Corollary 5 Suppose that for some Ψ ≥ 0, f policies of all the players (where f is of one of three types: EE, N E, N N ) form an equilibrium in the worstcase partial information uid model with service cost c, incoming rate λ and service rate µ.Than for any ε > 0 there exists an N such that for any n ≥ N f policies form ε-equilibria in worst-case partial information counterparts of discrete models M n .
Corollary 6 Suppose all the players have the same statistical information about the system state ρ and that Ψ and f policies of all the players (where f is of one of three types: EE, N E, N N ) form an equilibrium in the hierarchical Bayesian partial information uid model with service cost c, incoming rate λ and service rate µ.Than for any ε > 0 there exists an N such that for any n ≥ N nΨ and f policies for all the players form ε-equilibria in hierarchical Bayesian partial information counterparts of discrete models M n .
Corollary 7 Suppose that Ψ and f policies of all the players (where f is of one of three types: EE, N E, N N ) form an equilibrium in the hierarchical worst-case partial information uid model with service cost c, incoming rate λ and service rate µ.Than for any ε > 0 there exists an N such that for any n ≥ N nΨ and f policies for all the players form ε-equilibria in hierarchical worst-case partial information counterparts of discrete models M n .

Numerical Analysis
Here we numerically evaluate NE strategy prole in complete and partial information setting for some special class of cost functions c(•).We consider the following cost function where a > 0. It is easy to discern that c(•) is strictly decreasing with u.
From (8), Θ is the solution of the following equation  For the rest of simulation results, we consider x 0 = 0.2.Note that c(ρ) = γµ when γ = 1 7 . Figure 2 shows the variation of optimal social cost and the highest possible social cost at an equilibrium with γ.Optimal social cost is zero when γ ≤ 1 7 . But when γ > 1 7 the optimal cost increases linearly as it is evident by Theorem 2. From ( 17) we obtain that when γ ≤ 0.1865, then Θ ≥ 0.2 and when γ > 0.1865, then Θ < 0.2.Thus, when γ < 0.1865 social cost under the best equilibrium is zero by Theorem 4. But when γ ≥ 0.1865 the social cost under the best equilibrium is exactly the same as optimal social cost.Figure 3 shows the variation of optimal social cost and worst case social cost with γ.Note that γ ≥ 0.3466, Θ ≥ 0.2 and for γ < 0.3466, Θ < 0.2.Fig. 3 Variation of optimal social cost and the worst possible social cost at equilibrium with γ for complete information game Thus, when γ ≤ 0.3466 the worst case social cost is 0 by Theorem 3. On the other hand when γ > 0.3466 the worst case social cost is exactly the same as optimal social cost.

Partial Information Game
In this section, we numerically evaluate the social cost when only partial information is available to each player.We consider x 0 = 0.2.
We start with the section by noting from Theorem 4 and Theorem 8 that the variation of the best possible social cost at an equilibrium is exactly the same as in the complete information game.Hence, we omit the study of best case scenario.
Figure 4 shows the variation of optimal social cost and the worst case social cost with γ.From (17) we obtain that for γ ≤ 0.1865 Θ ≥ 0.2 and Θ < 0.2 for γ > 0.1865.Hence, by Theorem 8 the worst case social cost is 0 when γ ≤ 0.1865.Note from (18) that 1 λ ρ 0 c(u)du = 0.2503.Thus, by Theorem 8 worst case social cost is equal to the optimal social cost for γ ∈ (0.1865, 0.2503) and the worst case social cost becomes 0 when γ = 0.2503 (since Θ = 0.5012 > x 0 when γ = 0.2503).The worst case social cost is again equal to the optimal social cost when γ > 0.2503.

Conclusions
We studied in this paper a congestion game in a uid queueing network in which customers benet from congestion, i.e. the cost per customer decreases Fig. 4 Variation of optimal social cost and the worst possible social cost at equilibrium with γ for partial information game with the congestion.We showed that this could lead to a large number of symmetric equilibria, all of which with a reverse threshold behavior: customers get in if and only if the number of queued customers exceeds the threshold.We computed perfect equilibria to this game and the social optimum.Further, we considered a model where the information provided to the players is limited to an indication of whether the state of the queue is above or below some threshold.It turned out that appropriate limitation of the information obtained by the players can draw the outcome of the game towards the social optimum.Finally we showed that one can use the equilibria policies in the uid queue to approximate equilibria for discrete queues and provided numerical examples.

1 µ
u) du and lim x→∞ C(x) = lim u→∞ c(u).Thus, by the intermediate value property, and since du , there exists an x such that C(x) = γ, but this is exactly how Θ is dened.Similarly, the function C(x) := 1 xµ x 0 c(u) du is continuous on R + and satises C λ µ = 1 λ λ µ 0 c(u) du and lim x→∞ C(x) = lim u→∞ c(u).Thus, again by the intermediate value property, and since γ ∈

1
u) du, 1 µ c(0) , the equation C(x) = γ has no solutions.Moreover, its LHS is always smaller than its RHS.On the other hand, since lim x→0 + C(x) = 1 µ c(0) and C λ µ = 1 λ λ µ 0 c(u) du, by the intermediate value property the equation C(x) = γ has a unique solution Θ ∈ 0, λ γ .Next, again by Lemma 4, Now we are ready to formulate the main result.Theorem 5 For any Ψ ≥ 0 the game with partial information has a purestrategy worst-case equilibrium.Moreover: (a) When γ > L N N (Ψ ) then all the players use policy EE in equilibrium; (b) When L N N (Ψ ) ≥ γ ≥ L EE (Ψ ) and γ > H N N (Ψ ) then strategy proles where all the players use policy EE and where all the players use policy N E are equilibria; (c) When H N N (Ψ ) ≥ γ ≥ max{L EE (Ψ ), H N E (Ψ )} then any strategy prole where all the players use the same policy is an equilibrium; (d) When H N E (Ψ ) > γ ≥ L EE (Ψ ) then strategy proles where all the players use policy EE and where all the players use policy N N are equilibria; (e) When L EE (Ψ ) > γ > H N N (Ψ ) then all the players use policy N E in equilibrium; (f) When min{L EE (Ψ ), H N N (Ψ )} ≥ γ ≥ H N E (Ψ ) then strategy proles where all the players use policy N E and where all the players use policy N N are equilibria; (g) When min{L EE (Ψ ), H N E (Ψ )} > γ then all the players use policy N N in equilibrium.

1 µ
(d) If γ = 1 µ c(0) then there are two equilibria regardless of Ψ , where all the players use the same policy, which is either EE or N E. (e) If γ > 1 µ c(0) then all the players use strategy EE in the equilibrium, regardless of Ψ .Proof (a) If γ ≤ lim u→∞ c(u) then by Lemma 9 H EE is always bigger than γ.Consequently, by Lemma 8 also H N E > γ and L EE > γ and thus by Theorem 5 all the players use policy N N in the worst-case equilibrium.(b) If γ ∈ 1 µ lim u→∞ c(u), 1 λ λ µ

(b) If γ ∈ 1 µ
lim u→∞ c(u), 1 µ c λ µ then the social planner chooses any Ψ < Θ and all the players use policy N N in the equilibrium.(c) If γ ∈ 1 u) du then the social planner chooses Ψ = Θ with all the players using policy N E in the equilibrium.(d) If γ ∈ 1 λ λ µ 0 c(u) du, 1 µ c(0) then in pessimistic case the social planner chooses Ψ = Θ and all the players use policy N E in equilibrium; in the optimistic case the social planner chooses any Ψ and all the players use policy EE in equilibrium.(e) If γ ≥ 1

Fig. 1 Fig. 2
Fig. 1 Variation of Θ and Lower Threshold value with γ