A new quantum scheme for normal-form games

We give a strict mathematical description for a refinement of the Marinatto-Weber quantum game scheme. The model allows the players to choose projector operators that determine the state on which they perform their local operators. The game induced by the scheme generalizes finite strategic form game. In particular, it covers normal representations of extensive games, i.e., strategic games generated by extensive ones. We illustrate our idea with an example of extensive game and prove that rational choices in the classical game and its quantum counterpart may lead to significantly different outcomes.


Introduction
A 15-year-period research on quantum games results in many ideas of how a quantum game might look like and how it might be played.Certainly, the quantum scheme for 2 × 2 games introduced in [1] (the EWL scheme) has become one of the most common models and it has already found application in more complex games (see, for example, [2]).However, the more complex a classical game is, the more sophisticated techniques are required to find optimal players' strategies in the EWL-type scheme.While in the scheme for 2 × 2 games the result of the game depends on six real parameters (each players' strategy is a unitary operator from SU (2), and it is defined by three real parameters), the EWL-type scheme for 3 × 3 games would already require 16 parameters to take into account [3], [4].One way to avoid cumbersome calculations when studying a game in the quantum domain was presented in [5] (see also recent papers [6], [7] [8] and [9] based on this scheme).The authors defined a model (the MW scheme) for quantum game where the players' unitary strategies were restricted to the identity and bit-flip operator.Then, the game became quantum if the players' local operators were performed on some fixed entangled state |Ψ (called the players' joint strategy).The MW scheme appears to be much simpler than the EWL scheme.The number of pure strategies of each player is the same as in the classical game [10].Thus, the complexity of finding a rational solution is similar in both a classical game and the corresponding quantum counterpart.Unfortunately, that simple scheme exhibits some undesirable properties that we pointed out in [11].First, the MW scheme implies non-classical game even if the players' joint strategy is an unentangled state.In particular, if a player's qubit is in an equal superposition of computational basis states, she cannot affect the game outcome in contrast to her strategic position in the classical game.Moreover, the players have no impact on the form of the initial state.In paper [11] we showed that the above-mentioned drawbacks vanish by allowing the players to choose between the basis state that represents the classical game and the state |Ψ .In this paper, we continue that line of research.We give a formal description for players' strategies to include the choice of the initial state in the MW scheme.It will allow us to move beyond bimatrix games examined in [11] and consider more general normal form games. Then we study possible applications of the scheme.Some knowledge of game theory is required to follow this paper.While theory of bimatrix games is commonly used in quantum game theory, the notion of normal representation of extensive games may not be known for readers that deal with quantum games.Therefore, we encourage the reader who is not familiar with extensive game theory to see one of the textbooks [12], [13].

Refinement of the Marinatto-Weber scheme
In paper [11] we introduced a new scheme for playing finite bimatrix games in the quantum domain.The idea behind the scheme is that the players can choose whether they play a classical game or its quantum counterpart defined by the MW scheme.In the case of quantum model for 2 × 2 bimatrix games, this means that the players choose their local operations: the identity 1 or the Pauli operator σ x and additionally they decide whether the chosen operators are performed on state |00 or some fixed state |Ψ ∈ C 2 ⊗ C 2 .Now, we give a formal description for the scheme., where (a i j , b i j ) ∈ R 2 . (1) The quantum scheme for game (1) is defined on an inner product space (C 2 ) ⊗4 by the following components: where 2. Players' pure strategies: i ⊗U (3) j for player 1, P k ⊗U (4) l for player 2, where i, j, k, l = 0, 1, and the upper indices identify the subspace C 2 of (C 2 ) ⊗4 on which the operators are defined.That is, player 1 acts on the first and third qubit, player 2 acts on the second and fourth one.The order of qubits is in line with the upper indices.
3. Measurement operators M 1 and M 2 given by formula where a xy and b xy are the payoffs from (1).
The scheme proceeds in the similar way as the MW scheme or the EWL scheme-the players determine the final state by choosing their strategies and acting on operator H.As a result, they determine the following density operator: Next, the payoffs for player 1 and 2 are tr(ρ f M 1 ) and tr(ρ f M 2 ).
Similar to the MW scheme, each player is allowed to use mixed strategies, i.e., to choose her own strategies according to some probability distribution.Let (p i j ) i j=0,1 be a probability distribution over the set P (1) i ⊗ U (3) j : i, j = 0, 1 , and (q kl ) k,l=0,1 be a probability distribution over P (2) k ⊗ U (4) l : k, l = 0, 1 .Then the resulting density operator takes the form Note that scheme (2)-( 4) generalizes the classical way of playing the game.If the players' strategy profile takes the form P (1) the players' payoffs depend on U Obviously, if U (3) j and U (4) j are chosen according to some probability distributions {p 00 , p 01 } and {q 00 , q 01 }, respectively, the resulting distribution over a jl (b jl ) coincides with one given by the corresponding mixed strategy profile in game (1).As a result, scheme (2)-(4) determines a game that is a complete quantization of (1) (see [14] for the definition of complete quantization).

Nash equilibrium
In non-cooperative quantum game theory, Nash equilibrium is the most used solution concept.It is defined as a profile of strategies of all players in which each strategy is a best response to the other strategies.In view of scheme (2)-( 4), it is a mixed strategy profile (p * i j ) i, j=0,1 , (q * kl ) i, j=0,1 that solves the following optimization problems: (q * kl ) ∈ arg max where S ik jl = P (1) l .Like in the classical game theory, we can simplify conditions ( 10) and ( 11) and only check if (p * i j ) or (q * kl ) yields a payoff that is equal to a maximum payoff when choosing pure strategies.More formally, condition ( 10) is equivalent to the following one tr It follows from the fact that tr(ρ f M 1 ) for density operator ρ f given by ( 7) is a convex combination of elements tr with weights p i j .In similar way we can simplify condition (11).

Bimatrix form
The game given by scheme ( 2)-( 4) can be expressed in terms of bimatrix form.Each entry of the bimatrix is a pair (tr(ρ f M 1 ), tr(ρ f M 2 )) of payoffs that corresponds to a particular profile P (1) l .As a result, we obtain where Bimatrix ( 14) is a very convenient way to study the game determined by scheme ( 2)-( 4).Once, the entries (tr(ρ f M 1 ), tr(ρ f M 2 )) are specified, we can leave quantum formalism out and use (14).This is due to the linearity of trace that makes a density operator (7) and the corresponding probability distribution over pure strategies equivalent in a sense of generated outcomes.For example, in order to find Nash equilibria we can use the techniques for bimatrix games instead of conditions ( 10) and (11).Note that bimatrix ( 14) clearly shows the role of components P i of players' strategies.Namely, the operations are preformed on state |Ψ if and only if both players form profile P (1) The scheme can be generalized to include more than one joint strategy |Ψ .Let us define operator H on and players' pure strategies In this case, the local operators U

Quantum model for general bimatrix games
We showed in [?] how to construct the scheme for any finite bimatrix game according to the MW model.The key elements of the scheme are appropriately defined operators for players.In the case of (n + 1) where n, m ≥ 1, player 1 (player 2) has n + 1 operators U i (m + 1 operators V j ) defined on space C n+1 (C m+1 ) that act on basis states {|0 , |1 , . . ., |n } ({|0 , |1 , . . ., |m }) as follows: In view of ( 19) and (20), scheme ( 2)-( 4) can be generalized by the players' strategies and the positive operator having the same form as ( 2), but with the outer product operators |00 00|, |Ψ Ψ| defined on C n+1 ⊗ C m+1 .
3 Quantum approach to finite normal form games In the previous section we formalized the refinement of the MW scheme that was introduced in [11].We obtained the scheme that can be applied to any finite bimatrix game.In this section, we construct a framework for general normal-form games.The term of normal-form game has two main meanings.One concerns a strategic game given a priori.It is defined by triple (N, where N is a set of players and, for i ∈ N, components S i and u i are player i's strategy set and payoff function, respectively.The second meaning concerns a strategic game (N, {S i } i∈N , {u i } i∈N ) that is generated by a game in extensive form.The strategic game obtained in this way is called the normal representation of the extensive game.In what follows, we extend the scheme (2)-(4) to cover both cases.

Strategic-form game
The difference between bimatrix games and finite strategic games is that more than two players (say n players) are allowed in the latter case.Therefore, operator (2) has to be modified in such a way that it simply outputs a density operator after n players' strategies act on it.
For simplicity of our analysis we restrict our attention to n-person strategic games with each S i having two elements.The extension of scheme (2)-( 4) is defined now on space (C 2 ) ⊗n ⊗ (C 2 ) ⊗n with the positive operator H, where |Ψ ∈ (C 2 ) ⊗n , |Ψ = 1.Each player i ∈ {1, . .., n} has a strategy determined by (3) that acts on qubits i and n + i, i.e., it is on the form P (i) j n+i , where j i , j n+i = 0, 1.As a result, a profile of players' strategies forms operator that results in the following density operator: Finally, we define for each player i the payoff measurement M i , where a i x 1 ,...,x n is player i's payoff in the classical game that corresponds to strategy profile consisting of (x 1 + 1)th strategy of player 1, (x 2 + 1) strategy of player 2, . . ., (x n + 1) strategy of player n.It is not difficult to check that scheme ( 22)-( 24) generalizes a n-person strategic game with two strategies for each player.If the joint strategy |Ψ is not played, i.e., element Thus, for strategic-form game (N, The game generated by scheme ( 22)-( 24) is equivalent to game (26) if strategies s 0 and s 1 are identified, respectively, with U (n+i) 0 and U (n+i) 1 for each i.
Example 1 Let us consider the three-person Prisoner's Dilemma that was studied in the quantum domain (via the EWL scheme) by Du et al. [15].In terms of matrices the game is defined as follows: Here, player 1 and 2 choose between the rows and the columns, respectively, whereas player 3 chooses between the matrices.We recall that the only Nash equilibrium in ( 27) is a profile consisting of the players' second strategies.Thus, the most reasonable result of the game is (1, 1, 1).Similar to the best-known 2-person Prisoner's Dilemma, the players would increase their payoffs if at least two of them played their first strategies.However, the first strategy cannot be played by a rational player since for each profile of the opponents' strategies this strategy always yields a worse payoff than the second strategy.In what follows, we apply scheme ( 22)-(24) to game (27).
According to the reasoning used immediately before Example 1, we identify each player's strategies in game (27) with local operators U 0 and U 1 .Moreover, let us assume that player i, i = 1, 2, 3 acts on the system of ith and (i + 3)th qubit.As a result, scheme (22)-( 24) comes down to one defined on (C 2 ) ⊗3 ⊗ (C 2 ) ⊗3 with the positive operator the player i's strategy set and the triple of payoff operators Let us fix now the players' joint strategy |Ψ as: and determine the resulting players' payoffs that correspond to profiles Note that for fixed 6 k=4 U (k) We see from the matrix representation that there are two types of pure Nash equilibria.The first one corresponds to the unique equilibrium in game ( 27), and it is generated by profiles 1 , where ( j 1 , j 2 , j 3 ) ∈ {(0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1)}. (38) Each profile of ( 38) is a Nash equilibrium since each player's unilateral deviation from the equilibrium strategy yields the payoff 0 or 1.It also follows from the construction of ( 22)-( 24).Namely, if a player cannot cause the joint strategy |Ψ to be played by changing her own strategy, the equilibrium analysis is restricted to studying the local operations on state |000 .That, in turn, coincides with the problem of finding Nash equilibria in game ( 27), and 1 is just the counterpart of the profile of the players' second strategies that forms the unique equilibrium in (27).However, in contrast to (27), the quantum game has another equilibrium given by profile (39) Indeed, player 1 suffers a loss of at least 1/4 by unilaterally deviation from strategy P (1) 1 and the same occurs in the case of player 2 and 3. Profile (39) is more profitable than (38) since it implies 11/4 for each player instead of 1.Thus, the players gain by making use of the joint strategy |Ψ , i.e., by playing 3 k=1 P (k)

Normal representation of extensive games
Given an extensive form game, one can construct a representation of that game in the strategic (normal) form.The resulting strategic game and the given extensive game have the same set of players and the same set of strategies for each player.The payoff functions are determined by the payoffs generated by the strategies in the extensive game.The normal representation appears to be a very convenient way to study the extensive game.In particular, while we lose the sequential structure, we obtain the sufficient and easier form of the game to find all the Nash equilibria.
In our earlier paper [16], we introduced a quantum scheme for playing an extensive game by using its normal representation.Basing on the MW and EWL schemes, we assigned an action at each information set in an extensive game to a local operation on a particular qubit in the quantum game.As a result, a number of qubits on which each player was allowed to specify local operations was equal to the number of their information sets.In what follows, we extend our idea to the refinement of the MW scheme.This means that in addition to multiple choice of 1 and σ x , the players specify the state on which they perform the local operators.
Let us modify (22) to cover the normal-form game determined by an extensive game with the set of players {1, 2, . .., k} and n information sets, n ≥ k.The positive operator is now defined on (C 2 ) ⊗k ⊗ (C 2 ) ⊗n by formula where |Ψ ∈ (C 2 ) ⊗n and |Ψ = 1.Let ξ : {k + 1, k + 2, . .., k + n} → {1, 2, . . ., k} be a surjective map.We define player i's set of strategies as follows where j y are defined by (3).As an possible application of (40)-( 41), let us consider the following example: Example 2 (Four-stage centipede game) A centipede game is a 2-person extensive game in which the players move one after another for finitely many rounds.In some sense, it can be treated as an extensive counterpart of the Prisoner's Dilemma.While both players are able to obtain a high payoff, their rationality leads them to one of the worst outcomes.An example of a four-stage centipede game is shown in Fig. 1.Each player has two information sets (in this case, they are represented by the nodes of the game tree) with two available actions at each of them.Each player can stop the game (action S) or continue the game (action C), giving the opportunity to the other player to make her choice.One way to learn how the game may end is by backward induction.If player 2 is to choose at her second information set, she certainly plays action S since she obtains 5 instead of 4-the result of playing action C. Since players' rationality is common knowledge, player 1 knows that by playing C at her second information she ends up with payoff 3. Thus, player 1 chooses S that yields 4. Similar analysis shows that the players choose action S at their first information sets.Consequently, the backward induction predicts outcome (2, 0).As we focus on normal form games, we construct the normal representation associated with the game in Fig. 1.Let us first determine the players' strategies.We recall that a player's strategy in an extensive game is a function that assigns an action to each information set of that player.Thus, each player has four strategies in the case of a four-stage centipede game.They can be written in the form SS, SC,CS and CC, where, for example, CS means that a player chooses C at her first information set and S at the second one.Once the strategies are specified, we determine the payoffs that correspond to all possible strategy profiles.For example, (SC,CC) determines outcome (2, 0) since player 1's strategy SC specifies action S at her first information set.On the other hand, profile (CC,CS) corresponds to payoff (3,5) as player 1 always plays C and player 2 chooses S at her second information set.The players' strategies together with the payoffs corresponding to the strategy profiles define the following normal representation (42) By using bimatrix (42), we can learn that rational players always choose action S at their first information sets.More formally, there are four pure Nash equilibria: (SS, SS), (SS, SC), (SC, SS) and (SC, SC), each resulting in outcome (2, 0).
The main advantage of model ( 40)-(41) or equivalently ( 22)-( 24) is that a classical normal form game and its quantum counterpart have similar complexity.In particular, given any 2-person finite extensive game with k strategies for each player, the normal form game implied by scheme (40)-( 41) is just a bimatrix 2k × 2k game.As a result, there is no significant difference in the problem of determining Nash equilibria in both games.
Example 3 (N-stage centipede game) Let us consider a centipede game where this time the number of stages is any even integer n for n ≥ 2. The extensive form for this game is given in Fig. 2. Similar to the four-stage centipede game, the n-stage case has also the unique equilibrium outcome (2, 0).Rational players choose action S at their own information sets even though the game enables the players to obtain the payoffs approximate to the number of stages.We have learned from the preceding example that there is a unique, symmetric, and pareto-optimal Nash equilibrium if (42) is extended to (46).It turns out that the result is valid in the general case.That is, there is a Nash equilibrium that implies the payoff n + 1/2 for both players (pair of payoffs (n + 1/2, n + 1/2) is indeed a paretooptimal outcome since it is the midpoint of the segment whose endpoints are (n − 1, n + 1) and (n + 2, n)).In order to prove the existence of that equilibrium, let us generalize (43) and (44) to an arbitrary n-stage centipede game.Since there are two players and n information sets in the game, the positive operator H and the players' strategies are given by (40) and (41) for k = 2.We assume that players 1 and 2 perfom their local operators on qubits with odd and even indices, respectively.Thus, the map ξ : {3, 4, . . ., n + 2} → {1, 2} is given by formula The appropriately generalized payoff operators take the form

( 3 )
j ⊗ U (4) l are performed on state |Ψ i if and only if the resulting stategy profile takes the form |ii ii| ⊗ U (3) j ⊗ U (4) l .

Figure 1 :
Figure 1: Extensive form representation of a four-stage centipede game (left) and the corresponding payoff polytope (right).