1 Introduction

Quantum information experiments can be described as a sequence of three operations: state preparation, evolution and measurement [1]. In most cases, one cannot assume that experiments are conducted perfectly; therefore, imperfections have to be taken into account while modeling them. In this work, we are interested in how the knowledge about imperfect evolution of a quantum system can be exploited by players engaged in a quantum game. We assume that one of the players possesses the knowledge about imperfections in the system, while the other is ignorant of their existence. We ask a question of how much the player’s knowledge about those imperfections can be exploited by him/her for their advantage.

We consider implementation of the quantum version of the penny flip game, which is influenced by the environment that causes decoherence of the system. In order to model the decoherence, we assume Markovian approximation of open quantum system dynamics. This assumption is valid, for example, in the case of two-level atom coupled to the vacuum, undergoing spontaneous emission (amplitude damping). The coherent part of the atom’s evolution is described by one-qubit Hamiltonian. Spontaneous emission causes an atom in the excited state to drop down into the ground state, emitting a photon in the process. Similarly, phase damping channel can be considered. This channel causes a continuous decay of coherence without energy dissipation in a quantum system [2].

The paper is organized as follows: in the two following subsections, we discuss related work and present our motivation to undertake this task. In Sect. 2, we recall the penny flip game and its quantum version; in Sect. 3, we present the noise model; in Sect. 4, we discuss the strategies applied in the presence of noise and finally in Sect. 5, we conclude the obtained results.

1.1 Related work

Imperfect realizations of quantum games have been discussed in literature since the beginning of the century. Johnson [3] discusses a three-player quantum game played with a corrupted source of entangled qubits. The author implicitly assumes that the initial state of the game had passed through a bit-flip noisy channel before the game began. The corruption of quantum states in schemes implementing quantum games was studied by various authors, e.g., in [4], the authors study the general treatment of decoherence in two-player, two-strategy quantum games; in [5], the authors perform an analysis of the two-player prisoners’ dilemma game; in [6], the multiplayer quantum minority game with decoherence is studied; in [7, 8], the authors analyze the influence of the local noisy channels on quantum Magic Squares games, while the quantum Monty Hall problem under decoherence is studied first in [9] and subsequently in [10]. In [11], the authors study the influence of the interaction of qubits forming a spin chain on the qubit flip game. An analysis of trembling hand-perfect equilibria in quantum games was done in [12]. Prisoners’ dilemma in the presence of collective dephasing modeled by using the Markovian approximation of open quantum systems dynamics is studied in [13]. Unfortunately, the model applied in this work assumes that decoherence acts only after the initial state has been prepared and ceases to act before unitary strategies are applied. Another interesting approach to quantum games is the study of relativistic quantum games [14, 15]. This setup has also been studied in a noisy setup [16].

1.2 Motivation

In the quantum game, theoretic literature decoherence is typically applied to a quantum game in the following way:

  1. 1.

    The entangled state is prepared,

  2. 2.

    It is transferred through a noisy channel,

  3. 3.

    Players’ strategies are applied,

  4. 4.

    The resulting state is transferred once again through a noisy channel,

  5. 5.

    The state is disentangled,

  6. 6.

    Quantum local measurements are performed, and the outcomes of the games are calculated.

In some cases, where it is appropriate, steps 4 and 5 are omitted. The problem with the above procedure is that it separates unitary evolution from the decoherent evolution. In Miszczak et al. [11], it was proposed to observe the behavior of the quantum version of the penny flip game under more physically realistic assumptions where decoherence due to coupling with the environment and unitary evolution happen simultaneously. In the papers, the authors study an implementation of the qubit flip game on quantum spin chains. First, a design, expressed in the form of quantum control problem, of the game on the trivial, one-qubit spin chain is proposed. Then the environment in the form of an additional qubit is added, and spin-spin coupling is adjusted, so one of the players, under some assumptions, can not detect that the system is implemented on two qubits rather than on one qubit. In the paper, it is shown that if one of the players posses the knowledge about the spin coupling, he or she can exploit it for augmenting his or hers winning probability.

2 Game as a quantum experiment

In this work, our goal is to follow the work done in [11] and to discuss the quantum penny flip game as a physical experiment consisting in preparation, evolution and measurement of the system. For the purpose of this paper, we assume that preparation and measurement, contrary to noisy evolution of the system are perfect. We investigate the influence of the noise on the players’ odds and how the noisiness of the system can be exploited by them. The noise model we use is described by the Lindblad master equation, and the dynamics of the system is expressed in the language of quantum systems control.

2.1 Penny flip game

In order to provide classical background for our problem, let us consider a classical two-player game, consisting in flipping over a coin by the players in three consecutive rounds. As usual, the players are called Alice and Bob. In each round, Alice and Bob performs one of two operations on the coin: flips it over or retains it unchanged.

At the beginning of the game, the coin is turned heads up. During the course of the game the coin is hidden and the players do not know the opponents actions. If after the last round, the coin tails up, then Alice wins, otherwise the winner is Bob.

The game consists of three rounds: Alice performs her action in the first and the third round, while Bob performs his in the second round of the game. Therefore, the set of allowed strategies consists of eight sequences \((N,N,N), (N,N,F),\) \( \ldots , (F,F,F)\), where \(N\) corresponds to the non-flipping strategy and \(F\) to the flipping strategy. Bob’s pay-off table for this game is presented in Table 1. Looking at the pay-off tables, it can be seen that utility function of players in the game is balanced; thus, the penny flip game is a zero-sum game.

Table 1 Bob’s pay-off table for the penny flip game

A detailed analysis of this game and its asymmetrical quantization can be found in [17]. In this work it was shown that there is no winning strategy for any player in the penny flip game. It was also shown, that if Alice was allowed to extend her set of strategies to quantum strategies she could always win. In Miszczak et al. [11] it was shown that when both players have access to quantum strategies the game becomes fair and it has the Nash equilibrium.

2.2 Qubit flip game

The quantum version of the qubit flip game was studied for the first time by Meyer [18]. In our study, we wish to follow the work done in the aforementioned paper [11]. Hence, we consider a quantum version of the penny flip game. In this case, we treat a qubit as a quantum coin. As in the classical case the game is divided into three rounds. Starting with Alice, in each round, one player performs a unitary operation on the quantum coin. The rules of the game are constrained by its physical implementation. In order to obtain an arbitrary one-qubit unitary operation it is sufficient to use a control Hamiltonian built using only two traceless Pauli operators [19]. Therefore, we assume that in each round each of the players can choose three control parameters \(\alpha _1,\alpha _2,\alpha _3\) in order to realize his/hers strategy. The resulting unitary gate is given by the equation:

$$\begin{aligned} U(\alpha _1,\alpha _2,\alpha _3)=\hbox {e}^{-\mathrm{i}\alpha _3\sigma _z \Delta t} \hbox {e}^{-\mathrm{i}\alpha _2\sigma _y \Delta t} \hbox {e}^{-\mathrm{i}\alpha _1\sigma _z \Delta t}, \end{aligned}$$
(1)

where \(\Delta t\) is an arbitrarily chosen constant time interval.

Therefore, the system defined above forms a single qubit system driven by time-dependent Hamiltonian \(H(t)\), which is a piecewise constant and can be expressed in the following form

$$\begin{aligned} H(t)= {\left\{ \begin{array}{ll} \alpha _1^{A_1}\sigma _z &{} \text { for } 0\le t < \Delta t,\\ \alpha _2^{A_1}\sigma _y &{} \text { for } \Delta t\le t < 2\Delta t,\\ \alpha _3^{A_1}\sigma _z &{} \text { for } 2\Delta t\le t < 3\Delta t,\\ \alpha _1^{B}\sigma _z &{} \text { for } 3\Delta t\le t < 4\Delta t,\\ \alpha _2^{B}\sigma _y &{} \text { for } 4\Delta t\le t < 5\Delta t,\\ \alpha _3^{B}\sigma _z &{} \text { for } 5\Delta t\le t < 6\Delta t,\\ \alpha _1^{A_2}\sigma _z &{} \text { for } 6\Delta t\le t < 7\Delta t,\\ \alpha _2^{A_2}\sigma _y &{} \text { for } 7\Delta t\le t < 8\Delta t,\\ \alpha _3^{A_2}\sigma _z &{} \text { for } 8\Delta t\le t \le 9\Delta t. \end{array}\right. } \end{aligned}$$
(2)

Control parameters in the Hamiltonian \(H(t)\) will be referred to vector \(\mathrm {\alpha }=(\alpha _1^{A_1}, \alpha _2^{A_1}, \alpha _3^{A_1}, \alpha _1^{B}, \alpha _2^{B}, \alpha _3^{B}, \alpha _1^{A_2}, \alpha _2^{A_2}, \alpha _3^{A_2})\), where \(\alpha _i^{A_1},\alpha _i^{A_2}\) are determined by Alice and \(\alpha _i^{B}\) are selected by Bob.

Suppose that players are allowed to play the game by manipulating the control parameters in the Hamiltonian \(H(t)\) representing the coherent part of the dynamics, but they are not aware of the action of the environment on the system. Hence, the time evolution of the system is non-unitary and is described by a master equation, which can be written generally in the Lindblad form as

$$\begin{aligned} \frac{\mathrm{d}\rho }{\mathrm{d}t}=-\mathrm{i}[H(t),\rho ] + \sum _j \gamma _j(L_j\rho L_j^\dagger - \frac{1}{2}\{L_j^\dagger L_j,\rho \}), \end{aligned}$$
(3)

where \(H(t)\) is the system Hamiltonian, \(L_j\) are the Lindblad operators, representing the environment influence on the system [2] and \(\rho \) is the state of the system.

For the purpose of this paper we chose three classes of decoherence: amplitude damping, amplitude raising and phase damping which correspond to noisy operators \(\sigma _{-}=| 0 \rangle \langle 1 |\), \(\sigma _{+}=| 1 \rangle \langle 0 |\) and \(\sigma _z\), respectively.

Let us suppose that initially the quantum coin is in the state \(| 0 \rangle \langle 0 |\). Next, in each round, Alice and Bob perform their sequences of controls on the qubit, where each control pulse is applied according to Eq. (3). After applying all of the nine pulses, we measure the expected value of the \(\sigma _z\) operator. If \(\mathrm{tr}(\sigma _z\rho (T))=-1\) Alice wins, if \(\mathrm{tr}(\sigma _z\rho (T))=1\) Bob wins. Here, \(\rho (T)\) denotes the state of the system at time \(T=9\Delta t\).

Alternatively we can say that the final step of the procedures consists in performing orthogonal measurement \(\{O_\mathrm{tails}\rightarrow | 1 \rangle \langle 1 |,O_\mathrm{heads}\rightarrow | 0 \rangle \langle 0 |\}\) on state \(\rho (T)\). The probability of measuring \(O_\mathrm{tails}\) and \(O_\mathrm{heads}\) determines pay-off functions for Alice and Bob, respectively. These probabilities can be obtained from relations \(p(\mathrm{tails})=\langle 1 |\rho (T)| 1 \rangle \) and \(p(\mathrm{heads})=\langle 0 |\rho (T)| 0 \rangle \).

2.3 Nash equilibrium

In this game, pure strategies cannot be in Nash equilibrium [18]. Hence, the players choose mixed strategies, which are better than the pure ones. We assume that Alice and Bob use the Pauli strategy, which is mixed and gives Nash equilibrium [11]; therefore, this strategy is a reasonable choice for the players. According to the Pauli strategy, each player chooses one of the four unitary operations \(\{{1\!\!1}, \mathrm{i}\sigma _{x}, \mathrm{i}\sigma _{y}, \mathrm{i}\sigma _{z}\}\) with equal probability. Thus, to obtain the Pauli strategy, each player chooses a sequence of control parameters \((\alpha _1^\square , \alpha _2^\square , \alpha _3^\square )\) listed in Table 2. The symbol \(\square \) can be substituted by \(A_1,B,A_2\). It means that in each round, one player performs a unitary operation chosen randomly with a uniform probability distribution from the set \(\{ {1\!\!1}, \mathrm{i}\sigma _x, \mathrm{i}\sigma _y, \mathrm{i}\sigma _z \}\).

Table 2 Control parameters for realizing the Pauli strategy

3 Influence of decoherence on the game

In this section, we perform an analytical investigation which shows the influence of decoherence on the game result. In accordance with the Lindblad master equation, the environment influence on the system is represented by Lindblad operators \(L_j\), while the rate of decoherence is described by parameters \(\gamma _j\). In our game, players use the Pauli strategy; hence, the quantum system evolves depending on the Hamiltonians expressed as \(H(t)=\alpha _i^\square \sigma _y\) or \(H(t)=\alpha _i^\square \sigma _z\). To simplify the discussion, we consider Hamiltonians represented by diagonal matrices. In our case, \(H=\alpha _i^\square \sigma _z\) is diagonal, but Hamiltonian \(\alpha _i^\square \sigma _y\) requires diagonalization. Therefore, we will consider solutions of Lindblad equations for the Hamiltonians given by \(H_z = \alpha _i^\square \sigma _z\) and \(H_y = \alpha _i^\square U^\dagger \sigma _y U=\alpha _i^\square \left( \begin{array}{ll} -1 &{} 0\\ 0 &{} 1 \end{array} \right) \), where \(U=\left( \begin{array}{ll} -\frac{\sqrt{2}}{2} &{} -\frac{\sqrt{2}}{2} \\ \mathrm{i}\frac{\sqrt{2}}{2} &{} -\mathrm{i}\frac{\sqrt{2}}{2} \end{array} \right) \) is unitary matrix, whose columns are the eigenvectors of \(\sigma _y\). Thus, we consider the solutions of the Lindblad equation for the Hamiltonian of the form

$$\begin{aligned} H=\beta _1| 0 \rangle \langle 0 | + \beta _2| 1 \rangle \langle 1 |. \end{aligned}$$
(4)

3.1 Amplitude damping and amplitude raising

First we consider the amplitude damping decoherence, which corresponds to the Lindblad operator \(\sigma _{-}\). Thus, the master Eq. (3) is expressed as

$$\begin{aligned} \frac{\mathrm{d}\rho }{\mathrm{d}t}=-\mathrm{i}[H,\rho (t)] + \gamma (\sigma _{-}\rho (t)\sigma _{+}-\frac{1}{2}\sigma _{+}\sigma _{-}\rho (t)-\frac{1}{2}\ \rho (t)\sigma _{+}\sigma _{-}), \end{aligned}$$
(5)

where \(\sigma _{+}=\sigma _{-}^\dagger =| 1 \rangle \langle 0 |\). The equation can be rewritten in the following form

$$\begin{aligned} \frac{\mathrm{d}\rho }{\mathrm{d}t}=A\rho (t)+\rho (t) A^\dagger + \gamma \sigma _{-}\rho (t)\sigma _{+}, \end{aligned}$$
(6)

where \(A=-\mathrm{i}H(t)-\frac{1}{2}\gamma \sigma _{+}\sigma _{-}\). In solving this equation it is helpful to make a change of variables \(\rho (t)=\hbox {e}^{At}\hat{\rho }(t)\hbox {e}^{A^\dagger t}\). Hence, we obtain

$$\begin{aligned} \frac{\mathrm{d}\hat{\rho }}{\mathrm{d}t}=\gamma B(t)\hat{\rho }(t) B^{\dagger }(t), \end{aligned}$$
(7)

where \(B(t)=\hbox {e}^{-At}\sigma _{-}\hbox {e}^{At}=\hbox {e}^{-\mathrm{i}(\beta _2-\beta _1)t-\frac{\gamma }{2}t}\sigma _{-}\). It follows that

$$\begin{aligned} \frac{\mathrm{d}\hat{\rho }}{\mathrm{d}t}=\gamma \hbox {e}^{-\gamma t} \sigma _{-}\hat{\rho }(t) \sigma _{+}. \end{aligned}$$
(8)

Due to the fact that \(\sigma _{-}\sigma _{-}=\sigma _{+}\sigma _{+}=0\) and \(\sigma _{-}\frac{\mathrm{d}\hat{\rho }}{\mathrm{d}t}\sigma _{+}=0\) it is possible to write \(\hat{\rho }(t)\) as

$$\begin{aligned} \hat{\rho }(t)=\hat{\rho }(0) -\hbox {e}^{-\gamma t}\sigma _{-}\hat{\rho }(0)\sigma _{+}. \end{aligned}$$
(9)

Coming back to the original variables we get the expression

$$\begin{aligned} \rho (t)=\hbox {e}^{At}\rho (0)\hbox {e}^{A^\dagger t}-\hbox {e}^{-\gamma t}\sigma _{-}\rho (0)\sigma _{+}. \end{aligned}$$
(10)

In order to study the asymptotic effects of decoherence on the results of the game, we consider the following limit

$$\begin{aligned} \lim _{\gamma \rightarrow \infty } \hbox {e}^{At}\rho (0)\hbox {e}^{A^\dagger t}-\hbox {e}^{-\gamma t}\sigma _{+}\rho (0)\sigma _{-} = | 0 \rangle \langle 0 |\rho (0)| 0 \rangle \langle 0 |. \end{aligned}$$
(11)

Let \(\rho (0)=| 0 \rangle \langle 0 |\); thus, the above limit is equal to \(| 0 \rangle \langle 0 |\). This result shows that for high values of \(\gamma \), chances of winning the game by Bob increase to 1 as \(\gamma \) increases. Figure 1 shows an example of the evolution of a quantum system with amplitude damping decoherence for two values of the parameter \(\gamma \). Figure 1a, b show the player’s control pulses. In this case they are the ones implementing the Pauli strategy. Figure 1c, d show the time evolution of the state expressed as the expectation values of the observables \(\sigma _x\), \(\sigma _y\) and \(\sigma _z\) for both cases. Finally, Fig. 1e, f show the evolution of the qubit’s state in the Bloch sphere. This shows how a little amount of noise influences the evolution of the system and changes the probability of winning the game.

Fig. 1
figure 1

Example of the time evolution of a quantum system with the amplitude damping decoherence for a sequence of control parameters \(\alpha \) and fixed \(\gamma =0.1\) (left side), \(\gamma =0.7\) (right side). a Control parameters \(\alpha =(-\frac{\pi }{4},-\frac{\pi }{2},\frac{\pi }{4},0,-\frac{\pi }{2},0,-\frac{\pi }{4},-\frac{\pi }{2},\frac{\pi }{4})\). b Control parameters \(\alpha =(0,-\frac{\pi }{2},0,-\frac{\pi }{4},-\frac{\pi }{2},\frac{\pi }{4},-\frac{\pi }{4},0,-\frac{\pi }{4})\). c Mean values of \(\sigma _x,\sigma _y\) and \(\sigma _z\). d Mean values of \(\sigma _x,\sigma _y\) and \(\sigma _z\). e Time evolution of a quantum coin. f Time evolution of a quantum coin

The noisy operator \(\sigma _{+}\) is related to amplitude raising decoherence, and the solution of the master equation has the following form

$$\begin{aligned} \rho (t)=\hbox {e}^{At}\rho (0)\hbox {e}^{A^\dagger t}-\hbox {e}^{-\gamma t}\sigma _{+}\rho (0)\sigma _{-}, \end{aligned}$$
(12)

where \(A=-\mathrm{i}H(t) -\frac{1}{2}\gamma \sigma _{-}\sigma _{+}\). It is easy to check that as \(\gamma \rightarrow \infty \) the state \(| 1 \rangle \langle 1 |\) is the solution of the above equation, in which case Alice wins.

3.2 Phase damping

Now, we consider the impact of the phase damping decoherence on the outcome of the game. In this case, the Lindblad operator is given by \(\sigma _z\). Hence, the Lindblad equation has the following form

$$\begin{aligned} \frac{\mathrm{d}\rho }{\mathrm{d}t}&= -\mathrm{i}[H,\rho (t)] + \gamma (\sigma _z\rho (t)\sigma _z - \frac{1}{2}\sigma _z\sigma _z\rho (t) -\frac{1}{2}\rho (t)\sigma _z\sigma _z)\nonumber \\&= -\mathrm{i}[H,\rho (t)] + \gamma (\sigma _z\rho (t)\sigma _z - \rho (t)). \end{aligned}$$
(13)

Next, we make a change of variables \(\hat{\rho }(t)=\hbox {e}^{\mathrm{i}Ht}\rho (t)\hbox {e}^{-\mathrm{i}Ht}\), which is helpful to solve the equation. We obtain

$$\begin{aligned} \frac{\mathrm{d}\hat{\rho }}{\mathrm{d}t}&= \frac{\mathrm{d}\hbox {e}^{\mathrm{i}Ht}}{\mathrm{d}t}\rho (t)\hbox {e}^{-\mathrm{i}Ht}+ \hbox {e}^{\mathrm{i}Ht}\frac{\mathrm{d}\rho }{\mathrm{d}t}\hbox {e}^{-\mathrm{i}Ht}+ \hbox {e}^{\mathrm{i}Ht}\rho (t)\frac{\mathrm{d}\hbox {e}^{-\mathrm{i}Ht}}{\mathrm{d}t}\nonumber \\&= \mathrm{i}H \hbox {e}^{\mathrm{i}Ht}\hbox {e}^{-\mathrm{i}Ht}\hat{\rho }(t)\hbox {e}^{\mathrm{i}Ht}\hbox {e}^{-\mathrm{i}Ht} - \mathrm{i}\hbox {e}^{\mathrm{i}Ht}H\hbox {e}^{-\mathrm{i}Ht}\hat{\rho }(t)\hbox {e}^{\mathrm{i}Ht}\hbox {e}^{-\mathrm{i}Ht}\nonumber \\&+\, \mathrm{i}\hbox {e}^{\mathrm{i}Ht}\hbox {e}^{-\mathrm{i}Ht}\hat{\rho }(t)\hbox {e}^{\mathrm{i}Ht}H\hbox {e}^{-\mathrm{i}Ht}+ \gamma \hbox {e}^{\mathrm{i}Ht}\sigma _z\hbox {e}^{-\mathrm{i}Ht}\hat{\rho }(t) \hbox {e}^{\mathrm{i}Ht}\sigma _z\hbox {e}^{-\mathrm{i}Ht}\nonumber \\&-\,\hbox {e}^{\mathrm{i}Ht}\hbox {e}^{-\mathrm{i}Ht}\hat{\rho }\hbox {e}^{\mathrm{i}Ht}\hbox {e}^{-\mathrm{i}Ht} -\mathrm{i}\hbox {e}^{\mathrm{i}Ht}\hbox {e}^{-\mathrm{i}Ht}\hat{\rho }\hbox {e}^{\mathrm{i}Ht}\hbox {e}^{-\mathrm{i}Ht}H \nonumber \\&= \gamma (\sigma _z\hat{\rho (t)}\sigma _z - \hat{\rho (t)}). \end{aligned}$$
(14)

It follows that the solution of the above equation is given by

$$\begin{aligned} \hat{\rho }(t)&= | 0 \rangle \langle 0 |\rho (0)| 0 \rangle \langle 0 | + | 1 \rangle \langle 1 |\rho (0)| 1 \rangle \langle 1 | + \nonumber \\&+\, \mathrm{e}^{-2\gamma t} (| 0 \rangle \langle 0 |\rho (0)| 1 \rangle \langle 1 |+| 1 \rangle \langle 1 |\rho (0)| 0 \rangle \langle 0 |). \end{aligned}$$
(15)

Coming back to the original variables we get the expression

$$\begin{aligned} \rho (t)&= | 0 \rangle \langle 0 |\rho (0)| 0 \rangle \langle 0 | + | 1 \rangle \langle 1 |\rho (0)| 1 \rangle \langle 1 | + \nonumber \\&+\, \mathrm{e}^{-2\gamma t}\mathrm{e}^{-\mathrm{i}H t}(| 0 \rangle \langle 0 |\rho (0)| 1 \rangle \langle 1 |+| 1 \rangle \langle 1 |\rho (0)| 0 \rangle \langle 0 |) \mathrm{e}^{\mathrm{i}H t}. \end{aligned}$$
(16)

Consider the following limit

$$\begin{aligned} \lim _{\gamma \rightarrow \infty } \rho (t)= | 0 \rangle \langle 0 |\rho (0)| 0 \rangle \langle 0 | + | 1 \rangle \langle 1 |\rho (0)| 1 \rangle \langle 1 |. \end{aligned}$$
(17)

The above result is a diagonal matrix dependent on the initial state. For high values of \(\gamma \), the initial state \(\rho (0)\) has a significant impact on the game. If \(\rho (0)=| 0 \rangle \langle 0 |\) then \(\lim _{\gamma \rightarrow \infty } \rho (t)=| 0 \rangle \langle 0 |\). This kind of decoherence is conducive to Bob. Similarly, if \(\rho (0) = | 1 \rangle \langle 1 |\), then Alice wins. The evolution of a quantum system with the phase damping decoherence and fixed Hamiltonian is shown in Fig. 2. Figures 2a,b show the player’s control pulses. In this case they are the ones implementing the Pauli strategy. Figure 2c,d show the time evolution of the state expressed as the expectation values of the observables \(\sigma _x\), \(\sigma _y\) and \(\sigma _z\) for both cases. Finally, Fig. 2e,f show the evolution of the qubit’s state in the Bloch sphere. In this case, we can see that a low amount of phase damping noise does not have a significant impact on the outcome of the game. On the other hand, for higher values of \(\gamma \) we can see mainly the effect of the decoherence rather than the effect of player’s actions, i.e., the state evolves almost directly toward the maximally mixed state.

Fig. 2
figure 2

Example of the time evolution of a quantum system with the phase damping decoherence for fixed \(\gamma =0.5\) (left side), \(\gamma =5\) (right side) and a sequence of control parameters \(\alpha \). a Control parameters \(\alpha =(-\frac{\pi }{4},-\frac{\pi }{2},\frac{\pi }{4},-\frac{\pi }{4},-\frac{\pi }{2},\frac{\pi }{4},-\frac{\pi }{4},-\frac{\pi }{2},\frac{\pi }{4})\). b Control parameters \(\alpha =(0,-\frac{\pi }{2},0,0,-\frac{\pi }{2},0,0,-\frac{\pi }{2},0)\). c Mean values of \(\sigma _x,\sigma _y\) and \(\sigma _z\). d Mean values of \(\sigma _x,\sigma _y\) and \(\sigma _z\). e Time evolution of a quantum coin. f Time evolution of a quantum coin

4 Optimal strategy for the players

Due to the noisy evolution of the underlying qubit, the strategy given by Table 2 is no longer a Nash equilibrium. We study the possibility of optimizing one player’s strategy, while the other one uses the Pauli strategy. It turns out that this optimization is not always possible. If the rate of decoherence is high enough, then the players’ strategies have little impact on the game outcome. In the low noise scenario, it is possible to optimize the strategy of both players.

In each round, one player performs a series of unitary operations, which are chosen randomly from a uniform distribution. Therefore, the strategy of a player can be seen as a random unitary channel. In this section \(\Phi _{A_1},\Phi _{A_2}\) denote mixed unitary channels used by Alice who implements the Pauli strategy. Similarly, \(\Phi _B\) denotes channels used by Bob.

4.1 Optimization method

In order to find optimal strategies for the players, we assume the Hamiltonian in (3) to have the form

$$\begin{aligned} H = H(\varepsilon (t)), \end{aligned}$$
(18)

where \(\varepsilon (t)\) are the control pulses. As the optimization target, we introduce the cost functional

$$\begin{aligned} J(\varepsilon )=\mathrm{tr}\{ F_0(\rho (T)) \}, \end{aligned}$$
(19)

where \(F_0(\rho (T))\) is a functional that is bounded from below and differentiable with respect to \(\rho (T)\). A sequence of control pulses that minimizes the functional (19) is said to be optimal. In our case we assume that

$$\begin{aligned} \mathrm{tr}\{ F_0(\rho (T)) \} = \frac{1}{2} || \rho (T) - \rho _\mathrm{T} ||_\mathrm{F}^2, \end{aligned}$$
(20)

where \(\rho _\mathrm{T}\) is the target density matrix of the system.

In order to solve this optimization problem, we need to find an analytical formula for the derivative of the cost functional (19) with respect to control pulses \(\varepsilon (t)\). Using the Pontryagin principle [20], it is possible to show that we need to solve the following equations to obtain the analytical formula for the derivative

$$\begin{aligned} \frac{\mathrm{d}\rho (t)}{\mathrm{d}t}&= -\mathrm{i}[H(\varepsilon (t)) ,\rho (t)] - \mathrm{i}L_\mathrm{D} [\rho (t)],\; t\in [0, T],\end{aligned}$$
(21)
$$\begin{aligned} \frac{\mathrm{d}\lambda (t)}{\mathrm{d}t}&= -\mathrm{i}[H(\varepsilon (t)) ,\lambda (t)] - \mathrm{i}L_\mathrm{D}^\dagger [\lambda (t)],\; t\in [0, T],\end{aligned}$$
(22)
$$\begin{aligned} L_\mathrm{D}[A]&= \mathrm{i}\sum _j \gamma _j(L_j A L_j^\dagger - \frac{1}{2}\{L_j^\dagger L_j,A\}),\end{aligned}$$
(23)
$$\begin{aligned} \rho (0)&= \rho _\mathrm{s}, \end{aligned}$$
(24)
$$\begin{aligned} \lambda (T)&= F'_0(\rho (T)), \end{aligned}$$
(25)

where \(\rho _\mathrm{s}\) denotes the initial density matrix, \(\lambda (t)\) is called the adjoint state and

$$\begin{aligned} F'_0(\rho (T)) = \rho (T) - \rho _\mathrm{T}. \end{aligned}$$
(26)

The derivation of these equations can be found in [21].

In order to optimize the control pulses using a gradient method, we convert the problem from an infinite dimensional (continuous time) to a finite dimensional (discrete time) one. For this purpose, we discretize the time interval \([0, T]\) into \(M\) equal sized subintervals \(\Delta t_k\). Thus, the problem becomes that of finding \(\varepsilon =[\varepsilon _1,\ldots ,\varepsilon _M]^\mathrm{T}\) such that

$$\begin{aligned} J(\varepsilon ) = \inf _{\zeta \in \mathbb {R}^M}J(\zeta ). \end{aligned}$$
(27)

The gradient of the cost functional is

$$\begin{aligned} G = \left[ \frac{\partial J}{\partial \varepsilon _1}, \ldots , \frac{\partial J}{\partial \varepsilon _M} \right] ^\mathrm{T}. \end{aligned}$$
(28)

It can be shown [21] that elements of vector (28) are given by

$$\begin{aligned} \frac{\partial J}{\partial \varepsilon _k} = \mathrm{tr}\left\{ -\mathrm{i}\lambda _k \left[ \frac{\partial H(\varepsilon _k)}{\partial \varepsilon _k}, \rho _k \right] \right\} \Delta t_k, \end{aligned}$$
(29)

where \(\rho _k\) and \(\lambda _k\) are solutions of the Lindblad equation and the adjoint system corresponding to time subinterval \(\Delta t_k\), respectively. To minimize the gradient given in Eq. (28) we use the BFGS algorithm [22].

4.2 Optimization setup

Our goal is to find control strategies for players, which maximize their respective chances of winning the game. We study three noise channels: the amplitude damping, the phase damping and the amplitude raising channel. They are given by the Lindblad operators \(\sigma _-\), \(\sigma _z\) and \(\sigma _+ = \sigma _-^\dagger \), respectively. In all cases, we assume that one of the players uses the Pauli strategy, while for the other player we try to optimize a control strategy that maximizes that player’s probability of winning. However, in our setup it is convenient to use the value of the observable \(\sigma _z\) rather than probabilities. Value 0 means that each player has a probability of \(\frac{1}{2}\) of winning the game. Values closer to 1 mean higher probability of winning for Bob, while values closer to -1 mean higher probability of winning for Alice.

4.3 Optimization results

4.3.1 Phase damping

The results for the phase damping channel are shown in Fig. 3. As it can be seen, in this case, both players are able to optimize their strategies, and so Alice can optimize her strategy for low values of \(\gamma \) to obtain the probability of winning grater than \(\frac{1}{2}\). The region where this occurs is shown in the inset. For high noise values, she is able to achieve the probability of winning equal to \(\frac{1}{2}\). In the case of high values of \(\gamma \), the best strategy for Alice is to drive the state as close as possible to the maximally mixed state on her first move. This state can not be changed neither by Bob’s actions, nor by the phase damping channel. On the other hand, optimization of Bob’s strategy shows that he is able to achieve high probabilities of winning for relatively low values of \(\gamma \). This is consistent with the limit shown in Eq. (17) as our initial state is \(\rho =| 0 \rangle \langle 0 |\). Figure 4 presents optimal game strategies for both players. For Alice we chose \(\gamma =1.172\) which corresponds to her maximal probability of winning the game. In the case of Bob’s strategies we arbitrarily choose the value \(\gamma =1.610\). In these cases the evolution of the qubit is much more complex. This is due to the fact that the players are not restricted to the Pauli strategy.

Fig. 3
figure 3

Mean value of the pay-off for the phase damping channel with and without optimization of the player’s strategies. The inset shows the region where Alice is able to increase her probability of winning to exceed \(\frac{1}{2}\)

Fig. 4
figure 4

Game results for the phase damping channel. Optimal Alice’s strategy when \(\gamma = 1.172\) (left side), and optimal Bob’s strategy when \(\gamma = 1.610\) (right side). a Optimal controls for Alice, b Optimal controls for Bob, c Mean values of \(\sigma _x,\sigma _y\) and \(\sigma _z\), d Mean values of \(\sigma _x,\sigma _y\) and \(\sigma _z\), e Time evolution of a quantum coin, f Time evolution of a quantum coin

4.3.2 Amplitude damping

Next, we present the results obtained for the amplitude damping channel. They are shown in Fig. 5. Unfortunately, for Alice, for high values of \(\gamma \) Bob always wins. This is due to the fact that in this case the state quickly decays to state \(| 0 \rangle \langle 0 |\). Additionally, Bob is also able to optimize his strategies. He is able to achieve probability of winning equal to 1 for relatively low values of \(\gamma \). However, for low values of \(\gamma \), the interaction allows Alice to achieve higher than \(\frac{1}{2}\) probability of winning. The region where this happens is magnified in the inset. Interestingly, for very low values of \(\gamma \), Alice can increase her probability of winning. This is due to the fact that low noise values are sufficient to distort Bob’s attempts to perform the Pauli strategy. On the other hand, they are not high enough to drive the system toward state \(| 0 \rangle \langle 0 |\). Optimal game results for both players are shown in Fig. 6. For both players, we chose \(\gamma =0.621\) which corresponds to Alice’s maximal probability of winning the game. As can be seen, in this case, the evolutions of the observables \(\sigma _x\), \(\sigma _y\) and \(\sigma _z\) show rapid oscillations. This behavior is turned on by applying control pulses associated with the \(\sigma _y\) Hamiltonian.

Fig. 5
figure 5

Mean value of the pay-off for the amplitude damping channel with and without optimization of the player’s strategies. The inset shows the region where Alice is able to increase her probability of winning to exceed \(\frac{1}{2}\)

Fig. 6
figure 6

Game results obtained for the amplitude damping channel with \(\gamma \) equal to \(0.621\). Optimal Alice’s strategy (left side), and optimal Bob’s strategy (right side). a Optimal controls for Alice, b Optimal controls for Bob, c Mean values of \(\sigma _x,\sigma _y\) and \(\sigma _z\), d Mean values of \(\sigma _x,\sigma _y\) and \(\sigma _z\), e Time evolution of a quantum coin, f Time evolution of a quantum coin

4.3.3 Amplitude raising

Finally, we present optimization results for the amplitude raising channel. The optimization results, shown in Fig. 7, indicate that Alice can achieve probability of winning equal to 1 for lower values of \(\gamma \) compared with the unoptimized case. In this case, Bob cannot do any better than in the unoptimized case due to a limited number of available control pulses.

Fig. 7
figure 7

Mean value of the pay-off for the amplitude raising channel with and without optimization of the player’s strategies

5 Conclusions

We studied the quantum version of the coin flip game under decoherence. To model the interaction with external environment, we used the Markovian approximation in the form of the Lindblad equation. Because of the fact that Pauli strategy is a known Nash equilibrium of the game, therefore, it was natural to investigate this strategy in the presence noise. Our results show that in the presence of noise, the Pauli strategy is no longer a Nash equilibrium. One of the players, Bob in our case, is always favoured by amplitude and phase damping noise. If we had considered a game with another initial state i.e.,, \(\rho _0=| 1 \rangle \langle 1 |\), Alice would have been favoured in this case. Our next step was to check if the players were able to do better than the Pauli strategy. For this, we used the BFGS gradient method to optimize the players’ strategies. Our results show that Alice, as well as Bob, are able to increase their respective winning probabilities. Alice can achieve this for all three studied cases, while Bob can only do this for the phase damping and amplitude damping channels.