Skip to main content

Iterative Computation of Security Strategies of Matrix Games with Growing Action Set

Abstract

This paper studies how to efficiently update the saddle-point strategy, or security strategy of one player in a matrix game when the other player develops new actions in the game. It is well known that the saddle-point strategy of one player can be computed by solving a linear program. Developing a new action will add a new constraint to the existing LP. Therefore, our problem becomes how to efficiently solve the new LP with a new constraint. Considering the potentially huge number of constraints, which corresponds to the large size of the other player’s action set, we use the shadow vertex simplex method, whose computational complexity is lower than linear with respect to the size of the constraints, as the basis of our iterative algorithm. We first rebuild the main theorems in the shadow vertex method with a relaxed non-degeneracy assumption to make sure such a method works well in our model, then analyze the probability that the old optimum remains optimal in the new LP, and finally provide the iterative shadow vertex method whose average computational complexity is shown to be strictly less than that of the shadow vertex method. The simulation results demonstrate our main results about the probability of re-computing the optimum and the computational complexity of the iterative shadow vertex method.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

References

  1. Agrawal S, Wang Z, Ye Y (2014) A dynamic near-optimal algorithm for online linear programming. Oper Res 62(4):876–890

    MathSciNet  Article  Google Scholar 

  2. Başar T, Olsder GJ (1998) Dynamic noncooperative game theory. SIAM

  3. Bopardikar SD, Langbort C (2016) On incremental approximate saddle-point computation in zero-sum matrix games. Automatica 69:150–156

    MathSciNet  Article  Google Scholar 

  4. Borgwardt KH (1982) The average number of pivot steps required by the simplex-method is polynomial. Math Methods Oper Res 26(1):157–177

    MathSciNet  Article  Google Scholar 

  5. Borgwardt KH (2012) The simplex method: a probabilistic analysis, vol 1. Springer, Berlin

    MATH  Google Scholar 

  6. Dantzig G (2016) Linear programming and extensions. Princeton University Press, Princeton

    Google Scholar 

  7. Devanur NR, Hayes TP (2009) The adwords problem: online keyword matching with budgeted bidders under random permutations. In: Proceedings of the 10th ACM conference on electronic commerce. ACM, pp 71–78

  8. Eghbali R, Fazel M (2016) Worst case competitive analysis of online algorithms for conic optimization. arXiv:161100507

  9. Feldman J, Henzinger M, Korula N, Mirrokni V, Stein C (2010) Online stochastic packing applied to display ad allocation. In: European symposium on algorithms. Springer, 2010, pp 182–194

  10. Iouditski A (2016) Course note of convex optimization i: introduction. http://ljk.imag.fr/membres/Anatoli.Iouditski/cours/convex/chapitre_2.pdf

  11. Kleinberg R (2005) A multiple-choice secretary algorithm with applications to online auctions. In: Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, pp 630–631

  12. Liebling TM, Henn R, Künzi H, Schubert H (1972) On the number of iterations of the simplex method. Methods Oper Res 2(ROSO–CONF–1972–001):248–264

    Google Scholar 

  13. Lindberg PO (1981) A note on random LP-problems. Department of Mathematics, Royal Institute of Technology, Stockholm

    Google Scholar 

  14. Molinaro M, Ravi R (2013) The geometry of online packing linear programs. Math Oper Res 39(1):46–59

    MathSciNet  Article  Google Scholar 

  15. Schmidt WM (1968) Some results in probabilistic geometry. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 9(2):158–162

    MathSciNet  Article  Google Scholar 

  16. Spielman DA, Teng SH (2004) Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time. J ACM: JACM 51(3):385–463

    MathSciNet  Article  Google Scholar 

  17. Tsai J, Yin Z, Kwak Jy, Kempe D, Kiekintveld C, Tambe M (2010) Urban security: game-theoretic resource allocation in networked physical domains. In: National conference on artificial intelligence (AAAI)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lichun Li.

Additional information

This work was supported in part by NSF Grants 1619339 and 1151076 (Directorate for Computer and Information Science and Engineering) to Cedric Langbort.

Appendix

Appendix

Proof of Lemma 1

Assume that Eq. (15) holds. We know that \(A_{\varOmega _0^i}x_{0}=b_{\varOmega _0^i}\ge A_{\varOmega _0^i}x\) for any feasible x, which implies that \(w^Tx_0\ge w^T x\) for any feasible x.

Now assume that \(w^Tx_0\ge w^T x\) for any feasible x. According to the inhomogeneous Farkas theorem [10], we derive that \(w=\sum _{i=1}^{m}\rho _iA_i^T\) for some nonnegative vector \(\rho \) satisfying \(\sum _{i=1}^m \rho _i b_i\le w^T x_0=\sum _{i=1}^{m}\rho _iA_i x_0\). It implies that \(\sum _{i\notin \varOmega _0}\rho _i(b_i-A_ix_0) \le 0\). Since \(\rho _i\ge 0\) and \(b_i-A_ix_0>0\) for \(i\notin \varOmega _0\), the inequality holds only if \(\rho _i=0\) for \(i\notin \varOmega _0\), and Eq. (15) is true.

Proof of Proposition 1

1) \(\Rightarrow \) 2): It is clear that if \(\hat{x}\) is a shadow vertex, its projection lies in the boundary of \(\varGamma (X)\).

2) \(\Rightarrow \) 3): Let \(\varGamma (\hat{x})\in \partial \varGamma (X)\). Because of the convexity of the \(\varGamma (X)\), there must exist a \(w\in span(u,c){\setminus } \{\mathbf {0}\}\) such that \(w^T\varGamma (\hat{x}) \ge w^T \varGamma (x)\) for any \(x\in X\). Meanwhile, we know that \(x-\varGamma (x)\bot span(u,c)\), and hence, \(w^Tx=w^T\varGamma (x)\) for any \(x\in X\). Therefore, we have \(w^T\hat{x}\ge w^Tx\) for any \(x\in X\).

3) \(\Rightarrow \) 1): Now let’s assume that 3) is true. Because \(x-\varGamma (x)\bot span(u,c)\) and \(w\in span(u,c)\), we know that \(\varGamma (\hat{x})\) is the maximal relative to w for \(x\in \varGamma (X)\). Therefore, \(\varGamma (\hat{x})\) is in the boundary of the shadow \(\varGamma (X)\). Since \(\varGamma (X)\) is a two-dimensional polygon, if \(\varGamma (\hat{x})\) is not a vertex, then it must lies inside an edge.

Together with the fact that \(\varGamma (\hat{x})\) is the optimum w.r.t w, we know that w is orthogonal to the edge, and there exists a \(v\in span(u,c){\setminus } \{0\}\) such that \(w\bot v\) and \(\varGamma (\hat{x})\) is the maximal relative to \(w+\varepsilon v\) for any \(x\in \varGamma (X)\) if and only if \(\varepsilon =0\). Since \(x-\varGamma (x)\bot span(u,c)\) and \(w+\varepsilon v\in span(u,c)\), we know that \(\hat{x}\) is also maximal relative to \(w+\varepsilon v\) if and only if \(\varepsilon =0\) for any \(x\in X\).

Let \(\varOmega \) be the active constraint set when \(x=\hat{x}\). Assumption 1 indicates that there are n elements in \(\varOmega \). Since \(\hat{x}\) is maximal relative to w, according to Theorem 1, \(w=\sum _{i=1}^n \rho _i A_{\varOmega ^i}^T\ne 0\) for \(\rho \ge 0\). Let \(v=\sum _{i=1}^n \alpha _i A_{\varOmega ^i}^T\), and hence, \(w+\varepsilon v=\sum _{i=1}^n (\rho _i+\varepsilon \alpha _i)A_{\varOmega _0^i}^T\). \(\hat{x}\) is not maximal relative to \(w+\varepsilon v\) for \(\varepsilon >0\) implies that there exists an l such that \(\rho _l+\varepsilon \alpha _l<0\) for any \(\varepsilon >0\), and from the continuity of the function, we see that \(\rho _l+0\alpha _l=0\). Similarly, \(\hat{x}\) is not maximal relative to \(w+\varepsilon v\) for \(\varepsilon <0\) implies that there exists a k such that \(\rho _k+\varepsilon \alpha _k<0\) for any \(\varepsilon <0\), and the continuity of the function implies that \(\rho _k+0\alpha _k=0\). Moreover, together with the fact that \(\rho _i+\varepsilon \alpha _i\) is a linear function of \(\varepsilon \) for \(i=l,k\), we see that \(l\ne k\).

Therefore, we know that at least 2 elements of \(\rho \) is 0, and we have

$$\begin{aligned} w=\beta _1 u+\beta _2 c=\sum _{i=1,i\ne k,l}^n \rho _i A_{\varOmega ^i}^T. \end{aligned}$$
(38)

It means that there exist linearly dependent n elements in \(\{A_{\varOmega ^1},\ldots ,A_{\varOmega ^n},u,c\}\), which contradicts Assumption 1, and hence, \(\varGamma (\hat{x})\) is a vertex of \(\varGamma (X)\).

Results Related to Theorem 1

Lemma 4

Suppose Assumptions 13 hold. The unique optimal solution \(x^*\) of the old LP (10, 11) is not optimal any more in the new LP (25, 26) if and only if the new constraint (27) is an active constraint for any optimum of the new LP (25, 26), i.e., \([g_n-g_1\ g_n-g_2\ \ldots \ g_n-g_{n-1}\ 1]x^+ = g_n \), where \(x^+\) is any optimum of the new LP (25, 26).

Proof

Suppose \(x^*\) is not an optimum of the new LP. According to Theorem 1, we have \(\hat{A}_{m+1}x^*>\hat{b}_{m+1}\). If there exists an optimum \(x^+\) of the new LP such that \(\hat{A}_{m+1}x^+<\hat{b}_{m+1}\), then there must exist an \(x=\alpha x^+ +(1-\alpha ) x^*\) such that \(\hat{A}_{m+1}x=\hat{b}_{m+1}\). It is easy to verify that x is a feasible solution of the new LP. Since \(x^+\) is an optimum of the new LP, we have

$$\begin{aligned} c^Tx^+ \ge c^T x. \end{aligned}$$
(39)

Meanwhile, \(x^*\) is the unique optimum of old LP implies that \(c^Tx^+<c^Tx^*\), and we have

$$\begin{aligned} c^Tx=\alpha c^Tx^+ +(1-\alpha ) c^Tx^*>c^T x^+ \end{aligned}$$

which contradicts Eq. (39). Therefore, there does not exist an optimum \(x^+\) such that \(\hat{A}_{m+1}x^+<\hat{b}_{m+1}\). Together with the fact that an optimal solution is feasible, we conclude that \(\hat{A}_{m+1}x^+=\hat{b}_{m+1}\) for any optimum of the new LP problem.

For the other direction, suppose the new constraint is active for any optimum of new LP (25, 26). If the old optimum \(x^*\) is an optimum of the new LP, the new constraint (27), i.e., constraint \(m+1\), is active. Together with the other n active constraints which are included in the old constraints, there are \(n+1\) active constraints. This contradicts Assumption 1. Therefore, \(x^*\) is not an optimum of the new LP. \(\square \)

For the simplicity of the following two lemmas, we let

$$\begin{aligned} \varOmega _1&=\{i_1,\ldots ,i_\ell ,\kappa _1,\ldots ,\kappa _{n-\ell }\} \end{aligned}$$
(40)
$$\begin{aligned} \varOmega _2&=\{j_1,\ldots ,j_\ell ,\kappa _1,\ldots ,\kappa _{n-\ell }\} \end{aligned}$$
(41)

for some \(\ell =1,\ldots ,n\), where \(\kappa _1,\ldots ,\kappa _{n-\ell }\in \{m+2,\ldots ,m+1+n\}\) are the indices of probability constraints, and \(i_1,\ldots ,i_\ell ,j_1,\ldots ,j_\ell \in \{1,\ldots ,m+1\}\) are the indices of normal constraints..

Lemma 5

Suppose Assumptions 1 and 2 hold. Any n element subset \(\varDelta \subset \{1,\ldots ,m+1+n\}\) that contains the same probability constraints has the same probability to be the optimal active constraint set of the new LP problem (25, 26). In other words,

$$\begin{aligned} P(\varOmega _1\hbox { is the optimal active constraint set})=P(\varOmega _2\hbox { is the optimal active constrant set}), \end{aligned}$$
(42)

where \(\varOmega _1\) and \(\varOmega _2\) are defined as in (40) and (41), respectively.

Proof

Since the payoff columns are independently and identically distributed, \(\hat{A}_1,\ldots ,\)\(\hat{A}_{m+1}\) are also independently and identically distributed. Therefore, The probability that the convex cone of \(\hat{A}_{i_1},\ldots ,\hat{A}_{i_\ell },\hat{A}_{\kappa _1},\ldots ,\hat{A}_{\kappa _{n-\ell }}\) contains c is the same as the probability that the convex cone of \(\hat{A}_{j_1},\ldots ,\hat{A}_{j_\ell },\hat{A}_{\kappa _1},\ldots ,\hat{A}_{\kappa _{n-\ell }}\) contains c. Together with Lemma 1, we see that Eq. (43) is true. \(\square \)

Related Results in the Proof of Theorem 2

Lemma 6

Consider the new LP problem (25, 26), and suppose Assumptions 1 and 2 hold. Any n element subset \(\varDelta \subset \{1,\ldots ,m+1+n\}\) that contains the same probability constraints has the same probability to form a shadow vertex of the new LP problem (25, 26). In other words,

$$\begin{aligned} P(\varOmega _1\hbox { forms a shadow vertex})=P(\varOmega _2\hbox { forms a shadow vertex}), \end{aligned}$$
(43)

where \(\varOmega _1\) and \(\varOmega _2\) are defined as in (40) and (41), respectively.

Proof

Since the payoff columns are independently and identically distributed, \(\hat{A}_1,\ldots ,\)\(\hat{A}_{m+1}\) are also independently and identically distributed. Therefore, The probability that the convex cone of \(\hat{A}_{i_1},\ldots ,\hat{A}_{i_\ell },\hat{A}_{\kappa _1},\ldots ,\hat{A}_{\kappa _{n-\ell }}\) intersects with span(uc) is the same as the probability that the convex cone of \(\hat{A}_{j_1},\ldots ,\hat{A}_{j_\ell },\hat{A}_{\kappa _1},\ldots ,\hat{A}_{\kappa _{n-\ell }}\) intersects with span(uc). Together with Theorem 1, we see that Eq. (43) is true. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Li, L., Langbort, C. Iterative Computation of Security Strategies of Matrix Games with Growing Action Set. Dyn Games Appl 9, 942–964 (2019). https://doi.org/10.1007/s13235-018-0283-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13235-018-0283-5

Keywords

  • Game theory
  • Growing action set
  • Iterative computation
  • Shadow vertex method