Skip to main content
Log in

The Operator Approach to Entropy Games

  • Published:
Theory of Computing Systems Aims and scope Submit manuscript

Abstract

Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in which some action spaces are simplices and payments are given by a relative entropy (Kullback-Leibler divergence). In this way, we show that entropy games with a fixed number of states belonging to Despot can be solved in polynomial time. This approach also allows us to solve these games by a policy iteration algorithm, which we compare with the spectral simplex algorithm developed by Protasov.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Anantharam, V., Borkar, V.S.: A variational formula for risk-sensitive reward. SIAM J. Contro Optim. 55(2), 961–988 (2017). arXiv:1501.00676

    Article  MathSciNet  MATH  Google Scholar 

  2. Asarin, E., Cervelle, J., Degorre, A., Dima, C., Horn, F., Kozyakin, V.: Entropy games and matrix multiplication games. In: 33rd Symposium on Theoretical Aspects of Computer Science, STACS, Orlėans, France, pp. 11:1–11:14 (2016)

  3. Akian, M., Gaubert, S., Guterman, A.: Tropical polyhedra are equivalent to mean payoff games. Int. J. Algebra Comput. 22(1), 125001 (43 pages) (2012)

    Article  MathSciNet  MATH  Google Scholar 

  4. Akian, M., Gaubert, S., Grand-Clément, J., Guillaud, J.: The Operator Approach to Entropy Games. In: Vollmer, H., Vallée, B. (eds.) 34th Symposium on Theoretical Aspects of Computer Science (STACS 2017), volume 66 of Leibniz International Proceedings in Informatics (LIPIcs), pp. 6:1–6:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl (2017)

  5. Akian, M., Gaubert, S., Nussbaum, R.: A Collatz-Wielandt characterization of the spectral radius of order-preserving homogeneous maps on cones. arXiv:1112.5968 (2011)

  6. Andersson, D., Miltersen, P.B.: The complexity of solving stochastic games on graphs. In: Proceedings of ISAAC’09, number 5878 in LNCS, pp 112–121. Springer (2009)

  7. Borwein, J.M., Borwein, P.B.: On the complexity of familiar functions and numbers. SIAM Rev. 30(4), 589–601 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  8. Baillon, J.B., Bruck, R.E.: Optimal rates of asymptotic regularity for averaged nonexpansive mappings. In: Tan, K. K. (ed.) Proceedings of the Second International Conference on Fixed Point Theory and Applications, pp. 27–66. World Scientific Press (1992)

  9. Bolte, J., Gaubert, S., Vigeral, G.: Definable zero-sum stochastic games. Math. Oper. Res. 40(1), 171–191 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  10. Bewley, T., Kohlberg, E.: The asymptotic theory of stochastic games. Math. Oper. Res. 1(3), 197–208 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  11. Blondel, V.D., Nesterov, Y.: Polynomial-time computation of the joint spectral radius for some sets of nonnegative matrices. SIAM J. Matrix Anal. 31(3), 865–876 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  12. Berman, A., Plemmons, R.J.: Nonnegative matrices in the mathematical sciences. Academic Press, New York (1994)

    Book  MATH  Google Scholar 

  13. Chen, T., Han, T.: On the complexity of computing maximum entropy for markovian models. In: 34th International Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2014, pp. 571–583, New Delhi (2014)

  14. Crandall, M.G., Tartar, L.: Some relations between non expansive and order preserving maps. Proc. AMS 78(3), 385–390 (1980)

    Article  MATH  Google Scholar 

  15. Donsker, M.D., Varadhan, R.: On a variational formula for the principal eigenvalue for operators with maximum principle. Proc. Nat. Acad. Sci. USA 72(3), 780–783 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  16. Fleming, W.H., Hernández-Hernández, D.: Risk-sensitive control of finite state machines on an infinite horizon. I SIAM J. Control Optim. 35(5), 1790–1810 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  17. Fleming, W.H., Hernández-Hernández, D.: Risk-sensitive control of finite state machines on an infinite horizon. II. SIAM J. Control Optim. 37(4), 1048–1069 (electronic) (1999)

    Article  MathSciNet  MATH  Google Scholar 

  18. Gaubert, S., Gunawardena, J.: A non-linear hierarchy for discrete event dynamical systems. In: Proceedings of the Fourth Workshop on Discrete Event Systems (WODES98), pp. 249–254. IEEE, Cagliari (1998)

  19. Gaubert, S., Gunawardena, J.: The Perron-Frobenius theorem for homogeneous, monotone functions. Trans. AMS 356(12), 4931–4950 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  20. Grötschel, M., Lovász, L., Schrijver, A.: The ellipsoid method and its consequences in combinatorial optimization. Combinatorica 1(2), 169–197 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  21. Gaubert, S., Stott, N.: A convergent hierarchy of non-linear eigenproblems to compute the joint spectral radius of nonnegative matrices. Proceedings of the 23rd International Symposium on Mathematical Theory of Networks and Systems (MTNS2018), Hong Kong (2018)

  22. Gaubert, S., Vigeral, G.: A maximin characterization of the escape rate of nonexpansive mappings in metrically convex spaces. Math Proc. Camb. Phil. Soc. 152, 341–363 (2012)

    Article  MATH  Google Scholar 

  23. Hoffman, A.J., Karp, R.M.: On nonterminating stochastic games. Manag. Sci. J. Inst. Manag. Sci. Appl. Theory Ser. 12, 359–370 (1966)

    MathSciNet  MATH  Google Scholar 

  24. Howard, R.A., Matheson, J.E.: Risk-sensitive markov decision processes. Manag. Sci. 18(7), 356–369 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  25. Hansen, T.D., Miltersen, P.B., Zwick, U.: Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. In: Innovations in Computer Science 2011, pp. 253–263. Tsinghua University Press (2011)

  26. Ishikawa, S.: Fixed points and iteration of a nonexpansive mapping in a Banach space. Proc. Amer. Math. Soc. 59(1), 65–71 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  27. Kingman, J.F.C.: A convexity property of positive matrices. Quart. J. Math. Oxford Ser. 2(12), 283–284 (1961)

    Article  MathSciNet  MATH  Google Scholar 

  28. Kozyakin, V.: Hourglass alternative and the finiteness conjecture for the spectral characteristics of sets of non-negative matrices. Linear Algebra Appl. 489, 167–185 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  29. Krasnosel’skiĭ, M. A.: Two remarks on the method of successive approximations. Uspekhi Matematicheskikh Nauk 10, 123–127 (1955)

    MathSciNet  Google Scholar 

  30. Kullback, S.: Information theory and statistics. Dover Publications, Inc., Mineola (1997). Reprint of the second (1968) edition

    MATH  Google Scholar 

  31. Lemmens, B., Lins, B., Nussbaum, R., Wortel, M.: Denjoy-Wolff theorems for Hilbert’s and Thompson’s metric spaces. J. d’Anal. Math. 134, 671–718 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  32. Lothaire, M.: Applied combinatorics on words. Cambridge, New York (2005)

    Book  MATH  Google Scholar 

  33. Mann, W.R.: Mean value methods in iteration. Proc. Amer. Math. Soc. 4, 506–510 (1953)

    Article  MathSciNet  MATH  Google Scholar 

  34. Mertens, J.-F., Neyman, A.: Stochastic games. Internat. J. Game Theory 10(2), 53–66 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  35. Müller, J. M.: Elementary functions: algorithms and implementation. Birkhaüser, Cambridge (2005)

    Google Scholar 

  36. Neyman, A.: Stochastic games and nonexpansive maps. In Stochastic games and applications (Stony Brook, NY, 1999), volume 570 of NATO Sci. Ser. C Math. Phys. Sci., pp. 397–415. Kluwer Acad. Publ., Dordrecht (2003)

  37. Nussbaum, R.D.: Convexity and log convexity for the spectral radius. Linear Algebra Appl. 73, 59–122 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  38. Protasov, V. Yu.: Spectral simplex method. Math. Program. 156(1-2, Ser. A), 485–511 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  39. Puterman, M.L.: Markov decision processes. Wiley, New York (2005)

    MATH  Google Scholar 

  40. Rothblum, U.G.: Multiplicative markov decision chains. Math. Oper. Res. 9 (1), 6–24 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  41. Rump, S.M.: Polynomial minimum root separation. Math. Comput. 145(33), 327–336 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  42. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)

    Book  MATH  Google Scholar 

  43. Sladký, K.: On Dynamic Programming Recursions for Multiplicative Markov Decision Chains, pp 216–226. Springer, Berlin (1976)

    MATH  Google Scholar 

  44. van den Dries, L.: Tame topology and o-minimal structures, volume 248 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge (1998)

    Book  Google Scholar 

  45. van den Dries, L.: o-minimal structures and real analytic geometry. In: Current developments in mathematics, 1998 (Cambridge, MA), pp. 105–152. Int. Press, Somerville (1999)

  46. Vigeral, G.: A zero-sum stochastic game with compact action sets and no asymptotic value. Dyn. Games Appl. 3(2), 172–186 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  47. Whittle, P.: Optimization over time, I. Wiley, New York (1982)

    MATH  Google Scholar 

  48. Wilkie, A.J.: Model completeness results for expansions of the ordered field of real numbers by restricted Pfaffian functions and the exponential function. J. Amer. Math. Soc. 9(4), 1051–1094 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  49. Ye, Y.: The simplex and policy-iteration methods are strongly polynomial for the markov decision problem with a fixed discount rate. Math. Oper. Res. 36(4), 593–603 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  50. Zijm, W.H.M.: Asymptotic expansions for dynamic programming recursions with general nonnegative matrices. J. Optim. Theory Appl. 54(1), 157–191 (1987)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

An announcement of the present results appeared in the proceedings of STACS, [4]. We are very grateful to the referees of this STACS paper and also to the referees of the present extended version, for their detailed comments which helped us to improve this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphane Gaubert.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Special Issue on Theoretical Aspects of Computer Science (STACS 2017)

The authors were partially supported by the ANR through the MALTHY INS project, and by the Gaspard Monge corporate sponsorship Program (PGMO) of EDF, Orange, Thales and Fondation Mathé matique Jacques Hadmard.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Akian, M., Gaubert, S., Grand-Clément, J. et al. The Operator Approach to Entropy Games. Theory Comput Syst 63, 1089–1130 (2019). https://doi.org/10.1007/s00224-019-09925-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00224-019-09925-z

Keywords

Navigation