The Operator Approach to Entropy Games

Akian, Marianne; Gaubert, Stéphane; Grand-Clément, Julien; Guillaud, Jérémie

doi:10.1007/s00224-019-09925-z

The Operator Approach to Entropy Games

Published: 30 May 2019

Volume 63, pages 1089–1130, (2019)
Cite this article

Theory of Computing Systems Aims and scope Submit manuscript

Marianne Akian¹,
Stéphane Gaubert ORCID: orcid.org/0000-0002-2777-9988¹,
Julien Grand-Clément² &
…
Jérémie Guillaud³

219 Accesses
6 Citations
Explore all metrics

Abstract

Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in which some action spaces are simplices and payments are given by a relative entropy (Kullback-Leibler divergence). In this way, we show that entropy games with a fixed number of states belonging to Despot can be solved in polynomial time. This approach also allows us to solve these games by a policy iteration algorithm, which we compare with the spectral simplex algorithm developed by Protasov.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Variant of the Logistic Quantal Response Equilibrium to Select a Perfect Equilibrium

Article 03 May 2024

On the Replication of the Pre-kernel and Related Solutions

Article 19 September 2023

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Article 05 April 2024

References

Anantharam, V., Borkar, V.S.: A variational formula for risk-sensitive reward. SIAM J. Contro Optim. 55(2), 961–988 (2017). arXiv:1501.00676
Article MathSciNet MATH Google Scholar
Asarin, E., Cervelle, J., Degorre, A., Dima, C., Horn, F., Kozyakin, V.: Entropy games and matrix multiplication games. In: 33rd Symposium on Theoretical Aspects of Computer Science, STACS, Orlėans, France, pp. 11:1–11:14 (2016)
Akian, M., Gaubert, S., Guterman, A.: Tropical polyhedra are equivalent to mean payoff games. Int. J. Algebra Comput. 22(1), 125001 (43 pages) (2012)
Article MathSciNet MATH Google Scholar
Akian, M., Gaubert, S., Grand-Clément, J., Guillaud, J.: The Operator Approach to Entropy Games. In: Vollmer, H., Vallée, B. (eds.) 34th Symposium on Theoretical Aspects of Computer Science (STACS 2017), volume 66 of Leibniz International Proceedings in Informatics (LIPIcs), pp. 6:1–6:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl (2017)
Akian, M., Gaubert, S., Nussbaum, R.: A Collatz-Wielandt characterization of the spectral radius of order-preserving homogeneous maps on cones. arXiv:1112.5968 (2011)
Andersson, D., Miltersen, P.B.: The complexity of solving stochastic games on graphs. In: Proceedings of ISAAC’09, number 5878 in LNCS, pp 112–121. Springer (2009)
Borwein, J.M., Borwein, P.B.: On the complexity of familiar functions and numbers. SIAM Rev. 30(4), 589–601 (1988)
Article MathSciNet MATH Google Scholar
Baillon, J.B., Bruck, R.E.: Optimal rates of asymptotic regularity for averaged nonexpansive mappings. In: Tan, K. K. (ed.) Proceedings of the Second International Conference on Fixed Point Theory and Applications, pp. 27–66. World Scientific Press (1992)
Bolte, J., Gaubert, S., Vigeral, G.: Definable zero-sum stochastic games. Math. Oper. Res. 40(1), 171–191 (2014)
Article MathSciNet MATH Google Scholar
Bewley, T., Kohlberg, E.: The asymptotic theory of stochastic games. Math. Oper. Res. 1(3), 197–208 (1976)
Article MathSciNet MATH Google Scholar
Blondel, V.D., Nesterov, Y.: Polynomial-time computation of the joint spectral radius for some sets of nonnegative matrices. SIAM J. Matrix Anal. 31(3), 865–876 (2009)
Article MathSciNet MATH Google Scholar
Berman, A., Plemmons, R.J.: Nonnegative matrices in the mathematical sciences. Academic Press, New York (1994)
Book MATH Google Scholar
Chen, T., Han, T.: On the complexity of computing maximum entropy for markovian models. In: 34th International Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2014, pp. 571–583, New Delhi (2014)
Crandall, M.G., Tartar, L.: Some relations between non expansive and order preserving maps. Proc. AMS 78(3), 385–390 (1980)
Article MATH Google Scholar
Donsker, M.D., Varadhan, R.: On a variational formula for the principal eigenvalue for operators with maximum principle. Proc. Nat. Acad. Sci. USA 72(3), 780–783 (1975)
Article MathSciNet MATH Google Scholar
Fleming, W.H., Hernández-Hernández, D.: Risk-sensitive control of finite state machines on an infinite horizon. I SIAM J. Control Optim. 35(5), 1790–1810 (1997)
Article MathSciNet MATH Google Scholar
Fleming, W.H., Hernández-Hernández, D.: Risk-sensitive control of finite state machines on an infinite horizon. II. SIAM J. Control Optim. 37(4), 1048–1069 (electronic) (1999)
Article MathSciNet MATH Google Scholar
Gaubert, S., Gunawardena, J.: A non-linear hierarchy for discrete event dynamical systems. In: Proceedings of the Fourth Workshop on Discrete Event Systems (WODES98), pp. 249–254. IEEE, Cagliari (1998)
Gaubert, S., Gunawardena, J.: The Perron-Frobenius theorem for homogeneous, monotone functions. Trans. AMS 356(12), 4931–4950 (2004)
Article MathSciNet MATH Google Scholar
Grötschel, M., Lovász, L., Schrijver, A.: The ellipsoid method and its consequences in combinatorial optimization. Combinatorica 1(2), 169–197 (1981)
Article MathSciNet MATH Google Scholar
Gaubert, S., Stott, N.: A convergent hierarchy of non-linear eigenproblems to compute the joint spectral radius of nonnegative matrices. Proceedings of the 23rd International Symposium on Mathematical Theory of Networks and Systems (MTNS2018), Hong Kong (2018)
Gaubert, S., Vigeral, G.: A maximin characterization of the escape rate of nonexpansive mappings in metrically convex spaces. Math Proc. Camb. Phil. Soc. 152, 341–363 (2012)
Article MATH Google Scholar
Hoffman, A.J., Karp, R.M.: On nonterminating stochastic games. Manag. Sci. J. Inst. Manag. Sci. Appl. Theory Ser. 12, 359–370 (1966)
MathSciNet MATH Google Scholar
Howard, R.A., Matheson, J.E.: Risk-sensitive markov decision processes. Manag. Sci. 18(7), 356–369 (1972)
Article MathSciNet MATH Google Scholar
Hansen, T.D., Miltersen, P.B., Zwick, U.: Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor. In: Innovations in Computer Science 2011, pp. 253–263. Tsinghua University Press (2011)
Ishikawa, S.: Fixed points and iteration of a nonexpansive mapping in a Banach space. Proc. Amer. Math. Soc. 59(1), 65–71 (1976)
Article MathSciNet MATH Google Scholar
Kingman, J.F.C.: A convexity property of positive matrices. Quart. J. Math. Oxford Ser. 2(12), 283–284 (1961)
Article MathSciNet MATH Google Scholar
Kozyakin, V.: Hourglass alternative and the finiteness conjecture for the spectral characteristics of sets of non-negative matrices. Linear Algebra Appl. 489, 167–185 (2016)
Article MathSciNet MATH Google Scholar
Krasnosel’skiĭ, M. A.: Two remarks on the method of successive approximations. Uspekhi Matematicheskikh Nauk 10, 123–127 (1955)
MathSciNet Google Scholar
Kullback, S.: Information theory and statistics. Dover Publications, Inc., Mineola (1997). Reprint of the second (1968) edition
MATH Google Scholar
Lemmens, B., Lins, B., Nussbaum, R., Wortel, M.: Denjoy-Wolff theorems for Hilbert’s and Thompson’s metric spaces. J. d’Anal. Math. 134, 671–718 (2018)
Article MathSciNet MATH Google Scholar
Lothaire, M.: Applied combinatorics on words. Cambridge, New York (2005)
Book MATH Google Scholar
Mann, W.R.: Mean value methods in iteration. Proc. Amer. Math. Soc. 4, 506–510 (1953)
Article MathSciNet MATH Google Scholar
Mertens, J.-F., Neyman, A.: Stochastic games. Internat. J. Game Theory 10(2), 53–66 (1981)
Article MathSciNet MATH Google Scholar
Müller, J. M.: Elementary functions: algorithms and implementation. Birkhaüser, Cambridge (2005)
Google Scholar
Neyman, A.: Stochastic games and nonexpansive maps. In Stochastic games and applications (Stony Brook, NY, 1999), volume 570 of NATO Sci. Ser. C Math. Phys. Sci., pp. 397–415. Kluwer Acad. Publ., Dordrecht (2003)
Nussbaum, R.D.: Convexity and log convexity for the spectral radius. Linear Algebra Appl. 73, 59–122 (1986)
Article MathSciNet MATH Google Scholar
Protasov, V. Yu.: Spectral simplex method. Math. Program. 156(1-2, Ser. A), 485–511 (2016)
Article MathSciNet MATH Google Scholar
Puterman, M.L.: Markov decision processes. Wiley, New York (2005)
MATH Google Scholar
Rothblum, U.G.: Multiplicative markov decision chains. Math. Oper. Res. 9 (1), 6–24 (1984)
Article MathSciNet MATH Google Scholar
Rump, S.M.: Polynomial minimum root separation. Math. Comput. 145(33), 327–336 (1979)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Book MATH Google Scholar
Sladký, K.: On Dynamic Programming Recursions for Multiplicative Markov Decision Chains, pp 216–226. Springer, Berlin (1976)
MATH Google Scholar
van den Dries, L.: Tame topology and o-minimal structures, volume 248 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge (1998)
Book Google Scholar
van den Dries, L.: o-minimal structures and real analytic geometry. In: Current developments in mathematics, 1998 (Cambridge, MA), pp. 105–152. Int. Press, Somerville (1999)
Vigeral, G.: A zero-sum stochastic game with compact action sets and no asymptotic value. Dyn. Games Appl. 3(2), 172–186 (2013)
Article MathSciNet MATH Google Scholar
Whittle, P.: Optimization over time, I. Wiley, New York (1982)
MATH Google Scholar
Wilkie, A.J.: Model completeness results for expansions of the ordered field of real numbers by restricted Pfaffian functions and the exponential function. J. Amer. Math. Soc. 9(4), 1051–1094 (1996)
Article MathSciNet MATH Google Scholar
Ye, Y.: The simplex and policy-iteration methods are strongly polynomial for the markov decision problem with a fixed discount rate. Math. Oper. Res. 36(4), 593–603 (2011)
Article MathSciNet MATH Google Scholar
Zijm, W.H.M.: Asymptotic expansions for dynamic programming recursions with general nonnegative matrices. J. Optim. Theory Appl. 54(1), 157–191 (1987)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

An announcement of the present results appeared in the proceedings of STACS, [4]. We are very grateful to the referees of this STACS paper and also to the referees of the present extended version, for their detailed comments which helped us to improve this manuscript.

Author information

Authors and Affiliations

Inria and CMAP, École polytechnique, CNRS, Palaiseau, France
Marianne Akian & Stéphane Gaubert
IEOR Department, Columbia University, New York, NY, USA
Julien Grand-Clément
Inria Paris, Paris, France
Jérémie Guillaud

Authors

Marianne Akian
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Gaubert
View author publications
You can also search for this author in PubMed Google Scholar
Julien Grand-Clément
View author publications
You can also search for this author in PubMed Google Scholar
Jérémie Guillaud
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stéphane Gaubert.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Special Issue on Theoretical Aspects of Computer Science (STACS 2017)

The authors were partially supported by the ANR through the MALTHY INS project, and by the Gaspard Monge corporate sponsorship Program (PGMO) of EDF, Orange, Thales and Fondation Mathé matique Jacques Hadmard.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akian, M., Gaubert, S., Grand-Clément, J. et al. The Operator Approach to Entropy Games. Theory Comput Syst 63, 1089–1130 (2019). https://doi.org/10.1007/s00224-019-09925-z

Download citation

Published: 30 May 2019
Issue Date: 15 July 2019
DOI: https://doi.org/10.1007/s00224-019-09925-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Operator Approach to Entropy Games

Abstract

Access this article

Similar content being viewed by others

A Variant of the Logistic Quantal Response Equilibrium to Select a Perfect Equilibrium

On the Replication of the Pre-kernel and Related Solutions

Sum-of-Squares Relaxations for Information Theory and Variational Inference

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Operator Approach to Entropy Games

Abstract

Access this article

Similar content being viewed by others

A Variant of the Logistic Quantal Response Equilibrium to Select a Perfect Equilibrium

On the Replication of the Pre-kernel and Related Solutions

Sum-of-Squares Relaxations for Information Theory and Variational Inference

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation