Abstract
We study optimal solutions to an abstract optimization problem for measures, which is a generalization of classical variational problems in information theory and statistical physics. In the classical problems, information and relative entropy are defined using the Kullback-Leibler divergence, and for this reason optimal measures belong to a one-parameter exponential family. Measures within such a family have the property of mutual absolute continuity. Here we show that this property characterizes other families of optimal positive measures if a functional representing information has a strictly convex dual. Mutual absolute continuity of optimal probability measures allows us to strictly separate deterministic and non-deterministic Markov transition kernels, which play an important role in theories of decisions, estimation, control, communication and computation. We show that deterministic transitions are strictly sub-optimal, unless information resource with a strictly convex dual is unconstrained. For illustration, we construct an example where, unlike non-deterministic, any deterministic kernel either has negatively infinite expected utility (unbounded expected error) or communicates infinite information.
Similar content being viewed by others
References
Accardi L., Cecchini C.: Conditional expectations in von Neumann algebras and a theorem of Takesaki. J. Funct. Anal. 45(2), 245–273 (1982)
Alesker S.: Integrals of smooth and analytic functions over Minkowski’s sums of convex sets. In: Ball, K.M., Milman, V. (eds) Convex Geometric Analysis, vol. 34, pp. 1–15. MSRI, Berkeley, CA (1998)
Amari S.I.: Differential-Geometrical Methods of Statistics Lecture Notes in Statistics vol 25. Springer, Berlin (1985)
Amari S.I., Ohara A.: Geometry of q-exponential family of probability distributions. Entropy 13, 1170–1185 (2011)
Asplund E., Rockafellar R.T.: Gradients of convex functions. Trans. Am. Math. Soc. 139, 443–467 (1969)
Banerjee A., Merugu S., Dhillon I.S., Ghosh J.: Clustering with Bregman divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005)
Belavkin, R.V.: Utility and value of information in cognitive science, biology and quantum theory. In: Accardi, L., Freudenberg, W., Ohya, M. (eds.) Quantum Bio-Informatics III. QP-PQ: Quantum Probability and White Noise Analysis, vol. 26. World Scientific, Singapore (2010)
Belavkin, R.V.: On evolution of an information dynamic system and its generating operator. Optim. Lett. 1–14 (2011). doi:10.1007/s11590-011-0325-z
Belavkin V.P.: New types of quantum entropies and additive information capacities. In: Accardi, L., Freudenberg, W., Ohya, M. (eds) Quantum Bio-Informatics IV, QP-PQ: Quantum Probability and White Noise Analysis, pp. 61–89. World Scientific, Singapore (2011)
Bobkov, S.G., Zegarlinski, B.: Entropy Bounds and Isoperimetry. Memoirs of the American Mathematical Society, vol. 176. AMS, New York, USA (2005)
Bourbaki N.: Eléments de mathématiques. Intégration. Hermann, Paris (1963)
Chentsov, N.N.: Statistical Decision Rules and Optimal Inference. Nauka, Moscow, USSR (1972). In Russian, English translation: AMS, Providence, RI (1982)
Cramér H.: Mathematical Methods of Statistics. Princeton University Press, Princeton, NJ (1946)
Csiszár I.: Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems. Ann. Stat. 19(4), 2032–2066 (1991)
Dixmier J.: von Neumann Algebras. North-Holland, Amsterdam, New York, NY (1981)
Goldreich O.: Computational Complexity: A Conceptual Perspective. Cambridge University Press, Cambridge (2008)
Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106, 108, 620–630, 171–190 (1957)
Kachurovskii R.I.: Nonlinear monotone operators in Banach spaces. Russ. Math. Surv. 23(2), 117–165 (1968)
Khinchin A.I.: Mathematical Foundations of Information Theory. Dover, New York, NY (1957)
Kirkpatrick S., Gelatt C.D., Vecchi J.M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)
Kolmogorov, A.N., Uspenskii, V.A.: On the definition of an algorithm. Uspekhi Mat. Nauk 13(4), 3–28 (1958) In Russian
Kozen D., Ruozzi N.: Applications of metric coinduction. Log. Methods Comput. Sci. 5(3), 10–119 (2009)
Kullback S.: Information Theory and Statistics. Wiley, New York, NY (1959)
Markov, A.A., Nagornyi, N.M.: The theory of algorithms. Kluwer, Dordrecht, Boston, London (1988). Translated from Russian
Moreau, J.J.: Functionelles convexes. Lectrue Notes, Séminaire sur les équations aux derivées partielles. Collége de France, Paris (1967)
Naudts J.: Generalised exponential families and associated entropy functions. Entropy 10, 131–149 (2008)
Petz D.: Conditional expectation in quantum probability. Lecture Notes in Mathematics 1303, 251–260 (1988)
Phelps R.R.: Lectures on Choquet’s theorem Lecture Notes in Mathematics vol 1757 2nd edn. Springer, Berlin (2001)
Pistone G., Sempi C.: An infinite-dimensional geometric structure on the space of all the probability measures equivalent to a given one. Ann. Stat. 23(5), 1543–1561 (1995)
Rao C.R.: Information and the accuracy attainable in the estimation of statistical parameters. Bull. Calc. Math. Soc. 37, 81–89 (1945)
Rockafellar, R.T.: Conjugate Duality and Optimization. CBMS-NSF Regional Conference Series in Applied Mathematics, vol. 16. Society for Industrial and Applied Mathematics, PA (1974)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 and 623–656 (1948)
Stratonovich R.L.: On value of information. Izvestiya USSR Acad. Sci. Tech. Cybern 5, 3–12 (1965) In Russian
Stratonovich, R.L.: Information Theory. Sovetskoe Radio, Moscow, USSR (1975). In Russian
Streater, R.F.: Quantum Orlicz spaces in information geometry. In: The 36th Conference on Mathematical Physics. Open Systems and Information Dynamics, vol. 11, pp. 350–375. Torun (2004)
Takesaki M.: Conditional expectations in von Neumann algebras. J. Funct. Anal. 9(3), 306–321 (1972)
Tikhomirov, V.M.: Analysis II. Encyclopedia of Mathematical Sciences, vol. 14, chap. Convex Analysis, pp. 1–92. Springer (1990)
von Neumann J., Morgenstern O.: Theory of Games and Economic Behavior, 1st edn. Princeton University Press, Princeton, NJ (1944)
Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Tech. Rep. 649, University of California, Berkeley (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Belavkin, R.V. Optimal measures and Markov transition kernels. J Glob Optim 55, 387–416 (2013). https://doi.org/10.1007/s10898-012-9851-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-012-9851-1