Using multiobjective optimization to map the entropy region

Csirmaz, László

doi:10.1007/s10589-015-9760-6

Using multiobjective optimization to map the entropy region

Published: 16 June 2015

Volume 63, pages 45–67, (2016)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

László Csirmaz^1,2

325 Accesses
16 Citations
Explore all metrics

Abstract

Mapping the structure of the entropy region of at least four jointly distributed random variables is an important open problem. Even partial knowledge about this region has far reaching consequences in other areas in mathematics, like information theory, cryptography, probability theory and combinatorics. Presently, the only known method of exploring the entropy region is, or equivalent to, the one of Zhang and Yeung from 1998. Using some non-trivial properties of the entropy function, their method is transformed to solving high dimensional linear multiobjective optimization problems. Benson’s outer approximation algorithm is a fundamental tool for solving such optimization problems. An improved version of Benson’s algorithm is presented, which requires solving one scalar linear program in each iteration rather than two or three as in previous versions. During the algorithm design, special care is taken for numerical stability. The implemented algorithm is used to verify previous statements about the entropy region, as well as to explore it further. Experimental results demonstrate the viability of the improved Benson’s algorithm for determining the extremal set of medium-sized numerically ill-posed optimization problems. With larger problem sizes, two limitations of Benson’s algorithm is observed: the inefficiency of the scalar LP solver, and the unexpectedly large number of intermediate vertices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Article 09 April 2023

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Article 19 January 2024

A Systematic Review of the Whale Optimization Algorithm: Theoretical Foundation, Improvements, and Hybridizations

Article 27 May 2023

Notes

Hamel et al. [14] have observed the same improvement independently.

References

Avis, D., Bremner, D., Seidel, R.: How good are convex hull algorithms? Comput. Geom. 7(5–6), 265–301 (1997)
Article MATH MathSciNet Google Scholar
Baber, R., Christofides, D., Dang, A.N., Riis, S., Vaughan, E.R.: Multiple unicasts, graph guessing games, and non-Shannon inequalities. In: Proc. NetCod 2013, Calgary, pp. 1–6
Bassoli, R., Marques, H., Rodriguez, J., Shum, K.W., Tafazolli, R.: Network coding theory: a survey. IEEE Commun. Surveys Tutor. 15(4), 1950–1978 (2013)
Article Google Scholar
Beimel, A.: Secret-sharing schemes: a survey. Coding and Cryptology. LNCS, pp. 11–46. Springer, Heidelberg (2011)
Chapter Google Scholar
Benson, P.: An outer approximation algorithm for generating all efficient extreme points in the outcome set of a multiple objective linear program. J. Glob. Optim. 13(1), 1–24 (1998)
Article MATH MathSciNet Google Scholar
Bremner, D.: On the complexity of vertex and facet enumeration for convex polytopes. PhD Thesis, School of Computer Science, McGill University (1997)
Burton, B.A., Ozlen, M.: Projective geometry and the outer approximation algorithm for multiobjective linear programming, arXiv:1006.3085 (2010)
Chan, T.H.: Balanced information inequalities. IEEE Trans. Inform. Theory 49, 3261–3267 (2003)
Article MATH MathSciNet Google Scholar
Chan, T.H.: Recent progresses in characterising information inequalities. Entropy 13, 379–401 (2011)
Article MATH MathSciNet Google Scholar
Csirmaz, L.: Book inequalities. IEEE Trans. Inform. Theory 60, 6811–6818 (2014)
Article MathSciNet Google Scholar
Dougherty, R., Freiling, C., Zeger, K.: Non-Shannon information inequalities in four random variables, arXiv:1104.3602 (2011)
Ehrgott, M., Löhne, A., Shao, L.: A dual variant of Benson’s outer approximation algorithm for multiple objective linear programming. J. Glob. Optim. 52, 757–778 (2012)
Article MATH Google Scholar
Fukuda, K., Prodon, A.: Double description method revisited. Combinatorics and Computer Science (Brest, 1995). LNCS, vol. 1120, pp. 91–111. Springer, Berlin (1996)
Chapter Google Scholar
Hamel, A.H., Löhne, A., Rudloff, B.: Benson type algorithms for linear vector optimization and applications. J. Glob. Optim. 59, 811–836 (2013)
Article Google Scholar
Heyde, F., Löhne, A.: Geometric duality in multiple objective linear programming. SIAM J. Optim. 19(2), 836–845 (2008)
Article MATH MathSciNet Google Scholar
Kaced, T.: Equivalence of two proof techniques for non-Shannon-type inequalities, In: Proceedings of the 2013 IEEE International Symposium on Information Theory, Istambul, pp. 236–240 (2013)
Madiman, M., Marcus, A.W., Tetali, P.: Information-theoretic inequalities in additive combinatorics. In: IEEE ITW, pp. 1–4 (2010)
Makarychev, K., Makarychev, Yu., Romashchenko, A., Vereshchagin, N.: A new class of non-Shannon-type inequalities for entropies. Commun. Inf. Syst. 2(2), 147–166 (2002)
MATH MathSciNet Google Scholar
Matus, F.: Infinitely many information inequalities. In: Proceedings ISIT, pp. 41–47, 24–29 June 2007. Nice, France
Matus, F.: Two constructions on limits of entropy functions. IEEE Trans. Inform. Theory 53(1), 320–330 (2007)
Article MathSciNet Google Scholar
Matus, F.: Personal communication (2012)
Matus, F., Studeny, M.: Conditional independencies among four random variables I. Comb. Probab. Comput. 4, 269–278 (1995)
Article MATH MathSciNet Google Scholar
McRae, W.B., Davidson, E.R.: An algorithm for the extreme rays of a pointed convex polyhedral cone. SIAM J. Comput. 2(4), 281–293 (1973)
Article MATH MathSciNet Google Scholar
Pippenger, N.: What are the laws of information theory. In: Special Problems on Communication and Computation Conference. Palo Alt, California, 3–5 Sept 1986
Studený, M.: Probabilistic Conditional Independence Structures. Springer, New York (2005)
MATH Google Scholar
MacLaren Walsh, J., Weber, S.: Relationships among bounds for the region of entropic vectors in four variables. In: Allerton Conference on Communication, Control, and Computing (2010)
Yeung, R.W.: A First Course in Information Theory. Kluwer Academic/Plenum Publishers, New York (2002)
Book Google Scholar
Zhang, Z., Yeung, R.W.: On characterization of entropy function via information inequalities. Proc. IEEE Trans. Inf. Theory 44(4), 1440–1452 (1998)
Article MATH MathSciNet Google Scholar

Download references

Acknowledgments

The author would like to acknowledge the help received during the numerous insightful, fruitful, and enjoyable discussions with Frantisek Matúš on the entropy function, matroids, and on the ultimate question of everything. Supported by TAMOP-4.2.2.C-11/1/KONV-2012-0001 and the Lendulet Program.

Author information

Authors and Affiliations

Central European University, Budapest, Hungary
László Csirmaz
Rényi Institute, Budapest, Hungary
László Csirmaz

Authors

László Csirmaz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to László Csirmaz.

Appendices

Appendix 1 1.1 Shannon inequalities

Recall that given a discrete random variable x with possible values $\{a_1,a_2,\dots ,a_n\}$ and probability distribution $\{p(a_i)\}_{i=1}^n$, the Shannon entropy of x is defined as $\mathop {\mathbf{H}}(x)=-\sum _{i=1}^n\,p(a_i)\log p(a_i)$ which is a measure of the average uncertainty associated with x. Let $\langle x_i:i\in I\rangle $ be a collection of random variables. For $A\subseteq I$, we let $x_A = \langle x_i:i\in A\rangle $, and $\mathop {\mathbf{H}}(x_A)$ be the entropy of $x_A$ equipped with the marginal distribution. Thus the entropy function $\mathop {\mathbf{H}}$ associated with collection $\langle x_i:i\in I\rangle $ maps the non-empty subsets of I to non-negative real numbers. The Shannon inequalities say that this $\mathop {\mathbf{H}}$ is a monotone and submodular function, that is,

$$\begin{aligned} 0\le \mathop {\mathbf{H}}(x_A)\le \mathop {\mathbf{H}}(x_B) ~~~\text{ when }\; A\subseteq B, \end{aligned}$$

(5)

and

$$\begin{aligned} \mathop {\mathbf{H}}(x_{A\cup B}) + \mathop {\mathbf{H}}(x_{A\cap B}) \le \mathop {\mathbf{H}}(x_A)+\mathop {\mathbf{H}}(x_B), \end{aligned}$$

(6)

for all subsets A, B of I. There are redundant inequalities among the Shannon inequalities. For example, the following smaller collection implies all others: consider the inequalities from (5) where $B=I$, and A is missing only one element of I; and the inequalities from (6) where both A and B has exactly one element not in $A\cap B$.

1.2 Independent copy of random variables

We split a set of random variables into two disjoint groups $\langle x_i: i\in I\rangle $ and $\langle y_j: j\in J \rangle $, and create $\langle x'_i:i\in I\rangle $ as an independent copy of $\langle x_i\rangle $ over $\langle y_j\rangle $. It means that $\langle x'_i\rangle $ and $\langle x_i\rangle $ have the same set of possible values, and

$$\begin{aligned}&\mathop {\mathrm {Prob}}\big (\langle x'_i{=}a'_i\rangle , \langle x_i{=}a_i\rangle , \langle y_j{=}b_j\rangle \big ) \\&\quad = \frac{\mathop {\mathrm {Prob}}\big (\langle x_i{=}a'_i\rangle , \langle y_j{=}b_j\rangle \big ) \cdot \mathop {\mathrm {Prob}}\big (\langle x_i{=}a_i\rangle , \langle y_j{=}b_j\rangle \big )}{\mathop {\mathrm {Prob}}\big (\langle y_j{=}b_j\rangle \big )}, \end{aligned}$$

expressing that $\langle x'_i\rangle $ and $\langle x_i\rangle $ are independent over $\langle y_j\rangle $. The entropy of certain subsets of $\langle x'_i,x_i,y_j\rangle $ can be computed from the entropy of other subsets as follows. Let $A, B \subseteq I$ and $C\subseteq J$. Then,

$$\begin{aligned} \mathop {\mathbf{H}}(x'_A x_By_C) = \mathop {\mathbf{H}}(x'_B x_A y_C), \end{aligned}$$

which is due to the complete symmetry between $\langle x'_i\rangle $ and $\langle x_i\rangle $. The fact that $x'_I$ and $x_I$ are independent over $y_J$ translates into the following entropy equality:

$$\begin{aligned} \mathop {\mathbf{H}}(x'_A x_B y_J) = \mathop {\mathbf{H}}(x'_A y_J) + \mathop {\mathbf{H}}(x_B y_J) - \mathop {\mathbf{H}}(y_J) \end{aligned}$$

for all subsets $A,B\subseteq I$.

1.3 Copy strings

The process starts by fixing four random variables a, b, c, and d with some joint distribution. Split them into two parts, create an independent copy of the first part over the second, add the newly created random variables to the group, and then repeat this process. To save on the number of variables created, in each step certain newly generated variables can be discarded, or two or more new variables can be merged into a single one. This process is described by a copy string, which has the following form:

$$\begin{aligned} \mathtt{rs=cd:ab;\,t=(cr):ab;\,u=t:acs} \end{aligned}$$

This string describes three iterations which are separated by semicolons. In the first step we create an independent copy of cd over ab, and name the two new variables by rs such that r is a copy of c, and s is a copy of d. After this step we have six variables abcdrs with some joint distribution. In the next step we make an independent copy of cdrs over ab, merge the copies of c and r to a single variable, name it t, and add it to the pool. In the last step create an independent copy of bdrt over acd, keep the copy of t, name it u, and discard the other three newly created variables. As the result, we get the eight random variables abcdrstu.

1.4 A unimodular matrix

It is advantageous to look at the 15 entropies of the four random variables in another coordinate system. The new coordinates can be computed using the unimodular matrix shown in Table 4. Columns represent the entropies of the subsets of the four random variables a, b, c and d, as indicated in the top row. The value of the “Ingleton row” should be set to 1, and rows marked by the letter “z” vanish for all extremal vertices, thus they should be set to 0.

Table 4 The unimodular matrix

Full size table

Appendix 2

This section lists new entropy inequalities which were found during the experiments described in Sect. 4 and have all coefficients less than 100. Each entry in the list contains nine integers representing the coefficients $c_0$, $c_1$, $\dots $, $c_8$ for the non-Shannon information inequality of the form

$$\begin{aligned}&c_0\big (\mathbf {I}(c,d)-\mathbf {I}(a,b)+\mathbf {I}(a,b\,|\,c)+\mathbf {I}(a,b\,|\,d)\big ) \\&\quad +\, c_1\mathbf {I}(a,b\,|\,c)+c_2\mathbf {I}(a,b\,|\,d) \\&\quad +\, c_3\mathbf {I}(a,c\,|\,b) + c_4\mathbf {I}(b,c\,|\,a)+ c_5\mathbf {I}(a,d\,|\,b)+c_6\mathbf {I}(b,d\,|\,a) \\&\quad +\, c_7\mathbf {I}(c,d\,|\,a) + c_8\mathbf {I}(c,d\,|\,b) \ge 0. \end{aligned}$$

Here $\mathbf {I}(A,B) = \mathop {\mathbf{H}}(A)+\mathop {\mathbf{H}}(B)-\mathop {\mathbf{H}}(AB)$ is the mutual information, $\mathbf {I}(A,B\,|\,C)=\mathop {\mathbf{H}}(AC)+\mathop {\mathbf{H}}(BC)-\mathop {\mathbf{H}}(ABC)-\mathop {\mathbf{H}}(C)$ is the conditional mutual information. The expression after $c_0$ is the Ingleton value. Following the list of coefficients is the applied copy string.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Csirmaz, L. Using multiobjective optimization to map the entropy region. Comput Optim Appl 63, 45–67 (2016). https://doi.org/10.1007/s10589-015-9760-6

Download citation

Received: 12 November 2013
Published: 16 June 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s10589-015-9760-6

Keywords

Mathematics Subject Classfication

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using multiobjective optimization to map the entropy region

Abstract

Access this article

Similar content being viewed by others

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

A Systematic Review of the Whale Optimization Algorithm: Theoretical Foundation, Improvements, and Hybridizations

Notes

References

Acknowledgments