## Abstract

Information is a valuable asset in socio-economic systems, a significant part of which is entailed into the network of connections between agents. The different interlinkages patterns that agents establish may, in fact, lead to asymmetries in the knowledge of the network structure; since this entails a different ability of quantifying relevant, systemic properties (e.g. the risk of contagion in a network of liabilities), agents capable of providing a better estimation of (otherwise) inaccessible network properties, ultimately have a competitive advantage. In this paper, we address the issue of quantifying the information asymmetry of nodes: to this aim, we define a novel index—InfoRank—intended to rank nodes according to their information content. In order to do so, each node ego-network is enforced as a constraint of an entropy-maximization problem and the subsequent uncertainty reduction is used to quantify the node-specific accessible information. We, then, test the performance of our ranking procedure in terms of reconstruction accuracy and show that it outperforms other centrality measures in identifying the “most informative” nodes. Finally, we discuss the socio-economic implications of network information asymmetry.

This is a preview of subscription content, access via your institution.

## References

- 1.
Newman, M.E.J.: Networks: An Introduction. Oxford University Press, New York (2010)

- 2.
Bloch, F., Jackson, M.O., Tebaldi, P.: Centrality measures in networks (2017). arXiv:1608.05845

- 3.
Borgatti, S.P.: Centrality and network flow. Soc. Netw.

**27**, 55–71 (2005) - 4.
Benzi, M., Klymko, C.: A matrix analysis of different centrality measures. SIAM J. Matrix Anal. Appl.

**36**, 686–706 (2013). https://doi.org/10.1137/130950550 - 5.
Sabidussi, G.: The centrality index of a graph. Psychometrika

**31**, 581–603 (1966) - 6.
Langville, A.N., Meyer, C.: Google’s PageRank and Beyond. Princeton University Press, Princeton (2006)

- 7.
Squartini, T., Cimini, G., Gabrielli, A., Garlaschelli, D.: Network reconstruction via density sampling. Appl. Netw. Sci.

**2**(3) (2017). https://doi.org/10.1007/s41109-017-0021-8 - 8.
Zhang, Q., Meizhu, L., Yuxian, D., Yong, D.: Local structure entropy of complex networks (2014). arXiv:1412.3910v1

- 9.
Bianconi, G., Pin, P., Marsili, M.: Assessing the relevance of node features for network structure. PNAS

**106**(28), 11433–11438 (2009). https://doi.org/10.1073/pnas.0811511106 - 10.
Bianconi, G.: The entropy of randomized network ensembles. Europhys. Lett.

**81**(2), 28005 (2007) - 11.
Borgatti, S.P.: Identifying sets of key players in a social network. Comput. Math. Organ. Theory

**12**, 21–34 (2006). https://doi.org/10.1007/s10588-006-7084-x - 12.
Park, J., Newman, M.E.J.: The statistical mechanics of networks. Phys. Rev. E

**70**, 066117 (2004). https://doi.org/10.1103/PhysRevE.70.066117 - 13.
Squartini, T., Garlaschelli, D.: Maximum-Entropy Networks. Pattern Detection, Network Reconstruction and Graph Combinatorics. Springer Briefs in Complexity. Springer, Cham (2018)

- 14.
Oshio, K., Iwasaki, Y., Morita, S., Osana, Y., Gomi, S., Akiyama, E., Omata, K., Oka, K., Kawamura, K.: Tech. Rep. of CCeP, Keio Future 3. Keio University, Tokyo (2003)

- 15.
Colizza, V., Pastor-Satorras, R., Vespignani, A.: Reaction-diffusion processes and metapopulation models in heterogeneous networks. Nat. Phys.

**3**, 276–282 (2007) - 16.
Martinez, N.D.: Artifacts or attributes? Effects of resolution on the Little Rock Lake food web. Ecol. Monogr.

**61**(4), 367–392 (1991) - 17.
Fortunato, S., Boguna, M., Flammini, A., Menczer, F.: Approximating PageRank from in-Degree in Lecture Notes in Computer Science 4936. Springer, Berlin (2008)

- 18.
Gleditsch, K.S.: Expanded trade and GDP data. J. Confl. Resolut.

**46**, 712–724 (2002) - 19.
Squartini, T., Fagiolo, G., Garlaschelli, D.: Randomizing world trade. I. A binary network analysis. Phys. Rev.

**E84**, 046117 (2011). https://doi.org/10.1103/PhysRevE.84.046117 - 20.
Wittenberg-Moerman, R.: The role of information asymmetry and financial reporting quality in debt trading: evidence from the secondary loan market. J. Account. Econ.

**46**(2), 240–260 (2008) - 21.
Eisenberg, L., Noe, T.H.: Systemic risk in financial systems. Manag. Sci.

**47**(2), 236–249 (2001) - 22.
Rogers, L.C.G., Veraart, L.A.M.: Failure and rescue in an interbank network. Manag. Sci.

**59**(4), 882–898 (2013) - 23.
Barucca, P., Lillo, F.: The organization of the interbank network and how ECB unconventional measures affected the e-MID overnight market (2015). arXiv:1511.08068

- 24.
Glasserman, P., Young, P.H.: Contagion in financial networks. J. Econ. Lit.

**54**(3), 779–831 (2016) - 25.
Barucca, P., Bardoscia, M., Caccioli, F., D’Errico, M., Visentin, G., Battiston, S., Caldarelli, G.: Network valuation in financial systems (2016). arXiv:1606.05164

## Acknowledgements

PB and TS acknowledge support from: FET Project DOLFINS No. 640772 and FET IP Project MULTIPLEX No. 317532.

## Author information

### Affiliations

### Corresponding author

## Appendices

### Appendix A

Here we show how the computation of \(S_0^{(i)}\) can be simplified in two cases of general interest. The first one concerns sparse networks: since, in this case, the probability coefficients defined by Eq. (1) satisfy the requirement \(p_{ij}\ll 1\), the following factorization holds \(p_{ij}\simeq x_ix_j\), further implying that

The second approximation is valid whenever the node *i*-specific probability coefficients are well represented by their average value, i.e. \(p_{ij}\simeq \frac{k_i}{N-1}\equiv \overline{p}_{ij}\); in this case,

### Appendix B

This second appendix collects the details of the derivation of our proposed methodology. Let us focus on the simplest case of a single node (hereafter indexed by *l*): in order to calculate InfoRank it can be imagined to solve two different problems. The first one concerns the maximization of the functional

i.e. the *constrained* Shannon entropy, constraints encoding the benchmark information accessible by all nodes (represented by the vector of *M* constraints \(\vec {C}^*\)—notice that the normalization condition of the probability distribution, \(P(\mathbf {G}|\vec {\eta })\), to be determined can be re-written as an \(M+1\)-th constraint of the kind \(C_{M+1}(\mathbf {G})=C_{M+1}^*=1\)) [13]. By solving the constrained-optimization problem in (14), node *l* finds that

(where \(Z(\vec {\eta })=\sum _{\mathbf {G}}e^{-\vec {\eta }\cdot \vec {C}(\mathbf {G})}\) is the so-called *partition function* and depends on the unknown Lagrange multipliers \(\vec {\eta }\)). On the other hand, the second optimization problem node *l* has to solve concerns the functional

with \(S_{(l)}\) being nothing else than the functional in (14) further constrained by imposing the ego-network of node *l* as well (i.e. the values of the link-specific variables \(a^*_{lm}\)—either 0 or 1). Upon solving the second problem, the expression

(where \(Z'(\vec {\theta },\vec {\psi })=\sum _{\mathbf {G}}e^{-\vec {\theta }\cdot \vec {C}(\mathbf {G})-\sum _m\psi _{lm}a_{lm}(\mathbf {G})}\)) is found. Notice that although \(S_{(l)}\) and \(S_0\) are defined by the same vector of constraints, \(\vec {C}\), the numerical values of the Lagrange multipliers ensuring that \(\langle \vec {C}\rangle =\vec {C}^*\) will, in general, differ, whence the use of different symbols, i.e. \(\vec {\eta }\) and \(\vec {\theta }\).

Both functionals achieve a minimum in their stationary point (consistently with our attempt to minimize each node—residual—uncertainty). This can be easily proven, upon noticing that the Hessian matrix of both \(S_0\) and \(S_{(l)}\) is the covariance matrix of the constraints and, as such, positive-semidefinite. In order to find the stationary point of \(S_{(l)}\), node *l* must solve the equations

which lead to the system of equations in (4). More explicitly, the second group of conditions reads

in order to numerically evaluate the parameters \(\vec {\psi }\), let us focus on a specific value, e.g. \(\psi _{l1}\) controlling for the value of the entry \(a_{l1}\). Let us now explicitly distinguish the configurations characterized by \(a_{l1}=0\) from the ones with \(a_{l1}=1\): upon doing so, condition (19) can be rewritten as

i.e. as a sum over only the configurations with \(a_{l1}=1\) (indicated with the symbol \(\mathbf {G}_1\)). Analogously, we can split \(Z'(\vec {\theta },\vec {\psi })\) into the sum of two terms, i.e. \(Z'(\vec {\theta },\vec {\psi })=Z_0'(\vec {\theta },\vec {\psi })+e^{-\psi _{l1}}Z_1'(\vec {\theta },\vec {\psi })\), where the first sum

runs over the networks having \(a_{l1}=0\) and the second sum

runs over the networks having \(a_{l1}=1\).

Solving Eq. (20) in the case \(a^*_{l1}=0\) leads to \(\psi _{l1}=+\infty \). As a consequence, in this case \(S_{(l)}=\vec {\theta }\cdot \vec {C}^*+\ln Z_0'(\vec {\theta },\vec {\psi })\) since the term \(Z_1'(\vec {\theta },\vec {\psi })\) is suppressed by the coefficient \(e^{-\psi _{l1}}\) that converges to zero. On the other hand, solving Eq. (20) in the case \(a^*_{l1}=1\) leads to \(\psi _{l1}=-\infty \) and \(S_{(l)}=\vec {\theta }\cdot \vec {C}^*+\ln Z_1'(\vec {\theta },\vec {\psi })\) since the term \(Z_0'(\vec {\theta },\vec {\psi })\) is now suppressed by the coefficient \(e^{\psi _{l1}}\) (this is readily seen by multiplying both the numerator and the denominator at the left-hand side of Eq. (20) by \(e^{\psi _{l1}}\)). Specifying the node-specific ego-networks, in other words, leads to reducing the number of configurations over which the estimation of the constraints is carried out: \(Z'(\vec {\theta })\), thus, runs over a smaller number of configurations than \(Z(\vec {\eta })\). The estimation of the other parameters \(a_{l2}\dots a_{lN}\) proceeds in an analogous way, by applying the same line of reasoning to the “surviving” partition functions.

Let us now evaluate the expressions \(Z(\vec {\eta })\) and \(Z'(\vec {\theta })\) for the *same value* of the parameters (say \(\vec {\mu }\)): since the number of addenda in \(Z(\vec {\mu })\) is larger than the number of addenda in \(Z'(\vec {\mu })\), it also holds true that \(\ln Z(\vec {\mu })\ge \ln Z'(\vec {\mu })\), in turn implying the inequivalence \(S_0(\vec {\mu })\ge S_{(l)}(\vec {\mu })\) to be true as well. Let us now choose a particular value of the parameters, i.e. the point of minimum of \(S_0\): \(\vec {\mu }=\vec {\eta }^*\). Thus,

where the second inequality follows from the very definition of minimum. This ensures the ratio \(S_{(l)}/S_0\) to be smaller than one and the InfoRank index in Eq. (7) to be always well-defined.

Our ranking procedure builds upon the evidence that, by imposing more information on top of the common one, each node further reduces its uncertainty about the unknown network structure: the one reducing the residual uncertainty to the largest extent is identified as the “most informative” one.

The same line of reasoning applies when subsets of nodes are considered, although the resolution of such a problem may be computationally demanding: given a network of size *N*, quantifying the InfoRank of all possible subsets of *s* nodes would require computing \({N}\atopwithdelims (){s}\) different Shannon entropies.

## Rights and permissions

## About this article

### Cite this article

Barucca, P., Caldarelli, G. & Squartini, T. Tackling Information Asymmetry in Networks: A New Entropy-Based Ranking Index.
*J Stat Phys* **173, **1028–1044 (2018). https://doi.org/10.1007/s10955-018-2076-z

Received:

Accepted:

Published:

Issue Date:

### Keywords

- Complex networks
- Shannon entropy
- Information theory
- Ranking algorithm