1 Introduction

An important characteristic of a network is how well communication can flow in it, i.e., how easy or difficult it is to reach one part of the network from another part by following edges. How well information flows through the whole network can be measured by the diameter of the graph that represents the network or by its global efficiency. Both these measures can be determined with the aid of the path length matrix associated with the network. We will discuss these connections and introduce new measures of communication based on the path length matrix.

Let us introduce some notation and definitions that will be used throughout this paper. A network is represented by a graph. A weighted graph \(\mathcal {G}=\langle \mathcal {V},\mathcal {E},\mathcal {W}\rangle \) consists of a set of nodes or vertices \(\mathcal {V}=\{v_1,v_2,\dots ,v_n\}\), a set of edges \(\mathcal {E}=\{e_1,e_2,\dots ,e_m\}\) that connect the vertices, and a set of weights \(\mathcal {W}=\{a_{ij}\}_{i,j=1}^n\); the weights \(a_{ij}\) are the entries of the adjacency matrix \(A=[a_{ij}]_{i,j=1}^n\) associated with the graph \(\mathcal {G}\); see below. An edge is said to be directed if it starts at a vertex \(v_i\) and ends at a vertex \(v_j\), and is denoted by \(e(v_i\rightarrow v_j)\). An edge between the vertices \(v_i\) and \(v_j\) is said to be undirected when the pair of vertices is unordered and the weights \(a_{ij}\) and \(a_{ji}\) are positive and equal. An undirected egde between the vertices \(v_i\) and \(v_j\) is denoted by \(e(v_i\leftrightarrow v_j)\). A graph with only undirected edges is said to be undirected; otherwise the graph is directed. A simple graph is a graph without multiple edges or self-loops. In particular, this implies that the diagonal entries of the adjacency matrix for the graph vanish. This work considers simple graphs.

The adjacency matrix \(A=[a_{ij}]_{i,j=1}^n\in \mathbb {R}^{n\times n}\) for a weighted graph \(\mathcal {G}\) is determined by the weights \(a_{ij}\) of the graph with \(a_{ij}>0\) if there is an edge \(e(v_i\rightarrow v_j)\) in \(\mathcal {G}\). If there is no edge \(e(v_i\rightarrow v_j)\) in \(\mathcal {G}\), then \(a_{ij}=0\). For an unweighted graph, all positive entries \(a_{ij}\) of A equal one. A sequence of k edges (not necessarily distinct) such that \(\{e(v_1\rightarrow v_2),e(v_2\rightarrow v_3),\ldots ,e(v_k\rightarrow v_{k+1})\}\) form a walk. If \(v_{k+1}=v_1\), then the walk is said to be closed. A sequence of distinct edges such that \(\{e(v_1\rightarrow v_2),e(v_2\rightarrow v_3),\ldots , e(v_k\rightarrow v_{k+1})\}\) form a path. The length of a path is given by the sum of all weights of the edges in the path. (In the unweighted case, the sum of all weights of the edges in a path of length k is k.) For further discussions on networks and graphs; see [8, 14].

To construct the path length matrix associated with the network, we will make use of min-plus matrix multiplication, i.e., matrix multiplication in the tropical algebra [12]:

$$\begin{aligned} C=[c_{ij}]_{i,j=1}^{n}=A\star B: \qquad \qquad c_{ij}=\min _{h=1,2,\ldots ,n} \{a_{ih}+b_{hj}\},\qquad 1\le i,j\le n, \end{aligned}$$

where \(A,B,C\in \mathbb {R}^{n\times n}\). We denote by \(A^{1,\star }=[a^{(1,\star )}_{ij}]_{i,j=1}^n\in \mathbb {R}^{n\times n}\) the matrix obtained by setting to \(\infty \) the vanishing off-diagonal entries of the adjacency matrix A associated with the graph \(\mathcal G\) under consideration. For \(k>1\), the \(k^{th}\) min-plus power of \(A^{1,\star }\) is given by

$$\begin{aligned} A^{k,\star }=[a^{(k,\star )}_{ij}]_{i,j=1}^n\in \mathbb {R}^{n\times n}: \qquad \qquad a^{(k,\star )}_{ij} =\min _{h=1,2,\ldots ,n} \{a^{(k-1,\star )}_{ih}+a^{(1,\star )}_{hj}\}. \end{aligned}$$

Notice that the matrix \(A^{k,\star }\) gives vertex distances using paths of at most k edges. In detail, the entry \(a^{(k,\star )}_{ij}\), with \(i\ne j\), represents the length of the shortest path from \(v_i\) to \(v_j\) made up of at most k edges. The diagonal entries of \(A^{k,\star }\) are zero by definition. One has \(a_{ij}^{(k,\star )}=\infty \) if every path from \(v_i\) to \(v_j\) is made up of more than k edges, or if there is no path from \(v_i\) to \(v_j\).

The diameter of a graph \(\mathcal {G}\) is the maximal length \(d_{\mathcal {G}}\) of the shortest path between any distinct vertices of the graph and provides a measure of how easy it is for the vertices of the graph to communicate. One has

$$\begin{aligned} d_{\mathcal {G}}=\max _{1\le i,j\le n} a_{ij}^{(n-1,\star )}. \end{aligned}$$
(1)

Indeed, the entry \(a^{(n-1,\star )}_{ij}\) of the matrix \(A^{n-1,\star }=[a^{(n-1,\star )}_{ij}]_{i,j=1}^n\) yields the length of the shortest path from \(v_i\) to \(v_j\). We will refer to \(A^{n-1,\star }\) as the path length matrix. Note that the triangle inequality holds for the entries of a path length matrix. Specifically,

$$\begin{aligned} a_{ij}^{(n-1,\star )}\le a_{ih}^{(n-1,\star )}+a_{hj}^{(n-1,\star )},\qquad 1\le i,j\le n. \end{aligned}$$

Consider an unweighted connected graph \(\mathcal {G}\) with associated adjacency matrix \(A\in \mathbb {R}^{n\times n}\). Recall that the diameter of \(\mathcal {G}\) is the maximal number of edges in the shortest path between all pairs of distinct vertices of the graph. Given the vertices \(v_i\) and \(v_j\), there is an integer \(\widehat{k}\), \(1\le \widehat{k}<n\), such that

$$\begin{aligned} a_{ij}^{(n-1,\star )}=\dots = a_{ij}^{(\widehat{k}+1,\star )}= a_{ij}^{(\widehat{k},\star )}= \widehat{k} \end{aligned}$$

since the graph is connected, whereas for \(1\le h<\widehat{k}\), one has \(a_{ij}^{(h,\star )}=\infty \). Thus, information provided by the path length matrix \(A^{n-1,\star }\) includes information about all powers \(A^{k,\star }\) for \(1\le k<n-1\).

Let \(v_i\) and \(v_j\) be distinct vertices in a weighted graph. Then there is an integer \(\widehat{k}\), \(1\le \widehat{k}<n-1\), such that

$$\begin{aligned} a_{ij}^{(n-1,\star )}\le \dots \le a_{ij}^{(\widehat{k}+1,\star )}\le a_{ij}^{(\widehat{k},\star )}<\infty \end{aligned}$$

and, for \(1\le h<\widehat{k}\), \(a_{ij}^{(h,\star )}=\infty \). Thus, as in the unweighted case, information provided by the path length matrix \(A^{n-1,\star }\) refines information given by the powers \(A^{k,\star }\) for \(1\le k<n-1\). However, the information of the minimal number of steps required to reach vertex \(v_j\) from vertex \(v_i\) is lost.

As mentioned above, the path length matrix may be constructed by evaluating the min-plus powers of \(A^{1,\star }\) \(n-2\) times; here \(A^{1,\star }\) is obtained from the adjacency matrix A by setting all zero off-diagonal entries to \(\infty \). The following MATLAB function, with the adjacency matrix A for a graph and the \(\textrm{level} = n-1\) as input arguments, returns the path length matrix associated with the adjacency matrix A. The function implements the dynamic programming Bellmann-Ford algorithm for solving the well-known “all-pairs shortest path problem”. The algorithm requires \(\mathcal{O}(n^2m)\) arithmetic floating point operations (flops), where n is the number of vertices and m is the number of edges of the graph; if the graph is undirected, then the cost of the algorithm is halved. Notice that the function can be applied to determine shortest paths in a weighted graph having positive or negative weights.

figure a

In line 2 of the function Pathlength_matrix, the function call \(\textsf{size}(A,1)\) yields the order of the matrix \(A\in \mathbb {R}^{n\times n}\); in line 6, the matrix entry A(ij) is set to \(\infty \). Similarly, in line 11, C is defined as an \(n\times n\) matrix with all entries equal to \(\infty \). If the graph is unweighted and connected, then the above MATLAB function can be modified by introducing a break before the last \(\mathbf{end~for}\) when there is no entry \(\infty \). The diameter of the graph then is count+1.

For both weighted and unweighted graphs, also when the graph is not connected, the above MATLAB function with argument \(\textrm{level}=K\), where \(1<K<n\), computes \(A^{K,\star }\), i.e., the matrix of the distances between any distinct vertices of the graph using paths with at most K edges. We note that the triangle inequality might not hold for the entries of \(A^{K,\star }\) and some entries of this matrix may have the value \(\infty \).

We will see how the matrix \(A^{n-1,\star }\) associated with a connected graph \(\mathcal {G}\) sheds light on the communication within the network determined by the graph. In fact, as a measure of the ease of communication in the graph, we like to use the average inverse geodesic length of \(\mathcal {G}\) (i.e., its global efficiency, cf. Sect. 3) instead of the maximum geodesic length of \(\mathcal {G}\) (i.e., its diameter). To this end, we introduce the reciprocal path length matrix \(A^{n-1,\star ,-1}=[a^{(n-1,\star ,-1)}_{ij}]_{i,j=1}^n\) obtained by replacing the off-diagonal entries of the path length matrix by their reciprocals, i.e.,

$$\begin{aligned} a^{(n-1,\star ,-1)}_{ij}=1/a^{(n-1,\star )}_{ij},\qquad 1\le i,j\le n, \end{aligned}$$

where \(1/\infty \) is identified with 0.

We are interested in determining the shortest paths that use at most K edges. We therefore also consider the reciprocal K-path length matrix \(A^{K,\star ,-1}=[a^{(K,\star ,-1)}_{ij}]_{i,j=1}^n\), with \(a^{(K,\star ,-1)}_{ij}=1/a^{(K,\star )}_{ij}\). Thus, the entry \(a^{(K,\star ,-1)}_{ij}\) vanishes if \(a^{(K,\star )}_{ij}=\infty \). Note that the same would happen if \(K=n-1\), in case the graph \(\mathcal {G}\) that determines the adjacency matrix A is not connected. The matrix \(A^{K,\star ,-1}\) allows us to define the global K-efficiency of \(\mathcal {G}\); cf. Sect. 3.

In order to enhance communication using paths with at most K edges, with \(1<K<n\), i.e., to increase the global K-efficiency of the graph associated with the adjacency matrix A, we select edge weights by analyzing centrality properties of the vertices of the graph and, if computationally feasible, the spectral properties of the reciprocal K-path length matrix \(A^{K,\star ,-1}\). In detail, if \(K\ll n\) and the (sparse) non-negative reciprocal K-path length matrix \(A^{K,\star ,-1}\) is irreducible, then we apply the Perron-Frobenius theory by following the approach in [4, 7]. In our context, the choice of weights is dictated by the analysis of the sensitivity to perturbations in A.

Applications of our approach include city planning and information transmission. As for disease propagation, a recent research study of the Zhejiang City Planning Center (China) pointed out a strong connection between the spread of the Covid-19 epidemic and the shape of the city. Cities with a radial structure (such as Milan) have good internal connections due to the capillarity of the public transportation system - buses, trams, subways, and trains. This dynamic made citizens of such cities more vulnerable to the arrival of the Covid-19 virus: the incidence of infections compared to the number of inhabitants was generally larger in cities with a radial urban structure than in cities without this structure, due to the good communication of the people in cities with radial urban structure.

This paper is organized as follows: Sect. 2 analyzes differences and similarities of powers and tropical powers of the adjacency matrix for undirected and unweighted graphs. Section 3 reviews well-known measures that can be easily computed by means of the path length matrix and introduces novel ones. In Sect. 4 we present two algorithms that determine which edge-weight should be changed in order to boost global efficiency. Changing the edge weights may entail widening streets or increasing the number of trams on a route, decreasing travel times on a highway by increasing the travel speed, or decreasing the waiting time for trams on a route. Finally, numerical tests are reported in Sect. 5 and concluding remarks can be found in Sect. 6.

2 Powers versus tropical powers

Consider an undirected and unweighted simple graph \(\mathcal {G}\) with adjacency matrix \(A\in \mathbb {R}^{n\times n}\). Then the entry \(a^{(k)}_{ij}\) of the matrix \(A^k=[a^{(k)}_{ij}]_{i,j=1}^n\in \mathbb {R}^{n\times n}\) counts the number of walks of length k between the vertices \(v_i\) and \(v_j\). A matrix function based on the powers \(A^k\) that is analytic at the origin, and vanishes there, can be defined by a formal Maclaurin series

$$\begin{aligned} f(A)=\sum _{k=1}^{\infty }c_kA^k, \end{aligned}$$
(2)

where we for the moment ignore the convergence properties of this series. Usually long walks are considered less important than short walks, because information flows more easily through short walks than through long ones. Therefore matrix functions applied in network analysis generally have the property that \(0\le c_{k+1}\le c_k\) for all \(k\ge 1\). The most common matrix function used in network analysis is the matrix exponential; see [5, 6, 8,9,10] for discussions and illustrations. We prefer to use the the modified matrix exponential

$$\begin{aligned} \exp _0(A):=\exp (A)-I, \end{aligned}$$

where I denotes the identity matrix, because the first term in the Maclaurin series of \(\exp (A)\) has no natural interpretation in the context of network modeling. For the modified matrix exponential, we have \(c_k=1/k!\), and the series (2) converges for any adjacency matrix A.

The communicability between distinct vertices \(v_i\) and \(v_j\), \(i\ne j\), is defined by

$$\begin{aligned}{}[\exp _0(A)]_{ij}=\sum _{k=1}^{\infty }\frac{a^{(k)}_{ij}}{k!}; \end{aligned}$$

see [8] for the analogous definition based on \(\exp (A)\). The communicability accounts for all possible routes of communication between the vertices \(v_i\) and \(v_j\) in the network defined by the adjacency matrix A, and assigns a larger weight to shorter walks than to longer ones. The larger the value of \([\exp _0(A)]_{ij}\), the better is the communicability between the vertices \(v_i\) and \(v_j\).

Remark 1

Notice that even if there exists an integer \(\widehat{k}\), \(1\le \widehat{k}< n-1\), such that \(a_{ij}^{(\widehat{k})}>0\), one may have \(a_{ij}^{(\widehat{k}+1)}=0\). Information provided by \(A^{n-1}\) does not include information provided by all \(A^{h}\) for \(1\le h<n-1\). This is one of the reasons for the interest in the matrix functions \(\exp _0(A)\) and \(\exp (A)\).

It is straightforward to show the following result.

Proposition 1

Let \(\widehat{k}\), \(1\le \widehat{k}< n\), be the smallest integer power such that \(a_{ij}^{(\widehat{k})}=p>0\), that is to say,

$$\begin{aligned} a^{(h)}_{ij}=0,\quad \forall h<\widehat{k},\quad a^{(\widehat{k})}_{ij}=p. \end{aligned}$$

Then \(\widehat{k}\) is the length of the shortest path that connects \(v_i\) and \(v_j\), i.e.,

$$\begin{aligned} a_{ij}^{(h,\star )}=\infty ,\quad \forall h<\widehat{k},\quad a_{ij}^{(h,\star )} = \widehat{k},\quad \forall h\ge \widehat{k}, \end{aligned}$$

and p is the number of shortest paths that connect \(v_i\) and \(v_j\).

Example 1

In view of Proposition 1, the information provided by both the path length matrix and a suitable power of the adjacency matrix is of interest. Consider the undirected and unweighted graphs \(\mathcal {G}_1\) and \(\mathcal {G}_2\) depicted in Figs. 1-2. The adjacency matrices of these graphs are

$$\begin{aligned} A_1=\begin{bmatrix} 0 &{} 0 &{} 1 &{} 1 &{} 1 \\ 0 &{} 0 &{} 1 &{} 1 &{} 1\\ 1 &{} 1 &{} 0 &{} 0 &{} 0\\ 1 &{} 1 &{} 0 &{} 0 &{} 0\\ 1 &{} 1 &{} 0 &{} 0 &{} 0 \end{bmatrix}, \quad \quad A_2=\begin{bmatrix} 0 &{} 0 &{} 1 \\ 0 &{} 0 &{} 1 \\ 1 &{} 1 &{} 0 \end{bmatrix}. \end{aligned}$$
Fig. 1
figure 1

Graph \(\mathcal {G}_1\) in Example 1

Fig. 2
figure 2

Graph \(\mathcal {G}_2\) in Example 1

For both graphs, the shortest path between the vertices \(v_1\) and \(v_2\) has length 2. However, in \(\mathcal {G}_1\) there are three shortest paths that connect \(v_1\) and \(v_2\), while there is only one in \(\mathcal {G}_2\). Thus, there is surely better communication between these vertices in \(\mathcal {G}_1\) than in \(\mathcal {G}_2\), even though this information is not provided by the path length matrices for these graphs. The path length matrices are

$$\begin{aligned} A_1^{4,\star }=\begin{bmatrix} 0 &{} 2 &{} 1 &{} 1 &{} 1 \\ 2 &{} 0 &{} 1 &{} 1 &{} 1\\ 1 &{} 1 &{} 0 &{} 2 &{} 2\\ 1 &{} 1 &{} 2 &{} 0 &{} 2\\ 1 &{} 1 &{} 2&{} 2 &{} 0 \end{bmatrix}, \quad \quad A_2^{2,\star }=\begin{bmatrix} 0 &{} 2 &{} 1 \\ 2 &{} 0 &{} 1 \\ 1 &{} 1 &{} 0 \end{bmatrix}. \end{aligned}$$
(3)

On the other hand, the second powers of the above adjacency matrices are

$$\begin{aligned} A_1^2=\begin{bmatrix} 3 &{} 3 &{} 0 &{} 0 &{} 0 \\ 3 &{} 3 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 2 &{} 2 &{} 2\\ 0 &{} 0 &{} 2 &{} 2 &{} 2\\ 0 &{} 0 &{} 2 &{} 2 &{} 2 \end{bmatrix}, \quad \quad A_2^2=\begin{bmatrix} 1 &{} 1 &{} 0 \\ 1 &{} 1 &{} 0 \\ 0 &{} 0 &{} 2 \\ \end{bmatrix}. \end{aligned}$$

Remark 2

For weighted graphs, the interpretation of the entry \(a^{(k)}_{ij}\) of the matrix \(A^k\) has to be modified. Indeed, \(a^{(k)}_{ij}\) yields the sum of all products of all weights of the edges in the walks of length k between the vertices \(v_i\) and \(v_j\). (In the unweighted case, any product of all weights of the edges in a walk is 1.) Remark 1 holds true, while Proposition 1 does not.

3 Measures that depend on the path length matrix

Let for now the graph \(\mathcal {G}\) be undirected.

3.1 Closeness centrality

Let the graph \(\mathcal {G}\) be connected. Then the reciprocal of the sum of all lengths of the shortest paths starting from vertex \(v_i\),

$$\begin{aligned} c_i=\frac{1}{\sum _{j\ne i}a_{ij}^{(n-1,\star )}}=[A^{n-1,\star }\textbf{1}_i]^{-1}, \end{aligned}$$

where \(\textbf{1}_i\in \mathbb {R}^n\) denotes the vector with all zero entries except for the \(i^{th}\) entry, which is one, is referred to as the closeness centrality of \(v_i\) in \(\mathcal {G}\); see, e.g., [2]. This measure gives a large value to vertices that have small shortest path distances to the other vertices of the graph.

3.2 The radius and center of a network

The maximum length over all shortest paths starting from vertex \(v_i\) in a weighted or unweighted connected graph \({\mathcal {G}}\), that is to say,

$$\begin{aligned} e_i=\max _{j\ne i} a_{ij}^{(n-1,\star )}, \end{aligned}$$

is commonly referred to as the eccentricity of the vertex \(v_i\). The diameter may be seen as the maximum eccentricity among the vertices of the network. The radius \(r_{\mathcal {G}}\) of \(\mathcal {G}\) is the minimum eccentricity of a vertex. One has

$$\begin{aligned} r_{\mathcal {G}}=\min _i\max _{j\ne i} a_{ij}^{(n-1,\star )}. \end{aligned}$$

It is shown in [3] that

$$\begin{aligned} r_{\mathcal {G}}\le d_{\mathcal {G}}\le 2r_{\mathcal {G}}. \end{aligned}$$

A vertex is said to be central if its eccentricity is equal to the radius of the graph. The center of the graph is the set of all central vertices; see, e.g., [8].

3.3 Average shortest path length

The average shortest path length of a weighted or unweighted connected graph \(\mathcal {G}\) computed over all possible pairs of vertices in the network [2] is given by

$$\begin{aligned} a_{\mathcal {G}}=\frac{1}{n(n-1)}{\sum _{i,j\ne i} a_{ij}^{(n-1,\star )}}= \frac{1}{n(n-1)}\textbf{1}^TA^{n-1,\star } \textbf{1}, \end{aligned}$$

where \(\textbf{1}\in \mathbb {R}^n\) denotes the vector with all entries one. Let the graph \(\mathcal {G}\) be unweighted and be formed by a path of n vertices. Then the largest average shortest path length is \(a_{\mathcal {G}}=(n+1)/3\). If \(a_{\mathcal {G}}\) scales logarithmically with n, then \(\mathcal {G}\) displays the “small-world phenomenon”; see, e.g., [8].

3.4 Harmonic centrality and global efficiency

The efficiency of a path between any two vertices of a weighted or unweighted graph \(\mathcal {G}\) is defined as the inverse of the length of the path. The sum of the inverses of the length of all shortest paths starting from vertex \(v_i\), i.e., the sum of the efficiencies of all shortest paths starting from \(v_i\),

$$\begin{aligned} h_i=\sum _{j\ne i}[a_{ij}^{(n-1,\star )}]^{-1}, \end{aligned}$$
(4)

is referred to as the harmonic centrality of \(v_i\); see, e.g., [2]. The latter measure gives a large centrality to vertices \(v_i\) that have small shortest path distances to the other vertices of the graph. Harmonic centrality will play a central role in our analysis. We define the h-center of a graph as the set of all vertices with the largest harmonic centrality.

If the graph is connected, then the average shortest path efficiency over all possible pairs of vertices, also known as the average inverse geodesic length, is referred to as the global efficiency of the graph [2]:

$$\begin{aligned} e_{\mathcal {G}}=\frac{1}{n(n-1)}{\sum _{i,j\ne i} [a_{ij}^{(n-1,\star )}]^{-1}}. \end{aligned}$$
(5)

Finally, we remark that in the context of molecular chemistry, the sum of reciprocals of distances between all pairs of vertices of an undirected and unweighted connected graph is known as the Harary index and the reciprocal path length matrix \(A^{n-1,\star ,-1}\) is referred to as the Harary matrix [17].

Remark 3

The measures (4) and (5) can be especially useful when the network has more than one connected component, because infinite distances do not contribute to these “harmonic” averages.

Example 2

As an illustration of the above measures, consider again the graphs \(\mathcal {G}_1\) and \(\mathcal {G}_2\) of Example 1. Table 1 reports the diameter, radius, average shortest path length, and global efficiency of these graphs. Notice that the center of \(\mathcal {G}_1\) is given by the set of all vertices and the center of \(\mathcal {G}_2\) is made up of the vertex \(v_3\), only. The h-center of \(\mathcal {G}_1\) is formed by the vertices \(v_1\) and \(v_2\), and the h-center of \(\mathcal {G}_2\) is given by the vertex \(v_3\), only. Table 2 shows the eccentricity, harmonic centrality, and closeness centrality of all the vertices of \(\mathcal {G}_1\) and \(\mathcal {G}_2\). All the measures in this example are computed by using the path length matrices (3).

Table 1 Example 2. Diameter, radius, average shortest path length, and global efficiency of the graphs \(\mathcal {G}_1\) and \(\mathcal {G}_2\) depicted in Figs. 1-2. Note that the average shortest path length of \(\mathcal {G}_2\) is maximal; indeed, \((n+1)/3=1.33\)
Table 2 Example 2. Eccentricity, harmonic centrality, and closeness centrality of the vertices of \(\mathcal {G}_1\) and of \(\mathcal {G}_2\)
Table 3 Example 3. Eccentricity, harmonic centrality, and closeness centrality of the vertices of \(\mathcal {\tilde{G}}_2\)

3.5 Harmonic K-centrality and global K-efficiency

When one considers shortest paths that are made up of at most K edges, the matrix \(A^{K,\star ,-1}\) takes the role of the reciprocal path length matrix \(A^{n-1,\star ,-1}\). We are in a position to introduce the harmonic K-centrality of the vertex \(v_i\). It is given by

$$\begin{aligned} h_i^{K}=\sum _{j\ne i}[a_{ij}^{(K,\star )}]^{-1}. \end{aligned}$$

The global K-efficiency of a graph \(\mathcal {G}\) is defined by

$$\begin{aligned} e_{\mathcal {G}}^{K}=\frac{1}{n(n-1)}{\sum _{i,j\ne i} [a_{ij}^{(K,\star )}]^{-1}}. \end{aligned}$$

The set of all vertices of a graph \(\mathcal {G}\) with the largest harmonic K-centrality is referred to as the \(h^{K}\)-center of the graph.

3.6 Out-centrality versus in-centrality

Let the graph \(\mathcal {G}\) be directed. Then the above centrality measures (closeness, eccentricity, harmonic centrality, and harmonic K-centrality) determine the importance of a vertex \(v_j\) by taking into account the paths that start at \(v_j\). These measures therefore may be considered measures of out-centrality. One also may be interested in measuring the importance of a vertex \(v_j\) by considering the paths that end at it, that is to say by measuring the in-centrality of \(v_j\). This can be achieved by replacing the path length matrix \(A^{n-1,\star }\) in the measures mentioned by its transpose. This allows us to introduce the measures closeness in-centrality, in-eccentricity, harmonic in-centrality, and harmonic \(K_\textrm{in}\)-centrality. These measures are defined as

$$\begin{aligned} c_j^\textrm{in}= & {} \frac{1}{\sum _{i\ne j}a_{ij}^{(n-1,\star )}} \,,\;\;\; e_j^\textrm{in}= \max _{i\ne j} a_{ij}^{(n-1,\star )}\,, \\ h_j^\textrm{in}= & {} \sum _{i\ne j}[a_{ij}^{(n-1,\star )}]^{-1}\,,\;\;\; h_j^{K_\textrm{in}}= \sum _{i\ne j}[a_{ij}^{(K,\star )}]^{-1}. \end{aligned}$$

We also define the in-radius of \(\mathcal {G}\),

$$\begin{aligned} r^\textrm{in}_{\mathcal {G}}=\min _j\max _{i\ne j} a_{ij}^{(n-1,\star )}. \end{aligned}$$

Finally, the vertex \(v_j\) is said to be in-central if its in-eccentricity equals the in-radius of the graph. The \(h^{K_\textrm{in}}\)-center of a graph \(\mathcal {G}\) is the set of all vertices with the largest harmonic \(K_\textrm{in}\)-centrality. The notions of \(h^{K_\textrm{in}}\)-center and \(h^{K_\textrm{out}}\)-center will be of interest in the sequel.

4 Enhancing network communication

The diameter of a weighted or unweighted graph provides a measure of how easy it is for the vertices of the graph to communicate. When graphs are used as models for communication networks, the diameter plays an important role in the performance analysis and cost optimization. A simple way to decrease the diameter of a graph so that information can be transmitted more easily between vertices of the graph [4] is to decrease the weight of an edge that belongs to all maximal shortest paths, if feasible.

Example 3

Consider the graph \(\mathcal {G}_2\) in Example 1. The edge \(e(v_2\leftrightarrow v_3)\) belongs to the maximal shortest path and is a bridge, that is its removal would make the vertex \(v_2\) unreachable and the perturbed graph \(\tilde{\mathcal {G}}_2\) so-obtained disconnected. In detail, the edge \(e(v_2\leftrightarrow v_3)\) belongs to the shortest path between the vertices \(v_1\) and \(v_2\), which is the only shortest path of length \(d_\mathcal {G}\). Thus, decreasing the weight of the edge \(e(v_2\leftrightarrow v_3)\) decreases both the average shortest path length and the diameter of the graph. Specifically, if one decreases the weights \(a_{23}\) and \(a_{32}\) from 1 to 0.5, one obtains the matrices

$$\begin{aligned} \tilde{A}_2=\begin{bmatrix} 0 &{} 0 &{} 1 \\ 0 &{} 0 &{} 0.5 \\ 1 &{} 0.5 &{} 0 \end{bmatrix}, \quad \quad \tilde{A}_2^{2,\star }=\begin{bmatrix} 0 &{} 1.5 &{} 1 \\ 1.5&{} 0 &{} 0.5 \\ 1 &{} 0.5 &{} 0 \end{bmatrix}. \end{aligned}$$

The perturbed graph \(\tilde{\mathcal {G}}_2\) has diameter \(d_{{\mathcal {\tilde{G}}}_2}=1.5\), radius \(r_{{\mathcal {\tilde{G}}}_2}=1\), average shortest path length \(a_{{\mathcal {\tilde{G}}}_2}=1\), and global efficiency \(e_{{\mathcal {\tilde{G}}}_2}=1.22\). The eccentricity, harmonic centrality, and closeness centrality of the vertices of the graph \(\tilde{\mathcal {G}}_2\) are reported in Table 3.

4.1 Increasing the global K-efficiency

We propose two approaches to increase the global efficiency of a network.

4.1.1 The function eKG1

Let for now the graph \(\mathcal {G}\) be directed. The first approach is based on the observation that the most important vertices with respect to the global K-efficiency live in the vertex subsets \(h^{K_\textrm{out}}\)-center and \(h^{K_\textrm{in}}\)-center of the graph. These vertices may be interpreted as important intermediaries, that quickly collect information from many vertices and quickly broadcast it to many others vertices. Indeed, strengthening an existing connection from a vertex of the \(h^{K_\textrm{in}}\)-center to a vertex of the \(h^{K_\textrm{out}}\)-center is likely to strengthen their communicability by having new shorter paths with at most K steps that exploit these connections. This is likely to increase the global K-efficiency more than strengthening an existing connection between vertices with lower harmonic \(K_\textrm{in}\)- and \(K_\textrm{out}\)-centrality. Here we consider graphs whose edge weights represent travel times or waiting times. Hence, strengthening is achieved by decreasing appropriate weights.

We construct the perturbed adjacency matrix

$$\begin{aligned} \tilde{A}=A+\gamma _{h_1,h_2}\textbf{1}_{h_1}\textbf{1}_{h_2}^T, \;\;\; \text{ with }\;\;\gamma _{h_1,h_2} = - a_{h_1,h_2}/2, \end{aligned}$$
(6)

If the graph is undirected, the above approach simplifies, because the \(h^{K_\textrm{out}}\)- and \(h^{K_\textrm{in}}\)-centers coincide. Hence, the idea is to strengthen the connection between vertices with the largest harmonic K-centrality. The perturbed adjacency matrix \(\tilde{A}\) will be

$$\begin{aligned} \tilde{A}=A+\gamma _{h_1,h_2}(\textbf{1}_{h_1}\textbf{1}_{h_2}^T + \textbf{1}_{h_2}\textbf{1}_{h_1}^T) . \end{aligned}$$
(7)

The MATLAB function eKG1 describes the necessary computations. The operator \(==\) in line 5 of the function eKG1 stands for logical equal to, and the symbol ./ in line 6 denotes element-wise division. The function call \(\textsf{sum}(M)\) for a matrix \(M\in \mathbb {R}^{n\times n}\) computes a row vector \(m\in \mathbb {R}^n\), whose \(j^{th}\) component is the sum of the entries of the \(j^{th}\) column; the function call \(\textsf{sum}(M,2)\) for a matrix \(M\in \mathbb {R}^{n\times n}\) computes a column vector, whose \(i^{th}\) entry is the sum of the elements of row i of M. The blip in line 8 denotes transposition, and the operator \(.*\) in line 11 stands for vector-vector element-wise product.

figure b

4.1.2 The function eKG2

Assume for now that the graph \(\mathcal {G}\) is directed. If computing the Perron root \(\rho _K\) and the unique positive left and right eigenvectors of unit norm (the Perron vectors) of the reciprocal K-path length matrix \(A^{K,\star ,-1}\) is not computationally feasible or if this matrix is not irreducible, then one can use the function eKG1. However, if \(A^{K,\star ,-1}\) is irreducible and its left and right Perron vectors \(\textbf{x}_K=(x_{K,i})\) and \(\textbf{y}_K=(y_{K,i})\) can be computed, then these vectors determine the Wilkinson perturbation \(W_K=\textbf{y}_K\textbf{x}_K^T\); see [20, Section 2]. Following [7], to induce the maximal perturbation in \(\rho _K\), one chooses the indices \((h_1,h_2)\) such that \(W_K(h_1,h_2)\) is the largest entry of \(W_K\) and \(A(h_1,h_2)>0\), i.e., the indices of the largest entry of the Wilkinson perturbation “projected” onto the zero-structure of A; see, e.g. [15]. Thus,

$$\begin{aligned} (h_1,h_2): x_{K,h_2}\,y_{K,h_1}=(\textbf{y}_K\textbf{x}_K^T)_{h_1, h_2}= \max _{i,j:A(i,j)>0} (W_K)_{i,j}. \end{aligned}$$

As in function eKG1, one strengthens the edge \(e(v_{h_1}\rightarrow v_{h_2})\) by halving its weight. The perturbed adjacency matrix then is given by (6).

When the graph \(\mathcal {G}\) is undirected, the left and right Perron vectors coincide and the perturbed adjacency matrix is constructed as in (7). The outlined approach is implemented by the MATLAB function eKG2. We recall that the function \(\textsf{abs}(\mathsf v)\) of a vector \(\textsf{v}\), used in the function eKG2, returns a vector, whose components are the absolute value of the components of \(\mathsf v\). The expression \(\textsf{sum}(\textsf{sum}(Pr))\) on line 7 sums all entries of the matrix Pr. The operator \(.*\) on line 20 denotes the Hadamard product of two matrices; the entries of the matrix \(A>0\) are one if the corresponding entry of A is positive; they are zero otherwise. The function ind2sub determines the equivalent subscript values corresponding to a given single index in an array.

figure c

We expect the global K-efficiency to increase the most when decreasing the edge-weight that makes the Perron root \(\rho _K\) change the most. The perturbation of the Perron root generated by the function eKG2 typically is larger than the perturbation determined by the function eKG1. The difference in these perturbations is analogous to the difference between considering the most important vertex in a graph the one with the largest degree and the one with maximal eigenvector centrality.

Remark 4

Both functions eKG1 and eKG2 maximize lower bounds for the global K-efficiency of the network. Consider for the sake of clarity the undirected case. Let \(\textbf{h}_K\) denote the vector of the harmonic K-centralities of the vertices of the graph. Its 1-norm is the sum in the numerator of the global K-efficiency, and its \(\infty \)-norm is what function eKG1 is maximizing. Indeed, one has

$$ n(n-1)\,e_{\mathcal {G}}^{K}=\Vert \textbf{h}_K\Vert _1\ge \Vert \textbf{h}_K\Vert _{\infty }= \Vert A^{K,\star ,-1}\Vert _{\infty } \ge \rho _K. $$

Remark 5

The Perron root is bounded from below and from above by the minimal and maximal entries of \(\textbf{h}_K\), respectively. We have

$$\begin{aligned} \min _{i}\sum _{j\ne i}[a_{ij}^{(K,\star )}]^{-1}\le \rho _K\le \max _{i}\sum _{j\ne i}[a_{ij}^{(K,\star )}]^{-1}= \Vert \textbf{h}_K\Vert _{\infty }. \end{aligned}$$
(8)

Indeed, one has

$$\begin{aligned} \textbf{1}^T A^{K,\star ,-1} \textbf{x}_K = \textbf{1}^T \rho _K \textbf{x}_K = \rho _K \Vert \textbf{x}_K\Vert _1 \end{aligned}$$

and

$$\begin{aligned} \textbf{1}^T A^{K,\star ,-1} \textbf{x}_K = \textbf{x}_K^T A^{K,\star ,-1} \textbf{1} =\sum _{i=1}^n(x_{K,i} \sum _{j \ne i} [a_{ij}^{(K,\star )}]^{-1} ), \end{aligned}$$

so that one obtains (8) by observing that

$$\begin{aligned} \min _{i}\sum _{j\ne i}[a_{ij}^{(K,\star )}]^{-1}\Vert \textbf{x}_K\Vert _1\le \rho _K\Vert \textbf{x}_K\Vert _1\le \max _{i}\sum _{j\ne i}[a_{ij}^{(K,\star )}]^{-1}\Vert \textbf{x}_K\Vert _1. \end{aligned}$$

Example 4

We apply the functions eKG1 and eKG2 to the graphs of Example 1. For neither graph, there is a unique “best choice” to report.

First consider the graph \(\mathcal {G}_1\) in Example 1. Then \(A_1^{2,\star }=A_1^{3,\star }= A_1^{4,\star }\). We let \(K=2\) and obtain

$$\begin{aligned} A_1^{2,\star ,-1}=\begin{bmatrix} 0 &{} 0.5 &{} 1 &{} 1 &{} 1 \\ 0.5 &{} 0 &{} 1 &{} 1 &{} 1\\ 1 &{} 1 &{} 0 &{} 0.5&{} 0.5\\ 1 &{} 1 &{} 0.5 &{} 0 &{} 0.5\\ 1 &{} 1 &{} 0.5&{} 0.5 &{} 0 \end{bmatrix}\,. \end{aligned}$$

The vector \(\textbf{h}_2\) of harmonic 2-centralities is \(\textbf{h}_2=[3.5,3.5,3,3,3]^T\), while the Perron vector \(\textbf{x}_2\) is given by \(\textbf{x}_2=[0.47,0.47,0.43,0.43,0.43]^T\). This tells us that the vertices \(v_1\) and \(v_2\) are the most important ones in the sense of both harmonic centrality and eigenvector centrality. Indeed, these vertices are the only ones that are connected by paths of minimal length with three vertices. Thus, they are well connected. However, \(A_1(1,2)=0\).

Both functions eKG1 and eKG2 give \((h_1,h_2)=(1,3)\), and yield the matrix

$$\begin{aligned} \tilde{A}_1=\begin{bmatrix} 0 &{} 0 &{} 0.5 &{} 1 &{} 1 \\ 0 &{} 0 &{} 1 &{} 1 &{} 1\\ 0.5 &{} 1 &{} 0 &{} 0 &{} 0\\ 1 &{} 1 &{} 0 &{} 0 &{} 0\\ 1 &{} 1 &{} 0 &{} 0 &{} 0 \end{bmatrix}\,, \end{aligned}$$

even though other choices of existing edges are equally valid. The global 2-efficiency \(e_{\tilde{\mathcal {G}}_1}^2\) of \(\tilde{A}_1\) is 0.95; compare with the global 2-efficiency \(e_{\mathcal {G}_1}^2=0.80\) of \(A_1\). We remark that a perturbation of another existing edge would have led to the same increase of the global 2-efficiency.

Consider the graph \(\mathcal {G}_2\) in Example 1. One has

$$\begin{aligned} A_2^{2,\star ,-1}=\begin{bmatrix} 0 &{} 0.5 &{} 1 \\ 0.5 &{} 0 &{} 1 \\ 1 &{} 1 &{} 0 \\ \end{bmatrix}\,. \end{aligned}$$

The vector of harmonic 2-centralities is \(\textbf{h}_2=[1.5,1.5,2]^T\), while the Perron vector is given by \(\textbf{x}_2=[0.54, 0.54, 0.64]^T\). Both functions eKG1 and eKG2 yield \((h_1,h_2)=(3,1)\), even though the choice \((h_1,h_2)=(3,2)\) is equally valid and gives the matrix

$$\begin{aligned} \tilde{A}_2=\begin{bmatrix} 0 &{} 0 &{} 0.5 \\ 0 &{} 0 &{} 1 \\ 0.5 &{} 1 &{} 0 \\ \end{bmatrix}. \end{aligned}$$

The global 2-efficiency of \(\tilde{A}_2\) is \(e_{\tilde{\mathcal {G}}_2}^2=1.22\), while the global 2-efficiency of the matrix \(A_2\) is \(e_{\mathcal {G}_2}^2=0.83\). The diameter associated with the graph determined by \(\tilde{A}_2\) is only 1.5, while the diameter of the graph associated with \(A_2\) is 2. The choice \((h_1,h_2)=(3,2)\), which was considered in Example 3, gives the same results.

5 Numerical tests

The numerical tests reported in this section have been carried out using MATLAB R2022b on a 3.2 GHz Intel Core i7 6 core iMac. The Perron root and left and right Perron vectors for small to moderately sized graphs can easily be evaluated by using the MATLAB function eig. For large-scale graphs these quantities can be computed by the MATLAB function eigs or by the two-sided Arnoldi algorithm, introduced by Ruhe [18] and improved by Zwaan and Hochstenbach [21].

Example 5

Consider the adjacency matrix for the network Air500 in [1]. This data set describes flight connections for the top 500 airports worldwide based on total passenger volume. The flight connections between airports are for the year from 1 July 2007 to 30 June 2008. The network is represented by a directed unweighted connected graph with 500 vertices and 24009 directed edges. The vertices of the network are the airports and the edges represent direct flight routes between two airports.

The path length matrix \(A^{5,\star }=A^{499,\star }\) yields the diameter and the radius of the graph 5 and 3, respectively. The information provided by the vector of the harmonic centralities and the Perron vector for the reciprocal path length matrix \(A^{5,\star ,-1}\) is the same as the one given by the vector of harmonic K-centralities and the Perron vector for \(A^{K,\star ,-1}\) with \(K=2\); cf. Table 4. Therefore, the perturbation that increases the global K-efficiency the most also will enhance the global efficiency the most. The information provided by Table 4 suggests that the number of flights from the Frankfurt FRA Airport (vertex \(v_{161}\)) to the JFK Airport in New York (vertex \(v_{224}\)) should be doubled in order to half the wait time between these flights. Doubling the number of flights corresponds to halving the weight for the corresponding edge.

Table 4 Example 5. Indices chosen by the functions eKG1 and eKG2 and the global K-efficiency of both the given graph, \({\mathcal {G}}\), and the perturbed graph, \({\tilde{\mathcal {G}}}\), for \(K=2,3,4,5\)

Example 6

This example considers an undirected unweighted connected graph \({\mathcal {G}}\) that represents the German highway system network Autobahn. The graph is available at [1]. Its 1168 vertices are German locations and its 1243 edges represent highway segments that connect them.

Let A be the adjacency matrix associated with \({\mathcal {G}}\). The path length matrix \(A^{62,\star }=A^{1167,\star }\) shows that the diameter and the radius of \({\mathcal {G}}\) are 62 and 34, respectively, whereas its global efficiency equals \(6.7175\cdot 10^{-2}\). One notices that there is only one shortest path of length 62, which connects the vertices \(v_{116}\) and \(v_{1154}\). The diameter of the graph can be decreased by halving the weight of the edges \(a_{120,116}\) and \(a_{116, 120}\) (since the graph is undirected), because this is the unique edge that connects the vertex \(v_{116}\) to the other vertices of the graph.

Table 5 Example 6. Indices chosen by function eKG1 and global K-efficiency of both \({\mathcal {G}}\) and \({\tilde{\mathcal {G}}}\), for \(K=2,3,4,5\) and for \(K=12:10:62\)
Table 6 Example 6. Indices chosen by function eKG2 and global K-efficiency of both \({\mathcal {G}}\) and \({\tilde{\mathcal {G}}}\), for \(K=2,3,4,5\) and for \(K=12:10:62\)

Let \({\hat{A}}\) denote the perturbed adjacency matrix and \(\hat{\mathcal {G}}\) the corresponding graph. The global efficiency of \(\hat{\mathcal {G}}\) is \(6.7177\cdot 10^{-2}\) and the diameter is 61.5. We turn to the application of the functions eKG1 and eKG2 to increasing the global efficiency. Table 5 reports the global K-efficiency (for several values of K) of the graph \({\tilde{\mathcal {G}}}\) associated with the adjacency matrix \(\tilde{A}\) obtained by halving both entries \(a_{h_1,h_2}\) and \(a_{h_2,h_2}\) of A, computed by the function eKG1. Table 6 shows the global K-efficiency (for several values of K) of the graph \({\tilde{\mathcal {G}}}\) associated with the adjacency matrix \(\tilde{A}\), computed by the function eKG2. Also in this example, the information provided by the vector of harmonic centralities and the Perron vector for the reciprocal path length matrix \(A^{62,\star ,-1}\) is exactly the same information that is provided by the vector of harmonic K-centralities and the Perron vector for \(A^{K,\star ,-1}\) with \(K\ge 4\). We note that the latter vectors are less expensive to determine than the former. The information provided by both tables suggests that one should double the width of the highway that connects the cities of Duisburg (vertex \(v_{219}\)) and Krefeld (vertex \(v_{565}\)) to half the travel time. These cities are 10 miles apart. Doubling the width corresponds to halving the weight associated with the corresponding edge.

6 Concluding remarks

The adjacency matrix of a graph is a well-known tool for studying properties of a network defined by the graph. The path length matrix associated with a graph also sheds light on properties of the network, but so far has not received much attention. A review of measures that can be defined in terms of the path length matrix is provided, and new such measures are introduced. The sensitivity of the transmission of information to perturbations of the entries of the adjacency matrix is investigated.