1 Introduction

In network theory, a specific way to detect vertices having a peculiar common feature is termed clustering or community detection. Formally, a cluster, or a community, is a subgraph whose similarity or internal connections are stronger than the ones with the rest of the graph (Fortunato 2010). In recent years there was a surge of interest on the community structure in economic networks (Hajdu et al. 2019) and, specifically, in international trade (Barigozzi et al. 2011; Garlaschelli and Loffredo 2004; 2005; Li et al. 2003; Piccardi 2011; Serrano and Boguñá 2003; Serrano et al. 2007). The classical approach consists in finding sets of countries which are densely connected, through preferential economic relationships. A typical representation of this phenomenon is through a directed and weighted network, where nodes are countries and weighted links represent the aggregate trade flows. This representation is named in the literature as the International Trade Network (ITN).

Under this perspective, it becomes important to map the input-output interrelations among the countries through an inspection of the communities, where two countries share the same community if they have a comparable intensity in the trade flows or if they have preferential trade flows.

International trade has been widely studied in the literature showing that main characteristics have changed over time, with an acceleration of modifications occurring in the last decades. In particular, over the years, the composition of trade flows changed making countries even more deeply interconnected. The geographical distribution of trade also varied, with an increasing role of the emerging countries, especially in Asia.Footnote 1

To detect the network structure, a key function is played by the vertex centrality. The idea of centrality is quite simple to grasp: a numerical score is assigned to each node of the network so that the higher the score, the more central the node in the network. The literature has highlighted the importance to be central in an economic network (see Varela et al. (2015), Blöchl et al. (2011), and Barbero and Zofío (2016)). In particular, centrality may be associated with countries that are the most important hub of the ITN, even though they are not leading import or export countries ((Blöchl et al. 2011; De Benedictis and Tajoli 2011)). There are different metrics describing centrality, but it has been shown that different measures (degree, coreness, etc.) identify different influential nodes (Ferraz de Arruda et al. 2014). For instance, a node could be central if it is directly connected with many other nodes, if it has an intermediary role in communication, and so on. Indeed, there is no consensus on a univocal definition of network centrality, because each measure considers only one specific concept (see, e.g., Newman (2010)). But, resorting to only one of them is discarding a large amount of the whole information available. Related to centrality, the clustering coefficient is also an important index to measure the interconnections within a community. This coefficient has been developed in all the cases of weighted, unweighted, directed and undirected networks (see Wasserman and Faust (1994), Watts and Strogatz (1998), Barrat et al. (2004), Onnela et al. (2005), Clemente and Grassi (2018), and Fagiolo (2007)). In particular, Rotundo and Ausloos (2010) discusses the clustering coefficient in presence of already established communities for directed networks and Cerqueti et al. (2018) presents a concept of clustering coefficient which also includes the presence of missing indirect links in the construction of triangles. The association between communities and clustering coefficients is quite natural. Triangles are the easiest geometric visualization of communities, providing a picture of non-exclusive interactions among different agents. The relevance of this coefficient has been investigated also in the context of ITN (see, e.g., De Benedictis and Tajoli (2011), De Benedictis and Tajoli (2016), Fagiolo et al. (2010), and Cepeda-López et al. (2019)).

As stressed in Barigozzi et al. (2011), detecting the community structure of the ITN and how it correlates with country-specific variables and geography (e.g., distances between countries) is crucial from an international-trade perspective. Indeed, finding communities in the ITN means identifying clusters of countries that carry tightly interrelated trade linkages among them, while being relatively less interconnected with countries outside the cluster.

In this work, we provide a new methodology for clustering countries based on a multi-criteria assessment of several topological indicators of centrality. The method consists of two steps. In the first step, we rank countries in ITN, according to various centrality measures. In the second one, based on those rankings, we compute the similarities between countries and then we apply the clustering algorithm based on the Clique Partition model.

More specifically, in the first step, and unlike classical methodologies, we consider all the most prominent centrality definitions proposed in the literature that are relevant to international trade. Rather than advocate the superiority of one of them, we aggregate this rich multi-criteria assessment by defining a proper measure of similarity/dissimilarity between nations using their ranking positions. Next, we group together countries that have common structural features in terms of those rankings. The main advantage of our proposal is that we do not focus on a single and specific indicator of centrality, nor we come out with a detailed countries ranking. Rather, we are able to identify groups of countries that have similar structural properties in the ITN. A specific tool developed for our project is a new heuristic algorithm to find clusters, based on the Clique Partition model (Grötschel and Wakabayashi 1989; 1990; de Amorim et al. 1992). The Clique Partition model consists of partitioning the vertices of a graph into the smallest number of cliques. First, a measure of similarity/dissimilarity between units must be established. This measure can take both positive and negative values, respectively if two units are similar or dissimilar. Then units must be partitioned in subsets, in such a way to maximize the similarity between them. This model has some advantages over the classical k-means or hierarchical models. First of all, the clique partition model does not require either that the number of clusters were fixed in advance, e.g. the parameter k, or that the user should arbitrarily analyse the chart of the hierarchical clusters. Rather, the number of clusters results by the optimization of an objective function. Moreover, outliers are not forced to be in a cluster, but they can form peculiar groups of a single element. Finally, the principle of the method is that clusters are composed of mutually homogeneous data, while the k-means models first try to establish cluster’s centres and then groups are composed by units that are similar to centres. Conversely, the clique partitioning forms groups of similar units. Experimental comparison between the clique partition and other clustering methods can be found in Wang et al. (2008). The paper is organized as follows. In Section 2, we recall main literature related to network theory, analysis of ITN and main solution methods for clique partitioning problems. In Section 3, we describe the methodological framework and the integer linear programming problem. In Section 3.2, we define the maximum clique partition problem as well as the algorithm applied for identifying the optimal solution. In Section 4, a numerical application is developed by using the paradigmatic case of the ITN. Conclusions follow in Section 5.

1.1 Novelty and Advantages of the Proposed Methodology

The classical meaning of community refers to the clustering of nodes on the basis of the intensity of the connections between them: the community structure maximizes the density or the intensity of the connections between nodes inside each cluster, while members of different clusters are as weakly connected as possible (Newman and Girvan 2004; Fortunato and Hric 2016). The efforts of the literature have focused on finding new methodologies to detect communities under specific conditions (i.e. large or overlapping data, node attributed graphs, multilayer networks, and so on). Some methods are algorithm-based, such as hierarchical clustering or edge removal (Clauset et al. 2008). Others are based on the optimization of specific criteria over all possible network partitions. In this context, it is well-known the optimization of a modularity function according to Newman’s definition (2004).

We go one step beyond this idea, applying a graph partitioning methods, e.g. the clique partitioning, to the graph in which arcs are weighted by node similarities. For instance, in term of centrality, nodes can be grouped together if they have strategic importance in transmitting information, or if they have similar power or control in the network. Moreover, our method is not limited to grouping nodes based on a single characteristic, but it is able to consider simultaneously more than one feature. This aggregation is general enough to be applied to various frameworks.

We show an application to the ITN. In this context, the identification of communities of countries has been addressed, among others, by Piccardi and Tajoli (2012), Barigozzi et al. (2011) and Bartesaghi et al. (2020). In Piccardi and Tajoli (2012), the authors apply the classical maximum modularity criterion showing that the recognition of the mesoscale structure is increasingly difficult due to the growing complexity and globalization of the international economic interactions. The correlation between the world partition in communities obtained by a modularity criterion and geographical distances has been investigated also in Barigozzi et al. (2011). The authors, both at an aggregate level and at a number of commodity-specific levels, compare the two maximum modularity partitions of the input-output network and of the weighted network of the geographical closenesses. They find a high similarity between aggregate trade and geography-based communities, greater than, for instance, communities determined by regional trade agreements. They conclude that geographically-related factors explain the patterns of global trade more than political determinants. In Bartesaghi et al. (2020), the authors interpret the ITN as a metric space by using two different distance measures that overcome the limitations of the shortest-path distance. They highlight strong interconnections between countries and identify communities as clusters of close countries in terms of such distances, according to a varying threshold.

Our approach is instead aimed at applying a modularity criterion not to the immediate network of economic exchanges between countries but to a network in which the connection between countries is represented by a measure of similarity in the role they play within the global framework. This similarity measure exploits indicators of different nature and, as a consequence, our results will be less dependent on immediate factors which can affect a stronger or weaker relationship between pairs of countries, such as geographical proximity, trade agreements, common language or traditional partnerships. Taking into account the relevance of countries in the network, the methodology proposed in this paper provides a different approach for identifying clusters. Indeed, results here obtained may be used to highlight different aspects of the hidden structure of the ITN with respect to traditional community detection approaches. In particular, we aim at merging in the same community countries that have an analogous role in the network. Indeed, as emphasized in the literature (see Cingolani et al. (2017)) to shed light on a country’s participation in global trade, it is therefore important to understand where the country is positioned in the network. Although there is a growing literature concerned with measures for assessing countries’ position, typically main results show that rankings of countries vary according to the measures of centrality that are used. Our proposal instead aims at detecting communities of countries with similar relevance by aggregating several indicators and taking into account peculiarities and heterogeneity of different measures.

2 Related literature

In this section we briefly remind the main literature related to network theory and International Trade, as well as clique partitioning problems and the main solution methods.

Network theory has been traditionally used in sociology and political science in order to investigate international trade relations, being an effective tool in revealing the core-periphery structure of the countries or in studying the impact of the globalization on the international trade structure (Snyder and Kick 1979; Smith and White 1992; Kim and Shin 2002). The topological and statistical properties of the international trades, also in a time perspective, have been deeply studied in several works (see for instance, Serrano and Boguñá (2003), Garlaschelli et al. (2007), and Fagiolo et al. (2008)). More recently, complex networks have also been used to investigate economic and financial implications of the world trade. For instance, Kali and Reyes (Kali and Reyes 2007; 2010) study the country’s role in the ITN deducing important implications in terms of economic growth and explaining the phenomenon of financial contagion. Both international trade and financial integration patterns are investigated by Fagiolo et al. (Schiavo et al. 2010). Another important issue is the identification of communities in the trade network. Barigozzi et al. (2011) deeply study the topology of the international trade multi-network, aiming at discovering its community structure. In Tzekina et al. (2008), the authors analyse the evolution of communities (“islands”): from two large trading communities, centred on UK and US, to a fairly heterogeneous “archipelago” of trade, that seems to reflect a phenomenon of globalization. Finally, dissimilarities between different layers of an international trade multiplex network have been studied in Zhang et al. (2017). The authors characterize each layer as a commodity network in a specific time period.

The definition of communities can be naturally associated with a partition in clusters, and one of the most important model of community detection is the clique partition. The presence of communities inside the network is revealed by the modularity index (see Newman and Girvan (2004) and Santiago and Lamb (2017)), that corresponds to the objective function of a clique partition model. By maximizing the partition modularity, one can determine the community structure of the network (Newman 2004; Clauset et al. 2004; Blondel et al. 2008; Danon et al. 2006; Aloise et al. 2010). The clique partition model, as a combinatorial approach to cluster qualitative data, had a methodological development independent of the problem of community detection, as it has been introduced in Grötschel and Wakabayashi (1989), Grötschel and Wakabayashi (1990), de Amorim et al. (1992), and Pattillo et al. (2013) and its applications range in many different fields (see, for instance, Butenko and Wilhelm (2006)). It has been recognized that it is a NP-hard problem, implying that the exact solution cannot be computed in polynomial time, unless P=NP. In practice, exact methods can solve instances that do not exceed one hundred nodes (Mehrotra and Trick 1998; Aloise et al. 2010), so that the use of heuristic procedure is necessary in our applications (Santiago and Lamb 2017; Chelouah and Siarry 2000).

3 The model

In this section, we describe our methodology for clustering countries on the basis of the similarity attributes.

A network is described by a graph G = (V,E) where V and E are respectively the set of n vertices and m links (or edges). Two nodes are adjacent if there is a link (i,j) connecting them. The degree di of a node i is the number of links incident to it. If a weight wij > 0 is associated with each link (i,j), a weighted graph G = (V,E,W) is obtained, being W the set of weights. In general, both adjacency relationships between vertices of G and weights on the links are described by a nonnegative, real n-square matrix W. In the unweighted case, matrix W is simply the classical binary adjacency matrix A, of entries aij, where aij = 1 if (i,j) ∈ E, 0 otherwise. Since we consider network without loops, aii = 0 (or wii = 0). The (i,j) −element of the k −power of A is the number of walks of length k from i to j. The Laplacian matrix is defined as L = DA, where D is the diagonal matrix having the vertex degrees on the diagonal entries.

A network is directed if each link is directed, that is an arc (i,j) ∈ E means that there is a link starting from i and ending in j. The in-degree \(d^{in}_{i}\) (out-degree \(d^{out}_{i}\)) of a node i is the number of arcs pointing towards (starting from) i. The degree \(d^{tot}_{i}\) of a vertex is then the sum of the in and out-degree. In the directed case, matrices A, for a binary network, and W, for a weighted network, are not symmetric.

3.1 Network attributes and rankings

We are interested in specific characteristics of the nodes, such as their centrality or their level of interconnection within the network. Since the network is weighted and directed, we need appropriate measures that take into account both weights and directions. Thus, according to the four dimensions classification of centrality indices in Brandes and Erlebach (2005), we focus on four class of network indicators, each one computed using both incoming and outgoing links. These are in and out-strength, in and out-clustering, hub and authority and Laplacian centrality.

The strength (in and out) is the natural extension to the weighted and directed case of the degree centrality. It counts both the number of ties and their intensity. Formally, for a node i, we have:

$$ s^{in}_{i}=(\textbf{A}^{T}\textbf{W})_{ii}=\textbf{W}^{T}_{i}\textbf{1} $$
(1)
$$ s^{out}_{i}=(\textbf{AW}^{T})_{ii}=\textbf{W}_{i}\textbf{1} $$
(2)

where Wi corresponds to the ith row of the matrix W.

In particular, in our application, the in-strength \(s_{i}^{in}\) measures the total trade flows incoming to the country i, that is the import. The out-strength \(s_{i}^{out}\) measures the total trade flows outgoing from the country i, that is the export.

Clustering coefficient measures the tendency of a node to be well interconnected with its neighbours. Local clustering coefficient of a node i counts the number of observed weighted directed triangles connected to i, divided by all its potential unweighted directed triangles:

$$ c_{i}(\tilde{\mathbf{W}})=\frac{\frac{1}{2}\left[(\tilde{\mathbf{W}}^{\left[\frac{1}{3}\right]}+(\tilde{\mathbf{W}}^{T})^{\left[\frac{1}{3}\right]}\right]^{3}_{ii}}{d^{tot}_{i}\left( d^{tot}_{i}-1\right)-2d^{\leftrightarrow}_{i}}, $$
(3)

where \(\tilde {\mathbf {W}}=[\tilde {w}_{ij}]_{i,j\in V}\) is the normalized weighted matrix whose elements are defined as \(\tilde {w}_{ij}=\frac {w_{ij}}{max(w_{ij})}\) and \(d^{\leftrightarrow }_{i}={\sum }_{j\neq {i}}a_{ij}a_{ji}\) is the degree of bilateral arcs between the node i and its adjacent nodes.

As pointed out in Clemente and Grassi (2018) and Fagiolo (2007), we have four types of directed triangles to which i could belong. They generate four types of clustering coefficients, that can be separately computed.

Formula (3) includes all the four coefficients described in Fagiolo (2007). Nevertheless, the country i is part of the in-type and out-type triangles, highlighting the presence/role of the node i in import/export between its neighbouring countries. Thus, in our analysis, in-clustering and out-clustering coefficients seem more appropriate in capturing the role of the node i in the exchanges between the closest countries, distinguishing between import and export:

$$ c^{in}_{i}(\tilde{\mathbf{W}})=\frac{\frac{1}{2}(\tilde{\mathbf{W}}^{T}\tilde{\mathbf{W}}^{2})_{ii}}{d^{in}_{i}\left( d^{in}_{i}-1\right)}, $$
(4)
$$ c^{out}_{i}(\tilde{\mathbf{W}})=\frac{\frac{1}{2}(\tilde{\mathbf{W}}^{2}\tilde{\mathbf{W}}^{T})_{ii}}{d^{out}_{i}\left( d^{out}_{i}-1\right)}. $$
(5)

In order to model the influence, or the prominence, of a country in a global scenario of trade flows, the eigenvector centrality is the most suitable measure. The generalization of this measure to directed networks allows to associate with a node two status: authority and hubness. The idea arises in the context of web page search to rank the importance of a page (Kleinberg 1999). A web page is an authority if it is pointed by many other pages. Hubs are pages that link to many authoritative pages. Formally, let ai and hi be the authority and hub scores respectively. Then, the following relations hold:

$$ a_{i}=(\textbf{W}^{T}\textbf{h})_{i} $$
(6)

and

$$ h_{i}=(\textbf{W}\textbf{a})_{i} $$
(7)

where the vectors a and h collect respectively authority and hub scores of all nodes.

By formulas (6) and (7), definitions of hubs and authorities are characterized by a mutually reinforcing relationship: essentially, a good hub is a page that points to many good authorities; a good authority is a page that is pointed to by many good hubs. The use of these measures is motivated by their interpretation: on one hand, authorities are central countries as they import in turn from central countries. On the other hand, hubs are central as they export towards central countries. To compute the scores (6) and (7), an iterative algorithm (HITS - Hyperlink Induced Topic Search) is proposed in Kleinberg (1999). Starting with initial score vectors a0 and h0, through the power iteration method on AAT and ATA, the process converges to the principal eigenvectors a* and h* of the matrices AAT and ATA.

The idea behind the Laplacian centrality is that the importance of a vertex i is related to the network ability to adapt itself to the deletion of the vertex, i.e. its resilience. The Laplacian centrality of a vertex i is reflected by the drop of the Laplacian energy of the network deriving by the deletion of i from the network. According to Lazić (2006), the definition Footnote 2 of the Laplacian energy is:

$$ E_{L}(G)=\sum\limits_{k}{\lambda_{k}^{2}} $$
(8)

where λk are the eigenvalues of the Laplacian L.

Therefore, let Gi the graph obtained by deleting the node i from G, the Laplacian centrality is (see Qi et al. (2012)):

$$ l_{i}=\frac{E_{L}(G)-E_{L}(G_{i})}{E_{L}(G)}=\frac{({\Delta} E)_{i}}{E_{L}(G)}. $$
(9)

where EL(G) and EL(Gi) are the Laplacian centralities computed on G and Gi, respectively and (ΔE)i measures the effect on the Laplacian energy of the network of the removal of i. Since the denominator EL(G) has the same value for all vertices, we focus on the numerator (ΔE)i, that is always nonnegative for the interlacing property of the eigenvalues of the Laplacian matrix (see (Haemers 1995)). The Laplacian energy can be re-expressed in terms of strengthFootnote 3 (see (Qi et al. 2012), Th. 1):

$$ E_{L}(G)=\sum\limits_{k}{s_{k}^{2}}+2\sum\limits_{k<j}w_{kj}^{2}. $$
(10)

Hence, the difference (ΔE)i is:

$$ ({\Delta} E)_{i}={s_{i}^{2}}+\sum\limits_{k\in N(i)}(w_{ki}^{2}+2s_{k}w_{ki}) $$
(11)

where N(i) is the set of neighbours of the node i. This expression allows the following interpretation of the Laplacian centrality of i. This centrality depends (in a quadratic way) on the strength and on the weights of the neighbours of i. As stressed in Qi et al. (2012) and Baruah and Bharali (2017), compared with other standard centrality measures proposed for weighted networks (e.g. strength or betweenness centrality), the Laplacian centrality is an intermediate measure between global and local characterization of the importance of a vertex. The generalization to directed and weighted case follows,Footnote 4 giving an expression for weighted and directed Laplacian centrality (in and out) \(l_{i}^{in}\) and \(l_{i}^{out}\) derived by formula (11).

In our analysis, we intend to aggregate different indicators. Indeed, as already stressed, each measure has peculiarities and characteristics that highlight various aspects of the exchange relations between countries.

This heterogeneity requires an approach that cannot be simply based on the direct comparison among extremely different measures.

Given that each index has specific unit measures and range of variations, we will focus on the various country centrality rankings rather than their absolute values. More specifically, first we calculate the country rankings according to all the indices, then we cluster countries according to their positions on those rankings. Indeed, each indicator induces a ranking which represents the structural importance of a single node in the network. Rankings analysis allows us to compare more than one centrality simultaneously. The comparison will be developed by computing a distance function between rankings. In particular in this work we refer to the Minkowski distance, also known as Lp-norm distance.

Let us order the scores of each node obtained for each centrality measure k and let \({r_{i}^{k}}\) be the position of the node i with respect to k. The Minkowski distance d(ri,rj) is

$$ d(\textbf{r}_{i},\textbf{r}_{j})=||{\textbf{r}_{i}}-{\textbf{r}_{j}}||_{p}=\left( \sum\limits_{k=1}^{K}\left| {r_{i}^{k}}-{r_{j}^{k}} \right|^{p}\right)^{1/p} $$
(12)

being ri the rankings vector of node i, K the number of considered centrality measures and p any real value such thatFootnote 5p ≥ 1.

This distance measure is commonly used in the literature for computing the dissimilarity of objects described by numeric attributes. It is a generalized distance metric that includes others as special cases. In fact, although theoretically infinite measures exist by varying the value of p, just three have gained importance (Manhattan distance for p = 1, Euclidean distance for p = 2 and Chebyshev distance for \(p\rightarrow \infty \)).

A remarkable feature of this distance consists in grouping more than one object, namely it allows to consider all the network indicators simultaneously, producing a global fictitious distance between couple of nodes ranking. Furthermore, this distance allows to exploit several values of p in order to better highlight the general features of the analysed data (see de Amorim and Mirkin (2012) and Rudin (2009)). For instance, Rudin (2009) highlights how different configurations of data concentration can be caught varying p, so that Minkowski distance can be used for effectively tackling data analysis problems.

In our context, we use this distance to construct a complete network Kn having the same nodes set and weighted adjacency matrix Ω, whose entries are defined as:

$$ \omega_{ij}= \left\{ \begin{array}{ll} \frac{1}{1+d(\mathbf{r}_{i},\textbf{r}_{j})} & \text{for}\ i\neq j \\ 0 & \text{for}\ i = j \end{array} .\right. $$
(13)

These weights range in [0,1] and turn out to be effective in describing the similarities between countries. Indeed, the more two countries have a similar behaviour, the smaller is the distance and the higher is the weight.

3.2 The Maximum Clique Partition Problem

The Clique Partition (CP) problem, as applied to our model, is defined as follows. The complete undirected graph G = (V,E) is given, with V = {1,…,n}. For each (i,j) ∈ E, gains/costs gij are defined, which can take both positive and negative values. In our application, positive values of gij are similarities, negative values are dissimilarities. Let P = {V1,V2,…,Vq} be a feasible partition of V and let \(\pi (V_{k}) = {\sum }_{i,j \in V_{k}} g_{ij}\) be the gains/costs sum of subset Vk, for 1 ≤ kq. The CP problem consists of finding the node partition P that maximizes the objective function \(f(P) = {\sum }_{k = 1}^{q} \pi (V_{k})\).

It is important to note that values gij must be both positive and negative, otherwise there is no incentive to discard negative values and the best partition would be the total set P = {V }. Therefore, we calculate gij as the difference between ωij (that are positive and bounded between 0 and 1) and benchmark values \(\omega ^{*}_{ij}\), representing a neutral threshold. Neutral thresholds are calculated as follows. Let \(\omega = {\sum }_{ij} \omega _{ij}\) be the total network similarities and let \(\omega _{i} = {\sum }_{j} \omega _{ij}\) the sum of similarities appointed to unit i. The probability that a unit x of network similarity would be allocated to node i is Pr[xincidenttoi] = ωi/ω. If similarity has no structure, that is, it is independent of pairs (i,j) because data do not have clusters, then:

$$ \begin{array}{@{}rcl@{}} Pr[{\textit{x} incident to \textit{i}} \cap {\textit{x} incident to \textit{j}}] & \\ = Pr[{\textit{x} incident to \textit{i}}]\times Pr[{\textit{x} incident to \textit{j}}] & = 2 \omega_{i} \omega_{j}/\omega^{2}. \end{array} $$

Then, if similarities are independent, the expected similarity between i and j should be: \(\omega ^{*}_{i,j} = \frac {\omega _{i} \omega _{j}}{\omega }\). So, we can calculate gain/cost gij as the difference between the actual and the hypothetical similarity: \(g_{ij} = \omega _{ij} - \omega ^{*}_{ij}\). In this way we obtain values gij that are both positive and negative. The integer linear programming formulation of the Clique Partition is then:

$$ \max \sum\limits_{i > j}g_{ij} x_{ij} $$
(14)

subject to

$$ \left\{ \begin{array}{c} -x_{ij}+x_{ik}+x_{jk}\leq 1, \quad \forall i< j < k, \ \ i,j,k \in V \\ -x_{ik}+x_{jk}+x_{ij}\leq 1, \quad \forall i< j < k, \ \ i,j,k \in V \\ -x_{jk}+x_{ij}+x_{ik}\leq 1, \quad \forall i< j < k, \ \ i,j,k \in V \\ x_{ij} \in \{0,1\}, \quad i < j, \ \ i,j \in V \end{array} \right. $$

where xij is equal to 1 if two nodes are in the same cluster and 0 otherwise.

We experimented very long computational times when we tried to solve it through Integer Linear Programming. Therefore, we implemented a heuristic procedure based on shrinking the vertices of the graph. Shrink is the subroutine by which we take two vertices, representing single units or clusters, and we merge them together to obtain a single cluster. Shrink is described in Algorithm 1. Input is a data structure Gh =< Vh,gh,πh >, in which Vh is the active node set, each node representing a set of the partition, gh are the shrunken costs, defined for every pair i,jVh, πh are the clique costs, defined for every active node iVh. Output is a data structure Gq =< Vq,gq,πq > in which |Vq| = |Vh|− 1. When we shrink i,jVq, we delete j from the active nodes, see Line 1, and the clique profit \({\pi _{i}^{h}}\) of i increases by the arc profit \(g^{h}_{ij}\), while all others remain the same, see Lines 2 and 3. In the next steps, the profit of i inherits the profits of j’s connections, see Lines 5–7.

figure a

Subroutine Shrink is used to join nodes or clusters every time we find an improvement of the objective function, that is, when we find a pair (i,j) such that \(g_{ij}^{h} > 0\). The procedure is described in Algorithm 2. At the beginning, Lines 1 and 2, the partition Vq is composed of subsets of one element and the profits π associated to them are null. Then, in the loop 3-9, the greatest profit gij is selected and, if positive, vertices (i,j) are shrunken. Otherwise, the algorithm stops. The objective function is calculated in Line 10.

figure b

We found that Algorithm 2 calculates quickly good quality solution. However, it can be the case that the selected partition is suboptimal. Therefore, we implemented a version of the Neighborhood Search procedure proposed in Brusco and Köhn (2009). The procedure starts with a feasible partition P, in our case the one calculated through Algorithm 2. Then we select at random k vertices of V and try to relocate them to different clusters, searching for an improvement of the objective function. The procedure is repeated several time and for different values of k, until no improvement are found for many consecutive attempts. But in our data, we found that most of the times the results of Algorithm 2 were not improved.

3.3 An Overview of the Ranking Aggregation/Clique Partitioning Procedure

The next pseudo-code (see Algorithm 3) summarizes the methodology that we are proposing:

figure c

In Step 1, we have K centrality measures, as defined in Section 3.1. For every measure k, (k = 1,...,K), we obtain the ranking rk, whose element \({r_{i}^{k}}\) is the position of country i in the ranking according to the measure k. In Step 2, we calculate values ωij according to Formula (13). In Step 3, we calculate the gains/costs needed to define the Clique Partition model explained in Section 3.2. Lastly, in Step 4, we apply the Algorithm 2.

4 Numerical Application

4.1 International Trade Network

In this section, we apply the model previously described in order to study the structure of the ITN. We focus on a World Trade dataset, made available by the Observatory of Economic Complexity.Footnote 6 In particular, data regard the world trade database developed by the research and expertise centre on the world economy (CEPII) at a high level of product disaggregation. Original data are provided by the United Nations Statistical Division (UN Comtrade) and then the dataset is constructed by CEPII using an original procedure that reconciles the declarations of the exporter and the importer. This harmonization procedure enables to extend considerably the number of countries for which trade data are available, as compared to the original dataset (see Gaulier and Zignago (2010)). In particular, we consider the last version published in 2017, based on the Harmonized Commodity Description and Coding System, and that provides aggregated bilateral values of exports for each couple of origin and destination countries. We focus on the aggregated data of the last available year, namely, 2014.

Hence, we construct a directed and weighted network (see Fig. 1), where each node is a country and weighted links represent the amount of product trades between couple of countries expressed in US dollars. This network is characterized by 220 countries and 26034 links. Its arc density is approximatively 0.54, because on average each country has a large number of trade partners and the entire system is intensely connected. However, the network is far from being complete or, in other words, most countries do not trade with all other countries, but they rather select their partners. Furthermore, world trade tends to be concentrated among a sub-group of countries and a small percentage of the total number of flows accounts for a disproportionally large share of world trade. We have indeed that, on average, each country has trades with more than an half of the other countries in the world, but the top 10 countries export more than 50% of the total flow. To highlight most relevant trades, we report in Fig. 2 directed links whose weight is higher than 10 billion of US dollars. The network is, in this case, characterized by 61 countries and 330 links. Additionally, key importers and exporters, classified in terms of strength, are displayed in Fig. 3. Differences between import and export rankings are remarkable. United States, China, Japan, South Korea and some European countries (namely, France, Germany, Italy, Netherlands and United Kingdom) are world largest importers and exporters. Russia and Canada display instead a top ranking in terms of volume of exports. In particular, Russia is characterized by a significant positive trade balance, equal to approximatively 30% of its total exportations.

Fig. 1
figure 1

World Trade Network of imports and exports at the end of 2014. Each node is a country and weighted directed links represent the amount of product trades between couple of countries. Opacity of the link is proportional to the amount exchanged between countries

Fig. 2
figure 2

World Trade Network of imports and exports at the end of 2014. To highlight most relevant trades, we report only directed links whose weights are higher than 10 billion of US dollars. This amount approximatively corresponds to the quantile at level 99.3% of the distribution of weights

Fig. 3
figure 3

In and out-strength of countries in world trade network. Categories are based on the following classes [0 − q50], (q50q75],(q75q95],(q95q100] where qp is the p-quantile of the in-strength and out-strength distribution, respectively

Furthermore, as expected, greater countries have more partners and they account for a generally larger share of world trade. However, the relationship between the economic size and the number of partners is far from perfect, as indicated by the correlation, around 0.5, between the total value of (in or out) flows and the number of partners for each country.

4.2 Numerical Results and Discussion

As described in Section 3, we aggregate the centrality indexes through a community detection method. As a result, communities are determined by the Clique Partition model, whose input is a weighted network constructed by the original one, in which weights are determined taking into account all the topological indicators in a multi-criteria approach. Four classes of network indicators are initially computed by using the network depicted in Fig. 1. We report in Fig. 4 the scatter plots of each couple of centrality measures and the Spearman’s rank-order correlation, in order to assess the strength and the direction of association between different ranked indicators. All the correlation coefficients are positive, because a country with a high volume of exports is also highly interconnected in the network. However, there are not fully correlated couples and, in many cases, the correlation is far from one. It is also noteworthy the strong dependence between in and out versions of the same indicator. This is mainly explained by the similar patterns of imports and exports for several countries. Only hubs and authorities seem to emphasize the presence of specific exceptions. Table 1 reports the top ten countries according to the rankings of the four used indicators. The rankings reflect the results about the correlations and they exemplify the differences in the role of each country as importer or exporter.

Fig. 4
figure 4

On the left-hand side, spearman correlation between each couple of measures. On the right-hand side, matrix of scatter plots between different indicators

Table 1 The top ten countries for each network indicator.

By applying the methodologyFootnote 7 described in Section 3, we obtain at the first step three communities, characterized by 69, 87 and 64 countries, respectively. We display in Fig. 5 the communities initially identified by the algorithm. These three clusters are also well separated in terms of countries’ centrality. We have indeed that countries belonging to community 1 have an average ranking of 38, the second community has an average ranking of 113, while countries that belong to the lowest community have an average ranking around 185. In other words, the most central countries are all included in the top community. We also notice that the three clusters are characterized by a very different intra-group density. We have indeed that the density of the subgraphs (of the original ITN) induced by the countries belonging to the three clusters is 0.97, 0.53, 0.05, respectively. This behaviour can be partially explained by the fact that central countries tend to concentrate a high number of transactions between them.

Fig. 5
figure 5

Clusters of countries identified at the first step by the community detection algorithm. The communities are ordered in terms of average ranking

Since in several contexts this initial division could be too raw, we can refine the procedure in order to reduce the heterogeneity in each group. To this end, at the subsequent step, we separately consider the ranking of centralities of countries, applying the proposed method for community detection to the single group. Specifically, at step 2 we apply the proposed algorithm within each community detected at the previous step. In other words, at this step the algorithm takes into account how a specific country is ranked with respect to other countries of the same subgroup on the basis of the centrality indicators computed on the whole network. The ranking position of each country may change, but the global ranking remains the original one. For instance, the community 1, characterized by 69 countries, splits into two groups of 32 and 37 countries, respectively. The two groups obtained have an average ranking of 19 and 55. The procedure is repeated in a similar way also for the other two communities identified at the step 1, resulting in 8 communities at step 2 (see dendrogram in Fig. 6 and top left-hand side in Fig. 7).

Fig. 6
figure 6

Dendrogram that illustrates the arrangement of clusters by applying the algorithm at four different levels. Communities are ordered in terms of average ranking

Fig. 7
figure 7

Clusters of countries identified at the second, third and fourth step, respectively, by the community detection algorithm. The communities are ordered in terms of average ranking

Further reductions of the heterogeneity in each cluster are possible of course, repeating again this process at the next steps and, in general, a stopping criterion is needed. A possible one consists in looking at the volatility of the ranking inside each cluster. If we focus on community with larger standard deviation, we tend to produce a more refined breakdown between low-ranking countries. Vice versa, looking at a measure of relative volatility (as the coefficient of variation (CV)), we deal with a higher decomposition of top-ranking clusters. Here we follow this second approach and, at each step, we further divide a community only if the CV of countries’ average rankings is higher than 7.5%.

The complete structure representing the various division steps is represented by the dendrogram in Fig. 6. We notice that the number of communities increases at each step, leading to 22 communities at step 4. As expected, the criterion based on CV leads to a more granular breakdown for clusters characterized by a higher average ranking. In this way, we are able to classify key countries in different clusters. In Fig. 7 we report the subnetworks induced by the clusters. The analysis confirms a tendency of top communities in showing a higher intra-group density. For instance, the top community at step 3 and the three higher ranking communities at step 4 are complete, that is all central countries trade each other. However, there is not a monotonic behaviour between ranking and intra-density. For instance, at step 2 community 4 has a higher average ranking than community 5 (124 against 128), but a significant lower intra density (0.05 against 0.58). This peculiar behaviour can be justified by the composition of the groups.Footnote 8 Indeed, we are grouping countries on the basis of similarity in terms of their central role in the network instead of using preferential economic relationships.

It is worth to compare our results with a well-known country-classification method based on the Economic Complexity Index (ECI). This index, introduced by Hausmann et al. (2014), allows to rank countries in the ITN according to the diversification of their export flows, which reflects the amount of knowledge that drives their growth. The higher is the ECI, the more advanced and diversified is an economy. In particular, countries whose economic complexity is greater than expected (on the basis of their global income), tend to grow faster than rich countries with a low ECI. In this perspective, ECI represents a suitable tool for comparing countries in the ITN independently of their total output and it provides an independent measure of similarity. For instance, in Table 2, we list the values of the ECI for the countries in the top four clusters detected. As shown in Table 3, the mean value of such an index for each cluster is positively correlated with their ranking in the final partition we found at step 4. However, some exceptions are noticeable. For instance, China, in cluster 1, is characterised by a lower ECI than some countries in cluster 2 (e.g. UK and Italy) because of a lower diversification of exported commodities. Indeed, its wealth comes from a more homogeneous set of assets than UK and Italy, which can express a wider diversification in their total output. This could explain why the Standard Deviation inside each one of our communities is significantly high.

Table 2 Composition of top four clusters (in terms of average ranking) derived at step 4. Last column displays the ECI for each country
Table 3 Mean and standard deviation of ECI inside each of the four top clusters

Now, we focus on the countries’ role within the network. As shown in Fig. 8, the initial breakdown in communities gives a general feeling of the relevance of different macro-regions in the whole trade network. We have indeed that the top cluster, characterized by 69 countries at step 1, includes all the most developed European countries,Footnote 9 largest economies in Asia and Middle East, several countries in South America, Canada, Mexico, USA, Australia and New Zealand. Furthermore, Algeria, Angola, Egypt, Morocco, Nigeria and South Africa are included for the African continent. Except for some small countries, this community includes all the advanced economies identified in the World Economic Outlook (WEO) by the International Monetary Fund (IMF)Footnote 10 and the emerging economies identified by IMF and by other analystsFootnote 11.

Fig. 8
figure 8

Structure of communities at different steps. Darker colours are associated to communities with an higher average ranking. The number of communities is respectively equal to 3, 8, 16, 22

At the end of the procedure, we obtain that the most central group is composed by China, Germany, Japan and United States. Higher volumes of trades are indeed moved by these countries (e.g., see ranking of in and out-strength in Table 1) and, at the same time, they also show the highest levels of interconnections.

In the second group, we have countries which either are positioned at a slightly lower level (as GBR, FRA, ITA and NLD) or are outstanding for one specific indicator, but, on average, they show a less relevant role in the network. For instance, Canada has the second position in terms of hubs centrality (see Table 1), but shows an average ranking around 14, because of a lower clustering. This is in line with its low value of the ECI.

It is worth briefly comparing our results with those obtained by other community detection methods on the same network (see Barigozzi et al. (2011), Piccardi and Tajoli (2012) and Bartesaghi et al. (2020)). In particular, in Piccardi and Tajoli (2012), both directed and undirected networks have been tested without significant differences. In Bartesaghi et al. (2020), the authors follow an approach based on the maximisation of a specific quality function defined for general metric spaces. A quantitative correlation between the world partition in communities obtained by a modularity criterion and geographical distances has been investigated in Barigozzi et al. (2011). A common point of these alternative approaches is that the applied methodologies focus on the strength of countries’ relationship in order to group together countries that trade each other. As a consequence, a common result is that geographical proximity still matters for international trade, jointly with trade agreements, common language or religion, and traditional partnerships. In all cases, a large relevant community including China and North America is observed.

As described in Section 1.1, the methodology proposed in this paper follows a different path for identifying clusters based on the relevance of countries in the network. Results display indeed in the same community countries that have an analogous role in the network. Hence, it could be interesting to compare them with papers that study how countries are positioned in the ITN. In this field, main approaches in the literature are based on the application of alternative centrality measures and main results show how different centrality measures, catching alternative aspects of the network structure, can provide a different ranking (see, e.g., Cingolani et al. (2017)). In this context, the main advantage of our approach is that we take jointly into account several indicators considering the peculiarities and the heterogeneity of different measures and we group togheter countries with a similar role according to the considered features. Comparing the results, we observe that the four countries (China, Germany, Japan and United States) that belong to the most central group, are on average also in the top positions of the economic sectors explored in Cingolani et al. (2017). Similarly we found, in our second group, countries (as Mexico, Canada and South Korea) that in Cingolani et al. (2017) appear to follow an intermediary role, having connections with both focal countries and less central ones. To conclude, it seems that the proposed approach is able to catch different elements of the network structure, providing, at the same time, a univocal classification of countries in terms of their relevance.

5 Conclusions

Community detection is a widely discussed topic in network theory. The analysis of the mesoscale structure of a real network throws light on its inner structure. This plays an even more significant role when applied to ITN, in view of its multiple implications. This work aimed at clustering countries according to similarities in their role in the global market, rather than using only the preferential channels of exchange between them. Centrality measures have represented, by now, a classical tool to rank such a role in the network. In particular, each centrality measure expresses a different information about the nodes position. We proposed a way to collect all the information content, represented by suitable centrality measures, through a distance measure between countries.

Among all possible similarity-dissimilarity distances, the Minkowski distance allows to grasp different data distributions, depending on a specific parameter p. In this way, we constructed a weighted complete network where nodes are countries and weighted links are related to similarities between them. By means of this similarity-network, we set up a classical Clique Partitioning problem to identify the community structure that maximizes the modularity. We proposed here a new algorithm which, loosely speaking, merges different nodes or clusters and shrinks the network in such a way to get polynomial times for its solution.

When applied to the ITN in the year 2014, the optimal solution shows three big clusters, more or less equivalent in size but very different in terms of intra-cluster density. This has been easily interpreted since the rate of exchanges between top countries is far more intense than for poor ones. We iterated the same methodology to each cluster, in order to reduce the internal heterogeneity. This allows to build a dendrogram tree stemming at each step.

The top leader economies in the world result to be those of China, Japan, USA and Germany. This is not unexpected but our proposal shows that these countries also play a very similar role in the world economy on the basis of the set of selected indicators, making our approach suitable for other network applications.