The new challenges of multiplex networks: measures and models

What do societies, the Internet, and the human brain have in common? The immediate answer might be"not that much", but in reality they are all examples of complex relational systems, whose emerging behaviours are largely determined by the non-trivial networks of interactions among their constituents, namely individuals, computers, or neurons. In the last two decades, network scientists have proposed models of increasing complexity to better understand real-world systems. Only recently we have realised that multiplexity, i.e. the coexistence of several types of interactions among the constituents of a complex system, is responsible for substantial qualitative and quantitative differences in the type and variety of behaviours that a complex system can exhibit. As a consequence, multilayer and multiplex networks have become a hot topic in complexity science. Here we provide an overview of some of the measures proposed so far to characterise the structure of multiplex networks, and a selection of models aiming at reproducing those structural properties and at quantifying their statistical significance. Focusing on a subset of relevant topics, this brief review is a quite comprehensive introduction to the most basic tools for the analysis of multiplex networks observed in the real-world. The wide applicability of multiplex networks as a framework to model complex systems in different fields, from biology to social sciences, and the colloquial tone of the paper will make it an interesting read for researchers working on both theoretical and experimental analysis of networked systems.


I. INTRODUCTION
One of the most intriguing characteristic of complex systems is that most of the collective behaviours they exhibit cannot be predicted from the the knowledge of the properties of their elementary constituents.Indeed, in the last two decades network science has shown that in many cases, from biology to economics, the structure of the interactions among the constituents of the system plays a fundamental role in shaping the emergence of complex behaviours, much more important than the role played by the specific properties of the single units of the system [1][2][3][4].And surprisingly, systems as diverse as social networks, transportation systems, cities, and the human brain were shown to share a significant number of features and a comparable structure of interactions [5][6][7][8][9][10].
Recently, the availability of new data sets, the rediscovery of old ones and the access to more powerful computers, has highlighted the necessity to develop a new framework to represent networks whose units interact through more than just one type of relations.These systems are usually called multiplex networks, and are characterised by the fact that all the connections of a given type are embedded into a distinct layer.A full description of early research on such topic can be found in [11][12][13].This Article provides an informal, still comprehensive handbook for the experimental investigation of systems which can be described as multiplex networks.In the first part we provide an overview of the most basic measures to characterise the structure of multiplex networks, focusing on the properties of nodes, edges, and layers.In the second part we review a few models which can be used to reproduce the empirical patterns observed in real-world multi-layer system, or to assess the statistical significance of such patterns.

II. MEASURES FOR MULTIPLEX NETWORKS
We consider a multiplex network M consisting of N nodes and M different types of relations, represented by M graphs, or layers.We can fully describe the structure of the system by considering the set of adjacency matrices M ≡ A = {A [1] , . . ., A [M] }, where ij }, with a [α] ij = 1 if i and j share a bond of type α and a [α] ij = 0 otherwise [17].When the connections among nodes are weighted, the system can be described by a set of weighted adjacency matrices W = {W [1] , . . ., W [M] }, with the weight of the link between node i and j [18].This formulation implicitly assumes that each node i consists of M replicas, one at each layer, and that a link can only connect two replicas lying on the same layer.Consequently, although both A = {a  ij } can be considered as generic order-3 tensors, it is important to stress that node i on layer α and node i on layer β effectively represent the same unit of the system and not two different ones.In other words, the replicas of the same node are identified across layers.Social systems can be naturally cast within this framework, where different layers can represent for instance different interaction channels among the same nodes (e.g., face-to-face communication, email exchange, online chat, etc.), but the different replicas are just a mathematical representation of the same individual in each of the M contexts.
However, there are cases in which there exists some sort of communication or flow between the replicas of the same node at different layers.A typical example is that of multimodal transportation systems, where nodes are locations and layers represent different transport modalities, e.g.bus, underground, trains, etc.In this case, an accurate modelling of the system has to take into account inter-layer transitions between the replicas of the same node at different layers, and it is more convenient to model the structure of the system through an order-4 tensor M j,β i,α [14].This formulation makes explicit (and adjustable) the relative importance of intra-layer and interlayer connections [15,16].Unless specified otherwise, in the following we will mostly use the more simple formulation based on order-3 tensors and given in Eq. 1.

A. Node properties
Differently from the traditional single-layer approach, where node properties are described by scalar variables, node features in multiplex networks are naturally described in vectorial terms.As an example, for each node i we consider its total number of connections, or degree, at layer α, i.e. k ij , and the multilayer degree k i = {k [1] i , . . ., k A crucial empirical evidence is that in many multiplex networks not all nodes have connections at all layers.As a consequence, a node i is defined as active on a layer α if it is connected to at least another node at that layer, i.e. if k [α] i > 0. The activitypattern of each node can be compactly stored into the node-activity vector where b represents the number of layers in which node i is active, with 0 ≤ B i ≤ M [19].It has been found that most real-world multiplex networks are characterised by heterogeneous distributions of node activity [19], and it has been shown that such heterogeneity might be responsible for the increased fragility of multiplex networks to random failures [20].
Given a generic vectorial property }, it is important to be able to compress the information into meaningful scalar descriptors, especially for systems composed of a large number of layers.A typical way to approach this problem is to consider the first and the second moment of the vector ξ, accounting for its mean value µ(ξ) (or, analogously, the sum of its components) and its variance σ 2 (ξ), or related quantities.In the particular case of the degree, the total number of connections of node i is usually called total or overlapping degree [17] while the heterogeneity of the number of neighbours of node i across the layers can be measured through the multiplex participation coefficient [17] where P i = 1 when the links incident on node i are equally distributed across the layers, and P i = 0 when a node is only active on one layer.We note that similar information about the heterogeneity of the distribution of a node's connections across layers is provided by the Shannon entropy of the degree vector [17] The pair of variables (P i , o i ) can be used to classify nodes via the so-called multiplex cartography [17], efficiently distinguishing multiplex hubs (high o i and high P i ), focused hubs (high o i and low P i ), multiplex leaves (low o i and high P i ) and focused leaves (low o i and low P i ).
A remarkable property of real-world networks is the tendency of nodes to form triangles, a phenomenon usually known as transitivity.In single-layer networks, the abundance of triangles is typically measured through the average clustering coefficient C, where C = 1 N N i=1 C i and C i accounts for the fraction of triads centred on node i which are closed into triangles.In multiplex networks, triads and triangles can effectively extend over more than one layer.We define an m-triad (m-triangle) a triad (triangle) which uses edges from m different layers.It is possible to define two multiplex clustering coefficients to quantify the added value provided to transitivity by the layered structure [17].For each node i, the first coefficient C i,1 is defined as the ratio between the number of 2-triangles with a vertex in i and the number of 1-triads centred in i.In formulas: The second multiplex clustering coefficient C i,2 is instead defined as the ratio between the number of 3-triangles with node i as a vertex, and the number of 2-triads centred in i.In formulas: These two measures are defined respectively for M ≥ 2 and M ≥ 3, and are a natural generalisation of clustering coefficient to the case of multiplex networks.A related generalisation of clustering coefficient, based on the order-4 tensorial formulation for multiplex networks, has been suggested in Ref. [21].
Another characteristic properties of real-world networks is the presence of heterogeneity in the relative importance of nodes, as measured by different notions of node centrality [3].A number of different approaches have been suggested to define and compute the centrality of a node in a multiplex network.A first possibility consists in defining the multiplex centrality as a combination of the centrality scores of each node at the different layers.For instance, starting from the centrality vector of node i, c i = {c [1] i , . . ., c [M] i }, one can try to condense the information into a single scalar variable, as is normally done for the degree.However, computing averages of centrality scores across layers is not always meaningful.The first reason is that in general a node can play different roles on different layers, and averaging over layers will only level down such heterogeneities.
The presence of more than one layer allows to define new genuinely multiplex centrality measures in which the role of a node explicitly depends on the structure of the multiplex at all layers.For instance, the authors of Ref. [23] suggested to compute the eigenvector centrality of nodes on each layer α as the normalised eigenvector relative to the largest eigenvalue of where I = {i [α,β] } is a given influence matrix which determines how the centrality of layer α depends on the structure of layer β.In Ref. [17], instead, the authors studied the contributions of the different layers to the centrality of the nodes by varying the coefficients i [α] α = 1, . . ., M of the matrix which is a convex combination of the adjacency matrices of the layers.
An entire class of node centrality measures can be defined by using the properties of random walks on multiplex networks [25,26].A particularly interesting example is that of multiplex PageRank centrality proposed in Ref. [24].The authors of Ref. [24] considered the case of a two-layer multiplex network and defined the multiplex PageRank of the nodes in layer α = 2 as a function of the PageRank scores of the nodes in layer α = 1.The main idea of this genuinely multiplex measure is that, especially in social systems, nodes can leverage their centrality in one context, such as personal relationships (represented by layer 1) to gain centrality in another context, e.g.professional relationships (represented by layer 2).
Real-world networks often exhibit the small-world property, meaning that the typical distance between any pair of nodes in the system scales logarithmically with the total number of nodes.An important observation is that not all the nodes of a system are equally important in mediating paths between other nodes, which is the main idea between the concept of node betweenness [3].In a multiplex system, the reachability of a node might significantly depend on the interplay between different layers.The added value introduced by multiplexity can be measured through the interdependence [22,48] where ψ ij accounts for the number of shortest paths between i and j that use edges in more than one layer, and σ ij is the total number of shortest paths between i and j.The quantity λ i takes values in the interval [0, 1], with larger values corresponding to a higher advantage for the reachability of node i provided by the interplay of the different layers.Finally, if one represents a multiplex network using an order-4 tensor, an entire class of centrality measures can be obtained as natural extensions to adjacency tensors of the corresponding measures defined on adjacency matrices [28].For instance, the eigenvector centrality of a node in this formalism can be computed by considering either the eigenvectors of the order-4 tensorial representation of the multiplex or the eigenvectors of the associated supraadjacency matrix.An interesting application of this class of measures, described in Ref. [27], allows to define the versatility of nodes, assigning higher centrality scores to those nodes which act as bridges among different layers.

B. Layer properties
Similarly to the case of node activity, it is possible to define the activity-vector of each layer α [19] as where b

and b
[α] i = 0 otherwise.For each layer α, the total layer activity describes the total number of nodes with at least one connection in layer α, with 0 ≤ N [α] ≤ N .The similarity between the activity-vectors of two layers α and β can be measured by mean of the pairwise multiplexity [19], which accounts for the fraction of nodes of the multiplex which are active on both layers: In general 0 ≤ Q α,β ≤ 1, with Q α,β = 1 when all nodes are active in both layers, and Q α,β = 0 when no node is active on both layers.The similarity among the patterns of activity in two layers can also be measured through the Hamming distance [19] where H α,β = 0 if d [α] = d [β] and H α,β = 1 if all active nodes are active in no more than one layer.It has been suggested that real multiplex networks are normally characterised by heterogeneous distributions of layer activity and of pairwise multiplexity [19].Another interesting property observed in real multiplex networks is the presence of correlations between the degrees of the same node at different layers.This is normally signalled by the fact that the probability ) to find a node with degree k 1 on layer α and degree k 2 on layer β does not factorise in the product P [α] (k)P [β] (k) of the degree distributions of the two layers.In general, given two layers α and β and a generic node property ξ i , the correlation between ξ [α] i and ξ [β] i can be computed using the rank correlation coefficient [19]: where R [α] i is the rank of node i at layer α induced by the property ξ.When the property of interest is the degree, it makes sense to define the quantity: that is the average degree at layer β of a node having degree k [α] at layer α, and is the multiplex homologous of the nearest-neighbours average degree function k nn (k) traditionally used to quantify degree-degree correlations in single-layer graphs [29].An increasing (decreasing) trend in k [β] (k [α] ) will signal the presence of positive (negative) inter-layer degree correlations between layer α and layer β.
The authors of Ref. [31] proposed to quantify interlayer degree correlations by using the pairwise mutual information between the degree sequences of the two layers: which is maximal when the degree sequences k [α] i and k [β] i are perfectly correlated (or perfectly anticorrelated), and minimal when they are uncorrelated.
An equivalent set of quantities to measure the interlayer assortativity has been defined for the order-4 tensorial formulation in Ref. [33].
A fundamental research question in the field of multiplex networks is to assess whether the presence of more than one interaction layer indeed provides more information about the structure of a system compared to a classical single-layer network representation.In particular, it is interesting to quantify how much information is lost (if any at all) when we aggregate some or all the layers of a multiplex network to obtain a lower-dimensional representation.The authors of Ref. [45] tackled the problem of multiplex reducibility by drawing on an existing formal parallel between density operators of quantum systems and Laplacian matrices of graphs, and extending the concept of Von Neumann entropy of a graph to the case of multiplex networks.They proposed a greedy procedure, based on the estimation of the quantum Jensen-Shannon divergence between layers, which allows to successively aggregate the most redundant layers of a multiplex and to obtain a more compact representation which uses the minimal number of layers while maximising the distinguishability between the multiplex and the single-layer representation of the same system.An interesting result of the paper is that different multilayer systems allow different levels of reducibility, with man-made systems being the least reducible and biological and social systems showing the highest levels of redundancy [45].

C. Edge properties
Due to the presence of multiple layers, a pair of nodes (i, j) can be connected through several edges.Given two layers α and β, the edge overlap of the pair (i, j) [17,30] is defined as where o = 0 if the two nodes are node connected.For a generic number of layers M , the edge overlap is defined as This measure can be easily extended to the whole network as where the average is computed over all possible pairs of nodes [17], or instead as where the average is restricted to the pairs of nodes which share at least one edge [31].Alternative definitions for the local edge overlap on a node i and the total overlap of two layers are suggested in [30] and respectively read and where õ[α,β] ij = 1 when both layers have a link between i and j and õ[α,β] ij = 0 otherwise.In the same spirit, a similar measure of edge correlations is the so-called multiplexity [32], defined as where ) is the total number of edges at layer α (β).Notice that m [α,β] takes values in the range [0, 1].
A somehow dual quantity is the so-called edge intersection index, which measures the probability of finding a pair of nodes that is connected by an edge on all the M layers of the multiplex [45].
An alternative characterisation of edge correlations can be based on the conditional probability to find a link at layer α given the presence of an edge between the same nodes at layer β [17] P (a If layer β has weighted edges, it is also possible to look at the conditional probability P w (a ij ) to have a link at layer α given its weight on layer β.If P w shows an increasing trend as a function of w, this phenomenon goes under the name of edge reinforcement, since a stronger link on one layer implies a higher chance to find the same edge on a different layer [17].

D. Mesoscale properties
Complex networks are usually characterised by nontrivial structural patterns not only at the level of singlenode properties but also, and more importantly, at the level of sub-graphs.A lot of attention has been devoted to the analysis of statistically significant sub-graphs in single-layer networks, also known as motifs.It has been found that a few specific sub-graphs are over-represented in real systems compared to their abundance in equivalent networks obtained by randomising the original graph [34,35].Due to the additional level of richness provided by the layered structure of multiplex networks, the multilink, i.e. the organisation of the edges between the same pair of nodes (i, j) across the M layers, is the most basic motif [18,30].Similarly, m-triads and m-triangles used for the definition of node clustering coefficients are multiplex motifs [17].On top of that, higher-order motifs [36,37] have also been studied and characterised, especially in two-layer networks based on structural and functional connectivity in the human brain [37].
Another remarkable feature of networked systems is the tendency of their units to cluster together in tightlyknit groups, giving rise to non-trivial community structures.Communities are also observed in multiplex networks, even if there is not to date an agreed definition of what a multiplex community is [38].Some of the efforts in the characterisation of the communities of a multiplex have been focused on the quantification of the similarities in the community structure observed at different layers.In general, given two layers α and β and their partitions in communities P α and P β , their similarity can be measured through the normalised mutual information (NMI) [39,40] where N mm ′ is the number of nodes in common between community m of partition P α and community m ′ of partition P β , while N m and N m ′ are respectively the number nodes in the two communities m and m ′ .We note that such measure was originally suggested to compare the community structure obtained on the same single-layer networks from different algorithms.A different similarity measure is suggested in Ref. [40], in terms of the possibility to infer the community structure at layer α using information about the community structure at layer β.
The information about the decomposition in communities of different layers can be combined together to define a multilayer partition in communities.A classical approach is that described in Ref. [38], which proposed a generalisation of the concept of modularity to multiplex, multi-slice, and temporal networks.Among the genuinely multiplex methods to extract the community decomposition of a system we find particularly interesting the approach proposed in Ref. [39], which extends the Infomap algorithm [42], based on the minimisation of the description length of a partition in communities, to the case of multiplex networks.

III. MODELS OF MULTIPLEX NETWORKS
The characterisation of the structure of a network is normally accompanied by a modelling effort aiming at quantifying how special or peculiar are the observed patterns, e.g. in terms of how probable is to find them in an appropriately chosen family of random graphs, and which are the mechanisms that determine their appearance.In the following we review two classes of models of multiplex networks, namely static random graph models and growing networks.

A. Microcanonical and Canonical Ensembles
A standard approach to study the structure of a given network is to quantify how probable is to observe a network with similar properties in an appropriately defined ensemble of random graphs whose elements satisfy certain constraints.For instance, it is a well-known fact that graphs with power-law degree distributions are extremely rare in the classical Erdös-Renyi random graph ensemble, where each pair of nodes are connected with a given probability p.As a consequence, the hypothesis that power-law degree distributions arise as a result of a uniform distribution of edges across the nodes can be safely rejected, and we can conclude that some other mechanism should be at work in the formation of graphs with heterogeneous degree sequences.
An ensemble of graphs is determined by a set of constraints that its elements should satisfy.According to the type of constraints, we can identify at least two classes of random network ensembles, namely canonical ensembles, where each graph of the ensemble satisfies the set of constrains on average (soft constraints), and microcanonical ensembles, where each graph satisfies all the constraints exactly (hard constraints).It is possible to define a sequence of canonical and microcanonical ensembles of multiplex networks [30], where the constraints are just the average degree at each of the M layers, or the degree distribution of each layer, or the degree distribution together with the distribution of edge overlap, and so on.
Each multiplex networks ensemble is defined by providing the probability P (A) for each of the possible configuration of multiplex networks A which satisfy the constraints.Starting from P (A), the Shannon entropy of the ensemble is defined as [30].
For the special case of uncorrelated multiplex networks, we have the probability P (M) ≡ P (A) can be factorised into the probability of observing each single layer, i.e.
In this particular case, the entropy of the multiplex ensemble reads ) ln(P [α] (A [α] )).(30) In the following we focus on the canonical -indicated by C -and microcanonical -denoted by M -ensembles of multiplex networks.Let us assume that we have T soft constraints such that where µ = 1, . . ., T , and F µ (M) describes how such constraints are imposed on the network, such as the degree of each node of the network at each layer α, or the total number of edges K [α] for α = 1, . . ., M .The probability P C (M) of observing the multiplex M can be obtained by maximising the entropy S under the given set of constraints.By solving the optimisation problem one obtains: where Z C is the partition function of the canonical multiplex ensemble and the values of the Lagrangian multipliers λ µ are obtained by satisfying Eq.31 imposing such functional form for P C (M).In the canonical multiplex ensembles we have: Conversely, in the microcanonical multiplex ensemble each multiplex configuration compatible with the hard constraints has the same probability where δ is the Kronecker delta function and is the microcanonical partition function, accounting for the number of multiplex networks satisfying the T hard constraints F µ (M) = C µ .By defining the entropy of these ensembles as N Σ, such entropy reads where Σ is the Gibbs entropy of the multiplex ensemble.
It can be shown that the Gibbs entropy Σ is related to the corresponding Shannon entropy S by N Σ = S − N Ω, where Ω is the logarithm of the probability that in the related canonical multiplex ensemble the hard constraints F µ (M) are satisfied.The author of Ref. [30] provided an exhaustive explanation of how the entropy and the partition function can be computed in different classes of multiplex networks with increasingly stringent sets of constraints, both for the canonical and for the microcanonical ensembles.The same approach has been generalised to a number of more complicated structures, including spatial multiplex networks [47] and multiplex networks with heterogeneous activities of the nodes [20].

B. Models of node and layer activity
The concept of node and layer activity is peculiar to multilayer networks, and it is interesting to assess whether simple models can give account for the observed heterogeneous distributions of node and layer activities.
In the following we provide a brief review of some null models proposed so far to quantify the peculiarity of given distributions of node and layer activities.
Let us consider two layers α and β with N [α] and N [β] active nodes respectively.If initially the two layers have no active nodes and we then sample uniformly at random from {1, 2, 3, . . ., N } N [α] nodes on layer α and N [β]  nodes on layer β and we activate them, then the probability that m of them are active at both layers follows a hypergeometric distribution p(m; N, N [α] , N [β] ) = according to which the expected number of nodes active at both layers is equal to N [α] N [β] /N , the expected pairwise multiplexity is and the expected Hamming distance reads Hα,β = ) This is the simplest model of node activation and is known as the hypergeometric model [19].However, the authors of Ref. [19] have shown that the distribution of pairwise multiplexity and pairwise Hamming distance in real-world multiplex networks is not compatible with those give in Eq. 37 and Eq.38.
Let us now consider the problem of constructing a multiplex networks with a fixed number of layers M , a fixed number of nodes N which are active on at least one of the M layers, and where each node i has an assigned node activity B i , which is for instance set equal to the node activity observed in the real network.By sampling for each node one of the M Bi vectors of node-activity with B i non-zero entries, the distribution of the total node activity of the original system is kept fixed, whereas the correlations in the layer activity and the distribution of the node-activity vectors are not preserved.Moreover, in such a model all layers have the same expected number of active nodes: This model is known as the multi-activity deterministic model [19].A variation of the model is constructed by activating node i in each layer α with probability Bi = B i /M , so that the expected activity of each layer stays the same but the original node-activity distribution is not preserved.This model is known as the multi-activity stochastic model [19].Finally, it is possible to construct a model for a 2layer multiplex network where the degree distributions of each layer is kept fixed, and where one can control the edge overlap ω by rewiring a certain fraction r of the edges.The model was introduced in Ref. [46].For simplicity, let us assume that the two layers have the same number of edges K [α] = K [β] = K.If we start from two identical networks, we have maximum edge overlap ω = 1.If we now keep fixed the structure on one of the two layers, and rewire one of the edges of the other layer, the number of links present in both layers decreases by one unit, while the number of those present in only one of the two layers increases by two units.Consequently, if we rewire a fraction r of the K edges of the second layer in such a way that each rewire decreases the number of edges existing on both layers, we obtain: By inverting such relation, we find that a given overlap ω corresponds to a rewire r equal to r = (1 − ω)/(1 + ω).In practice, this model allows to obtain a prescribed value of edge overlap by rewiring a certain fraction r of the edges in one of the two layers.

C. Growth models of multiplex networks
In this section we review a few growth models for multiplex networks.The most simple example of this class is a model of layer-growth, aimed at explaining the fattail distribution of layer activity observed in empirical data [19].The model works as follows.We start at time t 0 = 0 with a multiplex with M 0 layers and N nodes.At each time t, a new layer α joins the network with N [α]  nodes to be activated, where N [α] can be observed from the data-set we are attempting to reproduce.Each node i has then a probability to be active on that layer at time t equal to: where B i (t) is the number of layers where node i is already active and A > 0 is a constant that allows the activation of nodes not yet active in the multiplex.When the number of layers in the model increases, the distribution of layer activity P (N [α] ) approaches a power law.
Another important class of growth models is that where not layers, but individual nodes join sequentially the network, for instance by connecting to preexisting vertices on possibly different layers.In such regard, it is clear that the specific shape of the attachment function determines the long-term statistical properties of the final multiplex graph.In single-layer networks, a particularly well-studied case is the so-called preferential attachment, where nodes choose to attach to older vertices depending on a function (in the simplest case linear) of their degree k.In a multiplex network the degree of each node j is a vector and the probability Π [α] i→j that a new node i attaches to j on a given layer α in general depends on all its components.In formula: The most simple class of preferential attachment models is obtained by considering linear attachment kernels, i.e. by setting F [α] j as a convex combination of the degrees of node j at all the layers [48,49].The interesting result is that linear attachment kernels produce multiplex networks whose layers have power-law degree distributions, but where inter-layer degree correlations are always positive, meaning that a hub on one layer is also a hub on the other layer as well.This is due to the fact that the expected final degree of a node on a certain layer is determined solely by the time at which it joins the network [48].A generalisation of the closed-form solutions for the joint degree distribution of heterogeneously growing multiplex networks with arbitrary number of layers and arbitrary times can be found in [51].
A more interesting class of multiplex networks is obtained by considering non-linear attachment kernels.The authors of Ref. [50] started from the case of multiplex networks with two layers, using the attachment kernel: where α, β ∈ R. By tuning the relative values of the exponents α and β, one can obtain multiplex networks where each layer has either an exponential, a power-law, or a condensed degree distribution (where super-hubs with extensive degrees appear), and where inter-layer degree correlations can be either positive, null, or negative.In the same work the authors suggested several possible generalisations of the model to the case of multiplex net-works with M ≤ 2 layers.An interesting model of multiplex network growth which takes into account weighted links, aimed at reproducing the structure of some layered social networks, can be found in Ref. [52].

D. Models of multiplex communities
Simple preferential attachment models, while able to reproduce some empirical patterns such as inter-layer degree correlations, do not allow to construct multiplex networks with strong community structure.More sophisticated models able to produce tunable intra-layer and inter-layer community structure have been suggested, based on intra-layer and inter-layer triadic closure mechanisms on 2-layer multiplexes [40].In that model a node i arrives and selects one of the layers at random, and a node n 1 in that layer as its first neighbour.The following m − 1 links on the same layer will be to a neighbour of n 1 with probability p, or to a uniformly sampled node in the same layer with probability 1 − p. Once m links have been created on the first layer, node i starts creating links on the other layer.In particular, the first edge on the other layer is created with probability p * to the same node n 1 , and at to a node sampled uniformly at random with probability 1 − p * .The remaining links on the second layer are placed again with probability p and 1 − p.In this model, the value of the parameter p determines the strength of communities on each layer, with higher values of p corresponding to tighter communities, while p * tunes the extent of overlap (i.e., number of shared nodes) between communities in different layers.Interestingly, The model was able to reproduce some of salient characteristics of multiplex collaboration networks.

IV. CONCLUSIONS
Networks are responsible for the emergence of a variety of complex behaviours in social, economical, technological and biological systems, and multilayer networks are the last frontier of research in this field.The theory of multiplex networks has already proven quite successful in modelling the structure of intrinsically multidimensional relational systems, showing at the same time that the presence of more than a single type of interaction is responsible for new levels of complexity.The advances made in this field in the last few years are definitely encouraging, and there is still a lot of open problems to address in depth and many questions still waiting to be asked.We strongly believe that multiplex networks are an extremely active and interesting area of research, and we really hope that this brief review will contribute to spur the curiosity of researchers who are interested in studying the structure of real-world systems.