Constraints and entropy in a model of network evolution

Barabási–Albert’s “Scale Free” model is the starting point for much of the accepted theory of the evolution of real world communication networks. Careful comparison of the theory with a wide range of real world networks, however, indicates that the model is in some cases, only a rough approximation to the dynamical evolution of real networks. In particular, the exponent γ of the power law distribution of degree is predicted by the model to be exactly 3, whereas in a number of real world networks it has values between 1.2 and 2.9. In addition, the degree distributions of real networks exhibit cut offs at high node degree, which indicates the existence of maximal node degrees for these networks. In this paper we propose a simple extension to the “Scale Free” model, which offers better agreement with the experimental data. This improvement is satisfying, but the model still does not explain why the attachment probabilities should favor high degree nodes, or indeed how constraints arrive in non-physical networks. Using recent advances in the analysis of the entropy of graphs at the node level we propose a first principles derivation for the “Scale Free” and “constraints” model from thermodynamic principles, and demonstrate that both preferential attachment and constraints could arise as a natural consequence of the second law of thermodynamics.


Introduction and Background 1.Overview
The 'Scale Free' model of Barabási-Albert [1] is widely accepted as the definitive model of how real world networks evolve.This and other dynamic network models consider real world networks as graphs G(V, E), where V (t) is the set of vertices and E(t) the set of edges.Its success at overcoming the difficulties of applying the Erdős-Rényi (ER) random graph model (for a detailed description see [2]) to real world networks is well understood.In particular the model naturally results in a power law degree distribution, as opposed to the random graph model, which has a binomial distribution of node degree, which in the continuum limit of a very large network is approximately Poisson, with well defined higher statistical moments that establish the 'scale' of the graph.This is in stark contrast to the scale free model which does not have well defined moments above the mean.The model described by  and [1] builds upon, and provides an explanation for, the notion of the small world network, first introduced by Watts and Strogatz [4] and has been used to analyze a wide variety of real world graphs.
On close examination, the scale free model has a number of theoretical challenges, and, it is well understood that the behavior of real world networks has deeper complexity than a single constant power law degree distribution.Of course balanced against the success of the model in generating networks that share the small world property and scale free degree distributions, these challenges can be viewed as opportunities for refinement of the fundamental approach.In this work we focus on extensions to the model which provide improvements in the following three areas: -Absence of Constraints: There is an assumption that a graph can continue to evolve indefinitely, unconstrained by any system wide or external resources.For most real world networks this is not the case.For example in communication networks every node in the network has a natural maximum connectivity.In the scale free model there is no such upper limit to node degree.-Fit to Real World Data: The standard scale free model produces a degree distribution that follows a power law with exponent γ = 3.It is well understood that this is not an exact fit to real world data, which we highlight arXiv:1612.03115v3[physics.soc-ph]7 Sep 2017 in Section 3.Many extensions exist that produce a better fit, some of which we survey later.It is clear that the degree distributions of real networks have more complex behavior than a simple fixed exponent power law.-Absence of a Physical Model: The notion of scale freedom derives directly from the hypothesis of preferential attachment, that is in a dynamically evolving graph new nodes will more likely attach to nodes of higher degree.Whilst the scale free model provides a theoretical framework that points to high node degree making a node more likely to attract new connections, there is no fundamental explanation of why that should be so, and what physical processes may be at work that could produce that effect.It would be desirable if this could be explained using a first principles argument involving well understood mechanisms.This would further strengthen the fundamental premise of the scale free model.
In this paper we will attempt to address these challenges.We do so by proposing a simple extension to the standard scale free model, which introduces a hard cut off in the degree of a node, motivated by considerations from communications network design.This model has some attractive features, amongst which is a more accurate prediction of the power law exponent.Although extensions to the preferential attachment approach (most notably [5], [6] and [7]), can result in values of the power law exponent less than 3, we believe our model achieves this through a simple and natural extension to the traditional preferential attachment paradigm.Furthermore, as a consequence of introducing the constraint, we identify that the attachment probability introduces superlinear polynomial terms in node degree.This additional structure to the attachment probability is responsible for a richer scaling regime in node degree evolution.This structure allows us to compare in Section 4 both the constraints and scale free model to a novel model of evolution that argues from a stochastic perspective based upon recent developments in the structural entropy of a graph.By developing the outline of an entropic model we illustrate how both the standard scale free and our constrained model could be viewed as approximations to a more fundamental, statistical thermodynamic model of network growth.
In this section we will begin with a brief overview of the continuum analysis used in [1] to derive the principle results of scale free models, and at a very high level subsequent attempts to build upon and extend the model.We will make use of the same continuum approximation in our analysis.We show in section 2 how the introduction of a simple environmental constraint into the scale free model can significantly improve its predictive power, and compare our constrained model to a range of more contemporary network data in section 3.As part of the verification of our constrained model, we also present results of simulations of network growth using our modified attachment probability defined in Section 2. An attractive feature of our extended model is that it reproduces the scale free model when we allow our constraint to tend to infinity.We are able to significantly outperform the ability of the scale free model to predict the exponent γ of the power law distribution across a wide range of real world data (results are summarized in Table 2).In particular for ten of the twenty three data sets analyzed (marked in Table 2 in bold) we are able to predict γ to within 10%, whereas the scale free model overestimates the value of γ by an average of 35% and in only four cases does it predict within the range 10-20%.Our constrained model therefore performs better than the standard scale free model on the first two issues identified above, but not on the third.
In Section 4 we propose a novel statistical thermodynamical (i.e.entropic) model of network growth.This addresses the third objective.Recent work on the behavior of communications networks by Tee et al [8,9] introduced a measure of the structural entropy of a node, derived from its degree and clustering coefficient.We show how this can lead to a direct derivation of scale free and constraint models, potentially explaining why scale freedom arises and why our constrained model is a better fit for networks as they grow and encounter connectivity limitations.We present in the same section some early results from numerical simulations of the entropic model, which show many of the features of the real world data we analyzed in Section 3.

The Scale Free Model
The Scale Free Model of Barabási, Albert and Jeong [3], [1] is based on two simple and fundamental assumptions: -Growth: Starting with m 0 nodes and e 0 edges, we add a new node at each unit time step.When this node is added to the network, it connects to m m 0 other nodes.This process continues indefinitely, such that after t unit time steps, there are m 0 +t nodes, and e 0 + mt edges.Eventually the constants in these expressions can be dropped as they are insignificant compared to t.
-Preferential Attachment: The node attaches to other nodes with a probability determined by the degree of the target node, such that more highly connected nodes are preferred over lower degree nodes.
Using a mean field theory approach the analysis explains both the power law scaling of real world networks [10], and the simultaneous resilience and vulnerability of networks to random and targeted attacks, respectively [11].
The approach taken in [3] begins by proposing the probability of a randomly chosen node i, capturing a connection to a new node, as solely dependent upon its degree k i as: In the strictest sense the approximation j k j = 2mt should include the original nodes m 0 and their degrees, however for large values of t this can be effectively ignored, without loss of generality, as 2e 0 2mt.By taking the continuous approximation, this naturally leads to the following ordinary differential equation for the time evolution of node i's degree k i (t): Equation ( 2), can be solved subject to an initial condition that at time t i , when node i is added, its degree k i = m to yield: In order to derive the degree distribution begin by assuming that t is fixed.At this stage the probability that k i (t) is smaller than a given degree k is: Developing the mean field approach we note that the ith node was chosen at random, so its time of introduction into the network t i is a random variable.Given that nodes are added at each time step, the range of possible values for t i are 1, 2, . . ., (m 0 + t), and each value can occur with probability 1 (m0+t) .We can conclude that the random variable t i is uniformly distributed and can write the probability of choosing a node i with a t i smaller than m 2 t k 2 as: We can now state that the probability of a node having degree k < k i as:

∂k
, yielding the principal result of the Barabási-Albert Scale Free model: This predicts that on a log/log scale the slope of the degree distribution γ is identically 3. The result has been compared against many real world networks, and indeed the power law behavior has been seen in many examples and is one of the triumphs of the scale free model.The model, however, generally overestimates the value of γ and cannot explain the non linear behavior of the degree distribution at high values of k (as outlined in [12]).Reproduced in Table 1 from the data in [1] are some key parameters from a selection of the analyzed real world networks.The data is taken from a wide range of sources, which we supplement in Section 3, including the classic movie actor collaboration network from IMDB, a physical communications network, a biological network and a number of collaboration networks.A striking feature of all of these networks is both a limit to the degree of a node, and also that the value of γ is significantly lower than predicted by the scale free model (γ is calculated as described in Section 3.1.).Recent work [13] has highlighted a number of deficiencies in the scale free model, including deviations from the scale free degree distributions and the presence of cut offs in the maximum degree.It must be stated however that the model is strikingly powerful in its ability from a simple set of assumptions to explain many features of complex networks, from their small world property to the absence of a 'scale' in the degree distributions.This simplicity is powerful and hints at fundamental processes underlying the dynamics of network evolution.Failure to capture the detail of the degree distributions of real world networks however, indicates that this simplicity must be supplemented with additional facets to the model of node attachment.In addition the appeal to node degree being the primary determinant of attachment probability is a modeling assumption and does not explain why that is the case.The principal argument is based on the concept of "the rich get richer", which is an equivalent statement to equation (1).In our view this is not a 'first principles argument', based upon fundamental physics.Given the success of the model and widespread acceptance of its validity and application in many fields from genetics to network design, it would be satisfying to link the derivation of equation ( 1) to core principles of physics.In this paper we start by exploring a next degree of approximation to the model to identify how environmental influences such as the presence of a top constraint for node degree alter the form of equation ( 1).In the model we propose this yields polynomial terms in k, which we hypothesize may be part of a series of corrections to the attachment probability.
Using arguments based upon applying ensemble statistical mechanics to the entropy of a network vertex, we then propose an entropic model which naturally produces the concept of preferential attachment and constraints, and hints at further structure to the form of attachment probability in equation (1).

Extensions to the Scale Free Model
Before embarking on an investigation of our model, it is important to stress that many proposals to extend prefer-ential attachment have been advanced.These alternative models to preferential attachment rely upon modifications to the probability of attachment beyond simple dependence on the degree of the node.The extensions range from ecologically inspired models such as the competition based approach of D'Souza in [15], to direct alterations of the form of equation ( 1) by introducing 'super-linear' terms in k, that is arbitrary powers of k.The model of Krapivsky et al [7], explicitly explores forms of attachment probability where the term in k is replaced by an exponential form k α , where the exponent α can vary in the range 0 < α < ∞.By varying α it is possible to and produce very different forms of the degree distribution.These range from stretched exponential degree distribution to a super-linear zone for α > 2 where one node captures a connection to all other nodes.In other work, notably Dorogovtsev et al [5], the concept of initial attractiveness of a node is introduced, which permits values of the power law exponent to vary and produces values of γ that are between 2 < γ < 3.These models depend upon the concept of some nodes starting with a higher initial attractiveness than others in their ability to gain connections to new nodes.In some ways this is the opposite approach to the constrained model we propose in this paper, where nodes become progressively less attractive as they acquire connections and approach their limit.It is perhaps the ecological, and physically inspired extensions that are most attractive alternatives to preferential attachment.We have already mentioned the competition based model of D'Souza [15] that uses an optimization approach in which the minimization of a cost function upon every node addition is used to determine which node the new node attaches to.This model produces an exponentially corrected degree distribution of the form P (k) ∝ k −γ e −αk .This degree distribution is similar to that which we see in the data analyzed in Section 3, and is an encouraging advance on the original preferential attachment model.Another widely accepted approach, which builds upon the work of Dorogovtsev, was developed by Barabási in collaboration with Bianconi, This model parametrizes the attractiveness of the node using a fitness measure, η i , and was introduced in [6], [16] and further developed in the work of Moriano et al [17], and Su et al [18].The extended model proposes that the probability of attachment is modified to include the fitness parameter in the most general sense, as follows: To prevent this model requiring as many independent variables as there are nodes, the attractiveness η is fixed, or quenched, at node addition and is randomly assigned from an assumed probability distribution ρ(η) for the parameter.The model permits an analogy between the graph and the Bose-Einstein treatment of ideal gases.This analogy relies upon the identification of a node vertex with an energy level of the gas i , with the degree corresponding to the occupancy number of the energy level.Derivation of graph properties from statistical mechanical arguments is long established, including in the work of Newman and Park on exponential random graphs described in [19].In the Bianconi-Barabási model the fitness parameter is defined as i = − 1 β log η i , with β being identified as classical inverse thermodynamic temperature.The denominator of equation ( 5) is then easily identified with the partition function Z, familiar from the Bose-Einstein model of statistical mechanics.Using the probability distribution ρ(η) of the nodes' fitness parameter as outline in [6], P (k) can be analytically solved for in the case of the uniform distribution to yield: , where C is a constant (6) This model is attractive, and indeed does provide a closer fit to the data, including the presence of a cut-off on the maximum degree of a node.The models described thus far all share a similar set up to the original preferential attachment mechanism, in that they consider a stepwise addition of a single node which connects to a variable number of pre-existing nodes.In recent work by Bianconi et al, this has been generalized to investigate models based upon the addition of simplicial complexes to a network rather than nodes as described in [20,21].These models, referred to as Network Geometry with Flavor (NGF), introduce the concept of a d dimensional simplex, which is a fully connected clique of d + 1 nodes.When d = 1 the model reduces down to the Bianconi-Barabási model, but higher dimensional simplices are hypothesized to more correctly represent the growth of networks where the unit of addition is a clique, such as a citation network being built from sub networks of frequently collaborating authors.The NGF model proceeds by adding a single node and links, so as to produce a new d dimensional simplex in the graph, by attaching the simplex to a randomly chosen d − 1 existing face in the graph, governed by a generalized form of equation ( 5).The attachment probability is further parameterized by a flavor variable s which can take the values of −1, 0, 1 that allows the introduction of a generalized degree which counts the number of d dimensional simplices incident to a node.The range of flavor ensures that the form of attachment probability, which is beyond the scope of this survey to outline, produces a well behaved probability.The survey in [20] has a full and complete overview of the model.The attraction of these models is the generation of a rich set of possible graph geometries, including scale free, Apollonian and a form of graph deeply analogous to the form of graphs proposed in a range of approaches to Quantum Gravity.
Together with the competition model of D'Souza these more physically and ecologically inspired models provide motivation to explore other analogies with such processes to improve upon the standard preferential attachment.It would be a significant insight if we could explain the experimental data based upon solely intrinsic properties of the graph such as node degree and local clustering coefficient of a node, with reference to how these relate to fundamental properties such as entropy and constraints.
In the next section we propose an extension, based upon the concept of constraints to the maximum degree of a node.This constraint is motivated from real world concerns in many networks.For example in communications networks the number of physical connections a node can maintain has a hard limit, and even in social networks building a network of friends is subject to constraints of time and physical space.In Section 4 we show how both constraints and non-linear preferential attachment could arise from a deeper, more fundamental, entropic model.

A Pure Constraint Based Model
A core assumption of the scale free model is that new nodes attach to other nodes with a probability that is determined only by the degree of the target node; no other factors affect Π i and attachment is unconditional.In most networks though this is not a fully accurate assumption, as most nodes will have some inherent upper limit on their capability to establish connections.We can imagine a network comprised of nodes capable of maintaining a maximum of c connections, with c i (t) being the point in time capacity of node i at time t.To simplify the treatment we assume the capacity of all nodes is equal across the network.In this case we could imagine modifying the probability of attachment to account for the nodes capacity as they accumulate connections, with a multiplicative factor to the preferential attachment probability Π i .This assumption of uniform maximum capacity is an approximation that we justify by the simplicity of the theoretical analysis it permits.We seek to avoid introducing a family of free parameters, which would equate to a family of constraints, to preserve the theoretical elegance of the treatment.When we come to compare our constrained model to real world data it does require us to make reasonable estimates for the effective average constraint.We assume that this acts as a scaling factor for the attachment probability, similarly to the fitness factor introduced in the Barabási-Bianconi model [6], [16], in essence acting like a conditioning of the probability of attachment with the probability the node can accept the connection.In the most general sense, we can write this as the ratio of the nodes capacity relative to the time varying, average capacity of an arbitrary node, c(t) as: and To calculate c(t) , we observe that at any time t a given node i will have an expected value of capacity c i (t) = c − k i (t) .As we assume that c is a shared maximum capacity across all nodes this reduces to c i (t) = c − k i (t) , and we note that k i (t) is the expected value of a node's degree k i = k i , which will be useful in section 3 when we will compare our constrained model against real networks.We can also estimate the expected value of the capacity of a node, by assuming a base uniform distribution of attachments in the absence of preference.
After n nodes have been added, we will have added nc capacity to the graph, and consumed 2nm connections.
In the simplest case for the average capacity of a node, after adding a large number of nodes n, we note that the average capacity must evolve to a constant as following: Unfortunately as written this attachment probability is not sufficient as This can be demonstrated by expanding Equation (7) as follows: If we define δ as the normalization sum becomes, In general δ could be a function of time and degree, but as an approximation in our model we treat it as a constant of the system.We test that assumption in the simulations presented later in this section, which indicate that it is valid to assume that δ eventually stabilizes to a constant as the network evolves.We run these simulations of network growth to mimic the parameters for a selection of the real network data we analyze.Investigation of models where δ is a function of time (and potentially k i ) is an current avenue of research, and the subject of future work.For our attachment probability to be a valid probability measure we need to establish that δ (c−2m) ≥ 0 and that δ (c−2m) ≤ 1.In the first instance the numerator of Equation ( 9), as defined in Equation (10), is the variance of k i across the graph, and so is strictly positive.Providing that c > 2m, we can safely assume δ ≥ 0. Regarding the upper limit of δ, we can appeal to Popviciu's inequality (see [22]) for a bounded distribution, with k max = c and k min = m.This states: For times t > (c−m) 2 8m(c−2m) , we then conclude that as required δ (c−2m) ≤ 1.With these limits established, we can modify the attachment probability by adding in δ to produce a form for the attachment probability, which sums to unity at each time step across all nodes, below: For convenience, we can further simplify the expression for ζ i , as follows: , We can now write the complete probability of attachment as: For comparison with the Barabási-Albert model, using α = c+δ (c−2m) from equation ( 8) we can rewrite Π c i as follows: , for large c.
This recovers the standard Barabási-Albert model in the case that the constraint c is infinite and therefore does not interfere with the dynamics of the network's evolution.Following the continuum approach, and dropping the explicit time dependency of k i for clarity, we can substitute this into equation (2), to obtain with the fraction multiplied out for convenience later.This is directly solvable by separating as follows: whose solution is: or in simplified form Following the continuum method in [1] we apply the initial condition that k i (t) = m at time t = t i , to obtain: Again, we note that as c → ∞, ρ(c + δ) → m, α → 1, and so equation ( 6) reduces to , the standard result from the continuum analysis of Barabási and Albert [1], [3].We then note that the probability that a node has degree k i (t) < k is: Assuming uniform probability for the choice of node introduction time t i of 1 (m0+t) we arrive at the expression: Although somewhat more complex than the expression in [1] it is nevertheless simple to compute the distribution equation P (k) = ∂(ki(t)<k) ∂k to obtain the main result of our constrained model: In appendix A we examine the asymptotic behavior of Equation ( 16), which verifies that by careful manipulation the standard result of the scale free model γ = 3, is recovered in the limit c → ∞.Further, this analysis also indicates that the dominant contribution to degree distribution for k (c + δ), produces a scale free log linearity with power law exponent γ = 2 α + 1.This equivalence to a more straight forward power law, but with an exponent γ < 3 for values of k (c + δ) indicates that the presence of a constraint influences the behavior of our model even for nodes early in their evolution.This is a significant result and we make use of it to compare the predictions of our theory against real network data and simulations in section 3. The result in equation ( 16) has some interesting implications, as the presence of a finite capacity c alters the scale factor for the distribution of the nodes, whilst preserving the essential aspects of scale free behavior.By way of example, the data for the IMDB movie actor database, as presented in Table 1, is plotted in Figure 1b, along with results from a simulation of our model.The movie actor database naturally produces a graph by assigning a vertex for each actor and connecting two vertices when the actors have acted in the same film.Figure 1b contains a theoretical plot of the distribution taken directly from equation ( 16), using k = 127, c = 900 and with initial conditions of m 0 = 100, which we take from Table 1 .For this plot we set δ = 205, which we take directly from the simulation, which we discuss in the next paragraph.The unmodified scale free model would give a value of γ of exactly 3, but our modification has an initial value of γ = 2 α + 1, which increases as k → c and reaches a limit when k = c.To calculate γ we can take c = 900 from the dataset in Table 1 and k = 127.33,with the estimated value of δ = 243 (we average the ratio of δ to c), to yield γ = 2.35, versus the measured value of 2.3 in [1] and 2.43 from our simulation.By comparison, to the scale free model, our approach predicts the value of γ to 2.29%, compared to 30.4% for scale free, a significant improvement.In addition, there is no explanation in the scale free model for the degree of a node in the graph having a maximum value.
To further verify our model, and in particular the assumption that δ can be effectively treated as a constant, simulations were run using the form of preferential attachment probability in equation ( 13), for a network sharing the same parameters of maximum degree and average degree as the IMDB network.We present those results in Figure 1a.The simulation was run for a selection of initial parameters to asses the evolution of δ, and in each case the value quickly converges to a constant.Turning to the simulation of degree distribution, in Figure 1b the essential scale free nature of the network obtained is visible on the log scale graph, as is the goodness of fit and agreement between the simulation with a theoretical plot of P (k) using the same simulation parameters.Using the techniques described in [23], we can measure γ, and obtain a value of 2.40 versus a calculated value from equation ( 16) of 2.41, which is in close agreement.
We also ran simulations for the Patents Citation graph (Figure 1c) and the Web Provider network (Figure 1d), which both produce similarly good results to the IMDB network in terms of the closeness of fit between the simulated and theoretically obtained P (k).We can conclude that the constrained model is a good representation of networks with a simple maximum degree constraint.
Motivated by this example and simulation, in the following section we extend our analysis to a range of more recent, publicly available, network data to investigate further the accuracy of our constrained model.

Data and Methods
In this section we present the analysis of an extensive collection of network datasets comprising virtual, transport, and communications networks.The bulk of this data is publicly available through the Stanford Large Datasets Collection [24] which comprises an excellent repository of large graphs.The Twitter follower data is provided by [25], and the rest of the datasets are reproduced from publications such as [1], the Internet Topology Zoo [26].We have one proprietary graph built from the topology taken from a large commercial deployment of network infrastructure used to deliver a top 10 Internet portal service (see [8]).The produced graphs fall into the following categories: -Social Networks.Analysis of the data was undertaken using a program and graph datastore which is available from the authors on request.The source data was often very large (the Twitter data contains for example over 10 million edges), and extracting values for the max degree and k is not necessarily evident.Some of the data had some extreme outliers in terms of node degree, and to avoid skewing the results, we estimated the constraint at the 99 th percentile of k rather than the maximum value in the data.This is consistent with the methodology taken in the theoretical analysis, where we made an assumption of the node degree constraint being constant for all nodes.This is a simplification, but one with great benefit in the analytical treatment of the model.The elimination of outliers   at first sight may seem inconsistent with the assumption of a single constraint in the capacity of a node, but it is expected that the real world data will contain perhaps many different constraints, and that the average behavior of the graph will be most influenced by the effective maximum established at the 99 th percentile.Further, the data above the 99 th percentile in k is typically very sparse and may contain spurious data points, which this cut off eliminates.In Figure 2 we present the variation of the calculated value of γ with the choice of percentile at which to choose c.The range of calculated values as we move from the 98.2 th to the 100 th percentile is 2.20 to 2.69, a range of ±9% either side of the chosen value of c = 41.We believe this further strengthens our choice of the 99 th percentile as the appropriate cut off for measuring c.
For k we require the expected value of the degree.This was calculated by computing the weighted mean, a discrete approximation of k , which is truly only valid if k is a continuous variable.This is consistent with the approximation of continuity inherent in the continuum analysis approach.
To compare against the actual value of γ, power law exponent, we followed the techniques outlined in [23] to both asses the presence of a scale free distribution and obtain the value of γ.For the datasets we analyzed, which can be seen visually in Figures 3, 4 and 5, there is a considerable portion of the distribution which has a well defined straight line on the log/log plots, illustrating the intrin-  sic power law distribution of node degree.We capture the measured values of these power law exponents in Table 2.

Analysis
In the summary Table 2 it is compelling to note that in all but a few cases the constrained model is more accurate in its predictions of γ than the standard scale free model.Indeed in the case of the Patent Citation, Internet Topology Zoo, Pokec, the real world network from a Web Provider, and a number of the citation networks and social networks, it comes very close to an exact prediction.Given that the motivation to investigate the constrained model originated from considerations of network design in communications networks, it is interesting to see that this has some strong applicability to non-physical networks.We also present the analysis both as a collection of log/log distribution graphs in Figures 3, 4 and 5 and also summarize the key prediction of γ against the standard value of 3.0 from preferential attachment in Table 2.In the log/log plots we overlay the value of c at 99 th percentile, the average value of γ to this constraint and the expected value of the node degree k .In each of Figures 3, 4 and 5, we also overlay the theoretical prediction for the distribution P (k) obtained by substituting the values of γ from Table 2 into Equation (16).The agreement between the predicted values of γ and the measured ones for our datasets is evident from these combined theoretical and experimental plots, at least for portions of the distribution.A consequence of the selection of c at the 99 th percentile is that our theoretical curve displays a cut off earlier than the experimental data, which is to be expected.The striking feature of many of the degree distributions is the absence of strict linearity, contrary to the predictions of the standard scale free model, and also the marked increase in γ at high values of k, a key prediction of our constrained model and a necessary precursor to a hard constraint in the value of k.In the social network data we analyzed this is best illustrated in Figures 3a, 3c and 3b.
Similar behavior is also present in the citation network (perhaps the best example being Figure 4d), and again in the infrastructure graphs, particularly the Internet Topology Zoo (Figure 5a).It is interesting to speculate what the nature of the constraint is in the social networks, but this is perhaps explained by the effective limitations, no matter how small, on the amount of time people can feasibly spend on social networking platforms.Indeed in almost every conceivable network a constraint is a natural feature.Whether the node in the graph is a physical device, and individual engaged in an activity such as writing papers, or web site hyperlinks, there is a limitation to the connections a node can have.In some cases these are hard design limits such as ports on a network switch, in others it is simply the capacity of a human being, with a fixed lifespan, to blog, interact, star in a movie or engage in any other social activity.In every case our experimental data bears this out.
In the following Section 4 we point out how the two models may well be related to a fundamental dynamical principle that arises from thermodynamic considerations of network evolution.Critically this analysis derives the form of preferential attachment presented as an axiom in the scale free model.

Dynamical Evolution of Scale Freedom
In our treatment thus far we have followed the continuum model of Barabási-Albert with the addition of a constraintbased factor to the attachment probability.However, we can attack the problem from a more fundamental viewpoint.Essentially, we argue that the evolution of a graph satisfies the criteria for a treatment based upon considerations of entropy from a statistical mechanics perspective, in accordance with the 2 nd law of thermodynamics.
In any isolated physical system the entropy of the system will tend to a maximum unless energy is input to prevent that.For a classic treatment see [32].In natural processes this tendency to increase entropy can be modeled as a macroscopic force on the system.This entropic force is responsible for both the elasticity of certain polymers and the biological process of osmosis.Indeed if thermodynamic temperature is written as T and entropy S, one can state the entropic force F acting on a body when a process changes entropy as follows: To begin our treatment of graph evolution from fundamental thermodynamic principles, it suffices to pose the problem in an appropriate manner.Consider an existing graph of m 0 nodes and e 0 edges in thermal equilibrium with an infinite supply of unattached nodes, each capable of connecting to m nodes in the event that it comes into contact with the existing graph.At every time-step we imagine that such an interaction occurs and the new node connects to m others.Our problem is to identify the probability of attachment for a node according to its degree k, and thus derive the degree distribution.More    (e) Arxiv High Energy Physics Collaboration Network [27] Fig. 4: Degree Distributions from Collaboration and Citation Networks on a Logarithmic Scale strictly, it is necessary to consider an ensemble of all possible graph configurations, at every time step, to enable statistical treatment of this process.This requirement to consider an ensemble of configurations is at first sight an added complication, but in fact is critical in permitting the analysis of the model.Whenever we consider a randomly selected node, for example in equation (18), it is important to recognize that we must average any interaction with the remaining graph over all possible graphs that can be constructed from the subgraph obtained by removing the randomly selected node and all edges connected to it.This ensemble average is further constrained by the total number of vertices and edges being unchanged after the removal of the random node.This requirement to average over all possible graph configurations at each time step justifies the approximation we make to calculate, for example, the average clustering coefficient.The probability of attachment to a random node must statistically and universally seek to maximize total entropy.Our model proposes that the probability of this random node acquiring new links is a result of the relative strength of the entropic force of attachment to the randomly chosen node versus any other node in the graph.Those nodes which exert the highest entropic force relative to the rest of the nodes in the network will gain the most links, and we write this mathematically as: where F (v i ) is the entropic force of attraction to node i.This expression governs the individual interaction that our randomly selected node has with a particular graph configuration, analogous to the elastic collision equations used to formulate the statistical treatment of ideal gases.In a similar way we cannot easily analytically formulate the dynamical equations of the graph from this equation as they are very large, and so to derive the degree evolution equations from this formulation we utilize statistical ensemble arguments.Considering all possible configurations of the graph G(V (t), E(t)) at a fixed time t, the denominator of equation ( 18) is computed as an expectation value of the relative force of attaching to any other node, across all possible graphs at time t in the ensemble that our random node could be connected to.At a given time t in the evolution of the graph the numbers of vertices |V (t)| and edges |E(t)| are constant, but we do have to consider all possible graph configurations of that number of vertices and edges.This will ultimately change the average of the change in entropy that the node could make on connecting to any other node in the graph other than our randomly selected node v i .In this way we collapse the denominator to the expected value of this entropy change, averaged across all possible connection points in all possible members of the ensemble.We write this as T×|V |×E(∆S).As the graph becomes larger, we make the assumption that the value of |V | × E(∆S) is effectively constant, and factor this out.We base this assumption on the fact that most real world networks do indeed demonstrate some form of steep drop in the distribution of node degrees, so that the vast majority of nodes posses low degree (an impor-tant claim of [4] and [1]).It seems reasonable to assume that with such a restricted degree sequence most nodes will contribute a similar amount to the change in entropy, and this expected value will stabilize to a constant.More complex analysis could admit a time varying value of this constant, as strictly both V and E(∆S) may have complex time dependence, but for simplicity we assume: .
With this assumption equation ( 18) simplifies and T factors out to yield In general S i is a function of potentially many variables x i , but certainly depends upon k i and time t.We can calculate ∆S i as a total differential, ∆S i (x j ) = xj ∂Si ∂xj ∆x j , but we can assume for simplicity that t is fixed and the dependence is purely upon k i .In this case ∆S i = dSi dki × ∆k i , with, for a single time step, ∆k i = 2m.This gives us our expression for attachment probability: To make use of equation (20) we require an expression for the entropy of a node in the graph.The subject of the entropy of a graph has a long history, originating in the work of Körner on the informational entropy of signals described in [33] and [34].Many approaches to calculating the entropy of a graph have been proposed, including the use of the eigenvalues of the adjacency matrix (see [35], and ensembles of networks with similar degree sequences (proposed in [36]).Unfortunately these concepts relate to the global value of entropy for a graph, and do not have utility when calculating the change in entropy as a new node connects.A series of papers by Dehmer ([37], [38]) formalized the concept of the individual entropy of a node.In recent work [8] we built upon this formulation to define a local vertex measure (referred to in [8] as N V E , and equivalent to our definition of S i here) in terms of its relative degree as: where C 1 i represents a modified clustering coefficient of the 1-hop neighborhood of the node v i .Contrary to the more common point-deleted neighborhood clustering coefficient, C 1 i preserves the node in the calculation to measure similarity to the local perfect graph K n of order n = k i +1.For convenience we give an explicit definition of the 1-hop neighborhood N 1 i : , and the related '1-edges' E 1 i as We can then define the modified clustering coefficient to be At this point we can make use of the fact that we must consider all possible intermediate graph configurations to assume effective uniformity in the graph to calculate |E 1 i |, and assert that for a given node, This then yields for the clustering coefficient the following expression: Given that at every time-step we add one node to the graph, connecting to m other nodes we can write |V | = m 0 +t, and |E| = e 0 +mt.In general as the model evolves, t m 0 and similarly, mt e 0 , these simplify to |V | = t and |E| = mt.Substituting back in we obtain the following equation for vertex entropy at v i at time t as: In the analysis undertaken by Tee et al in [8,9], this quantity was identified as sharing some of the properties of the structural entropy of the graph when summed across all vertices.In particular, the extremal behavior of the summed vertex entropy was proven to be minimized by the perfect graph of order n, K n , and maximized by the star graph of order n, S n , for simply connected undirected graphs.From the perspective of dynamical evolution of networks, this is consistent with the approach in our analysis.The perfect graph K n will tend towards a more node level disordered graph such as S n as addition of nodes selects targets such as to increase the value of S i in Equation (24).From a purely statistical mechanics perspective one can consider each connected graph on n nodes and |E| edges as representing a micro-state.The perfect graph is achievable in precisely one unique configuration if edges are indistinguishable, whereas other configurations, S n for example, can be achieved by selecting any one of the nodes as the hub vertex.In this way the result that increases in entropy tends to destroy cliques and regular ordered graphs is consistent.From this perspective we would expect dynamic processes to favor the attachment to nodes where the increase in S i is greatest.From here it is straightforward to follow through the continuum analysis as described in [1].For the time evolution of k the following equation, is obtained: Although at first sight this nonlinear ODE appears intractable, in fact an analytic solution is available.Making the change of variables y = log k and x = log t, so that This is now a linear ODE which can be solved by standard methods.Applying the initial condition k i (t i ) = m the solution is found to be most conveniently expressed in the form For values of < 1 the behavior of k i (t) is similar to the Barabasi-Albert model: degrees increase monotonically but at an ever decreasing rate.An analytic form for the degree distribution, analogous to (3) does not seem straightforward to derive.Figure 6 compares numerically computed degree distributions from the model (26) (shown in figure 6a) and the Barabasi-Albert model, shown in figure 6b.In each case a new node was added to the network every 0.5 time units, setting m = 5 and growing the degrees of existing nodes according to (26) or (3) respectively.Degree distributions are plotted for fixed end times t end , taking the values 3 × 10 2 , 10 3 , 3 × 10 3 , 10 4 , and 3 × 10 4 .The degree distributions for the entropy-based model do not clearly follow any power law behaviour, at least in the regime explored here, while the Barabasi-Albert model quickly assumes a form very close to a power-law degree distribution with exponent γ = 3 as we expect.While any systematic analysis of (26) seems difficult, for large enough networks we might expect that this model is comparable to the classes of sub-linear preferential attachment models studied rigorously by Dereich & Mörters [39,40].These authors prove that preferential attachment rules based on concave functions of node degree will asymptotically result in degree distributions with exponent γ = 3.This suggests that the long time dynamics of the entropybased model might also show this behavior, but at intermediate times the more complex distributions illustrated in figure 6(a) might well be more typical.

Conclusion and Future Directions
In Section 2 we introduced a modification to the preferential attachment model to account for the maximum connections a node may have in a network.From the mathematical analysis we were able to predict both the value of the power law exponent γ and the presence of a hard limit on the degree distribution.In Section 3 we applied the analysis to an extensive range of social, citation and physical infrastructure graphs, and found that the constraint model's values for γ more accurately fitted the data.In addition, the constrained model implicitly contains a hard limit in the node degree, and the data analyzed had degree distributions with far fewer nodes of extremely large k than a pure power law would predict.This is an important result because the value is arrived at as a natural consequence of the presence of constraints on the maximum node degree, rather than by introducing a distribution of additional parameters such as in the fitness model.Fitness is a valuable concept, and indeed in further work it is intended to investigate the role of a top constraint in a model extended to include the concept of fitness, or indeed generalized in a similar way to the NGF models.In particular the analogy with Bose-Einstein statistical mechanics is interesting, and opens up many applications of network science in more general theoretical physics, but the method outlined in this paper captures the essential features of real degree distributions without requiring the concept of fitness.Motivated by the interesting results when applying concepts from statistical mechanics, and the results for vertex entropy arrived at in [8], we also set out to see if scale free models could be arrived at from pure thermodynamic principles of entropic force.In Section 4 we were able to obtain, from first principles, an evolution equation for the degree of a random node, which although soluble analytically, presents challenges when deriving the degree distributions according to the continuum analysis.The Taylor series for log(x) converges only for values of x in the range 0 < x ≤ 2, but as k ≤ 2mt, and, both terms are always strictly positive, we can safely expand the log term in equation (25).The validity of this expansion is not valid for k 2mt as the series for log(x) converges very slowly as x → 0. However at early times after the introduction of the node into the graph, k 2mt will be closer to 1 and we can expand the log to yield: − 1 + higher order terms.
For the same period of time this expression is valid we can see that the leading terms in this expansion contribute to the ODE time evolution of k the following: What can be asserted is that for a period of time after a node is introduced into the network its behavior will be governed by the first terms in this expansion, with much more complex behavior as the network evolves.This is illustrated nicely in Figure 6 obtained from our numerical simulations.These first two terms in the expansion are identical in form to the evolution of k with time in the Barabási-Albert model, and also a correction identical in form to our constrained model.This would indicate that for small t the behavior of the entropic model should closely resemble scale free, with a correction for constraints.As t increases the model will become more complex.
The model introduces as a free parameter, and it is a legitimate question to ask what the correct value of this should be.In the numerical simulations we chose, for illustrative purposes, = 0.1.The choice of will have a profound affect on the family of graphs that can emerge from the initial conditions and in particular the slope of the power law degree distribution obtained.For example, values of > 1 will tend to generate power laws with γ < 3, and conversely < 1 will produce γ > 3, at least in the regime where the first term of equation( 27) dominates.Given that the origin of the parameter is in the relative entropic force of the graph compared to a randomly picked node of degree k, one could speculate that its value measures the relative affect of an additional link on the bulk of the graph to increase entropy compared to an individual node of varying degree.High values of perhaps indicate relatively more homogeneous graphs than low values, indicating that degree distributions drop off more slowly the more ordered a graph's initial state.In future work we intend to investigate the dependency of graph evolution on in more detail, and whether the more complex evolution behavior of our dynamic model has utility in revealing more detail on the internal structure of dynamically evolving graphs.
We believe that there is a deep connection between vertex entropy and the evolution of networks.An attractive feature of our model is that it predicts scale free and more complex network evolution behavior from a first principles argument without appeal to any heuristics, node by node parameters, or indeed a stated but not justified property of nodes to seek out other high degree nodes with which to preferentially attach.Instead we argue from the safety of the second law of thermodynamics to a model which reproduces the essential features of scale freedom, and also the constrained model which we demonstrated provides a better fit to the experimental data.It is possible that higher terms in the expansion of equation ( 25) could yield insight into the detailed evolution of networks, and provide powerful analytical tools to for example determine the age of a network.Nevertheless, it is attractive to speculate that scale freedom, and similar models, may be a manifestation of the second law of thermodynamics as applied to graph evolution.
Beyond investigating the entropic model, there are many potential enhancements to the constrained model.In further work we intend to conduct analysis of more network datasets and also investigate corrections to the constrained model to improve our estimate of (c − 2m) or (c − k ) for the average occupancy of a node, by iterating the resultant distribution in equation ( 16) to calculate k as k = +∞ −∞ kP (k)dk.that, as c > 2m, by definition, α ≥ 1 with equality in the limit that c → ∞.This yields a range for the power law exponent γ as 1 ≤ γ ≤ 3, with the familiar result of γ = 3 recovered in the case of the constraint being infinite, and therefore unimportant to the dynamics of the network growth.We can also examine Equation (28) in the asymptotic limit of c → ∞.We recall that ρ = m c+δ−m , and that α = c+δ c−2m .At the limit c → ∞, α = 1, which reduces Equation (28) to: which multiplying out and allowing c → ∞, gives As expected, this is precisely the form of the degree distribution in the standard preferential attachment model, which emerges as the constraint becomes infinite, and therefore unimportant in the dynamical growth of the network.
2m) over 50, 000 Iterations in a Simulation of Constrained Attachment Simulation and Theoretical Degree Distribution using Equation(13) and IMDB Parameters at t = 50, 000 Simulation and Theoretical Degree Distribution using Equation(13) and Patents Parameters at t = 50, 000 Simulation and Theoretical Degree Distribution using Equation(13) and Web Provider Parameters at t = 50, 000

Fig. 2 :
Fig. 2: Variation of Calculated Values of γ with Choice of Percentile for c for the Patents Graph Pokec -Slovakian Social network Friendship Graph, Theoretical and Experimental[28]

Fig. 3 :
Fig. 3: Degree Distributions from Social Networking and Web Networks on a Logarithmic Scale Arxiv Condensed Matter Citation Network, Theoretical and Experimental[27] Arxiv Astro-Physics Citation Network[27] Arxiv High Energy Physics Citation Network[27]

Fig. 5 :
Fig. 5: Degree Distributions from Infrastructure and Communications Networks on a Logarithmic Scale log(2m) − x

Table 1 :
[1]ree Distribution Parameters of some Real Networks[1] These include Twitter, Facebook, Pokec graphs of the relationships between users.Typically each user is a node and nodes have links if the users have some form of relationship with each other.For example in the case of Twitter this relationship derives from one user 'following' another.-Collaboration and Citation Networks.These cover a wide range of publicly available data, including the Arxiv citation, Patent Citation and co-authorship graphs as examples.Graphs are constructed by creating a vertex for each unique user or paper and then connecting the vertices if they share authorship with another vertex or directly cite it.

Table 2 :
Comparison of γ Predictions Between Preferential Attachment and Constraints Model