Taming vagueness: the philosophy of network science

In the last 20 years network science has become an independent scientific field. We argue that by building network models network scientists are able to tame the vagueness of propositions about complex systems and networks, that is, to make these propositions precise. This makes it possible to study important vague properties such as modularity, near-decomposability, scale-freeness or being a small world. Using an epistemic model of network science, we systematically analyse the specific nature of network models and the logic behind the taming mechanism.


Introduction
In the last years of the twentieth century network theory turned into an independent scientific field. Watts and Strogatz, approaching from nonlinear dynamics and Albert and Barabási, approaching from statistical physics came to the same conclusion: the behavior of certain complex systems, such as the human brain, the Internet or biological or molecular structures is fundamentally determined by their network properties. In the limelight of modern network science there stood some simple elegant network models that implied hidden universal laws behind the formation of real world networks.
Network science did not come out of the blue. From Jacob Moreno's sociometry to Mark Granovetter's "strength of the weak ties" theory, networks appeared in social science in the form of social network analysis. Several crucial notions of modern network science from betweenness centrality to small world property are deeply rooted B Gábor Elek g.elek@lancaster.ac.uk Eszter Babarczy ebabarczy@mome.hu the network. To further continue the train of thoughts of Kostic, the identification of the underlying network "abstracts away from the details" of a particular fundamental unit or the interconnectivity units. We call this part of the network representation the topological representation.
Note however, that Craver (2016) raised a slight objection to the definition above, by pointing out the difference between a living and a dead brain represented by exactly the same network structure. In the frequently quoted editorial (Brandes et al., 2013, p. 4) of the first issue of the scientific journal "Network Science", the authors made clear that network representations entail not only the structural relations of the system, but also the network abstraction of the system phenomenon in question such as the spreading of a disease. We call the abstraction of the system phenomenon the functional representation (see Sect. 3). Hence, the network representation consists of a topological representation and a functional representation. This view is in harmony with the mechanistic model concept of Craver (2006).
Nevertheless, we must add that the idiosyncratic characteristics of the mechanisms investigated by network scientists somewhat differ from the typical features of the mechanisms which are the main interests of the new mechanism school of the philosophy of science. In network science, the mechanisms usually consist of a large amount of entities identified with the fundamental units and similarly an abundance of basic, low-level activities related to the interconnectivity units of the complex system. In other words, the complexity of the mechanisms studied in network science lies in the size and interconnectedness of the system, not in the fine spatio-temporal structure of a small web of highly structured entities and activities that constitutes the mechanisms typically studied by the philosophers of neuroscience or biology. The phenomenon under study, as Huneman (2010) put it, "a feature, a trait, a property or an outcome of the system" (e.g. the fast spread of viruses or gossip, vulnerability/robustness under attack) and not some sort of complex, higher level activity (e.g. the Krebs cycle or the neuronal action potential mechanism).
According to the classical school of network science (Watts, Strogatz, Barabási, Newman) the epistemic value of network science lies in the explanatory power of certain crucial network properties. This view is reflected in the structural explanation thesis. Huneman observes that certain graph properties of the topological representations can function as constraints that instantiate the above features, traits or properties. Hence, they can provide a purely mathematical/structural explanation of them.
The crucial graph property constraint in the network explanations of Huneman (2010Huneman ( , 2018) is the small world property. Note however, that in Huneman (2010) it is quite clear that by graph property Huneman means such features that classify the graphs into equivalence classes: graphs which possess the given property and graphs which don't. On the other hand, it is important to observe that small worldness is a vague property. A network can be more and less small worldly. The archetypal small world networks of Watts and Strogatz (1998), in case of a very small rewiring parameter have a much smaller average path length then a 20-by-20 regular lattice which frequently illustrates the lack of the small world property. In this case, the small world phenomenon becomes visible only when we consider a rather large amount of nodes. Vagueness is also characteristic of some other classical constraints of network science (scale-freeness, non-decomposability, nearly decomposability) similarly to the features mentioned by Huneman (2018) such as the fast spread of epidemics or stability in face of random extinction. Network science is inherently related to the vagueness phenomenon and consequently the understanding and "taming" of vague properties.
However, as in the above cited paper Brandes et al. claims: "Network science is the study of network models." In the case of systems of astronomical size, the actual topological representation is substituted by cleverly constructed network models. These models are built in such a way that they satisfy some essential constraints of the original topological representation.

The nomothetist and the idiographer
The relatively simple definitions of network representations and network models make it possible for us to create an epistemic model of network science. In this model two scientific agents try to explain a phenomenon (feature, trait, etc.) related to a mechanism of a complex system (the System).
Such division into roles in network science has also been studied by Jacomy (2020) under the name of nomothetic and idiographic subcultures (the terms were originally coined by the turn of the century Neo-Kantian philosopher Wilhelm Windelband). Hence, we call the two agents the Nomothetist and the Idiographer.
The Nomothetist studies a concrete entity, the Network (the topological realization of the System) and she is much more concerned with the general than the particular. As we pointed out in the previous section, in our epistemic model the System is considered to be enormously large, so the scientific agents have only limited access to it. The exact structure of the Network (a database containing an ordered pairs of numbers, where (i, j) in the database if the i-th node of the Network is linked to the j-th node) is not accessible for them. Thus, the Nomothetist studies the Network by statistical measurements and based on her findings she builds a Network Model (see below).
On the other hand, the Idiographer prefers the meticulous descriptions of the idiosyncrasies of the System over generalities. In social network analysis the role of the Idiographer had clear priority over the Nomothetist, 1 while the theory-driven nature of modern network science gave prominent role to the Nomothetist. In our epistemic model (that is slightly tilted towards the Nomothetist's point of view) the Idiographer's role seems to be secondary, nevertheless in Sect. 9 we look at some important notions of network science from the Idiographer's point of view.

The nomothetist and the network model
In our epistemic model the Nomothetist's most important goal is the construction of the Network Model. Based on information about the System and the Network, the Nomothetist sets up certain constraints satisfied by the Network and constructs a computer algorithm that outputs the Network Model (see Sect. 6 for description of network models and their properties). The constraints can be: • Statistical parameters of the Network such as the degree distribution (see below), that the Nomothetist can learn from measurement. • Some intuitive and rather coarsely formulated dynamical constraints that reflect some intuition about the dynamical formation/evolution of the Network (see Sect. 8).
Using the Network Model the Nomothetist can produce reasonably small, easier to study toy-models of the Network, that is, networks that share the constraints with the Network.

How can the nomothetist measure the network?
The most intriguing promise of network science is the understanding of complex networks of astronomical size. Hence, in our epistemic model we assume that the System is so large that the Network's size (that is the number of nodes) is not accessible for the Nomothetist. This assumption is in line with the size complexity characterization of complex systems given by Herbert Simon (1976). If the Nomothetist does not know the true size of the Network she must choose a measurement strategy that does not depend on the size. This seems to be somewhat controversial at first sight, since it means that the Nomothetist can access only a negligibly small part of the nodes of the Networks in the measurement process. Note however, that this small part will be a random sample. Suppose that the Nomothetist is interested in the proportion of leaves in the network. A leaf is such a node that has only one neighbour. So, the Nomothetist wants to measure the parameter p(N ) = l(N ) |N | , where l(N ) is the number of leaves, |N | is the size of the Network. Neither l(N ) nor |N | is accessible for the Nomothetist. However, she picks a random sample of one thousand nodes from the Network and calculates the proportion of leaves in the sample. This number will be very close to p(N ), even if the size of the Network is much more than 1000, say one trillion. She learns p(N ) by statistical inference. If the Nomothetist wishes to know the number p(N ) with error plus or minus 0.01 with confidence 99.99%, using the classical laws of statistics she knows how large sampling size must be chosen. This sampling size is independent of the number of the nodes, that is, of the size of the Networks. Therefore we can say that the proportion of leaves, a typical example of a statistical parameter, is epistemically accessible for the Nomothetist, she can use this parameter in the construction of the Network Model.
Degree distribution Using statistical samplings as above, the Nomothetist can compute the elements of the degree distribution of the Network, that is, the numbers d 1 (N ), d 2 (N ), . . . , where d k (N ) is the proportion of nodes of degree k (so, the proportion of the leaves equals to d 1 (N )). Correlations As we mentioned above, degree correlation is a crucial vague property studied in network science. The correlation number c k,l (N ) describes how big a proportion of the neighbours of degree k nodes (instead of all nodes) have degree l. For random networks c k,l (N ) approximately equals to d k (N ), so the correlation number can be a sign of departure from disordered random models (see "Appendix") towards some order. Note that by sampling nodes and checking their neighbours one can compute the correlation numbers with prescribed accuracy and confidence, hence we also consider them epistemically accessible statistical parameters.
Network motif parameters Using the exploration of the local relational structures of sampled nodes, the Nomothetist can compute for any possible subnetwork H the value p H (N ), the H-count of the Network, that is, the number of copies of H in the Network divided by the size of the Network. These parameters are called the network motif distribution parameters, 2 they were introduced into network science by the influential paper of Milo et al. (2002), however they were studied in social network analysis already in the seventies (Holland & Reinhardt, 1974). Also, the sampling procedures were studied in depth by Granovetter (1976) and Frank (1978).

The idiographer and the functional representation
The main task of the Idiographer in our epistemic model is the creation of the System's functional representation. Recall the definition of a mechanism of the foundational paper on mechanical explanations (Machamer et al., 2000).
Mechanisms are entities and activities organized such that they are productive of regular changes from start or set-up to finish or termination conditions. The entities taking part in the basic, low level activities are the fundamental units. Although the phenomena studied by the scientific agents are usually global, the lowlevel activities are local, they are directly related to the interconnectivity structure of the System. After studying the mechanism of the System, the Idiographer assigns certain labels (numbers, symbols etc.) to the nodes and constructs a local rule of change.
That is, the Idiographer creates a computer program. She expects the Network to be the input of the program together with a labeling of the nodes, representing the set-up state of the System. The program makes a certain number of steps. In each step the labels are changing according to the local rule. The output is another labeling of the nodes, representing the terminating state of the system, where the phenomenon (trait, feature, etc.) can be observed.
We should notice however, that although the computer program of the Idiographer was constructed by the careful study of the System, it can take basically any network as an input, that is, it could explain/predict not only the phenomenon of the System but any similarly interpretable phenomenon in other existing or theoretically possible systems which are similar to the System in an essential way. That is, if the program explains the fast spread of viruses or gossip in the System, it can explain or predict fast spread of virus or gossip in similar systems.
Shortly, the Idiographer converts a system phenomenon (trait, feature, etc.) into a network property P in the following way. The network N possesses the property P if we input N into the Idiographer's program we observe in the output.

Vague properties, vague propositions
As we noted in the Introduction, the notion of vagueness has central significance in network science, because of the specific epistemological features of the discipline. According to Crispin Wright's tolerance principle (Wright, 1976) vague propositions are those that are insensitive to small changes or inaccuracies. On the other hand, Williamson (1994) rejects the tolerance principle and presents an epistemic view that vague properties are, in fact, bivalent and their vagueness is just the consequence of our ignorance. The more we know about the position and structure of the borderline cases the less we see the property as truly vague. The epistemological peculiarity of network science is underlined by two of our standing assumptions: (1) Although the vague propositions are associated with purely mathematical objects (networks), they are deduced by observation about complex systems of the real world.
(2) The actual size (and in some cases even the order of magnitude) of the networks is epistemically inaccessible.
So, the lack of knowledge about the size of the Network is not the product of ignorance, but a rather hard epistemic limitation. Also, there is an element of context-dependence of certain vague propositions. Nevertheless, our concept of taming vagueness is not very far from the epistemic view of Williamson (1994). We also regard the vague propositions about networks as precise propositions, but precise propositions about network models. As we will see below, the small world property is characterized by the size of two parameters (see "Appendix"), C (clustering coefficient) and L (average path length). The former is a statistical parameter, the latter is not (so it is inaccessible for the Nomothetist using an empirical approach). Roughly speaking, small worldliness means that C is not very small and L is not very large. The following propositions are considered to be true in network science.
(1) The Watts-Strogatz networks are of small average path length.
(2) The regular lattice networks are not of small average path length.
We already alluded to the fact that for Watts-Strogatz networks with small rewiring coefficient, the average path length can be much larger than the average path length of a 20-by-20 regular lattice. This observation seems to be in contradiction with the tentative mathematical definition above. Still, we will be able to interpret the fact above in a consistent logical system (see in Sect. 5). The reason that the inconsistency can be removed is that the mathematical definition refers to an empirically inaccessible nonstatistical parameter, and even more importantly, that the knowledge of the Nomothetist concerns a tendency of evolving networks that idealizes the concrete networks. The evolving Watts-Strogatz networks with small but fixed parameter have a small average path length tendency, the evolving larger and larger regular lattices have not. The tendency approach to vagueness preserves the tolerance principle (in the disguise of robustness or resilience under change) without sacrificing the ultimate preciseness of the vague properties.

The vague quantitative language
In the standard language of finite sets we use two quantifiers ∀ and ∃. In the vague language of finite sets we use vague quantifiers as well, sometimes we use "almost all" instead of ∀ or "there exists a lot" instead of ∃. More precisely, in the vague quantitative language a lot of objects means that it is so many that it is hard or practically impossible to be counted, such as the number of sand grains in a heap in the famous Sorites Paradox, the classical source of vagueness in philosophy. A few objects means that it is easily counted and numbered, say, seven grains of sand. A tiny number is a positive real number very close to zero, a huge number is the real counterpart of the lot of and it can be regarded as the reciprocal value of a tiny number. We also denominate relative sizes. A small part of a family of objects might mean a lot of objects, but the proportion of a small part in the whole is tiny. A large part of the whole, is the opposite of a small part. The adjective is also used to describe something containing a lot of parts (such as " a large network"). So, a few of the large parts could make the whole. Almost all means the whole but a small part. We can say that a real parameter t is much bigger than a real parameter s, if s t is tiny, in this case the real parameter s is much smaller than t. Two numbers a and b are approximately the same if a b is neither tiny nor huge.
Typical "sorites type" vague propositions are : (1) If we take a few parts out of a lot of parts, there still remain a lot of parts.
(2) If we take a small amount of parts out of almost all parts there still remain almost all the parts. (3) If we take a small amount of parts out of a large amount of parts there still remain a large amount of parts.
We also have certain vague propositions about our vague notions that explain the possible compatibilities.
(1) A few families of small amount of parts together is still a small amount of parts.
(2) A few families of few parts together is still only a few parts.
(3) A few times tiny is still tiny.
Now we turn to the vague properties the Nomothetist may characterize the Network with.

Vague network properties
It might come as a surprise, but most, if not all, the important network properties are vague. That is, these properties are using vague quantifiers as above concerning sizes or node degrees. Let us list the most classical vague properties in network science. Simon (1976) observed that certain important complex networks in the real world have very sparse interconnectivity structure. This observation can be translated into the following vague network properties.
The Network is sparse Using the vague quantitative language this property means that there are approximately the same amount of links as nodes in the Network. Another classical vague property is: There are hubs in the Network It means that there is a node with a huge degree, in other words, with a lot of incident links. Note however, that in the context of the preferential attachment scheme there is an even more refined picture.
The Network is strongly sparse That is, only a small part of links are incident to hubs. It is important to note that the last property is, in fact, a bit stronger than the sparsity notion above and in our epistemic environment the Nomothetist will assume this vague property. By the strong sparsity assumption our epistemic model avoids the combinatorial explosion mentioned by Rathkopf (2018). 3 The network is scale-free This proposition means that up to some large number k, the degree distribution parameter d k (N ) is approximately the same as k −γ , where γ is the so-called scale-parameter.
The network has high clustering coefficient It means that the clustering coefficient of the Network (see "Appendix") is not tiny.
The network has degree correlation It means that the correlation numbers are not approximately the same as in a random network with the same degree distribution (see "Appendix").
The network is small world This property entails that the Network has high clustering coefficient and has small average path length, that is the average path length of the Network is smaller or approximately the same as the logarithm of the size of the Network. The very fact that the base of the logarithm is rarely discussed in this context underlines the vague nature of this definition.
The network is low-dimensional This means that the size of neighbourhoods (see "Appendix") in the Network (see "Appendix") is not bigger than the size of the neighbourhoods in a small dimensional lattice network.
The network is non-decomposable. The network is near-decomposable These two vague properties that we regard meta-properties in our epistemic model has been introduced by Herbert Simon (1962) and will be discussed in details in Sect. 7.

Set sequences
One of the basic phenomena behind vagueness is indiscriminability. If two objects are indiscriminable they must have the same vague properties. In some cases, due to the non-transitivity of the indiscriminability relation, items that are clearly different can be linked by a sorites series, of which each member is indiscriminable from its neighbours (Williamson, 1994). In the vague quantitative language the most obvious indiscriminability relation is the "being different by at most one element" relation. E.g. a heap of sand is indiscriminable from another heap of sand if the two heaps are equal, or the first heap can be obtained by adding to or taking away one single grain of sand from the second heap. Is seems that we can remove the sorites-type paradoxes only by getting rid of indiscriminability. In this way one also removes the tolerance principle. We suggest a different remedy, where we preserve indiscriminability by making it transitive. So, we tame vagueness by taming indiscriminability.
In this new model of the vague quantitative language the items are sequences. A set-sequence X is a sequence of finite sets We have the following interpretation of the vague quantitative properties.
(1) The set-sequence X is large, if the sizes of A 1 , A 2 , . . . are tending to infinity. ( |A n | 4 will eventually be above some positive number r (it is possible that finitely many B n 's are empty). That is, there is a threshold T such that if n is larger than T then the ratio |B n | |A n | is greater than r .
(4) A subset-sequence Y is a small part of X if the ratios |B n | |A n | will tend to zero as n tends to infinity.
Clearly, all these properties are bivalent in the realm of set-sequences. They either hold or not, there are no borderline cases and the following three propositions are true.
(1) If we take away a small subset-sequence from a large sequence, the remaining sequence will still be large.
(2) If a subset-sequence Y is a large part of a large sequence X and we take out a small part of X from Y, then the remaining subset-sequence is still large.
(3) A few small subset-sequences together are still forming a small subset-sequence.
The original indiscriminability relation can be extended to the set-sequences in the obvious way. The set-sequence . if for all n, the set A n is indiscriminable from the set B n . Then, we can define a new relation by saying that two set-sequences are S-indiscriminable if they can be bridged by a finite sorites series. Clearly, Sindiscriminability is transitive and if a set-sequence X is larger than any sequence which is S-indiscriminable from Y, it is large as well. So, we interpreted the vague quantitative language in a consistent and non-trivial way preserving indiscriminability and Wright's tolerance principle.

Properties of network sequences
Now we show how to interpret the vague network properties of Sect. 4.2 preserving indiscriminability via network sequences. Network sequences are not just useful mathematical tools, but natural objects studied in network science. For example, the famous Preferential Attachment network sequence (Barabási and Albert, 1999) starts with a network H 1 consisting of one single node, then H 2 consists of two nodes and one link and so on. Adding one node and one link to H n using the preferential attachment rule gives rise to H n+1 (see "Appendix"). It is the intuitive description of a network evolution that is based on the "rich get richer" principle. Thus, a Preferential Attachment network sequence is a growing sequence of trees and in any given time the new tree is obtained by adding one node and one link to the existing tree.
Quite obviously, any given tree network on n nodes can possibly be the n-th network of a network sequence produced by the Preferential Attachment Algorithm. This observation underlines the possible world nature of network sequences that will be even more clear when we consider network models, since the network sequence will be situated in the multiverse created by the network model.
Network parameters If p is a network parameter and W = H 1 , H 2 , H 3 . . . is a network sequence, we say that p can be interpreted on W and has value x if as n tends to infinity the parameter values p(H 1 ), p(H 2 ), p(H 3 ), . . . tend to the given number x. 5 Vague network properties As we see below, the vague network properties of Sect. 4 can be interpreted on network sequences as precise, bivalent properties. Below, let W = H 1 , H 2 , H 3 , . . . be a network sequence.
There are hubs in W It means that for any given number k there exists a threshold T k such that if n is larger than T k , then H n contains a node of degree larger than k. Note that the larger the k the larger threshold is needed. That is, eventually the networks in the sequence will have large degrees. It is possible that the maximal degrees of the first couple of hundred networks in the sequence are less than ten, but after a while the maximum degrees of the networks in the sequence will be greater than one thousand.
We add that the strong sparsity condition, that is an important commitment of the Nomothetist, translates to a bivalent network sequence property as well: "For any positive number r there exists an integer number k and a threshold T r such that if n is larger than T r then the fraction of links incident to nodes of degree larger than k is less than r ." W is scale-free This means that there exists some positive number C such that for all integers k the ratio d k (H n ) k γ is in between C and 1 C . That is, the degree distributions of the networks in the sequence are more and more similar to a power-law distribution. Of course, for an individual finite network N , the degree distribution cannot follow a power-law, since d k (N ) will be zero, if k is larger than the size of N .
W has high clustering coefficient This means that there exists some positive number r and a threshold T such that if n is larger than T then the clustering coefficient in H n is larger than r . This entails that the clustering coefficients will eventually be significantly larger than in a random network of the same size, for which the average above is very close to zero.
W is low-dimensional This means that there exists a dimension constant d and some other constant C such that for any integer r the size of the r -neighbourhoods (see "Appendix") in any of the networks H n is smaller than Cr d . That is, the neighbourhood growth in W is not much greater than the neighbourhood growth in lattice networks. It is a crucial property of networks that are topological representations of designed physical systems (where d is usually 2 or 3).
W is small world This entails that W has high clustering coefficient and small average path length, that is, there exists some positive number C and a threshold T such that the average path length of H n is less than C times the natural logarithm of the size of H n provided that n is larger than the threshold. Notice that if W1 is the sequence of larger and larger regular lattices and W2 is a Watts-Strogatz sequence of networks of the same sizes with very small rewiring parameter, then for a long while the average path lengths of the networks in W1 will be much smaller than the average path lengths of the corresponding networks in W2. However, after a while the average path lengths of the networks in W2 will be much smaller than the average path lengths of the corresponding networks in W1.
Network sequences reflect the size complexity commitment of the Nomothetist. The Nomothetist knows that the Network is the topological representation of the System of our real world, hence she assumes that the Network is finite. On the other hand, since she considers the true size of the Network epistemically inaccessible, the Nomothetist expects that the following proposition is false.
"The Network has exactly 1 node." Since the Nomothetist must have size complexity commitment on the network obtained by removing one node from the Network, she expects that for any n ≥ 1 the proposition "The Network has exactly n nodes." is false. This small epistemic paradox can be resolved if we consider that the Nomothetist views the Network as an unknown element of a network sequence. It seems that this is not very far from the actual views of actual network scientists. 6 6 Network models

The network algorithm and the network model
Similarly to the Preferential Attachment Tree Algorithm one can consider the Nonpreferential Attachment Tree Algorithm as well. This algorithm also produces growing trees, except that in the n-th step the newcomer node attaches itself to the existing nodes completely randomly without considering the degrees. Clearly, the Nonpreferential Attachment Tree Algorithm produces exactly the same tree evolution network sequences as the Preferential Attachment Tree Algorithm. That is, all the tree evolu-tions are possible worlds for both the Preferential Attachment Tree Algorithm and the Nonpreferential Attachment Tree Algorithm.
Note however, that Preferential Attachment Tree Algorithm produces not only network sequences but defines a probability distribution on the space of certain network sequences (in this case tree evolution network sequences). When network scientists, stating a vague network proposition, say that preferential attachment networks are scale-free and have short path length, they actually mean it has probability one that the network sequence generated by the Preferential Attachment Tree Algorithm is both scale-free and has short path length. As opposed to the Nonpreferential Attachment Tree Algorithm, where the network sequences generated by the algorithm have exponential degree distribution decay and still have short path lengths with probability one.
In general, a network model M consists of a space S of network sequences (the possible worlds) and a probability distribution μ on S.
If P is a precise, bivalent network sequence property then we can say that the network model M possesses the property P if the probability that an element S possesses the property is one.
• The network model given by the Preferential Attachment Algorithm (with one or more incoming links) possesses the scale-free, the small path length and the degree correlation properties. • The sparse Erdős-Rényi random network model possesses the exponential degree decay and the small average path length properties.
On the other hand, if p is a network parameter (see Sect. 5) then p can be interpreted on M and p(M) has value x if the probability that for an element S the parameter is interpreted and has value x is one.
For an example, if p is the proportion of the leaves then p is interpreted on the Preferential Attachment Tree Model (see "Appendix") and has value 2 3 (Bollobás et al., 2001). 7 In our epistemic model the Nomothetist creates a randomized Network Algorithm A and the Network Model M A is produced by A. The algorithm A is manifested in a computer program that is able to run infinitely and outputs a single network sequence for each run. The space S A is the set of all possible outputs for the program. The randomness in the algorithm uniquely determines the probability μ A . So, taming the vagueness is built into the creation of the Network Algorithm.

Algorithms, models and the epistemology of simulation
Simulating the network As Winsberg writes in his paper Sanctioning Models: The Epistemology of Simulation (1999) ...the simulationist hopes to infer, from existing theoretical knowledge about the system simulated. The Nomothetist intends to infer new knowledge about a system that relates to the topological representation of the system. Her knowledge about the system (the System) is manifested by statistical parameters of the network, and possibly by some intuition about the growth process/dynamical formation that led to creation of the Network. Using her knowledge as constraints, the Nomothetist builds her Network Algorithm A that is, in fact, a computer program.
Case 1 As Winsberg puts it, sometimes the algorithm is analytically tractable that is even without actually running the program, the Nomothetist may infer that the Network Model M A associated to the Network Algorithm possesses a certain property. Then, the Nomothetist can predict (as a new knowledge) that the Network possesses the given property in its vague form. Similarly, the Nomothetist may infer that the parameter can be interpreted on M A and has value x. Then, the Nomothetist can predict that the value of the parameter on the Network is approximately x. The classical example for analytical tractability is the Preferential Attachment Algorithm (e.g. Bollobás et al. (2001)).
Case 2 It is much more common however, that the Network Algorithm is not analytically tractable, therefore the Nomothetist must use simulation to infer new knowledge. Again, she intends to infer some properties or the value of some parameters of M A . The Nomothetist runs the program of A that theoretically could input an infinite network sequence, nevertheless, she stops the program at a certain time to get a single network H of reasonable (that is computationally accessible) size. Since H is relatively small, she can compute the value p(H). In some cases, say for the Preferential Attachment Model or for the Erdős-Rényi random model the fact that a certain parameter can be interpreted is provable in a rigorous way, but the exact value is out of reach. Then, by repeating the process above, the Nomothetist can convince herself with a high degree of confidence and precision about the actual value of p on the Network Model. In some other cases, it is only conjectured that the parameter can be interpreted (say, for some highly complex variations of the Preferential Attachment Model), then the repeated simulations can serve as strong evidence for this conjecture.
Simulating the system Our full epistemic model is in line with the description of modelling a phenomenon that Winsberg calls the final goal of simulation study. They (simulations) involve a complex chain of inferences that serve to transform theoretical structures into specific concrete knowledge of physical systems-formulates Winsberg. We have a specific phenomenon that should be visualized, explained or subjected to mathematical analysis. Two programs are created: the Idiographer's program that actually describes the complex chain of inferences and the Nomothetist's program discussed above. The simulation process goes as follows. First, the Nomothetist produces the network H of reasonable size by her program. This small network H serves as a toy-model of the Network and it is fed into the Idiographer's program. Then, the output of the Idiographer's program is the product of the whole simulation process. If the network property (frequently manifested in a dataset or a curve) associated with the phenomenon in question can be detected, then the simulation serves as a prediction or explanation of the given phenomenon.

Modularity
The notion of modularity became popular in cognitive science after the publication of Jerry Fodor (1983). Nevertheless, modularity had already been the central concept of Marr (1982) and Chomsky (1972). The term module in all the fields above captures the functional or evolutionary independence or quasi-independence of certain parts of the whole system, where the ability to perform a particular task is only very slightly influenced by the rest of the system in the short run. In our somewhat idealized epistemic model, we assume that in such parts, that we call functional modules, most of the interactions associated with the elementary parts are intramodular. This condition can clearly count as a reason for the negligibility of the influence by the rest of the system in the short run. In network theory a network module is a subnetwork K such that the number of links outgoing from the subnetwork is much less than the number links within the subnetwork. That is, the fraction o(K) |K| is tiny, where o(K) stands for the number of links connecting K with the rest of the network (intermodular links) and as usual, |K| stands for the number of nodes in the subnetwork. We call the fraction o(K) |K| the modularity parameter of the subnetwork K (so the smaller the modularity parameter the more module-like the subnetwork). This definition might raise some eyebrows in the network science community. There are a multitude of modularity definitions studied in network science and we will discuss some of them in Sect. 9. Note however, that in our epistemic model we are committed to the strong sparsity assumption, meaning that in most of the subnetworks the number of links is not much more than the number of nodes, that is we assume very low density all over the Network. By this assumption all the modularity definitions express the same module feature: only a very small fraction of the nodes in the network module are connected to the rest of the Network. As we will see in Sect. 9 the Idiographer is very much interested in small, concrete networks in which the density is much larger than in the very large networks the Nomothetist is interested in.
Again, the vague notion of network modules can be tamed into a precise, bivalent notion of network sequences in a natural way. Let W = H 1 , H 2 , H 3 , . . . be a network sequence and M = K 1 , K 2 , K 3 , . . . a sequence of subnetworks in W . Then, we call M a module sequence if o(K n ) |K n | tends to zero as n tends to infinity. We do not wish to regard too large subnetworks as proper modules, so we add the extra condition that a subnetwork above does not contain almost all the parts of the whole network. This vague property translates to the condition, that there is a number r strictly less than one, such that eventually, |K n | |H n | , the relative size of K n is smaller than r . We say that the network sequence contains a large module if we also have a positive number s such that eventually |K n | |H n | is greater than s. It is important to note the directionality of this definition. According to our assumption of the first paragraph of the section, functional modularity always implies network modularity. However, there is no guarantee that a component in the System realized by a very module-like subnetwork in the Network performs any particular task, in other words, network modularity does not necessarily imply functional modularity.

Near-decomposable networks
In his seminal paper "The Architecture of Complexity" (1962) Herbert Simon presented a strong nomothetic view about complexity, a search for common properties among diverse kind of complex systems. He was mostly concerned about organized complexity, explicitly referring to the complexity paper of Weaver (1948) and focused on two basic properties of complex systems: hierarchy and near-decomposability. By hierarchy, he meant a system that is composed of interrelated subsystems, each of the latter being, in turn, hierarchic in structure until we reach some lowest level of elementary subsystem. Simon also assumed that the subsystems in the hierarchy are divided into a small or moderate number of subsystems, each of which may be further divided.
He argued that in their dynamics, hierarchies have a property, near-decomposability that greatly simplifies their behaviour. Although near-decomposability as a system property appears first in the Architecture of Complexity, it already had a key implicit role in the process of breaking down complex simultaneous equation systems that were crucial in the development of Simon's seminal work on bounded rationality theory in the decision-making process, for which he later was awarded the Nobel Prize in Economics. 8 We analyse the notion of near-decomposability, keeping in mind that we will consider it for complex systems with a clear network topology. Simon's definition invokes modularity in a straightforward way: In a nearly decomposable system, the short-run behavior of each of the component subsystems is approximately independent of the short-run behaviour of the other components;. He also makes it clear that the subsystems are more modular, that is, they have less intermodular connections, as we go up to higher from the lowest level. Simon's famous metaphorical Hora watch presented in the Architecture of Complexity perfectly illustrates the build-up of a hierarchical, near-decomposable system. 9 How can we convert Simon's hierarchical near-decomposability into a vague network property?
• First, since we abstract away functionality, causality and the evolutionary features of the complex systems, the vague property of their topological representations must point towards the possibility of module structures and leveled hierarchy without explicitly prescribing them. • Second, based on the vague property we must be able to explain/predict the advantages and features of hierarchical, near-decomposable systems as Simon did. In particular, the vague network property should explain why the various complex system equations attached to such systems can be broken into much smaller equations. 8 As Sent (2001) puts it: "Consider Simon's valuable insights on causality and econometric identifiability. What connects them to his research on managerial decision-making and economic bounded rationality is, again, his interpretation of nearly decomposable systems. Specifically, systems of simultaneous equations and sets of variables appearing in these equations can themselves be approached as complex, hierarchical systems." 9 ... as we proceed upward, from level to level,in the hierarchy, the strengths of the interactions between elements belonging to different components become weaker and weaker (Simon, 1976).

Simon explicitly assumed that
(1) subsystems are formed from a few smaller subsystems, (2) going upward the proportion of intermodular connections are getting tinier, therefore, based on Simon's assumptions about hierarchy and near-decomposability, we can argue for the following simple network property.
The network is near-decomposable It means that by deleting only a small amount of links we can cut the Network into parts containing only a few nodes. Note that the definition describes that the Network is very near to be completely decomposed to small parts. Simon (2002) explained that he meant "nearly-completely decomposable" by "nearly-decomposable". We can also argue that the vague network property above facilitates the vast simplification of very complex symbolic equations. The example below seems to be in line with some examples in Simon (1962). Suppose that the complex system is a hierarchical, near-decomposable organization in which the decision-maker/manager wants to assign some supervisors such that each member of the organization has connection to a supervisor. The manager wants to minimize the number of supervisors. This is a purely graph theoretical problem known by the name of "vertex cover problem". The manager wants to choose the minimum number of nodes such that all nodes are either chosen or adjacent to a chosen node in the "connection network" of the organization. It has been known for a long time that this problem is computationally extremely hard. If the "connection network" of the organization contains thousands of nodes, finding the optimal solution might take more time than the age of the Universe. In general, there are not even suboptimal approximately good algorithms.
The manager uses the fact that the connection network of the organization is neardecomposable. Deleting one tenth of the links from the connection network in a clever way, the manager can form small groups of people. In the small groups the vertex cover problem can be solved in reasonable time (even in a parallel fashion) and the supervisor nodes are chosen. Since each node belongs to one group, each node will be either a supervisor node or a node adjacent to a supervisor node. Also, restricted to the groups the solution will be optimal. The actual, global optimal solution might be somewhat smaller, but the difference cannot be larger than the number of deleted links. So using near-decomposability, the manager obtained a satisficing solution to a completely hopeless optimization problem.
The vague network property above can be tamed into the following precise, bivalent property of a network sequence W = H 1 , H 2 , H 3 , . . . ": W is non-decomposable It means that for any small positive constant r there exists some integer K such that one can delete rl(H n ) links from the network H n to partition it into components of size at most K (note that l(H n ) is the total number of links in H n ).
We believe it is worth giving a short and simple example. Let W be the sequence of paths (see "Appendix"). So, H 1 consists of two nodes linked together, H 2 consists of three nodes linked together in a linear fashion by two links, and, in general, H n consists of n + 1 nodes linked together by n links. Let r be 1 1000 . Then, K can be chosen 1000, since one can delete every thousandth link to cut a long path into paths of length at most 1000. Note that in the definition of near-decomposability there is no direct reference to hierarchy. The example illustrates that near-decomposability implies the existence of some sort of hierarchical structure of the network, nevertheless, this structure is not unique (there are many ways a long path can be built up as a hierarchical system of shorter paths) and does not necessarily reflect the hierarchical nature of the system with which the network is associated.
As in the previous section, it is clear when a network model M possesses the near-decomposability property.
Nearly-decomposable network sequences were introduced into graph theory under the name of "hyperfinite graph sequences" (Elek, 2008) and have some relevant applications in computer science that we briefly mention below in the context of Simon's work and our epistemic model.
• There are some important network models that are near-decomposable.
(2) Tree network models, e.g. the Preferential Attachment Tree Model is neardecomposable.
(3) The classical Hierarchical Modular Model of Ravasz et al. (2002) is in fact near-decomposable.
• The relation of near-decomposability to other important network properties can be summarized as follows.
(1) By definition, near-decomposable networks are always vulnerable to planned attacks.
(2) Low-dimensional models never have small average path length, but the Preferential Attachment Tree Model and the Hierarchical Modular Model both have the small path average length and scale-free properties. (3) In network science it is almost proverbial that hierarchical modularity is closely related to higher clustering coefficients (Ravasz et al., 2002). Note however, that lattice and tree models have zero clustering coefficient. On the other hand, certain models that are very far from being near-decomposable have high cluster coefficients due to triadic closure (see "Appendix"). Nevertheless, it is true that many hierarchical complex systems have topological representation of high clustering coefficients.
• As we have already seen, with nearly-decomposable networks some notoriously hard problems can be broken into easy-to-solve subproblems. Nevertheless, we have some further arguments to support Simon's thesis on the advantages of near-decomposability. In the supervisor choice problem above one can define a parameter of a network: the relative proportion of the minimum sized supervisor set. This parameter is extremely hard to estimate, let alone, to calculate exactly. However, if we assume (an ontological condition) that our real life complex system is hierarchically near-decomposable in the sense of Simon, the supervisor parameter becomes epistemically accessible. This is a consequence of a rather deep computer scientific result of Hassidim et al. (2009). In fact they proved that any rea-sonable parameter is epistemically accessible in the universe of near-decomposable networks, that is, they can be estimated using only statistical parameters. Hassidim et. al. show that near-decomposable networks can be learned, that is, for a given large near-decomposable network A, based on only statistical samplings, one can figure out how to build another network B, so that B is almost the same as network A. This result gives a rather strong argument for the self-organizing reconstruction hypothesis of Simon (1962).

Non-decomposable networks
The notion of non-decomposability can also be traced back to Herbert Simon (1962). Non-decomposability is a feature of a complex system that can be characterized by the complete lack of modularity, a strong sign of disorganized complexity. A particularly clear definition is given by Rathkopf (2018) 10 : ...a system is non-decomposable just in case the behavior of any given component part, even over a short time period, depends on the behavior of many other individual components. This definition is to be interpreted in such a way that the term "component can refer either to the basic elements in a system or to any collection of basic elements other than the entire set". A component can be influenced in a short time period by the rest of the System if it is very far from being a module. There must be a sizeable amount of interconnectivity units connecting the component with the rest of the System. That is, the subnetwork in the Network corresponding to the component cannot be a network module. Hence, non-decomposability of the Systems converts to the following vague property of the Network:

The network has no modules
The above vague property can be tamed to the following precise, bivalent property of a network sequence W = H 1 , H 2 , H 3 , . . . ": W is non-decomposable It means that there exists some positive constant l (the expansion) such that for any subnetwork K n in H n , the modularity o(K n ) |K n | is larger than l, provided that that the size of K n is at most (say) half of the size of H n . " Therefore as in the previous section, it is clear when a network model M possesses the non-decomposability property.
Independently of the work of Simon, the non-decomposability property of network sequences was introduced into graph theory in the early seventies under the name of expander property (see e.g. the monograph of Lubotzky (2012)). In our paper however, we will use the phrase non-decomposability instead of expander to avoid confusion. Below, we give a brief summary on the relevance of non-decomposability in network science. • Leskovec et. al (2009) concluded that several social networks are basically nondecomposable or at least have a natural core-periphery decomposition, where the cores are non-decomposable.
• Most of the classical network models are non-decomposable.
(1) The Preferential Attachment Model (with more than one incoming link) is non-decomposable. (Hofstad) (2) The sparse version of the Watts-Strogatz Model and the zero configuration models (see "Appendix") are non-decomposable (Flaxman, 2007). (3) In the sparse Erdős-Rényi random model, the sequence of the giant components is very close to be non-decomposable (Benjamini et al., 2014).
• Non-decomposability is closely related to various important network properties.
(1) Non-decomposable networks always have small average path lengths. 11 (2) The random walk is fastly mixing on non-decomposable networks (see Lubotzky 2012). This is the reason why search algorithms such as Page Rank are very effective on certain real networks.

Organized and disorganized complexity
In his frequently cited essay Warren Weaver (1948) distinguished two kinds of complexities: organized complexity and disorganized complexity. As Weaver observed, the study of simple systems that eighteenth century scientists had dealt with involved two variables or at most three or four. The slogan of the nineteenth century was Let us develop analytical methods which can deal with two billion variables. The new statistical methods pioneered by Gibbs made such inquiries possible. Weaver went even further and stated that in such disorganized complexity problems the number of variables is perhaps totally unknown. These observations are in line with our epistemic model and with the early twenty first century network science as well. As Borgatti et al. (2009) put it, these network constructions are somewhat simplistic and coarse-grained not unlike the powerful statistical physical models of Josiah Willard Gibbs. On the other hand, there are problems that might still involve considerable number of variables, while as Warren remarked, there are a sizeable number of factors which are interrelated into an organic whole. As an example he asked how we can explain the behavior pattern of an organized group of persons such as a labor union or a group of manufacturers or a racial minority. These sorts of problems are somewhat strange to a network scientist with a nomothetic inclination, who is always looking for general laws and patterns, but bread and butter for social network analysts with an idiographic disposition.

Network models for organized complexity
Arguably, the first social network model dealing with complex behavior patterns that can be incorporated into our epistemic picture was constructed by the American economist Thomas Schelling (1971). Such agent-based models are extremely important in social network analysis, but somewhat neglected in the network science literature for their strongly idiographic nature. In our epistemic model the Nomothetist builds the Network Algorithm that creates a probability distribution on a state of network sequences, using statistical parameters and some rather vague intuition (see the next subsection) about the evolution/dynamical formation of the System. In many cases the Network Algorithm provides a relatively clear mathematical model. In an agent-based model an Idiographer-Nomothetist still creates a probability distribution on a state of network sequences as we explain below. However, instead of using carefully measured statistical parameters, she applies deep insights concerning a phenomenon that is behind the formation of the System. That is, agent-based models, as in Epstein (1999): Situate an initial population of autonomous heterogeneous agents in a relevant spatial environment; allow them to interact according to simple local rules, and thereby generate-or "grow" -the macroscopic regularity from the bottom up.
Recall that in our epistemic model the Idiographer creates a local rule of change to study a phenomenon. For agent-based models, the phenomenon under study is exactly the one that drives the formation of the network structure. E.g. racial tension drives the formation of the neighbourhood structure of homes of segregated dwellers in the model of Schelling. Using the characterization of Epstein (1999) we can sketch how such algorithms look like.
(1) At the starting point the links are not fixed, but the agents (the nodes) are situated in some explicit space. This space can be a 3-dimensional lattice or a part of the plane, etc.
(2) The agents are autonomous, that is, they decide with whom they will be linked without a centralized authority. (3) Pairs of agents that are in the vicinity of each other use local interactions until they find out whether they want to be linked or not. This local interaction can be playing games with each other to build trust, exchange of demands and supplies, revealing political interests etc. (4) This linking process, that usually involves some randomness, leads to the formation of a network.
One should note that such algorithms still lead to the creation of probability distributions on a space of network sequences. For each fixed n value, starting with n agents the algorithm randomly produces a finite network H n . So, running the algorithms for larger and larger values (quite similarly to the Watts-Strogatz Model) the Idiographer-Nomothetist creates a sequence of networks as possible outcomes. Also, the randomness of the algorithm defines a probability distribution on the network sequences above. As opposed to the classical models of network science these algorithms are usually analytically intractable, therefore the use of simulation is crucial. There are no real claims that the network properties of the original Network and the Network Models are shared, only the key phenomenon is intended to be explained (Epstein, 1999). Indeed, it is rather rare (but not impossible, see De Caux et al. (2014) or Hamill and Gilbert (2009)) that the classical vague properties of network science can be deduced for these constructions. On the other hand, there are some rather crude neardecomposable models in classical network science, such as the Hierarchical Modular Model of Ravasz et al. (2002) for which vague properties can be inferred, but they are rather "proof-of-concept demonstrations" than models of concrete organized systems (Serban forthcoming).

Network models for disorganized complexity
Our epistemic view admittedly has a Gibbsian nature. While creating the Network Model, the Nomothetist mostly depends upon some statistical parameters and may be a simplistic and coarse-grained (Borgatti et al., 2009) intuitive picture about the formation/creation of the System's network topology.
• Zero models (see the "Appendix" and also Sect. 9) for further explanations) are the extreme cases, where no intuition about the formation of the Network is used. Still, the random bipartite zero model of Newman et al. (2001) was applied to study the board of directors of Fortune 1000 companies. Actually, zero model means random model conditioned on some constraint parameter that gives zero models some Bayesian flavour. In network science, the zero models represent chaos/disorder (particularly the Erdős-Rényi model)in its totality as benchmark models. The main goal of Nomothetist's such as Barabási (see Jacomy 2020) is to find "nature's unmistakable sign that chaos is departing in favor of order" (Barabási, 2002). Note that a departure from order in favour of chaos was studied in the agent-based models. Indeed, the autonomous behaviour of the individual agents introduces some randomness into the benchmark model of total order (regular lattices), creating some disorder. However, as we have seen above, moving from total order in the direction of disorder, these models still preserve near-decomposability, the mark of organized complexity. As we see below, the introduction of the simplistic and coarse-grained intuitive picture about the formation/creation of the network topology will still preserve non-decomposability, the mark of disorganized complexity. • Preferential growth models are network models in which new nodes are connecting themselves to existing nodes using some rules involving randomness. Of course, these models can be viewed as primitive agent-based models with autonomy and very simple interactions, without the explicit space and the locality assumption. Note that the formation of online social networks such as Facebook or Twitter, very roughly speaking, work in such a way.
In the classical Preferential Attachment Model the rather crude intuition about the creation of the Network is "the rich get richer" characteristic. Some versions of the preferential attachment model use fitness of the individual nodes (Bianconi and Barabási, 2001) or some copying mechanism (Kumar et al., 2000) as a more realistic and less crude intuition about network formation. Still, these models are non-decomposable.
• Models with underlying topography are even closer to agent-based models since they satisfy the explicit space condition. The first such model (Holland and Leinhardt, 1971) is just a slight modification of the Erdős-Rényi model, where the location of the individual agents in the explicit space is taken into consideration, when the probability that a link is chosen defined. The famous Watts-Strogatz Model starts with a regular ring lattice as an explicit space. In these models the rather crude intuition about the creation of the network is the existence of the explicite space itself. In the Watts-Strogatz Model and its variants there is a quite visible underlying near-decomposable structure on the nodes, nevertheless, due to the amount of randomness introduced in the construction of additional links the network itself is non-decomposable. However, the less randomness is introduced, the closer the network remains being near-decomposable. In certain cases such as the Navigation Model of Kleinberg (2000) one can see the phase-transition where, due the change of some tuning parameter, randomness starts to dominate structure or in other words, organized complexity turns disorganized complexity.

On Rathkopf's theses about network models
Rathkopf's first thesis (2018) is that "network models embrace complexity". In the light of our considerations above one must add: network models, constructed by Nomothetists (and Rathkopf studies only these kind of models), embrace disorganized complexity. The other thesis of Rathkopf needs a bit more reflection. Simon (1962) doubted that properties of non-decomposable systems can be satisfactorily explained, remarking that they may to a considerable extent escape our observation and our understanding. Rathkopf's second thesis seemed to contradict Simon's thesis: network representations are particularly helpful in explaining the properties of non-decomposable systems. The simulation of the Watts-Strogatz Model or the Preferential Attachment Model explained "six-degrees of separation" and the fast spread of epidemics. However, one should note that these phenomena are consequences of vague network properties, that are in turn implied by non-decomposability. That is why non-decomposability is a meta-property in our epistemic model of network science. The Nomothetist must have a correct intuition about the non-decomposability of the Network to use the very elegant but "simplistic and coarsely-grained" network models. Then, she might be able to predict quite interesting properties of the System, without truly contradicting the maxim of Herbert Simon.

The idiographer's view of network science
Our epistemic model recounts network science from an undeniably nomothetic point of view. By looking at the network of an immensely large system, the Nomothetist creates a universe of networks, the elements of the network sequences evolving from her algorithm. Social network analysts dealt with rather small systems, where the "nodes" had names, lives and history. In the last section we try to look at network science with the eye of the scientists, with a strong idiographic predisposition favouring the actual and the concrete over the theoretical and the universal.

The perception of the typical
One of the most compelling differences between the view of the Nomothetist and the view of the Idiographer is the perception of the Typical. Barabási postulates in his book Network Science (2016) that scale-free networks have smaller path lengths. What Barabási meant that scale-free networks typically have small average path length, even smaller average path length than the giant component of an Erdős-Rényi network with the same amount of links. If one considers all the networks that have 10 100 nodes and a degree distribution close to the power-law, then more than 99.999% of them have smaller average path length than the corresponding Erdős-Renyi network. This is a true and rather uncontroversial statement. In the same book Barabási described his similar typicality result joint with Albert et al. (2000) that states: scale-free networks are more vulnerable under targeted attack than the corresponding random network, in the correct mathematical sense. This statement is still true, however, it led to the "Achilles Heel of the Internet" paradigm and stirred quite a bit of controversy (Jacomy, 2020) 12 . Surely, in systems of disorganized complexity the Gibbsian statistical typicality notion prevails. However, the Internet router topology seems to be more than just a possible outcome of a zero model based on its purported degree distribution, its complexity has a more organized character. Even if the Internet router-network was scale-free and 99.9999% of the scale-free networks of the size of the Internet are vulnerable under targeted attack, the proposition "Hence the Internet is vulnerable under targeted attack" could not be inferred in the usual inductive/statistical way. The nomothetic minded network scientist is prone to use typicality for inductive/statistical inference, some network scientists with idiographic inclination use typicality mostly as a frame of reference, as we will see below. One should note however that some social network analysts rejects even the comparison with the typical. As Borgatti et al. (2009)  To them, baseline models like simple random graphs seem naïve in the extreme-like comparing the structure of a skyscraper to a random distribution of the same quantities of materials. That is, although skyscrapers might be viewed as a network of bricks where two nodes are linked if the corresponding bricks are touching each other, the random network that has exactly the same number of nodes and links as the skyscraper has, is not "building-like" enough for the comparison. The network of skyscrapers should be compared to the networks of real buildings.

The world of the small
From a social scientist's point of view, network research in the physical sciences can seem alarmingly simplistic and coarse-grained. And, no doubt, from a physical scientist's point of view, network research in the social sciences must appear oddly mired in the minute and the particular, using tiny data sets and treating every context as different. (Borgatti et al., 2009) One of the most studied networks in Social Science is Zachary's Karate Club Network that contained only 34 nodes and 78 links, another well-studied network the Santa Fe Institute Collaboration Network contained 118 nodes and 200 links. For such small networks the limitation we have in our epistemic model does not really exist, since the whole data set of the relation structure of the network is known, in particular, the size of the network and the number of the links are known for the Idiographer. In our epistemic model average path length has a rather crude quantification. If N is a connected (see "Appendix") network, then we can define its normalized average path length parameter by where |N | is the number of nodes and ln stands for the natural logarithm. If this number is not a large number, then we call the network N of small average path length. Now, for a small network N we can consider the family of all connected networks with the same number of nodes and links. Thus, we can ask ourselves: How large is the average path length of N in comparison with the average of the average path lengths of the networks in the family above? If the average path length of N is much larger than the average of the average path lengths , then we can consider N of large average path length, even if apl(N ) is tiny, since in its own context N counts as a network of large average path length.
In practice (and it has strong probability theoretical reason that is beyond the scope of our paper) the average of the average path lengths is very typical, that is, the average path lengths of most of the networks with equal amount of nodes and links are concentrating very close to their average, and this statement holds for some other interesting parameters as well. Note however that computing the average path lengths of all such networks is technically impossible, so Idiographers use the expected value of the studied parameters in the Erdős-Rényi random network having the same size and expected number of links as the network they investigate. For an example, Humphries and Gurney (2008) quantified small worldliness by comparing the given network to the Erdős-Rényi random network of the same amount of nodes and expected number of links (see "Appendix").

Community structures
In Sect. 7 we introduced an ad hoc modularity parameter that worked quite well for network models, but it is not very useful for idiographers studying networks of small size. It is an important task in sociology to identify social groups in societies that are defined by kinship, occupation or common religious or political interest. The subnetworks corresponding to these social groups in the relation network of the given social systems are supposed to be highly module-like and these subnetworks are called communities. The mathematical task of the social network analyst is to detect communities in the given social network and even to partition the network into non-overlapping communities. The problem in general is to quantify how good a partition is, using some associated quality function. Again, the famous quality function of Girvan and Newman (2002) compared the partition of a given network with the partition of all networks with the same amount of nodes and links, that is the corresponding Erdős-Rényi random network. However, there are competing modularity notions and associated quality functions (see Fortunato (2010) for a detailed survey) and even the use of quality functions to compare partitions with different numbers of modules has serious theoretical limitations (Fortunato and Barthelemy, 2007). The large, sparse near-decomposable networks of our epistemic model are well-partitionable (Hassidim et al., 2009), but these partitioning methods are very rarely applicable to the Idiographers' concrete small networks to identify social groups.

The problem of scale-freeness
A prefiguration of the Preferential Attachment Model had appeared in a paper of Herbert Simon (1955), scale-freeness, that is, power-law distribution had already been studied by Pareto and Yule in the early twentieth century and it had been used in both natural and social sciences. As we have already seen, classical Nomothetists of network science viewed networks from the point of view of a statistical physicist dealing with disorganized complexity. They looked at networks as if they emerged at the criticality of some continuous phase transition. This picture was already studied in the case of the emergence of the giant component in percolation theory, where the cluster sizes near the critical probability obey a power-law. Percolation theory had a crucial role in early key papers of network science such as Callaway et al. (2000) that was co-authored by Newman, Strogatz and Watts and in the important papers of Albert and Barabási (1999) and Albert et al. (2000). At the same time, Faloutsos et al. (1999) claimed that the Internet router-level topology had power-law distribution and several similar claims of scale-freeness appeared in a very short time, creating a perfect storm. As Jacomy (2020) observed, for the nomothetic network scientist scale-freeness is the sign of a universal law. The classical Nomothetists practically had an ontological commitment to scale-freeness, a commitment that has been supported or seemed to be supported by some empirical results.
On the other hand, Jacomy points out that for the scientists of idiographic disposition, scale-freeness is an empirical characterization of the network. Since the scale-free distribution is infinite, for concrete, finite networks scale-freeness is only a vague property, as opposed to the Preferential Attachment Model that exhibits perfect scale-freeness. The refined statistical tools of the Idiographer enable her to study the empirical degree-distributions of concrete networks and compare them to several other distributions, not only to the power-law distribution. It turned out that scale-freeness in its strict form is not as common in real-world networks as it was believed at the beginning of the twenty-first century (Broido and Clauset, 2019). Although these results cannot completely reject the emergence thesis of some nomothetic minded network scientists, it casts a shadow over the primacy of the Nomothetists over the Idiographers in the science of networks.
• Basic definitions Throughout the paper a network means a finite set of nodes, where some pairs of different nodes are connected by one undirected link. The degree of a node is the number of links incident to it. A path of length n between two nodes u and v in a network N is a sequence of different nodes a 0 , a 1 , a 2 , . . . a n in N , where a i and a i+1 are linked, a 0 = u and a n is linked to v. If n is at least 2 and u equals to v, the path above is called a cycle. A network is connected if there is a path between all pairs of nodes in the network. A connected network without cycle is called a tree. In a connected network the distance of two nodes is the length of the shortest path between them. The average path length in a connected network is the average of the distances taken for all pairs of nodes in the network. For a positive integer r , the r -neighbourhood of a node u of a network N is the set of all nodes v in N that the distance of u and v is not larger than r . • The clustering coefficient This parameter was introduced by Watts and Strogatz (1998) to measure small worldliness. If N is a network and v is a node of N the local clustering coefficient is defined as where T (v) is the number of links between the neighbours of v and d(v) is the degree of v. We might assume that d(v) is larger than 1, otherwise C(v) is set to be zero. The clustering coefficient of a network is defined as the average of all the local clustering coefficients of the nodes in the network. Note that in large sparse networks or in the preferential attachment model the clustering coefficient is very small, thus a high clustering coefficient is frequently considered as a sign of some hierarchical modularity (Ravasz et al., 2002). However, one should note that by using triadic closure we can raise the value of the clustering coefficient of a network. Triadic closure means that if the distance of two nodes is exactly two, then we link them with a certain probability, hence we have completely random procedures as well resulting in high cluster coefficients. • The Erdős-Rényi random network model The model originates from the seminal paper of Paul Erdős et al. (1959). In their construction n nodes are given and for a number L they consider all networks on the n nodes having L links with equal probability defining a random network G E R (n, L) (that is, a probability distribution on the set of all networks on the n nodes having exactly L links). Gilbert (1959) introduced a similar random network model, G G (n, p) where for each pair of nodes a link is assigned with probability p. If p = 2L n(n−1) , then the expected number of links in the Gilbert model is L, and in fact the behavior of G E R (n, L) and G G (n, 2L n(n−1) ) is very similar. In the sparse random network model G S (n, t) with parameter t frequently used by Nomothetists, for each n the probability p n is chosen in the way that the expected number of links is tn. Idiographers mostly use the Erdős-Rényi model, since it perfectly describes typicality (see Sect. 9). The definition of the average path length of the Erdős-Rényi random network is slightly controversial, since if L is not much larger than n, then with high probability G E R (n, L) is not connected. However, for large n a giant component emerges, whose size is not much smaller than n and one can consider its average path length. One might notice that the level of rigor concerning network models in graph theoretical and network theoretical papers somewhat differ. 13 However in practice, Idiographers (e.g. in the paper of Humphries and Gurney (2008)) are using simulations of the Erdős-Rényi random network and not exact mathematical formulas, so the controversy does not have too much effect.
• Zero models These models that are also called configuration models are variations on the theme of Erdős and Rényi. Fix an n and some number a 1 , a 2 , . . . , a k such that there exists some network N with exactly a i number of nodes with degree i. Choosing a graph on n nodes with the same degree distribution defines a probability distribution on networks of size n. Varying n and the degrees, one can construct various network models with scale-free, Poisson and regular (all the degrees are the same) etc. distributions. • The preferential attachment model The model was constructed first in Barabási and Albert (1999). Bollobás et al. (2001) observed that the original model is slightly imprecise and made it mathematically sound. However, for the Preferential Attachment Tree Model the original construction is completely precise. Before starting the construction notice the for each network N there is an interesting probability distribution π N on the nodes of N . The probability of a node u is proportional to the degree of u, this determines π N completely. Now the construction of the Preferential Attachment Tree Model goes as follows. In the first step we have a network N 1 consisting of one single node. In the second step we have a network N 2 consisting of two nodes that are forming a link. In the third step we have a network N 3 consisting of three nodes that are forming a path of length 2. In the fourth step we add a new node to N 3 and link it to one of the nodes in N 3 according to the distribution π N 3 , obtaining the network N 4 . Inductively, if pi N k has already been constructed, a new node is linked to one of the old nodes according to the probability distribution π N k . This is why the model is called preferential attachment. The new node will "prefer" old nodes with larger degree to old nodes with smaller degree. In the general, more complicated model the new node will be linked to m old nodes, where m is some fixed number. • The Watts-Strogatz Model and its sparse variant For a given n we can construct a base graph on n nodes. First, we consider a cycle of length n and if the path distance of two nodes is less than the square of the natural logarithm of n, then each link of the base graph is rewired with probability p, that is, with probability p the link is replaced by a random link. Hence, we obtain a probability distribution W (n, p) on networks that have n nodes and as many links as the base graph has. Note that in the original paper the definition is a bit vague, instead of the square of the logarithm any number can be taken that is much smaller than n but much larger than the logarithm of n. Due to this technical condition, the Watts-Strogatz Model is not sparse. Flaxman (2007) considered a very simple, strongly sparse variation on the theme of Watts and Strogatz. Let L n be the n by n regular lattice. For each node u of L n pick a random edge incident to u to obtain a random network. Flaxman proved that his model is non-decomposable (for any connected base graph system, not only for lattices)