Abstract
The network models discussed in the previous chapter serve a variety of useful purposes. Yet for the purpose of statistical model building, they come up short. Indeed, as Robins and Morris [125] write, “A good [statistical network graph] model needs to be both estimable from data and a reasonable representation of that data, to be theoretically plausible about the type of effects that might have produced the network, and to be amenable to examining which competing effects might be the best explanation of the data.” None of the models we have seen up until this point are really intended to meet such criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
These models have also been referred to as p ∗ models, particularly in the social network literature, where they are seen as one of the later examples of a series of model classes introduced in succession over a roughly 20-year period covering the late 1970s, 1980s, and early 1990s. See the review of Wasserman and Pattison [145], for example. Our use of the term ‘exponential random graph models’ reflects current practice, which emphasizes the connection of these models with traditional exponential family models in classical statistics.
- 2.
Recall that an arbitrary (discrete) random vector Z is said to belong to an exponential family if its probability mass function may be expressed in the form
$$\displaystyle{ \mathbb{P}_{\theta }\left (\mathbf{Z} = \mathbf{z}\right ) =\exp \left \{{\theta }^{T}\mathbf{g}(\mathbf{z}) -\psi (\theta )\right \}, }$$(6.1)where θ ∈ IRp is a p × 1 vector of parameters, g(⋅ ) is a p-dimensional function of \(\mathbf{z}\), and ψ(θ) is a normalization term, ensuring that \(\mathbb{P}_{\theta }(\cdot )\) sums to one.
- 3.
However, it is important to realize that it is not the case that simply any collection of (in)dependence relations among the elements of Y yields a proper joint distribution on Y. Rather, certain conditions must be satisfied, as formalized in the celebrated Hammersley-Clifford theorem (e.g., Besag [12]).
- 4.
The statnet suite is arguably the most sophisticated single collection of R packages for doing statistical modeling of network graphs, particularly from the perspective of social network analysis.
- 5.
That is, for each pair {i, j}, we assume that Y ij is independent of \(Y _{i^{\prime},j^{\prime}}\), for any \(\{i^{\prime},j^{\prime}\}\neq \{i,j\}\).
- 6.
Note that S 1(y) = N e is the number of edges.
- 7.
Formally, Frank and Strauss introduced the notion of Markov dependence for network graph models, which specifies that two possible edges are dependent whenever they share a vertex, conditional on all other possible edges. A random graph G arising under Markov dependence conditions is called a Markov graph.
- 8.
In this context the term is used to refer to a probability distribution that places a disproportionately large amount of its mass on a correspondingly small set of outcomes.
- 9.
Hunter [78] offers an equivalent formulation of this definition, in terms of geometrically weighted counts of the neighbors common to adjacent vertices.
- 10.
We note that the ergm package provides not only summary statistics but also p-values. However, as mentioned earlier, the theoretical justification for the asymptotic chi-square and F-distributions used by ergm to compute these values has not been established formally to date. Therefore, our preference is to interpret these values informally, as additional summary statistics.
- 11.
Goodness-of-fit has been found to be particularly important where ERGMs are concerned, due in large part to the issue of potential model degeneracy.
- 12.
A random variable X is said to follow a Q-class mixture distribution if its probability density function is of the form \(f(x) =\sum _{ q=1}^{Q}\alpha _{k}f_{q}(x)\), for class-specific densities f q , where the mixing weights α q are all non-negative and sum to one.
- 13.
The entropy of a discrete probability distribution p = (p 1, …, p Q ) is defined as \(H(\mathbf{p}) = -\sum _{q=1}^{Q}p_{q}\log _{2}p_{q}\), with smaller values indicating a distribution concentrated on fewer classes. This value is bounded above by log2 Q, corresponding to a uniform distribution on {1, …, Q}.
- 14.
A set of random variables is said to be exchangeable if their joint distribution is the same for any ordering.
- 15.
In general, a probit model specifies, for a binary response Y, as a function of covariates x, that \(\mathbb{P}(Y = 1\vert \mathbf{X} = \mathbf{x}) =\varPhi ({\mathbf{x}}^{T}\beta )\), for some β.
- 16.
The package latentnet, in the statnet suite of tools, implements other variants of latent network models, such as latent distance models.
- 17.
The arguments S and burn chosen in our example ask that a ‘burn-in’ of 10, 000 iterations be used to initiate our MCMC sampler, after which the following 1, 000 iterations are used to perform posterior inference.
- 18.
- 19.
An ROC curve is used commonly in classification problems. The term refers to a curve obtained by plotting the true positive rate of a classifier against the true negative rate, as a threshold (or similar parameter) is varied across its natural range, where the threshold is applied to the predicted values to discriminate between two classes of interest. Here, since the predictions are posterior probabilities, the threshold is varied from 0 to 1, with vertex pairs for which the posterior probability of an edge is above threshold being predicted to have an edge.
References
E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing, Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9, 1981–2014 (2008)
D. Aldous, Exchangeability and related topics. In École d’Été de Probabilités de Saint-Flour XIII—1983, (Springer, Berlin, 1985), pp. 1–198
J. Besag, Spatial interaction and the statistical analysis of lattice systems. J. Roy. Stat. Soc. Ser. B 36(2), 192–236 (1974)
A. Coja-Oghlan, A. Lanka, Finding planted partitions in random graphs with general degree distributions. SIAM J. Discrete Math. 23(4), 1682–1714 (2009)
J.-J. Daudin, F. Picard, S. Robin, A mixture model for random graphs. Stat. Comput. 18(2), 173–183 (2008)
O. Frank, D. Strauss, Markov graphs. J. Am. Stat. Assoc. 81(395), 832–842 (1986)
C. Geyer, E. Thompson, Constrained Monte Carlo maximum likelihood for dependent data. J. Roy. Stat. Soc. Ser. B 54(3), 657–699 (1992)
M. Handcock, Assessing degeneracy in statistical models of social networks. Technical Report No. 39, Center for Statistics and the Social Sciences, University of Washington, 2003
P. Hoff, Modeling homophily and stochastic equivalence in symmetric relational data. Advances in Neural Information Processing Systems, NIPS (MIT Press, Cambridge, 2008)
D.N. Hoover, Row-column exchangeability and a generalized model for probability. In Exchangeability in Probability and Statistics (North-Holland, Amsterdam, 1982), pp. 81–291
D. Hunter, Curved exponential family models for social networks. Soc. Network. 29(2), 216–230 (2007)
D. Hunter, M. Handcock, Inference in curved exponential family models for networks. J. Comput. Graph. Stat. 15(3), 565–583 (2006)
B. Karrer, M.E. Newman, Stochastic blockmodels and community structure in networks. Phys. Rev. E 83(1), 016107 (2011)
D. Lusher, J. Koskinen, G. Robins, Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications (Cambridge University Press, Cambridge, 2012)
G. McLachlan, T. Krishnan, The EM Algorithm and Extensions, vol. 382 (Wiley, New York, 2007)
K. Nowicki, T. Snijders, Estimation and prediction for stochastic blockstructures. J. Am. Stat. Assoc. 96(455), 1077–1087 (2001)
P. Pattison, G. Robins, Neighborhood-based models for social networks. Socio. Meth. 32(1), 301–337 (2002)
G. Robins, M. Morris, Advances in exponential random graph (p*) models. Soc. Network. 29(2), 169–172 (2007)
G. Robins, P. Pattison, Y. Kalish, D. Lusher, An introduction to exponential random graph (p*) models for social networks. Soc. Network. 29(2), 173–191 (2007)
T. Snijders, P. Pattison, G. Robins, M. Handcock, New specifications for exponential random graph models. Socio. Meth. 36(1), 99–153 (2006)
S. Wasserman, K. Faust, Social Network Analysis: Methods and Applications (Cambridge University Press, New York, 1994)
S. Wasserman, P. Pattison, Logit models and logistic regressions for social networks: I. An introduction to Markov graphs and p ∗. Psychometrika 61(3), 401–425 (1996)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Kolaczyk, E.D., Csárdi, G. (2014). Statistical Models for Network Graphs. In: Statistical Analysis of Network Data with R. Use R!, vol 65. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-0983-4_6
Download citation
DOI: https://doi.org/10.1007/978-1-4939-0983-4_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-0982-7
Online ISBN: 978-1-4939-0983-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)