Abstract
Models of social interaction have been mostly focusing on the dyad, the smallest possible social structure, as a unit of network analysis. In the context of friendship networks, we argue that the triad could also be seen as a building block to ensure cohesion and stability of larger group structures. By explicitly modeling the mechanism behind network formation, individual attributes (such as gender and ethnicity) are often dissociated from purely structural network effects (such as popularity) acknowledging the presence of more complex configurations. Allowing structural configurations to emerge when nodes share similar attribute values, real-world networks are more adequately described. We present a comprehensive set of network statistics that allow for continuous attributes to be accounted for. We also draw on the important literature on endogenous social effects to further explore the role of network structures on individual outcomes. A series of Monte Carlo experiments and an empirical example analyzing students’ friendship networks illustrate the importance of properly modeling attribute-based structural effects. In addition, we model unobserved nodal heterogeneity in the network formation process to control for possible friendship selection bias on educational outcomes. A critical issue discussed is whether friendships are related to homogeneity across several attributes or by a balance between homophily on some, such as gender and race, but heterophily on others, such as socio-economic factors.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Battaglini, M., Sciabolazza, V.L., Patacchini, E.: Effectiveness of connected legislators. Am. J. Political Sci. 64(4), 739–756 (2020)
Boucher, V., Mourifié, I.: My friend far, far away: a random field approach to exponential random graph models. Economet. J. 20(3), S14–S46 (2017)
Bramoullé, Y., Djebbari, H., Fortin, B.: Identification of peer effects through social networks. J. Econom. 150(1), 41–55 (2009)
Burt, R. S.: Structural Holes: The Social Structure of Competition. Harvard University Press (1992)
Caimo, A., Friel, N.: Bayesian inference for exponential random graph models. Social Netw. 33(1), 41–55 (2011)
Caimo, A., Friel, N.: Bergm: Bayesian exponential random graphs in r. J. Stat. Softw. 61(2), 1–25 (2014)
Calvó-Armengol, A., Patacchini, E., Zenou, Y.: Peer effects and social networks in education. Rev. Econ. Stud. 76(4), 1239–1267 (2009)
Carnegie, N.B., Krivitsky, P.N., Hunter, D.R., Goodreau, S.M.: An approximation method for improving dynamic network model fitting. J. Comput. Graph. Stat. 24(2), 502–519 (2015)
Coleman, J.S.: Social capital in the creation of human capital. Am. J. Sociol. 94, S95–S120 (1988)
Cranmer, S.J., Desmarais, B.A.: Inferential network analysis with exponential random graph models. Polit. Anal. 19(1), 66–86 (2011)
Davis, J. A., Leinhardt, S.: The structure of positive interpersonal relations in small groups (1967)
De Paula, Á., Richards-Shubik, S., Tamer, E.: Identifying preferences in networks with bounded degree. Econometrica 86(1), 263–288 (2018)
Faust, K.: A puzzle concerning triads in social networks: Graph constraints and the triad census. Social Netw. 32(3), 221–233 (2010)
Fernandez, R.M., Gould, R.V.: A dilemma of state power: Brokerage and influence in the national health policy domain. Am. J. Sociol. 99(6), 1455–1491 (1994)
Frank, O., Strauss, D.: Markov graphs. J. Am. Stat. Assoc. 81(395), 832–842 (1986)
Goldsmith-Pinkham, P., Imbens, G.W.: Social networks and the identification of peer effects. J. Bus. Econ. Stat. 31(3), 253–264 (2013)
Goodreau, S.M., Kitts, J.A., Morris, M.: Birds of a feather, or friend of a friend? using exponential random graph models to investigate adolescent social networks. Demography 46(1), 103–125 (2009)
Graham, B.S.: An econometric model of network formation with degree heterogeneity. Econometrica 85(4), 1033–1063 (2017)
Han, X., Hsieh, C.-S., Ko, S.I.: Spatial modeling approach for dynamic network formation and interactions. J. Bus. Econ. Stat. 39(1), 120–135 (2021)
Harris, K. M., Halpern, C. T., Whitsel, E., Hussey, J., Tabor, J., Entzel, P., Udry, J. R.: The national longitudinal study of adolescent to adult health: research design (2009)
Holland, P.W., Leinhardt, S.: An omnibus test for social structure using triads. Sociol. Methods Res. 7(2), 227–256 (1978)
Hsieh, C.-S., Lee, L.F.: A social interactions model with endogenous friendship formation and selectivity. J. Appl. Economet. 31(2), 301–319 (2016)
Hsieh, C.-S., Lee, L.-F., Boucher, V.: Specification and estimation of network formation and network interaction models with the exponential probability distribution. Quant. Econ. 11(4), 1349–1390 (2020)
Hunter, D.R., Goodreau, S.M., Handcock, M.S.: Goodness of fit of social network models. J. Am. Stat. Assoc. 103(481), 248–258 (2008)
Hunter, D.R., Handcock, M.S., Butts, C.T., Goodreau, S.M., Morris, M.: ergm: a package to fit, simulate and diagnose exponential-family models for networks. J. Stat. Softw. 24(3), nihpa54860 (2008)
Hunter, D.R., Krivitsky, P.N., Schweinberger, M.: Computational statistical methods for social network models. J. Comput. Graph. Stat. 21(4), 856–882 (2012)
Jackson, M.O., Rogers, B.W., Zenou, Y.: The economic consequences of social-network structure. J. Econ. Lit. 55(1), 49–95 (2017)
Johnsson, I., Moon, H.R.: Estimation of peer effects in endogenous social networks: control function approach. Rev. Econ. Stat. 103(2), 328–345 (2021)
Krackhardt, D., Kilduff, M.: Structure, culture and simmelian ties in entrepreneurial firms. Social Netw. 24(3), 279–290 (2002)
Krivitsky, P.N., Koehly, L.M., Marcum, C.S.: Exponential-family random graph models for multi-layer networks. Psychometrika 85(3), 630–659 (2020)
Lee, L.-F.: Identification and estimation of econometric models with group interactions, contextual factors and fixed effects. J. Econom. 140(2), 333–374 (2007)
LeSage, J., Pace, R.K.: Introduction to Spatial Econometrics. Chapman and Hall/CRC (2009)
Lusher, D., Koskinen, J., Robins, G.: Exponential Random Graph Models for Social Networks: Theory, Methods, and Applications. Cambridge University Press (2013)
Lusher, D. Robins, G.: Formation of social network structure. Exponential random graph models for social networks, pages 16–28 (2013)
Manski, C.F.: Identification of endogenous social effects: the reflection problem. Rev. Econ. Stud. 60(3), 531–542 (1993)
McPherson, M., Smith-Lovin, L., Cook, J.M.: Birds of a feather: Homophily in social networks. Ann. Rev. Sociol. 27, 415–444 (2001)
Mele, A.: A structural model of dense network formation. Econometrica 85(3), 825–850 (2017)
Miyauchi, Y.: Structural estimation of pairwise stable networks with nonnegative externality. J. Econom. 195(2), 224–235 (2016)
Morris, M., Handcock, M.S., Hunter, D.R.: Specification of exponential-family random graph models: terms and computational aspects. J. Stat. Softw. 24(4), 1548 (2008)
Murray, I., Ghahramani, Z., MacKay, D. J.: Mcmc for doubly-intractable distributions. In Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), pages 359–366. AUAI Press (2006)
Neal, R. M. Taking bigger metropolis steps by dragging fast variables. arXiv preprint arXiv:math/0502099 (2005)
O’Malley, A.J.: The analysis of social network data: an exciting frontier for statisticians. Stat. Med. 32(4), 539–555 (2013)
O’Malley, A.J., Marsden, P.V.: The analysis of social networks. Health Serv. Outcomes Res. Method. 8(4), 222–269 (2008)
Park, J., Haran, M.: Bayesian inference in the presence of intractable normalizing functions. J. Am. Stat. Assoc. 113(523), 1372–1390 (2018)
Pattison, P., Robins, G.: 9: neighborhood-based models for social networks. Sociol. Methodol. 32(1), 301–337 (2002)
Ridder, G. Sheng, S.: Estimation of large network formation games. arXiv preprint arXiv:2001.03838 (2020)
Robins, G., Elliott, P., Pattison, P.: Network models for social selection processes. Social Netw. 23(1), 1–30 (2001)
Robins, G., Pattison, P.: Random graph models for temporal processes in social networks. J. Math. Sociol. 25(1), 5–41 (2001)
Robins, G., Pattison, P., Kalish, Y., Lusher, D.: An introduction to exponential random graph (p*) models for social networks. Social Netw. 29(2), 173–191 (2007)
Simmel, G.: The Sociology of Georg Simmel, vol. 92892. Simon and Schuster (1950)
Snijders, T., Steglich, C., Schweinberger, M.: Modeling the coevolution of networks and behavior. In Longitudinal models in the behavioral and related sciences, pages 41–71. Routledge (2017)
Snijders, T.A., Pattison, P.E., Robins, G.L., Handcock, M.S.: New specifications for exponential random graph models. Sociol. Methodol. 36(1), 99–153 (2006)
Snijders, T.A., Van de Bunt, G.G., Steglich, C.E.: Introduction to stochastic actor-based models for network dynamics. Social Netw. 32(1), 44–60 (2010)
Steglich, C., Snijders, T.A., Pearson, M.: Dynamic networks and behavior: separating selection from influence. Sociol. Methodol. 40(1), 329–393 (2010)
Thiemichen, S., Friel, N., Caimo, A., Kauermann, G.: Bayesian exponential random graph models with nodal random effects. Social Netw. 46, 11–28 (2016)
Wang, P., Robins, G., Pattison, P., Lazega, E.: Exponential random graph models for multilevel networks. Social Netw. 35(1), 96–115 (2013)
Wang, P., Robins, G., Pattison, P., Lazega, E.: Social selection models for multilevel networks. Social Netw. 44, 346–362 (2016)
Wasserman, S., Pattison, P.: Logit models and logistic regressions for social networks: I: an introduction to markov graphs andp. Psychometrika 61(3), 401–425 (1996)
Weng, H., Parent, O.: A Social Interaction Model with Endogenous Network Formation. Working paper (2021)
Funding
Financial support from the Charles Phelps Taft Research Center is gratefully acknowledged. The authors would like to thank the referees and editor for their very constructive comments and suggestions, as well as comments from Lung fei Lee, James P. LeSage, Xiaodong Liu, Jeffrey Mills and other participants at the 2019 Midwest Econometric Group, the 2019 North American Regional Science Conference 2019, the 2019 Southern Regional Science Association 2019, and the 2019 Annual Summit on Applied Economics, Regional, and Urban Studies.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendix A: Modeling dyadic and triadic covariates
Individuals tend to associate and bond with similar others. To estimate this homophily effect on the network formation, we combine network configurations with attributes from which similarities between individuals can be measured. For example, we extend mutual link \(w_{ij}w_{ji}\) to similarity-based mutual link \(w_{ij}w_{ji}f(Z_{i,q},Z_{j,q})\), or cycle link \(w_{ij}w_{jk}w_{ki}\) to similarity-based cycle link \(w_{ij}w_{jk}w_{ki}g(Z_{i,q},Z_{j,q},Z_{k,q})\). To that end, we introduce the functions \(f(Z_{i,q},Z_{j,q})\) and \(g(Z_{i,q},Z_{j,q},Z_{k,q})\) to measure similarities between dyads (composed of individuals i, j) and triads (made of individuals i, j, k), respectively.
For categorical attributes Z, we distinguish individuals’ similarity depending on whether they belong to the same category. Therefore, when Z is discrete, the functions f and g can be defined as
The similarity measure is equal to 1 with matching attributes and 0 otherwise. Those similarity-based functions are common in statistical network analysis and are available in mainstream network packages such as Statnet.
As for continuous attributes Z, the similarity measure or observed homophily is generally a function of the absolute value of the difference in characteristics between two individuals. For convenience, the similarity measure is transformed to lie within the interval [0, 1] and for its values to be greater when attributes are more similar, we implement the following function:
where \(R_q\) represents the range of attribute \(Z_q\). Extending the same idea for triads, we use the sum of distances to measure the deviation between individual attributes:
where the similarity measure is also contained in the interval [0, 1]. To our knowledge, there is no statistical network analysis package that model higher order connectivity terms with continuous attributes.
Appendix B: Double metropolis-hastings algorithm
The estimation procedure is based on Markov Chain Monte Carlo (MCMC) methods. We use the traditional Gibbs sampler to estimate the outcome equation and an extended version of the DMH sampler is introduced to estimate the network. The joint likelihood of the network and outcome equation is described in (4).
The unobserved sender heterogeneity parameter \(\eta _{s,i} \sim N(0,\sigma ^2_{\eta })\) follows a Normal distribution with mean zero and variance \(\sigma ^2_{\eta }\). For identification purpose, the prior mean has to be set to zero as the intercept is being captured by the statistic counting the number of arcs in the graph (Thiemichen et al. 2016). The variance \(\sigma ^2_{\eta }\) has an hyperprior distribution that follows an inverse Gamma distribution. Since we assume independence across all prior distributions, the joint posterior distribution is equivalent to:
where \(\pi (.)\) represents the prior distribution for all parameters of interest.
We assume \(\theta \sim N(0,\sigma _{\theta }^2I_q)\), \(\gamma \sim N(0,\sigma _{\gamma }^2 I_{2})\) and \({{\tilde{\beta }}} \sim N(\beta _0,\sigma _{{{\tilde{\beta }}}}^2I_{2k+3})\), with \({\tilde{\beta }}=(\alpha ,\beta ',\gamma ')'\) where all hyperparameters are chosen to form uninformative prior distributions. The prior for \(\rho \) is defined over its stability region with the following uniform distribution \(\rho \sim U(1/\xi _{min},1/\xi _{max})\), where \(\xi _{min}\) and \(\xi _{max}\) are the minimum and maximum eigenvalues of the matrix W, respectively. This stability constraint is imposed during the random walk Metropolis-Hastings step of the main MCMC procedure (see LeSage and Pace 2009). Finally, the variance parameters have the following prior distributions \(\sigma ^2\sim IG(a,b)\) and \(\sigma _{\eta }^2\sim IG(c,d)\), where a, b, c and d are chosen for the priors to become uninformative.
The key challenge of the entire estimation procedure is to sample from the posterior distribution of the network statistics \(p(\theta |\eta _s,W,Z)\) while avoiding the intractable constant term. To that end, we rely on Murray et al. (2006) who proposed an extended Exchange algorithm based on Annealed Importance Sampling (AIS). The main idea relies on the simulation of an auxiliary network u sharing the same state space as W. The posterior \(p(\theta |\eta _s,W,Z)\) is then obtained from the joint target distribution \(p(\theta ,u|\eta _s,W,Z)\) by integrating over the auxiliary network u. The joint target distribution \(p(\theta ,u|\eta _s,W,Z)\) is simulated using a Metropolis-Hastings (MH) algorithm on the \((\theta ,u)\) space, with proposal \(\theta ^{\star } \sim q(.|\theta )\) and auxiliary network \(u\sim f(.|\theta ^{\star },\eta _s,W,Z)\)
To improve the normalizing constant approximation, we generate not one latent network but an entire sequence that have a greater probability of being generated under \(\theta \). A sequence of \(l=(0,\ldots ,L)\) transitions \(T_{l}(u_{l}|u_{l-1},\theta ^{\star },\theta ,\eta _s,W,Z)\) provides a route from the first distribution \(f (u_{0}|\theta ^{\star },\eta _s,W,Z)\) to the last distribution \(f (u_{L}|u_{L-1},\theta ^{\star },\eta _s,W,Z)\), with the auxiliary distributions interpolating between them. Each transitional network \(u_l\) is generated from \(T_{l}(u_{l}|u_{l-1},\theta ^{\star },\theta ,\eta _s,W,Z)\), the MH transition kernel whose stationary distribution is \(\varphi _{l}(.|\theta ^{\star },\theta ,\eta _s,W,Z)\). Following Neal (2005), the sequence of L target distributions is based on the following interpolation scheme:
where \(\xi _l = (L-l+1)/(L+1)\) with \(l=(1,\ldots ,L)\).
This set of bridging distributions leads to the proposed exchange algorithm:
-
1.
Draw \(\theta ^{\star }\sim q(.|\theta )\), and \(u_{0}\sim f(.|\theta ^{\star },\eta _s,Z)\). the proposal function q(.) is Normally distributed, centered around the current parameter values.
-
2.
Generate the sequence of transitions:
$$\begin{aligned}{} & {} u_1\sim T_1(u_1|u_0,\theta ^{\star },\theta ,\eta _s,Z),\ldots ,\\{} & {} u_L\sim T_L(u_L|u_{L-1},\theta ^{\star },\theta ,\eta _s,Z). \end{aligned}$$ -
3.
Propose to move from \(\theta \) to \(\theta ^{\star }\) with probability:
$$\begin{aligned} \alpha= & {} {} min\left\{ 1,\frac{p(\theta ^{\star })q(\theta |\theta ^{\star })\varphi (W|\theta ^{\star },\eta _s,Z)}{p(\theta )q(\theta ^{\star }|\theta )\varphi (W|\theta ,\eta _s,Z)} \prod _{l=0}^{L}\right. \nonumber \\{} & {} \left. \frac{\varphi _{l+1}(u_{l+1}|\theta ^{\star },\theta ,\eta _s,Z)}{\varphi _l(u_l|\theta ^{\star },\theta ,\eta _s,Z)}\right\} , \end{aligned}$$(14)where the function \(\varphi (.)\) indicates the unnormalized likelihood
-
4.
Draw \(\eta _s^{\star }\sim q(.|\eta _s)\), and \(W^{\star }\sim f(.|\theta ,\eta _s^{\star },Z)\).
-
5.
Propose to move from \(\eta _s\) to \(\eta _s^{\star }\) with probability
$$\begin{aligned} \alpha =min\left\{ 1,\frac{p(\eta _s^{\star })q(\eta _s|\eta _s^{\star })\varphi (W|\theta ,\eta _s^{\star },Z)}{p(\eta _s)q(\eta _s^{\star }|\eta _s)\varphi (W|\theta ,\eta _s,Z)}\frac{\varphi (W^{\star }|\theta ,\eta _s^{\star },Z)}{\varphi (W^{\star }|\theta ,\eta _s,Z)}\frac{f(y|X,W,\alpha ,\beta ,\gamma ,\eta _s^{\star })}{f(y|X,W,\alpha ,\beta ,\gamma ,\eta _s)}\right\} .\nonumber \\ \end{aligned}$$(15) -
6.
Let \({\tilde{\beta }}=(\alpha ,\beta ',\gamma ')'\), sample \({\tilde{\beta }}|X,W,y,\eta _s \sim N({\tilde{\beta }};{\hat{\beta }},\sigma ^2B^{-1})\), where \({\hat{\beta }}=B^{-1}({\check{X}}(I_N-\rho W)y+\sigma ^{2}B_0^{-1}\beta _0)\) and \(B=(\sigma ^{2}B_0^{-1}+{\check{X}}'{\check{X}})\) with \({\check{X}}=(\iota _n,{\tilde{X}},{\tilde{\eta }})\), \({\tilde{X}}= \left[ X \ W X\right] \), and \({\tilde{\eta }}= \left[ \eta _s \ W \eta _s\right] \).
-
7.
Sample \(\sigma ^2|X,W,y,{\hat{\beta }},\eta _s \sim IG(a_1,b_1)\), where \(a_1=a+n/2\), \(b_1=b+e'e/2\), and \(e=((I_N-\rho W)y-{\check{X}}{\tilde{\beta }})\).
-
8.
Sample \(\sigma _\eta ^2|\eta \sim IG(c_1,d_1)\), where \(c_1=c+n/2\), \(d_1=d+\eta _s'\eta _s/2\).
-
9.
Draw \(\rho ^{\star }\sim N(\rho ,\kappa _{\rho }^2)\) and accept \(\rho ^{\star }\) with probability
$$\begin{aligned} \alpha =min\left\{ 1,\frac{ l(y|X,W,{\tilde{\beta }},\sigma ^2,\rho ^{\star },\eta _s)p(\rho ^{\star }) }{l(y|X,W,{\tilde{\beta }},\sigma ^2,\rho ,\eta _s)p(\rho )} \right\} \end{aligned}$$
where \(\kappa _{\rho }\) are tuning parameters adjusting the acceptance rate between 40% and 60% during the burn-in period. Note that all proposal functions q(.) centered around the current parameter values contain tuning parameters as well. To generate each auxiliary network \(u_l\), \(l=0,\ldots ,L\) among the space of all possible networks, we use an MH step detailed in Sect. 4.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Weng, H., Parent, O. Beyond homophilic dyadic interactions: the impact of network formation on individual outcomes. Stat Comput 33, 43 (2023). https://doi.org/10.1007/s11222-023-10215-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-023-10215-5