Skip to main content
Log in

Community informed experimental design

  • Original Paper
  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

Network information has become a common feature of many modern experiments. From vaccine efficacy studies to marketing for product adoption, stakeholders aim to estimate global treatment effects — what happens if everyone in a network is treated versus if no one is treated. Because individual outcomes are potentially influenced by the treatments or behaviors of others in the network, experimental designs must condition on the underlying network. Social networks frequently exhibit homophilous community structure, meaning that individuals within observed or latent communities are more similar to each. This observation motivates the development of community aware experimental design. This design recognizes that information between individuals likely flows along within community edges rather than across community edges. We demonstrate that this design reduces the bias of a simple difference in means estimator, even when the community structure of the graph needs to be estimated. Further, we show that as the community detection problem gets more difficult or if the community structure does not affect the causal question, the proposed design maintains its performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Abbe E (2017) Community detection and stochastic block models: recent developments. J Mach Learn Res 18(1):6446–6531

    MathSciNet  Google Scholar 

  • Adamic LA, Glance N (2005) The political blogosphere and the 2004 us election: divided they blog. Proceedings of the 3rd international workshop on link discovery (pp. 36–43)

  • Aldrich H, Dubini P (1991) Personal and extended networks are central to the entrepreneurial process. J Bus Ventur 6(5):305–313

    Article  Google Scholar 

  • Aral S, Muchnik L, Sundararajan A (2009) Distinguishing influencebased contagion from homophily-driven diffusion in dynamic networks. Proc Nat Acad Sci 106(51):21544–21549

    Article  Google Scholar 

  • Aronow PM, Samii C (2017) Estimating average causal effects under general interference, with application to a social network experiment. Ann Appl Stat 11(4):1912–1947

    Article  MathSciNet  Google Scholar 

  • Athey S, Eckles D, Imbens GW (2018) Exact p values for network interference. J Am Stat Assoc 113(521):230–240

    Article  MathSciNet  Google Scholar 

  • Aukett R, Ritchie J, Mill K (1988) Gender differences in friendship patterns. Sex Roles 19(1–2):57–66

    Article  Google Scholar 

  • Awan U, Morucci M, Orlandi V, Roy S, Rudin C, Volfovsky A (2020) Almost-matching-exactly for treatment effect estimation under network interference. International conference on artificial intelligence and statistics (pp. 3252–3262)

  • Bail CA, Argyle LP, Brown TW, Bumpus JP, Chen H, Hunzaker MF, Volfovsky A (2018) Exposure to opposing views on social media can increase political polarization. Proc Nat. Acad. Sci. 115(37):9216–9221

    Article  Google Scholar 

  • Basse GW, Airoldi EM (2018) Model-assisted design of experiments in the presence of network-correlated outcomes. Biometrika 105(4):849–858

    Article  MathSciNet  Google Scholar 

  • Bhattacharyya S, Bickel PJ (2014) Community detection in networks using graph distance. arXiv preprint arXiv:1401.3915

  • Binkiewicz N, Vogelstein JT, Rohe K (2017) Covariate-assisted spectral clustering. Biometrika 104(2):361–377

    Article  MathSciNet  Google Scholar 

  • Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008

    Article  Google Scholar 

  • Bruna J, Li X (2017) Community detection with graph neural networks. Stat 1050:27

    Google Scholar 

  • Budel G, Van Mieghem P (2020) Detecting the number of clusters in a network. J Complex Netw 8(6):047

    MathSciNet  Google Scholar 

  • Chamberlain B, Kasair C, Rotheram-Fuller E (2007) Involvement or isolation? the social networks of children with autism in regular classrooms. J Autism Dev Disord 37(2):230–242

    Article  Google Scholar 

  • Eckles D, Karrer B, Ugander J (2016) Design and analysis of experiments in networks: reducing bias from interference. J Causal Inference 5(1):7530

    MathSciNet  Google Scholar 

  • Faust K, Wasserman S (1992) Blockmodels: interpretation and evaluation. Soc Netw 14(1–2):5–61

    Article  Google Scholar 

  • Geng J, Bhattacharya A, Pati D (2019) Probabilistic community detection with unknown number of communities. J Am Stat Assoc 114(526):893–905

    Article  MathSciNet  Google Scholar 

  • Granovetter MS (1973) The strength of weak ties. Am J Soc 78(6):1360–1380

    Article  Google Scholar 

  • Hoff P (2008) Modeling homophily and stochastic equivalence in symmetric relational data. In: Platt J, Koller D, Singer Y, Roweis S (eds) Advances in neural information processing systems, vol 20. MIT Press, Cambridge MA, pp 657–664

    Google Scholar 

  • Hoff P, Fosdick B, Volfovsky A, Stovel K (2013) Likelihoods for fixed rank nomination networks. Netw Sci 1(3):253–277

    Article  Google Scholar 

  • Holland PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: first steps. Soc Netw 5(2):109–137

    Article  MathSciNet  Google Scholar 

  • Hudgens MG, Halloran ME (2008) Toward causal inference with interference. J Am Stat Assoc 103(482):832–842

    Article  MathSciNet  Google Scholar 

  • Igarashi T, Takai J, Yoshida T (2005) Gender differences in social network development via mobile phone text messages: A longitudinal study. J Soc Pers Relatsh 22:691–713

    Article  Google Scholar 

  • Jagadeesan R, Pillai NS, Volfovsky A (2020) Designs for estimating the treatment effect in networks with interference. Ann Stat 48(2):679–712

    Article  MathSciNet  Google Scholar 

  • Karrer B, Shi L, Bhole M, Goldman M, Palmer T, Gelman C, Sun, F (2021) Network experimentation at scale. Proceedings of the 27th acm sigkdd conference on knowledge discovery & data mining (pp. 3106–3116)

  • Karwa V, Airoldi EM (2018). A systematic investigation of classical causal inference strategies under mis-specification due to network interference. arXiv preprint arXiv:1810.08259

  • Kohavi R, Deng A, Frasca B, Walker T, Xu Y, Pohlmann N (2013). Online controlled experiments at large scale. Proceedings of the 19th acm sigkdd international conference on knowledge discovery and data mining, (pp. 1168–1176)

  • Kossinets G, Watts DJ (2006) Empirical analysis of an evolving social network. Science 311(5757):88–90

    Article  MathSciNet  Google Scholar 

  • Krzakala F, Moore C, Mossel E, Neeman J, Sly A, Zdeborová L, Zhang P (2013) Spectral redemption in clustering sparse networks. Proc Nat Acad Sci 110(52):20935–20940. https://doi.org/10.1073/pnas.1312486110

    Article  MathSciNet  Google Scholar 

  • Lorrain F, White HC (1971) Structural equivalence of individuals in social networks. J Math Soc 1(1):49–80

    Article  Google Scholar 

  • Manski CF (1995) Identification problems in the social sciences. Harvard University Press, Cambridge

    Google Scholar 

  • Mathews H, Mayya V, Volfovsky A, Reeves G (2019) Gaussian mixture models for stochastic block models with non-vanishing noise. 2019 IEEE 8th international workshop on computational advances in multi-sensor adaptive processing (camsap), pp. 699–703

  • Mathews H, Volfovsky A (2021) Latent community adaptive network regression. arXiv preprint arXiv:2112.06097

  • Mayer A, Puller SL (2008) The old boy (and girl) network: social network formation on university campuses. J Pub Econ 92(1–2):329–347

    Article  Google Scholar 

  • Mayya V, Reeves G (2019). Mutual information in community detection with covariate information and correlated networks. 2019 57th annual allerton conference on communication, control, and computing (allerton), pp. 602–607

  • Newman ME, Reinert G (2016) Estimating the number of communities in a network. Phys Rev Lett 117(7):078301

    Article  Google Scholar 

  • Paluck EL, Shepherd H, Aronow PM (2016). Changing climates of conflict: A social network experiment in 56 schools. Proc Nat Acad Sci, 113 (3):566–571. Retrieved from https://www.pnas.org/content/113/3/566 https://arxiv.org/abs/ https://www.pnas.org/content/113/3/566.full.pdf 10.1073/pnas.1514483113

  • Paluck EL, Shepherd HR, Aronow P (2020) Changing climates of conflict: a social network experiment in 56 schools. Proceedings of the National Academy of Sciences. NJ 10.3886/ICPSR37070.v2

  • Puelz D, Basse G, Feller A, Toulis P (2019). A graph-theoretic approach to randomization tests of causal effects under general interference. arXiv preprint arXiv:1910.10862

  • Rajkumar K, Saint-Jacques G, Bojinov I, Brynjolfsson E, Aral S (2022) A causal test of the strength of weak ties. Science 377(6612):1304–1310

    Article  MathSciNet  Google Scholar 

  • Reeves G, Mayya V, Volfovsky A (2019). The geometry of community detection via the mmse matrix. 2019 IEEE international symposium on information theory (isit), pp. 400–404

  • Rienties B, Nolan E-M (2014) Understanding friendship and learning networks of international and host students using longitudinal social network analysis. Int J Intercult Relat 41:165–180

    Article  Google Scholar 

  • Rohe K, Chatterjee S, Yu B et al (2011) Spectral clustering and the highdimensional stochastic blockmodel. Ann Stat 39(4):1878–1915

    Article  Google Scholar 

  • Rubin DB (1990). Formal mode of statistical inference for causal effects. J Stat Plann Inference 25 (3):279-292. Retrieved from https://www.sciencedirect.com/science/article/pii/0378375890900778 https://doi.org/10.1016/0378-3758(90)90077-8

  • Särndal C-E, Swensson B, Wretman J (2003) Model assisted survey sampling. Springer Science and Business Media, Berlin

    Google Scholar 

  • Sävje F (2021). Causal inference with misspecified exposure mappings. arXiv preprint arXiv:2103.06471

  • Sävje F, Aronow PM, Hudgens MG (2021) Average treatment effects in the presence of unknown interference. Ann Stat 49(2):673–701

    Article  MathSciNet  Google Scholar 

  • Sentse M, Kiuru N, Veenstra R, Salmivalli C (2014) A social network approach to the interplay between adolescents’ bullying and likeability over time. J Youth Aadolesc 43(9):1409–1420

    Article  Google Scholar 

  • Shen L, Amini A, Josephs N, Lin L (2022) Bayesian community detection for networks with covariates. arXiv preprint arXiv:2203.02090

  • Staber U (1993) Friends, acquaintances, strangers: gender differences in the structure of enterpreneurial networks. J Small Bus Entrep 11:73–82

    Article  Google Scholar 

  • Sussman DL, Airoldi EM (2017) Elements of estimation theory for causal effects in the presence of network interference. arXiv preprint arXiv:1702.03578

  • Toulis P, Kao E (2013). Estimation of causal peer influence effects. In International conference on machine learning. PMLR, NY, pp. 1489–1497

  • Ugander J, Karrer B, Backstrom L, Kleinberg J (2013) Graph cluster randomization: Network exposure to multiple universes. Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 329–337

  • Ugander J, Yin H (2020) Randomized graph cluster randomization. arXiv preprint arXiv:2009.02297

  • White HC, Boorman SA, Breiger RL (1976) Social structure from multiple networks. i. blockmodels of roles and positions. Am J Soc 81(4):730–780

    Article  Google Scholar 

  • Xu Y, Chen N, Fernandez A, Sinno O, Bhasin A (2015). From infrastructure to culture: A/b testing challenges in large scale social networks. Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 2227–2236

  • Yan B, Sarkar P (2021) Covariate regularized community detection in sparse graphs. J Am Stat Assoc 116(534):734–745

    Article  MathSciNet  Google Scholar 

  • Zhou Y, Liu Y, Li P, Hu F (2020) Cluster-adaptive network a/b testing: from randomization to estimation. arXiv preprint arXiv:2008.08648

Download references

Acknowledgements

The authors gratefully acknowledge financial support from the Statistical and Applied Mathematical Sciences Institute, the National Science Foundation (DMS 2046880) and the Army Research Institute. (W911NF1810233).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Volfovsky.

Ethics declarations

Competing interests

The authors do not have any competing interests.

Code availability

Code will be made available for all simulation studies and no additional data was generated for this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mathews, H., Volfovsky, A. Community informed experimental design. Stat Methods Appl 32, 1141–1166 (2023). https://doi.org/10.1007/s10260-022-00679-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-022-00679-6

Keywords

Navigation