A fixed degree sequence model for the one-mode projection of multiplex bipartite graphs

Abstract

A bipartite structure is a common property of many real-world network data sets such as agents which are affiliated with societies, customers who buy, rent, or rate products, and authors who write scientific papers. The one-mode projection of these networks onto either set of entities (e.g., societies, products, and articles) is a well-established approach for the analysis of such data and deduces relations between these entities. Some bipartite data sets of key importance contain several distinct types of relations between their entities. These networks require a projection method which accounts for multiple edge types. In this article, we present the multiplex extension of an existing projection algorithm for simplex bipartite networks, i.e., networks that contain a single type of relation. We use synthetic data to show the robustness of our method before applying it to a real-world network of user ratings for films, namely, the Netflix data set. Based on the assumption that co-ratings of films contain information about the films’ similarity, we analyse the multiplex projection as an approximation of the similarity landscape of the films. Besides comparing the projection to the coarse-grained classification of films into genres, we validate the resulting similarities based on ground truth data sets containing film series. Our analysis confirms the predictive power of the network of positive co-ratings. We furthermore explore the potential of additional, mixed co-rating patterns in improving the prediction of similarities and highlight necessary criteria for this approach.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    Note that in the used Netflix data set the same user rates a certain film only once by either liking or disliking it. Thus, the maximal multiplicity of the resulting bipartite graph is 1.

  2. 2.

    We also ran experiments on synthetic data where the degree sequence on one of the node sets of the bipartite graph was more homogenous. This work in progress shows that the presented multiplex one-mode projection is robust when using a different network model as well.

  3. 3.

    The term ground truth is a standard term in machine learning which defines the set of observations that is to be re-discovered by a good algorithm. Any algorithm can then be evaluated by the number of true positive predictions, i.e., those that are in the ground truth, the number of false positives, i.e., those not in the ground truth set but predicted by the algorithm, the number of true negatives (not predicted, not present in ground truth), and the number of false negatives (not predicted, but present in ground truth).

  4. 4.

    The Area Under (the receiver operating, ROC) Curve is a standard machine learning measure, which quantifies the probability that true positives are assigned lower scores than true negatives by a given algorithm (Fawcett 2006). Thus, a perfect one-mode projection algorithm regarding ground truth has an AUC of 1 while random guessing results in an AUC of 0.5.

References

  1. Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466:761–764

    Article  Google Scholar 

  2. Barabási AL, Jeong H, Néda Z, Ravasz E, Schubert A, Vicsek T (2002) Evolution of the social network of scientific collaborations. Physica A 311:590–614

    MathSciNet  Article  MATH  Google Scholar 

  3. boyd d, Crawford K (2011) Six provocations for big data. In: A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society

  4. Breiger RL (1974) The duality of persons and groups. Soc Forces 53(2):181–190

    Google Scholar 

  5. Bródka, Stawiak P, Kazienko P (2011) Shortest path discovery in the multi-layered social network. In: Proceedings of the 2011 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’11), pp 497–501

  6. Campbell C, Yang S, Albert R, Sheab K (2011) A network model for plant–pollinator community assembly. Proc Natl Acad Sci 108:197–202

    Article  Google Scholar 

  7. Davis D, Lichtenwalter R, Chawla NV (2012) Supervised methods for multi-relational link prediction. Soc Netw Anal Min, pp 1–15

  8. Eagle N, Pentland AS, Lazer D (2009) Inferring friendship network structure by using mobile phone data. Proc Natl Acad Sci 106:15274–15278

    Article  Google Scholar 

  9. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874

    Article  Google Scholar 

  10. Foster JG, Foster DV, Grassberger P, Paczuski M (2010) Edge direction and the structure of networks. Proc Natl Acad Sci 107(24):10815–10820

    Article  Google Scholar 

  11. Film series at Wikipedia. http://en.wikipedia.org/wiki/Film_series/

  12. Gionis A, Mannila H, Mielikinen T, Tsaparas P (2007) Assessing data mining results via swap randomization. ACM Trans Knowl Discov Data 1(3). Art no 14

  13. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99:7821–7826

    MathSciNet  Article  MATH  Google Scholar 

  14. Gómez-Gardeñes J, Vilone D, Sanchez A (2011) Disentangling social and group heterogeneities: Public Goods games on complex networks. Eur J Phys 95:68003

    Google Scholar 

  15. Gotelli NJ, Graves GR (1996) Null-Models in Ecology. Smithsonian Institution Press, Washington, DC

  16. Holme P, Liljeros F, Edling CR, Kim BJ (2003) Network bipartivity. Phys Rev E 68:056107

    Article  Google Scholar 

  17. Horvát EÁ, Zweig KA (2012) One-mode projection of bipartite graphs. In: Proceedings of the 2012 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’12), pp 598–605

  18. Kazienko P, Musial K, Kajdanowicz T (2011) Multidimensional social network in the social recommender system. IEEE Trans Syst Man Cybern Part A Syst Hum 41(4):746–759

    Article  Google Scholar 

  19. Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78:046110

    Article  Google Scholar 

  20. Lehmann S, Schwartz M, Hansen LK (2008) Biclique communities. Phys Rev E 78:016108

    MathSciNet  Article  Google Scholar 

  21. Lewis K, Kaufman J, Gonzalez M, Wimmer A, Christachis N (2008) Tastes, ties, and time: a new social network dataset using facebook.com. Soc Netw 30:330–342

    Article  Google Scholar 

  22. Li M, Fan Y, Chen J, Gao L, Di Z, Wu J (2005) Weighted networks of scientific communication: the measurement and topological role of weight. Physica A 350:643–656

    Article  Google Scholar 

  23. Li N, Chen G (2009) Multi-layered friendship modeling for location-based mobile social networks. In: Proceedings of Mobiquitous 2009 (MobiQuitous ’09), pp 1–10

  24. Magnani M, Rossi L (2011) The ML-model for multi-layer social networks. In: Proceedings of the 2011 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’11), pp 5–12

  25. Mane KK, Börner K (2004) Mapping topics and topic bursts in PNAS. Proc Natl Acad Sci 101:5287–5290

    Article  Google Scholar 

  26. McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Annu Rev Sociol 27:415–444

    Article  Google Scholar 

  27. Milo R, Shen-Orr SS, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2004) Network motifs: simple building blocks of complex networks. Science 298:824–827

    Article  Google Scholar 

  28. Mucha PJ, Richardson T, Macon K, Porter MA, Onnela JP (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328:876–878

    MathSciNet  Article  MATH  Google Scholar 

  29. Neal Z (2013) Identifying statistically significant edges in one-mode projections. Soc Netw Anal Min, pp 1–10

  30. Newman MEJ (2001a) Scientific collaboration networks. I. Network construction and fundamental results. Phys Rev Lett 64:016131

    Google Scholar 

  31. Newman MEJ (2001b) Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev Lett 64:016132

    Google Scholar 

  32. Newman MEJ (2002) Assortative mixing in networks. Phys Rev Lett 89:208701

    Article  Google Scholar 

  33. Newman MEJ (2004) Coauthorship networks and patterns of scientific collaboration. Proc Natl Acad Sci 101:5200–5205

    Article  Google Scholar 

  34. Park J, Barabási AL (2007) Distribution of node characteristics in complex networks. Proc Natl Acad Sci 104(46):17916–17920

    Article  Google Scholar 

  35. Piatetsky-Shapiro G, Frawley W (1991) Knowledge Discovery in Databases. AAAI/MIT Press, Cambridge, pp 229–248

  36. Ramasco JJ, Dorogovtsev S, Pastor-Satorras R (2004) Self-organization of collaboration networks. Phys Rev E 70:036106

    Article  Google Scholar 

  37. Ramasco JJ, Morris SA (2006) Social inertia in collaboration networks. Phys Rev E 73:016122

    Article  Google Scholar 

  38. Saavedra S, Reed-Tsochas F, Uzzi B (2009) A simple model of bipartite cooperation for ecological and organizational networks. Nature 457:463–466

    Article  Google Scholar 

  39. Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet 31:64–68

    Article  Google Scholar 

  40. Szell M, Lambiotte R, Thurner S (2010) Multirelational organization of large-scale social networks in an online world. Proc Natl Acad Sci 107:13636–13641

    Article  Google Scholar 

  41. Szell M, Thurner S (2010) Measuring social dynamics in a massive multiplayer online game. Soc Netw 32:313–329

    Article  Google Scholar 

  42. The Internet Movie Database (IMDb). Alternative interfaces. http://imdb.com/interfaces

  43. The Netflix Prize. http://www.netflixprize.com/

  44. Uhlmann S, Mannsperger H, Zhang JD, Horvát EÁ, Schmidt C, Küblbeck M, Ward A, Tschulena U, Zweig K, Korf U, Wiemann S, Sahin Ö (2012) Global miRNA regulation of a local protein network: case study with the EGFR-driven cell cycle network in breast cancer. Mol Syst Biol 570:8

    Google Scholar 

  45. Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge

  46. Watts DJ, Strogatz SH (1998) Collective dynamics of ’small-world’ networks. Nature 393:440–442

    Article  Google Scholar 

  47. Yeger-Lotem E, Sattath S, Kashtan N, Itzkovitz S, Milo R, Pinter RY, Alon U, Margalit H (2004) Network motifs in integrated cellular networks of transcription–regulation and protein–protein interaction. Proc Natl Acad Sci 101:5934–5939

    Article  Google Scholar 

  48. Zahoránszky L, Katona G, Hári P, Málnási-Csizmadia A, Zweig K, Zahoránszky-Kőhalmi G (2009) Breaking the hierarchy—a new cluster selection mechanism for hierarchical clustering methods. Algorithms Mol Biol 4:12

    Article  Google Scholar 

  49. Zhou T, Ren J, Medo M, Zhang YC (2007) Bipartite network projection and personal recommendation. Phys Rev E 76:046115

    Article  Google Scholar 

  50. Zweig KA (2010) How to forget the second side of the story: a new method for the one-mode projection of bipartite graphs. In: Proceedings of the second Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM’10), pp 200–207

  51. Zweig KA, Kaufmann M (2011) A systematic approach to the one-mode projection of bipartite graphs. Soc Netw Anal Min 1(3):187–218

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank Andreas Spitz for useful discussions, ground truth data, and software. The authors are also grateful to the anonymous reviewers for their helpful comments. EÁH is supported by the Heidelberg Graduate School of Mathematical and Computational Methods for the Sciences, University of Heidelberg, Germany, which is funded by the German Excellence Initiative (GSC 220).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Emőke-Ágnes Horvát.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Horvát, E., Zweig, K.A. A fixed degree sequence model for the one-mode projection of multiplex bipartite graphs. Soc. Netw. Anal. Min. 3, 1209–1224 (2013). https://doi.org/10.1007/s13278-013-0133-9

Download citation

Keywords

  • Bipartite graphs
  • One-mode projection
  • Multiplex networks