## Abstract

A bipartite structure is a common property of many real-world network data sets such as agents which are affiliated with societies, customers who buy, rent, or rate products, and authors who write scientific papers. The one-mode projection of these networks onto either set of entities (e.g., societies, products, and articles) is a well-established approach for the analysis of such data and deduces relations between these entities. Some bipartite data sets of key importance contain several distinct types of relations between their entities. These networks require a projection method which accounts for multiple edge types. In this article, we present the multiplex extension of an existing projection algorithm for simplex bipartite networks, i.e., networks that contain a single type of relation. We use synthetic data to show the robustness of our method before applying it to a real-world network of user ratings for films, namely, the Netflix data set. Based on the assumption that co-ratings of films contain information about the films’ similarity, we analyse the multiplex projection as an approximation of the similarity landscape of the films. Besides comparing the projection to the coarse-grained classification of films into genres, we validate the resulting similarities based on ground truth data sets containing film series. Our analysis confirms the predictive power of the network of positive co-ratings. We furthermore explore the potential of additional, mixed co-rating patterns in improving the prediction of similarities and highlight necessary criteria for this approach.

This is a preview of subscription content, log in to check access.

## Notes

- 1.
Note that in the used Netflix data set the same user rates a certain film only once by either liking or disliking it. Thus, the maximal multiplicity of the resulting bipartite graph is 1.

- 2.
We also ran experiments on synthetic data where the degree sequence on one of the node sets of the bipartite graph was more homogenous. This work in progress shows that the presented multiplex one-mode projection is robust when using a different network model as well.

- 3.
The term

*ground truth*is a standard term in machine learning which defines the set of observations that is to be re-discovered by a good algorithm. Any algorithm can then be evaluated by the number of*true positive*predictions, i.e., those that are in the ground truth, the number of*false positives*, i.e., those not in the ground truth set but predicted by the algorithm, the number of*true negatives*(not predicted, not present in ground truth), and the number of*false negatives*(not predicted, but present in ground truth). - 4.
The Area Under (the receiver operating, ROC) Curve is a standard machine learning measure, which quantifies the probability that true positives are assigned lower scores than true negatives by a given algorithm (Fawcett 2006). Thus, a perfect one-mode projection algorithm regarding ground truth has an

*AUC*of 1 while random guessing results in an*AUC*of 0.5.

## References

Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466:761–764

Barabási AL, Jeong H, Néda Z, Ravasz E, Schubert A, Vicsek T (2002) Evolution of the social network of scientific collaborations. Physica A 311:590–614

boyd d, Crawford K (2011) Six provocations for big data. In: A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society

Breiger RL (1974) The duality of persons and groups. Soc Forces 53(2):181–190

Bródka, Stawiak P, Kazienko P (2011) Shortest path discovery in the multi-layered social network. In: Proceedings of the 2011 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’11), pp 497–501

Campbell C, Yang S, Albert R, Sheab K (2011) A network model for plant–pollinator community assembly. Proc Natl Acad Sci 108:197–202

Davis D, Lichtenwalter R, Chawla NV (2012) Supervised methods for multi-relational link prediction. Soc Netw Anal Min, pp 1–15

Eagle N, Pentland AS, Lazer D (2009) Inferring friendship network structure by using mobile phone data. Proc Natl Acad Sci 106:15274–15278

Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874

Foster JG, Foster DV, Grassberger P, Paczuski M (2010) Edge direction and the structure of networks. Proc Natl Acad Sci 107(24):10815–10820

Film series at Wikipedia. http://en.wikipedia.org/wiki/Film_series/

Gionis A, Mannila H, Mielikinen T, Tsaparas P (2007) Assessing data mining results via swap randomization. ACM Trans Knowl Discov Data 1(3). Art no 14

Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99:7821–7826

Gómez-Gardeñes J, Vilone D, Sanchez A (2011) Disentangling social and group heterogeneities: Public Goods games on complex networks. Eur J Phys 95:68003

Gotelli NJ, Graves GR (1996) Null-Models in Ecology. Smithsonian Institution Press, Washington, DC

Holme P, Liljeros F, Edling CR, Kim BJ (2003) Network bipartivity. Phys Rev E 68:056107

Horvát EÁ, Zweig KA (2012) One-mode projection of bipartite graphs. In: Proceedings of the 2012 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’12), pp 598–605

Kazienko P, Musial K, Kajdanowicz T (2011) Multidimensional social network in the social recommender system. IEEE Trans Syst Man Cybern Part A Syst Hum 41(4):746–759

Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78:046110

Lehmann S, Schwartz M, Hansen LK (2008) Biclique communities. Phys Rev E 78:016108

Lewis K, Kaufman J, Gonzalez M, Wimmer A, Christachis N (2008) Tastes, ties, and time: a new social network dataset using facebook.com. Soc Netw 30:330–342

Li M, Fan Y, Chen J, Gao L, Di Z, Wu J (2005) Weighted networks of scientific communication: the measurement and topological role of weight. Physica A 350:643–656

Li N, Chen G (2009) Multi-layered friendship modeling for location-based mobile social networks. In: Proceedings of Mobiquitous 2009 (MobiQuitous ’09), pp 1–10

Magnani M, Rossi L (2011) The ML-model for multi-layer social networks. In: Proceedings of the 2011 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’11), pp 5–12

Mane KK, Börner K (2004) Mapping topics and topic bursts in PNAS. Proc Natl Acad Sci 101:5287–5290

McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Annu Rev Sociol 27:415–444

Milo R, Shen-Orr SS, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2004) Network motifs: simple building blocks of complex networks. Science 298:824–827

Mucha PJ, Richardson T, Macon K, Porter MA, Onnela JP (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328:876–878

Neal Z (2013) Identifying statistically significant edges in one-mode projections. Soc Netw Anal Min, pp 1–10

Newman MEJ (2001a) Scientific collaboration networks. I. Network construction and fundamental results. Phys Rev Lett 64:016131

Newman MEJ (2001b) Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev Lett 64:016132

Newman MEJ (2002) Assortative mixing in networks. Phys Rev Lett 89:208701

Newman MEJ (2004) Coauthorship networks and patterns of scientific collaboration. Proc Natl Acad Sci 101:5200–5205

Park J, Barabási AL (2007) Distribution of node characteristics in complex networks. Proc Natl Acad Sci 104(46):17916–17920

Piatetsky-Shapiro G, Frawley W (1991) Knowledge Discovery in Databases. AAAI/MIT Press, Cambridge, pp 229–248

Ramasco JJ, Dorogovtsev S, Pastor-Satorras R (2004) Self-organization of collaboration networks. Phys Rev E 70:036106

Ramasco JJ, Morris SA (2006) Social inertia in collaboration networks. Phys Rev E 73:016122

Saavedra S, Reed-Tsochas F, Uzzi B (2009) A simple model of bipartite cooperation for ecological and organizational networks. Nature 457:463–466

Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of

*Escherichia coli*. Nat Genet 31:64–68Szell M, Lambiotte R, Thurner S (2010) Multirelational organization of large-scale social networks in an online world. Proc Natl Acad Sci 107:13636–13641

Szell M, Thurner S (2010) Measuring social dynamics in a massive multiplayer online game. Soc Netw 32:313–329

The Internet Movie Database (IMDb). Alternative interfaces. http://imdb.com/interfaces

The Netflix Prize. http://www.netflixprize.com/

Uhlmann S, Mannsperger H, Zhang JD, Horvát EÁ, Schmidt C, Küblbeck M, Ward A, Tschulena U, Zweig K, Korf U, Wiemann S, Sahin Ö (2012) Global miRNA regulation of a local protein network: case study with the EGFR-driven cell cycle network in breast cancer. Mol Syst Biol 570:8

Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge

Watts DJ, Strogatz SH (1998) Collective dynamics of ’small-world’ networks. Nature 393:440–442

Yeger-Lotem E, Sattath S, Kashtan N, Itzkovitz S, Milo R, Pinter RY, Alon U, Margalit H (2004) Network motifs in integrated cellular networks of transcription–regulation and protein–protein interaction. Proc Natl Acad Sci 101:5934–5939

Zahoránszky L, Katona G, Hári P, Málnási-Csizmadia A, Zweig K, Zahoránszky-Kőhalmi G (2009) Breaking the hierarchy—a new cluster selection mechanism for hierarchical clustering methods. Algorithms Mol Biol 4:12

Zhou T, Ren J, Medo M, Zhang YC (2007) Bipartite network projection and personal recommendation. Phys Rev E 76:046115

Zweig KA (2010) How to forget the second side of the story: a new method for the one-mode projection of bipartite graphs. In: Proceedings of the second Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM’10), pp 200–207

Zweig KA, Kaufmann M (2011) A systematic approach to the one-mode projection of bipartite graphs. Soc Netw Anal Min 1(3):187–218

## Acknowledgments

The authors would like to thank Andreas Spitz for useful discussions, ground truth data, and software. The authors are also grateful to the anonymous reviewers for their helpful comments. EÁH is supported by the Heidelberg Graduate School of Mathematical and Computational Methods for the Sciences, University of Heidelberg, Germany, which is funded by the German Excellence Initiative (GSC 220).

## Author information

### Affiliations

### Corresponding author

## Rights and permissions

## About this article

### Cite this article

Horvát, E., Zweig, K.A. A fixed degree sequence model for the one-mode projection of multiplex bipartite graphs.
*Soc. Netw. Anal. Min.* **3, **1209–1224 (2013). https://doi.org/10.1007/s13278-013-0133-9

Received:

Revised:

Accepted:

Published:

Issue Date:

### Keywords

- Bipartite graphs
- One-mode projection
- Multiplex networks