Parallel collective factorization for modeling large heterogeneous networks

  • Ryan A. Rossi
  • Rong Zhou
Original Article


Relational learning methods for heterogeneous network data are becoming increasingly important for many real-world applications. However, existing relational learning approaches are sequential, inefficient, unable to scale to large heterogeneous networks, as well as many other limitations related to convergence, parameter tuning, etc. In this paper, we propose Parallel Collective Matrix Factorization (PCMF) that serves as a fast and flexible framework for joint modeling of a variety of heterogeneous network data. The PCMF learning algorithm solves for a single parameter given the others, leading to a parallel scheme that is fast, flexible, and general for a variety of relational learning tasks and heterogeneous data types. The proposed approach is carefully designed to be (1) efficient for large heterogeneous networks (linear in the total number of observations from the set of input matrices), (2) flexible as many components are interchangeable and easily adaptable, and (3) effective for a variety of applications as well as for different types of data. The experiments demonstrate the scalability, flexibility, and effectiveness of PCMF for a variety of relational modeling tasks. In particular, PCMF outperforms a recent state-of-the-art approach in runtime, scalability, and prediction quality. Finally, we also investigate variants of PCMF for serving predictions in a real-time streaming fashion.


Recommender systems Missing value estimation Matrix completion Relational learning Low-rank approximation Parallelization Scalable graph models Matrix factorization Collective factorization Coupled matrix–tensor factorization Cyclic coordinate descent Heterogeneous networks Prediction Social networks Link prediction Role discovery Network analysis 


  1. Aggarwal CC (2007) Data streams: models and algorithms, vol 31. Springer, BerlinCrossRefzbMATHGoogle Scholar
  2. Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on Very large data bases, vol 29. VLDB Endowment, pp 81–92Google Scholar
  3. Ahmed NK, Neville J, Kompella R (2013) Network sampling: from static to streaming graphs. TKDD, pp 1–54Google Scholar
  4. Ahmed NK, Rossi RA (2015) Interactive visual graph analytics on the web. In: International AAAI conference on web and social media (ICWSM), pp 566–569Google Scholar
  5. Akaike H (1974) A new look at the statistical model identification. Trans Autom Control 19(6):716–723MathSciNetCrossRefzbMATHGoogle Scholar
  6. Bilgic M, Mihalkova L, Getoor L (2010) Active learning for networked data. In: ICML, pp 79–86Google Scholar
  7. Bonhard P, Sasse M (2006) Knowing me, knowing you using profiles and social networking to improve recommender systems. BT Technol J 24(3):84–98CrossRefGoogle Scholar
  8. Borgatti SP, Everett MG, Johnson JC (2013) Analyzing social networks. SAGE Publications Limited, CaliforniaGoogle Scholar
  9. Dolan ED, Moré JJ (2002) Benchmarking optimization software with performance profiles. Math Program 91(2):201–213MathSciNetCrossRefzbMATHGoogle Scholar
  10. Fairbanks J, Ediger D, McColl R, Bader DA, Gilbert E (2013) A statistical framework for streaming graph analysis. In: ASONAM, pp 341–347Google Scholar
  11. Gemulla R, Nijkamp E, Haas PJ, Sismanis Y (2011) Large-scale matrix factorization with distributed stochastic gradient descent. In: SIGKDD, pp 69–77Google Scholar
  12. Jamali M, Ester M (2010) A matrix factorization technique with trust propagation for recommendation in social networks. In: RecSys, pp 135–142Google Scholar
  13. Jiang D, Pei J, Li H (2013) Mining search and browse logs for web search: a survey. TIST 4(4):57CrossRefGoogle Scholar
  14. Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37CrossRefGoogle Scholar
  15. La Fond T, Neville J (2010) Randomization tests for distinguishing social influence and homophily effects. In: WWW, pp 601–610Google Scholar
  16. Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. JASIST 58(7):1019–1031CrossRefGoogle Scholar
  17. Liu W, He J, Chang S-F (2010) Large graph construction for scalable semi-supervised learning. In: Proceedings of the 27th international conference on machine learning, pp 679–686Google Scholar
  18. Lusk EL, Pieper SC, Butler RM et al (2010) More scalability, less pain: a simple programming model and its implementation for extreme computing. SciDAC Rev 17(1):30–37Google Scholar
  19. Ma H, Yang H, Lyu MR, King I (2008) Sorec: social recommendation using probabilistic matrix factorization. In: CIKM, pp 931–940Google Scholar
  20. Massa P, Avesani P (2007) Trust-aware recommender systems. In: Proceedings of the 2007 ACM conference on recommender systems. ACM, pp 17–24Google Scholar
  21. McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27:415–444CrossRefGoogle Scholar
  22. Mislove A, Marcon M, Gummadi K, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks. In: SIGCOMM, pp 29–42Google Scholar
  23. Niu F, Recht B, Ré C, Wright SJ (2011) Hogwild!: a lock-free approach to parallelizing stochastic gradient descent. NIPS 24:693–701Google Scholar
  24. Recht B, Ré C (2013) Parallel stochastic gradient algorithms for large-scale matrix completion. Math Program Comput 5(2):201–226MathSciNetCrossRefzbMATHGoogle Scholar
  25. Rossi RA, Ahmed NK (2014) Role discovery in networks. TKDE 26(7):1–20Google Scholar
  26. Rossi RA, Ahmed NK (2016) An interactive data repository with visual analytics. SIGKDD Explor 17(2):37–41CrossRefGoogle Scholar
  27. Rossi RA, McDowell LK, Aha DW, Neville J (2012) Transforming graph data for statistical relational learning. JAIR 45(1):363–441zbMATHGoogle Scholar
  28. Salakhutdinov R, Mnih A (2007) Probabilistic matrix factorization. In NIPS, vol 1, pp 1–2Google Scholar
  29. Satuluri V, Parthasarathy S, Ruan Y (2011) Local graph sparsification for scalable clustering. In: Proceedings of the 2011 international conference on Management of data. ACM, pp 721–732Google Scholar
  30. Singla P, Richardson M (2008) Yes, there is a correlation: from social networks to personal behavior on the web. In: WWW, pp 655–664Google Scholar
  31. Spielman DA, Teng S-H (2004) Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In: Proceedings of the thirty-sixth annual ACM symposium on theory of computing. ACM, pp 81–90Google Scholar
  32. Sun Y, Han J (2012) Mining heterogeneous information networks: principles and methodologies. Synth Lect Data Min Knowl Discov 3(2):1–159CrossRefGoogle Scholar
  33. Tang J, Hu X, Liu H (2013) Social recommendation: a review. SNAM 3(4):1113–1133Google Scholar
  34. Tsai M-H, Aggarwal C, Huang T (2014) Ranking in heterogeneous social media. In: WSDM, pp 613–622Google Scholar
  35. Vorontsov M, Carhart G, Ricklin J (1997) Adaptive phase-distortion correction based on parallel gradient-descent optimization. Opt Lett 22(12):907–909CrossRefGoogle Scholar
  36. Yang X, Guo Y, Liu Y, Steck H (2013) A survey of collaborative filtering based social recommender systems. Comput Commun 41:1–10CrossRefGoogle Scholar
  37. Yang S-H, Long B, Smola A, Sadagopan N, Zheng Z, and Zha H (2011) Like like alike: joint friendship and interest propagation in social networks. In: WWW, pp 537–546Google Scholar
  38. Yasui Y, Fujisawa K, Goto K (2013) NUMA-optimized parallel breadth-first search on multicore single-node system. In: Big data, pp 394–402Google Scholar
  39. Yu H-F, Hsieh C-J, Si S, Dhillon IS (2012) Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: ICDM, pp 765–774Google Scholar
  40. Zhou Y, Wilkinson D, Schreiber R, Pan R (2008) Large-scale parallel collaborative filtering for the netflix prize. In: Algorithmic aspects in information and management. Springer, pp 337–348Google Scholar
  41. Zinkevich M, Weimer M, Smola AJ, Li L (2010) Parallelized stochastic gradient descent. In: NIPS, vol 4, p 4Google Scholar

Copyright information

© Springer-Verlag Wien 2016

Authors and Affiliations

  1. 1.Palo Alto Research Center (PARC, a Xerox Company)Palo AltoUSA

Personalised recommendations