A matrix factorization approach to graph compression with partial information

  • Farshad Nourbakhsh
  • Samuel Rota Bulò
  • Marcello Pelillo
Original Article

Abstract

We address the problem of encoding a graph of order \(\mathsf {n}\) into a graph of order \(\mathsf {k}<\mathsf {n}\) in a way to minimize reconstruction error. This encoding is characterized in terms of a particular factorization of the adjacency matrix of the original graph. The factorization is determined as the solution of a discrete optimization problem, which is for convenience relaxed into a continuous, but equivalent, one. Our formulation does not require to have the full graph, but it can factorize the graph also in the presence of partial information. We propose a multiplicative update rule for the optimization task resembling the ones introduced for nonnegative matrix factorization, and convergence properties are proven. Experiments are conducted to assess the effectiveness of the proposed approach.

Keywords

Matrix factorization Graph compression Stochastic blockmodel 

References

  1. 1.
    Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014MATHGoogle Scholar
  2. 2.
    Baum LE, Eagon JA (1967) An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bull Am Math Soc 73:360–363MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J. Mach Learn Res 3:993–1022Google Scholar
  4. 4.
    Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560CrossRefGoogle Scholar
  5. 5.
    Choi Y, Szpankowski W (2012) Compression of graphical structures: fundamental limits, algorithms, and experiments. IEEE Trans Inf Theory 58(2):620–638MathSciNetCrossRefGoogle Scholar
  6. 6.
    Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407CrossRefGoogle Scholar
  7. 7.
    Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. Int Conf Knowl Discov Data Min 10:551–556Google Scholar
  8. 8.
    Ding C, He, X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: SIAM data mining conference, pp 606–610Google Scholar
  9. 9.
    Ding C, Li, T, Jordan MI (2008) Nonnegative matrix factorization for combinatorial optimization: spectral clustering, graph matching and clique finding. In: IEEE international conference on data mining, pp 183–192Google Scholar
  10. 10.
    Ding C, Li T, Peng W (2008) On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput Stat Data Anal 52(8):3913–3927MathSciNetCrossRefMATHGoogle Scholar
  11. 11.
    Goldenberg A, Zheng AX, Fienberg SE, Airoldi EM (2010) A survey of statistical network models. Found Trends Mach Learn 2(2):129–233CrossRefGoogle Scholar
  12. 12.
    Hofmann T (2000) Learning the similarity of documents: an information-geometric approach to document retrieval and categorization. Adv Neural Inf Process Syst 12:914–920Google Scholar
  13. 13.
    Holland PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: first steps. Soc Netw 5(2):109–137MathSciNetCrossRefGoogle Scholar
  14. 14.
    Horn RA, Johnson CR (1985) Matrix analysis. Cambridge University Press, CambridgeCrossRefMATHGoogle Scholar
  15. 15.
    Hubbard TJP, Murzin AG, Brenner SE, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540Google Scholar
  16. 16.
    Jolliffe I (1987) Principal component analysis. Springer, New YorkGoogle Scholar
  17. 17.
    Kuang D, Park H, Ding C (2012) Symmetric nonnegative matrix factorization for graph clustering. In: SIAM international conference data mining, pp 106–117Google Scholar
  18. 18.
    Lakshminarayanan B, Raich R (2010) Non-negative matrix factorization for parameter estimation in hidden Markov models. In: IEEE international workshop on machine learning for signal processing, pp 89–94Google Scholar
  19. 19.
    Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791CrossRefGoogle Scholar
  20. 20.
    Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562Google Scholar
  21. 21.
    Li P, Bu J, Yang Y, Ji R, Chen C, Cai D (2014) Discriminative orthogonal nonnegative matrix factorization with flexibility for data representation. Exp Syst Appl 41(4):1283–1293CrossRefGoogle Scholar
  22. 22.
    Li P, Chen C, Bu J (2012) Clustering analysis using manifold kernel concept factorization. Neural Comput 87:120–131Google Scholar
  23. 23.
    Lorrain F, White HC (1971) Structural equivalence of individuals in social networks. J Math Sociol 1:49–80CrossRefGoogle Scholar
  24. 24.
    Mørup M, Schmidt M (2012) Bayesian community detection. Neural Comput 24(9):2434–2456MathSciNetCrossRefGoogle Scholar
  25. 25.
    Navlakha S, Rastogi R, Shrivastava N (2008) Graph summarization with bounded error. In: ACM SIGMOD international conference on management of data, pp 419–432Google Scholar
  26. 26.
    Nepusz T, Petróczi A, Négyessy L, Bazsó F (2008) Fuzzy communities and the concept of bridgeness in complex networks. Phys Rev E 77(1):016107MathSciNetCrossRefGoogle Scholar
  27. 27.
    Nourbakhsh F, Rota Bulò, S, Pelillo M (2014) A matrix factorization approach to graph compression. In: 22nd international conference on pattern recognition. IEEE, Stockholm, Sweden, 24–28 Aug 2014Google Scholar
  28. 28.
    Paatero P, Tapper AU (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5:111–126CrossRefGoogle Scholar
  29. 29.
    Psorakis I, Roberts S, Ebden M, Sheldon B (2011) Overlapping community detection using nonnegative matrix factorization. Phys Rev E 83(6):066114CrossRefGoogle Scholar
  30. 30.
    Rota Bulò S, Lourenço A, Fred, ALN, Pelillo M (2010) Pairwise probabilistic clustering using evidence accumulation. In: International workshop on structure and synthesis pattern recognition, pp 395–404Google Scholar
  31. 31.
    Rota Bulò S, Pelillo M (2010) Probabilistic clustering using the baum-eagon inequality. In: International conference on pattern recognition, pp 1429–1432Google Scholar
  32. 32.
    Schölkopf B, Smola A, M\(\ddot{\rm l}\)ler KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319Google Scholar
  33. 33.
    Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22:888–905CrossRefGoogle Scholar
  34. 34.
    Sperotto A, Pelillo M (2007) Szemerédis regularity lemma and its applications to pairwise clustering and segmentation. In: Energy minimization methods in computer vision and pattern recognition, pp 13–27Google Scholar
  35. 35.
    Szemerédi, E (1978) Regular partitions of graphs. In: Problèmes combinatoires et thorie des graphes. CNRS, Paris, pp 399–401Google Scholar
  36. 36.
    Toivonen H, Zhou F, Hartikainen A, Hinkka A (2011) Compression of weighted graphs. In: International conference on knowledge discovery and data mining, pp 965–973Google Scholar
  37. 37.
    Verma D, Meila M (2003) Comparison of spectral clustering methods, Technical report. University of WashingtonGoogle Scholar
  38. 38.
    Xu W, Gong, Y (2004) Document clustering by concept factorization. In: Proceedings of 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp 202–209Google Scholar
  39. 39.
    Yang Z, Oja E (2012) Quadratic nonnegative matrix factorization. Pattern Recognit 45(4):1500–1510CrossRefMATHGoogle Scholar
  40. 40.
    Zhang H, Yang Z, Oja E (2014) Adaptive multiplicative updates for quadratic nonnegative matrix factorization. Neural Comput 134:206–213Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Farshad Nourbakhsh
    • 1
  • Samuel Rota Bulò
    • 2
  • Marcello Pelillo
    • 1
  1. 1.DAIS, Università Ca’ Foscari VeneziaVenezia MestreItaly
  2. 2.Fondazione Bruno KesslerPovoItaly

Personalised recommendations