Skip to main content
Log in

Abstract

We study the problem of simplifying a given directed graph by keeping a small subset of its arcs. Our goal is to maintain the connectivity required to explain a set of observed traces of information propagation across the graph. Unlike previous work, we do not make any assumption about an underlying model of information propagation. Instead, we approach the task as a combinatorial problem. We prove that the resulting optimization problem is \(\mathbf{NP}\)-hard. We show that a standard greedy algorithm performs very well in practice, even though it does not have theoretical guarantees. Additionally, if the activity traces have a tree structure, we show that the objective function is supermodular, and experimentally verify that the approach for size-constrained submodular minimization recently proposed by Nagano et al. (28th International Conference on Machine Learning, 2011) produces very good results. Moreover, when applied to the task of reconstructing an unobserved graph, our methods perform comparably to a state-of-the-art algorithm devised specifically for this task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Notes

  1. www.gurobi.com.

  2. Yahoo! Meme was a microblogging service that was discontinued on May 25, 2012.

  3. http://snap.stanford.edu/data/memetracker9.html.

  4. www.cs.sfu.ca/~sja25/personal/datasets.

  5. www.cs.utexas.edu/users/dml/Software/graclus.html.

References

  • Arenas A, Duch J, Fernández A, Gómez S (2007) Size reduction of complex networks preserving modularity. New J Phys 9(6):176

    Article  Google Scholar 

  • Edmonds J (2003) Submodular functions, matroids, and certain polyhedra. In: Combinatorial optimization—Eureka, You Shrink!, Springer, Berlin, pp 11–26

  • Elkin M, Peleg D (2005) Approximating \(k\)-spanner problems for \(k {\>} 2\). Theor Comput Sci 337(1):249–277

  • Foti NJ, Hughes JM, Rockmore DN (2011) Nonparametric sparsification of complex multiscale networks. PLoS One 6(2):e16431

    Google Scholar 

  • Fujishige S (2005) Submodular functions and optimization, vol 58. Elsevier Science, Amsterdam

    Google Scholar 

  • Fung WS, Hariharan R, Harvey NJ, Panigrahi D (2011) A general framework for graph sparsification. In: Proceedings of the 43rd annual ACM symposium on theory of computing, ACM, pp 71–80

  • Gomez-Rodriguez M, Leskovec J, Krause A (2010) Inferring networks of diffusion and influence. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 1019–1028

  • Gomez-Rodriguez M, Balduzzi D, Schölkopf B (2011) Uncovering the temporal dynamics of diffusion networks. In: Proceedings of the 28th international conference on machine learning, pp 561–568

  • Iwata S, Orlin JB (2009) A simple combinatorial algorithm for submodular function minimization. In: Proceedings of the twentieth Annual ACM-SIAM symposium on discrete algorithms, society for industrial and applied mathematics, pp 1230–1237

  • Jamali M, Ester M (2010) Modeling and comparing the influence of neighbors on the behavior of users in social and similarity networks. In: 2010 IEEE international conference on data mining workshops (ICDMW), IEEE, pp 336–343

  • Kempe D, Kleinberg J, Tardos É (2003) Maximizing the spread of influence through a social network. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 137–146

  • Krause A (2010) Sfo: a toolbox for submodular function optimization. J Mach Learn Res 11:1141–1144

    MATH  Google Scholar 

  • Leskovec J, Faloutsos C (2007) Scalable modeling of real graphs using kronecker multiplication. In: Proceedings of the 24th international conference on machine learning, ACM, pp 497–504

  • Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 497–506

  • Mathioudakis M, Bonchi F, Castillo C, Gionis A, Ukkonen A (2011) Sparsification of influence networks. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 529–537

  • Misiołek E, Chen DZ (2006) Two flow network simplification algorithms. Inf Process Let 97(5):197–202

    Article  MATH  Google Scholar 

  • Nagano K, Kawahara Y, Aihara K (2011) Size-constrained submodular minimization through minimum norm base. In: Proceedings of the 28th international conference on machine learning, pp 977–984

  • Nemhauser GL, Wolsey LA, Fisher ML (1978) An analysis of approximations for maximizing submodular set functions-I. Math Progr 14(1):265–294

    Article  MathSciNet  MATH  Google Scholar 

  • Peleg D, Schäffer AA (1989) Graph spanners. J Graph Theory 13(1):99–116

    Google Scholar 

  • Quirin A, Cordon O, Santamaria J, Vargas-Quesada B, Moya-Anegón F (2008) A new variant of the pathfinder algorithm to generate large visual science maps in cubic time. Inf Process Manag 44(4):1611–1623

    Article  Google Scholar 

  • Serrano E, Quirin A, Botia J, Cordón O (2010) Debugging complex software systems by means of pathfinder networks. Inf Sci 180(5):561–583

    Article  Google Scholar 

  • Serrano MÁ, Boguñá M, Vespignani A (2009) Extracting the multiscale backbone of complex weighted networks. Proc Nat Acad Sci USA 106(16):6483–6488

    Article  Google Scholar 

  • Srikant R, Yang Y (2001) Mining web logs to improve website organization. In: Proceedings of the 10th international conference on World Wide Web, ACM, pp 430–437

  • Svitkina Z, Fleischer L (2011) Submodular approximation: sampling-based algorithms and lower bounds. SIAM J Comput 40(6):1715–1737

    Article  MathSciNet  MATH  Google Scholar 

  • Toivonen H, Mahler S, Zhou F (2010) A framework for path-oriented network simplification. In: Advances in intelligent data analysis IX, Springer, Berlin, pp 220–231

  • Wolfe P (1976) Finding the nearest point in a polytope. Math Progr 11(1):128–149

    Article  MATH  Google Scholar 

  • Zhou F, Malher S, Toivonen H (2010) Network simplification with minimal loss of connectivity. In: Data Mining (ICDM), 2010 IEEE 10th international conference on IEEE, pp 659–668

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antti Ukkonen.

Additional information

Responsible editor: Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, Filip Zelezny.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bonchi, F., De Francisci Morales, G., Gionis, A. et al. Activity preserving graph simplification. Data Min Knowl Disc 27, 321–343 (2013). https://doi.org/10.1007/s10618-013-0328-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-013-0328-8

Keywords

Navigation