Advertisement

Maintaining Sliding-Window Neighborhood Profiles in Interaction Networks

  • Rohit Kumar
  • Toon Calders
  • Aristides Gionis
  • Nikolaj Tatti
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9285)

Abstract

Large networks are being generated by applications that keep track of relationships between different data entities. Examples include online social networks recording interactions between individuals, sensor networks logging information exchanges between sensors, and more. There is a large body of literature on computing exact or approximate properties on large networks, although most methods assume static networks. On the other hand, in most modern real-world applications, networks are highly dynamic and continuous interactions along existing connections are generated. Furthermore, it is desirable to consider that old edges become less important, and their contribution to the current view of the network diminishes over time.

We study the problem of maintaining the neighborhood profile of each node in an interaction network. Maintaining such a profile has applications in modeling network evolution and monitoring the importance of the nodes of the network over time. We present an online streaming algorithm to maintain neighborhood profiles in the sliding-window model. The algorithm is highly scalable as it permits parallel processing and the computation is node centric, hence it scales easily to very large networks on a distributed system, like Apache Giraph. We present results from both serial and parallel implementations of the algorithm for different social networks. The summary of the graph is maintained such that query of any window length can be performed.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ahn, K.J., Guha, S.: Graph sparsification in the semi-streaming model. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009, Part II. LNCS, vol. 5556, pp. 328–338. Springer, Heidelberg (2009) CrossRefGoogle Scholar
  2. 2.
    Arasu, A., Manku, G.: Approximate counts and quantiles over sliding windows. In: PODS, pp. 286–296 (2004)Google Scholar
  3. 3.
    Babcock, B., Datar, M., Motwani, R.: Sampling from a moving window over streaming data. In: SODA, pp. 633–634 (2002)Google Scholar
  4. 4.
    Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Reductions in streaming algorithms, with an application to counting triangles in graphs. In: SODA, pp. 623–632 (2002)Google Scholar
  5. 5.
    Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient semi-streaming algorithms for local triangle counting in massive graphs. In: KDD (2008)Google Scholar
  6. 6.
    Boldi, P., Rosa, M., Vigna, S.: Hyperanf: approximating the neighbourhood function of very large graphs on a budget. In: WWW, pp. 625–634 (2011)Google Scholar
  7. 7.
    Bordino, I., Donato, D., Gionis, A., Leonardi, S.: Mining large networks with subgraph counting. In: ICDM, pp. 737–742 (2008)Google Scholar
  8. 8.
    Buriol, L., Frahling, G., Leonardi, S., Marchetti-Spaccamela, A., Sohler, C.: Counting triangles in data streams. In: PODS, pp. 253–262 (2006)Google Scholar
  9. 9.
    Chabchoub, Y., Hébrail, G.: Sliding hyperloglog: estimating cardinality in a data stream over a sliding window. In: ICDM Workshops (2010)Google Scholar
  10. 10.
    Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. Journal of Computer and System Sciences 55(3), 441–453 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Cohen, E.: All-distances sketches, revisited: HIP estimators for massive graphs analysis. In: PODS, pp. 88–99 (2014)Google Scholar
  12. 12.
    Cormode, G., Muthukrishnan, S.: An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms 55(1), 58–75 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Crouch, M.S., McGregor, A., Stubbs, D.: Dynamic graphs in the sliding-window model. In: Bodlaender, H.L., Italiano, G.F. (eds.) ESA 2013. LNCS, vol. 8125, pp. 337–348. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  14. 14.
    Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM Journal on Computing 31(6), 1794–1813 (2002)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Durand, M., Flajolet, P.: Loglog counting of large cardinalities. In: Di Battista, G., Zwick, U. (eds.) ESA 2003. LNCS, vol. 2832, pp. 605–617. Springer, Heidelberg (2003) CrossRefGoogle Scholar
  16. 16.
    Eppstein, D., Galil, Z., Italiano, G.: Dynamic graph algorithms. CRC Press (1998)Google Scholar
  17. 17.
    Flajolet, P., Fusy, É., Gandouet, O., Meunier, F.: Hyperloglog: the analysis of a near-optimal cardinality estimation algorithm. In: Proceedings of the DMTCS (2008)Google Scholar
  18. 18.
    Gama, J.: Knowledge discovery from data streams. CRC Press (2010)Google Scholar
  19. 19.
    Henzinger, M., King, V.: Randomized fully dynamic graph algorithms with polylogarithmic time per operation. Journal of the ACM 46(4), 502–516 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Henzinger, M., Raghavan, P., Rajagopalan, S.: Computing on data streams. In: DIMACS Workshop External Memory and Visualization, vol. 50 (1999)Google Scholar
  21. 21.
    Holme, P., Saramäki, J.: Temporal networks. Physics Reports 519(3), 97–125 (2012)CrossRefGoogle Scholar
  22. 22.
    Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data
  23. 23.
    Michail, O.: An introduction to temporal graphs: An algorithmic perspective (2015). arXiv:1503.00278
  24. 24.
    Muthukrishnan, S.: Data streams: Algorithms and applications (2005)Google Scholar
  25. 25.
    Palmer, C., Gibbons, P., Faloutsos, C.: ANF: A fast and scalable tool for data mining in massive graphs. In: KDD, pp. 81–90 (2002)Google Scholar
  26. 26.
    Rozenshtein, P., Tatti, N., Gionis, A.: Discovering dynamic communities in interaction networks. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part II. LNCS, vol. 8725, pp. 678–693. Springer, Heidelberg (2014) Google Scholar
  27. 27.
    Tsourakakis, C., Kang, U., Miller, G., Faloutsos, C.: Doulion: counting triangles in massive graphs with a coin. In: KDD, pp. 837–846 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Rohit Kumar
    • 1
  • Toon Calders
    • 1
  • Aristides Gionis
    • 2
  • Nikolaj Tatti
    • 2
  1. 1.Department of Computer and Decision EngineeringUniversité Libre de BruxellesBrusselsBelgium
  2. 2.Helsinki Institute for Information Technology and Department of Computer ScienceAalto UniversityEspooFinland

Personalised recommendations