Maintaining Sliding-Window Neighborhood Profiles in Interaction Networks
Large networks are being generated by applications that keep track of relationships between different data entities. Examples include online social networks recording interactions between individuals, sensor networks logging information exchanges between sensors, and more. There is a large body of literature on computing exact or approximate properties on large networks, although most methods assume static networks. On the other hand, in most modern real-world applications, networks are highly dynamic and continuous interactions along existing connections are generated. Furthermore, it is desirable to consider that old edges become less important, and their contribution to the current view of the network diminishes over time.
We study the problem of maintaining the neighborhood profile of each node in an interaction network. Maintaining such a profile has applications in modeling network evolution and monitoring the importance of the nodes of the network over time. We present an online streaming algorithm to maintain neighborhood profiles in the sliding-window model. The algorithm is highly scalable as it permits parallel processing and the computation is node centric, hence it scales easily to very large networks on a distributed system, like Apache Giraph. We present results from both serial and parallel implementations of the algorithm for different social networks. The summary of the graph is maintained such that query of any window length can be performed.
Unable to display preview. Download preview PDF.
- 2.Arasu, A., Manku, G.: Approximate counts and quantiles over sliding windows. In: PODS, pp. 286–296 (2004)Google Scholar
- 3.Babcock, B., Datar, M., Motwani, R.: Sampling from a moving window over streaming data. In: SODA, pp. 633–634 (2002)Google Scholar
- 4.Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Reductions in streaming algorithms, with an application to counting triangles in graphs. In: SODA, pp. 623–632 (2002)Google Scholar
- 5.Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient semi-streaming algorithms for local triangle counting in massive graphs. In: KDD (2008)Google Scholar
- 6.Boldi, P., Rosa, M., Vigna, S.: Hyperanf: approximating the neighbourhood function of very large graphs on a budget. In: WWW, pp. 625–634 (2011)Google Scholar
- 7.Bordino, I., Donato, D., Gionis, A., Leonardi, S.: Mining large networks with subgraph counting. In: ICDM, pp. 737–742 (2008)Google Scholar
- 8.Buriol, L., Frahling, G., Leonardi, S., Marchetti-Spaccamela, A., Sohler, C.: Counting triangles in data streams. In: PODS, pp. 253–262 (2006)Google Scholar
- 9.Chabchoub, Y., Hébrail, G.: Sliding hyperloglog: estimating cardinality in a data stream over a sliding window. In: ICDM Workshops (2010)Google Scholar
- 11.Cohen, E.: All-distances sketches, revisited: HIP estimators for massive graphs analysis. In: PODS, pp. 88–99 (2014)Google Scholar
- 16.Eppstein, D., Galil, Z., Italiano, G.: Dynamic graph algorithms. CRC Press (1998)Google Scholar
- 17.Flajolet, P., Fusy, É., Gandouet, O., Meunier, F.: Hyperloglog: the analysis of a near-optimal cardinality estimation algorithm. In: Proceedings of the DMTCS (2008)Google Scholar
- 18.Gama, J.: Knowledge discovery from data streams. CRC Press (2010)Google Scholar
- 20.Henzinger, M., Raghavan, P., Rajagopalan, S.: Computing on data streams. In: DIMACS Workshop External Memory and Visualization, vol. 50 (1999)Google Scholar
- 22.Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data
- 23.Michail, O.: An introduction to temporal graphs: An algorithmic perspective (2015). arXiv:1503.00278
- 24.Muthukrishnan, S.: Data streams: Algorithms and applications (2005)Google Scholar
- 25.Palmer, C., Gibbons, P., Faloutsos, C.: ANF: A fast and scalable tool for data mining in massive graphs. In: KDD, pp. 81–90 (2002)Google Scholar
- 26.Rozenshtein, P., Tatti, N., Gionis, A.: Discovering dynamic communities in interaction networks. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part II. LNCS, vol. 8725, pp. 678–693. Springer, Heidelberg (2014) Google Scholar
- 27.Tsourakakis, C., Kang, U., Miller, G., Faloutsos, C.: Doulion: counting triangles in massive graphs with a coin. In: KDD, pp. 837–846 (2009)Google Scholar