Abstract
Estimating the number of triangles in graph streams using a limited amount of memory has become a popular topic in the last decade. Different variations of the problem have been studied, depending on whether the graph edges are provided in an arbitrary order or as incidence lists. However, with a few exceptions, the algorithms have considered insert-only streams. We present a new algorithm estimating the number of triangles in dynamic graph streams where edges can be both inserted and deleted. We show that our algorithm achieves better time and space complexity than previous solutions for various graph classes, for example sparse graphs with a relatively small number of triangles. Also, for graphs with constant transitivity coefficient, a common situation in real graphs, this is the first algorithm achieving constant processing time per edge. The result is achieved by a novel approach combining sampling of vertex triples and sparsification of the input graph. In the course of the analysis of the algorithm we present a lower bound on the number of pairwise independent 2-paths in general graphs which might be of independent interest. At the end of the paper we discuss lower bounds on the space complexity of triangle counting algorithms that make no assumptions on the structure of the graph.
Similar content being viewed by others
Notes
More generally, our results hold when the n vertices come from some arbitrary universe U known in advance.
References
Ahn, K.J., Guha, S., McGregor, A.: Graph Sketches: Sparsification, Spanners, and Subgraphs. In: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 5–14 (2012)
Aiello, W., Chung, F.R.K., Lu, L.: A Random Gmodel for Massive Graphs. In: Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21–23, 2000, Portland, OR, USA (STOC), 171–180 (2000)
Albert, R., Barabasi, A.L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002)
Alon, N., Yuster, R., Zwick, U.: Finding and counting given length cycles. Algorithmica 17(3), 209–223 (1997)
Arbitman, Y., Naor, M., Segev, G.: Backyard Cuckoo Hashing: Constant Worst-Case Operations with a Succinct Representation. In: 51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010, October 23–26, 2010, Las Vegas, Nevada, USA, 787–796 (2010)
Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient algorithms for large-scale local triangle counting. ACM Trans. Knowl. Discov. Data 13(1), 13–28 (2010)
Backstrom, L., Boldi, P., Rosa, M., Ugander, J., Vigna, S.: Four degrees of separation. Web Sci. 2012, WebScience ’12, Evanston, IL, USA, June 22–24, 33–42 (2012)
Berry, J.W., Hendrickson, B., LaViolette, R.A., Phillips, C.A.: Tolerating the community detection resolution limit with edge weighting. Phys. Rev. E 83(5), 56–119 (2011)
Buriol, L.S., Frahling, G., Leonardi, S., Marchetti-Spaccamela, A., Sohler, C.: Counting triangles in data streams. In: Proceedings of the Twenty-Fifth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26–28, Chicago, Illinois, USA, 253–262 (2006)
Carter, L.J., Wegman, M.N.: Universal classes of hash functions. J. Comput. Syst. Sci. 18(2), 143–154 (1979)
Frahling, G., Indyk, P., Sohler, C.: Sampling in dynamic data streams and applications. In: Symposium on Computational Geometry 142–149 (2005)
Jowhari, H., Ghodsi, M.: New Streaming Algorithms for Counting Triangles in Graphs. In: Computing and Combinatorics, 11th Annual International Conference, COCOON 2005, Kunming, China, August 16–29, 710–716 (2005)
Jha, M., Seshadhri, C., Pinar, A.: A space efficient streaming algorithm for triangle counting using the birthday paradox. In: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11–14, 589–597 (2013)
Jowhari, H., Saglam, M., Tardos, G.: Tight bounds for Lp samplers, finding duplicates in streams, and related problems. In: Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2011, June 12–16, Athens, Greece, 49–58 (2011)
Kane, D.M., Mehlhorn, K., Sauerwald, T., Sun, H.: Counting Arbitrary Subgraphs in Data Streams. In: Automata, Languages, and Programming—39th International Colloquium, ICALP: Warwick, UK, July 9–13, Proceedings. Part II 598–609 (2012)
Kolountzakis, M.N., Miller, G.L., Peng, R., Richard, T., Charalampos, E.: Efficient triangle counting in large graphs via degree-based vertex partitioning. Int. Math. 8(1–2), 161–185 (2012)
Kremer, I., Nisan, N., Ron, D.: On randomized one-round communication complexity. Comput. Complex. 8(1), 21–49 (1999)
Leonardi, S.: List of Open Problems in Sublinear Algorithms: Problem 11. http://sublinear.info/11
Manjunath, M., Mehlhorn, K., Panagiotou, K., Sun, H.: Approximate Counting of Cycles in Streams. In: Algorithms—ESA 2011—19th Annual European Symposium, Saarbrücken, Germany, September 5–9, 677–688 (2011)
Muthukrishnan, S.: Data streams: algorithms and applications. Found. Trends Theor. Comput. Sci. 1(2), 1 (2005)
Pagh, A., Pagh, R.: Uniform hashing in constant time and optimal space. SIAM J. Comput. 38(1), 85–96 (2008)
Pagh, R., Rasmus, T., Charalmpos, E.: Colorful triangle counting and a MapReduce implementation. Inf. Process. Lett. 112(7), 277–281 (2012)
Pavan, A., Tangwongsan, K., Tirthapura, S., Wu, K.L.: Counting and sampling triangles from a graph stream. PVLDB 6(14), 1870–1881 (2013)
Seshadhri, C., Pinar, A., Kolda, T.: Triadic Measures on Graphs: The Power of Wedge Sampling. In: Proceedings of the 13th SIAM International Conference on Data Mining, May 2–4. Austin, Texas, USA, 10–18 (2013)
Pǎtraşcu, M., Thorup, M.: The power of simple tabulation hashing. J. ACM 59(3), 14 (2012)
Thorup, M., Zhang, Y.: Tabulation-based 5-independent hashing with applications to linear probing and second moment estimation. SIAM J. Comput. 41(2), 293–331 (2012)
Tsourakakis, C.E., Kang, U., Miller, G.L., Faloutsos, C.: DOULION: Counting Triangles in Massive Graphs with a Coin. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28–July 1, 837–846 (2009)
Tsourakakis, C.E., Kolountzakis, M.N., Gary, G.L.: Triangle sparsifiers. J. Graph Algorithms Appl. 15(6), 703–726 (2011)
Williams, V.V.: Multiplying Matrices Faster than Coppersmith-Winograd. In: Proceedings of the 44th Symposium on Theory of Computing Conference, STOC 2012, New York, NY, USA, May 19–22, 887–898 (2012)
Author information
Authors and Affiliations
Corresponding author
Additional information
L. Bulteau: Work done while the author was at Technische Universität Berlin and supported by the Alexander von Humboldt Foundation, Bonn, Germany.
V. Froese: Supported by the DFG project DAMM (NI 369/13).
K. Kutzkov: Work done while the author was at IT University of Copenhagen and supported by the Danish National Research Foundation under the Sapere Aude program.
R. Pagh: Supported by the Danish National Research Foundation under the Sapere Aude program.
Rights and permissions
About this article
Cite this article
Bulteau, L., Froese, V., Kutzkov, K. et al. Triangle Counting in Dynamic Graph Streams. Algorithmica 76, 259–278 (2016). https://doi.org/10.1007/s00453-015-0036-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-015-0036-4