Advertisement

Induced Edge Samplings and Triangle Count Distributions in Large Networks

  • Nelson AntunesEmail author
  • Tianjian Guo
  • Vladas Pipiras
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 882)

Abstract

This work focuses on distributions of triangle counts per node and edge, as a means for network description, analysis, model building and other tasks. The main interest is in estimating these distributions through sampling, especially for large networks. Suitable sampling schemes for this are introduced and also adapted to the situations where network access is restricted or streaming data of edges are available. Estimation under the proposed sampling schemes is studied through several methods, and examined on simulated and real-world networks.

Keywords

Triangles Random sampling Distribution estimation Static and streaming graphs Power laws 

References

  1. 1.
    Al Hasan, M., Dave, V.S.: Triangle counting in large networks: a review. WIREs Data Mining Knowl. Discov. 8(2), e1226 (2018)CrossRefGoogle Scholar
  2. 2.
    Antunes, N., Pipiras, V.: Estimation of flow distributions from sampled traffic. ACM Trans. Model Perform. Eval. Comput. Syst. 1(3), 11:1–11:28 (2016)CrossRefGoogle Scholar
  3. 3.
    Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Reductions in streaming algorithms, with an application to counting triangles in graphs. In: Proceedings of the 13th Annual ACM-SIAM SODA, pp. 623–632 (2002)Google Scholar
  4. 4.
    Becchetti, L., Castillo, C., Donato, D., Baeza-YATES, R., Leonardi, S.: Link analysis for web spam detection. ACM Trans. Web 2(1), 2:1–2:42 (2008)CrossRefGoogle Scholar
  5. 5.
    Buriol, L.S., Frahling, G., Leonardi, S., Marchetti-Spaccamela, A., Sohler, C.: Counting triangles in data streams. In: Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART PODS, pp. 253–262 (2006)Google Scholar
  6. 6.
    Eckmann, J.-P., Moses, E.: Curvature of co-links uncovers hidden thematic layers in the world wide web. Proc. Natl. Acad. Sci. 99(9), 5825–5829 (2002)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Eldar, Y.C.: Generalized SURE for exponential families: applications to regularization. IEEE Trans. Signal Process. 57(2), 471–481 (2009)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Jha, M., Seshadhri, C., Pinar, A.: A space-efficient streaming algorithm for estimating transitivity and triangle counts using the birthday paradox. ACM Trans. Knowl. Discov. Data 9(3), 15:1–15:21 (2015)CrossRefGoogle Scholar
  9. 9.
    Kolaczyk, E.D.: Statistical Analysis of Network Data. Springer, New York (2009).  https://doi.org/10.1007/978-0-387-88146-1CrossRefzbMATHGoogle Scholar
  10. 10.
    Lim, Y., Jung, M., Kang, U.: Memory-efficient and accurate sampling for counting local triangles in graph streams: from simple to multigraphs. ACM Trans. Knowl. Discov. Data 12(1), 4:1–4:28 (2018)CrossRefGoogle Scholar
  11. 11.
    Newman, M.: Networks: An Introduction, 2nd edn. Oxford University Press Inc., New York (2018)CrossRefGoogle Scholar
  12. 12.
    Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)CrossRefGoogle Scholar
  13. 13.
    Stefani, L.D., Epasto, A., Riondato, M., Upfal, E.: TriÈst: counting local and global triangles in fully dynamic streams with fixed memory size. ACM Trans. Knowl. Discov. Data 11(4), 43:1–43:50 (2017)CrossRefGoogle Scholar
  14. 14.
    Tune, P., Veitch, D.: Fisher information in flow size distribution estimation. IEEE Trans. Info. Theory 57(10), 7011–7035 (2011)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Softw. 1(1), 37–57 (1985)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Zhang, Y., Kolaczyk, E.D., Spencer, B.D.: Estimating network degree distributions under sampling: an inverse problem, with applications to monitoring social media networks. Ann. Appl. Stat. 9(1), 166–199 (2015)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Center for Computational and Stochastic MathematicsUniversity of LisbonLisbonPortugal
  2. 2.University of AlgarveFaroPortugal
  3. 3.Department of Statistics and Operations ResearchUniversity of North CarolinaChapel HillUSA

Personalised recommendations