Advertisement

A Scalable Parallel Approach for Subgraph Census Computation

  • David Aparicio
  • Pedro Paredes
  • Pedro Ribeiro
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8806)

Abstract

Counting the occurrences of small subgraphs in large networks is a fundamental graph mining metric with several possible applications. Computing frequencies of those subgraphs is also known as the subgraph census problem, which is a computationally hard task. In this paper we provide a parallel multicore algorithm for this purpose. At its core we use FaSE, an efficient network-centric sequential subgraph census algorithm, which is able to substantially decrease the number of isomorphism tests needed when compared to past approaches. We use one thread per core and employ a dynamic load balancing scheme capable of dealing with the highly unbalanced search tree induced by FaSE and effectively redistributing work during execution. We assessed the scalability of our algorithm on a varied set of representative networks and achieved near linear speedup up to 32 cores while obtaining a high efficiency for the total 64 cores of our machine.

Keywords

Graph Mining Subgraph Census Parallelism Multicores 

References

  1. 1.
    Afrati, F.N., Fotakis, D., Ullman, J.D.: Enumerating subgraph instances using map-reduce. In: IEEE 29th International Conference on Data Engineering (ICDE), pp. 62–73. IEEE CS, Los Alamitos (2013)Google Scholar
  2. 2.
    Aparicio, D., Ribeiro, P., Silva, F.: Parallel subgraph counting for multicore architectures. In: IEEE International Symposium on Parallel and Distributed Processing with Applications. IEEE CS (August 2014)Google Scholar
  3. 3.
    Grochow, J., Kellis, M.: Network motif discovery using subgraph enumeration and symmetry-breaking. Research in Computational Molecular Biology, 92–106 (2007)Google Scholar
  4. 4.
    Kashani, Z., Ahrabian, H., Elahi, E., Nowzari-Dalini, A., Ansari, E., Asadi, S., Mohammadi, S., Schreiber, F., Masoudi-Nejad, A.: Kavosh: a new algorithm for finding network motifs. BMC Bioinformatics 10(1), 318 (2009)CrossRefGoogle Scholar
  5. 5.
    Khakabimamaghani, S., Sharafuddin, I., Dichter, N., Koch, I., Masoudi-Nejad, A.: Quatexelero: An accelerated exact network motif detection algorithm. PLoS One 8(7), e68073 (2013)Google Scholar
  6. 6.
    Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network Motifs: Simple Building Blocks of Complex Networks. Science 298(5594) (2002)Google Scholar
  7. 7.
    Paredes, P., Ribeiro, P.: Towards a faster network-centric subgraph census. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 264–271. ACM, NY (2013)CrossRefGoogle Scholar
  8. 8.
    Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics 26(6), 853–854 (2010)CrossRefGoogle Scholar
  9. 9.
    Ribeiro, P., Silva, F.: G-tries: a data structure for storing and finding subgraphs. Data Mining and Knowledge Discovery 28, 337–377 (2014)CrossRefMathSciNetzbMATHGoogle Scholar
  10. 10.
    Ribeiro, P., Silva, F., Kaiser, M.: Strategies for network motifs discovery. In: IEEE International Conference on e-Science, pp. 80–87. e-Science (2009)Google Scholar
  11. 11.
    Ribeiro, P., Silva, F., Lopes, L.: Efficient parallel subgraph counting using g-tries. In: IEEE International Conference on Cluster Computing (Cluster), pp. 1559–1566. IEEE CS (September 2010)Google Scholar
  12. 12.
    Ribeiro, P., Silva, F., Lopes, L.: Parallel discovery of network motifs. Journal of Parallel and Distributed Computing 72, 144–154 (2012)CrossRefGoogle Scholar
  13. 13.
    Sanders, P.: Asynchronous random polling dynamic load balancing. In: Aggarwal, A.K., Pandu Rangan, C. (eds.) ISAAC 1999. LNCS, vol. 1741, pp. 37–48. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  14. 14.
    Slota, G.M., Madduri, K.: Fast approximate subgraph counting and enumeration. In: 42nd International Conference on Parallel Processing, pp. 210–219 (2013)Google Scholar
  15. 15.
    Wang, T., Touchman, J.W., Zhang, W., Suh, E.B., Xue, G.: A parallel algorithm for extracting transcription regulatory network motifs. In: IEEE International Symposium on Bioinformatics and Bioengineering, pp. 193–200 (2005)Google Scholar
  16. 16.
    Wernicke, S.: Efficient detection of network motifs. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 347–359 (2006)Google Scholar
  17. 17.
    Zhao, Z., Wang, G., Butt, A.R., Khan, M., Kumar, V.A., Marathe, M.V.: Sahad: Subgraph analysis in massive networks using hadoop. In: International Parallel and Distributed Processing Symposium, pp. 390–401 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • David Aparicio
    • 1
  • Pedro Paredes
    • 1
  • Pedro Ribeiro
    • 1
  1. 1.CRACS & INESC-TEC, Faculdade de CienciasUniversidade do PortoPortoPortugal

Personalised recommendations