Advertisement

A Scalable Parallel Approach for Subgraph Census Computation

  • David Aparicio
  • Pedro Paredes
  • Pedro Ribeiro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8806)

Abstract

Counting the occurrences of small subgraphs in large networks is a fundamental graph mining metric with several possible applications. Computing frequencies of those subgraphs is also known as the subgraph census problem, which is a computationally hard task. In this paper we provide a parallel multicore algorithm for this purpose. At its core we use FaSE, an efficient network-centric sequential subgraph census algorithm, which is able to substantially decrease the number of isomorphism tests needed when compared to past approaches. We use one thread per core and employ a dynamic load balancing scheme capable of dealing with the highly unbalanced search tree induced by FaSE and effectively redistributing work during execution. We assessed the scalability of our algorithm on a varied set of representative networks and achieved near linear speedup up to 32 cores while obtaining a high efficiency for the total 64 cores of our machine.

Keywords

Graph Mining Subgraph Census Parallelism Multicores 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Afrati, F.N., Fotakis, D., Ullman, J.D.: Enumerating subgraph instances using map-reduce. In: IEEE 29th International Conference on Data Engineering (ICDE), pp. 62–73. IEEE CS, Los Alamitos (2013)Google Scholar
  2. 2.
    Aparicio, D., Ribeiro, P., Silva, F.: Parallel subgraph counting for multicore architectures. In: IEEE International Symposium on Parallel and Distributed Processing with Applications. IEEE CS (August 2014)Google Scholar
  3. 3.
    Grochow, J., Kellis, M.: Network motif discovery using subgraph enumeration and symmetry-breaking. Research in Computational Molecular Biology, 92–106 (2007)Google Scholar
  4. 4.
    Kashani, Z., Ahrabian, H., Elahi, E., Nowzari-Dalini, A., Ansari, E., Asadi, S., Mohammadi, S., Schreiber, F., Masoudi-Nejad, A.: Kavosh: a new algorithm for finding network motifs. BMC Bioinformatics 10(1), 318 (2009)CrossRefGoogle Scholar
  5. 5.
    Khakabimamaghani, S., Sharafuddin, I., Dichter, N., Koch, I., Masoudi-Nejad, A.: Quatexelero: An accelerated exact network motif detection algorithm. PLoS One 8(7), e68073 (2013)Google Scholar
  6. 6.
    Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network Motifs: Simple Building Blocks of Complex Networks. Science 298(5594) (2002)Google Scholar
  7. 7.
    Paredes, P., Ribeiro, P.: Towards a faster network-centric subgraph census. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 264–271. ACM, NY (2013)CrossRefGoogle Scholar
  8. 8.
    Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics 26(6), 853–854 (2010)CrossRefGoogle Scholar
  9. 9.
    Ribeiro, P., Silva, F.: G-tries: a data structure for storing and finding subgraphs. Data Mining and Knowledge Discovery 28, 337–377 (2014)CrossRefMathSciNetzbMATHGoogle Scholar
  10. 10.
    Ribeiro, P., Silva, F., Kaiser, M.: Strategies for network motifs discovery. In: IEEE International Conference on e-Science, pp. 80–87. e-Science (2009)Google Scholar
  11. 11.
    Ribeiro, P., Silva, F., Lopes, L.: Efficient parallel subgraph counting using g-tries. In: IEEE International Conference on Cluster Computing (Cluster), pp. 1559–1566. IEEE CS (September 2010)Google Scholar
  12. 12.
    Ribeiro, P., Silva, F., Lopes, L.: Parallel discovery of network motifs. Journal of Parallel and Distributed Computing 72, 144–154 (2012)CrossRefGoogle Scholar
  13. 13.
    Sanders, P.: Asynchronous random polling dynamic load balancing. In: Aggarwal, A.K., Pandu Rangan, C. (eds.) ISAAC 1999. LNCS, vol. 1741, pp. 37–48. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  14. 14.
    Slota, G.M., Madduri, K.: Fast approximate subgraph counting and enumeration. In: 42nd International Conference on Parallel Processing, pp. 210–219 (2013)Google Scholar
  15. 15.
    Wang, T., Touchman, J.W., Zhang, W., Suh, E.B., Xue, G.: A parallel algorithm for extracting transcription regulatory network motifs. In: IEEE International Symposium on Bioinformatics and Bioengineering, pp. 193–200 (2005)Google Scholar
  16. 16.
    Wernicke, S.: Efficient detection of network motifs. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 347–359 (2006)Google Scholar
  17. 17.
    Zhao, Z., Wang, G., Butt, A.R., Khan, M., Kumar, V.A., Marathe, M.V.: Sahad: Subgraph analysis in massive networks using hadoop. In: International Parallel and Distributed Processing Symposium, pp. 390–401 (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • David Aparicio
    • 1
  • Pedro Paredes
    • 1
  • Pedro Ribeiro
    • 1
  1. 1.CRACS & INESC-TEC, Faculdade de CienciasUniversidade do PortoPortoPortugal

Personalised recommendations