Very Large Graph Partitioning by Means of Parallel DBMS

  • Constantin S. Pan
  • Mikhail L. Zymbler
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8133)

Abstract

The paper introduces an approach to partitioning of very large graphs by means of parallel relational database management system (DBMS) named PargreSQL. Very large graph and its intermediate data that does not fit into main memory are represented as relational tables and processed by parallel DBMS. Multilevel partitioning is used. Parallel DBMS carries out coarsening to reduce graph size. Then an initial partitioning is performed by some third-party main-memory tool. After that parallel DBMS is used again to provide uncoarsening. The PargreSQL’s architecture is described in brief. The PargreSQL is developed by authors by means of embedding parallelism into PostgreSQL open-source DBMS. Experimental results are presented and show that our approach works with a very good time and speedup at an acceptable quality loss.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C., Wang, H.: Managing and Mining Graph Data, 1st edn. Springer Publishing Company, Incorporated (2010)Google Scholar
  2. 2.
    Balachandran, R., Padmanabhan, S., Chakravarthy, S.: Enhanced DB-subdue: Supporting subtle aspects of graph mining using a relational approach. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 673–678. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  3. 3.
    Barguñó, L., Muntés-Mulero, V., Dominguez-Sal, D., Valduriez, P.: ParallelGDB: a parallel graph database based on cache specialization. In: Desai, B.C., Cruz, I.F., Bernardino, J. (eds.) IDEAS, pp. 162–169. ACM (2011)Google Scholar
  4. 4.
    Chakravarthy, S., Beera, R., Balachandran, R.: DB-subdue: Database approach to graph mining. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 341–350. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Chakravarthy, S., Pradhan, S.: DB-FSG: An SQL-based approach for frequent subgraph mining. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 684–692. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Chen, R., Yang, M., Weng, X., Choi, B., He, B., Li, X.: Improving large graph processing on partitioned graphs in the cloud. In: Proceedings of the Third ACM Symposium on Cloud Computing, SoCC 2012, pp. 3:1–3:13. ACM, New York (2012)Google Scholar
  7. 7.
    Delling, D., Goldberg, A.V., Razenshteyn, I., Werneck, R.F.F.: Graph partitioning with natural cuts. In: IPDPS, pp. 1135–1146. IEEE (2011)Google Scholar
  8. 8.
    DeWitt, D.J., Gray, J.: Parallel Database Systems: The Future of High Performance Database Systems. Commun. ACM 35(6), 85–98 (1992)CrossRefGoogle Scholar
  9. 9.
    Fjallstrom, P.: Algorithms for graph partitioning: A survey (1998)Google Scholar
  10. 10.
    Garcia, W., Ordonez, C., Zhao, K., Chen, P.: Efficient algorithms based on relational queries to mine frequent graphs. In: Nica, A., Varde, A.S. (eds.) PIKM, pp. 17–24. ACM (2010)Google Scholar
  11. 11.
    Graefe, G.: Encapsulation of parallelism in the volcano query processing system. In: Garcia-Molina, H., Jagadish, H.V. (eds.) SIGMOD Conference, pp. 102–111. ACM Press (1990)Google Scholar
  12. 12.
    Hendrickson, B.: Chaco. In: Padua (ed.) [23], pp. 248–249Google Scholar
  13. 13.
    Karypis, G.: Metis and parmetis. In: Padua (ed.) [23], pp. 1117–1124Google Scholar
  14. 14.
    Karypis, G., Kumar, V.: Multilevel graph partitioning schemes. In: ICPP (3), pp. 113–122 (1995)Google Scholar
  15. 15.
    Karypis, G., Kumar, V.: A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput. 48(1), 71–95 (1998)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. The Bell System Technical Journal 49(1), 291–307 (1970)MATHCrossRefGoogle Scholar
  17. 17.
    Kim, J., Hwang, I., Kim, Y.-H., Moon, B.R.: Genetic approaches for graph partitioning: a survey. In: Krasnogor, N., Lanzi, P.L. (eds.) GECCO, pp. 473–480. ACM (2011)Google Scholar
  18. 18.
    Kyrola, A., Blelloch, G., Guestrin, C.: Graphchi: Large-scale graph computation on just a pc. In: Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood (October 2012)Google Scholar
  19. 19.
    Lepikhov, A.V., Sokolinsky, L.B.: Query processing in a dbms for cluster systems. Programming and Computer Software 36(4), 205–215 (2010)MATHCrossRefGoogle Scholar
  20. 20.
    Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Elmagarmid, A.K., Agrawal, D. (eds.) SIGMOD Conference, pp. 135–146. ACM (2010)Google Scholar
  21. 21.
    Moskovsky, A.A., Perminov, M.P., Sokolinsky, L.B., Cherepennikov, V.V., Shamakina, A.V.: Research Performance Family Supercomputers ’SKIF Aurora’ on Industrial Problems. Bulletin of South Ural State University. Mathematical Modelling and Programming Series 35(211), 66–78 (2010)Google Scholar
  22. 22.
    Padmanabhan, S., Chakravarthy, S.: HDB-subdue: A scalable approach to graph mining. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2009. LNCS, vol. 5691, pp. 325–338. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  23. 23.
    Padua, D.A. (ed.): Encyclopedia of Parallel Computing. Springer (2011)Google Scholar
  24. 24.
    Pan, C.: Development of a parallel dbms on the basis of postgresql. In: Turdakov, D., Simanovsky, A. (eds.) SYRCoDIS. CEUR Workshop Proceedings, vol. 735, pp. 57–61. CEUR-WS.org (2011)Google Scholar
  25. 25.
    Sanders, P., Schulz, C.: Engineering multilevel graph partitioning algorithms. In: Demetrescu, C., Halldórsson, M.M. (eds.) ESA 2011. LNCS, vol. 6942, pp. 469–480. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  26. 26.
    Sanders, P., Schulz, C.: Distributed evolutionary graph partitioning. In: Bader, D.A., Mutzel, P. (eds.) ALENEX, pp. 16–29. SIAM/Omnipress (2012)Google Scholar
  27. 27.
    Srihari, S., Chandrashekar, S., Parthasarathy, S.: A framework for SQL-based mining of large graphs on relational databases. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010, Part II. LNCS, vol. 6119, pp. 160–167. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  28. 28.
    Sui, X., Nguyen, D., Burtscher, M., Pingali, K.: Parallel graph partitioning on multicore architectures. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds.) LCPC 2010. LNCS, vol. 6548, pp. 246–260. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  29. 29.
    Trifunovic, A., Knottenbelt, W.J.: Towards a parallel disk-based algorithm for multilevel k-way hypergraph partitioning. In: IPDPS. IEEE Computer Society (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Constantin S. Pan
    • 1
  • Mikhail L. Zymbler
    • 1
  1. 1.South Ural State UniversityChelyabinskRussia

Personalised recommendations