A Review of Engines for Graph Storage and Mutations
- 546 Downloads
Abstract
With the continuous generation of big data, the need to structure a large amount of information is increasingly becoming a vital factor in extracting useful insights from raw data. Some of the technologies that emerged for this purpose are Graph Processing Systems that offer support for network analysis. Data can be collected and stored in a graph structure with vertices to represent entities and edges to represent their relationships, in order to reveal the correlation between different components e.g. to determine a group of users more likely to follow a certain Twitter account. In order to achieve high performance in Graph Analytics, graph processing engines exploit hardware resources and design efficient data structures to store graphs. Moreover, to track the evolution of graphs, systems need to support fast structural mutations i.e. addition/removal of vertices or edges. This paper provides a characterization of engines based on their hardware infrastructure, their graph storage and their support for graph mutations.
Keywords
Literature review Graph analysis Data structures Graph mutationReferences
- 1.Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at Facebook-scale. Proc. VLDB Endow. 8(12), 1804–1815 (2015)CrossRefGoogle Scholar
- 2.Graph structure in the Web. http://snap.stanford.edu/class/cs224w-readings/broder00bowtie.pdf
- 3.Low, Y., et al.: Graphlab: a new parallel framework for machine learning. In: Conference on Uncertainty in Artificial Intelligence (UAI) (2010)Google Scholar
- 4.Macko, P., Marathe, V.J., Margo, D.W., Seltzer, M.I.: LLAMA: efficient graph analytics using Large Multiversioned Arrays. In: 2015 IEEE 31st International Conference on Data Engineering, Seoul, 2015, pp. 363–374 (2015)Google Scholar
- 5.Ediger, D., Riedy, J., Bader, D.A., Meyerhenke, H.: Tracking structure of streaming social networks. In: IEEE International Symposium on Parallel and Distributed Processing Workshops and Ph.D. Forum. Shanghai, 2011, pp. 1691–1699 (2011)Google Scholar
- 6.Cheng, R., Hong, J., Kyrola, A., Miao, Y., Weng, X., Wu, M., Yang, F., Zhou, L., Zhao, F., Chen, E.: Kineograph: taking the pulse of a fast-changing and connected world. In: Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys 2012), pp. 85–98. ACM, New York (2012)Google Scholar
- 7.Zhang, K., Chen, R., Chen, H.: NUMA-aware graph-structured analytics. SIGPLAN Not. 50(8), 183–193 (2015)CrossRefGoogle Scholar
- 8.Tian, X., Zhan, J.: GraphDuo: a dual-model graph processing framework. IEEE Access 6, 35057–35071 (2018). https://doi.org/10.1109/ACCESS.2018.2848291CrossRefGoogle Scholar
- 9.Staudt, C., Sazonovs, A., Meyerhenke, H.: NetworKit: an interactive tool suite for high-performance network analysis. arxiv.org (2014)Google Scholar
- 10.Haubenschild, M., Then, M., Hong, S., Chafi, H.: ASGraph: a mutable multi-versioned graph container with high analytical performance. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems (GRADES 2016). ACM, New York (2016). Article 8, 6 pagesGoogle Scholar
- 11.Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 505–516 (2013). https://doi.org/10.1145/2463676.2467799
- 12.Mariappan, M., Vora, K.: GraphBolt: dependency-driven synchronous processing of streaming graphs. In: Proceedings of the Fourteenth EuroSys Conference 2019 (EuroSys 2019). ACM, New York (2019). Article 25, 16 pagesGoogle Scholar
- 13.Joaquim, P., Bravo, M., Rodrigues, L., Matos, M.: Hourglass: leveraging transient resources for time-constrained graph processing in the cloud. In: Proceedings of the Fourteenth EuroSys Conference 2019 (EuroSys 2019). ACM, New York (2019). Article 35, 16 pagesGoogle Scholar
- 14.Sengupta, D., et al.: GraphIn: an online high performance incremental graph processing framework. In: Dutot, P.F., Trystram, D. (eds.) Euro-Par 2016: Parallel Processing. Euro-Par 2016. Lecture Notes in Computer Science, vol. 9833. Springer, Cham (2016)CrossRefGoogle Scholar
- 15.Hong, S., Chafi, H., Sedlar, E., Olukotun, K.: Green-Marl: a DSL for easy and efficient graph analysis. In: Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVII), pp. 349–362. ACM, New York (2012)Google Scholar
- 16.Paradies, M., Lehner, W., Bornhövd, C.: GRAPHITE: an extensible graph traversal framework for relational database management systems. In: Proceedings of the 27th International Conference on Scientific and Statistical Database Management. ACM (2015)Google Scholar
- 17.Hong, S., Depner, S., Manhardt, T., Van Der Lugt, J., Verstraaten, M., Chafi, H.: PGX.D: a fast distributed graph processing engine. In: SC 2015: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, Austin, TX, 2015, pp. 1–12 (2015)Google Scholar
- 18.Roth, N.P., Trigonakis, V., Hong, S., Chafi, H., Potter, A., Motik, B., Horrocks, I.: PGX.D/Async: a scalable distributed graph pattern matching engine. In: Proceedings of the Fifth International Workshop on Graph Data-Management Experiences & Systems (GRADES’17). ACM, New York (2017). Article 7, 6 pagesGoogle Scholar
- 19.Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI 2014), pp. 599–613. USENIX Association, Berkeley (2014)Google Scholar
- 20.Kyrola, A., Blelloch, G., Guestrin, C.: GraphChi: large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI 2012), pp. 31–46. USENIX Association, Berkeley (2012)Google Scholar
- 21.Prabhakaran, V., Wu, M., Weng, X., McSherry, F., Zhou, L., Haridasan, M.: Managing large graphs on multi-cores with graph awareness. In: Proceedings of the 2012 USENIX conference on Annual Technical Conference (USENIX ATC 2012), p. 4. USENIX Association, Berkeley (2012)Google Scholar
- 22.Gadepally, V., et al.: Graphulo: linear algebra graph kernels for NoSQL databases. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. IEEE (2015)Google Scholar
- 23.Sha, M., Li, Y., He, B., Tan, K.-L.: Accelerating dynamic graph analytics on GPUs. Proc. VLDB Endow. 11(1), 107–120 (2017)CrossRefGoogle Scholar
- 24.Wheatman, B., Xu, H.: Packed Compressed Sparse Row: A Dynamic Graph Representation, pp. 1–7 (2018). https://doi.org/10.1109/HPEC.2018.8547566
- 25.King, J., Gilray, T., Kirby, R.M., Might, M.: Dynamic sparse-matrix allocation on GPUs. In: Kunkel, J., Balaji, P., Dongarra, J. (eds.) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science, vol. 9697. Springer, Cham (2016)Google Scholar
- 26.Madduri, K., Bader, D.A.: Compact graph representations and parallel connectivity algorithms for massive dynamic network analysis. In: 2009 IEEE International Symposium on Parallel & Distributed Processing, Rome, pp. 1–11 (2009)Google Scholar