Abstract
The emergence of large real life networks such as social networks, web page links, and traffic networks exhibits complex graph structures with millions of vertices and edges. Among many operations for exploiting these graphs, the shortest path discovery is a major and expensive one. Besides the in-memory approaches, many efficient shortest path computation methods have been developed on top of distributed and parallel platforms. Pregel, a bulk synchronous parallel framework, is one of them for processing large graphs. The known shortest path computation approach with Pregel is computation intensive and unable to target real-time services. In this paper, we propose a Pregel based efficient k-distance index technique that allows efficient single pair shortest path discovery. We reduce the network cost and unnecessary operations by transmitting more information in a single superstep. The extensive experiments on both real and synthetic datasets reveal the superiority of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Vieira, M.V., Fonseca, B.M., Damazio, R., Golgher, P.B., Reis, D.d.C., Ribeiro-Neto, B.: Efficient search ranking in social networks. In: CIKM, pp. 563–572 (2007)
Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: KDD, pp. 137–146 (2003)
Ukkonen, A., Castillo, C., Donato, D., Gionis, A.: Searching the wikipedia with contextual information. In: CIKM, pp. 1351–1352 (2008)
Potamias, M., Bonchi, F., Castillo, C., Gionis, A.: Fast shortest path distance estimation in large networks. In: CIKM, pp. 867–876 (2009)
Dijkstra, E.: A note on two problems in connexion with graphs. Numer. Math. 1, 269–271 (1959)
Wagner, D., Willhalm, T.: Speed-up techniques for shortest-path computations. In: Thomas, W., Weil, P. (eds.) STACS 2007. LNCS, vol. 4393, pp. 23–36. Springer, Heidelberg (2007)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: SODA, pp. 937–946 (2002)
Wei, F.: Tedi: efficient shortest path query answering on method for efficient shortest path discovery graphs. In: SIGMOD, pp. 99–110 (2010)
Potamias, M., Bonchi, F., Castillo, C., Gionis, A.: Fast shortest path distance estimation in large networks. In: CIKM, pp. 453–470 (2009)
Goldberg, A., Harrelson, C.: Computing the shortest path: search meets graph theory. In: Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms SODA, Vancouver, British Columbia, 23–25 January 2005
Wei, F.: Tedi: efficient shortest path query answering on graphs. In: Proceedings of the 29th ACM SIGMOD International Conference on Management of Data, Indianapolis, USA, 6–11 June 2010
Yuan, Y., Wang, G., Wang, H., Chen, L.: Efficient subgraph search over large uncertain graphs. PVLDB 4(11), 876–886 (2011)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Proceedings of the 6th Symposium on Operating Systems Design and Implementation, San Francisco, CA, 6–8 December 2004
Bahmani, B., Chakrabarti, K., Xin, D.: Fast personalized pagerank on mapreduce. In: Proceedings of the 30th ACM SIGMOD International Conference on Management of Data, Athens, Greece, 12–16 June 2011
Valiant, L.G.: A bridging model for parallel computation. Comm. ACM 33(8), 103–111 (1990)
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD (2010)
Gao, J., Jin, R., Zhou, J., Yu, J., Jiang, X., Wang, T.: Relational approach for shortest path discovery over large graphs. PVLDB 5(4), 358–369 (2011)
Mehlhorn, K., Naher, S.: The LEDA Platform of Combinatorial and Geometric Computing. Cambridge University Press, Cambridge (1999)
The iGraph library. http://igraph.wikidot.com/
Padmanabhan, S., Chakravarthy, S.: HDB-subdue: a scalable approach to graph mining. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2009. LNCS, vol. 5691, pp. 325–338. Springer, Heidelberg (2009)
Chakravarthy, S., Pradhan, S.: DB-FSG: an SQL-based approach for frequent subgraph mining. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2008. LNCS, vol. 5181, pp. 684–692. Springer, Heidelberg (2008)
Gregor, D., Lumsdaine, A.: The parallel BGL: a generic library for distributed graph computations. In: Proceedings of Parallel Object-Oriented Scientific Computing POOSC (2005)
Chan, A., Dehne, F.: CGMGRAPH/CGMLIB: implementing and testing CGM graph algorithms on PC clusters and shared memory machines. Int. J. High Perform. Comput. Appl. 19(1), 81–97 (2005)
Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J. C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–46 (2010)
Iosup, A., Lampraki, N.P., Penders, A., Biczak, M., Guo, Y., Varbanescu, A.L.: Parallelization and Distribution for Large Scale Graph Processing. HPD, Delft, The Netherlands (2012)
Apache Hadoop. http://hadoop.apache.org/
Amazon EC2. http://aws.amazon.com/ec2/
Leskovec, J., Huttenlocher, D., Kleinberg, J.: Predicting positive and negative links in online social networks. In: WWW (2010)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2005)
Leskovec, J., Lang, K., Dasgupta, A., Mahoney, M.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)
Hang, T.L.: A Java Library of Graph Algorithms and Optimization. Taylor & Francis, Hoboken (2007)
Acknowledgments
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 2013R1A2A1A05056375).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hong, J., Kim, H., Nawaz, W., Park, K., Jeong, BS., Lee, YK. (2014). Distributed K-Distance Indexing Approach for Efficient Shortest Path Discovery on Large Graphs. In: Han, WS., Lee, M., Muliantara, A., Sanjaya, N., Thalheim, B., Zhou, S. (eds) Database Systems for Advanced Applications. DASFAA 2014. Lecture Notes in Computer Science(), vol 8505. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43984-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-662-43984-5_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43983-8
Online ISBN: 978-3-662-43984-5
eBook Packages: Computer ScienceComputer Science (R0)