Abstract
The ability to efficiently obtain exact distance information from both directed and undirected graphs is desired by many real-world applications. In this work, we unified the query indexing efforts on directed and undirected graphs into one by proposing the TreeMap approach. Our approach has very tight bounds on query time, index size, and construction time for answering queries on both directed and undirected graphs. The query time complexity is close to constant for graphs with a small width of tree decomposition, and the index construction can be completed without materializing the distance matrix or other high-cost operations. In the empirical study, we demonstrated that the TreeMap approach in general performs much better than competitive methods in indexing real graphs for answering exact distance queries.
Similar content being viewed by others
References
Abraham, I., Delling, D., Goldberg, A.V., Werneck, R.F.F.: Hierarchical hub labelings for shortest paths. In: ESA, pp. 24–35 (2012)
Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: SIGMOD Conference, pp. 253–262 (1989)
Arnborg, S., Corneil, D.G., Proskurowski, A.: Complexity of finding embeddings in ak-tree. SIAM J. Algebraic Discrete Methods 8(2), 277–284 (1987)
Awerbuch, B.: A new distributed depth-first-search algorithm. Inf. Process. Lett. 20(3), 147–150 (1985)
Bender, M.A., Farach-Colton, M.: The lca problem revisited. In: LATIN, pp. 88–94 (2000)
Bodlaender, H.: Planar graphs with bounded treewidth. Technical Report RUU-CS, (88–14) (1988)
Bodlaender, H.L.: Discovering treewidth. In: SOFSEM, pp. 1–16 (2005)
Bodlaender, H.L., Gilbert, H.L., Hafsteinsson, H., Kloks, T.: Approximating treewidth, pathwidth, frontsize, and shortest elimination tree. J. Algorithms 18(2), 238–255 (1995)
Bodlaender, H.L., Koster, A.M.C.A.: Combinatorial optimization on graphs of bounded treewidth. Comput. J. 51(3), 255–269 (2008)
Chang, L., Yu, J.X., Qin, L., Cheng, H., Qiao, M.: The exact distance to destination in undirected world. VLDB J. 21(6), 869–888 (2012)
Cheng, J., Yu, J.X.: On-line exact shortest distance query processing. In: EDBT, pp. 481–492 (2009)
Chepoi, V., Dragan, F.F., Estellon, B., Habib, M., Vaxès, Y., Xiang, Y.: Additive spanners and distance and routing labeling schemes for hyperbolic graphs. Algorithmica 62(3–4), 713–732 (2012)
Cohen, E., Halperin, E., Kaplan, E., Zwick, U.: Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32(5), 1338–1355 (2003)
de Montgolfier, F., Soto, M., Viennot, L.:. Treewidth and hyperbolicity of the internet. In: NCA, pp. 25–32 (2011)
Diestel, R.: Graph theory. Springer, Berlin (2005)
Dragan, F.F., Yan, C.: Collective tree spanners in graphs with bounded parameters. Algorithmica 57(1), 22–43 (2010)
Fraigniaud, P.: Greedy routing in tree-decomposed graphs. In: ESA, pp. 791–802 (2005)
Gavoille, C., Peleg, D., Pérennes, S., Raz, R.: Distance labeling in graphs. J. Algorithms 53(1), 85–112 (2004)
Grumbach, S., Wu, Z.: Distributed tree decomposition of graphs and applications to verification. In: Parallel and Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on, pp. 1–8. IEEE (2010)
Gubichev, A., Bedathur, S.J., Seufert, S., Weikum, G.: Fast and accurate estimation of shortest paths in large graphs. In: CIKM, pp. 499–508 (2010)
Harary, F., Uhlenbeck, G.: On the number of husimi trees: I. Proc. Natl. Acad. Sci. USA 39(4), 315 (1953)
Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13(2), 338–355 (1984)
Jin, R., Ruan, N., Xiang, Y., Lee, V.E.: A highway-centric labeling approach for answering distance queries on large sparse graphs. In: SIGMOD Conference, pp. 445–456 (2012)
Jin, R., Ruan, N., Xiang, Y., Wang, H.: Path-tree: an efficient reachability indexing scheme for large directed graphs. ACM Trans. Database Syst. 36(1), 7 (2011)
Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-hop: a high-compression indexing scheme for reachability query. In: SIGMOD Conference, pp. 813–826 (2009)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 177–187. ACM (2005)
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)
Peleg, D.: Proximity-preserving labeling schemes and their applications. In: WG, pp. 30–41 (1999)
Potamias, M., Bonchi, F., Castillo, C., Gionis, A.: Fast shortest path distance estimation in large networks. In: CIKM, pp. 867–876 (2009)
Robertson, N., Seymour, N.: Graph minors. III. planar tree-width. J. Comb. Theory Ser. B 36(1), 49–64 (1984)
Schenkel, R., Theobald, A., Weikum, G.: Hopi: an efficient connection index for complex xml document collections. In: EDBT, pp. 237–255 (2004)
Thorup, M.: Compact oracles for reachability and approximate distances in planar digraphs. J. ACM 51(6), 993–1024 (2004)
Tretyakov, K., Armas-Cervantes, A., García-Bañuelos, L., Vilo, J., Dumas, M.: Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs. In: CIKM, pp. 1785–1794 (2011)
Trißl, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: SIGMOD Conference, pp. 845–856 (2007)
van Schaik, S.J., de Moor, O.: A memory efficient reachability data structure through bit vector compression. In: SIGMOD Conference, pp. 913–924 (2011)
Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: answering graph reachability queries in constant time. In: ICDE, p. 75 (2006)
Wei, F.: Tedi: efficient shortest path query answering on graphs. In: SIGMOD Conference, pp. 99–110 (2010)
Xiang, Y., James, S.L., Borlawsky, T.B., Huang, K., Payne, P.R.: k-neighborhood decentralization: a comprehensive solution to index the UMLS for large scale knowledge discovery. J. Biomed. Inf. 45(2), 323–336 (2012)
Yildirim, H., Chaoji, V., Zaki, M.J.: Grail: scalable reachability index for large graphs. PVLDB 3(1), 276–284 (2010)
Yoo, A., Chow, E., Henderson, K., McLendon, W., Hendrickson, B., Catalyurek, U.: A scalable distributed parallel breadth-first search algorithm on bluegene/l. In: Supercomputing, 2005. Proceedings of the ACM/IEEE SC 2005 Conference, pp. 25–25. IEEE (2005)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xiang, Y. Answering exact distance queries on real-world graphs with bounded performance guarantees. The VLDB Journal 23, 677–695 (2014). https://doi.org/10.1007/s00778-013-0338-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-013-0338-6