The VLDB Journal

, Volume 22, Issue 3, pp 275–294 | Cite as

Computing weight constraint reachability in large networks

  • Miao Qiao
  • Hong Cheng
  • Lu Qin
  • Jeffrey Xu Yu
  • Philip S. Yu
  • Lijun Chang
Regular Paper
  • 565 Downloads

Abstract

Reachability is a fundamental problem on large-scale networks emerging nowadays in various application domains, such as social networks, communication networks, biological networks, road networks, etc. It has been studied extensively. However, little existing work has studied reachability with realistic constraints imposed on graphs with real-valued edge or node weights. In fact, such weights are very common in many real-world networks, for example, the bandwidth of a link in communication networks, the reliability of an interaction between two proteins in PPI networks, and the handling capacity of a warehouse/storage point in a distribution network. In this paper, we formalize a new yet important reachability query in weighted undirected graphs, called weight constraint reachability (WCR) query that asks: is there a path between nodes \(a\) and \(b\), on which each real-valued edge (or node) weight satisfies a range constraint. We discover an interesting property of WCR, based on which, we design a novel edge-based index structure to answer the WCR query in \(O(1)\) time. Furthermore, we consider the case when the index cannot entirely fit in the memory, which can be very common for emerging massive networks. An I/O-efficient index is proposed, which provides constant I/O (precisely four I/Os) query time with \(O(|V|\log |V|)\) disk-based index size. Extensive experimental studies on both real and synthetic datasets demonstrate the efficiency and scalability of our solutions in answering the WCR query.

Keywords

Weight constraint reachability Minimum spanning tree  Lowest common ancestor Vertex coding I/O-efficient index 

References

  1. 1.
    Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proceedings of the 1989 ACM SIGMOD international conference on Management of data (SIGMOD 1989), pp. 253–262 (1989)Google Scholar
  2. 2.
    Bebek, G., Yang, J.: PathFinder: Mining signal transduction pathway segments from protein-protein interaction networks. BMC Bioinform. J. 8, 335 (2007)CrossRefGoogle Scholar
  3. 3.
    Bender, M. A., Farach-Colton, M.: The LCA problem revisited. In: LATIN 2000: Theoretical Informatics, volume 1776 of Lecture Notes in Computer Science, pp. 88–94. Springer, Berlin/HeidelbergGoogle Scholar
  4. 4.
    Bramandia, R., Choi, B., Ng, W.K.: On incremental maintenance of 2-hop labeling of graphs. In: Proceedings of the 17th international conference on World Wide Web (WWW 2008), pp. 845–854 (2008)Google Scholar
  5. 5.
    Chen, Y., Chen, Y.: An efficient algorithm for answering graph reachability queries. In: Proceedings of the 24th International Conference on Data Engineering (ICDE 2008), pp. 893–902 (2008)Google Scholar
  6. 6.
    Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P. S.: Fast computing reachability labelings for large graphs with high compression rate. In: Proceedings of the 11th International Conference on Extending Database Technology (EDBT 2008), pp. 193–204 (2008)Google Scholar
  7. 7.
    Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), pp. 937–946 (2002)Google Scholar
  8. 8.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. The MIT Press, New York (2001)MATHGoogle Scholar
  9. 9.
    Fan, W., Li, J., Ma, S., Tang, N., Wu, Y.: Adding regular expressions to graph reachability and pattern queries. In: Proceedings of the 27th International Conference on Data Engineering (ICDE 2011), pp. 39–50 (2011)Google Scholar
  10. 10.
    Florescu, D., Levy, A.Y., Suciu, D.: Query containment for conjunctive queries with regular expressions. In: Proceedings of the 1998 Symposium on Principles of Database Systems (PODS 1998), pp. 139–148 (1998)Google Scholar
  11. 11.
    Gomory, R.E., Hu, T.C.: Multi terminal network flows. J. Soc. Ind. Appl. Math. 9, 551–571 (1961)MathSciNetMATHCrossRefGoogle Scholar
  12. 12.
    He, H., Wang, H., Yang, J., Yu, P. S.: Compact reachability labeling for graph-structured data. In: Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management (CIKM 2005), pp. 594–601 (2005)Google Scholar
  13. 13.
    Jagadish, H.V.: A compression technique to materialize transitive closure. ACM Trans. Database Syst. 15(4), 558–598 (1990)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Jin, R., Hong, H., Wang, H., Ruan, N., Xiang, Y.: Computing label-constraint reachability in graph databases. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD 2010), pp. 123–134 (2010)Google Scholar
  15. 15.
    Jin, R., Liu, L., Ding, B., Wang, H.: Distance-constraint reachability computation in uncertain graphs. Proc. VLDB Endowment (PVLDB 2011) 4(9), 551–562 (2011)Google Scholar
  16. 16.
    Jin, R., Xiang, Y., Ruan, N., Fuhry, D.: 3-HOP: a high-compression indexing scheme for reachability query. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD 2009), pp. 813–826 (2009)Google Scholar
  17. 17.
    Jin, R., Xiang, Y., Ruan, N., Wang, H.: Efficiently answering reachability queries on very large directed graphs. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD 2008), pp. 595–608 (2008)Google Scholar
  18. 18.
    Johnsonbaugh, R., Kalin, M.: A graph generation software package. In: Proceedings of the 22nd SIGCSE Technical Symposium on Computer Science Education (SIGCSE 1991), pp. 151–154 (1991)Google Scholar
  19. 19.
    Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Amer. Math. Soc. 7(1), 48–50 (1956)MathSciNetMATHCrossRefGoogle Scholar
  20. 20.
    Lawder, J.K., King, P.J.H.: Querying multi-dimensional data indexed using the hilbert space-filling curve. SIGMOD Rec. 30(1), 19–24 (2001)CrossRefGoogle Scholar
  21. 21.
    Ma, Q., Steenkiste, P.: On path selection for traffic with bandwidth guarantees. In: Proceedings of the 1997 International Conference on Network Protocols (ICNP 1997), pp. 191–202 (1997) Google Scholar
  22. 22.
    Mendelzon, A.O., Wood, P.T.: Finding regular simple paths in graph databases. SIAM J. Comput. 24(6), 1235–1258 (1995)MathSciNetMATHCrossRefGoogle Scholar
  23. 23.
    Newman, M.E.J.: Power laws, pareto distributions and zipf’s law. Contemp. Phys. 46, 323–351 (2005)CrossRefGoogle Scholar
  24. 24.
    Roditty, L., Zwick, U.: A fully dynamic reachability algorithm for directed graphs with an almost linear update time. In: Proceedings of the 36th annual ACM symposium on Theory of computing (STOC 2004), pp. 184–191 (2004)Google Scholar
  25. 25.
    Schenkel, R., Theobald, A., Weikum, G.: HOPI: an efficient connection index for complex XML document collections. In: Proceedings of the 9th International Conference on Extending Database Technology (EDBT 2004), pp. 237–255 (2004)Google Scholar
  26. 26.
    Schenkel, R., Theobald, A., Weikum, G.: Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In: Proceedings of the 21th International Conference on Data Engineering (ICDE 2005), pp. 360–371 (2005)Google Scholar
  27. 27.
    TrißI, S., Leser, U.: Fast and practical indexing and querying of very large graphs. In: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data (SIGMOD 2007), pp. 845–856 (2007)Google Scholar
  28. 28.
    van Schaik, S.J., de Moor, O.: A memory efficient reachability data structure through bit vector compression. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD 2011), pp. 913–924 (2011)Google Scholar
  29. 29.
    Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the evolution of user interaction in facebook. In: Proceedings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN 2009), pp. 37–42 (2009)Google Scholar
  30. 30.
    Vitter, J.S.: External memory algorithms and data structures. ACM Comput. Surv. 33(2), 209–271 (2001)CrossRefGoogle Scholar
  31. 31.
    Wang, H., He, H., Yang, J., Yu, P. S., Yu, J. X.: Dual labeling: answering graph reachability queries in constant time. In: Proceedings of the 22th International Conference on Data Engineering (ICDE 2006), pp. 75 (2006)Google Scholar
  32. 32.
    Xu, K., Zou, L., Yu, J. X., Chen, L., Xiao, Y., Zhao, D.: Answering label-constraint reachability in large graphs. In: Proceedings of the 2011 ACM CIKM International Conference on Information and Knowledge Management (CIKM 2011), pp. 1595–1600 (2011)Google Scholar
  33. 33.
    Yildirim, H., Chaoji, V., Zaki, M.J.: GRAIL: scalable reachability index for large graphs. Proc. VLDB Endowment (PVLDB 2010) 3(1), 276–284 (2010)Google Scholar

Copyright information

© Springer-Verlag 2012

Authors and Affiliations

  • Miao Qiao
    • 1
  • Hong Cheng
    • 1
  • Lu Qin
    • 1
  • Jeffrey Xu Yu
    • 1
  • Philip S. Yu
    • 2
    • 3
  • Lijun Chang
    • 1
  1. 1.Department of Systems Engineering and Engineering ManagementThe Chinese University of Hong KongNew TerritoriesHong Kong
  2. 2.Department of Computer ScienceUniversity of Illinois at ChicagoChicagoUSA
  3. 3.Computer Science DepartmentKing Abdulaziz UniversityJeddahSaudi Arabia

Personalised recommendations