Fast Computation of Reachability Labeling for Large Graphs

  • Jiefeng Cheng
  • Jeffrey Xu Yu
  • Xuemin Lin
  • Haixun Wang
  • Philip S. Yu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3896)


The need of processing graph reachability queries stems from many applications that manage complex data as graphs. The applications include transportation network, Internet traffic analyzing, Web navigation, semantic web, chemical informatics and bio-informatics systems, and computer vision. A graph reachability query, as one of the primary tasks, is to find whether two given data objects, u and v, are related in any ways in a large and complex dataset. Formally, the query is about to find if v is reachable from u in a directed graph which is large in size. In this paper, we focus ourselves on building a reachability labeling for a large directed graph, in order to process reachability queries efficiently. Such a labeling needs to be minimized in size for the efficiency of answering the queries, and needs to be computed fast for the efficiency of constructing such a labeling. As such a labeling, 2-hop cover was proposed for arbitrary graphs with theoretical bounds on both the construction cost and the size of the resulting labeling. However, in practice, as reported, the construction cost of 2-hop cover is very high even with super power machines. In this paper, we propose a novel geometry-based algorithm which computes high-quality 2-hop cover fast. Our experimental results verify the effectiveness of our techniques over large real and synthetic graph datasets.


Bipartite Graph Directed Graph Directed Acyclic Graph Transitive Closure Large Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proc. of SIGMOD 1989 (1989)Google Scholar
  2. 2.
    Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proc. of SODA 2002 (2002)Google Scholar
  3. 3.
    Cohen, E., Kaplan, H., Milo, T.: Labeling dynamic XML trees. In: Proc. of PODS 2002 (2002)Google Scholar
  4. 4.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to algorithms. MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  5. 5.
    Gallo, G., Grigoriadis, M.D., Tarjan, R.E.: A fast parametric maximum flow algorithm and applications. SIAM J. Comput. 18(1), 30–55 (1989)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proc. of SIGMOD 1984 (1984)Google Scholar
  7. 7.
    Ioannidis, Y.E.: On the computation of the transitive closure of relational operators. In: Proc. of VLDB 1986 (1986)Google Scholar
  8. 8.
    Johnson, D.S.: Approximation algorithms for combinatorial problems. In: Proc. of STOC 1973 (1973)Google Scholar
  9. 9.
    Johnsonbaugh, R., Kalin, M.: A graph generation software package. In: Proc. Of SIGCSE 1991 (1991),
  10. 10.
    Kameda, K.: On the vector representation of the reachability in planar directed graphs. Information Processing Letters 3(3) (1975)Google Scholar
  11. 11.
    Kaplan, H., Milo, T., Shabo, R.: A comparison of labeling schemes for ancestor queries. In: Proc. of SODA 2002 (2002)Google Scholar
  12. 12.
    Kha, D.D., Yoshikawa, M., Uemura, S.: An XML indexing structure with relative region coordinate. In: Proc. of ICDE 2001 (2001)Google Scholar
  13. 13.
    Kimber, W.E.: HyTime and SGML: Understanding the HyTime HYQ query language 1.1. Technical report, IBM Corporation (1993)Google Scholar
  14. 14.
    Knuth, D.E.: The Stanford GraphBase: a platform for combinatorial computing. ACM Press, New York (1993)Google Scholar
  15. 15.
    Lee, Y.K., Yoo, S.J., Yoon, K.: Index structures for structured documents. In: Proc. Of ACM First International Conference on Digital Libraries (1996)Google Scholar
  16. 16.
    Lei, S., G.: A graph query language and its query processing. In: Proc. of ICDE 1999 (1999)Google Scholar
  17. 17.
    Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. of VLDB 2001 (2001)Google Scholar
  18. 18.
    Schenkel, R., Theobald, A., Weikum, G.: Hopi: An efficient connection index for complex xml document collections. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 237–255. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  19. 19.
    Schenkel, R., Theobald, A., Weikum, G.: Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In: Proc. of ICDE 2005 (2005)Google Scholar
  20. 20.
    Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: Xmark: A benchmark for xml data management. In: Proc. of VLDB 2002 (2002)Google Scholar
  21. 21.
    Tatarnov, I., Viglas, S.D., Beyer, K., Shanmugasundaram, J., Shekita, E., Zhang, C.: Storing and quering ordered XML using a relational database system. In: Proc. of SIGMOD 2002 (2002)Google Scholar
  22. 22.
    Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: Answering graph reachabilityqueries in constant time. In: Proc. of ICDE 2006 (2006)Google Scholar
  23. 23.
    Wang, W., Jiang, H., Lu, H., Yu, J.: Pbitree coding and efficient processing of containment join. In: Proc. of ICDE 2003 (2003)Google Scholar
  24. 24.
    YoshiKawa, M., Amagasa, T.: XRel: A path-based approach to storage and retrieval of XML documents using relational databases. ACM Transactions on Internet Technology 1(1) (2001)Google Scholar
  25. 25.
    Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD 2001 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jiefeng Cheng
    • 1
  • Jeffrey Xu Yu
    • 1
  • Xuemin Lin
    • 2
  • Haixun Wang
    • 3
  • Philip S. Yu
    • 3
  1. 1.The Chinese University of Hong KongChina
  2. 2.University of New South Wales & NICTAAustralia
  3. 3.T. J. Watson Research Center, IBMUSA

Personalised recommendations