Fast Reachability Query Processing

  • Jiefeng Cheng
  • Jeffrey Xu Yu
  • Nan Tang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3882)


Graph has great expressive power to describe the complex relationships among data objects, and there are large graph datasets available. In this paper, we focus ourselves on processing a primitive graph query. We call it reachability query. The reachability query, denoted \(A \rightsquigarrow D\), is to find all elements of a type D that are reachable from some elements in another type A. The problem is challenging because the existing structural join algorithms, studied in XML query processing, cannot be directly applied to it, because those techniques make use of the tree-structure heavily. We propose a novel approach which can process reachability queries on the fly while keeping the space consumption small that is needed to keep the required information for processing reachability queries. In brief, our approach is based on 2-hop labeling for a directed graph G which consumes O(|V|log|E|) space. We construct a novel join-index which is built on a small table and B+-tree. With the join-index, the high efficiency is achieved. We conducted extensive experimental studies, and we confirm that our approach can efficiently process reachability queries over a graph or a tree.


Directed Acyclic Graph Query Processing Semistructured Data Very Large Data Base Graph Code 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: from relations to semistructured data and XML. Morgan Kaufmann Publishers Inc, San Francisco (2000)Google Scholar
  2. 2.
    Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proc. of SIGMOD 1989 (1989)Google Scholar
  3. 3.
    Al-Khalifa, S., Jagadish, H.V., Patel, J.M., Wu, Y., Koudas, N., Srivastava, D.: Structural joins: A primitive for efficient xml query pattern matching. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), p. 141. IEEE Computer Society Press, Los Alamitos (2002)CrossRefGoogle Scholar
  4. 4.
    Berendt, B., Spiliopoulou, M.: Analysis of navigation behaviour in web sites integrating multiple information systems. The VLDB Journal 9(1), 56–75 (2000)CrossRefGoogle Scholar
  5. 5.
    Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S.: Fast computation of reachability labeling for large graphs (submitted for publication, 2005)Google Scholar
  6. 6.
    Chien, S.-Y., Vagena, Z., Zhang, D., Tsotras, V.J., Zaniolo, C.: Efficient structural joins on indexed xml documents. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, pp. 263–274. Springer, Heidelberg (2003)Google Scholar
  7. 7.
    Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proc. of SODA 2002 (2002)Google Scholar
  8. 8.
    DeRose, S., Maler, E., Orchard, D.: XML linking language (XLink) version 1.0 (2001),
  9. 9.
    DeRose, S., Maler, E., Orchard, D.: XML pointer language (XPointer) version 1.0 (2001),
  10. 10.
    Jiang, H., Lu, H., Wang, W., Ooi, B.: Xr-tree: Indexing xml data for efficient structural join. In: Proceedings of the 19th International Conference on Data Engineering (ICDE 2003), IEEE Computer Society, Los Alamitos (2003)Google Scholar
  11. 11.
    Keseler, I., Collado-Vides, J., Gama-Castro, S., Ingraham, J., Paley, S., Paulsen, I., Peralta-Gil, M., Karp, P.: Ecocyc: A omprehensive database resource for escherichia coli. Nucleic Acids Research 33, D334–D337 (2005)CrossRefGoogle Scholar
  12. 12.
    Li, H., Lee, M.L., Hsu, W., Chen, C.: An evaluation of xml indexes for structural join. SIGMOD Rec. 33(3), 28–33 (2004)CrossRefGoogle Scholar
  13. 13.
    Romero, P., Wagg, J., Green, M.L., Kaiser, D., Krummenacker, M., Karp, P.D.: Computational prediction of human metabolic pathways from the complete human genome. Genome Biology 6(1), 1–17 (2004)CrossRefGoogle Scholar
  14. 14.
    Schenkel, R., Theobald, A., Weikum, G.: Hopi: An efficient connection index for complex xml document collections. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, Springer, Heidelberg (2004)CrossRefGoogle Scholar
  15. 15.
    Schenkel, R., Theobald, A., Weikum, G.: Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In: Proc. of ICDE 2005 (2005)Google Scholar
  16. 16.
    Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: Xmark: A benchmark for xml data management. In: Bressan, S., Chaudhri, A.B., Li Lee, M., Yu, J.X., Lacroix, Z. (eds.) CAiSE 2002 and VLDB 2002. LNCS, vol. 2590, Springer, Heidelberg (2003)CrossRefGoogle Scholar
  17. 17.
    Wang, H., Wang, W., Lin, X., Li, J.: Labeling scheme and structural joins for graph-structured xml data. In: Proc. of The 7th Asia Pacific Web Conference (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jiefeng Cheng
    • 1
  • Jeffrey Xu Yu
    • 1
  • Nan Tang
    • 1
  1. 1.The Chinese University of Hong KongChina

Personalised recommendations