Skip to main content

Parallel processing of graph reachability in databases

Abstract

In this paper we consider parallel processing of a graph represented by a database relation, and we achieved two objectives. First, we propose a methodology for analyzing the speedup of a parallel processing strategy with the purpose of selecting at runtime one of several candidate strategies, depending on the hardware architecture and the input graph. Second, we study the single-source reachability problem, namely the problem of computing the set of nodes reachable from a given node in a directed graph. We propose several parallel strategies for solving this problem, and we analyze their performance using our new methodology. The analysis is confirmed experimentally in a UNIX-Ethernet environment. We also extend the results to the transitive closure problem.

This is a preview of subscription content, access via your institution.

References

  1. O. Wolfson, Weining Zhang, Harish Butani, Akira Kawaguchi, and Kui Mok, A Methodology for Evaluating Parallel Graph Algorithms and Its Application to Single Source Reachability,Parallel and Distributed Information Systems (1993).

  2. R. Agarawal and H. V. Jagadish, Multiprocessor Transitive Closure Algorithms,Proc. Int'l. Symp. on Databases in Parallel and Distr. Syst. (December 1988).

  3. J. Cheiney and C. De Maindreville, A Parallel Strategy for the Transitive Closure Using Double Hash-based Clustering,Proc. of VLDB Conf. (August 1990).

  4. M. Houtsma, P. Apers, and S. Ceri, Distributed Transitive Closure Computation: the Disconnection Set Approach,Proc. of VLDB Conf. (August 1990).

  5. P. Valduriez and S. Khoshafian, Parallel Evaluation of the Transitive Closure of a Database Relation.IJPP,17(1) (February 1988).

  6. F. Cacace, S. Ceri, and M. Houtsma, A Survey of Parallel Execution Strategies for Transitive Closure and Logic Programs, Manuscript (October 1991).

  7. F. Bancilhon, D. Maier, Y. Sagiv, and J. Ullman, Magic Sets and Other Strange Ways to Implement Logic Programs,Proc. 5th ACM Symp. on PODS, pp. 1–15 (1986).

  8. R. J. Lipton and J. F. Naughton, Estimating the size of generalized transitive closures,Proc. Fifteenth Very Large Data Bases, pp. 165–172 (August 1989).

  9. O. Wolfson and A. Silberschatz, Distributed processing of logic programs,Proc. of ACM Sigmod Conf., Chicago, Illinois, pp. 329–336 (1988).

  10. O. Wolfson, Sharing the load of logic program evaluation,Proc. Int'l. Symp. on Databases in Parallel and Distributed Systems (December 1988).

  11. S. R. Cohen and O. Wolfson, Why a single parallelization strategy is not enough in knowledge bases,Proc. of ACM Symp. on PODS, Philadelphia, Pennsylvania, pp. 200–216 (March 1989).

  12. G. Dong, On distributed processibility of logic programs by decomposing databases,Proc. of ACM Sigmod Conf., Portland, Oregon, pp. 26–35 (June 1989).

  13. S. Ganguly, A. Silberschatz, and S. Tsur, A framework for the parallel processing of datalog queries,Proc. of ACM Sigmod Conf., Atlantic City, New Jersey, pp. 143–152 (May 1990).

  14. O. Wolfson and A. Ozeri, A new paradigm for parallel and distributed rule-processing,Proc. of ACM Sigmod Conf., Atlantic City, New Jersey, pp. 133–142 (May 1990).

  15. J. Seib and G. Lausen, Parallelizing Datalog programs by generalized pivoting,Proc. of ACM Symp. on PODS, Denver, Colorado, pp. 241–251 (May 1991).

  16. W. Zhang, K. Wang, and S. C. Chau, Data partition: a practical parallel evaluation of Datalog programs,Proc. of the First Int'l Conf. on Parallel and Distributed Information Systems, Miami Beach, Florida, pp. 98–105 (1991).

  17. S. Stolfo, D. Miranker, and R. Mills, A simple processing scheme to extract and load balance implicit parallelism in the concurrent match of production rules,Proc. of the AFIPS Symp. on Fifth Generation Computing (1985).

  18. Q. Yang and C. Yu, A Parallel Scheme Using the Divide and Conquer Method,Workshop on Deductive Databases in Conjunction with ILPS'91, San Diego, California. Journal version submitted for publication.

  19. J. D. Ullman,Principles of Database and Knowledge-Base Systems, Vol. 2, Computer Science Press, Rockville, Maryland (1989).

    Google Scholar 

  20. G. C. Canavos,Applied Probability and Statistical Methods, Little Brown & Company, Boston (1984).

    Google Scholar 

  21. M. Ajtai, J. Komlós, and E. Szemerédi, The longest path in a random graph,Combinatorica,1:1–12 (1981).

    Google Scholar 

  22. F. Bancilhon and R. Ramakrishnan, An amateur's introduction to recursive query processing strategies,Proc. of ACM Sigmod Conf., pp. 16–52 (1986).

  23. F. Bancilhon and R. Ramakrishnan, Performance evaluation of data intensive logic programs,Foundations of Deductive Databases and Logic Programming, J. Minker (ed.), Morgan Kaufmann, Los Altos, California, pp. 439–517 (1988).

    Google Scholar 

  24. B. Bollobás,Random Graphs, Academic Press, London (1985).

    Google Scholar 

  25. S. Ganguly, R. Krishnamurthy, and A. Silberschatz, An analysis technique for transitive closure algorithms: a statistical approach,Proc. IEEE Data Engineering Conference, Kobe, Japan, pp. 728–735 (April 1991).

  26. J. Han and H. Lu, Some performance results on recursive query processing in relational database systems,Proc. IEEE Data Engineering Conf., Los Angeles, California, pp. 533–541 (1986).

  27. R. M. Karp, The transitive closure of a random graph,Random Structure and Algorithms,1(1):73–93 (1990).

    Google Scholar 

  28. A. Marchetti-Spaccamela, A. Pelaggi, and D. Sacca, Worstcase complexity analysis of methods for logic query implementation,Proc. ACM Symp. on PODS, San Diego, California, pp. 294–301 (March 1987).

  29. S. Seshadri and J. F. Naughton, On the estimated size of recursive Datalog queries,Proc. of ACM Symp. on PODS, Denver, Colorado, pp. 268–279 (1991).

  30. J. D. Ullman, Principles of Database and Knowledge-Base Systems, Vol. 1, Computer Science Press, Rockville, Maryland (1988).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

A preliminary shortened version of this paper has appeared inPDIS. See Ref. 1.

This author's work was supported in part by NSF Grant 90-03341.

This author's work was supported in part by the Natural Sciences and Engineering Research Council of Canada.

This author's work was supported in part by NSF Grant 90-03341.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wolfson, O., Zhang, W., Butani, H. et al. Parallel processing of graph reachability in databases. Int J Parallel Prog 21, 269–302 (1992). https://doi.org/10.1007/BF01421676

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01421676

Key Words

  • Parallel and distributed databases
  • data reduction paradigm
  • graph reachability
  • sampling
  • transitive closure