Abstract
An important feature of database technology of the nineties is the use of distributed computation for speeding up the execution of complex queries. Today, the use of parallelism is tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors, in multi-processor architectures without shared memory, in order to perform selections and joins in parallel.
With the development of new (logic) query languages and deductive databases, the new dimension of recursion has been added to query processing. Transitive closure queries, such as bill-of-material, allow important database problems to be solved by the database system itself; and more general logic programming queries allow us to study queries not considered before. Although recursive queries are very complex, their regular structure makes them particularly suited for parallel execution. Well-considered use of parallelism can give a high efficiency gain when processing recursive queries.
In this paper, we give an overview of approaches to parallel execution of recursive queries as they have been presented in recent literature. After showing that the most typical Datalog queries have exactly the same expressive power as the transitive closure of simple algebraic expressions, we focus on describing algebraic approaches to recursion.
To give a good overview of the problems that are inherent to parallel computation, we introduce a graphical formalism to describe parallel execution. This formalism enables us to clearly show the behaviour of parallel execution strategies. We first review algorithms developed in the framework of algebraic transitive closures that operate on entire relations; then we introduce fragmentation, distinguishing between hash-based and semantic fragmentation.
This research is partially supported by the LOGIDATA + Project of the National Research Council of Italy
Preview
Unable to display preview. Download preview PDF.
References
Agrawal R. and Jagadish H.V. “Direct algorithms for computing the transitive closure of database relations”, in Proc. 13th Int. Conference on Very Large Data Bases, Brighton, 1987, pp. 255–266.
Apers P.M.G., Hevner A., and Yao B., “Optimization algorithms for distributed queries”, IEEE-Transactions on Software Engineering, SE9:1, 1983.
Apers P.M.G., Houtsma M.A.W., and Brandse F. “Processing recursive queries in relational algebra,” in Data and Knowledge (DS-2), Proc. of the 2nd IFIP 2.6 Working Conference on Database Semantics, Albufeira, Portugal, Nov. 3–7, 1986, R.A. Meersman and A.C. Sernadas (eds.), North Holland, 1988, pp. 17–39.
Apers P.M.G., Kersten M. L., and H. Oerlemans, “PRISMA database machine: a distributed main-memory approach,” in Advances in Database Technology, Proc. Int. Conference Extending Database Technology (EDBT), 1988, pp. 590–593.
Cacace F., Ceri S., Crespi-Reghizzi C., Tanca L., and Zicari R. “Integrating object oriented data modeling with a rule-based programming language”, in Proc. ACM-Sigmod Conference, Atlantic City, May 1990, pp. 225–236.
Cacace, F., Ceri, S., and Houtsma, M.A.W. “A survey of parallel execution strategies for transitive closure and logic programs,” Technical Report university of Twente-Politecnico Milano, Nov. 1990. Submitted for publication.
Ceri S., Gottlob G., and Lavazza L. “Translation and optimization of logic queries: the algebraic approach”, in Proc. of the 12th Int. Conf. on Very Large Data Bases, Kyoto, pp. 395–403, August 1986.
Ceri S., Gottlob G., and Tanca L.Logic Programming and Databases, Springer-Verlag, 1990.
Ceri S., Gottlob G., Tanca L., and Wiederhold G., “Magic Semi-joins”, Information Processing Letters, 33:2, 1989.
Ceri S. and Pelagatti G.Distributed Databases: Principles and Systems, Computer Science Series, McGraw-Hill, 1984.
Ceri S. and Tanca L. “Optimization of systems of algebraic equations for evaluating Datalog queries,” in Proc. 13th Int. Conf. on Very Large Data Bases, Brighton, Sept. 1987, pp. 31–42.
Cheiney J.P. and de Maindreville C. “A parallel strategy for transitive closure using double hash-based clustering,” Proc. 16th Int. Conf. on Very Large Data Bases, Brisbane, Aug. 1990, pp. 347–358.
Copeland G., Alexander W., Boughter E., and Keller T. “Data placement in Bubba,” Proc ACM-Sigmod Conference, 1988, pp. 99–108.
De Witt D. J., Ghandeharizadeh S., and Schneider D. “A performance analysis of the Gamma database machine,” Proc. ACM-Sigmod Conference, 1988, pp. 350–360.
Ganguly S., Silberschatz A., and Tsur S. “A framework for the parallel processing of Datalog queries,” Proc. ACM-Sigmod Conference, Atlantic City, May 1990, pp. 143–152.
Goodman N., Bernstein P.A., Wong E., Reeve C.L., Rothnie J.B. “Query processing in SDD-1 — A system for Distributed Databases”, ACM-Transactions on Database Systems, 6:4, 1981.
Houtsma M.A.W.Data and Knowledge Base Management Systems: Data Model and Query Processing, Ph.D. Thesis, University of Twente, Enschede, the Netherlands, Nov. 1989.
Houtsma M.A.W., Apers P.M.G., and Ceri S. “Distributed transitive closure computation: the disconnection set approach,” Proc. 16th Int. Conf. on Very Large Data Bases, Brisbane, Aug. 1990, pp. 335–346.
Houtsma M.A.W., Apers, P.M.G., and Ceri S. “Parallel computation of transitive closure queries on fragmented databases,” Technical report INF88-56, University of Twente, the Netherlands, Dec. 1988.
Houtsma, M.A.W., Apers, P.M.G., and Ceri, S. “Complex transitive closure queries on a fragmented graph,” Proc. 3rd Int. Conf. on Database Theory, Lecture Notes in Computer Science, Springer-Verlag, Dec. 1990.
Houtsma M.A.W., Cacace F., and Ceri S. “Parallel hierarchical evaluation of transitive closure queries,” in preparation.
Hulin G., “Parallel processing of recursive queries in distributed architectures,” Proc. 15th Int. Conf. Very Large Data Bases, Amsterdam 1989, pp. 87–96.
Ioanidis Y. “On the computation of the transitive closure of relational operators,” Proc. 12th Int. Conf. on Very Large Data Bases, Kyoto 1986, pp. 403–411.
Kersten, M.L., Apers, P.M.G., Houtsma, M.A.W., van Kuijk, H.J.A., and van de Weg, R.L.W. “A distributed, main-memory database machine,” in Proc. of the 5th Int. Workshop on Database Machines, Karuizawa, Japan, Oct. 5–8, 1987; and in Database Machines and Knowledge Base Machines, M. Kitsuregawa and H. Tanaka (eds.), Kluwer Academic Publishers, 1988, pp. 353–369.
Kleinhuis G. and Oskam K.R. “Evaluation and simulation of parallel algorithms for the transitive closure operation,” M.Sc. Thesis, University of Twente, the Netherlands, May 1989.
Nejdl W., Ceri S., and Wiederhold G. “Evaluating recursive queries in distributed databases,” Tech. Rep. 90-015, Politecnico di Milano, submitted for publication.
Raschid L. and Su S.Y.W. “A parallel strategy for evaluating recursive queries,” Proc. 12th Int. Conf. on Very Large Data Bases, Kyoto 1986, pp. 412–419.
Schneider D. A. and De Witt D. J. “A performance analysis of four parallel join algorithms in a shared-nothing multiprocessor environment,” in Proc. ACM-Sigmod Conference, 1989, pp. 110–121.
Tandem Database Group, “NonStop SQL: a distributed, high-performance, highly-availability implementation of SQL”, in High Performance Transaction Systems, Lecture Notes in Computer Science, Springer-Verlag, 1987.
Ullman J.D.Principles of Data and Knowledge-Based Systems”, Computer Science Press, 1989.
Valduriez P. and Khoskafian S. “Parallel Evaluation of the Transitive Closure of a Database Relation,” Int. Journal of Parallel Programming, 17:1, Feb. 1988.
Van Gelder A., “A message passing framework for logical query evaluation,” in Proc. ACM-Sigmod Conference, 1986, pp. 155–165.
Wolfson O. “Sharing the load of logic program evaluation,” Int. Symp. on Database in Parallel and Distributed Systems, Dec. 1988, pp. 46–55.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1991 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cacace, F., Ceri, S., Houtsma, M.A.W. (1991). An overview of parallel strategies for transitive closure on algebraic machines. In: America, P. (eds) Parallel Database Systems. PDS 1990. Lecture Notes in Computer Science, vol 503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-54132-2_49
Download citation
DOI: https://doi.org/10.1007/3-540-54132-2_49
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-54132-5
Online ISBN: 978-3-540-47432-6
eBook Packages: Springer Book Archive