Abstract
We explore an approach to developing Datalog parallelization strategies that aims at good expected rather than worst-case performance. To illustrate, we consider a very simple parallelization strategy that applies to all Datalog programs. We prove that this has very good expected performance under equal distribution of inputs. This is done using an extension of 0–1 laws adapted to this context. The analysis is confirmed by experimental results on randomly generated data.
This work was done while the author was affiliated with the Ecole Nationale Supérieure des Télécommunications (ENST), Paris, France, and supported in part by CAPES/MEC Brasil under grant #1245/90-13
Work performed in part while visiting ENST Paris, and supported in part by the NSF under grant IRI-9221268.
Preview
Unable to display preview. Download preview PDF.
References
S. Abiteboul, K. Compton and V. Vianu: “Queries are easier than you thought (probably)”, Proc. ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, 1992, pp 23–32.
S. Abiteboul and A. Van Gelder: “Optimizing Active Databases using the Split Technique”, Proc. Intl. Conf. on Database Theory, 1992, pp 171–187.
F. Bancilhon and R. Ramakrishnan: “Performance Evaluation of Data Intensive Logic Programs”, Foundations of Deductive Databases and Logic Programming, Ed J. Minker, 1988, pp 439–517.
D.A. Bell, J. Shao and M.E.C. Hull: “A Pipelined Strategy for Processing Recursive Queries in Parallel”, Data and Knowledge Engineering, 6(5), 1991, pp 367–391.
A.K. Chandra: “Programming Primitives for Database Languages”, Proc. ACM Symp on Principles of Programming Languages, 1981, pp 50–62.
A.K. Chandra and D. Harel: “Structure and Complexity of Relational Queries”, J. Computer and System Sciences, 25(1), 1982, pp 99–128.
J.-P. Cheiney, G. Kiernan and C. de Maindreville: “A Database Rule Language Compiler Supporting Parallelism”, Proc. Intl. Symp. on Database Systems for Advanced Applications, 1993, pp 279–286.
S.R. Cohen and O. Wolfson: “Why a Single Parallelization Strategy is not Enough in Knowledge Bases”, Proc. ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, 1989, pp 200–216.
R. Fagin: “Monadic Generalized Spectra”, Z. Math. Logik 21, 1975, pp 89–96.
R. Fagin: “Finite-Model Theory: a Personal Perspective”, Proc. Int'l. Conf. on Database Theory, 1990, pp 3–24.
S. Ganguly, A. Silberschatz and S. Tsur: “A Framework for the Parallel Processing of Datalog Queries”, Proc. ACM-SIGMOD Intl. Conf. on Management of Data, 1990, pp 143–152.
E. Grandjean: “Complexity of the First-order Theory of Almost all Structures”, Information and Control, 52, 1983, pp 180–204.
G. Hulin: “Parallel Processing of Recursive Queries in Distributed Architectures”, Proc. Intl. Conf. on Very Large Data Bases, 1989, pp 87–96.
P.G. Kolaitis and M.Y. Vardi: “The Decision Problem for the Probabilities of Higher-Order Properties”, Proc. IEEE Symp. on Logic in Computer Science, 1987, pp 425–435.
P.G. Kolaitis and M.Y. Vardi: “0–1 Laws for Infinitary Logics”, Proc. IEEE Symp. on Logic in Computer Science, 1990, pp 156–167.
S. Lifschitz: “Stratégies d'évaluation parallèle de requêtes Datalog récursives” (in French), Ph.D. Thesis, Ecole Nationale Supérieure des Télécommunications, Paris, 1994.
R. Rado: “Universal Graphs and Universal Functions”, Acta Arith., 9, 1964, pp 331–340.
J. Seib and G. Lausen: “Parallelizing Datalog Programs by Generalized Pivoting”, Proc. ACM Symp. on Principles of Database Systems, 1991, pp 78–87.
J.D. Ullman: Principles of Database and Knowledge-Base Systems, Volumes I and II, Computer Science Press, 1989.
J.D. Ullman: “Bottom-up beats Top-down for Datalog”, Proc. ACM Symp. on Principles of Database Systems, 1989, pp 140–149.
A. Van Gelder: “A Message Passing Framework for Logical Query Evaluation”, Proc. ACM-SIGMOD Intl. Conf. on Management of Data, 1986, pp 155–165.
O. Wolfson: “Sharing the Load of Logic-Programming Evaluation”, Proc. Intl. Symp. on Databases in Parallel and Distributed Systems, 1988, pp 46–55.
O. Wolfson and A. Ozeri: “A New Paradigm for Parallel and Distributed Rule-Processing”, Proc. ACM-SIGMOD Intl. Conf. on Management of Data, 1990, pp 133–142.
O. Wolfson and A. Silberschatz: “Distributed Processing of Logic Programming”, Proc. ACM-SIGMOD Intl. Conf. on Management of Data, 1988, pp 329–336.
W. Zhang, K. Wang and S-C. Chau: “Data Partition: a Practical Parallel Evaluation of Datalog Programs”, Proc. Intl. Conf. on Parallel and Distributed Information Systems, 1991, pp 98–105.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lifschitz, S., Vianu, V. (1995). A probabilistic view of Datalog parallelization. In: Gottlob, G., Vardi, M.Y. (eds) Database Theory — ICDT '95. ICDT 1995. Lecture Notes in Computer Science, vol 893. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58907-4_23
Download citation
DOI: https://doi.org/10.1007/3-540-58907-4_23
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58907-5
Online ISBN: 978-3-540-49136-1
eBook Packages: Springer Book Archive