A probabilistic view of Datalog parallelization

  • Sérgio Lifschitz
  • Victor Vianu
Contributed Papers Probabilistic Methods
Part of the Lecture Notes in Computer Science book series (LNCS, volume 893)

Abstract

We explore an approach to developing Datalog parallelization strategies that aims at good expected rather than worst-case performance. To illustrate, we consider a very simple parallelization strategy that applies to all Datalog programs. We prove that this has very good expected performance under equal distribution of inputs. This is done using an extension of 0–1 laws adapted to this context. The analysis is confirmed by experimental results on randomly generated data.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    S. Abiteboul, K. Compton and V. Vianu: “Queries are easier than you thought (probably)”, Proc. ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, 1992, pp 23–32.Google Scholar
  2. 2.
    S. Abiteboul and A. Van Gelder: “Optimizing Active Databases using the Split Technique”, Proc. Intl. Conf. on Database Theory, 1992, pp 171–187.Google Scholar
  3. 3.
    F. Bancilhon and R. Ramakrishnan: “Performance Evaluation of Data Intensive Logic Programs”, Foundations of Deductive Databases and Logic Programming, Ed J. Minker, 1988, pp 439–517.Google Scholar
  4. 4.
    D.A. Bell, J. Shao and M.E.C. Hull: “A Pipelined Strategy for Processing Recursive Queries in Parallel”, Data and Knowledge Engineering, 6(5), 1991, pp 367–391.Google Scholar
  5. 5.
    A.K. Chandra: “Programming Primitives for Database Languages”, Proc. ACM Symp on Principles of Programming Languages, 1981, pp 50–62.Google Scholar
  6. 6.
    A.K. Chandra and D. Harel: “Structure and Complexity of Relational Queries”, J. Computer and System Sciences, 25(1), 1982, pp 99–128.Google Scholar
  7. 7.
    J.-P. Cheiney, G. Kiernan and C. de Maindreville: “A Database Rule Language Compiler Supporting Parallelism”, Proc. Intl. Symp. on Database Systems for Advanced Applications, 1993, pp 279–286.Google Scholar
  8. 8.
    S.R. Cohen and O. Wolfson: “Why a Single Parallelization Strategy is not Enough in Knowledge Bases”, Proc. ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, 1989, pp 200–216.Google Scholar
  9. 9.
    R. Fagin: “Monadic Generalized Spectra”, Z. Math. Logik 21, 1975, pp 89–96.Google Scholar
  10. 10.
    R. Fagin: “Finite-Model Theory: a Personal Perspective”, Proc. Int'l. Conf. on Database Theory, 1990, pp 3–24.Google Scholar
  11. 11.
    S. Ganguly, A. Silberschatz and S. Tsur: “A Framework for the Parallel Processing of Datalog Queries”, Proc. ACM-SIGMOD Intl. Conf. on Management of Data, 1990, pp 143–152.Google Scholar
  12. 12.
    E. Grandjean: “Complexity of the First-order Theory of Almost all Structures”, Information and Control, 52, 1983, pp 180–204.Google Scholar
  13. 13.
    G. Hulin: “Parallel Processing of Recursive Queries in Distributed Architectures”, Proc. Intl. Conf. on Very Large Data Bases, 1989, pp 87–96.Google Scholar
  14. 14.
    P.G. Kolaitis and M.Y. Vardi: “The Decision Problem for the Probabilities of Higher-Order Properties”, Proc. IEEE Symp. on Logic in Computer Science, 1987, pp 425–435.Google Scholar
  15. 15.
    P.G. Kolaitis and M.Y. Vardi: “0–1 Laws for Infinitary Logics”, Proc. IEEE Symp. on Logic in Computer Science, 1990, pp 156–167.Google Scholar
  16. 16.
    S. Lifschitz: “Stratégies d'évaluation parallèle de requêtes Datalog récursives” (in French), Ph.D. Thesis, Ecole Nationale Supérieure des Télécommunications, Paris, 1994.Google Scholar
  17. 17.
    R. Rado: “Universal Graphs and Universal Functions”, Acta Arith., 9, 1964, pp 331–340.Google Scholar
  18. 18.
    J. Seib and G. Lausen: “Parallelizing Datalog Programs by Generalized Pivoting”, Proc. ACM Symp. on Principles of Database Systems, 1991, pp 78–87.Google Scholar
  19. 19.
    J.D. Ullman: Principles of Database and Knowledge-Base Systems, Volumes I and II, Computer Science Press, 1989.Google Scholar
  20. 20.
    J.D. Ullman: “Bottom-up beats Top-down for Datalog”, Proc. ACM Symp. on Principles of Database Systems, 1989, pp 140–149.Google Scholar
  21. 21.
    A. Van Gelder: “A Message Passing Framework for Logical Query Evaluation”, Proc. ACM-SIGMOD Intl. Conf. on Management of Data, 1986, pp 155–165.Google Scholar
  22. 22.
    O. Wolfson: “Sharing the Load of Logic-Programming Evaluation”, Proc. Intl. Symp. on Databases in Parallel and Distributed Systems, 1988, pp 46–55.Google Scholar
  23. 23.
    O. Wolfson and A. Ozeri: “A New Paradigm for Parallel and Distributed Rule-Processing”, Proc. ACM-SIGMOD Intl. Conf. on Management of Data, 1990, pp 133–142.Google Scholar
  24. 24.
    O. Wolfson and A. Silberschatz: “Distributed Processing of Logic Programming”, Proc. ACM-SIGMOD Intl. Conf. on Management of Data, 1988, pp 329–336.Google Scholar
  25. 25.
    W. Zhang, K. Wang and S-C. Chau: “Data Partition: a Practical Parallel Evaluation of Datalog Programs”, Proc. Intl. Conf. on Parallel and Distributed Information Systems, 1991, pp 98–105.Google Scholar

Copyright information

© Springer-Verlag 1995

Authors and Affiliations

  • Sérgio Lifschitz
    • 1
  • Victor Vianu
    • 2
  1. 1.Depto. InformáticaPUC-RioRio de Janeiro, RJBrasil
  2. 2.Univ. of California at San DiegoLa JollaUSA

Personalised recommendations