Dataflow query execution in a parallel main-memory environment

Wilschut, Annita N.; Apers, Peter M. G.

doi:10.1007/BF01277522

Dataflow query execution in a parallel main-memory environment

Published: January 1993

Volume 1, pages 103–128, (1993)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

Annita N. Wilschut¹ &
Peter M. G. Apers²

243 Accesses
56 Citations
Explore all metrics

Abstract

In this paper, the performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results of this study are a step into the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries.

Among others, synchronization issues are identified to limit the performance gain from parallelism. A new hash-join algorithm is introduced that has fewer synchronization constraints than the known hash-join algorithms. Also, the behavior of individual join operations in a join-tree is studied in a simulation experiment. The results show that the introduced Pipelining hash-join algorithm yields a better performance for multi-join queries. The format of the optimal join-tree appears to depend on the size of the operands of the join: A multi-join between small operands performs best with a bushy schedule; larger operands are better off with a linear schedule. The results from the simulation study are confirmed with an analytic model for dataflow query execution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

P. America (ed.),Proc. PRISMA Workshop Parallel Database Systems, Springer-Verlag: New York, 1991.
Google Scholar
P.M.G. Apers, C.A. van den Berg, J. Flokstra, P.W.P.J. Grefen, M.L. Kersten, and A.N. Wilschut, “PRISMA/DB: A parallel main-memory relational DBMS.” To appear in IEEE transactions on Knowledge and Data Engineering.
D. Bitton, D.J. DeWitt and C. Turbyfill, “Benchmarking database systems—A systematic approach,” in M. Schkolnick and C. Thanos (eds.),Proc. 9th Int. Conf. Very Large Data Bases, Florence, Italy VLDB Endowment: Saratoga, CA, 1983.
Google Scholar
P. Bodorik and J.S. Riordon, “Heuristic algorithms for distributed query processing,” in S. Jajodia, W. Kim and A. Silberschatz (eds.),Proc. Int. Symposium on Databases Parallel Distributed Systems, Austin, Texas IEEE Press: Montvale, NJ, pp. 107–117, 1988.
Google Scholar
H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, and P. Valduriez, “Prototyping Bubba, a highly parallel database system,IEEE Trans Knowledge Data Eng., Vol. 2, no. 2, pp. 4–24, 1990.
Google Scholar
K. Bratbergsengen and T. Gjelsvik, “The development of the CROSS8 and HC16-186 (Database) computers,” in H. Boral and P. Faudemay (eds.),Proc. 6th Int. Workshop Database Machines, Deauville, France, June 1989, Springer-Verlag: New York, pp. 359–372, 1989.
Google Scholar
B.W. Char, K.O. Geddes, G.H. Gonnet, M.B. Monager, and S.M. Watt,Maple Reference Manual, WATCOM: Waterloo, Canada, 1988.
Google Scholar
D.J. DeWitt and J. Gray, “Parallel database systems: The future of database processing or a passing fad?,”ACM SIGMOD Record, vol. 19, no. 4, pp. 104–112, 1990.
Google Scholar
D.J. DeWitt, S. Ghandeharizadeh, D.A. Schneider, A. Bricker, H. Hsiao, and R. Rasmussen, “The GAMMA database machine project,”IEEE Trans. Knowledge Data Eng., vol. 2, no. 1, pp. 44–62, 1990.
Google Scholar
G. Graefe, “Encapsulation of parallelism in the volcano query processing system,” in H. Garcia-Molina, H.V. Jagardish (eds.),Proc. ACM-SIGMOD 1990 Int. Conf. Management Data, Atlantic City, NJ, ACM Press: New York, pp. 102–111.
P.W.P.J. Grefen, A.N. Wilschut, and J. Flokstra, “PRISMA/DB1 User Manual,” Universiteit Twente, Enschede, The Netherlands, Memorandum INF91-06, 1991.
M. Jarke and J. Koch, “Query optimization in database systems,”Comput. Surv., vol. 16, no. 2, pp. 111–152, 1984.
Google Scholar
M.L. Kersten, P.M.G. Apers, M.A.W. Houtsma, H.J.A. van Kuijk, and R.L.W. vande Weg, “PRISMA: A Distributed main memory database machine,” inProc. 5th Inter. Workshop Database Machines, Karuizawa, Japan, 1987.
E. van Kuijk, “Semantic query optimization in distributed database systems,” Ph.D. thesis, University of Twente, 1991.
A. Okubo,Diffusion and Ecological Problems: Mathematical Models, Springer-Verlag: New York, 1980.
Google Scholar
D.A. Schneider and D.J. DeWitt, “A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment,” in J. Clifford, B. Lindsay and D. Maier (eds.),Proc. ACM-SIGMOD 1989 Inter. Conf. Management Data, Portland, OR, ACM Press: New York, 1989 (Also appeared as ACM SIGMOD Record, vol. 18, no. 2, 1989.)
Google Scholar
D.A. Schneider and D.J. Dewitt, “Tradeoffs in processing complex join queries via hashing in multiprocessor database machines,” in D. McLeod, R. Sacks-Davis and H. Schek (eds.),Proc. 16th Int. Conf. Very Large Data Bases, Brisbane, Australia, Morgan Kaufmann: Palo Alto, CA, pp. 469–480, 1990.
Google Scholar
P.G. Selinger, M.M. Astrahan, D.D. Chamberlin, R.A. Lorie and T.G. Price, “Access path selection in a Relational Database Management System,” inProc. ACM-SIGMOD 1979 Int. Conf. Management Data, Boston, MA, pp. 82–93, 1979.
W.B. Teeuw and H.M. Blanken, “Control versus data flow in distributed database machines,” Universiteit Twente, Enschede, The Netherlands, Memorandum INF91-02, 1991.
Teradata Corporation, “Teradata,” DBC/1012 Database Computer Concepts and Facilities,” C02-0001-00, 1983.
A.N. Wilschut, “A model for dataflow query execution in a parallel main-memory environment,” Universiteit Twente, Enschede, The Netherlands, Memorandum INF91-34, 1991.
A.N. Wilschut and P.M.G. Apers, “Pipelining in query execution,” in N. Rishe, S. Navathe, and D. Tal (eds.),Proc. Int. Conf. Databases, Parallel Architectures and their applications, Miami, IEEE Press: Montvale, NJ, 1990.
Google Scholar
A.N. Wilschut, P.M.G. Apers, and J. Flokstra, “Parallel query execution in PRISMA/DB,” in P. America (ed.),Proc. PRISMA Workshop Parallel Database Systems, Noordwijk, The Netherlands, Springer-Verlag: New York, 1991.
Google Scholar
A.N. Wilschut and P.G. Doucet, “Theoretical studies on animal orientation: A model for kinesis,”Theoret. Biol. vol. 127, pp. 111–125, 1987.
Google Scholar
A.N. Wilschut, J. Flokstra, and P.M.G. Apers, “Parallelism in a main-memory system: The performance of PRISMA/DB.,” inProc. 18th Int. Conf. Very Large Data Bases, Vancouver, Canada, 1992.
A.N. Wilschut, P.W.P.J. Grefen, P.M.G. Apers, and M.L. Kersten, “Implementing PRISMA/DB in an OOPL.,” in H. Boral and P. Faudemay (eds.),Proc. 6th Int. Workshop Database Machines, Deauville, France, Springer-Verlag: New York, pp. 359–372, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Twente, P.O. Box 217, 7500, AE Enschede, The Netherlands
Annita N. Wilschut
University of Twente, P.O. Box 217, 7500, AE Enschede, The Netherlands
Peter M. G. Apers

Authors

Annita N. Wilschut
View author publications
You can also search for this author in PubMed Google Scholar
Peter M. G. Apers
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wilschut, A.N., Apers, P.M.G. Dataflow query execution in a parallel main-memory environment. Distrib Parallel Databases 1, 103–128 (1993). https://doi.org/10.1007/BF01277522

Download citation

Issue Date: January 1993
DOI: https://doi.org/10.1007/BF01277522

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dataflow query execution in a parallel main-memory environment

Abstract

Access this article

Similar content being viewed by others

A survey on the evolution of stream processing systems

MongoDB Vs PostgreSQL: A comparative study on performance aspects

Performance improvement of the triangular matrix product in commodity clusters

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dataflow query execution in a parallel main-memory environment

Abstract

Access this article

Similar content being viewed by others

A survey on the evolution of stream processing systems

MongoDB Vs PostgreSQL: A comparative study on performance aspects

Performance improvement of the triangular matrix product in commodity clusters

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation