Encyclopedia of Database Systems

Living Edition
| Editors: Ling Liu, M. Tamer Özsu

Query Load Balancing in Parallel Database Systems

  • Luc Bouganim
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4899-7993-3_1080-2

Synonyms

Definition

The goal of parallel query execution is minimizing query response time using inter- and intraoperator parallelism. Interoperator parallelism assigns different operators of a query execution plan to distinct (sets of) processors, while intraoperator parallelism uses several processors for the execution of a single operator, thanks to data partitioning. Conceptually, parallelizing a query amounts to divide the query work in small pieces or tasks assigned to different processors. The response time of a set of parallel tasks being that of the longest one, the main difficulty is to produce and execute these tasks such that the query load is evenly balanced within the processors. This is made more complex by the existence of dependencies between tasks (e.g., pipeline parallelism) and synchronizations points. Query load balancing relates to static and/or dynamic techniques and algorithms to balance the query load within the processors so that the...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Bouganim L, Florescu D, Valduriez P. Dynamic load balancing in hierarchical parallel database systems. In: Proceedings of the 22th International Conference on Very Large Data Bases; 1996. p. 436–47.Google Scholar
  2. 2.
    Brunie L, Kosch H. Control strategies for complex relational query processing in shared nothing systems. ACM SIGMOD Rec. 1996;25(3):34–9.CrossRefGoogle Scholar
  3. 3.
    De Witt DJ, Naughton JF, Schneider DA, Seshadri S. Practical skew handling in parallel joins. In: Proceedings of the18th International Conference on Very Large Data Bases; 1992. p. 27–40.Google Scholar
  4. 4.
    Hong W. Exploiting inter-operation parallelism in XPRS. In Proceedings of the ACM SIGMOD International Conference on Management of Data; 1992. p. 19–28.Google Scholar
  5. 5.
    Hsiao H, Chen MS, Yu PS. On parallel execution of multiple pipelined hash joins. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, ACM, New York; 1994. p. 185–96.Google Scholar
  6. 6.
    Kitsuregawa M, Ogawa Y. Bucket spreading parallel hash: a new, robust, parallel hash join method for data skew in the super database computer. In: Proceedings of the 16th International Conference on Very Large Data Bases; 1990. p. 210–21.Google Scholar
  7. 7.
    Lakshmi MS, Yu PS. Effect of skew on join performance in parallel architectures. In: International Symposium on Databases in Parallel and Distributed Systems, Austin; 1988. p. 107–20.Google Scholar
  8. 8.
    Lynch C. Selectivity estimation and query optimization in large databases with highly skewed distributions of column values. In: Proceedings of the 14th International Conference on Very Large Data Bases; 1988. p. 240–51.Google Scholar
  9. 9.
    Metha M, De Witt D. Managing intra-operator parallelism in parallel database systems. In: Proceedings of the 21th International Conference on Very Large Data Bases, Morgan Kaufmann, San Francisco, 1995. p. 382–94.Google Scholar
  10. 10.
    Özsu T, Valduriez P. Principles of Distributed Database Systems (2nd edn.). Prentice Hall; 1999 (3rd edn., forthcoming).Google Scholar
  11. 11.
    Rahm E, Marek R. Dynamic multi-resource load balancing in parallel database systems. In: Proceedings of the 21th International Conference on Very Large Data Bases; 1995.Google Scholar
  12. 12.
    Shekita EJ, Young HC. Multi-join optimization for symmetric multiprocessor. In Proceedings of the 19th International Conference on Very Large Data Bases; 1993. p. 479–92.Google Scholar
  13. 13.
    Walton CB, Dale AG, Jenevin RM. A taxonomy and performance model of data skew effects in parallel joins. In: Proceedings of the 17th International Conference on Very Large Data Bases; 1991. p. 537–48.Google Scholar
  14. 14.
    Wolf JL, Dias DM, Yu PS, Turek J. New Algorithms for parallelizing relational database joins in the presence of data skew. IEEE Trans Knowl Data Eng. 1994;6(6):990–7.CrossRefGoogle Scholar
  15. 15.
    Wilshut N, Flokstra J, Apers PG. Parallel evaluation of multi-join queries. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1995. p. 115–26.Google Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.INRIA Saclay & UVSQLe ChesnayFrance

Section editors and affiliations

  • Patrick Valduriez
    • 1
  1. 1.INRIALINANantesFrance