Parallel query processing
In this chapter, we have examined the various issues and techniques encountered in parallel query processing. The methods used to exploit query parallelism are divided into three categories: namely intra- operator, inter- operator, and inter- query parallelism. We have concentrated on join operations because they are the most expensive operations to execute, especially with the increases in database size and query complexity. In intra-operator parallelism, the major issue is task creation, where the objective is to split a operation into tasks so that the load can be spread evenly across a given number of processors. In the presence of data skew in the join attribute, naive range splitting or hashing does not suffice to balance the load. Inter-operator parallelism can be achieved either through parallel execution of independent operations or through pipelining. In either case, the major issues are the join sequence selection and processor allocation for each operation. Join sequence selection determines the precedence relations among the operations. For inter-query parallelism, the issue again is processor allocation, but now among the multiple queries. We explored query parallelism based on a hierarchical approach under a unified framework, so that potential integration of the techniques used to address each type of parallelism could be illustrated.
KeywordsExecution Time Query Plan Query Tree Processor Allocation Good Efficiency Point
Unable to display preview. Download preview PDF.