Advertisement

A hash partition strategy for distributed query processing

  • Chengwen Liu
  • Hao Chen
Performance
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1057)

Abstract

This paper describes a hash partitioning strategy for distributed query processing in a multi-database environment in which relations are unfragmented and replicated. Methods and efficient algorithms are provided to determine the sets of relations that can be hash partitioned, the copies of the relations to be partitioned and the partition sites, how the relations are to be partitioned and where the fragments are to be sent for processing. For a given query, there are usually more than one set of relations that can be hash partitioned. Among the alternatives, our algorithm picks the plan that gives the minimum response time. The paper also presents a simulation study that compares the hash partition strategy to the PRS strategy. The study shows that our strategy outperforms the PRS strategy.

Keywords

Query Processing Processing Site Partition Strategy Query Graph Very Large Data Base 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bernstein, P. A. and Chiu, D-M. W.: Using semi-joins to solve relational queries. JACM, Vol. 28, No. 1, pp. 25–40, Jan. 1981.Google Scholar
  2. 2.
    DeWitt, D, and Gerber, R.: Multiprocessor hash-based join algorithms. Proc. of the 11th Int'l Conf. on Very Large Data Bases (VLDB), pp. 151–164, Stockholm, Aug. 1985.Google Scholar
  3. 3.
    Du, W., Krishnamurthy, R. and Shan, M. C.: Query Optimization in Heterogeneous DBMS. Proc. of the 18th Int'l Conf. on VLDB, Vancouver, Aug. 1992.Google Scholar
  4. 4.
    Epstein, R., Stonebraker, M. and Wong, E.: Distributed query processing in relational databases system. Proc. 1978 ACM SIGMOD Conf., Austin, May 1978.Google Scholar
  5. 5.
    Ghandeharizadeh, S. and DeWitt, D. J.: A Multiuser Performance Analysis of Alternative Declustering Strategy. Proc. 6th IEEE Int'l Conf. on Data Engineering, pp. 466–475, Los Angeles, CA, Feb. 1990.Google Scholar
  6. 6.
    Hellerstein, J. M. and Stonebraker, M.: Predicate Migration: Optimizing Queries with Expensive Predicates. Proc. 1993 ACM SIGMOD Conf., pp. 267–276, Washington, DC, May 1993.Google Scholar
  7. 7.
    Hellerstein, J. M.: Practical Predicate Placement. Proc. 1994 ACM SIGMOD Conf., pp. 325–335, Minneapolis, Minnesota, May 1994.Google Scholar
  8. 8.
    Liu, C. and Yu, C.: Validation and Performance Evaluation of the Partition and Replicate Algorithm. Proc. of the 12th IEEE Int'l Conf. on Distr. Comp. Sys., pp. 400–407, Yokohama, Japan, June 1992.Google Scholar
  9. 9.
    Liu, C. and Yu, C.: Performance Issues in Distributed Query Processing. IEEE Trans. on Par. and Distr. Sys., Vol. 4, No. 8, pp. 889–905, Aug. 1993.Google Scholar
  10. 10.
    Liu, C. and Chen. H.: A Heuristic Algorithm for Partition Strategy in Distributed Query Processing. ACM SAC96, Philidelphia, USA, Feb. 1996.Google Scholar
  11. 11.
    Pramanik, S. and Vineyard, D.: Optimizing join queries in distributed databases. IEEE Trans. on Soft. Engr., Vol. 14, No. 9, pp. 1319–1326, Sept. 1988.Google Scholar
  12. 12.
    Sacca, D. and Wiederhold, G.: Database partitioning in a cluster of processors. ACM TODS, Vol. 10, No. 1, pp. 29–56, Mar. 1985.Google Scholar
  13. 13.
    Sacco, Giovanni Maria: Fragmentation: A technique for efficient query processing. ACM TODS, Vol. 11, No. 2, pp. 113–133, June 1986.Google Scholar
  14. 14.
    Schneider, D. A. and DeWitt, D. J.: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. Proc. 1989 ACM SIGMOD Conf., pp. 110–121, Portland, OR, June 1989.Google Scholar
  15. 15.
    Shasha, Dennis and Wang, Tsong-Li: Optimizing Equijoin Queries In Distributed Databases Where Relations Are Hash Partitioned. ACM TODS, Vol. 16, No. 2, pp. 279–308, June 1991.Google Scholar
  16. 16.
    Stonebraker, M.: The case for shared-nothing. Data Base Engineering, Vol. 9, No.1, pp. 4–9, Mar. 1986.Google Scholar
  17. 17.
    Ullman, J. D.: Principles of Database Systems. Computer Science Press, Inc., Rockville, MD, 2 edition, 1982.Google Scholar
  18. 18.
    Valduriez, P. and Gardarin, G.: Join and semijoin algorithms for a multiprocessor database machine. ACM TODS, Vol. 9, No. 1, pp. 133–161, Mar. 1984.Google Scholar
  19. 19.
    Wong, E. and Katz, R. H.: Distributing a database for parallelism. In Proc. 1983 ACM SIGMOD Conf., pp. 23–29, San Jose, CA, May 1983.Google Scholar
  20. 20.
    Yu, C. T., Chang, C. and Chang, Y.: Two surprising results in processing simple queries in distributed databases. Proc. 6th IEEE Int'l Computer Software and Application Conf., pp. 377–384, Chicago, IL, Nov. 1982.Google Scholar
  21. 21.
    Yu, C. T., Guh, K. C., Brill, D. and Chen, A. L. P.: Partition strategy for distributed query processing in fast local networks. IEEE Trans. on Soft. Engr., Vol. 15, No. 6, pp. 780–793, June 1989.Google Scholar
  22. 22.
    Yu, C. and Sun, W.: Automatic knowledge acquisition and maintenance for semantic query optimization. IEEE TKDE, Vol. 1, No. 3, pp. 362–375, Sept. 1989.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1996

Authors and Affiliations

  • Chengwen Liu
    • 1
  • Hao Chen
    • 1
  1. 1.School of Computer ScienceTelecommunications and Information Systems DePaul UniversityChicagoUSA

Personalised recommendations