Skip to main content

Optimization of Multi-way Join Cost Using System R* and SharesSkew

  • Conference paper
  • First Online:
ICDSMLA 2019

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 601))

  • 53 Accesses

Abstract

In a distributed environment relations are stored at different sites. To perform algebraic operations such as join, the relations are to be transferred from one site to the other in such a way that the total communication cost is minimized. This paper deals with the problem of computing the transmission cost using two approaches. The first uses System R* algorithm approach when the data is of non-skew nature and the second uses SharesSkew algorithm when the data has skews i.e., same value for a specific join attribute, named as Heavy Hitter (HH). Rules of the two algorithms to be followed for performing join are specified and by illustrating with Banking System, the communication cost is evaluated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ullman DJ (1984) Principles of database systems, 2nd edn. Galgotia Publications

    Google Scholar 

  2. Afrati NF, Stasinopoulos N, Ullman DJ (2018) SharesSkew: an algorithm to handle skew for joins in MapReduce. Inf Syt 7:129–150 (Elsevier)

    Google Scholar 

  3. Afrati NF, Ullman DJ (2011) Optimizing multiway joins in map-reduce environment. In: IEEE transactions on knowledge and data engineering, vol 23(9), pp 1282–1298

    Google Scholar 

  4. Beame P, Koutris P, Suciu D (2014) Skew in parallel query processing. In: Proceedings of the 33rd ACM SIGMOD symposium on principle of database systems, USA, pp 212–223

    Google Scholar 

  5. Chu S, Balazinska M, Suciu D (2015) From theory to practice: efficient join query evaluation in a parallel database system. In: Proceedings of the 2015 ACM SIDMOD international conference on management of data, ACM

    Google Scholar 

  6. Kwon Y, Balazinska M, Howe B, Rolia J (2012) SkewTune: mitigating skew in mapreduce applications. In: Proceedings of the 2012 ACM SIDMOD international conference on management of data, pp 25–36, ACM, USA

    Google Scholar 

  7. Kwon Y, Balazinska M, Howe B, Rolia J (2012) SkewTune in action: mitigating skew in mapreduce applications. PVLDB 5(12):1934–1937

    Google Scholar 

  8. Ullman DJ (2012) Designing good mapreduce algorithms. XRDS 19(1):30–34

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chittem Leela Krishna .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Krishna, C.L., Reddy, P.V.S. (2020). Optimization of Multi-way Join Cost Using System R* and SharesSkew. In: Kumar, A., Paprzycki, M., Gunjan, V. (eds) ICDSMLA 2019. Lecture Notes in Electrical Engineering, vol 601. Springer, Singapore. https://doi.org/10.1007/978-981-15-1420-3_5

Download citation

Publish with us

Policies and ethics