# Flexible partitioning for selective binary theta-joins in a massively parallel setting

- 199 Downloads

## Abstract

Efficient join processing plays an important role in big data analysis. In this work, we focus on generic theta joins in a massively parallel environment, such as MapReduce and Spark. Theta joins are notoriously slow due to their inherent quadratic complexity, even when their selectivity is low, e.g., 1%. The main performance bottleneck differs between cases, and is due to any of the following factors or their combination: amount of data being shuffled, memory load on reducers, or computation load on reducers. We propose an ensemble-based partitioning approach that tackles all three aspects. In this way, we can save communication cost, we better respect the memory and computation limitations of reducers and overall, we reduce the total execution time. The key idea behind our partitioning is to cluster join key values following two techniques, namely matrix re-arrangement and agglomerative clustering. These techniques can run either in isolation or in combination. We present thorough experimental results using both band queries on real data and arbitrary synthetic predicates. We show that we can save up to 45% of the communication cost and reduce the computation load of a single reducer up to 50% in band queries, whereas the savings are up to 74 and 80%, respectively, in queries with arbitrary theta predicates. Apart from being effective, the potential benefits of our approach can be estimated before execution from metadata, which allows for informed partitioning decisions. Finally, our solutions are flexible in that they can account for any weighted combination of the three bottleneck factors.

## Keywords

Theta-joins Query processing MapReduce Spark## Notes

### Acknowledgements

We would like to thank Jordi Torres, Rubèn Tous and Carlos Tripiana from the Barcelona Supercomputing Center for their help in running the Spark experiments.

## References

- 1.Afrati, F., Ullman, J.: Matching bounds for the all-pairs mapreduce problem. In: Proceedings of the 17th International Database Engineering & Applications Symposium, pp. 3–4. ACM (2013)Google Scholar
- 2.Afrati, F.N., Sarma, A.D., Salihoglu, S., Ullman, J.D.: Upper and lower bounds on the cost of a map-reduce computation. PVLDB
**6**(4), 277–288 (2013)Google Scholar - 3.Afrati, F.N., Ullman, J.D.: Optimizing multiway joins in a map-reduce environment. IEEE Trans. Knowl. Data Eng.
**23**(9), 1282–1298 (2011)CrossRefGoogle Scholar - 4.Beame, P., Koutris, P., Suciu, D.: Skew in parallel query processing. In: PODS, pp. 212–223 (2014)Google Scholar
- 5.Chan, H.M., Milner, D.A.: Direct clustering algorithm for group formation in cellular manufacture. J. Manuf. Syst.
**1**(1), 65–75 (1982)CrossRefGoogle Scholar - 6.Chen, S.-Y., Chang, T.-P., Chang, Z.-H.: An efficient theta-join query processing algorithm on mapreduce framework. In: Proceedings of the 2012 International Symposium on Computer, Consumer and Control (IS3C), pp. 686–689. IEEE (2012)Google Scholar
- 7.Chu, S., Balazinska, M., Suciu, D.: From theory to practice: efficient join query evaluation in a parallel database system. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31–June 4, 2015, pp. 63–78 (2015)Google Scholar
- 8.Chu, X., Ilyas, I.F., Koutris, P.: Distributed data deduplication. PVLDB
**9**(11), 864–875 (2016)Google Scholar - 9.Climer, S., Zhang, W.: Rearrangement clustering: pitfalls, remedies, and applications. J. Mach. Learn. Res.
**7**, 919–943 (2006)MathSciNetzbMATHGoogle Scholar - 10.Crotty, A., Galakatos, A., Dursun, K., Kraska, T., Binnig, C., Çetintemel, U., Zdonik, S.: An architecture for compiling udf-centric workflows. PVLDB
**8**(12), 1466–1477 (2015)Google Scholar - 11.Doulkeridis, C., Nørvåg, K.: A survey of large-scale analytical query processing in mapreduce. VLDB J.
**23**(3), 1–26 (2013)Google Scholar - 12.Elseidy, M., Elguindy, A., Vitorovic, A., Koch, C.: Scalable and adaptive online joins. PVLDB
**7**(6), 441–452 (2014)Google Scholar - 13.Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Burlington (2000)zbMATHGoogle Scholar
- 14.Khayyat, Z., Lucia, W., Singh, M., Ouzzani, M., Papotti, P., Quiané-Ruiz, J.-A., Tang, N., Kalnis, P.: Lightning fast and space efficient inequality joins. PVLDB
**8**(13), 2074–2085 (2015)Google Scholar - 15.King, J.R.: Machine-component grouping in production flow analysis: an approach using a rank order clustering algorithm. Int. J. Prod. Res.
**18**(2), 213–232 (1980)MathSciNetCrossRefGoogle Scholar - 16.Koumarelas, I., Naskos, A., Gounaris, A.: Binary theta-joins using mapreduce: efficiency analysis and improvements. In: Proceedings of the International Workshop on Algorithms for MapReduce and Beyond (BMR) (in conjunction with EDBT/ICDT’2014), Athens, Greece (2014)Google Scholar
- 17.Lenstra, J.K., Rinnooy Kan, A.H.G.: Some simple applications of the travelling salesman problem. Oper. Res. Q.
**26**(4), 717–733 (1975)CrossRefzbMATHGoogle Scholar - 18.Lenstra, J.K.: Technical noteclustering a data array and the traveling-salesman problem. Oper. Res.
**22**(2), 413–414 (1974)CrossRefzbMATHGoogle Scholar - 19.Li, F., Ooi, B.C., Tamer Özsu, M., Wu, S.: Distributed data management using mapreduce. ACM Comput. Surv.
**46**(3), 31 (2014)Google Scholar - 20.McCormick, W.T., Schweitzer, P.J., White, T.W.: Problem decomposition and data reorganization by a clustering technique. Oper. Res.
**20**(5), 993–1009 (1972)CrossRefzbMATHGoogle Scholar - 21.Okcan, A., Riedewald, M.: Processing theta-joins using mapreduce. In: SIGMOD Conference, pp. 949–960 (2011)Google Scholar
- 22.Okcan, A., Riedewald, M.: Anti-combining for mapreduce. In: SIGMOD Conference, pp. 839–850 (2014)Google Scholar
- 23.Ren, K., Kwon, Y.C., Balazinska, M., Howe, B.: Hadoop’s adolescence. PVLDB
**6**(10), 853–864 (2013)Google Scholar - 24.Sarma, A.D., He, Y., Chaudhuri, S.: Clusterjoin: a similarity joins framework using map-reduce. PVLDB
**7**(12), 1059–1070 (2014)Google Scholar - 25.Tao, Y., Lin, W., Xiao, X.: Minimal mapreduce algorithms. In: SIGMOD Conference, pp. 529–540 (2013)Google Scholar
- 26.Tous, R., Gounaris, A., Tripiana, C., Torres, J., Girona, S., Ayguade, E., Labarta, J., Becerra, Y., Carrera, D., Valero, M.: Spark deployment and performance evaluation on the marenostrum supercomputer. In: IEEE BigData (2015)Google Scholar
- 27.Vitorovic, A., Elseidy, M., Koch, C.: Load balancing and skew resilience for parallel joins. In: Proceedings of the ICDE (2016)Google Scholar
- 28.Yan, K., Zhu, H.: Two MRJs for multi-way theta-join in mapreduce. In: Yan, K., Zhu, H. (eds.) Internet and Distributed Computing Systems, pp. 321–332. Springer, New York (2013)CrossRefGoogle Scholar
- 29.Zhang, C., Li, J., Wu, L.: Optimizing theta-joins in a mapreduce environment. Int. J. Database Theory Appl.
**6**(4), 91–107 (2013)Google Scholar - 30.Zhang, X., Chen, L., Wang, M.: Efficient multi-way theta-join processing using mapreduce. PVLDB
**5**(11), 1184–1195 (2012)Google Scholar