Distributed and Parallel Databases

, Volume 36, Issue 2, pp 301–337 | Cite as

Flexible partitioning for selective binary theta-joins in a massively parallel setting

  • Ioannis Koumarelas
  • Athanasios Naskos
  • Anastasios GounarisEmail author


Efficient join processing plays an important role in big data analysis. In this work, we focus on generic theta joins in a massively parallel environment, such as MapReduce and Spark. Theta joins are notoriously slow due to their inherent quadratic complexity, even when their selectivity is low, e.g., 1%. The main performance bottleneck differs between cases, and is due to any of the following factors or their combination: amount of data being shuffled, memory load on reducers, or computation load on reducers. We propose an ensemble-based partitioning approach that tackles all three aspects. In this way, we can save communication cost, we better respect the memory and computation limitations of reducers and overall, we reduce the total execution time. The key idea behind our partitioning is to cluster join key values following two techniques, namely matrix re-arrangement and agglomerative clustering. These techniques can run either in isolation or in combination. We present thorough experimental results using both band queries on real data and arbitrary synthetic predicates. We show that we can save up to 45% of the communication cost and reduce the computation load of a single reducer up to 50% in band queries, whereas the savings are up to 74 and 80%, respectively, in queries with arbitrary theta predicates. Apart from being effective, the potential benefits of our approach can be estimated before execution from metadata, which allows for informed partitioning decisions. Finally, our solutions are flexible in that they can account for any weighted combination of the three bottleneck factors.


Theta-joins Query processing MapReduce Spark 



We would like to thank Jordi Torres, Rubèn Tous and Carlos Tripiana from the Barcelona Supercomputing Center for their help in running the Spark experiments.


  1. 1.
    Afrati, F., Ullman, J.: Matching bounds for the all-pairs mapreduce problem. In: Proceedings of the 17th International Database Engineering & Applications Symposium, pp. 3–4. ACM (2013)Google Scholar
  2. 2.
    Afrati, F.N., Sarma, A.D., Salihoglu, S., Ullman, J.D.: Upper and lower bounds on the cost of a map-reduce computation. PVLDB 6(4), 277–288 (2013)Google Scholar
  3. 3.
    Afrati, F.N., Ullman, J.D.: Optimizing multiway joins in a map-reduce environment. IEEE Trans. Knowl. Data Eng. 23(9), 1282–1298 (2011)CrossRefGoogle Scholar
  4. 4.
    Beame, P., Koutris, P., Suciu, D.: Skew in parallel query processing. In: PODS, pp. 212–223 (2014)Google Scholar
  5. 5.
    Chan, H.M., Milner, D.A.: Direct clustering algorithm for group formation in cellular manufacture. J. Manuf. Syst. 1(1), 65–75 (1982)CrossRefGoogle Scholar
  6. 6.
    Chen, S.-Y., Chang, T.-P., Chang, Z.-H.: An efficient theta-join query processing algorithm on mapreduce framework. In: Proceedings of the 2012 International Symposium on Computer, Consumer and Control (IS3C), pp. 686–689. IEEE (2012)Google Scholar
  7. 7.
    Chu, S., Balazinska, M., Suciu, D.: From theory to practice: efficient join query evaluation in a parallel database system. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31–June 4, 2015, pp. 63–78 (2015)Google Scholar
  8. 8.
    Chu, X., Ilyas, I.F., Koutris, P.: Distributed data deduplication. PVLDB 9(11), 864–875 (2016)Google Scholar
  9. 9.
    Climer, S., Zhang, W.: Rearrangement clustering: pitfalls, remedies, and applications. J. Mach. Learn. Res. 7, 919–943 (2006)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Crotty, A., Galakatos, A., Dursun, K., Kraska, T., Binnig, C., Çetintemel, U., Zdonik, S.: An architecture for compiling udf-centric workflows. PVLDB 8(12), 1466–1477 (2015)Google Scholar
  11. 11.
    Doulkeridis, C., Nørvåg, K.: A survey of large-scale analytical query processing in mapreduce. VLDB J. 23(3), 1–26 (2013)Google Scholar
  12. 12.
    Elseidy, M., Elguindy, A., Vitorovic, A., Koch, C.: Scalable and adaptive online joins. PVLDB 7(6), 441–452 (2014)Google Scholar
  13. 13.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Burlington (2000)zbMATHGoogle Scholar
  14. 14.
    Khayyat, Z., Lucia, W., Singh, M., Ouzzani, M., Papotti, P., Quiané-Ruiz, J.-A., Tang, N., Kalnis, P.: Lightning fast and space efficient inequality joins. PVLDB 8(13), 2074–2085 (2015)Google Scholar
  15. 15.
    King, J.R.: Machine-component grouping in production flow analysis: an approach using a rank order clustering algorithm. Int. J. Prod. Res. 18(2), 213–232 (1980)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Koumarelas, I., Naskos, A., Gounaris, A.: Binary theta-joins using mapreduce: efficiency analysis and improvements. In: Proceedings of the International Workshop on Algorithms for MapReduce and Beyond (BMR) (in conjunction with EDBT/ICDT’2014), Athens, Greece (2014)Google Scholar
  17. 17.
    Lenstra, J.K., Rinnooy Kan, A.H.G.: Some simple applications of the travelling salesman problem. Oper. Res. Q. 26(4), 717–733 (1975)CrossRefzbMATHGoogle Scholar
  18. 18.
    Lenstra, J.K.: Technical noteclustering a data array and the traveling-salesman problem. Oper. Res. 22(2), 413–414 (1974)CrossRefzbMATHGoogle Scholar
  19. 19.
    Li, F., Ooi, B.C., Tamer Özsu, M., Wu, S.: Distributed data management using mapreduce. ACM Comput. Surv. 46(3), 31 (2014)Google Scholar
  20. 20.
    McCormick, W.T., Schweitzer, P.J., White, T.W.: Problem decomposition and data reorganization by a clustering technique. Oper. Res. 20(5), 993–1009 (1972)CrossRefzbMATHGoogle Scholar
  21. 21.
    Okcan, A., Riedewald, M.: Processing theta-joins using mapreduce. In: SIGMOD Conference, pp. 949–960 (2011)Google Scholar
  22. 22.
    Okcan, A., Riedewald, M.: Anti-combining for mapreduce. In: SIGMOD Conference, pp. 839–850 (2014)Google Scholar
  23. 23.
    Ren, K., Kwon, Y.C., Balazinska, M., Howe, B.: Hadoop’s adolescence. PVLDB 6(10), 853–864 (2013)Google Scholar
  24. 24.
    Sarma, A.D., He, Y., Chaudhuri, S.: Clusterjoin: a similarity joins framework using map-reduce. PVLDB 7(12), 1059–1070 (2014)Google Scholar
  25. 25.
    Tao, Y., Lin, W., Xiao, X.: Minimal mapreduce algorithms. In: SIGMOD Conference, pp. 529–540 (2013)Google Scholar
  26. 26.
    Tous, R., Gounaris, A., Tripiana, C., Torres, J., Girona, S., Ayguade, E., Labarta, J., Becerra, Y., Carrera, D., Valero, M.: Spark deployment and performance evaluation on the marenostrum supercomputer. In: IEEE BigData (2015)Google Scholar
  27. 27.
    Vitorovic, A., Elseidy, M., Koch, C.: Load balancing and skew resilience for parallel joins. In: Proceedings of the ICDE (2016)Google Scholar
  28. 28.
    Yan, K., Zhu, H.: Two MRJs for multi-way theta-join in mapreduce. In: Yan, K., Zhu, H. (eds.) Internet and Distributed Computing Systems, pp. 321–332. Springer, New York (2013)CrossRefGoogle Scholar
  29. 29.
    Zhang, C., Li, J., Wu, L.: Optimizing theta-joins in a mapreduce environment. Int. J. Database Theory Appl. 6(4), 91–107 (2013)Google Scholar
  30. 30.
    Zhang, X., Chen, L., Wang, M.: Efficient multi-way theta-join processing using mapreduce. PVLDB 5(11), 1184–1195 (2012)Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Ioannis Koumarelas
    • 1
  • Athanasios Naskos
    • 2
  • Anastasios Gounaris
    • 2
    Email author
  1. 1.Department of InformaticsHasso-Plattner-InstitutPotsdamGermany
  2. 2.Department of InformaticsAristotle University of ThessalonikiThessalonikiGreece

Personalised recommendations