Malleable scheduling for flows of jobs and applications to MapReduce

Abstract

This paper provides a unified family of algorithms with performance guarantees for malleable scheduling problems on flows. A flow represents a set of jobs with precedence constraints. Each job has a speedup function that governs the rate at which work is done on the job as a function of the number of processors allocated to it. In our setting, each speedup function is linear up to some job-specific processor maximum. A key aspect of malleable scheduling is that the number of processors allocated to any job is allowed to vary with time. The overall objective is to minimize either the total cost (minisum) or the maximum cost (minimax) of the flows. Our approach handles a very general class of cost functions, and in particular provides the first constant-factor approximation algorithms for total and maximum weighted completion time. Our motivation for this work was scheduling in MapReduce, and we also provide experimental evaluations that show good practical performance.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Notes

  1. 1.

    Hadoop: https://hadoop.apache.org. Accessed 4 Aug 2018.

  2. 2.

    IBM BigInsights: www-01.ibm.com/software/data/infosphere/biginsights/. Accessed 4 Aug 2018.

  3. 3.

    Gridmix: https://hadoop.apache.org/docs/r1.2.1/gridmix.html. Accessed 4 Aug 2018.

  4. 4.

    Facebook: https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/. Accessed 4 Aug 2018.

  5. 5.

    The \(\tilde{O}\) notation ignores logarithmic factors.

  6. 6.

    IBM BigInsights: www-01.ibm.com/software/data/infosphere/biginsights/. Accessed 4 Aug 2018.

  7. 7.

    Gridmix: https://hadoop.apache.org/docs/r1.2.1/gridmix.html. Accessed 4 Aug 2018.

References

  1. Baker, B., Coffman, E., & Rivest, R. (1980). Orthogonal packings in two dimensions. SIAM Journal on Computing, 9(4), 846–855.

    Article  Google Scholar 

  2. Berlinska, J., & Drozdowski, M. (2011). Scheduling divisible MapReduce computations. Journal of Parallel and Distributed Computing, 71, 450–459.

    Article  Google Scholar 

  3. Beyer, K., Ercegovac, V., Gemulla, R., Balmin, A., Eltabakh, M., Kanne, C.-C., et al. (2011). Jaql: A scripting language for large scale semistructured data analysis. Proceedings of VLDB, 4(12), 1272–1283.

    Google Scholar 

  4. Chen, C.-Y., & Chu, C.-P. (2013). A 3.42-approximation algorithm for scheduling malleable tasks under precedence constraints. IEEE Transactions on Parallel and Distributed Systems, 24(8), 1479–1488.

    Article  Google Scholar 

  5. Coffman, E., Garey, M., Johnson, D., & Tarjan, R. (1980). Performance bounds for level-oriented two-dimensional packing algorithms. SIAM Journal on Computing, 9(4), 808–826.

    Article  Google Scholar 

  6. Daitch, S. I., & Spielman, D. A. (2008) Faster approximate lossy generalized flow via interior point algorithms. In Proceedings of the 40th annual ACM symposium on theory of computing (pp. 451–460).

  7. Dean, J., & Ghemawat, S. (2008). Mapreduce: Simplified data processing on large clusters. ACM Transactions on Computer Systems, 51(1), 107–113.

    Google Scholar 

  8. Drozdowski, M. (1996). Real-time scheduling of linear speedup parallel tasks. Information Processing Letters, 57(1), 35–40.

    Article  Google Scholar 

  9. Drozdowski, M. (2001). New applications of the Munz and Coffman algorithm. Journal of Scheduling, 4(4), 209–223.

    Article  Google Scholar 

  10. Drozdowski, M. (2009). Scheduling for parallel processing. Heidelberg: Springer.

    Google Scholar 

  11. Drozdowski, M., & Kubiak, W. (1999). Scheduling parallel tasks with sequential heads and tails. Annals of Operations Research, 90, 221–246.

    Article  Google Scholar 

  12. Fleischer, L., & Wayne, K. (2002). Fast and simple approximation schemes for generalized flow. Mathematical Programming, 91(2), 215–238.

    Article  Google Scholar 

  13. Fotakis, D., Milis, I., Papadigenopoulos, O., Vassalos, V. & Zois, G. (2016). Scheduling MapReduce jobs under multi-round precedences. In Euro-Par 2016: Parallel processing—22nd international conference on parallel and distributed computing (pp. 209–222).

  14. Garey, M., & Graham, R. (1975). Bounds for multiprocessor scheduling with resource constraints. SIAM Journal on Computing, 4(2), 187–200.

    Article  Google Scholar 

  15. Gates, A., Natkovich, O., Chopra, S., Kamath, P., Narayanamurthy, S., Olston, C., et al. (2009). Building a high-level dataflow system on top of MapReduce: The pig experience. Proceedings of VLDB, 2(2), 1414–1425.

    Article  Google Scholar 

  16. Graham, R. L. (1966). Bounds for certain multiprocessing anomalies. Bell System Technical Journal, 45(9), 1563–1581.

    Article  Google Scholar 

  17. Günther, E., König, F. G., & Megow, N. (2014). Scheduling and packing malleable and parallel tasks with precedence constraints of bounded width. Journal of Combinatorial Optimization, 27(1), 164–181.

    Article  Google Scholar 

  18. Hochbaum, D. S., & Shmoys, D. B. (1986). A unified approach to approximation algorithms for bottleneck problems. Journal of the ACM, 33(3), 533–550.

    Article  Google Scholar 

  19. Ibarra, O. H., & Kim, C. E. (1975). Fast approximation algorithms for the knapsack and sum of subset problems. Journal of the ACM, 22(4), 463–468.

    Article  Google Scholar 

  20. Islam, M., Huang, A., Battisha, M., Chiang, M., Srinivasan, S., Peters, C., et al. (2012) Oozie: Towards a scalable workflow management system for Hadoop. In Proceedings of the ACM workshop on scalable workflow execution engines and technologies.

  21. Jansen, K., & Zhang, H. (2006). An approximation algorithm for scheduling malleable tasks under general precedence constraints. ACM Transactions on Algorithms, 2(3), 416–434.

    Article  Google Scholar 

  22. Jansen, K., & Zhang, H. (2012). Scheduling malleable tasks with precedence constraints. Journal of Computer and System Sciences, 78(1), 245–259.

    Article  Google Scholar 

  23. Kalyanasundaram, B., & Pruhs, K. (2000). Speed is as powerful as clairvoyance. Journal of the ACM, 47(4), 617–643.

    Article  Google Scholar 

  24. Karloff, H., Suri, S., & Vassilvitskii, S. (2010) A model of computation for MapReduce. In SODA (pp. 938–948).

  25. Koutris, P., & Suciu, D. (2011) Parallel evaluation of conjunctive queries. In PODS (pp. 223–234).

  26. Labetoulle, J., Lawler, E., Lentra, J., & Rinnoy Kan, A. (1984). Preemptive scheduling of uniform machines subject to release dates. In W. Pulleyblank (Ed.), Progress in combinatorial optimization (pp. 245–261). New York: Academic Press.

    Google Scholar 

  27. Lepére, R., Trystram, D., & Woeginger, G. J. (2001) Approximation algorithms for scheduling malleable tasks under precedence constraints. In ESA (pp. 146–157).

  28. Leung, J. (2004). Handbook of scheduling. London: Chapman and Hall/CRC.

    Google Scholar 

  29. Ludwig, W., & Tiwari, P. (1994). Scheduling malleable and nonmalleable parallel tasks. In Symposium on discrete algorithms, Arlington, VA (pp. 167–176).

  30. McNaughton, R. (1959). Scheduling with deadlines and loss functions. Management Science, 6(1), 1–12.

    Article  Google Scholar 

  31. Moseley, B., Dasgupta, A., Kumar, R., & Sarlós, T. (2011). On scheduling in map-reduce and flow-shops. In Symposium on parallel algorithms and architectures. San Jose, CA.

  32. Nagarajan, V., Wolf, J., Balmin, A., & Hildrum, K. (2013) Flowflex: Malleable scheduling for flows of MapReduce jobs. In International middleware conference, Beijing, China (pp. 103–122).

  33. Pinedo, M. (1995). Scheduling: Theory, algorithms and systems. London: Prentice Hall.

    Google Scholar 

  34. Popescu, A., Ercegovac, V., Balmin, A., Branco, M., & Ailamaki, A. (2012) Same queries, different data: Can we predict runtime performance? In International conference in data engineering, Washington, DC (pp. 275–280).

  35. Schuurman, P., & Woeginger, G. J. (2000). A polynomial time approximation scheme for the two-stage multiprocessor flow shop problem. Theoretical Computer Science, 237(1–2), 105–122.

    Article  Google Scholar 

  36. Schwiegelshohn, U., Ludwig, W., Wolf, J., Turek, J., & Yu, P. (1999). Smart SMART bounds for weighted response time scheduling. SIAM Journal on Computing, 28(1), 237–253.

    Article  Google Scholar 

  37. Sleator, D. (1980). A 2.5 times optimal algorithm for packing in two dimensions. Information Processing Letters, 10(1), 37–40.

    Article  Google Scholar 

  38. Thusoo, A., Sarma, J., Jain, N., Shao, Z., Chakka, P., Zhang, N., Anthony, S., Liu, H., & Murthy, R. (2010) Hive—A petabyte scale data warehouse using Hadoop. In International conference on data engineering, Long Beach, CA (pp. 996–1005).

  39. Turek, J., Wolf, J., & Yu, P. (1992) Approximate algorithms for scheduling parallelizable tasks. In Symposium on parallel algorithms and architectures, San Diego, CA (pp. 323–332).

  40. Vizing, V. (1982). Minimization of the maximum delay in servicing systems with interruption. USSR Computational Mathematics and Methematical Physics, 22(3), 227–233.

    Article  Google Scholar 

  41. Wolf, J., Rajan, D., Hildrum, K., Khandekar, R., Kumar, R., Parekh, S., et al. (2010) FLEX: A slot allocation scheduling optimizer for MapReduce workloads. In International middleware conference, Bangalore, India (pp. 1–20).

  42. Wolf, J., Balmin, A., Rajan, D., Hildrum, K., Khandekar, R., Parekh, S., et al. (2012). On the optimization of schedules for MapReduce workloads in the presence of shared scans. VLDB Journal, 21(5), 589–609.

    Article  Google Scholar 

  43. Zaharia, M., Borthakur, D., Sarma, J., Elmeleegy, K., Schenker, S., & Stoica, I. (2009) Job scheduling for multi-user MapReduce clusters. UC Berkeley technical report EECS-2009-55.

  44. Zaharia, M., Borthakur, D., Sarma, J., Elmeleegy, K., Shenker, S., & Stoica, I. (2010) Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In European conference on computer systems, Paris (pp. 265–278).

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Viswanath Nagarajan.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nagarajan, V., Wolf, J., Balmin, A. et al. Malleable scheduling for flows of jobs and applications to MapReduce. J Sched 22, 393–411 (2019). https://doi.org/10.1007/s10951-018-0576-y

Download citation

Keywords

  • Parallel scheduling
  • Precedence constraints
  • Approximation algorithms
  • MapReduce