The Commutativity Problem of the MapReduce Framework: A Transducer-Based Approach

  • Yu-Fang ChenEmail author
  • Lei Song
  • Zhilin Wu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9780)


MapReduce is a popular programming model for data parallel computation. In MapReduce, the reducer produces an output from a list of inputs. Due to the scheduling policy of the platform, the inputs may arrive at the reducers in different order. The commutativity problem of reducers asks if the output of a reducer is independent of the order of its inputs. Although the problem is undecidable in general, the MapReduce programs in practice are usually used for data analytics and thus require very simple control flow. By exploiting the simplicity, we propose a programming language for reducers where the commutativity problem is decidable. The main idea of the reducer language is to separate the control and data flow of programs and disallow arithmetic operations in the control flow. The decision procedure for the commutativity problem is obtained through a reduction to the equivalence problem of streaming numerical transducers (SNTs), a novel automata model over infinite alphabets introduced in this paper. The design of SNTs is inspired by streaming transducers (Alur and Cerny, POPL 2011). Nevertheless, the two models are intrinsically different since the outputs of SNTs are integers while those of streaming transducers are data words. The decidability of the equivalence of SNTs is achieved with an involved combinatorial analysis of the evolvement of the values of the integer variables during the runs of SNTs.


Decision Procedure Reducer Program Transition Graph Simple Cycle Control Flow Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



Yu-Fang Chen is partially supported by the MOST project No. 103-2221-E-001-019-MY3. Zhilin Wu is partially supported by the NSFC grants No. 61100062, 61272135, 61472474, and 61572478.


  1. 1.
    Alur, R., Cerny, P.: Streaming transducers for algorithmic verification of single-pass list-processing programs. In: POPL, pp. 599–610. ACM (2011)Google Scholar
  2. 2.
    Alur, R., Antoni, L.D, Deshmukh, J., Raghothaman, M., Yuan, Y.: Regular functions and cost register automata. In: LICS, pp. 13–22 (2013)Google Scholar
  3. 3.
    Chen, Y.F., Hong, C.D., Sinha, N., Wang, B.Y.: Commutativity of reducers. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 131–146. Springer, Heidelberg (2015)Google Scholar
  4. 4.
    Chen, Y., Lei, S., Wu, Z.: The commutativity problem of the mapreduce framework: a transducer-based approach. CoRR, abs/1605.01497 (2016)Google Scholar
  5. 5.
    Comon, H., Jurski, Y.: Multiple counters automata, safety analysis and presburger arithmetic. In: Hu, A.J., Vardi, M.Y. (eds.) CAV 1998. LNCS, vol. 1427, pp. 268–279. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  6. 6.
    Finkel, A., Göller, S., Haase, C.: Reachability in register machines with polynomial updates. In: Chatterjee, K., Sgall, J. (eds.) MFCS 2013. LNCS, vol. 8087, pp. 409–420. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  7. 7.
    Haase, C., Halfon, S.: Integer vector addition systems with states. In: Ouaknine, J., Potapov, I., Worrell, J. (eds.) RP 2014. LNCS, vol. 8762, pp. 112–124. Springer, Heidelberg (2014)Google Scholar
  8. 8.
  9. 9.
    Ibarra, O.H.: Reversal-bounded multicounter machines and their decision problems. J. ACM 25(1), 116–133 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Kaminski, M., Francez, N.: Finite-memory automata. Theor. Comput. Sci. 134(2), 329–363 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Leroux, J., Sutre, G.: Flat counter automata almost everywhere! In: Software Verification: Infinite-State Model Checking and Static Program Analysis. Dagstuhl Seminar Proceedings, vol. 6081 (2006)Google Scholar
  12. 12.
    Minsky, M.L.: Computation: Finite and Infinite Machines. Prentice-Hall, Englewood Cliffs (1971)zbMATHGoogle Scholar
  13. 13.
    Neven, F., Schweikardt, N., Servais, F., Tan, T.: Distributed streaming with finite memory. In: ICDT, pp. 324–341 (2015)Google Scholar
  14. 14.
    Neven, F., Schwentick, T., Vianu, V.: Finite state machines for strings over infinite alphabets. ACM Trans. Comput. Logic 5(3), 403–435 (2004)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Reinhardt, K.: Reachability in petri nets with inhibitor arcs. Electron. Notes Theor. Comput. Sci. 223, 239–264 (2008). RPCrossRefzbMATHGoogle Scholar
  16. 16.
  17. 17.
    Veanes, M., Hooimeijer, P., Livshits, B., Molnar, D., Bjorner, N.: Symbolic finite state transducers: algorithms and applications. ACM SIGPLAN Not. 47(1), 137–150 (2012)CrossRefzbMATHGoogle Scholar
  18. 18.
    Xiao, T., Zhang, J., Zhou, H., Guo, Z., McDirmid, S., Lin, W., Chen, W., Zhou, L.: Nondeterminism in mapreduce considered harmful? An empirical study on non-commutative aggregators in mapreduce programs. In: ICSE, pp. 44–53 (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institute of Information ScienceAcademia SinicaTaipeiTaiwan
  2. 2.State Key Laboratory of Computer ScienceInstitute of Software, Chinese Academy of SciencesBeijingChina

Personalised recommendations