Skip to main content

PGAS Approach to Implement Mapreduce Framework Based on UPC Language

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10421))

Abstract

Over the years from its introduction Mapreduce technology proved to be very effective parallel programming technique to process large volumes of data. One of the most prevalent implementations of Mapreduce is Hadoop framework and Google proprietary Mapreduce system.

Out of other notable implementations one should mention recent PGAS (partitioned global address space) – based X10, UPC (Unified Parallel C) versions. These implementations present a new viewpoint when Mapreduce application developers can benefit from using global address space model while writing data parallel tasks. In this paper we introduce a novel UPC implementation of Mapreduce technology based on idea of using purely UPC based implementation of shared hashmap data structure as an intermediate key/value store. Shared hashmap is used in to perform exchange of key/values between parallel UPC threads during shuffle phase of Mapreduce framework. The framework also allows to express data parallel applications using simple sequential code.

Additionally, we present a heuristic approach based on genetic algorithm that could efficiently perform load balancing optimization to distribute key/values among threads such that we minimize data movement operations and evenly distribute computational workload.

Results of evaluation of Mapreduce on UPC framework based on WordCount benchmark application are presented and compared to Apache Hadoop implementation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Sixth Symposium on Operating System Design and Implementation (OSDI2004), p. 10. USENIX Association, San Francisco (2004)

    Google Scholar 

  2. Carlson, W.W., Draper, J.M., Culler, D.E., Yelick, K., Brooks, E., Warren, K.: Introduction to UPC and language specification. Technical report, IDA Center for Computing Sciences (1999)

    Google Scholar 

  3. Teijeiro, C., Taboada, G.L., Tourino, J., Doallo, R.: Design and implementation of Mapreduce using the PGAS programming model with UPC. In: 17th International Conference on Parallel and Distributed Systems (ICPADS 2011), pp. 196–203. IEEE Computer Society, Washington (2011). doi:10.1109/ICPADS.2011.162

  4. Dong, H., Zhou, S., Grove, D.: X10-enabled MapReduce. In: 4th Conference on Partitioned Global Address Space Programming Model (PGAS 2010), pp. 1–6. ACM, New York (2010). doi:10.1145/2020373.2020382

  5. Man, K.F., Tang, K.S., Kwong, S.: Genetic algorithms: Concepts and applications. IEEE Trans. Industr. Electron. 43(5), 519–534 (1996). doi:10.1109/41.538609

    Article  Google Scholar 

  6. Chu, P.C., Beasley, J.E.: A genetic algorithm for the generalised assignment problem. Comput. Oper. Res. 24(1), 17–23 (1997). doi:10.1016/S0305-0548(96)00032-9

    Article  MathSciNet  MATH  Google Scholar 

  7. Liu, Y.Y., Wang, S.: A scalable parallel genetic algorithm for the generalized Assignment Problem. Parallel Comput. 46, 98–119 (2015). doi:10.1016/j.parco.2014.04.008

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shomanov Aday .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Aday, S., Darkhan, AZ., Madina, M. (2017). PGAS Approach to Implement Mapreduce Framework Based on UPC Language. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2017. Lecture Notes in Computer Science(), vol 10421. Springer, Cham. https://doi.org/10.1007/978-3-319-62932-2_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62932-2_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62931-5

  • Online ISBN: 978-3-319-62932-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics