Abstract
From automatically translating documents to analyzing electoral voting patterns; from computing personalized movie recommendations to predicting flu epidemics: all of these tasks are possible due to the success and proliferation of the MapReduce parallel programming paradigm. Yet almost ten years after the system was introduced, we still do not have a good understanding of what problems can and cannot be efficiently computed in MapReduce.
In this talk I will give an overview of the MapReduce framework, and explain its connections to both Valiant’s Bulk Synchronous Parallel (BSP) model and the classical PRAM model of parallel computing. To demonstrate the power of the MapReduce model I will present the Sample and Prune approach that finds an approximate coreset of a manageable size, thereby reducing the problem from the realm of ‘Big Data’ to that of ‘Small Data.’
I will conclude by discussing other considerations that make a large difference when working with MapReduce in practice, but have so far resisted a careful theoretical analysis.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsAuthor information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vassilvitskii, S. (2013). MapReduce Algorithmics. In: Dehne, F., Solis-Oba, R., Sack, JR. (eds) Algorithms and Data Structures. WADS 2013. Lecture Notes in Computer Science, vol 8037. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40104-6_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-40104-6_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40103-9
Online ISBN: 978-3-642-40104-6
eBook Packages: Computer ScienceComputer Science (R0)