Encyclopedia of Big Data Technologies

Living Edition
| Editors: Sherif Sakr, Albert Zomaya

Advancements in YARN Resource Manager

  • Konstantinos Karanasos
  • Arun Suresh
  • Chris Douglas
Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-63962-8_207-1

Synonyms

Definitions

YARN is currently one of the most popular frameworks for scheduling jobs and managing resources in shared clusters. In this entry, we focus on the new features introduced in YARN since its initial version.

Overview

Apache Hadoop (2017), one of the most widely adopted implementations of MapReduce (Dean and Ghemawat 2004), revolutionized the way that companies perform analytics over vast amounts of data. It enables parallel data processing over clusters comprised of thousands of machines while alleviating the user from implementing complex communication patterns and fault tolerance mechanisms.

With its rise in popularity, came the realization that Hadoop’s resource model for MapReduce, albeit flexible, is not suitable for every application, especially those relying on low-latency or iterative computations. This motivated decoupling the cluster resource management infrastructure from specific programming models...

This is a preview of subscription content, log in to check access.

Notes

Acknowledgements

The authors would like to thank Subru Krishnan and Carlo Curino for their feedback while preparing this entry. We would also like to thank the diverse community of developers, operators, and users that have contributed to Apache Hadoop YARN since its inception.

References

  1. Apache Hadoop (2017) Apache Hadoop. http://hadoop.apache.org
  2. Apache HBase (2017) Apache HBase. http://hbase.apache.org
  3. Apache Slider (2017) Apache Slider (incubating). http://slider.incubator.apache.org
  4. Burd R, Sharma H, Sakalanaga S (2017) Lessons learned from scaling YARN to 40 K machines in a multi-tenancy environment. In: DataWorks Summit, San JoseGoogle Scholar
  5. Curino C, Difallah DE, Douglas C, Krishnan S, Ramakrishnan R, Rao S (2014) Reservation-based scheduling: if you’re late don’t blame us! In: ACM symposium on cloud computing (SoCC)Google Scholar
  6. Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: USENIX symposium on operating systems design and implementation (OSDI)Google Scholar
  7. Distributed scheduling (2017) Extend YARN to support distributed scheduling. https://issues.apache.org/jira/browse/YARN-2877
  8. Ghodsi A, Zaharia M, Hindman B, Konwinski A, Shenker S, Stoica I (2011) Dominant resource fairness: fair allocation of multiple resource types. In: USENIX symposium on networked systems design and implementation (NSDI)Google Scholar
  9. HDFS Federation (2017) Router-based HDFS federation. https://issues.apache.org/jira/browse/HDFS-10467
  10. Jyothi SA, Curino C, Menache I, Narayanamurthy SM, Tumanov A, Yaniv J, Mavlyutov R, Goiri I, Krishnan S, Kulkarni J, Rao S (2016) Morpheus: towards automated slos for enterprise clusters. In: USENIX symposium on operating systems design and implementation (OSDI)Google Scholar
  11. Karanasos K, Rao S, Curino C, Douglas C, Chaliparambil K, Fumarola GM, Heddaya S, Ramakrishnan R, Sakalanaga S (2015) Mercury: hybrid centralized and distributed scheduling in large shared clusters. In: USENIX annual technical conference (USENIX ATC)Google Scholar
  12. Node Labels (2017) Allow for (admin) labels on nodes and resource-requests. https://issues.apache.org/jira/browse/YARN-796
  13. Opportunistic Scheduling (2017) Scheduling of opportunistic containers through YARN RM. https://issues.apache.org/jira/browse/YARN-5220
  14. OrgQueue (2017) OrgQueue for easy capacityscheduler queue configuration management. https://issues.apache.org/jira/browse/YARN-5734
  15. Placement Constraints (2017) Rich placement constraints in YARN. https://issues.apache.org/jira/browse/YARN-6592
  16. Rasley J, Karanasos K, Kandula S, Fonseca R, Vojnovic M, Rao S (2016) Efficient queue management for cluster scheduling. In: European conference on computer systems (EuroSys)Google Scholar
  17. Resource Profiles (2017) Extend the YARN resource model for easier resource-type management and profiles. https://issues.apache.org/jira/browse/YARN-3926
  18. Utilization-Based Scheduling (2017) Schedule containers based on utilization of currently allocated containers. https://issues.apache.org/jira/browse/YARN-1011
  19. Vavilapalli VK, Murthy AC, Douglas C, Agarwal S, Konar M, Evans R, Graves T, Lowe J, Shah H, Seth S, Saha B, Curino C, O’Malley O, Radia S, Reed B, Baldeschwieler E (2013) Apache Hadoop YARN: yet another resource negotiator. In: ACM symposium on cloud computing (SoCC)Google Scholar
  20. YARN Federation (2017) Enable YARN RM scale out via federation using multiple RMs. https://issues.apache.org/jira/browse/YARN-2915
  21. YARN JIRA (2017) Apache JIRA issue tracker for YARN. https://issues.apache.org/jira/browse/YARN
  22. YARN TS v2 (2017) YARN timeline service v.2. https://issues.apache.org/jira/browse/YARN-5355

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Konstantinos Karanasos
    • 1
  • Arun Suresh
    • 1
  • Chris Douglas
    • 1
  1. 1.MicrosoftWashingtonUSA

Section editors and affiliations

  • Asterios Katsifodimos
    • 1
  • Pramod Bhatotia
    • 2
  1. 1.Delft University of TechnologyDelftNetherlands
  2. 2.School of InformaticsUniversity of EdinburghEdinburghUnited Kingdom