Advertisement

Physics of Particles and Nuclei Letters

, Volume 13, Issue 5, pp 647–653 | Cite as

Integration of Panda Workload Management System with supercomputers

  • K. De
  • S. Jha
  • A. Klimentov
  • T. Maeno
  • R. Mashinistov
  • P. Nilsson
  • A. Novikov
  • D. Oleynik
  • S. Panitkin
  • A. Poyda
  • K. F. Read
  • E. Ryabinkin
  • A. Teslyuk
  • V. Velikhov
  • J. C. Wells
  • T. Wenaus
Computer Technologies in Physics
  • 32 Downloads

Abstract

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), Supercomputer at the National Research Center “Kurchatov Institute”, IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run singlethreaded workloads in parallel on Titan’s multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms. We will present our current accomplishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility’s infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    J. Aad et al., “The ATLAS experiment at the CERN Large Hadron Collider,” J. Instrum. 3, S08003 (2008).CrossRefGoogle Scholar
  2. 2.
    The Worldwide LHC Computing Grid (WLCG). http://wlcg.web.cern.ch/LCG.Google Scholar
  3. 3.
    T. Maeno, “Overview of ATLAS PanDA workload management,” J. Phys.: Conf. Ser. 331, 072024 (2011).ADSGoogle Scholar
  4. 4.
    P. Nilsson, “The ATLAS PanDA pilot in operation,” in Proceedings of the 18th International Conference on Computing in High Energy and Nuclear Physics CHEP2010.Google Scholar
  5. 5.
    M. Turilli, M. Santcroos, and S. Jha, “A Comprehensive perspective on the pilot-job systems,” arxiv:1508.04180.Google Scholar
  6. 6.
    Titan, OLCF. www.olcf.ornl.gov/titan/.Google Scholar
  7. 7.
    Top500 List. http://www.top500.org/.Google Scholar
  8. 8.
    The SAGA Framework. http://saga-project.github.io.Google Scholar
  9. 9.
    Kurchatov Institute HPC Cluster. http://computing.kiae.ru/resources/hpc2/.Google Scholar
  10. 10.
    M. Schubert, L. Ermini, C. D. Sarkissian, H. Jonsson, A. Ginolhac, R. Schaefer, M. D. Martin, R. Fernandez, M. Kircher, M. McCue, E. Willerslev, and L. Orlando, “Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX,” Nat. Protoc. 9, 1056–1082 (2014).CrossRefGoogle Scholar

Copyright information

© Pleiades Publishing, Ltd. 2016

Authors and Affiliations

  • K. De
    • 1
  • S. Jha
    • 2
  • A. Klimentov
    • 3
    • 4
  • T. Maeno
    • 3
  • R. Mashinistov
    • 4
  • P. Nilsson
    • 3
  • A. Novikov
    • 4
  • D. Oleynik
    • 1
    • 6
  • S. Panitkin
    • 3
  • A. Poyda
    • 4
  • K. F. Read
    • 5
  • E. Ryabinkin
    • 4
  • A. Teslyuk
    • 4
  • V. Velikhov
    • 4
  • J. C. Wells
    • 5
  • T. Wenaus
    • 3
  1. 1.University Texas at ArlingtonArlingtonUSA
  2. 2.Rutgers UniversityPiscatawayUSA
  3. 3.Brookhaven National LaboratoryUptonUSA
  4. 4.National Research Center “Kurchatov Institute”MoscowRussia
  5. 5.Oak Ridge National LaboratoryOak RidgeUSA
  6. 6.Joint Institute for Nuclear ResearchDubnaRussia

Personalised recommendations