Abstract
The operation of large US Department of Energy (DOE) research facilities, like the DIII-D National Fusion Facility, results in the collection of complex multi-dimensional scientific datasets, both experimental and model-generated. In the future, it is envisioned that integrated data analysis coupled with large-scale high performance computing (HPC) simulations will be used to improve experimental planning and operation. Practically, massive data sets from these simulations provide the physics basis for generation of both reduced semi-analytic and machine-learning-based models. Storage of both HPC simulation datasets (generated from US DOE leadership computing facilities) and experimental datasets presents significant challenges. In this paper, we present a vision for a DOE-wide data management workflow that integrates US DOE fusion facilities with leadership computing facilities. Data persistence and long-term availability beyond the length of allocated projects is essential, particularly for verification and recalibration of artificial intelligence and machine learning (AI/ML) models. Because these data sets are often generated and shared among hundreds of users across multiple leadership computing facility centers, they would benefit from cross-platform accessibility, persistent identifiers (e.g. DOI, or digital object identifier), and provenance tracking. The ability to handle different data access patterns suggests that a combination of low cost, high latency (e.g. for storing ML training sets) and high cost, low latency systems (e.g. for real-time, integrated machine control feedback) may be needed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
McHarg, B.B.: Access to DIII-D data located in multiple files and multiple locations. In: 15th IEEE/NPSS Symposium. Fusion Engineering, vol. 1, p. 123 (1993). https://doi.org/10.1109/FUSION.1993.518297’
Stillerman, J.A., Fredian, T.W., Klare, K.A., Manduchi, G.: MDSplus data acquisition system. Rev. Sci. Instrum. 68, 939 (1997). https://doi.org/10.1063/1.1147719
Schissel, D.P., Abla, G., Flanagan, S., Kim, L., Lee, X.: The between-pulse data analysis infrastructure at the DIII-D national fusion facility. Fusion Sci. Technol. 58, 720 (2010). https://doi.org/10.13182/FST10-A10920
Staebler, G.M., Kinsey, J., Waltz, R.E.: A theory-based transport model with comprehensive physics. Phys. Plasmas 14, 055909 (2017). https://doi.org/10.1063/1.2436852
Lao, L.L., St. John, H., Stambaugh, R.D., Kellman, A.G., Pfeiffer, W.: Reconstruction of current profile parameters and plasma shapes in tokamaks. Nucl. Fusion 25, 1611 (1985). https://doi.org/10.1088/0029-5515/25/11/007
Lao, L.L., et al.: Equilibrium analysis of current profiles in tokamaks. Nucl. Fusion 30, 1035 (1990). https://doi.org/10.1088/0029-5515/30/6/006
Lao, L.L., et al.: Application of machine learning and artificial intelligence to extend EFIT equilibrium reconstruction. Plasma Phys. Control. Fusion 64, 074001 (2022). https://doi.org/10.1088/1361-6587/ac6fff
Data Version Control Software. https://dvc.org/
Montes, K.J., Rea, C., Tinguely, R.A., Sweeney, R., Zhu, J., Granetz, R.S.: A semi-supervised machine learning detector for physics events in tokamak discharges. Nucl. Fusion 61, 026022 (2021). https://doi.org/10.1088/1741-4326/abcdb9
Rea, C., Granetz, R.S.: Fus. Sci. Tech. 74, 89–100 (2018). https://doi.org/10.1080/15361055.2017.1407206
Kates-Harbeck, J., Svyatkovskiy, A., Tang, W.: Predicting disruptive instabilities in controlled fusion plasmas through deep learning. Nature 568, 526–531 (2019). https://doi.org/10.1038/s41586-019-1116-4
Candy, J., Belli, E.A.: A high-accuracy Eulerian gyrokinetic solver for collisional plasmas. J. Comp. Phys. 324, 73 (2016). https://doi.org/10.1016/j.jcp.2016.07.039
Staebler, G.M., Howard, N.T., Candy, J., Holland, C.: A model of the saturation of coupled electron and ion scale gyrokinetic turbulence. Nucl. Fusion 57, 066046 (2017). https://doi.org/10.1088/1741-4326/aa6bee
The GYRO Nonlinear Gyrokinetic Simulation Database (J. Kinsey). http://gafusion.github.io/doc/_downloads/gyro-database.pdf
Hatch, D.R., et al.: Reduced models for ETG transport in the tokamak pedestal. Phys. Plasmas 29, 062501 (2022). https://doi.org/10.1063/5.0087403
Jenko, F., Dorland, W., Kotschenreuther, M., Rogers, B.N.: Electron temperature gradient driven turbulence. Phys. Plasmas 7, 1904 (2000). https://doi.org/10.1063/1.874014
Peeters, A.G., et al.: The nonlinear gyro-kinetic flux tube code GKW. Comput. Phys. Commun. 180, 2650 (2009). https://doi.org/10.1016/j.cpc.2009.07.001
U.S. Department of Energy Innovative and Computational Impact on Theory and Experiment (INCITE) program. http://www.doeleadershipcomputing.org
FAIR Principles. http://www.go-fair.org/fair-principles/
High Performance Storage System. http://www.hpss-collaboration.org
National Energy Research Scientific Computing Center. http://nersc.gov/users/job-logs-statistics/storage-and-file-systems/storage-statistics
Sammuli, B.S., et al.: TokSearch: a search engine for fusion experimental data. Fusion Eng. Design 129, 12–15 (2018). https://doi.org/10.1016/j.fusengdes.2018.02.003
Kostuk, M., Uram, T.D., Evans, T., Orlov, D.M., Papka, M.E., Schissel, D.: Automatic between-pulse analysis of DIII-D experimental data performed remotely on a supercomputer at argonne leadership computing facility. Fusion Sci. Technol. 74, 135 (2018). https://doi.org/10.1080/15361055.2017.1390388
Acknowledgments
This work was supported by the U.S. Department of Energy under awards DE-FC02-04ER54698, DE-FG02-95ER54309, DE-FC02-06ER54873, and DE-SC0017992. Some material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, using the DIII-D National Fusion Facility, a DOE Office of Science user facility, under Award(s) DE-FC02-04ER54698. Computing resources were also provided by the National Energy Research Scientific Computing Center, which is an Office of Science User Facility supported under Contract DE-AC02-05CH11231. An award of computer time was also provided by the INCITE program. This research used resources of the Oak Ridge Leadership Computing Facility, which is an Office of Science User Facility supported under Contract DE-AC05-00OR22725. This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Smith, S. et al. (2022). A Vision for Coupling Operation of US Fusion Facilities with HPC Systems and the Implications for Workflows and Data Management. In: Doug, K., Al, G., Pophale, S., Liu, H., Parete-Koon, S. (eds) Accelerating Science and Engineering Discoveries Through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation. SMC 2022. Communications in Computer and Information Science, vol 1690. Springer, Cham. https://doi.org/10.1007/978-3-031-23606-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-23606-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23605-1
Online ISBN: 978-3-031-23606-8
eBook Packages: Computer ScienceComputer Science (R0)