Abstract
With HPC (high performance computing) evolving into the exascale era, improvements in computing performance and power efficiency have become increasingly more important. Based on our previous work on enabling earthquake simulations on a large scale on Sunway TaihuLight, we further explore other possibilities to improve the application through a fully-customized hardware design on reconfigurable FPGA (field programmable gate array) devices. We investigate the feasibility and the potential benefits of a complete fixed-point design. We first perform a coarse-resolution-based simulation to analyze the representation range and precision needed to capture both the total energy and the energy distribution of variables over space and time. We then derive a complete fixed-point design that identifies the suitable bitwidth for major categories of variables and dynamically represents the range through a dynamic scaling scheme. Finally, we use the optimized fixed-point design to run a case of the Wenchuan earthquake to demonstrate the potential of supporting large-scale scientific simulations on FPGA devices. The results demonstrate that an 18-bit fixed-point design already provides an almost identical description of the seismic events in the Wenchuan scenario down to a single-precision floating-point version and provides sustainable performance equivalent to 13.1 Intel Xeon Gold 6154 18-core CPUs or 2.10 Sunway 260-core processors, with performance per watt (power efficiency) improved by 15.3 and 3.72 times compared with the Intel Xeon Gold 6154 18-core CPUs and the Sunway 260-core processors, respectively.
Similar content being viewed by others
References
Thornton J E. The CDC 6600 project. IEEE Ann Hist Comput, 1980, 2: 338–348
Hey T, Tansley S, Tolle K, et al. The Fourth Paradigm: Data-Intensive Scientific Discovery. Redmond: Microsoft Research, 2009
Komatitsch D, Tsuboi S, Ji C, et al. A 14.6 billion degrees of freedom, 5 teraflops, 2.5 terabyte earthquake simulation on the earth simulator. In: Proceedings of ACM/IEEE Conference on Supercomputing, 2003
Rudi J, Malossi A C I, Isaac T, et al. An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth’s mantle. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis, 2015
Shingu S, Takahara H, Fuchigami H, et al. A 26.58 TFlops global atmospheric simulation with the spectral transform method on the earth simulator. In: Proceedings of ACM/IEEE Conference on Supercomputing, 2002
Ishiyama T, Nitadori K, Makino J. 4.45 PFlops astrophysical n-body simulation on k computer — the gravitational trillion-body problem. In: Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis, 2012
Habib S, Morozov V, Finkel H, et al. The universe at extreme scale: multi-petaflop sky simulation on the BG/Q. In: Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis, 2012
Shimokawabe T, Aoki T, Takaki T, et al. Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis, 2011
Yang X J, Liao X K, Lu K, et al. The TianHe-1A supercomputer: its hardware and software. J Comput Sci Technol, 2011, 26: 344–351
Liao X K, Xiao L Q, Yang C Q, et al. MilkyWay-2 supercomputer: system and application. Front Comput Sci, 2014, 8: 345–356
Fu H H, Liao J F, Yang J Z, et al. The Sunway TaihuLight supercomputer: system and applications. Sci China Inf Sci, 2016, 59: 072001
Shalf J, Quinlan D, Janssen C. Rethinking hardware-software codesign for exascale systems. Computer, 2011, 44: 22–30
Dosanjh S, Barrett R, Heroux M, et al. Achieving exascale computing through hardware/software co-design. In: Proceedings of European MPI Users’ Group Meeting, 2011. 5–7
Dosanjh S S, Barrett R F, Doerfler D W, et al. Exascale design space exploration and co-design. Future Generation Comput Syst, 2014, 30: 46–58
Maechling P, Deelman E, Zhao L, et al. SCEC cybershake workflows — automating probabilistic seismic hazard analysis calculations. In: Proceedings of Workflows for e-Science, 2007. 143–163
Komatitsch D, Tsuboi S, Ji C, et al. A 14.6 billion degrees of freedom, 5 teraflops, 2.5 terabyte earthquake simulation on the earth simulator. In: Proceedings of ACM/IEEE Conference on Supercomputing, 2003
Cui Y, Olsen K B, Jordan T H, et al. Scalable earthquake simulation on petascale supercomputers. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2010
Cui Y, Poyraz E, Olsen K B, et al. Physics-based seismic hazard analysis on petascale heterogeneous supercomputers. In: Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis, 2013
Fu H H, Yin W W, Yang G W, et al. 18.9-PFlops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-HZ and 8-meter scenarios. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis, 2017
Chen B W, Fu H H, Wei Y W, et al. Simulating the Wenchuan earthquake with accurate surface topography on Sunway TaihuLight. In: Proceedings of International Conference for High Performance Computing, Networking, Storage, and Analysis, 2018
Benedetti A, Perona P. Bit-width optimization for configurable DSP’s by multi-interval analysis. In: Proceedings of Conference Record of the 34th Asilomar Conference on Signals, Systems and Computers, 2000. 355–359
Wadekar S A, Parker A C. Accuracy sensitive word-length selection for algorithm optimization. In: Proceedings of International Conference on Computer Design, 1998. 54–61
Lee D U, Gaffar A A, Mencer O, et al. Optimizing hardware function evaluation. IEEE Trans Comput, 2005, 54: 1520–1531
Lee D U, Gaffar A A, Cheung R C C, et al. Accuracy-guaranteed bit-width optimization. IEEE Trans Comput-Aided Des Integr Circ Syst, 2006, 25: 1990–2000
Fu H H, Osborne W, Clapp R G, et al. Accelerating seismic computations on FPGAs from the perspective of number representations. In: Proceedings of the 70th EAGE Conference and Exhibition Incorporating SPE EUROPEC 2008, 2008
Gan L, Fu H H, Luk W, et al. Accelerating solvers for global atmospheric equations through mixed-precision data flow engine. In: Proceedings of the 23rd International Conference on Field Programmable Logic and Applications, 2013
Chow G C, Kwok K, Luk W, et al. Mixed precision processing in reconfigurable systems. In: Proceedings of the 19th Annual International Symposium on Field-Programmable Custom Computing Machines, 2011. 17–24
Chow G C T, Tse A H T, Jin Q, et al. A mixed precision Monte Carlo methodology for reconfigurable accelerator systems. In: Proceedings of ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2012. 57–66
He C, Lu M, Sun C. Accelerating seismic migration using FPGA-based coprocessor platform. In: Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, 2004. 207–216
Pell O, Clapp R G. Accelerating subsurface offset gathers for 3D seismic applications using FPGAs. In: Proceedings of SEG Technical Program Expanded Abstracts, 2007. 2383–2387
Medeiros V, Barros A, Silva-Filho A, et al. High performance implementation of RTM seismic modeling on FPGAs: architecture, arithmetic and power issues. In: Proceedings of High-Performance Computing Using FPGAs, 2013. 305–334
Bittencourt J C, Oliveira W L, Nascimento A, et al. Performance and energy efficiency analysis of reverse time migration on a FPGA platform. In: Proceedings of IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC), 2019. 50–58
Ellsworth W L. Earthquake history, 1769–1989. United States Geological Survey, Professional Paper (USA), 1990. http://geologycafe.com/california/pp1515/chapter6.html
Washburn Z, Arrowsmith J R, Forman S L, et al. Late Holocene earthquake history of the central Altyn Tagh fault, China. Geology, 2001, 29: 1051–1054
Zhang W, Chen X F. Traction image method for irregular free surface boundaries in finite difference seismic wave simulation. Geophys J Int, 2006, 167: 337–353
Zhang W, Zhang Z G, Chen X F. Three-dimensional elastic wave numerical modelling in the presence of surface topography by a collocated-grid finite-difference method on curvilinear grids. Geophys J Int, 2012, 190: 358–378
Butcher J C, Butcher J. The Numerical Analysis of Ordinary Differential Equations: Runge-Kutta and General Linear Methods. Hoboken: Wiley, 1987
Pell O, Mencer O, Tsoi K H, et al. Maximum performance computing with dataflow engines. In: Proceedings of Highperformance Computing Using FPGAs, 2013. 747–774
Becker J J, Sandwell D T, Smith W H F, et al. Global bathymetry and elevation data at 30 Arc seconds resolution: SRTM30_PLUS. Mar Geodesy, 2009, 32: 355–371
Zhang Z, Zhang W, Chen X. Dynamic rupture simulations of the 2008 Mw 7.9 Wenchuan earthquake by the curved grid finite-difference method. J Geophys Res Solid Earth, 2019, 124: 10565–10582
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant Nos. 51761135015, U1839206), Center for High Performance Computing and System Simulation, Pilot National Laboratory for Marine Science and Technology (Qingdao), Intel, Maxeler, Xilinx, and the United Kingdom EPSRC (Grant Nos. EP/L016796/1, EP/N031768/1, EP/P010040/1, EP/S030069/1). We would also like to thank Dr. Bastiaan Willem Kwaadgras, Dr. Pavel Burovskiy from Maxeler Technologies, Dr. Wenqiang ZHANG from University of Science and Technology of China, and Prof. Wei ZHANG, Prof. Xiaofei CHEN from Southern University of Science and Technology for their support.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, B., Fu, H., Luk, W. et al. A fully-customized dataflow engine for 3D earthquake simulation with a complex topography. Sci. China Inf. Sci. 65, 152103 (2022). https://doi.org/10.1007/s11432-020-2976-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-2976-5