Abstract
The cross-correlation function appears in many fields with time-series data, and speeding up the computation is essential given the recent accumulation of significant amounts of data. The cross-correlation function can be calculated as a matrix-matrix product, and a significant speed-up can be expected utilizing Tensor Core, which is a matrix-matrix product acceleration unit of the latest NVIDIA Graphics Processing Units (GPUs). In this research, we target a new precision data type called the TensorFloat-32, which is available in the Ampere architecture. We develop a fast calculation method considering the characteristics of the cross-correlation function and TensorCore. Our method achieved a very high performance of 53.56 TFLOPS in the performance measurement assuming seismic interferometry using actual data, which is 5.97 times faster than cuBLAS, a widely used linear algebra library on NVIDIA GPUs. In addition, the accuracy of the calculation result is sufficiently high compared to the 64-bit floating-point calculation, indicating the applicability of Tensor Core operations using TensorFloat-32 for scientific calculations. Our proposed method is expected to make it possible to utilize a large amount of data more effectively in many fields.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
TC-enhanced Cross-correlation Function. https://github.com/nlnxfkl/TC-enhanced_Cross-correlation_Function. Accessed 10 Apr 2022
Venu, D., Rao, N.V.K.: A cross-correlation approach to determine target range in passive radar using FM broadcast signals (2016). http://dx.doi.org/10.1109/WiSPNET.2016.7566190
Alonso, D., Cusin, G., Ferreira, P.G., Pitrou, C.: Detecting the anisotropic astrophysical gravitational wave background in the presence of shot noise through cross-correlations (2020). https://doi.org/10.1103/PhysRevD.102.023002
Shearer, P.M.: Global seismic event detection using a matched filter on long-period seismograms (1994). https://doi.org/10.1029/94JB00498
Aso, N., Ohta, K., Ide, S.: Volcanic-like low-frequency earthquakes beneath Osaka Bay in the absence of a volcano (2011). https://doi.org/10.1029/2011GL046935
Norman, M.R., et al.: Unprecedented cloud resolution in a GPU-enabled full-physics atmospheric climate simulation on OLCF’s summit supercomputer (2021). https://doi.org/10.1177/10943420211027539
Ichimura, T., et al.: A fast scalable implicit solver for nonlinear time-evolution earthquake city problem on low-ordered unstructured finite elements with artificial intelligence and transprecision computing (2018). https://doi.org/10.1109/SC.2018.00052
Beaucé, E., Frank, W.B., Romanenko, A.: Fast Matched Filter (FMF): an efficient seismic matched-filter search for both CPU and GPU architectures (2017). https://doi.org/10.1785/0220170181
NVIDIA TESLA V100 GPU ARCHITECTURE. https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf. Accessed 04 Feb 2022
Markidis, S., Chien, S.W.D., Laure, E., Peng, I.B., Vetter, J.S.: NVIDIA tensor core programmability, performance & precision (2018). https://doi.org/10.1109/IPDPSW.2018.00091
Yamaguchi, T., Ichimura, T., Fujita, K., Kato, A., Nakagawa, S.: Matched filtering accelerated by tensor cores on Volta GPUs With improved accuracy using half-precision variables (2019). https://doi.org/10.1109/LSP.2019.2951305
cuBLAS. https://docs.nvidia.com/cuda/cublas/index.html. Accessed 04 Feb 2022
NVIDIA A100 Tensor Core GPU Architecture. https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/nvidia-ampere-architecture-whitepaper.pdf. Accessed 17 Jan 2022
Raihan, M.A., Goli, N., Aamodt, T.M.: Modeling deep learning accelerator enabled GPUs (2019). https://doi.org/10.1109/ISPASS.2019.00016
CUDA C++ Programming Guide. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html. Accessed 04 Feb 2022
Curtis, A., Gerstoft, P., Sato, H., Snieder, R., Wapenaar, K.: Seismic interferometry-turning noise into signal (2006). https://doi.org/10.1190/1.2349814
Wapenaar, K., Fokkema, J.: Green’s function representations for seismic interferometry (2006). https://doi.org/10.1190/1.2213955
Chen, Y., Saygin, E.: Empirical Green’s function retrieval using ambient noise source-receiver interferometry (2020). https://doi.org/10.1029/2019JB018261
National Research Institute For Earth Science And Disaster Resilience: NIED MOWLAS (Monitoring of Waves on Land and Seafloor) (2019). https://nied-ir.bosai.go.jp/?action=repository_uri&item_id=2151&lang=english, https://doi.org/10.17598/NIED.0009
Dales, P., Audet, P., Olivier, G.: Seismic interferometry using persistent noise sources for temporal subsurface monitoring (2017). https://doi.org/10.1002/2017GL075342
Voisin, C., Guzmán, M.A.R., Réfloch, A., Taruselli, M., Garambois, S.: Groundwater monitoring with passive seismic interferometry (2017). https://doi.org/10.4236/jwarp.2017.912091
National Research Institute For Earth Science And Disaster Resilience: NIED K-NET, KiK-net (2019). https://nied-ir.bosai.go.jp/?action=repository_uri&item_id=2146&lang=english, https://doi.org/10.17598/NIED.0004
Acknowledgment
We acknowledge support from the Japan Society for the Promotion of Science (18H05239).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kikuchi, Y., Fujita, K., Ichimura, T., Hori, M., Maddegedara, L. (2022). Calculation of Cross-correlation Function Accelerated by Tensor Cores with TensorFloat-32 Precision on Ampere GPU. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13351. Springer, Cham. https://doi.org/10.1007/978-3-031-08754-7_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-08754-7_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08753-0
Online ISBN: 978-3-031-08754-7
eBook Packages: Computer ScienceComputer Science (R0)