Seismic data IO and sorting optimization in HPC through ANNs prediction based auto-tuning for ExSeisDat

Tipu, Abdul Jabbar Saeed; Conbhuí, Pádraig Ó; Howley, Enda

doi:10.1007/s00521-022-07991-y

Seismic data IO and sorting optimization in HPC through ANNs prediction based auto-tuning for ExSeisDat

Original Article
Open access
Published: 08 November 2022

Volume 35, pages 5855–5888, (2023)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

Seismic data IO and sorting optimization in HPC through ANNs prediction based auto-tuning for ExSeisDat

Download PDF

Abdul Jabbar Saeed Tipu¹^nAff2,
Pádraig Ó Conbhuí² &
Enda Howley¹

957 Accesses
1 Altmetric
Explore all metrics

Abstract

ExSeisDat is designed using standard message passing interface (MPI) library for seismic data processing on high-performance super-computing clusters. These clusters are generally designed for efficient execution of complex tasks including large size IO. The IO performance degradation issues arise when multiple processes try accessing data from parallel networked storage. These complications are caused by restrictive protocols running by a parallel file system (PFS) controlling the disks and due to less advancement in storage hardware itself as well. This requires and leads to the tuning of specific configuration parameters to optimize the IO performance, commonly not considered by users focused on writing parallel application. Despite its consideration, the changes in configuration parameters are required from case to case. It adds up to further degradation in IO performance for a large SEG-Y format seismic data file scaling to petabytes. The SEG-Y IO and file sorting operations are the two of the main features of ExSeisDat. This research paper proposes technique to optimize these SEG-Y operations based on artificial neural networks (ANNs). The optimization involves auto-tuning of the related configuration parameters, using IO bandwidth prediction by the trained ANN models through machine learning (ML) process. Furthermore, we discuss the impact on prediction accuracy and statistical analysis of auto-tuning bandwidth results, by the variation in hidden layers nodes configuration of the ANNs. The results have shown the overall improvement in bandwidth performance up to 108.8% and 237.4% in the combined SEG-Y IO and file sorting operations test cases, respectively. Therefore, this paper has demonstrated the significant gain in SEG-Y seismic data bandwidth performance by auto-tuning the parameters settings on runtime by using an ML approach.

Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat

Article Open access 02 July 2021

PBIL for Optimizing Hyperparameters of Convolutional Neural Networks and STL Decomposition

A cyclic learning approach for improving pre-stack seismic processing

Article Open access 21 April 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The oil and gas industry extensively relies on the seismic data as a critical factor for processing it to understand and visualize the structure beneath the surface of earth and the seabed [1]. The seismic data are normally stored and encapsulated in SEG-Y format files which are a global standard [2]. Figure 1 shows a complex structure layout of SEG-Y format to store the seismic data in file. It contains trace headers of 240 bytes each and actual traces data on alternate positions in the file. Usually the SEG-Y or seismic data scales to petabytes when written in file on disks. Therefore, it compellingly increase the need of high-performance computing (HPC) or super-computing systems to perform large scale I/O processing, across the oil and gas production industry as well as the research industry.

The modern HPC clusters are normally well equipped to perform exascale operations and significantly time efficient. They usually contain high number of computing nodes and parallel storage disks connected via fast network hardware infrastructure [3, 4]. Figure 2 shows a very basic structure of HPC system. At software side a programming paradigm is also required to exploit the clusters potential by writing and executing parallel applications, which, in this case, is the standard message passing interface (MPI) library to serve the purpose [5]. The library also contains the application programming interface (API) for carrying out parallel IO processing tasks by communicating with multiple storage disks. These disks are normally controlled and managed by a parallel file system (PFS) protocols. In this research, the Lustre file system (LFS) is the PFS running on the targeted machine disks [6].

To process the SEG-Y files data the Extreme-Scale Seismic Data (ExSeisDat) library is already developed based on the existing MPI-IO APIs [7, 8]. The ExSeisDat contains its own parallel IO library, namely PIOL, and Workflow library to perform seismic data related functionalities. In spite of this parallel processing library, it faces degradation in program execution performance. The reason is underlying running multiple MPI processes get restricted by LFS protocols, to access a disk at a same instance in parallel. This normally occurs as data-aligning is not applied and considered by users, although it does not guarantee maximum possible bandwidth performance [9,10,11,12,13]. It is also tricky to apply data-aligning to SEG-Y format file as traces data is placed on alternative positions rather than being placed consecutively, and consequently not considered in ExSeisDat functionality. This means the projection of parallel IO bandwidth performance of ExSeisDat to process seismic data critically relies on certain configuration parameters related to MPI, LFS and pattern to access data from SEG-Y file. For example, number of running MPI processes, Lustre stripe count of parallel disks, random or contiguous access pattern, etc. Therefore, it leaves the user with the choice to find and tune the suitable parameters settings that can improve the bandwidth performance from the currently existing value settings.

As our aim is to keep the user of ExSeisDat free from the overhead of manually tuning the settings of related parameters therefore, we propose auto-tuning approach based on maximum IO bandwidth prediction value. This research extends the previous paper contribution regarding the bandwidth prediction of SEG-Y IO and file sorting operations [9]. The first key contribution of this paper is the auto-tuning design strategy for SEG-Y IO and file sorting operations on the basis of related parameters bandwidth prediction to optimize performance. The recent studies have shown the notable advantages of predictive IO bandwidth or performance modeling through machine learning (ML), for auto-tuning parameters in different HPC problem scenarios [14,15,16,17,18]. Consequently, we particularly chose artificial neural networks (ANNs) ML technique for predictive modeling [19, 20], on the collected SEG-Y IO and Sorting benchmarks execution profiling data. The research studies from [21,22,23] are the source of our motivation to implement and execute ANNs ML process. This has been carried out using Python PyTorch package which has significant precision in predicting outputs. Once the ANNs are trained, they are validated and applied into auto-tuning design over the default configuration test settings.

We have tested the range of ANNs with varying number of nodes in the hidden layers to observe its impact on prediction accuracies of ANN models, which is our second key contribution. The runtime cost analysis of ANNs prediction is also presented, as it is a core component in new parameters values selection.

Additionally, the statistical analysis of resulted bandwidth values from default and auto-tuned test settings executions, is a third key contribution. This has also been conducted on the same range of ANNs with varying hidden layers nodes, as stated earlier. The purpose of this analysis is to see the most suitable ANN for auto-tuning of SEG-Y IO and file sorting operations.

Our research mainly emphasizes on the designing of auto-tuning strategy over SEG-Y IO and file sorting configurations by bandwidth prediction of ANNs. Furthermore, it focuses on the statistical analysis on varying overall bandwidth performance improvements, over the range of ANNs.

As evident from results and analysis, the auto-tuning design based on ANNs contribute to a notable increase in SEG-Y IO and file sorting bandwidth performance. This paper is structured as follows: related work in Sect. 2, design and implementation in Sect. 3, experimental result analysis in Sect. 4, discussion in 5 and conclusion in Sect. 6.

2 Related work

The ML-based predictive modeling has been an important element in the previous and recent studies, for handling IO performance degradation. The performance prediction is evaluated in the area of HPC clusters before further steps such as auto-tuning the specific configuration parameters. Those researches have been a great source of motivation for us to adapt prediction-based auto-tuning strategy. This is for optimizing the SEG-Y IO and file sorting performance across LFS employed disks in the super-computing cluster.

The work presented in [9] predicts the IO bandwidth performance on SEG-Y IO and file sorting parameters using ANNs and shows the prediction accuracy. However, those ANNs had only one type of setting in the hidden layers. The setting was 256 nodes in first hidden layer (h1) and 128 nodes in second hidden layer (h2). The SEG-Y IO READ and WRITE prediction accuracies were 96.5% and 88.1%, respectively. Whereas, the SEG-Y file sorting READ and WRITE predictions yielded to 77% and 80% in accuracy, respectively.

The study demonstrated by [14] presents the parameters such as the number of IO threads, CPU frequency and the IO scheduler impacts the HPC-IO performance throughput. Using these parameters, the IO pattern behavior is determined via predictive extrapolation and interpolation modeling approaches. It was achieved by a data analytic framework which supported the exascale modeling experiments. The performance evaluation was based on computing prediction accuracies on the unseen configurations of testing system. Subsequently, the map between Bayesian Treed Gaussian Process variability and the varying regression techniques, was used to optimize the system configurations. This leveraged the parameters selection by statistical methods insights and the HPC variability management.

The research in [15] handles the parallel IO requests by adaptively scheduling them based on the tuning of time window of executing workload within the HPC system. The aim is achieved by constructing the adaptive scheduler through reinforcement learning. As per performance evaluation, the 88% precision is achieved for runtime parameters selection once the access pattern is observed and classified. This classification is determined by deep neural networks in the initial few minutes by the system. Afterward, the system optimize and improves its IO performance for rest of the lifetime, as endorsed by the literature. The key aspect of this study is being more dynamic as compare to other technique, as it involves no training step.

In [17], the random forest regression has been used as the ML technique for predictive bandwidth modeling against the collective MPI-WRITE operation. The accuracy of the predictions has been notably high, ranges in approximately 82-99%. This was also dependent on maximum depth setting. As per the literature, the training and testing datasets have been significantly small sized as compared to our datasets. By adding further variation in the data, the prediction accuracy could be low by increasing data volume.

The work done in [16] deals with the HDF5 format data files by means of optimizing the parallel IO performance. The testing was carried on various HPC systems running the LFS and general parallel file system (GPFS) as the parallel file systems (PFSs). In this study, the auto-tuning had a vital role, based on the IO predictions. The predictive IO modeling was done using the data from LFS-IOR and some different benchmarks by nonlinear regression ML algorithm. This caused notable gain in parallel IO performance by auto-tuning the newly selected parameters. In comparison, the work in [18] demonstrates the IO predictive modeling using LFS-IOR benchmarks as well. However, for the training purpose, the ML approach has been used is Gaussian process regression.

The approach demonstrated by [21] predicts file access time on storage disks managed by LFS. The series of benchmarks were run to profile file access times, so they could be used in training ML models. The ANNs were used as the ML technique. The ANNs generated and produced around 30% less the average prediction error in comparison to the linear ML modeling techniques. The evaluation of distributed file access times was carried out in regard to similar parameters that used to access the file. The study also revealed that normally, the file access times are determined by the degree of magnitude in regard to particular IO paths.

Furthermore, the other researches explored regarding MPI application performance optimization via ML predictive modeling and then auto-tuning parameters. However, they do not consider the part of IO processing within a parallel program [24, 25].

Contrasting with existing recent research, our work in this paper stresses upon optimizing HPC IO performance for SEG-Y READ/WRITE and file sorting through auto-tuning parameters, which is based on the predictive IO bandwidth modeling over the range of ANNs. This provides different bandwidth performance statistics over variation in hidden layers nodes.

3 Design and implementation

3.1 Research methodology

This work has been carried out with the common sequence of steps for both SEG-Y IO and file sorting operations separately. The steps comprises the: (1) identification of the key tunable and non-tunable configuration parameters, (2) re-execution of SEG-Y IO and file sorting benchmarks to generate profiling data, (3) training/testing of 6 ANNs for each SEG-Y READ/WRITE and file sorting READ/WRITE operation (total 24 ANNs), (4) applying design strategy for auto-tuning on basis of generated ANNs and 5) prediction accuracy values and statistical analysis of bandwidth performance results against all ANNs. Figure 3 explains the main components of the flow of this research.

3.2 Key parameters identified

The SEG-Y IO and file sorting benchmarks are executed after identification of key configuration parameters, to read/write and sorting traces data in file, respectively. The benchmarks profiling datasets comprises of both training and testing sets for ANN models learning and prediction, respectively. The configuration parameters have also been used as the input features to ANN models.

Tables 1 and 2 show the complete separate list of the related key parameters for both SEG-Y IO and file sorting operations and their corresponding values settings. A complete and separate list for each operation type is generated prior to benchmarking, which contains all possible configuration settings according to the tables. A single configuration setting in SEG-Y IO list can be presented as follows:-

$$\begin{aligned}&config\_SEG\text{- }Y\text{- }IO \nonumber \\&\quad = [MPI\;nodes=8, processes\;per\;node=4,\nonumber \\&\quad stripe\;count=4, stripe\;size=1MiB,\nonumber \\&\quad number\;of\;traces=512,samples\;per\;trace=256,\nonumber \\&\quad file\;access\;pattern=random] \end{aligned}$$

(1)

Similarly, a single configuration setting in a list for SEG-Y file sorting can be as follows:-

$$\begin{aligned}&config\_SEG\text{- }Y\text{- }SORT = [MPI\;nodes=8, processes\;per\;node=4,\nonumber \\&\quad stripe\;count=4, stripe\;size=1MiB,\nonumber \\&\quad number\;of\;traces=512,samples\;per\;trace=256,\nonumber \\&\quad unsorted\;order=reverse] \end{aligned}$$

(2)

These type of configuration settings in the lists are used to execute both SEG-Y READ and WRITE benchmarks, as well as the sorting benchmarks. The 3$^{rd}$ column in both Tables 1 and 2 tells if a particular parameter is tunable or not, which is further explained in later.

Table 1 SEG-Y IO benchmarks configuration parameters and their values

Seismic data IO and sorting optimization in HPC through ANNs prediction based auto-tuning for ExSeisDat

Abstract

Similar content being viewed by others

Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat

PBIL for Optimizing Hyperparameters of Convolutional Neural Networks and STL Decomposition

A cyclic learning approach for improving pre-stack seismic processing

1 Introduction

2 Related work

3 Design and implementation

3.1 Research methodology

3.2 Key parameters identified

3.3 Generating SEG-Y IO and file sorting benchmarks results data

3.4 Predictive IO bandwidth modeling with ANNs

3.5 Full design strategy for auto-tuning parameters

3.5.1 New configuration settings by SEG-Y operations models

3.5.2 Auto-tuning SEG-Y operations with new configuration settings

3.5.3 Statistical data process upon executing auto-tuned SEG-Y operations

4 Experimental results analysis

4.1 ANNs performance analysis

4.1.1 SEG-Y IO ANNs prediction evaluation

4.1.2 SEG-Y file sorting ANNs prediction evaluation

4.1.3 Runtime cost analysis for new configurations selection

4.2 Auto-tuning results analysis

4.2.1 SEG-Y READ Auto-tuning results

4.2.2 SEG-Y WRITE auto-tuning results

4.2.3 Combined SEG-Y IO auto-tuning results

4.2.4 SEG-Y sorting via contiguous READ auto-tuning results

4.2.5 SEG-Y file sorting via contiguous WRITE auto-tuning results

4.2.6 Combined SEG-Y file sorting auto-tuning results

5 Discussion

6 Conclusion

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Availablity of data

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation