Abstract
Extracting seismic velocities from recorded seismic data requires converting the shot gathers to midpoint gathers, calculating the velocity spectrum, and picking the velocity values. In this paper, we propose to use graph theory to extract the seismic velocity values directly from the midpoint gathers. We use spectral data of a weighted graph model for approximating the seismic velocity. We develop a regression model to predict the seismic velocity from the largest eigenvalue of the graph representing the physical system. The approach is tested on a synthetic seismic data that represent a typical nearsurface geological situation. The method was able to predict the seismic velocity of the second layer with 99.4% accuracy.
1 Introduction
Seismic velocity of subsurface layers controls wave propagation across them. FigureÂ 1 shows the geometry of a typical seismic survey across a simple 3layer model. Although rays emanate from the source along all directions, rays crossing layer boundaries (interfaces) must satisfy Snellâ€™s law:
where \(\theta _1\) and \(\theta _2\) are the angles of the incident and transmitted rays, respectively, relative to the normal to the interface, while \(V_1\) and \(V_2\) are the velocities of the media in which the incident and transmitted rays propagate, respectively (Sheriff and Geldart 1995).
It can be shown geometrically that the twoway travel time (T) of a ray reflected off any interface is related to the offset (X) between the source and receiver on the surface through the following hyperbolic relation:
where \(T_0 = 2H/V\) is the zerooffset twoway time and H and V are the depth and velocity, respectively, from the surface to the interface (Sheriff and Geldart 1995). The topmost layer is generally accessible for direct velocity measurement and satisfies the hyperbolic assumption of equation (2). However, deeper layers generally do not satisfy equation (2) due to ray bending. Velocities of deeper layers are found conventionally by assuming small velocity contrasts between adjacent layers so that the hyperbolic assumption can still be used despite the introduction of small errors. The most commonly used methods for velocity determination include: the \(T^2\)\(X^2\) and velocity spectrum.
The \(T^2\)\(X^2\) method depends on fitting a line to the \(T^2\)\(X^2\) data of each reflection and estimating the stacking velocity (\(V_S\)) from the surface to the reflecting interface. The layerâ€™s interval velocity can be calculated from the stacking velocities and times at the top and bottom interfaces of the layer using (Dix 1955) formula. However, this method is generally not suitable for seismic data exploration because it requires picking of the travel times, which may be inaccurate when automated due to noise commonly present in seismic data and not practical if done by experienced geophysicists because it is highly time consuming (Yilmaz 2001).
The velocity spectrum method transforms the data in a common midpoint (CMP) gather from the TX domain to the \(T_0\)\(V_S\) domain. Without picking, a range of stacking velocities is fitted to the TX data and a goodnessoffit parameter (e.g., semblance) is calculated at each \(T_0\). The stacking velocity corresponding to the largest semblance value indicates a reflection from an interface at this \(T_0\) with this stacking velocity (Taner and Koehler 1969). Once stacking velocities to the top and bottom interfaces of each layer are estimated, Dix formula can be used to calculate the layerâ€™s velocity. Although this method avoids picking, it requires the hyperbolic assumption, which might not be satisfied especially in nearsurface layers. Nearsurface layers generally exhibit high vertical velocity contrast due to weathering causing rays to bend sharply when crossing an interface. Therefore, the hyperbolic assumption generally fails and more accurate methods that honor ray bending are required.
In an alternative approach, physical systems of the form in Fig.Â 1 can be represented by mathematical objects called graphs, from which properties of the underlying model can be studied using tools of graph theory. One main goal in spectral graph theory is to deduce principal properties and structure of a graph from its spectrum. See for example (Biggs etÂ al. 1976; Tompa 1980), where eigenvalues of graphs were associated with the stability of chemical molecules. Integration of graph theory with stress testing methodologies to assess the resilience of transportation network topology under the influence of environmental hazards was discussed in Aydin etÂ al. (2018).
The use of concepts of graph theory in the study of seismology has been discussed in a number of works. McBrearty etÂ al. (2019), use graph clusters to study the link between arrivals, earthquakes, and source locations. Graph theory was proposed for efficient seismic tracing by Moser (1989), where seismic point sources are represented as vertices in a graph and connections between the points are assigned weights in the graph representing the travel time of seismic wave along the connection. In Ferreira et al. (2019) weighted undirected graphs were discussed as instruments to aid seismic interpretation. Tools from graph theory have also been used to identify salt dome boundaries in seismic data and to carry out seismic performance assessment of transport systems using graph theoretical concepts of underlying network by Khayer etÂ al. (2022) and Malekloo et al. (2022), respectively. For a historical survey on the applications of graph theoretical tools in geosciences the reader may also consult the review article by Phillips etÂ al. (2015).
In this paper, we develop a method to estimate seismic velocities using concepts of graph theory based on some spectral data of the underlying earth model. The relationships between seismic velocities and eigenvalues of weighted graphs representing the underlying earth models are explored. We develop a regression model for predicting seismic velocities from the largest eigenvalues of graphs of the underlying earth models. The use of the proposed regression model for predicting velocities of previously unseen data is demonstrated and the percentage error of the predictions is discussed as a metric for the model.
This paper is organized as follows. Some preliminary concepts from graph theory are discussed in Sect.Â 2. In Sect.Â 3 we discuss the graph theoretic abstraction of the underlying earth model. The data generation approach and regression model are presented in Sect.Â 4. In Sect.Â 5 we test the proposed method on synthetic seismic data and some concluding remarks are given at the end of the paper.
2 Preliminaries
In this section, we recall some important concepts and terminology which are of fundamental use in our mathematical model. More details on the notions presented here can be found in Brouwer and Haemers (2011), Chung (1997), and Trudeau (2017)
Definition 1
A graph G is a triple consisting of a vertex set V(G), an edge set E(G), and a relation that associates with each edge two vertices (not necessarily distinct) called its endpoints.
Two vertices connected by an edge are said to be adjacent and they are referred to as neighbors. Let a and b be any two vertices of a graph, if a and b are neighbors, we use the notation \(a \sim b\) to indicate the adjacency of a and b. In addition, the edge between vertices a and b is denoted ab. A graph is said to be directed if all or some of its edges have directions. Such graphs are referred to as directed graphs (Digraphs). A graph with no directed edge is said to be undirected.
Definition 2
A weighted graph is one in which every edge is assigned a numerical value called weight.
Definition 3
A loop is an edge whose endpoints are equal. Multiple edges are those having same endpoints.
A graph with no loops or multiple edges is called a simple graph. In this paper, we consider simple undirected weighted graphs. For simplicity we shall suppress the use of some of these terminologies and simply refer to our graph models as weighted graphs. Next we discuss two important matrix representations of graphs as well as their spectra.
Let G be a graph with n vertices. The adjacency matrix A(G) and weighted adjacency matrix W(G) are \(n\times n\) matrices whose entries are specified, respectively, as
and
where \(n_{ij}\) is the numerical weight assigned to edge connecting the vertices represented by rows i and j.
Definition 4
The eigenvalues of a graph are the roots of the characteristic polynomial of its adjacency matrix. All the distinct eigenvalues and their multiplicities give the spectrum of the graph.
Thus, if a graph has k distinct eigenvalues \(\lambda _i\), \(i = 1, 2, \ldots , k\), each with multiplicity \(m_i\), then the spectrum of the graph is given by
Definition 5
The spectral radius of a square matrix is the largest absolute value of its eigenvalues.
For a graph, we can talk about the spectral radius of its matrix representations, for instance, spectral radius of the adjacency or of the weighted adjacency matrix. However, in this paper we shall focus mainly on the weighted adjacency matrix. As such, all through the paper the term spectral radius will be reserved for the weighted adjacency matrix.
To illustrate some of the above notions, consider the graph G in Fig.Â 2. The vertex and edge sets are \(\{\text {a,b,c,d}\}\) and \(\{\text {ab, ac, ad}\}\), respectively. So we say vertex \(\text {a}\) is a neighbor to vertices \(\text {b}\), \(\text {c}\) and \(\text {d}\), because it has an edge to each of them. The edges \(\text {ab}\), \(\text {ac}\), \(\text {ad}\) are assigned weights 5, 2, 2.5 units, respectively. The edges are undirected and the graph has no loops and no multiple edges.
Following the definition of the adjacency matrix, A(G), and weight adjacency matrix, W(G), given above, we have
and
The eigenvalues of A(G) are \([ 1.73, 1.73, 0., 0.]\) and those of W(G) are \([ 5.94, 5.94, 0., 0.]\). Thus, their respective spectra are
and the spectral radius of the graph is 5.94.
3 The graph theory approach for seismic velocity determination
The system under consideration consists of seismic sources which transmit signals through reflection and transmission points located on interfaces between subsurface layers. The signals are recorded at the ground surface by receivers (i.e., geophones). This can be thought of as a set of points being connected by lines (i.e., rays).
In this section, we describe the mathematical modeling of the physical problem using weighted graphs. We represent the sources, receivers, and reflection/transmission points as vertices and the signals (rays) between them are represented as edges. Thus, there is an edge between two vertices if and only if there is a ray between them. In the physical model, the distance between any two vertices is controlled by the seismic velocity. As such, our model is built to incorporate the physical distance between any two connected vertices by assigning it as a weight to their corresponding edge. Thus, the resulting model is an undirected weighted graph. FigureÂ 2 can be considered as the graph model of a seismic network with four vertices. Vertex \(\text {a}\) is 5 (distance units) away from \(\text {b}\), 2 (distance units) away from \(\text {c}\) and 2.5 (distance units) away from \(\text {d}\) as shown in the weights of the connecting edges. No edge between any of vertices \(\text {b}\), \(\text {c}\) and \(\text {d}\) indicates that there is no direct ray between them.
4 Data generation and linear regression model
We now give a description of how the data used to build our model was generated, after which we discuss the predictive model itself.
4.1 Data generation
In the physical model we rearrange the seismic data into common midpoint (CMP) gather as shown in Fig.Â 1. Here, we are searching for a raypath that initiated at the source point (S), reflected at the midpoint (m), and received at the receiver point (R). This raypath must satisfy Snellâ€™s law (Eq. (1)). That is, we are looking for the incident angle (\(\theta _1\)) that satisfies the CMP raypath and, at the same time, honors Snellâ€™s law. EquationÂ (6) is a rearrangement of Eq.Â (1) to find the value of \(\theta _1\)
As shown in Eq.Â (6), the value of \(\theta _1\), which defines the shape of the raypath, is a function of both \(V_1\) and \(V_2\).
Our major interest is to generate data which can facilitate the prediction/calculation of the seismic velocity \(V_2\) of the second (deeper) layer. For this purpose, we consider a simulation of the physical system to generate synthetic values of the velocity. In the simulation, a value of the velocity, \(V_1\), of the first (upper) layer is assumed. In addition, distance between sources and receivers as well as layer thicknesses are assigned. Then, a range of values for \(V_2\) is assumed. These values start at \(V_1 + 10\%V_1\), as commonly expected in nearsurface layers (e.g., dry over saturated soils, weathered layer over bedrock, etc.), and increases uniformly up to multiples of \(V_1\). These obtained velocities are then considered as reference values which we later attempt to predict.
To achieve our goal of building a predictive model for the seismic velocity \(V_2\), we also require to have at least one prediction variable (predictor). For this purpose, we turn to the weighted graph model used to abstract the physical system being simulated. In the graph model, the weight of any edge corresponds to the physical distance between the two vertices along the ray represented by the edge. The model is constructed as described in Sect.Â 3. Following the construction of the weighted graph, we investigate spectral properties of its weighted adjacency matrix.
4.2 The regression model
We begin with a brief description of our data exploration activities. To identify a suitable regression model for estimating the velocities, we first explore the relationship between the values of \(V_2\) and the spectral data (the different eigenvalues and their multiplicities) of the graphs corresponding to the earth models for which data was generated in Sect.Â 4.1. As expected, it is observed that the spectral radius has the most significant relationship with \(V_2\). FigureÂ 3 shows that there is a polynomial relationship between the velocity at the second layer (\(V_2\)) and the spectral radius of the graphs representing the underlying earth models.
Having identified the spectral radius as a possible predictor for the seismic velocity \(V_2\), we need to find a suitable predictive model. Given that we have an estimation problem with numerical predictor and response variable, we are motivated to study the possibility of a regression relationship. Our exploratory analysis revealed that a quartic polynomial regression gives better data fitting and generalization (i.e., estimation of data not previously used to train the regression model) than its quadratic and cubic counterparts. Consequently, we propose the quartic polynomial regression for computing an estimate \({\hat{v}}_2\) of \(V_2\):
where x denotes the spectral radius of the weighted graph representation of the underlying earth model and \(\beta _0, \beta _1, \beta _2, \beta _3\) and \(\beta _4\) are the model parameters to be learned from available data.
The problem of finding the values of the parameters \(\beta _0, \beta _1, \beta _2, \beta _3\) and \(\beta _4\), is a discrete inverse problem for which we need some input and output data x and \(V_2\), respectively. Using the approach discussed in Sect.Â 4.1, two hundred and ninety one (291) models were generated containing ray path matrices, and their corresponding values of \(V_2\) with \(V_1 = 500\text {m/s}\), among other information. In particular, the models generated have the values of \(V_2\) (in \(\text {m/s}\)) from \(V_2 = 550\), to \(V_2 = 2000\) with a \(5\text {m/s}\) increment, that is \(\{550, 555, 560, \ldots , 2000\}\). Seventy percent (selected randomly) of these generated data are used for training the model while the remaining thirty percent are used for testing purposes.
To recover the model parameters in the regression model (7) from the training dataset, we form an equation for each entry in the dataset by putting the value of x and the actual value of \(V_2\) in the regression equation. If we have a dataset of size l, then we end up with a system of equations given in the matrix form
where
\(x_i\) is the ith entry of x (the ith spectral radius) in the training dataset and \(v_{2,i}\) is the ith value of \(V_2\) in the training dataset. Solving the above system by least squares, we get the best possible approximation of the model parameters for fitting the given data. The values recovered are
The whole dataset as well as the training and testing partitions can be found on github, see the data availability statement for the URL. In addition, there are two zipped folders. One containing the generated models in.mat format and the other containing the extracted ray path matrices and velocities in an Excel format.
In our computations the percentage error (residual), denoted \(P_e\), in using the regression model (7) to estimate the velocity \(V_2\) is computed by
In Fig.Â 4, we present the results obtained when the regression equation (7) is applied to the data used for training the model (learning the parameters of the regression model). The plot on the left shows the matching of the target and estimated values while plot on the right shows the percentage error (9). It is seen from the figures that the estimates obtained from the regression model match the true values of \(V_2\) in the training data and the observed percentage error is largely smaller than \(4\%\).
In Fig.Â 5, we present the results obtained by applying the regression model to another dataset, different from the one used to learn the parameters of the model. Such dataset is called test data. It is seen that model also performs well in estimating the target (\(V_2\)) in the previously unseen data. The plot on the left shows that the estimations obtained from the model match the target values reasonably well. And the plot on the right reveals that percentage error in the estimation is also largely below \(4\%\). These observations highlights the generalization tendency of the model which testifies to the ability of the model to be useful for future predictions.
5 Model validation
A synthetic seismic data is generated to test the proposed method, where the true velocity model is consisting of 3 layers. The velocities of the 1st, 2nd, and 3rd layers are 500Â \(\text {m}/\text {s}\), 1500Â \(\text {m}/\text {s}\), and 2200Â \(\text {m}/\text {s}\), respectively, while the thickness of the first and second layers are 30 \(\text {m}\) and 50 \(\text {m}\), respectively. Three common shot gathers are generated at offsets 0, 100, and 200 \(\text {m}\) located at the ground surface. Then the CMP gather located at offset 250 \(\text {m}\) is extracted, using the three receivers at offsets 300, 400, and 500 \(\text {m}\) (Fig.Â 6).
To process the data, we calculated both \(V_1\) and thickness of the first layer from the recorded shot gathers, then we applied the graph theory on the raypath model. Here, source points \(\text {S}_1\) to \(\text {S}_3\), receiver points \(\text {R}_1\) to \(\text {R}_3\), transmission points \(\text {r}_1\) to \(\text {r}_6\), and reflection point M are considered as vertices, while the raypaths \(\text {S}_1\)\(\text {r}_1\), \(\text {r}_1\) M, M\(\text {r}_6\), etc. are considered as edges. The location of the vertices \(S_1\) to \(S_3\), \(R_1\) to \(R_3\), and \(\text {M}\) are fixed, while the location of the vertices \(\text {r}_1\) to \(\text {r}_6\) and M depends on the velocities \(V_1\) and \(V_2\) and thicknesses \(\text {H}_1\) and \(\text {H}_2\). Since we usually know the value of \(V_1\) from the directwave data, graph theory will be used to find \(V_2\).
In a real situation, the locations of \(\text {r}_1\) to \(\text {r}_6\) are unknown, so we assume all possible locations taking into consideration the following

(1)
Offset values of \(\text {r}_1\) to \(\text {r}_3\) are less than offset value of M.

(2)
Offset values of \(\text {r}_4\) to \(\text {r}_6\) are larger than offset value of M.

(3)
\(\text {r}_1\) to \(\text {r}_6\) are arranged in ascending offset, (i.e. offset of \(\text {r}_2\) is larger than offset of \(\text {r}_1\), etc).
Following this procedure, we generated a set of \(V_2\) values using the graph theory approach by calculating the spectral data for each attempted model and using our regression model (7) to estimate the corresponding \(V_2\) value. We find the correct \(V_2\) value by calculating the total travel times from \(\text {S}_1\) to \(\text {R}_1\), \(\text {S}_2\) to \(\text {R}_2\), and \(\text {S}_3\) to \(\text {R}_3\) for each attempted model. The calculated times are then compared to their corresponding \(\text {S}_1\) to \(\text {R}_1\), \(\text {S}_2\) to \(\text {R}_2\), and \(\text {S}_3\) to \(\text {R}_3\) travel times measured from the data. The model with the minimum error between the calculated and measured travel times corresponds to the correct \(V_2\) value.
To validate the proposed technique, we generated three common shot gathers representing \(\text {S}_1\), \(\text {S}_2\), and \(\text {S}_3\) shown in Fig.Â 6. The velocity model shown in Fig.Â 6 is used as the true velocity model. FigureÂ 7 shows one shot gather example of the data generated using a finitedifference solution to the acoustic wave equation. FigureÂ 8 shows the common midpoint extracted from the generated three shot gathers to test the graph theory approach. The value of \(V_2\) calculated using the graph theory approach is 1509Â \(\text {m}/\text {s}\) while the true value is 1500Â \(\text {m}/\text {s}\), which shows an error of only 0.6%. This result reflects the accuracy of the proposed graph theory approach to find seismic velocities in similar nearsurface settings.
6 Conclusion
We presented a novel approach for estimating seismic velocities in common nearsurface settings using graph theory. The proposed approach starts by defining the locations of sources, receivers, transmission and reflection points as vertices and ray segments between them as edges. Moreover, the coordinates (depths and offsets) of these vertices are used to calculate weights for the edges. This construction results in an undirected weighted graph model for which spectral data can be calculated. Then we performed some exploratory data analysis on a common nearsurface setting to uncover possible relationships between the spectral data and seismic velocities. Based on empirical evidence, it was observed that the largest eigenvalues are the most suitable predictors for the seismic velocity. In fact, these eigenvalues were observed to be approximately some quartic polynomial of the velocity \(V_2\). Furthermore, we test the proposed on synthetic seismic data involving a simple nearsurface velocity model. The proposed method was able to predict the velocity of the second layer with a high accuracy of \(99.4\%\), which testifies to its suitability to estimate subsurface seismic velocities accurately.
Data availability
The datasets generated and analyzed in this study are accessible via github: https://github.com/IbrahimSarumi/Seismic_VelocityEigenvalues.
References
Aydin, N.Y., Duzgun, H.S., Wenzel, F., Heinimann, H.R.: Integration of stress testing with graph theory to assess the resilience of urban road networks under seismic hazards. Nat. Hazards 91, 37â€“68 (2018)
Biggs, N.L., Lloyd, E.K., Wilson, R.J.: Graph Theory 1736â€“1936. Clarendon Press, Oxford (1976)
Brouwer, A.E., Haemers, W.H.: Spectra of Graphs. Springer, New York (2011)
Chung, F.R.K.: Spectral Graph Theory. American Mathematical Society, Rhode Island (1997)
Dix, C.H.: Seismic velocities from surface measurements. Geophysics 20(1), 68â€“86 (1955)
Ferreira, R.S., Brazil, E.V., Silva, R., Cerqueira, R.: Seismic graph analysis to aid seismic interpretation. Interpretation 7(3), SE81â€“SE92 (2019)
Khayer, K., RoshandelKahoo, A., SoleimaniMonfared, M., Kavoosi, K.: Combination of seismic attributes using graphbased methods to identify the salt dome boundary. J. Petrol. Sci. Eng. 215, 110625 (2022)
Malekloo, A., Ozer, E., Ramadan, W.: Bridge network seismic risk assessment using ShakeMap/HAZUS with dynamic traffic modeling. Infrastructures 7(10), 131 (2022)
McBrearty, I.W., Gomberg, J., Delorey, A.A., Johnson, P.A.: Earthquake arrival association with backprojection and graph theory. Bull. Seismol. Soc. Am. 109(6), 2510â€“2531 (2019)
Moser, T.J.: Efficient seismic ray tracing using graph theory. SEG Technical Program Expanded Abstracts, pp. 1106â€“1108 (1989)
Phillips, J.D., Schwanghart, W., Heckmann, T.: Graph theory in the geosciences. Earth Sci. Rev. 143, 147â€“160 (2015)
Sheriff, R., Geldart, L.: Exploration Seismology. Cambridge University Press, Cambridge (1995)
Taner, M.T., Koehler, F.: Velocity spectradigital computer derivation applications of velocity functions. Geophysics 34(6), 859â€“881 (1969)
Tompa, M.: Timespace tradeoffs for computing functions, using connectivity properties of their circuits. J. Comput. Syst. Sci. 20(2), 118â€“132 (1980)
Trudeau, R.J.: Introduction to Graph Theory. Parker Pub. Co., West Nyack (2017)
Yilmaz, O.: Seismic Data Analysis: Processing, Inversion, and Interpretation of Seismic Data. Society of Exploration Geophysicists, Tulsa (2001)
Acknowledgements
We thank the reviewer for the valuable suggestions and comments.
Funding
The authors appreciate funding of this research by the College of Petroleum Engineering and Geosciences in King Fahd University of Petroleum and Minerals through startup grant number SF 18060.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the articleâ€™s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleâ€™s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Alfuraidan, M.R., AlShuhail, A., Hanafy, S.M. et al. Approximation of seismic velocities from the spectrum of weighted graphs. Int J Geomath 14, 5 (2023). https://doi.org/10.1007/s1313702300214z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s1313702300214z
Keywords
 Seismic velocity
 Seismic migration
 Weighted graph
 Graph spectral radius
Mathematics Subject Classification
 8608
 86A15
 05C22
 05C50
 05C90