Efficient 2D multiple attenuation using SRME with curvelet-domain subtraction

Surface Related Multiple Elimination (SRME) usually suffers the issue of either over-attenuation that damages the primaries or under-attenuation that leaves strong residual multiples. This dilemma happens commonly when SRME is combined with least-squares subtraction. Here we introduce a more sophisticated subtraction approach that facilitates better separation of multiples from primaries. Curvelet-domain subtraction transforms both the data and the multiple model into the curvelet domain, where different frequency bands (scales) and event directions (orientations) are represented by a finite number of curvelet coefficients. When combined with adaptive subtraction in the time–space domain, this method can handle model prediction errors to achieve effective subtraction. We demonstrate this method on two 2D surveys from the TAiwan Integrated GEodynamics Research (TAIGER) project. With a careful parameter determination flow, our result shows curvelet-domain subtraction outperforms least-squares subtraction in all geological settings. We also present one failed case where specific geological condition hinders proper multiple subtraction. We further demonstrate that even for data acquired with short cables, curvelet-domain subtraction can still provide better results than least-squares subtraction. We recommend this method as the standard processing flow for multi-channel seismic data.


Introduction
Multiple attenuation is a critical step in marine seismic data processing. Successful multiple elimination can bring significant uplift in the quality of the final seismic stack. In general, multiple attenuation comprises two steps: multiple prediction and adaptive subtraction. Surface Related Multiple Elimination (SRME) (Verschuur et al. 1992) combined with L2-norm based least-squares subtraction has been a widely-adopted processing flow. This flow removes multiples effectively, but can sometimes lead to over attenuation due to crossing or overlapping of primary and multiple events (hereby primaries and multiples) (Hung and Wu 2015). Therefore, compromise has to be made between primary preservation and multiple attenuation.
For a better separation between primaries and multiples, Herrmann et al. (2007) first proposed a multiple subtraction approach in the curvelet domain following the development of fast discrete curvelet transform by Candes et al. (2006). Curvelets are small wave-like signals with multi-scale and multi-directional characteristics. With these characteristics, curvelet transform provides an appropriate representation of the band-limited reflection seismic data with optimal sparsity and efficient reconstruction capability (Hennenfent and Herrmann 2006). Existing curvelet-domain subtraction methods can be categorized into non-adaptive (Herrmann et al. 2007;Saab et al. 2007) and adaptive implementations (Herrmann et al. 2008;Neelamani et al. 2010;Hung and Wu 2015). Adaptive methods allow a certain prediction errors in the multiple models and therefore usually outperforms nonadaptive methods, though at the cost of higher computation time. So far most examples of curvelet-domain subtraction in published literature are from industrial applications. To promote this method in marine geology research, its performance and efficiency on data acquired through academic 1 Page 2 of 13 research vessels (mainly single-cable acquisition with short offsets) needs to be investigated.
In this study we aim to test the curvelet-domain subtraction method on 2D research data acquired nearby offshore Taiwan. In particular, we would like to examine the performance of this approach over different geological regimes. For a comprehensive understanding of this method, we organize this paper as follows. In Methodology, we first provide basic introduction for SRME and conventional least-squares subtraction method, followed by detailed methodology for the curvelet-domain subtraction method. In Field Data Application, we carry out tests to demonstrate the influence of key parameters including starting scales and sparsity constraints. With the optimized choice of parameters, we present the comparison of subtraction results between least-squares method and curvelet-domain method for selected geological regimes. In Discussion, we discuss the limitation of this method and the applicability on short-cable data, followed by Conclusion. Our goal is for geologists and geophysicists in the academia to have a comprehensive view of the performance and limits of this approach over research-level data.

Multiple modeling
Surface-related multiples are the dominant multiples in most seismic data. The term "surface" indicates multiples with at least one downward reflection at the water surface. Based on the work of Verschuur et al. (1992), these multiples can be estimated by convolving common-shot gathers and commonreceiver gathers to mimic the multiple reflections. This datadriven method requires no information about either the water bottom or the subsurface structures, and works better for multi-channel seismic data with large offsets. In such modeling, frequency, phase and arrival-time differences from the recorded data may arise due to the acquisition geometry such as finite cable length, cable feathering and spatial sampling (Naidu et al. 2013), and/or complex seafloor topography. The topography factor may severely impact the modeling accuracy especially for 2D surveys. Hence, to subtract multiple models from the data properly, adaptation is always needed.

Least-squares subtraction
One of the most common image subtraction approaches is least-squares subtraction. This approach determines a Wiener matching filter f LS in a least-squares sense by minimizing the following objective function (Berkhout and Verschuur 1997): where D is the recorded field data and M is the multiple model. The matching filter is then applied back to M to minimize the total energy in the data after. Least-squares subtraction hinges on the assumption that primaries are orthogonal or uncorrelated to multiples, and hence the post-subtraction image should have the minimal energy. This assumption, however, is usually not satisfied in the field data due to the crossing or overlapping between the multiples and primaries.
The outcome of such similarity is either over-attenuation that hurts the primaries or under-attenuation that leaves strong residual multiple energy.

Curvelet-domain subtraction
Curvelet transform is a higher dimensional generalization of wavelet transform to deal with edge-related image processing problems. Wavelet transform has known to be inefficient for curve geometry due to its isotropic characteristics. Curvelet transform on the other hand introduces an ensemble of scale-orientation coefficients that allows a better representation of curves and curve discontinuities of various directions. Currently the most common digital implementation is the Fast Discrete Curvelet Transform, available in the software package CurveLab (http:// www. curve let. org/). This implementation decomposes the field data D(t, x) in a two-dimensional space R 2 into the sum of multiple curvelet coefficients d(j, k, l) described by index j (scale, frequency bands), l (orientation) and k (spatial position) ( Fig. 1) (Candès et al. 2006): where ℂ is the curvelet transform operator, and j,k,l (t, x) is the curvele basis at location k of a particular scale j and orientation l. In the T-X (time-space) domain, the curvelet bases mimic segments of zero-phased seismic signals with two main side lobes ( Fig. 1b-e). When the scale number increases, the frequency content in the curvelet basis also increases. In the curvelet domain, each curvelet coefficient d (j, k, l) can be viewed as a 2D image d jl k x , k y (such as the light blue rectangles in Fig. 2). The value at each k x , k y pixel represents the strength of the curvelet d jl at a particular position in the T-X domain, such as the example in Fig. 1e. For more details about the numerical formulation of curvelet basis j,k,l (t, x) , the reader may refer to Candès et al. (2006).  To apply curvelet transform on multiple subtraction, we consider the following objective function for adaptive curvelet-domain subtraction (Hung and Wu 2015) where p and m denote the primaries and multiples in the curvelet domain and p = d − m , ℂ −1 denotes the inverse curvelet transform, f LS is jointly obtained from (1) and is the weight for L2-norm term determined by the data noise level (Hung and Wu 2015). w 1 and w 2 are the weighs for the L1-norm terms determined by initial coefficient amplitudes, defined as (Hung and Wu 2015) where 1 and 2 are two constants for sparsity constraints. Larger values indicate higher weights for the amplitude of curvelet coefficients and hence suppress the contribution from those with smaller amplitudes. The best estimates p and m are solved by an iterative thresholding algorithm for linear inverse problems with sparsity constraints (Daubechies et al. 2004).
To account for model inaccuracies and noises at different scales, Eq. (3) can be adapted separately at each scale, forming a frequency-regularized adaptive curvelet domain subtraction (Hung and Wu 2015). The objective function (3) therefore becomes where n is the total number of scales, and j s is the starting scale above which we wish to carry out the adaptation and subtraction. This expression allows us to define a scale above which the predicted multiples start to show similar patterns as the observed multiples in the curvelet domain (Fig. 2). This is the key advantage of curvelet-domain subtraction: the treatment of multiples can be naturally frequency-dependent. Note that the choice of total number of scales and the number of directional wedges (Fig. 1a) is arbitrary, although the length and width of each wedge still have to obey the parabolic scaling law (Candès et al. 2006). Finer partitioning in scales allows a more sophisticated differentiation of primary and multiple frequencies. Finer partitioning in wedges allows a better separation of features with minor dip angle differences. These differentiations are at the cost of longer processing time. (

Field data application
We demonstrate the performance of curvelet-domain subtraction by comparing its application result with that from least-squares subtraction. We use two 2D datasets from offshore southwestern Taiwan, MGL0905-8 and MGL0905-10, acquired during the TAiwan Integrated GEodynamics Research (TAIGER) project ). These two surveys were collected by R/V Marcus G. Langseth in 2009 with a 6-km-long streamer of 468 channels, 12.5-m receiver interval and 50-m shot interval. Listening time between each shot is 15 s, and sampling rate is 2 ms. These two lines cover water depth from 675 to 3,000 m (roughly equivalent to 900 ms to 4,000 ms in two way travel time), from upper continental slope to deep sea abyssal in the northern South China Sea (Fig. 3). Previous processing results using SRME with conventional least-squares subtraction for other lines in the TAIGER project can be found in Lester and McIntosh (2012). Geological interpretations of the TAIGER dataset have also been reported in Yeh et al. We pre-process the data in the CGG geovation system with the following steps: low-cut filtering, swell/linear/ interference noise attenuation, debubble, zero-phasing, and receiver deghosting (broadband processing). Before Fig. 3 Location of MGL0905-8 and MGL0905-10 surveys from the TAIGER project. Inset: The basic tectonic setting of offshore southwestern Taiwan modeling for multiples, we interpolate one additional shot gather in between two existing shots in order to increase the spatial sampling and to avoid spatial aliasing in the common receiver domain. The shot gathers and common receiver gathers are then convolved to form the multiple model.
Least-squares subtraction is executed in the common-shot domain (usually so to avoid harsh subtraction). We test a few different window sizes for filter design. In general, the smaller the design window size, the more likely the subtraction touches the primaries (Fig. 4). Finally we choose a filter design window of 180 ms × 21 traces for MGL0905-8 and 256 ms × 21 traces for MGL0905-10 based on visual judgement of a balance between primary retention and multiple attenuation in the common-channel domain. Curveletdomain subtraction is executed in the common-channel domain for better geometric separation between primaries and multiples. We first transform both the data and multiple models into 10 curvelet scales. We then carry out tests to examine the effect of different starting scales ( j s ) and sparsity constraints ( i ).
After multiple subtraction, we carry out pre-migration stacking with the NMO velocity and a 35-deg angle mute. Although the stacks are not migrated, they offer an easy and quick view for us to examine the effect of different subtraction approaches. Below we first present details of the tests (Test of key parameters), followed by comparison of subtraction results (Comparison of multiple subtraction results). We organize the results based on their geological settings, from anticlines within the accretionary wedge, to sediment pack on the passive continental slope, and finally to basement structures in the abyssal region.

Test of key parameters
We carry out tests with the combination of two key parameters, the starting scale ( j s ) for subtraction and the sparsity constraints ( i ). Although both 1 and 2 control the sparsity of curvelet coefficients, we find in our tests that 1 has a dominant effect on the resulting primaries, whereas 2 has only little or marginal impacts. Larger 1 value up-weights the high-amplitude coefficients and down-weights the lowamplitude coefficients for the post-adpated multiples models (Eq. (4)). It also gives noise-attenuation effect similar to applying a higher amplitude threshold on the curvelet coefficients (Gorszczyk et al. 2014). The starting scale j s indicates the scale at which the subtraction starts to take effect, while the coefficients with scales smaller than j s remain unchanged.
Among different parameter combinations, we see sections overly cleaned and dominated by low frequency at small j s and large 1 (Fig. 5c). It means the coefficients at scales larger than 6 are all used for subtraction with a strong sparsity constraint. Its frequency spectrum shows a peak between 5 and 20 Hz, with frequencies above 20 Hz suppressed (Fig. 5d). On the other hand, when j s is large and 1 is small, the section still contains obvious multiples (Fig. 5i). It means the starting scale of 8 bypasses some frequency bands where multiple signals reside. In this case, raising the 1 value does not help much with the demultiple efficacy ( Fig. 5i-k). At the parameter set we picked [ j s = 7 , 1 = 10 1 ] (Fig. 5f), we find the multiples relatively cleaned, and there is no significant loss of energies at medium-to-high frequency bands (Fig. 5h). Actually, the set [ j s = 6, 1 = 10 1 ] (Fig. 5b) was once considered, but the tiny frequency notch it creates around 20-25 Hz makes it less preferred (Fig. 5d). The parameter set [ j s = 7 , 1 = 10 1 ] is thus a conservative choice, which leaves room for residual multiple attenuation such as Radon in following steps.

Anticlines within the accretionary wedge
The eastern end of both MGL0905-8 and MGL0905-10 cuts across the accretionary wedge associated with the subduction/onset collision of the South China Sea plate underneath/with the Philippine Sea Plate (Shyu et al. 2005;Eakin et al. 2014;McIntosh et al. 2014) (Fig. 3). At the accretionary wedge, the rugged seafloor geometry and the associated multiples usually obscure the deeper image of the anticlines (Figs. 6a and 7a). After applying least-squares multiple subtraction (Figs. 6b and 7b), we still observe a substantial amount of residual multiple energy, forming a shade-like feature that shrouds the actual events underneath. In comparison, curvelet-domain subtraction allows a natural continuum of reflector amplitude in the entire section (Figs. 6c and 7c). The energy of the primaries is also better preserved in the curveletdomain subtraction. Now we can distinguish different anticlinal features from the two survey lines. In Fig. 6c, strong primaries are observed nearby the anticline, but the core region contains no coherent signals with strong amplitude. In Fig. 7c, strong events are identifiable within the core region of the anticline. These events intersect with the first multiples at an angle (Fig. 7a), and hence we interpret them as real events instead of multiples of the shallow layers right below the seafloor.
We can also examine these two subtraction results by looking at curvelet coefficients in the frequency plane (F-K plane) (lower panel of Figs. 6 and 7). Least-square subtraction retards energies in nearly all scales and all orientations, while curvelet-domain subtraction has the advantage of selective treatment of different scales and orientations. Amplitude of the primaries is therefore better preserved. Figure 8 shows a section that cuts across the continental slope around the northeastern margin of South China Sea. The seafloor also shows a rugged geometry due to erosion by submarine canyons. Least-squares subtraction removes the multiples successfully, but the energy of the primaries also reduces significantly due to the subtraction (Fig. 8b). Curvelet-domain subtraction again preserves more of the primary energy and allows structural interpretation (Fig. 8c  and d). Now we can identify the gently-deformed parallel layers between 5 and 7 s, as well as some rounded, more Test results for different starting scales ( j s ) and sparsity constraints ( 1 ). The histogram at the end of each row compares the spectrums from sections with the same j s but different 1 values. See Fig. 7c for location of this test section deformed features below. We can even observe onlap relationship between the parallel events and the top of the rounded features. At around 8-9 s, the second multiples are also successfully removed. Figure 9 shows a section across the abyssal plain in northern South China Sea. The seafloor is smooth and flat down at this deep-sea area. Below the smooth and parallel strata, we can observe basement structures starting from 5 s. The deeper part of the basement (below 7 s) however remains unclear after applying least-squares subtraction. In curvelet-domain subtraction, we can observe some strong reflectors after removing the multiples. In order to clarify whether these are true geological features, we mark the time segments between the seafloor and the basement structures ( Δt 1 and Δt 2 in Fig. 9a). We shift the time segments downward so that the top of the segments align with the beginning of the first multiples ( Δt � 1 and Δt � 2 ). On the left side of this section, the bottom of Δt � 1 segment matches the multiple of the basement structure. In leastsquares subtraction, the energy of this feature is attenuated, although some residual energies leave a shade-like feature. In curvelet-domain subtraction, this multiple is properly removed without much residual energy left. On the right side of this section, the bottom of Δt � 2 does not reach the strong reflectors below. After curvelet-domain subtraction, these strong reflectors appear continuous from left to right, suggesting the existence of deep structures underneath the basements at 9-10 s.

Limitation of the method
Based on the examples we presented, SRME with curveletdomain subtraction seems successful in most places, yet this method still has its own limitation. Figure 10 shows a section where neither least-squares subtraction nor curvelet-domain subtraction can successfully separate the multiples from the primaries (Fig. 10b -c). This section is close to the easternmost end of MGL0905-10 (Fig. 3). Even after applying harsh attenuation parameters, residual multiples can still be identified in both subtraction results but with different frequency contents (Fig. 10e and f).
We investigate the causes of this failure of SRME demultiple approach. The first plausible reason is acquisition geometry, in particular cable feathering and hence out-ofplane energy during acquisition. The largest cable feathering of MGL0905-10 is ~ 2-km at the easternmost end of the survey (Fig. 11b). Nevertheless, the section in Fig. 10 actually has relatively small feathering (< 100 m). In addition, the example in Fig. 7 has a stronger feathering (up to 500 m) than that in Fig. 10, yet its SRME demultiple result is still satisfactory. Therefore we think cable feathering may not be the main cause for this SRME failure.
Another possible explanation is the shallower water depth as compared to the example in Fig. 7. However, the water depth here is mostly deeper than 1 s (750 m), and the water bottom reflection is still present. The presence of distinct water bottom reflections fulfills the basic requirements of multiple prediction in SRME, which is different from the case of a shallow-water survey with water depth less than 200 m. Actually, the SRME multiple model mimics the data multiples reasonably well (Fig. 10d vs. a). This similarity implies that even with a larger topography slope angle (9.3°) within this section than that in Fig. 7 (5.7°-5.9°), the datadriven multiple prediction method can still function properly.
The discussion above leads to the last possible explanation: the primaries and the multiples may share similar frequency contents and dipping angles within this section of the survey. The situation may have been complicated by layers of strong reflectivity in the shallow section (Fig. 10a), which reflect back most of the seismic energies and leave little to reach the greater depth. In this case, harsher leastsquares subtraction would only lead to a more transparent deeper section. Harsher curvelet-domain subtraction has the potential of separating the low-frequency primaries from the high-frequency multiples, although low-frequency multiples with dipping angles similar to the primaries may still linger and confound the geological interpretation.

Applicability on short-offset data
As mentioned in Introduction, one of the main purposes of this study is to evaluate the performance of curveletdomain subtraction for data acquired with research vessels. The TAIGER dataset is unique in that the survey is acquired by a 6-km-long cable, which is rare for most academic research-level datasets. With that in mind, we carry out an additional SRME and curvelet-domain subtraction test with a data subset of the first 0.6-km offsets from the MGL0905-10 survey. During the test, we re-model the multiples with the data subset, re-run the subtraction and the stack. There are two major differences between this 0.6-km-offset data subset and a real short-cable acquisition. One is the velocity model: here we still adopt the NMO velocity picked based on the full 6-km-offset data. We did not re-pick the NMO velocity for the data subset in order to leave out the difference caused by different velocity models. The second major difference would be the source: short-cable acquisitions are usually accompanied with energy sources of higher frequencies. High-frequency sources yield data of higher resolutions, and hence the focus is usually at the shallow depth. We therefore caution that the best starting scales ( j s ) and sparsity constraint ( 1 ) in a real short-cable survey could be different from what we choose here ( j s = 7 , 1 = 10 1 ) and should be re-tested.
The result shows that even with the maximum offset of only 0.6 km, SRME combined with curvelet-domain subtraction can still outperforms least-squares subtraction. The general picture of post-subtraction stack between the 6-kmoffset and 0.6-km-offset result looks similar, with slightly more residual multiples in the latter (Fig. 12c and h). We caution that some residual multiples may appear coherent and mingle with actual events in the short-cable data.
Geologists should be careful of this caveat when making interpretations.
We investigate the source of this difference between the long-cable and short-cable demultiple results. We compare the multiple models and subtraction results channel-bychannel all the way to the 0.6-km maximum offset for the short-cable case (e.g., Fig. 12d, e vs. i, j), and find that the difference in the common-channel domain (where subtraction takes place) is minor ( Fig. 12e and j). We therefore conclude that instead of SRME performance at different cable lengths, the difference in the stacks should result from both the number of traces and the maximum offset of the traces used in stacking. When NMO-corrected with a proper velocity, the primaries will be flattened while the residual multiples still appear hyperbolic; the separation of travel time between primaries and residual multiples increases with offset. By stacking more long-offset traces, the influence of residual multiples will be down-weighted, resulting in better demultiple performance.

Conclusion
In this study we give a detailed introduction about the SRME demultiple method with curvelet-domain subtraction. Developed to handle edge-related features properly, curvelet transform handles seismic data properly by separating data into different frequency bands (scales) and directions (orientations). By investigating the curvelet coefficients of the input data and the multiple model, we can identify and subtract multiples at only a few scales and orientations. This sparsity allows better attenuation of multiples and retention of primaries compared to conventional least-squares subtraction. When combined with matching filters in T-X domain, curvelet-domain subtraction can also handle a certain level of model prediction errors.
We test curvelet-domain subtraction on two 2D surveys acquired during the TAIGER project. After a proper selection flow for key parameters (starting scale and sparsity constraint), the tests show that under different geological settings, curvelet-domain subtraction outperforms conventional least-squares subtraction. The limitation of this method is in the condition where primaries and multiples share similar frequency contents and dipping angles, and/or shallow layers of strong reflectivity impede energy penetration and cause dim primaries at depth.
Although curvelet-domain subtraction has been widely adopted in industrial seismic data processing, it has been relatively uncommon for data acquired by academic vessels due to various reasons. One of them is the relatively short cable length adopted by most research projects, which may be considered a problem for multiple modeling and subtraction. Our results show that even with 0.6-km maximum offset, curvelet-domain subtraction can still generate better results than least-squares subtraction. The processing time  Fig. 6-9. e and f Stack image after multiple subtraction with harsh parameters. The design window for LS subtraction is 64 ms × 11 traces. Note low-frequency events become more distinct in (f), although some low-frequency multiples still exist and may confound the interpretation. Blue arrows in (a) mark the top of shallow layers with strong reflectivity As the demand for computing resources is not extremely high, we recommend SRME with curvelet-domain subtraction as a standard processing flow for most multi-channel 2D seismic data. Comparison of SRME result between stacks processed with the maximum offset of a-e 6 km and f-i 0.6 km. This example section is the same as Fig. 7 but extends to 7 s. Blue arrows and light blue circles indicate major differences between the two CD subtraction stacks. Channel 34 is equivalent to offset = 0.6 km. M1 first multiples, M2 second multiples