Geostatistical modelling of multibeam backscatter for full-coverage seabed sediment maps

Extensive seabed sediment mapping is highly relevant to describe marine ecosystems and to quantify the distribution and extent of benthic habitats. Compared to traditional mapping methods, primarily based on bed sampling, multibeam echo sounding (MBES) is a time-efficient tool to acquire high-resolution bathymetric and backscatter data over large areas. We use a Bayesian method for unsupervised acoustic sediment classification (ASC) of MBES backscatter data of the Cleaver Bank, Netherlands Continental Shelf. On these sparsely distributed backscatter datasets, we tested and evaluated different Kriging algorithms, showing that Ordinary Kriging results in a reliable map. We introduce a new approach to classify interpolated MBES backscatter based on the Bayesian method for producing full-coverage sediment maps. Comparison to a traditional sediment map and in situ measurements shows that this approach resolves lateral heterogeneities (kilometers). When evaluating the high-resolution sediment map obtained from the Bayesian method, based on the actual backscatter, mapping laterally heterogeneous sediments significantly improved (meters). In order to create the optimal sediment map, we aimed to integrate ASC into existing maps, which, however, requires quantified spatial uncertainties of both considered maps. Finally, the low discrimination power of MBES backscatter for coarse sediments is highlighted as a shortcoming of current ASC mapping.


Introduction
Increasing human activities in the marine environment, such as fisheries or dredging, affect seabed habitats worldwide (Kaiser et al., 2003;Halpern et al., 2008;Foden et al., 2010;Erftemeijer et al., 2012;Korpinen et al., 2013). The impacts depend on the magnitude and frequency of the human activities, and vary with the marine ecosystem in which they occur (Halpern et al., 2015). To assess the anthropogenic impact on the seabed ecosystem and to develop suitable management strategies, it is necessary to identify the spatial variability of benthic habitats. Driving factors of the distribution and abundance of benthic communities, and hence important for habitat mapping, are the seabed substrate and grain-size distribution (Teske & Wooldridge, 2003). Marine in situ measurement techniques (e.g. grabs, cores and underwater video footage) reveal detailed information of the sediment properties and generate, in general, an accurate representation of the local seabed. However, the density and coverage of bed sampling are not always sufficient to represent the sediment heterogeneity on the required spatial scales.
In the last decade, different research fields have advanced remote sensing methods to overcome the limitations in spatial and temporal information of the seabed. One of the most rapidly developing research field is acoustic seabed classification (ASC), where acoustic remote sensing techniques, such as side-scan sonar (SSS), singlebeam echosounding (SBES) and multibeam echosounding (MBES), are used to investigate the seabed (e.g. Hughes Clarke et al., 1996;Anderson et al., 2008;Brown et al., 2011;Diesing et al., 2014;Lurton & Lamarche, 2015;Lamarche & Lurton, 2018). Extensive experimental and theoretical research about acoustic scattering shows that the acoustic echoes from the water-sediment interface contain information about the seabed (Jackson & Richardson, 2007). Accounting for the effect of environmental conditions (e.g. absorption) and sonar settings (e.g. signal strength, beam pattern), the backscatter strength can be retrieved from the acoustic echo received by the MBES. The backscatter strength is dependent on the acoustic frequency used, angle of incidence on the seabed and seabed properties, allowing for sediment characterisation based on acoustic measurements (Jackson & Richardson, 2007;Anderson et al., 2008). In particular, the MBES is a powerful tool due to the simultaneous acquisition of both backscatter and bathymetry data. MBES measurements are very time efficient compared to in situ measurements and are of high spatial resolution (in the order of decimetres to tens of meters, depending on the system configuration and on the water depth) (Anderson et al., 2008). The acoustic backscatter, bathymetry, and second-order features (e.g. slope, rugosity and standard deviation) are used either individually or in combination for the discrimination of the seabed characteristics and the production of sediment or habitat maps (Brown et al., 2011).
Despite the variety of acoustic classification studies, there is still a lack of seabed sediment maps for a large part of the world oceans. The reasons are manifold, for example: (i) lack of time and budget for MBES surveys, (ii) inclement weather, (iii) solely acquisition of bathymetry without storage of backscatter, (iv) lack of discrimination power in the signal close to nadir, (v) inaccessible marine environments due to hazards, navigation restrictions, renewable energy devices or marine conservation requirements. This indicates the need for suitable interpolation and classification techniques for sparsely distributed MBES backscatter data to generate full-coverage sediment maps. In addition, techniques able to combine multiple sources of information about the seabed from primary or secondary sources, e.g. samples, bathymetry or historical sediment maps, are highly important.
The European Union established the European Marine Observation and Data Network (EMODnet) to process marine data from disparate sources according to international standards and to eventually make the products freely available to marine data users. Data layers are generated with respect to Bathymetry, Geology, Seabed Habitats, Chemistry, Biology, Physics and Human Activities. EMODnet Geology provides, among other, datasets with information on the seabed substrate of the European marine areas (EMODnet, 2016). The seabed substrate maps are generated by harmonizing existing national maps (Väänänen et al., 2007), by interpolating existing sampling data or by expert knowledge in combination with the above approaches. The geological services of Norway, the United Kingdom and Ireland established a multidisciplinary national mapping program including acoustic measurements and ground truthing to produce seabed substrate maps based on ASC. They are collaborating to advance methods and practice in seabed mapping through sharing of knowledge and methods. The seabed sediment maps of the Netherlands Continental Shelf (NCS) are in general, besides a few local studies, based on traditional geological mapping where grab samples and shallow cores in conjunction with bathymetry and seismic information are used (Harrison et al., 1987;Jeffery et al., 1988;Lindeboom et al., 2008). Still, a few large-scale seabed mapping studies, including the NCS and based on machine learning techniques or geostatistical interpolation methods, were carried out. Stephens & Diesing (2015) used a Random Forest algorithm to predict seabed sediment composition in combination with grain-size data and several environmental predictors (e.g. bathymetry or current velocity with a resolution of 0.5-12 km) in the North Sea. Wilson et al. (2018) extended their approach to produce sediment compostion maps of the entire north-west European Shelf (including the NCS) by applying the Random Forest algorithm in regions with sparse data coverage and combining the results with the output of spatial interpolation in regions with extensive data coverage. Verfaillie et al. (2006) assessed different methods [e.g. Kriging with external Drift (KED)] for the interpolation of the median grain size of the sand fraction at the Belgian Continental Shelf. Maljers & Gunnink (2007) extended their KED approach by interpolating the full grain-size distribution curves allowing for the extraction of NCS maps of all separate fractions as well as for the derivations, for example, the sand median (D s 50). Bockelmann et al. (2018) produced fullcoverage maps of the mud content and the sediment median grain-size (D50) for the entire North Sea using among other KED. All of these studies used bathymetry as external-drift variable. A recent comprehensive study reflecting Dutch efforts to improve seabed sediment maps of the NCS using ASC was carried out by Snellen et al. (2018). They employed sparsely distributed 300 kHz MBES backscatter data acquired in five surveys from 2013 to 2015 at the Cleaver Bank, NCS, to investigate the repeatability of MBES backscatter-based sediment classification. They used an unsupervised ASC method, called the Bayesian bed-classification method , to classify the surficial sediments.
With the present-day urge for large-scale mapping, and to understand and identify the spatial and temporal distribution of benthic organisms, it is essential to consider the small-scale information and the broad environmental setting as well (Hawkins et al., 1993). For this, it is necessary to understand the benefits of ASC methods and integrate them into the available traditional seabed sediment mapping methods, as it is already done for some areas in the EMODnet framework. In this contribution, we address this integration for the Cleaver Bank area (NCS). We use the sparsely distributed MBES backscatter data collected during surveys from 2013 to 2015 and bed samples taken in the period from 2000 to 2015. Hence, the aims of this study are defined as follows: The first aim is to assess the capability of geostatistical modelling, i.e. Kriging interpolation techniques, to overcome and evaluate two issues that can hinder full-coverage acoustical mapping: (a) the MBES data can be restricted to inconsistent survey lines due to time and budget limitations or weather conditions, and (b) the backscatter data acquired at nadir have limited discrimination power and therefore cannot be used for ASC, resulting in gaps within the sonar swath. Aim 1 thus focusses on the potential of creating a full-coverage map based on sparse MBES backscatter data. Therefore, different Kriging techniques are applied to the sparsely distributed MBES backscatter data and a new approach is introduced to classify interpolated MBES backscatter data based on the Bayesian bed-classification method.
The second aim is to compare the sediment maps obtained from ASC, based on solely MBES backscatter, and Kriging interpolated MBES backscatter, respectively, with the existing seabed sediment map in the Cleaver Bank area, which is mainly based on traditional ground-truth data. Aim 2 thus focusses on issues that are of importance when integrating the ASC maps with traditional geological maps.
An important issue that we additionally address is the current physical limitation of ASC, concerning its ability to discriminate between the coarser sediment types.

Geological setting
The study area, the Cleaver Bank, is located about halfway between the United Kingdom and the Netherlands. This area covers about 1.5% of the entire NCS. Water depths at the Cleaver Bank range from 25 to 50 m Lowest Astronomical Tide (LAT) except for in the deep channel (Botney Cut) that crosses the Cleaver Bank from north-west to south-east, where water depths reach 70 m LAT (Fig. 2). This subaqueous palaeochannel is a glacial valley from the Weichselian glaciation incised into Pleistocene deposits and was partially filled in with sandy muds prior to the early-Holocene marine transgression (Botney Cut Formation), probably deposited in a glacio-lacustrine environment (Cameron et al., 1986). The infill is overlain by more recent Holocene marine sediments, varying from mud to sandy gravel with a layer thickness of 1 to 12 m. In the area surrounding the Botney Cut, Late-Weichselian glacial deposits are mapped as the Boulder Bank Formation, consisting of a blanket till of gravelly sandy clay. Where the Holocene marine deposits overlie the Boulder Bank Formation, these deposits are less than 2 m thick and comprise clean sand and sandy gravel (Fig. 1b) (Cameron et al., 1986). The Cleaver Bank is the largest area with coarse sediments on the NCS, with up to 30% being covered with gravel ( Fig. 1a) (Lindeboom et al., 2008). The abundance of different sediment types from muddy to rocky bottoms causes a high benthic biodiversity (Lindeboom et al., 2009;Coblentz et al., 2015).

Current seabed sediment maps (NCS)
Currently used seabed substrate maps of the NCS are available at the EMODnet-Geology data portal (EMODnet, 2016), the Interreg IIIb project MESH (Mapping European Seabed Habitats) (JNCC, 2018) or provided by Maljers & Gunnink (2007) and Stephens & Diesing (2015). Figure 1a shows the seabed sediment map from the entire North Sea with a nominal scale of 1000 m by 1000 m in which the sediments are classified into five sediment classes. The sediment classes are based on a simplified reclassification of the Folk scheme (Folk, 1954). Figure 1b displays a sediment map of the Cleaver Bank area with a finer nominal scale of 250 m by 250 m and a classification into 14 Folk classes, although only seven classes are present in the present area (EMODnet, 2016). This sediment map is based on grab samples and shallow cores where the sediment boundaries are manually refined with bathymetry and seismic information (Harrison et al., 1987;Jeffery et al., 1988). Both maps are downloaded from the EMODnet

Acoustic classification map based on MBES backscatter
Recently, the Cleaver Bank was acoustically mapped using MBES backscatter data acquired from 2013 to 2015 by Snellen et al. (2018) (see Fig. 2). They employed the Bayesian method, in which backscatter values averaged over the backscatter time series samples (scatter pixels) within a beam footprint are used. The method accounts for the intrinsic variability of the backscatter strength by assuming that the measured backscatter per beam resulting from a number of discrete seabed types corresponds to a sum of Gaussian distributions, where each Gaussian corresponds to a distinct seabed type ). By applying the Bayesian decision rule, the so-called acoustic classes are obtained from the individual Gaussian distributions. The technique considers the backscatter strength per beam (or incident angle) separately. The optimal number of acoustic classes is estimated by utilizing the outer, more discriminative beams and applying the goodness of fit criterion (v 2 ). Seven acoustic classes were identified (Fig. 2). Since the MBES survey track lines are separated by up to 1500 m and the backscatter data around nadir cannot be used for sediment classification, the final ASC map contains large data gaps (Fig. 2). The Bayesian method was developed by

In situ measurements
The grab sample dataset, as used for the current research but not used for the maps of  Folk (1954) scheme. The modified Folk scheme is used, where the threshold for slightly gravelly classes is 1%. Three grab samples are classified as gravelly muddy sand (2) and slightly gravelly sand (1). These are neglected during the following analysis because two individual samples are not sufficient to perform a correlation analysis and a validation test. In addition, one grab sample was considered untrustworthy due to inaccurate positioning. The remaining 100 grab samples are divided into three sets.
(1) 46 grab samples  Fig. 2 Acoustic map of the Cleaver Bank with a resolution of 3 m by 3 m obtained with ASC, using the Bayesian bedclassification method. Grab sample symbols assign each sample to the following set: assignment of sediment types to acoustic classes (square), validation test for ASC (triangle) and interpolated ASC (circle). The background bathymetry is displayed in a grey gradient. Location of the Cleaver Bank is indicated by the red rectangle in Fig. 1a. Black rectangles (a), (b) and (c) will be referred to later in the paper. The figure is taken from Snellen et al. (2018) and the layout is slightly modified (square symbol) are used for the assignment of sediment types to acoustic classes; these samples represent an arbitrary selection of 70% of the samples located on or very close (\ 25 m) to the survey track lines; (2) 23 grab samples (triangle symbol) are used for validating this assignment; these samples represent an arbitrary selection of 30% of the samples located on or very close (\ 25 m) to the survey track lines; (3) 31 grab samples (circle symbol) are used for assessing the predictive performance of Kriging and the EMODnet map; these grab samples are located between track lines with a distance to a track line of 25 m to 750 m (Fig. 2). In addition, video footage was collected during the MBES surveys in 2013 and 2015. The cameras were equipped with two parallel-orientated sizing lasers to scale the observations at the seabed. The video footage is qualitatively analysed and subjectively labelled with respect to the Folk classification, since a more quantitative analysis is hampered by the unsteady height of the camera system above the seafloor and the varying particle suspension affecting the visibility.

Kriging and Cokriging
Kriging is a geostatistical interpolation technique used to predict surfaces from a limited amount of sample data and to assess the uncertainty of these predictions (Krige, 1951). In this paper, we use Kriging to predict MBES backscatter for locations without acoustic measurements. The general equation iŝ whereẐ is the predicted value at an unsampled location s 0 ; Z is a measured value at the sampled location x i ; n is the number of considered samples within the interpolation neighbourhood, and k i is the weight assigned to the measured sample i for predict-ingẐ. A particular strength of the Kriging method is that k i is not only calculated using the distance from the measured to the predicted location but also accounts for the spatial arrangement of the measured data points. A variogram analysis is the first step for obtaining the Kriging weights k i . The so-called semivariogram cðhÞ represents the average variance between the observations separated by a certain distance, and describes the structure of the spatial variability of the investigated variable (Chambers et al., 2000). The semivariogram is calculated as (Journel & Huijbregts, 1978) cðhÞ ¼ 1 2NðhÞ where N is the number of pairs of sample points separated by the distance h. The data from this distance interval h are binned into lag classes. The size and the number of the lags are chosen according to the study area. However, the semivariogram will in general not provide information for all possible distances. Therefore, it is necessary to fit a semivariance model (e.g. spherical, exponential and stable) to the semivariogram. The type of the model is selected based on the nature of the data. For creating the semivariogram, we follow roughly the rule of thumb where the product of lag size and number of lags should be about half the largest distance among all points (Verfaillie et al., 2006). For the dataset considered, this means that the semivariogram is created using a lag size of 1000 m and the number of lags is set to 20. Finally, the Kriging weights k i are determined by solving a set of equations (Kriging system) including the knowledge of the semivariance model (Goovaerts, 1997). In geostatistical modelling, sampled data are considered as the result of a random process. Consequently, the predictions are always associated with some probability or uncertainty. To this end, the Kriging system is expressed as follows: whereẐ 1 is the estimated variable at location s 0 , decomposed into the deterministic trend l 1 and the random, auto-correlated error e 1 at location s 0 . The different Kriging methods are variations on Eq. 3. Ordinary Kriging (OK) estimates l 1 for each interpolation neighbourhood separately, where it is assumed to be locally constant. Simple Kriging (SK) assumes l 1 to be known for the entire area, where it is assumed to be globally constant. By contrast, Universal Kriging (UK) describes l 1 with a deterministic function. Cokriging allows to incorporate secondary variables, in order to improve the predictions. For example, bathymetry may be used as additional information in sediment mapping (e.g. Verfaillie et al., 2006). Hereto, a second equation is needed for the integration of a second variablê where l 2 is a second unknown constant in case of Ordinary Cokriging (OCK). Two random errors e 1 and e 2 are now used and for each of these values both an autocorrelation and cross-correlation have to be calculated. OCK tries to predict Z 1 s 0 ð Þ in the same way as OK, but uses the additional information of the covariate Z 2 s 0 ð Þ. For the Simple, Ordinary and Universal Kriging and Cokriging, we used the geostatistical toolbox of ArcMap10.3.

Validation of Kriging interpolation
Three measures, i.e. the prediction standard error (PSE), the root mean-square estimation error (RMSE) and the root mean-square standardized estimation error (RMSSE), are used to evaluate the prediction and the corresponding uncertainties of the interpolation.
The uncertainty of the Kriging prediction is given by the PSE rðs 0 Þ (smaller values indicate better predictions). This value is obtained from the solution of the Kriging system. It is defined as the standard deviation of the differences between the true and the estimated value. For instance, if the data are normally distributed, the true value falls within the interval of the estimated values (±) PSE with a probability of 95%. The RMSE and RMSSE are calculated by a cross-validation where one data sample is removed and the remaining data samples are used to estimate the removed data sample. The RMSE indicates how well the algorithm predicts the observed values. The output value has the same unit as the observation. The RMSE is written as follows: where D is the number of all samples used for the interpolation. The lower the RMSE value, the better the prediction accuracy is. To assess the reliability of the uncertainty, the RMSSE is used. Thereby, each estimated error is divided by its prediction standard error rðx i Þ at the sampled location RMSSE should be close to one if the prediction standard errors are valid. If the RMSSE is greater than one, the variability in predictions is underestimated. If RMSSE error is less than one, variability in predictions is overestimated.

Classification of interpolated MBES backscatter based on the Bayesian method
The workflow to produce a full-coverage sediment map from sparsely distributed MBES backscatter data is visualized in Fig. 3. The first step is the interpolation of the measured MBES backscatter data to retrieve a full-coverage backscatter map using Kriging. This is carried out for a specific beam angle and for each survey separately. The boundaries of the acoustic classes defined by the Bayesian method per beam angle and survey are used to obtain acoustic class maps from the interpolated backscatter data (see step 2, Fig. 3). A detailed description on how the acoustic class boundaries are defined is given in . The seabed slope in the Cleaver Bank area is relatively small (\ 5°); therefore, the difference between the beam angle and the actual incident angle can be neglected and thus we use beam angles in our classification. Considering each survey separately provides acoustic class maps independent of e.g. acoustic-instrument stability or sonar settings. The interpolated ASC maps obtained from different surveys are merged by evaluating the PSE (see step 3, Fig. 3). For each grid cell, the interpolated results exhibiting a lower PSE are used for the merged ASC map. Finally, the classification results obtained from the application of the Bayesian method to the actual MBES backscatter measurements are compared to grab samples (see step 4, Fig. 3). Based on their correlation, the assignment of sediment types to acoustic classes is carried out. In the following sections, the interpolated and subsequently merged acoustic class map will be termed interpolated ASC map.
Quantitative comparison of ASC and current sediment maps with in situ measurements Finally, to test the reliability of the sediment maps and to perform a quantitative comparison, error matrices are used. The error matrix provides different measures of accuracy, i.e. overall and individual accuracies and a Kappa coefficient of agreement (Cohen, 1960). The overall and individual accuracies describe the portion of correctly assigned sediment types by considering all samples and only samples of a specific sediment type, respectively. The Kappa coefficient accounts, in addition, for the likelihood of coincidental agreement.

Geostatistical modelling
The Kriging methods are applied to the MBES backscatter data acquired in 2013 and 2014/2015 separately. The backscatter data retrieved from the beam angle of 48°(± 1°, 47°-49°) at starboard-side are used for the interpolation. These beam angles have high discrimination power and appeared to have low noise. An optimal fit to the semivariance of the observed data is obtained by using the exponential and stable semivariance model for Kriging and Cokriging, respectively. Cokriging uses full-coverage bathymetry data with a grid cell size of 100 m as a secondary variable. In order to find a suitable interpolation method for the generation of full-coverage sediment maps of the MBES data, we compare the results of OK, SK, UK as well as OCK, SCK and UCK. From the RMSE and RMSSE values, it is found that the Cokriging methods perform slightly better than the Kriging. The different Kriging techniques Ordinary, Simple and Universal perform almost equally well with respect to the RMSE and RMSSE values (Table 1).
To further investigate the slightly better performance of the Cokriging methods, the correlation between bathymetry and backscatter for the 2013 and 2014/2015 data is shown in Fig. 4. The Pearson Correlation Coefficient, a standard measure of linear bi-variate correlation (i.e. between two variables), indicates a very weak correlation for the 2013 data (R = 0.16) and a medium correlation for the 2014/2015 data (R = 0.61) between bathymetry and backscatter. The varying correlation between the datasets is caused by the fact that the datasets cover different seabed areas. It shows that the correlation between bathymetry and backscatter strongly depends on the location in the study area. In the study of Asli & Marcotte (1995), a better performance for SCK and OCK over SK and OK is observed for a correlation coefficient [ 0.4. That would indicate an improved performance by incorporating bathymetry only for the 2014/2015 datasets. However, these observations are not reflected in the performance test using the RMSE and RMSSE. There is no significant difference between the OK and OCK performance from the 2013 to the 2014/2015 dataset observed.
To get insight into the added value of bathymetry data in predicting backscatter in between the track lines, which is not captured by the RMSE and RMSEE that only consider a single eliminated measurement, we removed a full track line from the dataset. In Fig. 2, the removed track line is indicated by the black rectangle (a). The Pearson correlation coefficient between predictions and actual values is 0.74 and 0.75 for both OK and OCK, respectively. This indicates no significant improvement by incorporating bathymetry as a secondary variable for predicting the backscatter values in between the track lines. Considering these results, the computational time and that the backscatter data can be seen as a variable without a constant mean, we selected OK as the most suitable interpolation method for MBES backscatter in our study area.
The interpolated backscatter map for the 2013 and 2014/2015 data using OK and the MBES backscatter from all track lines is visualized in Fig. 5a and d. The uncertainty of the OK predictions is represented by the PSE in Fig. 5c and f. The uncertainty map shows that the most reliable predictions are achieved on the survey track lines (PSE = 0 dB) and that uncertainty increases with distance to the MBES track lines (PSE = 2-3 dB). Locations close to multiple or crossing survey lines show lower uncertainties, demonstrating more reliable predictions caused by an increase of MBES data. PSE values above 3 dB in these maps show untrustworthy predictions, due to the total absence of data and resulting data artefacts, such as those found at the eastern border of the 2014/2015 data.

Generation of the map presenting acoustic classes
The outcomes from the Bayesian method are used to produce an acoustic class map from the interpolated backscatter data (see step 2, Fig. 3). The boundaries of the seven acoustic classes for the backscatter data of the 48°beam angle, which are used to classify the backscatter data into distinct classes, are shown in Table 2. The classification of the backscatter data into classes based on the Bayesian method allows for the merging of the results from 2013 and 2014/2015 to a single map. The production of the single maps is based on the PSE, which are shown in Fig. 5c and f (see step 3, Fig. 3). The individual acoustic class maps obtained from the 2013 and 2014/2015 datasets are displayed in Fig. 5b and e, respectively. The merged acoustic class map is shown in Fig. 6a. A number of small stripe    Table 2) To test the influence of different beam angles and different datasets on the ASC, the acoustic class boundaries for the 54°(± 1°, 53°-55°) beam angle are listed in Table 2 as well. Table 2 shows that the boundaries differ per beam angle and per dataset. It demonstrates the importance of applying the interpolation and acoustic classification to each beam angle and dataset separately. To test the validity of our approach, where only the backscatter data from a specific beam angle range are used and the datasets are considered separately, the interpolated and merged ASC map of the 54°beam angle is visualized in Fig. 6b. Both maps show the same pattern (Fig. 6a, b) and the lower classes (2-4, green to yellow) correspond well. However, only in areas represented by high acoustic classes (5-7, orange to purple) a difference of up to one acoustic class is visible (yellow in Fig. 6c), where the acoustic classes of the 54°T beams are generally lower than those of the 48°beams. Also, these differences are more prominent for the 2014-2015 data (Fig. 6c). Factors contributing to the difference are both the interpolation of the backscatter values and the intrinsic probability of misclassification due to the overlap of backscatter corresponding to the acoustic classes. In addition, different beam angles might sense different substrate/sediment classes on the seabed. In the next section, the focus will be on sediment type instead of acoustic class where the results are compared with in situ measurements.
Conversion from acoustic class to sediment map As a first step towards integrating the ASC results into sediment maps (in general displaying Folk class), the acoustic classes have to be converted to Folk classification scheme.
The approach taken uses the original ASC results (not interpolated), since the seabed area represented by grab samples is much closer to the spatial resolution of the original ASC results (3 m by 3 m) compared to the interpolated ASC results (100 m by 100 m). Figure 7 displays the relationship between acoustic classes and 46 grab samples (acoustic class at the grab sample location is determined by counting the most frequently occurring ASC class within a radius of 25 m around the grab sample). The order of Folk classes attempts to represent increasing mean grain sizes. A general correspondence between increasing acoustic class to increasing mean grain size is observed. Acoustic class 1 is not sampled and cannot be assigned to a sediment type (labelled with 'uN'). Acoustic class 2, 4 and 7 correspond mainly to sandy mud, sand and sandy gravel, respectively. All other acoustic classes indicate some ambiguity in the relationship between sediment type and acoustic class e.g. where an individual acoustic class is assigned to two or three different Folk classes or where one Folk class is represented by several acoustic classes overlapping with another Folk class representation. These results are slightly different from the assignment of Folk class to acoustic class by Snellen et al. (2018), using the same ASC results but 77 grab samples instead of only 46. They assigned sand only to acoustic class 4 compared to classes 3 and 4 here.
The final assignment of Folk class to acoustic class is shown in Table 3. Using the 23 independent grab samples to validate the assignment of Folk classes to acoustic classes results in an overall accuracy of 83% and a Kappa coefficient of 0.55, indicating good agreement with the grab samples. It is noted that the performance to discriminate between individual Folk classes is lower for the coarser sediments (sand to gravelly sand) compared to the finer sediment (sandy mud to muddy sand).
To further test the assignment of sediment type to ASC results, the results are qualitatively compared to the video footage. Here, it is visualized on a representative example in Fig. 8. There is a good overall qualitative agreement between acoustic classification and video footage (Fig. 8) Fig. 7 Correlation plot between acoustic class and sediment type (Folk, 1954) using the 46 grab samples and the acoustic classes obtained from the Bayesian classification method Table 3 Assignment of Folk class to acoustic class Sediment type 'uN' sM sandy mud mS muddy sand S sand gS gravelly sand msG muddy sandy gravel sG sandy gravel Acoustic class 1 2 3 3-4 5-6 5-6 6-7 Acoustic classes are obtained from applying the Bayes classification method, as previously described by Snellen et al. (2018) acoustic classification and the video footage indicate an increase in gravel content from location 1-3. Based on the visual inspection, the assignment to sand for video imagery 1, to gS or msG for video imagery 2 and to gS, msG or sG for the video imagery 3 is plausible, even if the mud content is hardly recognizable and a distinction between mud and sand is not feasible based on the video footage. Based on the validation test and the verification with video footage, it can be concluded that the assignment of Folk classes to acoustic class is reliable, but, as seen in Table 3, not unique. The approach (Table 3) is used to convert the interpolated ASC results to a full-coverage sediment map (Fig. 9a).

Comparison of the interpolated ASC and traditional sediment map
In this section, we will qualitatively assess the accuracy of the interpolated ASC (Fig. 9) and EMODnet sediment map (Fig. 1b) by using the remaining 31 independent grab samples. Table 4 presents the results of comparing the 31 grab samples with the two sediment maps. The OK achieves an overall accuracy of 65% and a Kappa coefficient of 0.50. This indicates fair agreement between the samples and the map. These values are similar for those obtained by the EMODnet map (65% and 0.54). When regarding the individual accuracy for both maps, the Botney Cut (sM) is resolved equally well (both 100%). The EMODnet fine-scale map shows The location of that area is marked by a black rectangle (b) in Fig. 2. The video footages are marked by blue triangles. White area indicates no data for ASC especially good agreement with the areas indicated by sand in the samples, whereas the interpolated ASC map indicates better agreement for mS and the coarser sediment samples. The latter is partly due to the necessity of assigning multiple types to a single acoustic class to account for the ambiguity in the relation of sediment type to acoustic class. This approach comes along with a decrease in discrimination power between the individual sediment types. As a final step in this section, we quantitatively compare the interpolated ASC and EMODnet sediment maps (Fig. 9b). The comparison reveals an overall agreement of 51%. As in the previous assessment (Table 4), the comparison map shows that for the Botney Cut in particular, a very good agreement is found, where both maps indicate sM which also matches with the grab samples. However, in other parts of the maps where sM was assigned in the ASC map, differences occur where sediments were in general assigned to sand in the EMODnet map (Fig. 1b). Further disagreements are linked to the larger sediment heterogeneity in the ASC map in the gravelly sand and sandy gravel classes, and may thus actually be an important improvement (see ''Discussion''). These types of differences should be accounted for when integrating the ASC maps into the existing sediment maps of the NCS and are discussed in the following sections.
(a) (b) Fig. 9 a Full-coverage sediment map based on interpolated ASC map (Fig. 6a) and 46 grab samples (Table 3). b Map comparison between full-coverage sediment map (Fig. 9a) and the existing EMODnet seabed sediment map (Fig. 1b). ''No data'' implies that either the Folk sediment type is not available in one of the maps or an acoustic class is unassigned in the interpolated ASC map

Relationship between backscatter values and grain-size fractions
In the previous section, we aimed to reduce the ambiguity related to the assignment of acoustic class to sediment type by assigning multiple sediment types to an acoustic class (adapted from Snellen et al., 2018). To further investigate this issue, the relationship between actual backscatter value and sediment properties is considered. In Snellen et al. (2018), the ambiguity was analysed with respect to the relationship between backscatter and the median grain size. In this study, we investigate the relationship between backscatter and individual grain-size fractions to identify grain sizes causing the ambiguity with respect to the used system wavelength. For this, we use the grain-size distributions of the 77 grab samples which are located on or close (\ 25 m) to the MBES track lines. The backscatter values are averaged over a maximum radius of 25 m around the grab sample. To take into account imperfect sonar calibrations and the angular dependency of backscatter, we used normalized backscatter values between -1 and 1 where the angular dependency is eliminated (Gavrilov & Parnum, 2010). Figure 10 shows the normalized backscatter values as a function of three measures of grain-size fractions (fine, medium, coarse). The first measure considers grains smaller than 0.5 mm and thus smaller than the acoustic wavelength of the MBES (5 mm). The second measure focusses on grain sizes roughly around the acoustic wavelength from 0.5 to 16 mm (here called the medium fraction). The last measure considers the percentage of grains larger than 16 mm. Figure 10a shows a negative correlation between backscatter values and the percentage of the fine fraction (\ 0.5 mm). Figure 10b illustrates a positive correlation between backscatter and percentage of the medium (0.5-16 mm) fraction. This shows that an increase in the amount of grains similar to the acoustic wavelength is positively correlated to backscatter.  Individual accuracy 100% 50% 50% 25% 50% 88% Overall accuracy 65% Kappa coefficient 0.50 However, considering Fig. 10c, a slight decrease in backscatter values with increasing percentage of grain sizes coarser than 16 mm is observed. This observation is contrary to the common assumption that coarser sediments are more likely to result in higher backscatter strength (Goff et al., 2004;De Falco et al., 2010). It indicates that in particular the amount of very coarse grains (coarser than 16 mm using an MBES with a frequency of 300 kHz) might induce the ambiguity. In addition, it shows that there is no one-to-one relationship between grain size and backscatter for the entire grain-size spectrum.

Geostatistical modelling
The performance tests and validation with independent sediment samples in this study show that geostatistical modelling, such as Kriging or Cokriging, is a suitable tool to fill the data gaps in MBES backscatter measurements to achieve full-coverage maps (Fig. 9a). The performance of the OK, SK and UK techniques does not differ significantly. Regarding UK, the reason might be that the backscatter data do not have a global trend. In that regard, UK fits a constant value to the data, the same as OK, instead of a deterministic polynomial function (Eq. 3). The relatively small difference between the SK and OK performance indicates that the choice of the stationarity of the deterministic trend (global or local) does not have a strong impact (see ''Materials and method''). We did not use anisotropic Kriging, since the backscatter data are not directed in any way. In general, the values of the error matrices indicate that the Kriging results are reliable, considering that other studies using geostatistical modelling and supervised ASC achieved similar overall accuracies ranging from 65 to 80% and Kappa coefficients between 0.3 and 0.6 .
Although several studies have shown that the use of bathymetry in geostatistical modelling improves the prediction of seabed sediment distribution (Verfaillie et al., 2006;Maljers & Gunnink, 2007;Lark et al., 2015), importing bathymetry as a second variable did not improve the prediction performance in our study on ASC bed classification, where a correlation analysis reveals a low correlation between backscatter and bathymetry. Similar results were found in Bockelmann et al. (2018), where the evaluation of the map confidence indicated no significant improvement by using KED (incorporating bathymetry as a secondary variable) compared to OK. Even though bathymetry, hydrodynamics and sedimentary processes control the sediment distribution in the marine environment, preexisting coarser sediments (e.g. due to a glacial history) do not always comply with present-day marine sedimentation processes. This implies that the successful application of multivariate geostatistical modelling depends on the local conditions of the study area. To account for complex geological processes, which has influenced the current sediment distributions on the seabed, we may need other interpolation methods allowing the use of arbitrarily complex forms of regression, for example Regression-Kriging (Bockelmann et al., 2018;Thill, 2018).
In the present study, only the backscatter data from a specific, highly discriminative, beam angle range are considered for obtaining an interpolated full-coverage map. The need for this approach is explained by Table 2, where a large difference in backscatter values of the acoustic class boundaries between the different datasets (i.e. 2013-2014 and 2015) and beam angles (i.e. 48°and 54°) is seen. This is due to the facts that (1) the backscatter is dependent on the incident angle (Jackson & Richardson, 2007), (2) the datasets from 2013 to 2014 and 2015 cover a different seabed area but also that, (3) e.g. acoustic-instrument stability, environmental conditions or survey methods might have changed between the surveys and post-processing was not able to account for these factors. Alternatively to account for the first issue, the angular effect can be removed allowing the use of the full range of backscatter measurements for the interpolation. However, this would hamper the comparison between acoustic classes determined from different uncalibrated MBES backscatter datasets because the relative backscatter values are not directly comparable.

Comparison of the ASC, interpolated ASC and traditional sediment map
In the quantitative assessment, the different performances of the interpolated ASC and EMODnet sediment maps were not apparent. To further investigate the differences between the ASC, the interpolated ASC and EMODnet maps, we qualitatively compare a certain area of the Cleaver Bank in more detail, visualized in Fig. 11. It is shown that the OK method is capable of mapping more detailed seabed sediment heterogeneity (orders of km's) than the EMODnet map. Both maps show a different pattern for the coarser sediments in that area (Fig. 11a, b). The easterly and south-westerly located grab samples indicating msG or sG are in agreement with the interpolated ASC map (except two samples) but do not agree with the EMODnet map. A closer examination in Fig. 11c demonstrates that the Bayesian method is able to resolve fine-scale spatial heterogeneities (orders of tens of meters) indicated by the agreement of the sG bed sample with the revealed gravelly patch located within a relatively homogeneous area of muddy sand (green). This coarse sediment patch is resolved neither in the interpolated ASC nor in the traditional EMODnet sediment map.
The success in predicting seabed sediments via interpolation methods depends amongst other things on the relation between the spatial distribution of the measurements and the seabed heterogeneity. The larger the seabed heterogeneity, the higher the measurement density needs to be. MBES backscatter provides a high and regularly spaced sample density. However, the sample density can be partly considered as artificially high because the measurements are clustered along the track lines. Clustering of samples can even yield to a reduced accuracy of Kriging methods (Zimmerman et al. 1999). However, the regularly spaced MBES backscatter measurements have the advantages that without any a priori knowledge, sediment heterogeneities are sampled in a more systematic way. This is in contrast to interpolation based only on sediment samples. They are often a collection from different sources and therefore often randomly distributed and appear in clusters. This results in a varying spatial resolution of the interpolated map and increases the likelihood of omitting certain sediment patches. The seabed heterogeneity varies within the marine environment and the sediment type due to the physical processes of sediment transport and deposition. For example, Buscombe et al. (2017) showed that, in a river environment, typical length scale or distance over which sediment types occur was larger for finer sediments than coarse sediments, and that immobile sediments had the shortest length scales. In our study area, these observations are visible in Fig. 2. Fine sediments, indicated by low acoustic classes (low backscatter values), are homogeneous and occur in extensive areas, e.g. Botney Cut, NW and SE of the Cleaver Bank (see Fig. 2), whereas areas with high acoustic classes appear to be more heterogeneous. The remaining coarse and less mobile glacigenic sediments are sparsely distributed, redistributed or partly overlain by more recent marine sediments, possibly yielding to the patchy pattern of the coarse sediments. Geostatistical methods account for the above mentioned spatial dependency of sediments by modelling the spatial distribution of the observations (Juan et al., 2011). Consequently, the probability to map sediment heterogeneities in between the adjacent MBES track lines is increased, indicating the advantage over deterministic interpolation methods. In general, the prediction of the fine sediments in the Cleaver Bank area can be considered as more reliable compared to the coarser sediments.

Integration of different sediment maps
A quantitative comparison between the interpolated ASC and the EMODnet sediment map reveals an overall agreement of 51% (Fig. 9b). In particular, it is important to deal with the areas of disagreement to be able to successfully integrate ASC results into existing sediment maps. In order to do so the strengths and weaknesses of the different methods need to be addressed. Table 5 presents a qualitative overview of the uncertainties related to sediment maps based on in situ measurements and ASC. The interpolation of in situ measurements results mostly in high uncertainties due to the limited amount of samples, whereas MBES backscatter provides large data coverage enabling interpolation with low uncertainties. The assignment of sediment type to in situ measurements can be very precise. However, a low uncertainty still exists caused for example by a potential washing out of the fine fraction or the inability of the sampling equipment to retrieve the coarse fraction. Due to the physics of acoustic scattering, a clear assignment of backscatter values or acoustic classes to sediment type  Fig. 10. In addition, the uncertainty corresponding to the assignment of sediment type to in situ measurements has to be added to the ASC as well because these sediment types are used for the assignment to the acoustic class.
For a full integration of both maps, the uncertainties need to be spatially quantified. A first step is taken in this contribution where we found 17% of global uncertainty (equal to 83% overall accuracy) along the track lines for the assignment of sediment type to acoustic class based on 23 independent samples. This value is a first estimate of the contribution belonging to the lower right box (Table 5). This uncertainty is relatively low indicating a reliable assignment. One explanation is that in the assignment of sediment type to acoustic class the uncertainty is at least partly accounted for by assigning multiple sediment types (Folk class) to one acoustic class. The 83% overall accuracy indicates that this is a way to reduce the uncertainty related to the assignment of sediment properties to acoustic data. However, it also implies a lower discrimination between different sediment types. Figure 5c and f indicate an uncertainty of up to 2.5 dB (PSE) between the track lines. Using the PSE and acoustic class boundaries (Table 2) for the 2013 and 2014/2015 dataset, the average prediction uncertainty for the acoustic class map is estimated. On average, there is an uncertainty of 19% for the entire survey area that the acoustic classes are expected to deviate by one acoustic class. This is an estimate of the contribution of the lower left box in Table 5. It is not possible to quantify the uncertainties of the EMODnet map (in situ measurements) due to lack of information (upper left box, Table 5).
Due to the lack of local uncertainties, here, a focus is on the differences between the interpolated ASC map and EMODnet map as a first step towards integrating the maps. Figure 12 shows the areas where the two maps correspond and differ. For the latter, we defined three types of areas, indicated by (a) the zones where sM is classified on the ASC map and S on the EMODnet map (a1, a2 and a3), (b) differences due to heterogeneity of the seabed sediments, (c) zones where a larger extent of the distributions of coarser sediments are identified on the ASC map (c1, c2 and c3).
In area (a1), no samples are available in this study, so classification is based on backscatter (see Fig. 2). Jeffery et al. (1988) highlight that the one sample used for the traditional map is more than 1 km away. In area (a3), the ASC classification of sM is contradicted with all ground-truth samples (Fig. 2), revealing sand (S), which corresponds to the EMODnet map (Table 4 confirms the higher individual accuracy for sand in the EMODnet map). Still, the acoustic classification clearly indicates lower backscatter values in the acoustically measured data. For area (a2), not only this study indicates lower backscatter and low acoustic classes but also a study carried out in 2004 and based on a SBES campaign reveals similar results (Snellen et al., 2011). This implies that the surface properties may differ from sediments in the grab samples. For Table 5 Qualitative overview of uncertainties in the production of sediment maps

Interpolation Assignment
In situ measurements High [-] (Harrison et al., 1987;Jeffery et al., 1988;Verfaillie et al., 2006;Maljers & Gunnink, 2007) Low [-] Acoustic classification Low Moderate 18.9% = 1 acoustic class difference 0.1% C 2 acoustic class difference (this paper) 17%, j = 0.64 (this paper) 20-35%, 0.3 \ j \ 0.6  In the literature, no values are readily available for the interpolation step in case of seabed cores/samples, i.e. traditional mapping (upper left box) or for the uncertainty due to the sampling methods (upper right box). For the sediment classification based on acoustic measurements, values in this paper show good performance in both the assignment of sediments to classes (lower right box) and the interpolation techniques (lower left box). The symbol [-] indicates that no uncertainty value was found in literature example, fluid mud may affect acoustic backscatter (Williams et al., 2009 In area (b), the ASC map represents high spatial heterogeneity of coarse-grained sediments, supported by sediment samples. The individual accuracy for msG and sG in Table 4 and the example in Fig. 11 corroborate these findings. Although the number of samples in the traditional map, on which the EMODnet map is based, seems sufficient in the east of area (b), the main sG area in the western part of area (b) (olive in Fig. 12b) is largely void of samples (Jeffery et al., 1988).
In area (c1), both maps indicate coarse sediments. However, the area is positioned slightly differently in the two maps and the ASC identifies a larger area of coarse sediments, which are corroborated by samples (see Fig. 11a, b). For (c2) coarse sediments were not identified in the EMODnet map (sM/mS), whereas these were identified in the ASC map. In that area, two samples indicating coarse sediments plead for the ASC map. For area (c3), the coarse sediment region is less extensive in the EMODnet map, in which parts were labelled as S, where the ASC map classified gS/ msG/sG. These coarser sediments were corroborated Longitude Longitude (a) (b) Fig. 12 Comparison maps displaying sediments in areas where discrepancies exist between a interpolated ASC and b EMODnet sediment map 123 by 6 out of 6 samples (Fig. 2, see also Table 4 for msG and sG where ASC performs better). In general, it can be concluded that for the areas where the two maps differ, the ASC maps represent spatial heterogeneous areas better, but more samples are needed to further assess the differences.
In case more knowledge on uncertainties of produced seabed maps are available, regions of high accuracy and low uncertainties could be determined. These regions can be further used as training and validation datasets for supervised and unsupervised classification methods applied to newly acquired MBES datasets. Lark et al. (2015) proposed the geostatistical linear mixed model (LMM) as an approach to incorporate new available MBES data into existing sediment maps. This procedure would lead to an update of seabed maps using datasets that allow higher spatial resolutions. In this approach, the information contained in the existing maps (e.g. EMODnet) would be considered as fixed (categorical) variables and the new datasets (e.g. backscatter) are considered as continuous covariates to generate an updated map with higher spatial resolution.

Current limitation of ASC based on MBES backscatter
Despite the major advantages of ASC in sediment mapping, at its current stage backscatter-based ASC is hampered by its restricted ability to assign sediment type (e.g. Folk class) to a distinct acoustic class, in particular, for coarser sediments (S to sG).
It is shown that specific grain-size fractions affect the backscatter differently in this study and no one-toone relationship between grain size and backscatter for the entire grain-size spectrum exists (Fig. 10). Therefore, for example, sediments with a high percentage of large grains ([ 16 mm) might result in the same backscatter values as sediment containing mainly small grains (\ 0.5 mm).
These insights are useful in understanding the relationship between backscatter and specific sediment properties and subsequently to generate an appropriate classification scheme for the assignment of sediment properties to acoustic class (backscatter). However, we need to consider that the backscatter strength is dependent on other seabed properties such as sediment bulk density, seafloor roughness, volume heterogeneity, discrete scatterers and sediment layering (Jackson & Richardson, 2007;Williams et al., 2009;Ivakin, 2012) even if several studies have shown a strong relationship to mean grain size, grainsize distribution or shell or gravel content for a specific study area (Goff et al., 2000;Collier & Brown, 2005;Simons et al., 2007;Medialdea et al., 2008;De Falco et al., 2010).
By investigating the influence of mean grain size for well-sorted sediments on backscatter strength, under ideal laboratory conditions, Ivakin & Sessarego (2007) observed a transition from positive to negative correlation where the fraction between mean grain size and acoustic wavelength exceeds 1. The authors hypothesized that the decrease might be explained with the appearance of a different dominating scattering regime. Buscombe et al. (2017) described this situation as a mixed Rayleigh-geometric regime where a transition from Rayleigh to geometric scattering occurs. With increasing acoustic frequency the sediment (with a specific grain size) cannot be described as a continuous medium anymore and the individual grains have to be considered as discrete scatterers. This demonstrates the strong influence of the transmitted frequency on the relationship between backscatter and the presence of coarse grains. Similar observations were found by Eleftherakis et al. (2014). Buscombe et al. (2017) postulated that the lack of discriminatory power in the gravel fraction was due to the transitional scattering regime. However, the organic content [e.g. remaining seagrass fragments (Kostylev et al., 2003;De Falco et al. 2010)], or smallscale topography (scales comparable or smaller than the signal footprint, and therefore below the resolution of the beam footprint) have an influence on the backscatter as well and might affect the results (Lurton et al., 2017).
This knowledge is crucial for efforts aiming to increase the discrimination performance. Recent studies have demonstrated that multispectral backscatter data indeed reveal that areas, showing the same acoustic responses for one frequency, differ in the acoustic response of a second frequency (Hughes Clarke, 2015;Brown et al., 2017). This implies that the use of different frequencies might be an appropriate approach to improve seabed sediment classification (Hughes Clarke, 2015) and resolve ambiguities.

Conclusions
This study introduces an approach to interpolate backscatter data available along MBES trajectories with large track spacing and their classification into distinct acoustic classes based on the Bayesian bedclassification method. It is shown that the Kriging techniques Ordinary, Universal and Simple perform almost equally well, and that due to a lack of a global trend with the depth, incorporating bathymetry as a second variable in Cokriging did not improve the interpolated map. The acoustic class map is converted to a sediment map displaying Folk classes by using grab samples. When quantitatively comparing the resulting interpolated ASC Folk class map with sediment samples, an overall accuracy of 65% was obtained; a similar value was obtained when evaluating a more traditional sediment map with the same samples (from EMODnet). However, a qualitative comparison pleads for a better performance of the interpolated ASC map. This is indicated by more precisely mapped seabed sediment heterogeneities in an order of km's. The evaluation of the high-resolution sediment map obtained from the Bayesian method, which employs the actual backscatter, demonstrates that mapping laterally heterogeneous sediments is improved up to an order of meters.
Considering the increasing use of ASC efforts and the large amount of existing traditional sediment maps, we aimed to integrate ASC into existing sediment maps, in order to optimize sediment mapping. A manual integration reveals that the knowledge of the spatial distributions of the uncertainties of each map is of high importance. The uncertainties in the interpolated full-coverage ASC sediment map are due to both the Kriging interpolation of the MBES backscatter data and the assignment of sediment to acoustic class. For the EMODnet map, the uncertainties are not known. A successful automatic and objective integration would lead to improved marine seabed sediment maps with higher spatial resolution and quantified uncertainties. The improved sediment maps advance marine habitat mapping since sediment types are one of the main driving factors for the distribution of benthic organisms.
The study reiterates the shortcoming of the ASC method, caused by the fact that no one-to-one relationship exists between grain size and backscatter for the entire grain-size spectrum in our study area. This ambiguity reduces the discrimination between sediments and therefore needs to be accounted for during the integration. The dependency of the relationship between grain size and backscatter on acoustic frequency has been described in the literature and provides a way to mitigate the effect of the ambiguity on the acoustic classification. Multispectral backscatter classification is therefore considered as an important future research area to optimize ASC methods.