How well can near infrared reflectance spectroscopy (NIRS) measure sediment organic matter in multiple lakes?

Loss-on-ignition (LOI) is the most widely used measure of organic matter in lake sediments, a variable related to both climate and land-use change. The main drawback for conventional measurement methods is the processing time and hence high labor costs associated with high-resolution analyses. On the other hand, broad-based near infrared reflectance spectroscopy (NIRS) is a time and cost efficient method to measure organic carbon and organic matter content in lacustrine sediments once predictive methods are developed. NIRS-based predictive models are most robust when applied to sediments with properties that are already included in the calibration dataset. To test the potential for a broad applicability of NIRS models in samples foreign to the calibration model using linear corrections, sediment cores from six lakes (537 samples, LOI range 1.03–85%) were used as reference samples to develop a predictive model. The applicability of the model was assessed by sequentially removing one lake from the reference dataset, developing a new model and then validating it against the removed lake. Results indicated that NIRS has a high predictive power (RMSEP < 4.79) for LOI with the need for intercept and slope correction for new cores measured by NIRS. For studies involving many samples, NIRS is a cost and time-efficient method to estimate LOI on a range of lake sediments with only linear bias adjustments for different records.


Introduction
Near infrared reflectance spectroscopy (NIRS) is increasingly included in soil studies to reduce costs and processing time (Chodak 2008). NIRS is a nondestructive method where a light beam with known spectral properties is directed at a sample and the reflected light is measured in the visible and near infrared region of the light spectrum (approx. 350 nm to 2500 nm) (Kaye 1954(Kaye , 1955. Measured reflectance values are related to the organic bonds between molecules in a sample, making it feasible to calibrate multivariate predictive models to measure organic compounds in sediment analyses. The main strength of NIRS compared to traditional physical and chemical analyses of sediments is that a single sample could be used to simultaneously analyze a range of different chemical or structural parameters (Cozzolino and Morón 2003) and since it is a non-destructive method the sample can then be used for other analyses.
The content of organic matter in sediments is widely used as a stratigraphic property and as, for example, an indicator of ecosystem productivity in lake sediment analyses as well as for assessing climate and environmental changes (Björck et al. 1991;Nesje and Dahl 2001;Birks and Birks 2006). The principal analytical method, loss on ignition (LOI), which has been used for several decades, is an accurate method to measure soil and sediment organic matter (LOI at 550°C) and CaCO 3 content (LOI at 950°C) (Dean 1974;Heiri et al. 2001) and although inexpensive in materials, it is expensive in staff time. Alternative methods using wet oxidation (Walkley-Black method; Heanes 1984) are more expensive and time consuming even when automated, and may over-estimate the actual organic carbon content (Wang et al. 2012).
NIRS has been successfully applied in previous studies to measure the physical and chemical composition of lake sediments, including LOI and geochemical elements such as C, N and P (Malley et al. 1999(Malley et al. , 2000Inagaki et al. 2012). The ability of this method to measure different proxies from a single analysis results in reduced sampling and laboratory analysis costs, together with increase time-efficiency (Nduwamungu et al. 2009). However, the use of NIRS has been limited to the cores (or lakes) for which the calibration models have been developed (Rosén et al. 2010). The restriction of the applicability of NIRS models to the cores, or lakes against which the models have been calibrated is a strong drawback for a wider application of the method. It has been assumed that site-specific variation in sediment chemical composition or stratigraphy will negatively affect the model applicability, even rendering it unusable in some situations. However, there is a growing literature on the applicability of spectrometric tools to samples that are outside the calibrated population (Rosén et al. 2011;Meyer-Jacob et al. 2017), although new measurements will generally require some correction factors based on a subset of samples from the new core (Roggo et al. 2007). There has been a marked development of spectroscopic tools to measure sediment properties with very good results, although transferability of models is still a remaining challenge (Table 1).
Here we present LOI predictive models based on NIRS scans of six contrasting lacustrine sediment cores from boreal to high Arctic locations to study the (1) method's accuracy and feasibility, and (2) test if the transferability of NIRS models to new samples with only a simple linear correction as adjustment.

Lake sediments
A total of 537 lacustrine sediment samples were analysed from six geographically distinct lakes across the boreal to high arctic zone (Table 2). For three of the lakes, the description of the cores has already been published: Skartjørna in Svalbard (Alsos et al. 2016), Uhca Rohči at Varanger Peninsula in Northern Norway (Clarke et al. 2019), and Bolshoye Schuchye in the Polar Urals (Svendsen et al. 2018). In two cases, we sampled new cores from lakes for which earlier lithological studies exist-Gauptjern in Troms (Jensen and Vorren 2008) and Øvre AEråsvatnet in Nordland (Alm 1993). For one lake, Uhca Rohči 1, the lithology is provided in supplementary material (ESM 1).

NIRS measurements
Approximately 1 g of wet sediment was dried at 50°C for 24 h, ground in a mortar and scanned with a portable spectrometer (Fieldspec 3, ASD Inc, Boulder, CO). Spectra were measured as reflectance in the 350-2500 nm spectral range with resolution of 1.4 nm in the 250-1050 nm region and 2 nm in the 1000-2500 nm regions, and automatically interpolated to 1-nm resolution. Each sample was scanned five times in a black polyacetal sample holder, rotating and mixing the samples between each scan to incorporate all spectral variability for each sample. The repeated spectra were averaged into a single spectrum per sample. On average, approximately 500 samples could be scanned with the NIRS method per week. Statistical analyses were performed in R 3.2.2 (R Development Core Team 2014) using partial least squares regression (Martens and Naes 1989) contained in the ''pls'' package (Mevik and Wehrens 2007). Data pre-processing tools i.e. derivatives, smoothing and spectra standardization were applied from the ''prospectr'' package (Stevens and Ramirez-Lopez 2014).
The following data transformations were tested for the model development: centering, scaling, smoothing based on moving averages, standard normal variate and 1st and 2nd order derivatives (Stevens and Ramirez-Lopez 2014).
Seven individual models were developed for LOI. The first model included all the samples from the six sediment cores (n = 537): calibration and validation sets were created using the Kennard-Stone algorithm (Kennard and Stone 1969) to ensure a proper spectral variability between calibration and validation, where 85% of the total sample was assigned to the calibration dataset and 15% was assigned to the validation dataset. The remaining six models were developed by subtracting one lake from the calibration dataset and validating the resulting model against this record ( Table 2). All the models were internally crossvalidated with a 10-fold cross validation.
The most parsimonious models were selected based on a high coefficient of determination (R 2 ), given a number of latent variables (k) and low root mean squared error of the cross-validation (RMSECV), which assesses the error between NIRS measured and reference values. Finally, each calibration model was tested against its respective validation set: coefficient of determination (R 2 ), root mean square error of the predictions (RMSEP), bias (systematic error) and the intercept and slope of the linear fit of the predictions were calculated to assess the applicability of the model.
Intercept and slope of the linear fit were used to apply corrections based on the equation y = mx ? b, where y is the corrected LOI value, m is the slope, b is the intercept and x is the NIRS measured raw value.

Results
LOI values ranged from very low to very high (1.03% to 85%), with a high overlap observed in the range of values between different lakes ( Table 2). The dataset covers a large variation in sediment composition, with samples ranging from organic-rich gyttja to   Standard normal variate (SNV), and the numbers after the derivative stand for differentiation order, polynomial order and window size respectively, number of latent variables (k), coefficient of determination between model fitted and reference LOI measurements (R 2 ), RMSECV and RMSEP are Root Mean Squared Error of the Cross Validation and Prediction, respectively, bias is the systematic error between reference and NIRS-measured values and Intercept and Slope represent the coefficients of the linear fit between the reference and NIRS-measured values predominantly minerogenic silts and clay with and without a significant CaCO 3 content ( Table 2). The full model including all the samples from the six sediment cores performed best (Fig. 1), with good fit in both the calibration and internal validation datasets (Table 3). The remaining models showed good model performance on the calibrations (Fig. 2) but needed intercept and slope correction on the external validations (Fig. 3, Table 3). In addition, the model excluding Gauptjern (Fig. 3d) showed a curvilinear response in the highest LOI values, most likely due to high CaCO 3 content (Jensen and Vorren 2008). Once corrected for intercept and slope, RMSEP values were reduced to similar levels as the full model (Table 3), showing that intercept and slope corrections result in highly precise estimates of sediment LOI on samples that do not belong to the reference population.
The samples belonging to the polar Urals and Skartjørna showed a poor coefficient of determination on the validation set (R 2 of 0.22 and 0.17, respectively) (Fig. 3e, f), although RMSEP values were similar to the other models after corrected for intercept and slope (RMSEP 1.37 and 4.26, respectively).

Discussion
The high predictive ability of LOI based on NIRS reported here extends previous studies showing NIRS as a reliable, non-destructive method for measuring lake sediment properties, both for within-site samples (Stenberg et al. 2010;Gholizadeh et al. 2013) or samples from contrasting sites that are not part of the calibration model (Rosén and Persson 2006;Rosén et al. 2011;Meyer-Jacob et al. 2017). A single NIRS spectrum can be used to simultaneously determine several sediment properties with no increased costs: one operation for sample preparation and scan is enough to estimate several variables, if predictive models are established (Malley et al. 2000;Cozzolino and Morón 2006). Cost per sample is further reduced with a greater number of predictive models available: sediment samples scanned in previous studies can retrospectively be analysed through their NIRS spectra and a new dimension added into the data gathering even after the samples have been scanned and analyzed with destructive methods. An added advantage is that, due to the small amount of sediment required to acquire a spectrum (approximately 1 g), high-resolution down-core analyses can be performed. While it is expected that NIRS predictions are most accurate when limited to samples with similar properties to those included in the calibration dataset (Foley et al. 1998;Chodak 2008), our study shows that comprehensive models including enough variability in stratigraphy or geographical origin can be applicable to other samples. Thus, we suggest that this is a first step towards creating worldwide lacustrine spectral libraries to develop true global models applied to lacustrine sediments (Viscarra Rossel 2009;Rosén et al. 2011;Stenberg and Viscarra Rossel 2016).
When sequentially removing a lake from the calibration set and then testing the model against it, we found a linear bias in most of the lakes. After performing the linear corrections, we obtained highly accurate estimates (R 2 [ 0.82, RMSEP \ 7.5, except for the polar Ural and Skartjørna samples), comparable to those of the model including all the lakes (R 2 = 0.98, RMECV = 3.3). The polar Ural and Skartjørna samples showed a poor coefficient of determination, but similar RMSEP values to the other models after intercept and slope corrections. Given the low variability in the LOI values on these two cores, the apparent model performance seems to be suboptimal: such cases with consistently low values along the core will raise concerns about the applicability of the model. However, the NIRS-measured values reveal similar patterns in LOI between the raw predictions and the measured LOI values on the core (Fig. 4), although apparent performance may seem worse, the general LOI pattern along the core is correctly detected and the RMSEP shows that the low R 2 values are an effect of the small LOI range in these two cores.
The prediction of the Gauptjern lake (Fig. 3d) showed a non-linear fit towards high LOI values. This is expected when core properties are out of the predictive ability of the model, due to a number of parameters such as chemical composition (e.g. CaCO 3 content) or particle size distribution (Barthes et al. 2006). The use of a validation subset for each core along the gradient of chemical/physical properties of the sediment (e.g. organic matter, carbon content) is a safe way to identify such issues. Once the problematic region has been detected, it can be corrected for linear bias, or the core (or core section) analyzed with traditional measurements, to incorporate it into the model. Even raw (uncorrected) LOI predictions from NIRS models already provide a first assessment of the LOI patterns down the core (Fig. 4). This helps to focus more intensive analyses based on other approaches in core regions with interesting patterns such as steep changes in LOI, or unusual values. To establish a robust validation of NIRS models in the future, McTiernan et al. (1998) recommend a minimum sample size of 20 samples from a single lake to develop a predictive model. However, we suggest to scan a higher amount (minimum of 40 to 50) of samples for each lake, to ensure representativeness of all intrinsic properties that each sediment origin presents. Robust models require several hundreds of samples representing a large range of LOI: once the predictive model is established, we suggest measuring a subset of samples (approx. 20% of the original number of samples) using the classical LOI methodology, selecting samples according to their NIRS predicted value to ensure a proper spread along the LOI values. In addition, core sections with new properties (stratigraphy, grain size, etc.) need to be more intensively sampled and included in the model afterwards. This will result in accurate correction factors (i.e. intercept and slope) for high-quality NIRS-inferred LOI estimates (Table 3). This process needs to be done for each new sediment core analysed by NIRS. It is expected that incorporating these validation samples into the model will improve the model robustness and increase the applicability for different samples from different lakes, resulting in a synergistic effect that will reduce the future need for intercept and slope corrections.

Conclusions
This study shows that NIRS can be used to estimate LOI in a wide variety of lake sediment types from six geographically distinct lakes with only minimal calibration samples. NIRS therefore has the potential to become a new standard procedure in lacustrine sediment research for the simultaneous and high resolution measurement of several sediment properties with only a 20% subset of validation samples needed to be analyzed with traditional methods in order to adjust the NIRS measured values, thus saving time and costs in sediment analyses. Such transferable models are especially valuable when large sets of samples (i.e. several cores from a lake, or long cores) are to be analyzed. Future work should focus on adding samples with different stratigraphic properties and geographical regions to build a more robust library.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Data availability
The dataset used to develop the models presented in this article is available on https://doi.org/10.18710/ OJC4TH.