Evaluation of the Near Infrared Spectroscopy (NIRS) to predict chemical composition in Ulva ohnoi

Green algae of the genus Ulva have been identified as suitable organisms for biomass production and good candidates for the development of seaweed blue-biotech industries. The fluctuation of chemical composition during the growth of the algae, which depends largely on environmental factors, makes the development of rapid phenotyping protocols necessary. In this work the efficacy of Near Infrared Spectroscopy (NIRS) to predict dry matter, mineral fraction, nitrogen, and carbon contents directly from wet untreated samples and from dried samples was studied. Partial least square (PLS) models from spectra recorded on 80 samples were used to predict dry matter, and 44 samples to predict carbon, nitrogen and mineral fraction on a wet and dry weight basis. NIR models developed from spectra acquired on wet samples had good accuracy (R2>0.9) for the prediction of N (on a ww and dw basis) and C (on a ww basis). Models with lower R2 scores have been obtained for dry matter (R2=0.610) and MF (R2=0.506-0.693). The models developed to predict carbon and nitrogen directly on wet and untreated samples present NIRS as a valuable tool to determine these parameters in a rapid and low-cost way, allowing making decisions about the optimal harvesting time.


Introduction
An increasing interest in non-animal protein sources during the last decade has arisen and seaweeds have emerged as potential candidates (Pereira 2011).Seaweeds, also, have high potential as a source of bioactive compounds for human and animal consumption or pharmaceutical applications (Barrington et al. 2009;Holdt and Kraan 2011;Peña-Rodriguez et al. 2011;Rasyid 2017;Gomez-Zavaglia et al. 2019).
Biochemical composition in general, and protein content in particular, in wild-harvested seaweeds depends on species, harvesting location, and season (Pedersen et al. 2010;Pereira et al. 2012;Vasconcelos et al. 2022).Fluctuations in the composition of harvested seaweeds are a bottleneck for the seaweed industry development, which needs a stable supply with a stable composition.
The green algae of the genus Ulva have been identified as suitable organisms for biomass production and as good candidates for the development of seaweed blue-biotech industries.Moreover, regarding Ulva ohnoi, its high growth rates in temperate waters, and nitrogen and phosphorus biofiltration capacity present them as a suitable candidate for land-based cultivation (Lawton et al. 2013;Mata et al. 2016).When cultured in tanks, cultivation conditions (nutrients and light) affect growth rates and biochemical composition (Neori et al. 1991;Harrison and Hurd 2001;Msuya and Neori 2008;Angell et al. 2014;Oca et al. 2019;Toth et al. 2020); hence, cultivation conditions could be adapted from a food perspective (biochemical composition) or from a growth perspective (biomass yield) (Toth et al. 2020).Characterization of the biochemical composition in a fast way would be a useful tool to determine the best harvesting time to ensure specific seaweed properties.
Biochemical composition determination in seaweed requires time, laborious methods with expensive equipment and trained personnel, and in general includes hazardous compounds.Near infrared spectroscopy (NIRS) is commonly used for routine analysis in food and feed industries (Manley and Williams 2021;Beć et al. 2022), and presents advantages such as their expeditiousness and versatility, which allows the determination of multiple parameters at once in different types of samples (powders, pellets, liquids, etc.….) with little or no preparation.NIRS has also been shown as a valuable tool for measuring protein and other compounds in brown (Horn et al. 1999;Hay et al. 2010;Yang et al. 2021;Campbell et al. 2022;Cao et al. 2022) and red seaweeds (Tadmor et al. 2022).
In the present study near infrared spectroscopy (NIRS) was used to predict dry matter, mineral fraction, nitrogen, and carbon content in wet and dry samples of the green seaweed U. ohnoi.

Ulva ohnoi
Ulva ohnoi was collected at the Ebro Delta from the bioremediation ponds of an aquaculture facility in Sant Carles de la Ràpita, Spain (latitude, 40.62 N; longitude, 0.66 E).The species was genetically identified by DNA extraction and PCR amplification of the chloroplast rbcl gene following the protocol described in Hayden et al. (2003) with the primers used by Manhart (1994).It was maintained at the Aquaculture Laboratories of the Universitat Politècnica de Catalunya in Castelldefels (Spain) in indoor tanks fed by water coming from a Solea senegalensis recirculation aquaculture system (RAS) equipped with a biological and mechanical filter, water temperature control and aeration.Ulva ohnoi were cultivated in three circular tanks (28 cm water depth, 64 cm diameter) with bottom aeration to tumble the seaweeds.Seaweed tanks were illuminated by LED light sources (with photon flux densities on the water surface ranging from 160 to 1000 µmol photons m −2 s −1 ).The seaweed stocking densities (0.8 to 3.0 kg m −2 ) and the photon flux densities on the water surface were combined in order to obtain fronds with different ranges of dry matter, nitrogen and carbon content.

Seaweed sampling and analysis
Over the course of the study 80 samples of U. ohnoi were used (between February 2022 and May 2022).During the study the temperature range in the tanks was 12.5-19.5ºC,pH 7.6-9.5, total ammonia nitrogen 0.01-0.20 mg L −1 , N-NO 3 25-38 mg L −1 , P-PO 4 0.7-4 mg L -1 and alkalinity 120-250 mg CaCO 3 L −1 .The NIR spectra for wet (ws) and dry samples (ds) was acquired.In all the samples the dry matter (DM) was determined, and 44 out of the 80 samples were analyzed for mineral fraction (MF), carbon (C), and nitrogen (N) content.

Seaweed sampling and preparation
Seaweed fronds were randomly taken from the tanks and always the same protocol was followed to obtain wet and dry samples (ws and ds): excess water was removed in a mechanical spinner and fronds were rinsed (first with fresh water and after with distilled water) to remove epiphytes and salt; then, excess of water was removed again with the mechanical spinner (ws).NIR spectra on the wet samples (ws) were acquired and immediately the samples were weighed, obtaining the wet weight (ww) for dry matter determination.Samples were oven-dried at 60 • C for 48 h, weighed (dry weight: dw), grounded in a mill (finely minced), and the NIR spectra on the dry samples (ds) were acquired.Samples were stored in closed containers and maintained in a dark and dry area until the chemical analysis.

Reference method analysis
Mineral fraction (MF) was quantified gravimetrically after incineration of the dry samples (around 1.5 g) for 14 h at 550 • C in a muffle furnace (Lahaye and Jegou 1993).Carbon (C) and nitrogen (N) were analyzed using a CHN elemental analyser EA-CE 1108 (Thermo Fisher Scientific) with sulphanilamide as a standard.
MF, C, and N were reported to the wet and dry weight of the seaweed (ww and dw).

Spectra acquisition
Since water has a high contribution in the NIR zone, sample spectra were acquired on wet (ws) and dry (ds) samples: one after the mechanical spinning (ws) and the other after being oven-dried and ground (ds) (see Sect. "Seaweed sampling and preparation").In both cases, NIR spectra were acquired by diffuse reflectance, using a FT-NIR Antaris II (Thermo Fisher), equipped with a spinner module.Spectra were obtained as the average of 3 technical replicates acquired in the range between 10,000 and 4000 cm −1 , with a resolution of 8 cm −1 and 32 scans.

Statistical analysis
Chemical variables were analyzed to ensure that significant variation was captured among the samples.All analyses were performed using R software (v.3.6.1;R Core Team, Austria).Descriptive statistics (maximum, minimum, average) were calculated for each variable in order to describe the range of variation.The adjustment to a normal distribution was studied by means of the Shapiro test ("ggpubr" package).The correlation between all the variables was calculated with the Pearson correlation coefficient.Significant correlations (p < 0.05) were plotted in a correlation matrix using the "corrplot" package.Graphics were produced using the "ggplot2" package.

Multivariate model development
The whole spectral pre-treatment and data modeling process was performed by PLS Toolbox v8.9.1 software (Eigenvector research, Inc.).Different pre-treatments and their combinations were tested: Standard Normal Variate (SNV) and 1st (SG-1D) and 2nd (SG-2D) derivatives of Savitzky-Golay.Before the model's development, samples were divided into two sets: calibration set, for the model construction, and external validation or prediction set, to assess the model's predictive ability.These sets were automatically selected by using the Onion algorithm, selecting 66% of samples for calibration and the remaining 34% for external validation.After the automatic selection, for each parameter the sets were manually checked and modified if necessary, in order that all the samples with extreme reference values fall in the calibration set.To develop the predictive models for the studied parameters, Partial Least Squares (PLS) regression algorithm was used.It allows relating the spectra of all samples to their reference data, reducing considerably the number of variables.The new variables, known as latent variables (LVs), are spectral vectors directly related to the target parameter, which will be used for the prediction of new samples.The choice of the optimal number of LVs is a very important issue.If the LV number selected is too low, some important spectral information of the samples may not be sufficiently represented, while if it is too high, spectral noise information will be added to the model increasing the risk of overfitting.In the present work, for each model, the number of LVs was established according to their root mean square error (RMSE) decrease and accumulated variance and R 2 increase in each LV and regarding the previous one (in calibration and cross-validation).Cross-validation is an internal validation applied on the calibration samples.It has been applied automatically by the software, using the Venetian Blinds algorithm and splitting the samples in ten groups.Outliers of the model were discarded according to their Hotelling T 2 , Q residual values, and differences between measured and predicted values during the construction of PLS models.For each parameter and sample presentation (ws or ds), the best model was selected according to the higher determination coefficient (R 2 ) and lower root mean square error (RMSE), both in the calibration (C) and prediction (P) set of samples.The predictive ability of the final models can be assessed by means of their referencepredicted scatterplots and the related statistics obtained in prediction (mainly R 2 and RMSEP).

Chemical parameters
80 samples of Ulva ohnoi were used to determine DM and 44 out of the 80 to determine N, C, and MF.Since a good range of data from reference analyses is critical for developing robust calibration models, the distribution of individual values for each variable was analyzed.In Fig. 1 the range of the values (max and min) and the mean obtained for each parameter are shown.Individual values for DM, N (ww, dw), and C (ww) showed a good distribution along the range of variation, despite an isolated sample in the low part of the range in the case of N (dw) (Fig. 1).Individual values for C (dw) and MF (ww, dw) were concentrated around the mean (normal distribution).The Shapiro test for normality showed that C (dw) and MF (ww, dw) were adjusted to a normal distribution (p > 0.01), while the remaining variables were not normally distributed (DM showed a skewed distribution to high values, while N (ww, dw) and C (ww) showed nonsymmetric and bimodal distributions).
Bivariate correlation analysis showed a significant positive correlation between DM and C and N contents (Fig. 2).Correlations among DM and N and C estimated on a ww basis were higher (r = 0.97 and r = 0.98, respectively) than on a dw basis (r = 0.75 and r = 0.55, respectively).MF on a ww basis showed a positive correlation with DM (r = 0.86), N (r = 0.83) and C (r = 0.82) (Fig. 2), i.e. samples with higher DM also accumulate more MF.However, on a dw basis MF content is negatively correlated with N (dw, r = -0.42)and C (dw, r = -0.51).

NIR spectra versus chemical composition
Figure 3 shows the raw and pretreated (SNV pretreatment) spectra measured from wet and dried samples.Spectra of wet samples differed from the dried samples ones, showing broad bands around 5000 cm −1 and 7000 cm −1 (Fig. 3a).SNV correct scattering effects (Fig. 3b), and SG-1D treatment (not shown) led to spectra with more defined peaks, although the absorbance of water remained very high in wet samples.
PLS models to estimate chemical variables were developed using different spectral pretreatments (SNV, SG-1D, SNV + SG-1D) on the entire spectral range (Table 1).The best pretreatment for each variable was selected based on the goodness of the resulting models.The performance of the models varied for each compound and for the two spectral acquisition ways (ws and ds: wet and dried samples).R 2 for the calibration models ranged 0.365-0.952.Models developed from NIR spectra acquired on wet samples (ws) present high R 2 for N (ww R 2 = 0.915, dw R 2 = 0.904) and C (ww R 2 = 0.941), moderate for C (dw R 2 = 0.763) and DM (R 2 = 0.773), and poor for MF (ww R 2 = 0.642, dw R 2 = 0.365) (Table 1).Regarding the models using NIR spectra acquired on dried samples (ds) it is shown that these were very similar for N (ww R 2 = 0.952, dw R 2 = 0.895) and C (ww R 2 = 0.912, dw R 2 = 0.865), but differed widely for MF, where models developed from dry samples had better quality (ww R 2 = 0.852, dw R 2 = 0.636).
In general, prediction results (R 2 ) were slightly lower but similar to the calibration ones (Table 1).RMSEP values were, for most of the variables, equivalent to the corresponding RMSEC values, indicating good robustness of the NIR models.

Chemical parameters
The parameters range of variation for the samples analyzed in our study is representative of the phenotypic diversity that can be found in cultured Ulva spp (Msuya and Neori 2008;Al-Hafedh et al. 2012;Angell et al. 2014;Mata el al. 2016).Chemical variables DM, C, and N showed non-normal distributions, with a uniform distribution of the reference values along the working ranges, which is favourable for the development of NIR models, according to Williams (2001).With regard to the correlations among variables, we identified significant and positive correlations between all the chemical variables expressed on a ww basis, whereas on a dw basis we identified a negative correlation between the inorganic (MF) and organic (C, N) fraction of the DM.The positive correlation between DM and C and N on a dry and wet weight basis is due to the fact that C is the main constituent of DM (C represents 27.2-35.8%dw) and N is an important element in the synthesis of biomass, despite being in a lower concentration (N represents 3.2-4.7%dw) (Fig. 1).The negative correlation on dry weight basis between MF and N (r = -0.42)and C (r = -0.51)indicates that the inorganic fraction of the DM (MF) diminishes in samples with a high organic fraction (N, C).

NIR spectra versus chemical composition
The broad bands observed around 5000 and 7000 cm −1 in the raw spectra of wet samples (Fig. 3) are due to the combination bands and 1st overtone of O-H bonds vibration in water molecules.These bands were expected since the water has a strong contribution in NIR spectra and was the major constituent of wet samples, ranging its content between 80.9-89.8%(Fig. 1).
NIR models developed from spectra acquired on wet samples had good accuracy for the prediction of N (on a ww and dw basis) and C (on a ww basis), and thus can be used to characterize Ulva ohnoi samples directly after harvesting.Models with lower R 2 scores have been obtained for DM and MF.The low R 2 scores for the MF may be due to the lack of discriminating spectral bands specific for this parameter.Overall, our results show that NIR can be used to predict N (on a ww and dw basis) and C (on a ww basis) contents with good accuracy (R 2 > 0.9) from recently harvested seaweed samples (without requiring any treatment of the sample) (Fig. 4a-c), and to estimate roughly DM and C (on a dw basis) contents (R 2 > 0.75) ( Fig. 4d-e).To the author's knowledge, this is the first full report on the efficacy of NIR to characterize the chemical composition of U. ohnoi.Previous works on brown algae yielded similar results to those reported in this study (Hay et al 2010;Campbell et al 2022;Tadmor et al. 2022).For instance, Hay et al. (2010) reported similar R 2 values for predicting C and N on Sargassum flavicans, and also observed a better fitness of the NIR models to predict N than C.More recently Campbell et al. (2022), working with 4 different brown seaweed species, also reported accurate models to predict the chemical composition, obtaining higher R 2 values than those obtained in our work, specifically for the MF.The preparation of the samples and the higher range of variation for MF studied by Campbell et al. (2022) can be the reason for this difference.These two studies used freeze-dried and lyophilized samples to develop the models, thus avoiding the signal noise impaired by water in the NIR spectra.

Conclusions
As one of the main benefits of NIR technology is the rapid characterization of samples with minimal sample processing time, the present research aimed to evaluate the potential of NIR applied either on wet and dried samples of the macroalgae Ulva ohnoi.
The effectiveness of NIR to predict dry matter, Nitrogen, Carbon and Mineral fraction in Ulva ohnoi samples have been explored, either directly on wet (recently harvested) or on dried samples.Results show that NIR models to predict nitrogen and carbon on a wet weight basis directly on wet samples had similar robustness than on dried samples (R 2 > 0.9), while lower accuracies were obtained for MF and dry matter.Obtaining these parameters (carbon and nitrogen) in wet samples allows making rapid determinations, avoiding the processing of the samples needed to obtain the grounded dry matter.
Rapid and low-cost phenotyping of the chemical composition of Ulva ohnoi is a crucial step towards the quality monitoring of wild and/or cultivated algae and the decision of optimal harvesting time.

Fig. 1
Fig. 1 Chemical composition of Ulva ohnoi samples used in this study: Violin plots showing the distribution of the samples along the range of variation for each parameter on a dry and wet weight basis

Fig. 2
Fig.2Bivariate correlations between parameters (only significant correlations are shown, p < 0.05).For each correlation, the Pearson coefficient is presented.The color scale indicates the correlation degree between parameters (blue, positive correlation; red, negative correlation)

Fig. 3
Fig. 3 NIR reflectance spectra of seaweed samples: (a) raw spectra from wet and dried samples; (b) SNV preprocessed spectra from wet and dried samples.Y axis represents arbitrary units

Fig. 4
Fig. 4 Reference-predicted scatterplots for the chemical variables estimated with NIR in the wet samples.In each graph, the calibration (black dots) and validation (grey dots) sets are shown, as well as the

Table 1
Calibration and prediction statistics obtained for the models developed with wet samples (ws) and dried samples (ds).All variables, as well as RMSEC/P, are expressed in % SNV: standard normal variate; SG-1D: Savitzky-Golay first-order derivative; LVs: number of latent variables; N: number of samples; R 2 coefficient of determination of calibration and prediction sets; RMSEC: root mean square error of calibration; SDref: Standard Deviation of reference values; RMSEP: root mean square error of prediction; SEP: Standard Error of Prediction; RPD: ratio SD/SEP