Assessment of microalgae species, biomass, and distribution from spectral images using a convolution neural network

Effective monitoring of microalgae growth is crucial for environmental observation, while the applications of this monitoring could also be expanded to commercial and research-focused microalgae cultivation. Currently, the distinctive optical properties of different microalgae groups are targeted for monitoring. Since different microalgae can grow together, their spectral signals are mixed with ambient properties, making estimations of species biomasses a challenging task. In this study, we cultured five different microalgae and monitored their growth with a mobile spectral imager in three separate experiments. We trained and validated a one-dimensional convolution neural network by introducing absorbance spectra of the cultured microalgae and simulated pairwise mixtures of them. We then tested the model with samples of microalgae (monocultures and their pairwise mixtures) that were not part of the training or validation data. The convolution neural network classified microalgae accurately in the monocultures (test accuracy = 95%, SD = 4) and in the pairwise mixtures (test accuracy = 100%, SD = 0). Median prediction errors for biomasses were 17% (mean = 22%, SD = 18) for the monocultures and 17% (mean 24%, SD = 28) for the pairwise mixtures. As the spectral camera produced spatial information of the imaged target, we also demonstrated here the spatial distribution of microalgae biomass by applying the model across 5 × 5 pixel areas of the spectral images. The results of this study encourage the application of a one-dimensional convolution neural network to solve classification, regression, and distribution problems related to microalgae observation, simultaneously.


Introduction
Microalgae are a cornerstone of the global carbon cycle (Thornton 2012), yet they may also form harmful algal blooms in natural waters (Sellner et al. 2003). In addition to their ecological importance, microalgae are the subject of active research in the production of bio-based compounds, food, clean water, and energy (Devadas et al. 2021;Kusmayadi et al. 2021;Yadav et al. 2021). For these reasons, accurate assessment of microalgae growth is of interest for microalgae biotechnology as well as for environmental monitoring. Microalgae are appealing targets for observation by optical techniques due to their inherent optical features, especially their photosynthetic pigments. Spectroscopic methods are used in remote sensing and in many ways also in laboratory applications to monitor microalgae (e.g. Havlik et al. 2013Havlik et al. , 2022Murphy et al. 2014).
In addition to spectroscopy-based approaches, other high-throughput methods have also been developed to assess and monitor microalgae. Traditionally, carotenoid pigment composition of microalgae together with Chemtax software (Mackey et al. 1996) and more recently fatty acids composition together with Bayesian modelling (Strandberg et al. 2015) have been used to identify different microalgae at the class level. Moreover, molecular biology tools, such as primer-independent metatranscriptomic analysis, have been used to identify microalgae from freshwaters (Vuorio et al. 2020). However, biochemical methods require time and expensive equipment, meaning their availabilities are limited. Instead, spectral imagers could offer cost-efficient in situ monitoring in a variety of volumetric scales.
As microalgae typically grow in suspension, their optical signals are mixed with those of the growth environment and possibly other microalgae and other organisms such as bacteria and protozoa. The sizes and structures of the microalgae cells and population density also affect the propagation of light (Bricaud et al. 1988;Bernard et al. 2009;Fujiwara et al. 2011). Thus, the key challenges in the use of spectroscopic monitoring for microalgae include the identification and quantification of different microalgae when one observation may contain signals from several targets. This is important when using point sensors, such as spectroradiometers or spectrophotometers, or spectral imagers. A spectral imager produces a stack of images taken on several wavebands. In the case of a hyperspectral imager, images are produced on more than 100 wavebands. As opposed to point sensors, a spectral imager also yields spatial information of the imaged target.
Different approaches to unmix the mixed algal signals have been used in previous studies. For example, Hunter et al. (2008) experimented with pseudo-communities consisting of different microalgae colour groups. They used derivative transformations of the reflectance spectra recorded with a spectroradiometer to accentuate the differences in spectral signal caused by the different colour groups. They also used spectral reflectance and derivative reflectance indices to resolve pigment concentrations. Mehrubeoglu et al. (2014) used non-negative linear least squares (NNLS) for unmixing of absorbance spectra to solve proportions of microalgae from pairwise mixtures imaged with a spectral imager. The model was able to solve percentages of the original samples in the mixtures, but algal biomass was not directly determined. As emphasized in previous studies (Bricaud et al. 2007;Mehrubeoglu et al. 2014), the unmixing of spectral signals of microalgae can be a complex task because the properties of microalgae cells -their structures, size, and shapes -and the concentration and arrangement of their pigments introduce nonlinearity to the propagation of light in the microalgae suspension. However, in addition to physics-based modelling of optical properties of microalgae, machine learning models that solve non-linear problems with high efficiency have recently attracted researchers' attention.
Artificial neural networks are machine learning algorithms that predict parameters from a given input based on the data that they have been trained with. Convolution neural networks (CNN) are a versatile tool due to the algorithm's capability to extract features from the training data. The principle of CNN is that a given dataset is treated with convolution kernels, that is, with filters that remove irrelevant features. The convoluted data then act as the input to a neural network. Conceptually, a neural network consists of nodes organized in layers, and outputs computed in the nodes of previous layer propagate to the next layer, and finally to the output of the model. Pant et al. (2020), Yadav et al. (2020), and Otálora et al. (2021) reported promising results from the identification of microalgae species from photomicrographs using a CNN. Bricaud et al. (2007) used a multilayer perceptron (MLP), that is, a neural network without the convolution treatment, to resolve chlorophyll concentrations and microalgae size classes from satellite images. Medina et al. (2017) compared CNN and a MLP classifier in the detection of algae in videoframes recorded for underwater pipeline inspection. They found that a CNN performed better than a MLP in the task, although both methods yielded high (> 95%) test classification accuracies. A CNN is generally a popular model to apply on image data, but it can also be used for spectral data, in which case a one-dimensional convolution neural network (1D CNN) is used. In addition to classification as tested in the previous studies, CNN could also be used to solve regression problems, such as microalgae biomasses from absorbance spectra. Recently, Maier et al. (2021) applied a 1D CNN to resolve chlorophyll concentration in satellite images.
In this study, we tested a 1D CNN to assess the biomass of microalgae belonging to different colour groups (bluegreen, i.e. cyanobacteria, brown, and green algae). We tested unmixing of the species from samples of monocultures and pairwise mixtures imaged with a hyperspectral imager. We hypothesized that the trained and validated 1D CNN resolves microalgae species and biomass from monocultures and pairwise mixtures with good test accuracies and low test errors. We also aimed to demonstrate the spatial distribution of biomass by applying the 1D CNN across a spectral image.

Algae cultures
This study consisted of three separate culturings of microalgae purchased from the Culture Collection of Algae at the University of Cologne (CCAC), Germany. In the first two culturings (hereafter Culturing I and Culturing II), the following strains were used: CCAC 3504 B Microcystis sp., CCAC 2944 B Synechococcus sp., CCAC 0064 Cryptomonas ovata, CCAC 0102 B Peridinium cinctum, and CCAC 3524 B Desmodesmus maximus. In the third separate culturing (Culturing III), the same cyanobacteria and green algae strains were used (CCAC 3504 B Microcystis sp., CCAC 2944 B Synechococcus sp., and CCAC 3524 B Desmodesmus maximus), but the Cryptomonas and Peridinium strains were not included.
In culturings I and II, three replicate cultures of each alga were grown in 250-mL cell culture flasks using Waris-H medium (McFadden and Melkonian 1986), a 14:10 light:dark cycle and 70-104 µmol photons m −2 s −1 irradiance measured by a quantum sensor (HiPoint, Taiwan) at the level of culture flask caps using fluorescent lamps. The datasets of spectral images of monocultures generated from culturings I and II were also used in our previous study (Salmi et al. 2021), which describes a vegetation indexbased arrangement to monitor the growth of microalgae during their exponential phase. The duration of culturings I and II were 35 to 49 days.
In culturing III, algae were cultured in 250-mL cell culture flasks with a modified WC medium (Guillard and Lorenzen 1972, Online Resource 1). The culture medium was prepared with a phosphate concentration of either 20 µg L −1 or 80 µg L −1 to induce variation into growth due to the phosphorus availability. Culturing III was performed in cell culturing cabins, using 18 °C and 23 °C temperatures and a 12:12 light:dark cycle with fluorescent lamps with 91-132 µmol photons m −2 s −1 irradiance measured by a quantum sensor (HiPoint, Taiwan) at the level of the bottoms of the culture flasks. Three replicates of each alga were cultured in each temperature and phosphate concentrations. The duration of culturing III was 18 days.

Spectral imaging and biomass assessment
Algae growth in culturings I and II was monitored with a spectral imager by sampling each replicate at least once a week prior to the stationary phase of their growth. In culturing III, one replicate of each phosphate treatment in both temperatures was sampled for spectral imaging at least once before the stationary phase of the growth. In culturing III, the growth of each replicate was also monitored with a flow cytometer (Guava easyCyte, Merck Millipore, USA) two times a week to observe when the cultures reached the stationary phase.
For the spectral imaging, a sample (volume 2 mL) was pipetted on a 24-well plate and imaged in transmission light with a Specim IQ spectral imager (Specim, Finland). The well plate was placed on a diffusor (Dolan-Jenner, USA) illuminated by a broadband halogen (Fiber-Lite, DC-950, Dolan-Jenner, Boxborough, MA, USA). The illuminated diffusor plate was used as a white reference. Specim IQ used its internal dark reference in producing transmittance images. Specim IQ records 204 wavebands between 400 and 1000 nm with 7 nm FWHM. However, 150 wavebands between 420 and 850 nm were included in this study because the edges of the recorded waveband range contained more noise. The imaging arrangement was described in more detail in Salmi et al. (2021). In this study, transmittance (T) images were converted to absorbance (A) images according to Eq. (1): After imaging the monocultures, mixed algal communities were formed by mixing two different strains belonging (1) A = −logT to the same or different microalgae groups. The total volume of a mixed sample was 2 mL, and the mixtures were imaged on the 24-well plates similarly to the monocultures.
Immediately after imaging, biomass of the monocultures was assessed from the samples that were left after the formation of the mixtures. Biomasses were assessed using an electronic cell counter (Casy, Omni Life Sciences, Germany) that yields cell biovolumes based on pulse areal modulation that a passing particle causes to a detector. Biovolumes were converted to wet biomasses by expecting the cells to be isopycnal with water. The expected biomass of each strain in a pairwise mixture was calculated by multiplying the biomass measured from the original monoculture sample by the fraction that those were mixed in a sample.

Training and validation data augmentation
A region of interests (ROI) was extracted from the spectral images of each sample by cropping 50 × 50 pixel areas of the sample wells. The ground truth was one biomass value for each sample well. To efficiently train and validate the machine learning models, the ROIs were first chopped into smaller 10 × 10 pixel subsamples. This led to 25 spectral images originating from the 50 × 50 spectral images. The mean spectra of these subsamples were used to augment the dataset further by simulating combinations of microalgae spectra in mixtures. Before the simulation, mean spectra that had 751 nm/676 nm < 1.04 were omitted because they were too dilute for the camera system to detect. NIR/Red index was chosen here because it turned out to be a good biomass estimator for different species in our earlier study with the same camera system (Salmi et al. 2021) as well as in other studies (Xue and Su 2017). Figure 1 shows the number of original spectral images of each alga that were used for subsampling, and data augmentation after the too dilute samples were omitted.
Two absorbance spectra x i and x j at a time were randomly selected from the dataset of mean spectra of 2,564 subsamples to form 100,000 simulated microalgae mixtures (Fig. 1). As a result of the simulation, we had a set X sim of linear mixtures of augmented spectra (2): where i, j ∈ [0, 2563] and x sim ⊂ X sim . The corresponding simulated biomass vector m sim ∈ ℝ 6 for all species and media was treated similarly (3): to form the simulated ground truths M sim for the simulated mixtures. From the 2,564 subsamples, 600 samples contained only the culture medium from different imaging days. This meant that some of the simulated mixtures consisted of culture medium.

Training and validation
All the data management and modelling were done with Python using Jupyter notebooks, Keras library, and Tensorflow backend. Computing was done using a Nvidia Tesla V100-SXM2 16 GB GPU unit. For training and validation of the 1D CNN, the augmented dataset X sim was divided so that 80% was used for training and 20% for model validation. Different topologies for the model were tested: two or three convolution layers were tested, with the first layer containing 128 convolution filters, the second 64 or 32, and the third 32 (Table 1).
The convolution kernel size was 3, and after each convolution layer, the maximum pooling layer with pool size 2 was applied. The neural network had one or two dense layers, in addition to the output layer, which had 6 nodes. The dense layers had either 256, 128 or 64 nodes as presented in Table 1. Before the output layer, a dropout layer with 0.5 weight was applied to prevent overfitting of the model (Bengio et al. 2017). The activation function for each convolution and dense layer was a rectified linear unit (ReLu), and the model optimiser was a gradient-based stochastic optimizer (Adam) with a learning rate of 0.001, 1 = 0.9 , 2 = 0.999 and = 1e −7 . Models were trained for 100 epochs using a sample batch size of 512. Training was performed against biomass set M sim . Activation of the output layer was also ReLu because we wanted to form a regression model from spectra to species-wise biomasses. Loss was calculated as  mean squared error. The performance of different model topologies (Table 1) was evaluated by calculating the root mean squared error between the expected and predicted biomasses.
The lowest training and validation losses and root mean squared errors were observed in models 2, 4, and 5 (Table 1), which had two convolution layers with 128 and 32 or 128 and 64, convolution filters. These topologies had 256 nodes in the first dense layer (Table 1). Models with 128 in the first dense layer (models 1 and 3) had higher validation and training losses and root mean squared errors than did models 2,4, and 5 (Table 1). Models with three convolution layers (6 and 7) had higher losses and root mean squared errors than did models 2, 4, and 5 with two convolution layers.

Testing
Test data consisted of samples of monocultures (see Online Resource 2 for the absorbance spectra) that were not included in the training and validation data and their pairwise mixtures. These samples were not subsampled; instead, mean absorbance spectra from the 50 × 50 pixel ROIs were used for testing. These samples were from the later phase of exponential growth, when the microalgae biomasses were higher, so that mixing them would not dilute them below detection limit. Altogether 26 samples of monocultures and 13 pairwise mixtures were used for tests (Fig. 2). We also applied the 1D CNN for larger images of the samples to visualize the distribution of microalgae biomasses. These images contained also structures and edges of the well plates. The spatial visualization was done by applying the 1D CNN to 5 × 5 pixel regions of an image.
Test classification accuracy was calculated to test the performance of each model topology in resolving the microalgae species. Test classification accuracy was calculated as the percentage proportion of the correctly assigned algal label of the total number of labels. Root mean squared error as was calculated to evaluate the capability of each model in predicting the microalgae biomass. Percentage prediction errors were calculated to facilitate interpretability and comparability of the modelling results. The prediction errors were calculated as where B exp is the expected biomass and B pred the biomass predicted by the model. All the tested model topologies yielded high test classification accuracies for monocultures (≥ 95%, Table 2). Models 1, 3, 4, and 5 yielded 100% test classification accuracies for pairwise mixtures. The root mean squared errors were rather similar for models 1, 3, 4, and 5 when tested for monocultures (0.026-0.028 g L −1 , SD = 0.001-0.004) and pairwise mixtures (0.021-0.022 g L −1 , SD = 0.001-0.004). Generally, the differences in the test classification accuracies and root mean squared errors were small between the topologies. However, increase in the number of convolution and dense layers deteriorated rather than improved these metrics, and the most complex tested model (Model 8) had the lowest test classification accuracy for mixtures (87%, SD = 12, Table 1).
The selection of the model for presenting the results below was based on both training and validation metrics (Table 1) and test metrics ( Table 2). The complexity of model topology increased from model 1 to model 7 (Tables 1-2). The spectrum of a 50 × 50 pixel region, which yielded a biomass estimate for that sample, or a 5 × 5 pixel region, whose estimates can be mapped for spatial distribution of biomass in the sample simpler models (models 1 and 2) performed slightly better than the other for tests but produced higher error during training and validation. Model 5 produced best training and validation metrics (Table 1); however, when the model was made more complex than topology of model 4, the test accuracy did not improve or error decrease (Table 2). Therefore, model 4 being the simpler than model 5, we chose model 4 to present the results below. Figure 3 shows the construction of model 4. Model 4 summary from Keras library is shown in the supplementary material (Online Resource 3).

Results
The mean classification accuracy for the three separately trained replicates of model 4 was high for both test monocultures (test accuracy = 95%, SD = 4) and pairwise mixtures (test accuracy = 100%, SD = 0). Root mean squared error for the topology was 0.026 g L −1 (SD = 0.001) for monocultures and 0.022 g L −1 (SD = 0.002) for pairwise mixtures. For monocultures, the weighted average of classification precision was 0.95 (SD = 0.04), sensitivity was 0.95 (SD = 0.04),  Fig. 3 Visualization of model 4 (Tables 1-2) that was used to produce the results presented below specificity was 0.98 (SD = 0.02), and F1 score was 0.95 (SD = 0.04). For microalgae mixtures, precision, sensitivity, specificity, and F1 score were all 1.00 (SD = 0.00). Confusion matrices (Online Resource 4), species-wise classification metrics (Online Resource 5) and roc-curves (Online Resource 6) are given in the supplementary material. As the differences between the metrics of three separately trained replicates of model 4 were small, we chose replicate a of the model 4 to display the results below (please, see Data availability to access the models). The correlation between expected and predicted biomasses was high for monocultures (r = 0.97, p < 0.001, Fig. 4A) and for pairwise mixtures (r = 0.96, p < 0.001, Fig. 4B). These correlations also include the samples where an alga's expected biomass was zero. Two false-positive observations with low biomasses were found in the test with monocultures (Microcystis and Synechococcus, Fig. 4A). If only expected biomasses above zero were included in the correlations between expected and predicted biomasses, those were still good for monocultures (r = 0.89, p < 0.001) and for pairwise mixtures (r = 0.89, p < 0.001).
The percentage prediction errors were calculated for microalgae whose biomasses were greater than 0. Median prediction error for the biomasses in the monocultures was 17% (mean = 22%, SD = 18). Median prediction error for the biomasses in the pairwise mixtures was 17% (mean = 24%, SD = 28).
Spatial mapping visualized the distribution of biomass (Fig. 5). However, applying the model in each pixel of the spectral image resulted in a notable number of false positives (data not shown). Figure 4 shows the model applied to mean absorbances of 5 × 5 pixel areas of the spectral images. The edges of the wells of the well plate show some occasional false positives (Fig. 5).
The average biomasses calculated from the biomass maps of the pairwise mixtures correlated well with the expected, electronic cell counter-based biomass assessments (r = 0.89, p < 0.001). Here, 10 × 10 pixel ROI in the distribution map corresponded the 50 × 50 pixel ROI of the spectral images. Correlation between the predictions based on the mean absorbance spectrum and the predictions based on mean biomasses calculated from distribution maps was high (r = 0.98, p < 0.001). Median prediction error for biomasses in the pairwise mixtures calculated from the distribution maps (21%, mean = 23%, SD = 22, Table 3) was on the same level as the prediction error calculated from the predictions based on mean absorbance spectra in the samples (Table 3). These comparisons indicate that the 1D CNN can resolve species composition and biomasses, both from mean spectra of the sample and smaller areas visualized as a distribution map, with reasonable variation.

Discussion
The convolution neural network tested here performed well, as expected. In previous studies, convolution neural networks have been used successfully to classify algae species from photomicrographs, based on their morphological traits. Pant et al. (2020) reported 98.45% classification accuracy for classification of seven species belonging to Pediastrum group. Similarly, Yadav et al. (2020) achieved high classification accuracy (99.97%) for sixteen different microalgae genera. Otálora et al. (2021) trained a convolution neural network to classify microalgae from photomicrographs taken by FlowCAM from a moving liquid. They found that the model predicted the proportions of Chlorella vulgaris and Scenedesmus almeriensis with high correlation to the expected proportions (R 2 > 0.99). In this study, we used a convolution neural network as a multidimensional regression that simultaneously classified the species based on their The prediction errors of biomasses of this study were in the same range as the errors reported by Bricaud et al. (2007), who tested a multi-layer perceptron neural network to retrieve microalgae pigment concentrations from absorbance spectra. They observed approximately 17% error for chlorophyll a and 27 to 51% error for other pigments. Murphy et al. (2014) developed an RGB camera-based observation system and reported 22% and 14% prediction error for Chlorella sp. and Anabaena variabilis biomasses in monocultures, respectively. Broadly, the mean prediction errors of this study (22% and 24% for monocultures and mixtures) correspond to those reported in the previous studies. The root mean squared error observed in this study was lower than the lowest tested biomasses, and the classification of microalgae species was successful.
The construction of the best model topology needs to be tested case-specifically for each application. In this study, we observed that adding more than two convolution layers might have deteriorated the test classification accuracy (Table 2). Adding more than two dense layers did not improve the classification accuracy or decrease the root mean square error either ( Table 2). As reported by Medina et al. (2017) in their study of microalgae detection on videoframes, one of the advantages of a convolution neural network over a multilayer perceptron is that the CNN performs feature extraction by itself. We likely had redundant wavebands without which the model would be faster to compute the training. However, as the topologies that we tested were computed in almost real-time using the GPU (1 to 2 min per 100 epochs), we did not optimize the number of wavebands. Additionally, the training time itself is irrelevant, as the actual prediction by the trained model happens practically in real time.
In this study, the lowest tested biomasses in the pairwise mixtures were 0.05 to 0.06 g L −1 (Table 3), which represent low biomasses for microalgae cultures. The biomass ratios in the pairwise mixtures varied between 0.3 and 1.0 in the pairwise mixtures of this study. In their RGB image-based monitoring system, Murphy et al. (2014) could detect a cyanobacteria contamination in a green algae culture when their biomass ratio was 0.08, green algae biomass being 0.16 g L −1 . In the future studies, the spectral camera system and the 1D CNN could also be used to test the limit of detection for contamination because they possibly allow lower detection limits for contamination. In this study, the neural Table 3 Tested pairwise mixtures of microalgae, their biomass predicted by the 1D CNN from mean absorbance spectra, mean biomasses calculated from the distribution maps (Fig. 5) network was capable of distinguishing between different colour groups (cyanobacteria, brown, and green algae) and even between the different species among the colour groups. The cell sizes of microalgae in this study varied from pico-sized Synechococcus to nano-sized Microcystis and to micro-sized Cryptomonas, Peridinium, and Desmodesmus. The package effect, which means that larger but sparser cells in suspension transmit more light compared to smaller but more abundant cells even if their biomasses were equal, affects the absorbance-based biomass estimates of microalgae (Bricaud et al. 1988). As the 1D CNN learns the features of each cell type, the algorithm yields good predictions for (wet) biomasses despite of this variation. As the next step, more microalgae species could be incorporated into the model. Maier et al. (2021) demonstrated in resolving chlorophyll a concentration from satellite images that the 1D CNN can be successfully trained with simulated data. In our study, the training and validation included the simulated spectra and the original subsampled images. This way, it was possible to get a large enough dataset for training the model but also to include real variation in the training data. Although the model was trained with mean absorbance spectra originating from subsamples of 10 × 10 pixel areas, it was successfully tested with absorbance spectra from the 50 × 50 pixel areas (Figs. 4A-B) but also with absorbance spectra from smaller, 5 × 5 pixel areas (Fig. 5). Maier et al. (2021) noted that the 1D CNN was insensitive for variation caused by illumination conditions. Similarly, our dataset contained variation from different imaging days, likely caused by the temperature of the halogen lamp included in the imaging arrangement.
The biovolume -and further biomass -estimates by the electronic cell counter likely contain variation that contributes to the variation in predictability of the model. In the future studies, the target of interest could also be pigment composition and concentrations, instead of wet biomasses, as in this study. This study deviated from previous studies where convolution neural networks have identified microalgae efficiently from photographs Yadav et al. 2020;Otálora et al. 2021) in that we used the spectral data as an input for the model. However, as microalgae in laboratory cultures might have morphological features detectable by spectral imagers, including spatial data in the model in addition to the spectral domain to form a 3D CNN could be interesting and potentially useful.

Conclusions
The results of this study showed that the 1D CNN is a powerful algorithm to identify and quantify microalgae in monocultures and in mixed samples. Therefore, the combination of a spectral imager and convolution neural network could be an efficient monitoring approach for the growth of microalgae. The 1D CNN classified microalgae accurately on the species level, even if a sample contained two species belonging to the same colour group. Additionally, the errors of the biomass estimates were decent. Applying 1D CNN to spectral images visualized the distribution of microalgae biomass. The properties of the algorithm demonstrated in this study could be applied and tested broadly, such as monitoring of biomass, contaminations, pigments, or some other features of interest that could be detected by an imager.