Estimation of Chlamydomonas reinhardtii biomass concentration from chord length distribution data

A novel method to estimate the concentration of Chlamydomonas reinhardtii biomass was developed. The method employs the chord length distribution information gathered by means of a focused beam reflectance probe immersed in the culture sample and processes the data through a feedforward multilayer perceptron. The multilayer perceptron architecture was systematically optimised through the application of a simulated annealing algorithm. The method developed can predict the concentration of microalgae with acceptable accuracy and, with further development, it could be implemented online to monitor the aggregation status and biomass concentration of microalgal cultures.


Introduction
The production of microalgal biomass has received a great deal of attention in the last years. Several unresolved technical-economical aspects still hinder the transfer of microalgae cultivation from laboratory photobioreactors to industrial scale. Among them, harvesting constitutes a major challenge. Those harvesting methods involving flocculation of microalgal biomass, namely dissolved air flotation, bio-flocculation, flocculation assisted sedimentation and flocculation assisted filtration, require a good accuracy in what concerns both the degree of aggregation and the concentration of the culture to apply the optimal dosage of flocculant, especially in large-scale facilities. The gravimetric determination of biomass dry weight requires considerable time, and it is not suitable as a monitoring method for continuous harvesting installations. Conventional indirect monitoring techniques are mostly based on associating light scattering or attenuation properties of the medium to a given biomass dry weight through an offline calibration (Reardon et al. 2013). Among these techniques, optical density (OD) measurement is the most popular one. This method, however, often yields inaccurate estimations of dry biomass due to several facts. First, the variable pigment contents that microalgae show during their growth cycle and under different culture conditions may distort the OD measurement (Griffiths et al. 2011). Secondly, the medium itself may undergo changes in turbidity, which will affect the measurement. Finally, the fact that the absorbance measured is a function of cell size, concentration and shape makes that the OD measurements vary according to the cultivation conditions (Chioccioli et al. 2014). While the changes in the quantity of pigments per cell and the intrinsic optical can be overcome choosing a wavelength lying away from the composite one of the culture, the effects of size, shape and concentration factors are hardly avoidable. The literature identifies several techniques, namely 2D fluorometry, IR spectroscopy, multiparameter FC, in situ microscopy and focused beam reflectance measurement probe (FBRM), as a having potential to become the basis of new online monitoring methods (Höpfner et al. 2010;Reardon et al. 2013).
The work presented in this paper focuses on the evaluation of a FBRM as a tool for devising a method to allow the online estimation of biomass concentration in microalgal cultures. The probe was developed to monitor crystallisation (e.g. Barrett and Glennon 1999;Li et al. 2014), although it has also been used to monitor flocculation processes (e.g. Blanco et al. 1996Blanco et al. , 2002Jarabo et al. 2013). The FBRM probe is a device that projects a laser beam moving in a circular path into an aqueous medium and registers the backscattered beam produced when the laser path crosses a particle. The number of crossing events is directly related with the density of particles in the medium, and the duration of each event yields the corresponding chord length of the particle. The distribution of chord lengths (CLD) constitutes a representation of the actual particle length distribution (PLD) (Li and Wilkinson 2005).
Given that the FBRM device provides information of the size of the particles in the medium, the probe takes into account the aggregation of cells, thus avoiding the lack of accuracy entailed in the characterisation of suspensions through optical techniques when the concentration of particles is so high that the overlapping effect cannot be compensated or when such suspensions are made up of large particles.
Although the FBRM probe has been occasionally used in flocculation studies of microalgal biomass (Danquah et al. 2010) or to characterise microalgal culture size distributions (Uduman et al. 2011), at the time of writing this article, the literature offers only two references in which the FBRM was employed to estimate biomass concentration, in both cases of plant cell suspensions (McDonald et al. 2001;Kieran et al. 2003). In these studies, starting suspension of plant cells were diluted several times with the culture medium and characterised with the probe at each dilution point in order to obtain a correlation between the number of particles per second detected and the concentration of biomass. Those works therefore do not take into account the fact that a given biomass concentration can be measured in cultures having a completely different distribution of aggregate sizes. Figure 1 represents three schematic microalgal cultures having the same biomass concentration at different aggregation states.
The use of the FBRM probe to estimate dry biomass concentration presents several drawbacks. First, the size distribution data gathered from microalgal cultures are characterised for having a very wide range of particle chord lengths and rate of particles detected. Second, the geometry of the algal aggregates departs from the spherical shape. This effect has been observed to be even more significant when the number of cells in the aggregate becomes larger as can be seen in the micrographs included in the BResults and discussion^section of this paper (Fig. 6). When aggregation takes place, the culture presents a highly heterogeneous distribution of particle sizes and shapes, and it is possible to find large aggregates of irregular shape, small aggregates and isolated cells of shape close to spherical. Finally, the FBRM probe data only provide a transformed representation of real chord particle length distributions. With these premises, it is not possible to find an analytical or semianalytical model to estimate algal biomass concentration from CLD data, making it necessary to apply artificial intelligence tools to approach the problem.
The authors chose to model the relationship between CLD data and dry biomass using artificial neural networks (ANN), in particular feedforward multilayer perceptrons, given their proven efficacy in solving problems related to sensing and spectra interpretation in the fields of Biotechnology (Strapasson et al. 2014) and Chemical Engineering (Curteanu and Cartwright 2011;Pirdashti et al. 2013;Ali et al. 2015). ANN have been employed occasionally to estimate biomass concentration of yeasts in fermentation (e.g. Vaněk et al. 2004;Hocalar et al. 2011) based on the concentration or production rates of the chemical species involved in the process. Therefore, the use of ANN in the estimation of microalgal biomass and the use of ANN to process CLD data for estimating biomass concentration represent an important innovation in this field.
The present study investigates the possibility of translating CLD data of Chlamydomonas reinhardtii suspensions into a biomass concentration by means of artificial neural networks.

Materials and methods
Chlamydomonas reinhardtii strain from CCAP (CCAP No. 11/32B) was used for this study. It was and cultivated with TAP medium (Gorman and Levine 1965) in shake flasks (115 rpm, 23°C and 12 h cool while light). After a concentration of biomass around 1 g L −1 was obtained, the cultures were transferred to 5.5-L photobioreactors operated at a Fig. 2 Algorithm for the selection of the multilayer perceptron architecture temperature between 23 and 25°C with aeration (2 L min −1 ) and pH 7.5. pH was controlled through the automatic supply of CO 2 (0.2 L min −1 ). Light was provided by means of four fluorescent cool while light in a 12-h cycle.

Flocculant
Chitosan from crab shells (Sigma-Aldrich) was used as a flocculant. The flocculant stock solution was prepared by dissolving chitosan in a solution of 1 % glacial acetic acid in ultrapure water subject to mechanical stirring at 400 rpm for 1 h. The solution was left to settle for 1 day before use.

Dry biomass determination
The determination of dry biomass was carried in triplicate following the method described in Beckmann et al. (2009). Each sample of 20-mL medium was centrifuged at 8400 rpm for 11 min and washed twice with ultrapure water. The microalgal pellets obtained were transferred to previously dried aluminium dishes and allowed to dry at 101.5°C for 3 h. The samples were then placed in a desiccator for 45 min and weighted afterwards. The concentration of biomass was calculated as gramme dry biomass per litre of medium.

Characterisation of C. reinhardtii cultures through CLD measurement
An M500L FBRM (Mettler Toledo) was employed to gather data about the distribution of particle chord length in the microalgal cultures. The FBRM is capable of performing the real-time monitoring of the number of particles in a suspension and classifies them in terms of their chord length through a dedicated software system. The software recorded the number of events detected per second and classified them according to their length in one of 90 bins of length intervals organised logarithmically and ranging from 1 to 1000 μm. The cultures were analysed in 10-s sampling intervals over a period of 6 min, which represented 36 data points. In order to represent a wide variety of concentration and aggregation states, the samples were collected at different stages of microalgal growth within the reactor. Fifteen microalgal cultures having different biomass concentrations were considered.
With the purpose of having a wide variety of aggregation states of the cells, each culture was partially flocculated using several doses of 1 % chitosan. Likewise, the original cultures and the flocculated ones were characterised with the FBRM probe taking samples of 200 mL. The samples were placed in 250-mL beakers and mechanically stirred at 200 rpm.   The set of data collected from each algal sample (flocculated and Braw^) was exported to a spreadsheet application and converted to a tabular data file. Each data file contained a set of data corresponding to the biomass chord length distribution of the medium sampled, i.e. a collection of vectors of 90 elements each. From the 36 data points gathered for each suspension, the first 8 points and the 8 last ones were discarded, what left a set of 20 consecutive measuring points (vectors) for further processing. From this phase onwards, the processing of data was carried out with the mathematical software package Matlab R2014b. The 20 vectors of each medium sampled were extracted with the mentioned software, and each one was associated to its corresponding dry biomass weight. The final dataset built consisted of a matrix having 1940 vectors of 90 elements (CLD data) and a vector of 1940 elements containing the biomass concentration data. The whole set of data was randomly sorted preserving the correspondence between CLD and biomass concentration.
In order to find the optimal perceptron architecture to fit the chord length distribution data, a simulated annealing algorithm was employed. The algorithm creates a random network architecture and evaluates it. Then, a random modification of the architecture is carried out and the network is evaluated again. If the new ensemble produces a better estimation of the biomass concentration, it is accepted. If the new network produces worse results, it can be accepted with a given probability that depends on the Boltzmann probability distribution as shown in Eq. 1, which is itself dependent on an analog temperature.
P(state) represents the probability of accepting a solution, ΔE is the difference between the error of the current network (energy of the state) and the previous one, k is a constant (in real systems the Boltzmann's constant) and T is the temperature. In the present work, the cooling schedule chosen was determined by Eq. 2.
T i is the temperature of the current state. T i+1 represents the temperature of the subsequent state and α is a reduction factor, in the case concerned set to 0.99905.
To evaluate each architecture, a Matlab function taking the characteristics of the network was implemented and called within the main programme. This function took as input parameters the characteristics of the network, the input and the output data. The characteristics of the network were codified as strings of 27 bits (Table 1).
The function constructed the network, trained, validated and tested it. The dataset was divided in three subsets to perform each procedure employing 70 % of the data for training (1358 points), 15 % for validation (291 points) and 15 % for testing (291 points). The function returned the network trained  and mean squared error of the estimations of biomass concentration corresponding solely to the test subset. Employing only the test set to evaluate the predicting ability of the network favours those networks having a generalisation capacity, thus avoiding overfitting. All networks were trained using the scaled conjugate gradient backpropagation method. The trainings were carried out using the CUDA (Compute Unified Device Architecture) functionalities offered by Matlab to accelerate the running time by employing the parallel computing capabilities of the workstation's GPU.
As mentioned above, the value of the error and the probability value calculated according to Eq. 1 served as a criterion to accept or discard the model. The algorithm was left to carry out 20,000 iterations. Figure 2 depicts the algorithm implemented to select the best neural network architecture.

Biomass dry weight and chord length distributions
Dry microalgal biomass concentration ranged between 0.2 and 2.7 g L −1 .
To exemplify the results obtained through the FBRM probe, three chord length distributions gathered with the device are presented in Figs. 3, 4 and 5 (0.2, 1.0 and 2.7 g dry biomass L −1 ). In the figures, the vertical axis represents the number of particles per second detected by the equipment and the horizontal axis the midpoint of the chord length intervals. The profundity axis corresponds to each of the 20 distributions taken in the course of the sampling time. Figure 6 shows three different states (in increasingly lighter hues of grey) produced with the same C. reinhardtii culture of 1.0 g L −1 dry biomass. The cluster of lines in the nearest position corresponds to the raw culture, and the next two ones were obtained with the addition of 2 and 4 ppm of 1 % chitosan, respectively. It can be observed that the lines of the second and third clusters reach higher values of counts per second and that the third cluster is slightly shifted towards higher values of chord length. The increased number of particles detected in the second and third measurements can be ascribed to the fact that the flocculation process aggregates some cells that fall below the lower detection threshold of the equipment into flocs that become visible for the probe. The shifting towards larger values of chord length is due to the formation of larger particles associated to a higher dosage of flocculant. Figure 7 is a photomicrograph of a flocculated culture sample. Both the floc-formed and isolated microalgal cells are visible. Isolated cells present an approximately spherical shape while flocs are irregular.   Perceptron selection Figure 8 shows the evolution of the mean squared error of the multilayer perceptron in the process of selection and the analog temperature over the 20,000 iteration cycles. The graph shows that at the early stages of iteration, the MSE shows very ample oscillations. At iteration cycle 10,895 (analog temperature value of 0.0032), the error reaches a plateau that corresponds to a MSE of 0.03764 (g L −1 ) 2 . At iteration 18,215, the error reaches its minimum, 0.0309 (g L −1 ) 2 . The network having this minimum error presents the features listed in Table 2. Figure 9 shows the correlation between the estimated biomass dry concentration and the real one for the train subset. Figure 10 shows the correlation between the estimated biomass dry concentration and the measured one for the test subset.

Conclusions
A method to estimate the dry biomass concentration of C. reinhardtii cultures based on their aggregation state and particle density was implemented. The developement of the method involved selecting the adequate architecture of a multilayer perceptron to translate chord length distribution data gathered though an FBRM probe into microalgal biomass concentration. The optimal architecture of the perceptron was found by applying a systematic selection of the model parameters based on a simulated annealing algorithm that favours the generalisation capacity of the model. The artificial neural network selected was capable to produce very good estimations of the dry biomass concentration of C. reinhardtii cultures (R 2 =0.9228 in the test set). In view of the results, it can be ascertained that the method investigated could be a useful tool for the online monitoring of microalgal suspensions if the software is adapted to handle data streams to perform the estimation of biomass concentration instantaneously. With such monitoring system, it would be possible to estimate the concentration of microalgal biomass in a reliable manner without the interferences of factors such as turbidity of the medium, state of aggregation or phase of growth. Such system should be calibrated to account for the particular characteristics of the strain considered (size, aspect ratio) and hydrodynamic conditions of the culture since these affect the flocculation (and auto-flocculation) process and determine the probability of the particles traversing the FBRM probe window.