1 Introduction

Production of fish protein hydrolysates (FPH) from fish processing by-products using enzyme hydrolysis is an established field [1]. The major area of investigation has been the production of FPH with bioactive and functional properties. To measure the progress of a hydrolysis reaction, degree of hydrolysis (DH) has been used to quantify the amount of peptide bonds in proteins that have been broken down by enzyme activity [2]. There is a lot that is known about the relationship between DH and different process conditions (temperature, pH, time, mixing speed, enzyme dosage, substrate concentration) [37], types of enzymes (Alcalase 2.4L, papain, Flavourzyme etc.), types of fish (herring, tuna, sardines, monkfish etc.) [811] and different sources of substrate (heads, viscera, fish frames and scales, tails and fins, fish muscles or a mixture of these) [10, 1216].

Despite the utility of DH as a measured process variable, higher DH does not necessarily imply high protein solubilisation. During the initial stages of enzyme hydrolysis, the majority of protein is in solid form, and more than 80% of enzyme attaches to solid particles in the first 5 to 10 min of enzyme hydrolysis [17]. This results in enzyme activity being dominant on the solid–liquid interfaces. The DH that is measured during this time will be a measure of this activity. However, as protein solubilisation increases because of enzyme activity on insoluble protein, the rate of enzyme activity on the protein in solution increases, resulting in the reduction of the dominance of the solid–liquid interface reactions. Additionally, low DH has been associated with high average molecular weight peptides in FPH, while high DH is associated with low average molecular weight peptides [14, 1820]. This means that DH only predicts the extent of overall protein hydrolysis, which has a large impact on the functional properties of the hydrolysates [9, 15, 18, 2123], but it does not predict protein solubilisation in all cases.

Determination of the total amount of solids (yield) and protein that can be recovered from an enzyme hydrolysis process is important for equipment and process design. Currently, there is little that is known about how total yield and protein recovery are affected by process conditions, especially process conditions that relate to mass transfer (solids concentration and mixing speed) and reaction kinetics (enzyme-to-substrate ratio).

Additionally, most of the existing literature that optimised DH and protein recovery performed single variable optimisation, where each response variable was optimised independent of the other. However, in complex unit operations with more than one response variable, independence of response variables cannot be summarily assumed, and simultaneous optimisation of multiple variables may be required. For example, enzyme hydrolysis produces a free oil phase, emulsion phase and sludge phase; in addition to the aqueous hydrolysate phase, the loss of protein from the desired aqueous hydrolysate phase to the emulsion and sludge phases can be significant [13, 2429]. Since the factors that affect protein recovery in the hydrolysate phase may not be the same factors that affect protein loss to emulsion and sludge, these response variables must be optimised simultaneously to maximise protein recovery. This simultaneous optimisation of multiple variables can be achieved by different methods, but one simple and efficient method, known as desirability analysis, was developed by Derringer [30].

The aim of this investigation was to demonstrate the importance yield and protein recovery, in addition to DH, in the multivariable optimisation of enzyme hydrolysis, as these two are key variables required for equipment sizing and process design. The specific objectives were to demonstrate how single variable optimisation produces different optimum conditions from multivariable optimisation and to illustrate the impact of solids concentration on process performance, with respect to the amount of water required to produce 1 unit mass of dry protein.

2 Materials and methods

2.1 Enzyme and substrate

Alcalase 2.4L, a commercial food grade endo-protease from Sigma-Aldrich (Sigma-Aldrich, Inc., USA), was used for the hydrolysis of sardine (Sardina pilchardus) off cuts originating from a commercial canning operation (West Point Processors (Pty) Ltd, St Helena Bay, South Africa). Alcalase was chosen based on previous studies where it was shown to produce protein hydrolysates with better nutritional value, functional and bioactive properties such as high solubility, high emulsifying activity, high angiotensin I-converting enzyme (ACE) inhibiting activity and less bitter tasting powders [23, 24, 27, 3133]. Upon delivery, samples were minced using a Trespade no 12 mincer (Tre Spade, Italy), packed into 75 g units and then stored in a freezer at − 26 °C until time of use.

Proximate analysis was performed on the minced fish samples to determine protein content, lipid content, ash content and moisture content. Protein analysis was conducted according to the Kjeldahl method (AOAC Official Method 979.09) using VELP Scientifica Kjeldahl equipment.

Total lipid content was measured according to the method by Lee et al. [34], with a slight modification. A 2:1 (v/v) chloroform–methanol solution at a solvent-to-sample ratio of 10:1 was used. Five gram of representative sample was mixed with 50 mL of 2:1 chloroform–methanol solution and homogenised. The filtration stage was skipped to minimise solvent loss to evaporation. The accuracy of the method is dependent on the amount of used to extract lipids, and any loss would result in concentrating the lipids thus giving an inaccurate result. The equation presented by Lee et al. [34] has a term to account for solvent loss, but working at such small volumes, it was difficult to accurately determine how much would be lost during the filtration process. 20 mL of 0.5% sodium chloride solution was added to the homogenised sample to prevent formation of emulsion. Five millilitre aliquots of the chloroform phase were extracted and dried in a pre-weighed beaker on a hot plate set at 60 °C in a well-ventilated fume hood. Lipid content was calculated using Eq. 1, modified from Lee et al. [34].

$$Lipid\;content\;\left(\%\right)=\frac{lipid\;extracted\;(g)}{sample\;mass\;(g)}\times\frac{volume\;of\;Chloroform\;used\;(mL)}{volume\;of\;Chloroform\;aliquot\;(mL)}\times100\%$$
(1)

Ash analysis was performed on the dried samples by incineration at 600 °C for 3 h (AOAC Official Method 942.05) using a Nabertherm Muffle Furnace Lt 3/11/B180 L-030H1CN (Nabertherm GmbH, Germany). Moisture content was analysed using 1 g samples on a Kern DBS 60–3 moisture analyser (KERN & SOHN GmbH, Germany) set to program 1 (auto drying at 180 °C).

The proximate composition of the raw material used in this study is shown in Table 1.

Table 1 Proximate analysis results for Moroccan sardines processing by-products used as protein substrate. Mean ± standard deviation, n = 3

2.2 Experimental design

A rotatable central composite design (CCD) with four centre point replicates was used to determine the effect of mixing speed (rpm), solids concentration (% wet weight) and enzyme dosage (% v/w dry protein) on degree of hydrolysis, total dry solids yield, protein recovery, protein loss to emulsion and protein remaining in sludge after separation by centrifugation. The ranges of independent factors were chosen based on preliminary one-factor-at-a-time experiments. An alpha value of 1.6818 was employed to maintain rotatability of the CCD (Table 2).

Table 2 Levels of independent factors for central composite design. α = 1.6818 for rotatability. Temperature and pH were fixed at 60 °C and pH 7.8 based on one-factor-at-a-time preliminary experiments

2.3 Hydrolysis experiments

Hydrolysis experiments were carried out in a 400-mL temperature-controlled reactor, which was filled with a known mass of fish sample that give a specific solids concentration on a wet basis. An appropriate amount of water was added to the fish sample, and homogenised, making a total volume of 150 mL. The homogenised material was allowed to acclimatise to the operating temperature of 60 °C and pH 7.8; hydrolysis temperature and pH were maintained constant throughout the experiments. These values are based on one-factor-at-a-time preliminary experiments where 60 °C and pH 7.8 gave the highest protein recovery. Initial reactor pH was adjusted using 2 M sodium hydroxide (NaOH). After the required hydrolysis conditions were reached, Alcalase was dosed at the enzyme–substrate ratios indicated in Table 2. The hydrolysis reaction was allowed to run for 120 min. During the the course of hydrolysis reaction, pH was maintained constant using 0.5 M NaOH that was automatically dosed into the reactor using Neon PR (Kuntze Instruments GmbH, Germany) PID controller connected to a PULSAtron K4VCT1 electronic pump (PULSAFEEDER Inc., USA).

Degree of hydrolysis (DH), which is the percentage of peptide bonds broken down by enzyme activity to the total available peptide bonds in the substrate (Eq. 2), was tracked using the pH–stat method.

$$DH (\%)=\frac{{V}_{B}\;{\bullet\;c}_{B}}{\alpha \bullet {m}_{P}\bullet {h}_{tot}}\times 100\%$$
(2)

where VB (mL) is the volume of the base consumed, cB (mol/L) is the base concentration, α is the average degree of dissociation of the alpha-amino groups (α-NH2 groups), mP (g) is the mass of protein in the substrate and htot (mmol of peptide bonds per g of protein) is total peptide bonds in the substrate. The Camacho et al. [35] method was used to calculate the value of α (Section S1 of the supplementary material).

2.4 Material balance

Material balance calculations were carried out at the end of hydrolysis runs, to determine how protein and non-protein material in the feed was distributed between the different phases that form during hydrolysis.

Hydrolysed material was fractionated through centrifugation into three fractions: an emulsion (top layer) containing light weight lipo-protein complexes, an aqueous phase containing solubilised FPH (middle layer) and an insoluble sludge (bottom layer) containing high molecular weight lipo-protein complexes, unhydrolysed protein, bones and scales. Fractionation was achieved by centrifugation in 6 × 50 mL tubes for 20 min at 5 000 rpm (1 398 g) using LASEC ISLXTG16.5 centrifuge (Lasec, South Africa). The oil content of the sample and the small quantities used in each experiment were not sufficient to form a free oil layer that could be separated and quantified. Oil droplets reported mainly to the emulsion phase, with some remaining in the liquid FPH after separation.

Dry solids yield, which is the total solids recovered as a percentage of the initial solids on a dry basis, was determined using the dry weight of FPH, according to Eq. 3.

$$Yield\;\left(\%\right)=\frac{mass\;of\;of\;solids\;in\;hydrolysate\;(dry\;basis)}{mass\;of\;solids\;in\;feed\;(dry\;basis)}\times100\%$$
(3)

The protein content of each fraction from the separation process was determined to calculate protein recovery in FPH, protein loss to emulsion and protein loss to sludge, as a percentage of the protein in the initial sample before hydrolysis. The formulae for determining these variables are presented in Eq. 4, Eq. 5 and Eq. 6, respectively.

$$Protein\;recovery\;\left(\%\right)=\frac{mass\;of\;protein\;in\;hydrolysate\;(dry\;basis)}{mass\;of\;protein\;in\;feed\;(dry\;basis)}\times100\%$$
(4)
$$Protein\;loss\;to\;emulsion\;\left(\%\right)=\frac{mass\;of\;protein\;in\;emulsion\;(dry\;basis)}{mass\;of\;protein\;in\;feed\;(dry\;basis)}\times100\%$$
(5)
$$Protein\;loss\;to\;sludge\;\left(\%\right)=\frac{mass\;of\;protein\;in\;sludge\;(dry\;basis)}{mass\;of\;protein\;in\;feed\;(dry\;basis)}\times100\%$$
(6)

Protein content and moisture of each fraction recovered from the hydrolysis process were determined according to Eq. 7 and Eq. 8:

$$Protein\;content\;\left(\%\right)=\frac{mass\;of\;protein\;in\;each\;fraction\;(dry\;basis)}{mass\;of\;solids\;in\;each\;fraction\;(dry\;basis)}\times100\%$$
(7)
$$Moisture\;content\;\left(\%\right)=\frac{mass\;of\;wet\;sample-mass\;of\;dry\;sample}{mass\;of\;wet\;sample}\times100\%$$
(8)

2.5 Statistical analysis

Data was analysed using Statistica v13.5.0.17 (TIBCO Software Inc., USA). Results of the experiments were fitted to a second-order regression model (Eq. 9), and the resulting model was employed to predict the response variables based on different levels of the independent variables.

$$Y={\beta }_{0}+\sum_{i=1}^{k}{\beta }_{i}{x}_{i}+\sum_{i=1}^{k}{\beta }_{ii}{x}_{i}^{2}+\sum \sum_{i<j}{\beta }_{ij}{x}_{i}{x}_{j}+\epsilon$$
(9)

Analysis of variance (ANOVA) was performed on the standardised effects to determine factors that were statistically significant in explaining the variation observed in the response variable. Factors were standardised by scaling the original factors such that the values of the low and high factors are transformed − 1 and + 1, respectively. This makes the interpretation of effect estimates standardised and comparable in size [36].

The second-order regression model was simplified by eliminating factors, one at a time, that were not statistically significant (p > 0.05), starting with the least significant factor (largest p value). The factors that gave the highest adjusted R2, instead of the normal R2, were retained in the model, regardless of whether they were significant or not. This was done to correct for the over estimation of the response variable introduced by terms that are not significant in explaining the variation observed in the response variable. Adjusted R2 was used because it penalises a model for inclusion of non-significant factors. That is, normal R2 increases as new terms are added to the model, but adjusted R2 decreases when these additional factors do not significantly improve the model accuracy in predicting the response variable [37, 38].

Lack of fit (LOF) was used to test the goodness of fit of the second order regression model on the data. A p value > 0.05 (statistically non-significant LOF) means that a model has good accuracy in describing the data set it was fitted to, while a p value < 0.05 (statistically significant LOF) shows that a model performs poorly in predicting the response variable for the given data.

Desirability analysis was used to determine the optimum conditions for each response variable, as well as the overall optimum for all variables when optimised simultaneously. A full description of the method is found in [30]. For response variables that were to be maximised (DH, dry solids yield and protein recovery), the desirability function is set as

$$d_i\left(Y_i\right)=\left\{\begin{array}{l}\;\;\;\;\;\;\;0\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;if\;Y_i(X)\;<\;L_i\\\left(\frac{Y_i\left(X\right)-L_i}{T_i-L_i}\right)^s\;\;\;\;\;\;\;\;\;\;\;if\;L_i\;\leq\;Y_i\left(X\right)\;\leq\;T_i\\\;\;\;\;\;\;\;1\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;if\;Y_i(X)\;>\;L_i\end{array}\right.$$
(10)

where di(Yi) is the desirability, with a value between 0 and 1 mapped to all the possible values of Yi. di(Yi) = 0 represents the most undesirable value of the response variable and di(Yi) = 1 represents the most desirable value of the response variable Yi. Yi(X) is the value of response variable i at independent factor X. Li is the lower target (response) value, and Ti is the desired value for the response variable.

Loss of protein to emulsion and sludge was optimised using the desirability function that minimises the value of the response variable.

$$d_i\left(Y_i\right)=\left\{\begin{array}{l}\;\;\;\;\;\;\;1\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;if\;Y_i(X)\;<\;L_i\\\left(\frac{Y_i\left(X\right)-U_i}{T_i-U_i}\right)^t\;\;\;\;\;\;\;\;\;\;if\;T_i\;\leq\;Y_i\left(X\right)\;\leq\;U_i\\\;\;\;\;\;\;0\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;if\;Y_i(X)\;>\;U_i\end{array}\right.$$
(11)

where Ui is the upper value of the response variable. The values of s and t, which are weights that are used to determine the curve of approach towards the target value, were set equal to 1. This makes the relationship between the desirability and the response variable linear.

For the mass balance, Tukey–Kramer test was used to perform pairwise comparison of means to determine the pairs that showed statistically significant differences [39, 40].

3 Results and discussion

Table 3 shows how degree of hydrolysis (DH), yield of dry solids in FPH, protein recovery and protein loss to emulsion and sludge were influenced by mixing speed, solids concentration and enzyme dosage. DH ranged from 20.56 to 26.73%, solids yield between 51.79 and 71.48%, protein recovery in the FPH ranged from 62.14 to 83.91%, while loss of protein to emulsion and sludge were 1.52–10.58% and 10.50–32.40%, respectively.

Table 3 Results of a central composite design of which the centre point was repeated four times, where the DH, yield of dry solids, protein recover in FPH and protein loss to sludge and emulsified phases are determined based on mixing speed, solids concentration and enzyme dosage

Table 4 summarises the results from ANOVA, showing only the p values for the effect of independent factors on response variables. ANOVA was performed at 5% significance level, that is, p values less than 0.05 mean the effect is statistically significant. A statistically significant lack of fit in Table 4 means that the regression model does not fit the data well. ANOVA tables showing the effect of mixing speed, solids concentration and enzyme dosage on each response variable are provided in Tables 7–11 in Section S2 of the supplementary material.

Table 4 p values from ANOVA, showing the effect of mixing speed, solids concentration and enzyme dosage on the response variables. ANOVA carried out at 5%

3.1 Degree of hydrolysis

DH ranged from 20.83 to 26.73%. The factors that explain the variation of DH in Table 3 were determined by analysis of variance (ANOVA). ANOVA showed that the linear terms of mixing speed, solids concentration and enzyme dosage all have a statistically significant effect (p < 0.05) on DH (Table 4). The interaction between mixing speed and solids concentration was statistically significant with a positive effect (Fig. 14 in supplementary material Section S3), but none of the quadratic terms or any other interaction terms were statistically significant. The LOF was statistically significant (p < 0.05), meaning that the second-order regression equation fitted to the data does not fully capture all the factors responsible for variations in DH in the particular system.

Factors that had no statistical significance on DH were sequentially removed from the second-order regression equation, in descending order of p values (i.e. starting with the quadratic term of enzyme dosage, Table 4). During the elimination process, no non-significant factor became significant, and the p values of statistically significant factors did not change when the non-significant factors were eliminated. Additionally, the LOF remained statistically significant throughout the elimination process. This means that, despite elimination of non-significant factors, a second-order regression model still does not adequately relate the effect of mixing speed, solids concentration and enzyme dosage to the response variable DH. The regression model based on the statistically significant factors in Table 4, relating the effect of mixing speed, solids concentration and enzyme dosage on DH, is shown in Eq. 12 (R2 = 0.57322, adjusted R2 = 0.44191).

$$DH \left(\%\right)=34.06468-0.05515x-0.33172y+1.09460z+0.00126xy$$
(12)

where x is mixing speed (rpm), y is solids concentration (%) and z is enzyme dosage (%).

The effect of mixing speed and solids concentration on DH at the centre point enzyme dosage (3%) is shown in Fig. 1A. Low mixing speeds and low solid concentration result in high DH. This is because at low mixing speeds and high solids concentration, there is non-homogenous mixing and the formation of dead zones at the bottom of the reactor where solids settle. This is in agreement with research carried out by [23, 25, 32, 41, 42], where low DH values were reported for high solids concentration at fixed mixing speeds. The practical significance of the relationship between mixing speed and solids concentration observed in this study in the context of bioprocess design is that there need to be investigations on mixer selection and design for handling high solids concentrations. This will entail the determination of mixing and mass transfer parameters such as power number, flow number, Reynolds number and Froude number. These parameters are functions of mixing speed and viscosity, which changes as solids concentration changes [4346]. Due to economic reasons, chief among them being the high cost of downstream processing related to energy required for removing water, it is necessary to operate bioprocesses at high solids concentrations [47, 48].

Fig. 1
figure 1

Contour plots for the response variable DH (%), showing the effect of A mixing speed and solids concertation at 3% enzyme dosage, B mixing speed and enzyme dosage at 38% solids concentration and C solids concentration and enzyme dosage at 200 rpm mixing speed

A further possible reason for the reduced DH at high solids concentration, apart from mass transfer effects, is the possible inhibition of enzyme activity by hydrolysis products, which is supported by the results in Fig. 1C. Valencia et al. showed that enzymatic hydrolysis products gradually reduce the rate of enzymatic hydrolysis as their concentration increases [49]. The rate of product formation is higher at high solids concentration and enzyme dosages, resulting in shorter times being required for enzyme inhibition by product formation to occur. Figure 1C shows that the reduced DH at high solids concentration can be countered, to some extent, by increasing enzyme dosage. However, this is discouraged at industrial scale since it significantly increases operating cost [50].

3.2 Yield of dry solids and protein recovery in hydrolysates

Different levels of mixing speed, solids concentration and enzyme dosage were used to determine the amount of solids (both protein and non-protein material) in the feed that can be recovered in the FPH phase. Yield was expressed as the percentage of the solids in the feed that were recovered in the FPH phase. This ranged from 51.79 to 68.85%, on a dry basis. The effect of the independent variables on yield of dry solids is shown in Table 4 and Fig. 15 in Section S3 of supplementary material. Only the linear and quadratic term of solids concentration were statistically significant in explaining the variation observed in yield, whereas all other terms and the LOF were statistically insignificant.

Regression model improvement was done by a sequential elimination of statistically non-significant factors in Table 4, until a maximum adjusted R2 was obtained. The resulting regression model showing the effect of mixing speed, solids concentration and enzyme hydrolysis on yield of dry solids is shown in Eq. 13 (R2 = 0.78626, adjusted R2 = 0.6972).

$$Yield \left(\%\right)=213.3854-0.2519x-0.0006{x}^{2}-6.5586y+0.0791{y}^{2}+1.5824z$$
(13)

where x is mixing speed (rpm), y is solids concentration (%), and z is enzyme dosage (%).

Figure 2 shows that increasing solids concentration at constant enzyme dosage (Fig. 2A) resulted in decreased yield for solid concentrations of 26% up to approximately 42%; whereafter, increased solids concentration resulted in increased yields up to 50%. Increasing solids concentration at constant mixing speed (2C) resulted in a decrease in yield from solids concentrations between 26 and 42%; there were again slight increases in yield for solids concentrations from 42 to 50%. Yields greater than 70% could generally be obtained when the solids concentration was below 30%. These results are in line with previous work where low substrate concentrations resulted in higher yields for enzyme-assisted reactions [49, 51, 52] and may be due to enzyme inhibition from hydrolysis products [49, 52] at high substrate concentrations, or possibly due to proteins reaching their solubility limit at the temperature and pH of the hydrolysis reaction [53].

Fig. 2
figure 2

Contour plots for the response variable yield (%), showing the effect of A mixing speed and solids concertation at 3% enzyme dosage, B mixing speed and enzyme dosage at 38% solids concentration and C solids concentration and enzyme dosage at 200 rpm mixing speed

Mixing speed controls particle velocity in the reactor, determining whether there is particle settling or suspension depending on solids concentration. The relationship between mixing speed and enzyme dosage shown in Fig. 2B agrees with what has been reported in literature [5456], where low mixing speed results in high yield for enzyme hydrolysis reactions. The reason for this is that there are two main regimes that control the kinetics of enzyme hydrolysis: mass transfer-controlled regime and reaction-controlled regime. A review by Gaikwad [54] showed that, at low mixing speeds, Vmax (the maximum rate of enzyme reaction) is higher while KM (Michaelis–Menten constant, representing the substrate concentration that results in half Vmax), meaning that kinetics is controlling the reaction. Increasing mixing speed significantly decreases Vmax, resulting in mass transfer effects dominating the reaction.

Protein recovery ranged from 62.14 to 83.91%, on a dry basis. As shown in Table 4, the linear term of enzyme dosage was the only factor that was statistically significant in explaining the variation observed in protein recovery. This factor had a positive effect on protein recovery (Fig. 16 in Section S3 of supplementary material), meaning that increasing enzyme dosage increased the amount of protein that was recovered in the FPH phase. This is expected, based on results reported in previous studies [22, 49, 50, 57], because Alcalase has a broad specificity and selectivity, meaning that it has a wide range of amino acids in a protein it can recognise. Thus, increasing enzyme dosage proportionally increases protein recovery.

The regression model showing the effect of mixing speed, solids concentration and enzyme dosage on protein recovery in FPH is shown in Eq. 14 (R2 = 0.6580, adjusted R2 = 0.51551).

$$protein\;recovery\;\left(\%\right)=100.2179-1.6873y+3.2900z+0.0035xy-0.0416xz+0.2293yz$$
(14)

where x is mixing speed (rpm), y is solids concentration (%) and z is enzyme dosage (%).

The observation in Fig. 3C, where a simultaneous increase of enzyme dosage and solids concentration increases the protein recovery, agrees with the observations in literature [5, 6, 5860]. This is because higher enzyme concentrations at higher solids concentrations counter the effect of enzyme inhibition caused by product formation. The other possible reason could be improved mass transfer, where enzymes and substrate have shorter diffusion distances before colliding with each other when their concentrations are both high. This was first explained by Engasser and Horvath [61], where the kinetics of an enzyme hydrolysis reaction are hindered by the poor diffusivity of enzyme to substrate in dilute systems.

Fig. 3
figure 3

Contour plots for protein recovery in FPH (%), showing the effect of A mixing speed and solids concertation at 3% enzyme dosage, B mixing speed and enzyme dosage at 38% solids concentration and C solids concentration and enzyme dosage at 200 rpm mixing speed

3.3 Protein loss to emulsion and sludge

Separation of the hydrolysis mixture produced an emulsion phase, a sludge phase and the aqueous phase containing the soluble FPH. Both the emulsion and sludge phases contained protein, non-protein material and water. The protein in the emulsion and sludge phases is considered non-recoverable since it is either insoluble or would require significant additional processing to recover.

Protein loss to emulsion ranged from 1.52 to 15.5%, on a dry basis. Table 4 has the p values from ANOVA showing the effect of mixing speed, solids concentration and enzyme dosage on protein loss to emulsion. The quadratic term of solids concentration and the interaction between mixing speed and solids concentration, and solids concentration and enzyme dosage had statistically significant effects (p < 0.05) on protein loss to emulsion. The statistical interaction between mixing speed and solids concentration can be explained by the fact that emulsion formation requires mechanical energy to be supplied to the system [62]. Furthermore, previous research has shown that hydrolysed proteins have high emulsion activity (the maximum interfacial area per unit mass of protein in a stabilised solution), emulsion capacity (the maximum amount of oil that can be emulsified under specified conditions by a unit mass of protein) and emulsion stability (the capacity of a protein to form an emulsion that remains unchanged for a certain time period at a given temperature and gravitational field) [41, 57, 6366]. These physicochemical properties also explain the effect of solids concentration and enzyme dosage. The hydrolysed protein can act as surfactant that stabilises the emulsion [67], and factors that enhance FPH production and formation of emulsions are likely to result in increased protein loss in the emulsion phase.

The regression model that describes the loss of protein to emulsion was obtained by sequentially eliminating statistically non-significant factors in Table 4, resulting in Eq. 15 (R2 = 0.55099, adjusted R2 = 0.3639).

$$protein\;loss\;to\;emulsion\;\left(\%\right)=9.2154-0.0076{x}^{2}+0.0284{z}^{2}-0.0172xz+0.0850yz$$
(15)

where x is mixing speed (rpm), y is solids concentration (%) and z is enzyme dosage (%).

Figure 4A and B shows the effect of mixing speed on protein loss to emulsion at different factor levels of solids concentration and enzyme dosage. Solids concentration is moderating the expected increase in emulsion formation as mixing speed increases (Fig. 4A), which could be due to water becoming a limiting reagent as solids concentration increases. Figure 4B shows a trend that is known from previous work, where emulsion formation increases as mixing effects, caused by an increase in mixing speed, increase. This is the principle used in the manufacture of colloidal emulsions [6870], e.g. during the production of mayonnaise.

Fig. 4
figure 4

Contour plots for protein loss to emulsion (%), showing the effect of A mixing speed and solids concertation at 3% enzyme dosage, B mixing speed and enzyme dosage at 38% solids concentration and C solids concentration and enzyme dosage at 200 rpm mixing speed

Increasing solids concentration and enzyme dosage at constant mixing speed increased the amount of soluble protein that was available for emulsion formation, resulting in more emulsion being formed (Eq. 15, Fig. 4C and Fig. 17 in Section S3 of supplementary material). The indirect effect of enzyme dosage is due to increased formation of small molecular weight peptides at higher enzyme dosages, which have high foaming and emulsion formation properties [41, 63, 7174]. The small molecular weight peptides formed reduce the surface tension of water, creating a more suitable environment for oil and large molecular weight proteins and peptides to interact, resulting in the formation of a stable emulsion.

Protein loss to sludge ranged from 10.5 to 32.4%, on a dry basis. Table 4 shows that the factors that were statistically significant (p < 0.05) were linear and quadratic term of solids concentration, linear term of enzyme dosage and the interactions between mixing speed and solids concentration, and solids concentration and enzyme dosage.

Equation 16 (R2 = 0.85128, adjusted R2 = 0.74717) is the regression model showing the effect of mixing speed, solid concentration and enzyme dosage on protein loss to sludge.

$$Protein\;loss\;to\;sludge\;\left(\%\right)=0.2072+0.0004{x}^{2}+0.0802y+0.0408{y}^{2}+8.7675z-0.0070xy+0.0286xz-0.4732yz$$
(16)

where x is speed (rpm), y is solids concentration (%), and z is enzyme dosage (%).

There are four potential sources of protein in sludge: (1) unhydrolysed muscle protein, (2) soluble protein in the water that remains in the sludge, (3) denatured protein precipitated during thermal or chemical treatment to stop the hydrolysis reaction, and (4) heavy lipo-protein complexes from emulsion formation. The presence of unhydrolysed protein in the sludge is typical of enzyme hydrolysis processes where not all substrate is accessible to enzyme or soluble during the enzymatic reaction [49, 60, 75, 76]. Different studies have shown that the denaturation temperature of fish protein generally ranges from 65 to 95 °C [7782]. Chan et al. [82] showed that thermal treatment causes thermal aggregation, which reduces protein solubility as more hydrophobic surfaces of the protein become exposed to water. This is in agreement with Furuta et al. [77] who showed that heating fish protein at 75 °C for 1 min or at 95 °C for 1.5 min significantly reduced protein solubility. Enzyme hydrolysis reactions are typically quenched at temperatures above 80 °C for at least 15 min [8, 8385], which is sufficient time to lose protein to thermal aggregation. Brunner [86] and Feingold [87] showed that emulsion formation results in the formation of lipo-protein complexes with different densities, with heavy lipo-protein complexes settling in the sludge during centrifugal separation, thus contributing to protein loss in the sludge.

Figure 5A shows that high mixing speeds at low solids concentration or high solids concentration at low mixing speeds resulted in high protein loss to sludge. The former could be due to protein loss caused by emulsion formation at high mixing speeds, as suggested in previous studies [86, 87]. The latter could be a result of poor mass and energy transfer when the solids concentration is high or inhibition of enzyme activity by product formation [49, 60, 75, 76]. In this study, we visually observed settling of solids at low mixing speeds (100 rpm) during enzyme hydrolysis, which would result in non-homogenous reaction conditions and decreased mass transfer during enzyme hydrolysis.

Fig. 5
figure 5

Contour plots for protein loss to sludge (%), showing the effect of A mixing speed and solids concertation at 3% enzyme dosage, B mixing speed and enzyme dosage at 38% solids concentration and C solids concentration and enzyme dosage at 200 rpm mixing speed

Enzyme dosage had a significant effect on the amount of protein that remained in the sludge, which is in agreement with previous work [5, 6, 5860] and also expected, since as the enzyme specifically targets protein material, solubilising it for recovery in the liquid FPH. Solids concentration was also a significant factor in determining protein loss to sludge, with a positive effect. That is, increasing solids concentration increases protein loss to sludge. Previous studies have shown that this is caused by inhibition of enzyme activity when solids concentration is high [49, 51, 52]. In theory, the effect of high solids concentration can be countered by increasing enzyme dosage, which is shown by the interaction term (Table 11, Eq. 16, Fig. 5C). However, it is discouraged at industrial scale since enzymes are a costly reagent [50, 88]. Figure 18 in Section S3 of supplementary material shows the magnitude and direction (positive or negative effect) that each independent factor had protein loss to sludge.

3.4 Multivariable optimisation through desirability analysis

Desirability analysis was performed using the predicted values of the response variables based on Eq. 12 to Eq. 16. The coefficients for the terms that were employed in the prediction models, together with their R2 values and p values for LOF, are summarised in Table 5. The analysis was done for each response variable to determine the optimum levels of the independent variables that gave the best value for each response variable. The objective was to maximise DH, yield and protein recovery while minimizing protein loss to the emulsion and sludge phases. That is, DH, yield and protein recovery were assigned a desirability score of 1, while protein loss to emulsion and sludge were given a score of 0.

Table 5 Summary of the regression coefficients for the second-order model for predicting response variables

The desirability profile for DH in Fig. 6 indicates that the estimated optimum conditions that maximise DH are a mixing speed of 100 rpm, solids concentration of 30.8% and enzyme dosage of 4.682%. However, the profile also shows that any value of mixing speed between 100 and 140 rpm, solids concentration from 26 to 30.8% and enzyme dosage from 4% are within the 95% confidence interval of the predicted DH. This gives a range of operating conditions that result in the maximum DH. This is an important observation, since, at industrial scale, it may be impossible to operate at fixed values of independent factors for various reasons. For example, it is economical to operate at the lowest enzyme dosage that gives the best DH since the cost of FPH is significantly affected by the cost of enzyme [50, 88]. The other implication is that mass transfer effects can be controlled by adjusting mixing speed and solids concentration without significantly affecting DH.

Fig. 6
figure 6

Desirability profile of predicted values of DH based on Eq. (12). Blue horizontal lines show the ± 95% confidence interval of the predicted value

Yield of solid material had a desirability score of 0.9924 at mixing speed of 100 rpm, solids concentration of 26% and enzyme dosage of 4.682% giving the maximum yield (Fig. 7). Desirability decreased sharply as mixing speed and solids concentration increased while enzyme dosage decreases, with solids concentration showing the steepest gradient. Therefore, solids concentration has a significant effect on the predicted values of yield, as already found from Fig. 2. Furthermore, the solids concentration that results in the highest yield (26%) is different from the one that results in the highest DH (30.8%). As previously discussed, there is room to increase the solids concentration from 26 to 30.8% without significantly affecting the desirability of DH but doing so will significantly reduce the desirability of yield (%). The significance of this is that, to make data-driven compromises, one must understand which of the two response variables being optimised is more important. For example, at industrial scale, some of the factors considered when selecting optimum conditions are the cost of enzymes and the cost of drying FPH. These costs are minimised by using low enzyme dosages and high solids concentration. Despite some attempts by different researchers to account for enzyme costs in hydrolysis of fish processing by-products [4, 50, 89], there is still no definitive level of enzyme dosage that is recommended. The slopes of the desirability profile in Fig. 7 shows that there are narrow regions of operation for the independent variables where yield is maximised.

Fig. 7
figure 7

Desirability profile of predicted values of yield (%) based on Eq. (13). Blue horizontal lines show the ± 95% confidence interval of the predicted value

The desirability profile of protein recovery (Fig. 8) has similar trends to the profile of DH, with the same optimum conditions for maximum protein recovery. DH measures the number of peptide bonds broken down during hydrolysis, while protein recovery measures the amount of protein recovered in the liquid phase. Mohr (1978) showed that protein hydrolysis occurs both at solid surfaces and in the liquid phase, with the initial reactions happening on the solid surfaces [17]. There are no studies that have definitively characterised the reaction mechanism to determine which reaction rate is dominant during the whole course of the hydrolysis reaction, but the general understanding is that hydrolysis occurring at the solids surfaces continues to contribute more protein in the liquid phase, making DH directly proportional to protein recovery [25, 28]. Therefore, conditions that favour higher DH are also likely to favour protein recovery as hydrolysates, which this work seems to confirm.

Fig. 8
figure 8

Desirability profile of predicted values of protein recovery (%) based on Eq. (14). Blue horizontal lines show the ± 95% confidence interval of the predicted value

Figure 9 shows that the two important factors to consider when the objective is to minimise emulsion formation are mixing speed and solids concentration. A desirability of 1, corresponding to the lowest amount of protein in the emulsion as a fraction of the feed protein content, is obtained at a minimum mixing speed of 100 rpm and a solids concentration of 50% solids. A discussion of how these two variables affect emulsion formation is presented in Sect. 3.3. Enzyme dosage appears to have no significant effect on the desirability of the response variable. The significance of this result is that lower enzyme dosages can be used when the objective is only to minimise emulsion formation.

Fig. 9
figure 9

Desirability profile of predicted values of protein in emulsion (%) based on Eq. (15). Blue horizontal lines show the ± 95% confidence interval of the predicted value

An optimum level of 300 rpm for mixing speed and 50% for solids concentration (Fig. 10) suggest that mass transfer effects, coupled with low emulsion formation due to water acting as a limiting reagent, control how much protein is lost to the sludge. Another possible effect of high mixing speed on protein loss to sludge could be mechanical breakdown of solids into finer particles, increasing the surface area exposed to enzyme attachment, which could decrease the amount of protein material that remains unhydrolysed in the sludge. However, high mixing speed increases emulsion formation (Fig. 9), and high solids concentration decreases DH, yield and protein recovery.

Fig. 10
figure 10

Desirability profile of predicted values of protein in sludge (%) based on Eq. (16). Blue horizontal lines show the ± 95% confidence interval of the predicted value

When all response variables are given equal importance and optimised simultaneously, the optimum levels of independent variables were 220 rpm, 26% solids concentration and 4.682% enzyme dosage, with an overall desirability of 0.8581 (Fig. 11). The relatively low overall desirability score means that there is at least one pair of response variables that do not have optimal levels of independent variables intersecting, e.g. for the pair of DH-protein in emulsion, the optimum level of solids concentration is on the low end of the range to optimise DH (30.8% solids), whereas optimum solids concentration to limit protein loss to emulsion was at the high extreme of solids concentration (50%).

Fig. 11
figure 11

Desirability profile showing the overall desirability obtained from a geometric mean of the individual desirabilities of the five response variables. Blue horizontal lines show the ± 95% confidence interval of the predicted value

3.5 Material balance

Additional enzyme hydrolysis runs were carried out at conditions shown in Table 6. DH, dry solids yield, protein recovery, protein content and moisture content were evaluated for each set of variables. Hydrolysis conditions for (a) represent the optimum conditions for maximised DH and dry solid yield. Set (b) is a set of conditions that maximises protein recovery, while (c) is a set of conditions that simultaneously maximises DH, dry solid yield and protein recovery while minimising protein loss to emulsion and sludge. Condition (d) was included based on the recommendation by [47, 48], that high solids concentration improve the economic feasibility of bioprocesses.

Table 6 Enzyme hydrolysis conditions used to evaluate the effect mixing speed, solids concentration and enzyme dosage on degree of hydrolysis and material balance in the FPH, emulsion and sludge fractions. DH is mean ± stand deviation

Figure 12 shows how yield of total solids (A) and protein recovery (B), on a dry basis, varied in the three recovered phases at different process conditions. For FPH, there was a significant decrease in yield and protein recovery when the solids concentration was increased from 26 to 50% (a and d), while at the same time, there was an increase in the percentage of dry total solids and protein that remained in the sludge. This agrees with literature where higher solids concentration result in poor enzyme hydrolysis because of enzyme inhibition by both substrate and product as well as poor mass transfer [49, 51, 52, 90].

Fig. 12
figure 12

Yield of dry solids (A) and recovery of dry protein (B) in the fractions obtained after centrifugal separation of hydrolysed material. Enzyme hydrolysis conditions for (a) 100 rpm mixing speed, 26% solids concentration and 4.682% enzyme dosage; (b) 140 rpm mixing speed, 30.86% solids concentration and 4% enzyme dosage; (c) 220 rpm mixing speed, 26% solids concentration and 4% enzyme dosage; and (d) 100 rpm mixing speed, 50% solids concentration and 4.682% enzyme dosage. Different roman numerals (i, ii, iii) show statistically significant differences between means using Tukey–Kramer test, n = 3. Error bars are standard deviation

Protein content, on a dry basis, and moisture content were also considered in this study. Although there was no statistically significant difference in protein content of the FPH recovered (Fig. 13A), the moisture content (Fig. 13B) of the FPH obtained from the hydrolysis process with 50% solids concentration was significantly lower compared to the FPH obtained from the hydrolysis at optimum conditions (c). This implies that higher solids concentration will result in less water being removed during downstream processing. However, the economic implications of this are unclear as there are currently no techno-economic evaluation studies on enzyme hydrolysis that show the sensitivity of the process to solids concentration. There need to be investigations of that nature that determine at what solids concentration do enzyme hydrolysis processes that use fish processing by-products become economically viable. The only available literature on economic feasibility studies of producing FPH at industrial scale was by [89]. This study used conceptual process simulation software for a microwave-assisted process using 26% solids concentration.

Fig. 13
figure 13

Protein (A) and moisture (B) in the fractions obtained after separating the hydrolysed material. Enzyme hydrolysis conditions for (a) 100 rpm mixing speed, 26% solids concentration and 4.682% enzyme dosage; (b) 140 rpm mixing speed, 30.86% solids concentration and 4% enzyme dosage; (c) 220 rpm mixing speed, 26% solids concentration and 4% enzyme dosage; and (d) 100 rpm mixing speed, 50% solids concentration and 4.682% enzyme dosage. Different roman numerals (i, ii, iii) show statistically significant differences between means using Tukey–Kramer test, n = 3. Error bars are standard deviation

The results shown in Fig. 12 and Fig. 13 were used to perform an overall mass balance, determining the amount of fish processing by-products and water required to produce 1 unit mass of dry protein. Although results presented in the previous sections have shown that low solids concentration will result in significantly higher dry solids yield, protein recovery and protein content in the FPH, this comes at a cost of using a significantly high amount of water as shown in Table 7. The implication of this in process design is twofold; (1) a larger reactor volume is required for enzyme hydrolysis, and (2) there is an increase in downstream processing cost for handling and removing large volumes of water to produce a dry product. This is supported by studies carried out by Balan et al. [47] and Janssen et al. [48], who showed that bioprocesses mostly only become commercially viable at relatively high solids concentrations.

Table 7 Mass balance showing the mass of dry solids and water required for producing 1 unit mass of dry protein hydrolysates under different enzyme hydrolysis conditions

Despite statistically insignificant differences in water content of the FPH phase between 26 and 30.86% (Fig. 13B), any subtle differences that do exist may be significant from a technical and economic perspective.

4 Conclusion

Enzyme hydrolysis experiments carried out in this study have shown that response variables are not affected in the same way when independent variables are manipulated. It was observed that operating at low mixing speeds and low solids concentration results in increased protein loss to sludge while operating at high mixing speeds and low solids concentration results in high protein loss to emulsion. This shows that individually optimising one response variable at a time will negatively affect the response of other variables that may be important to process operation.

This study demonstrates the importance of multivariable optimisation for data-driven decision-making in selecting the best operating conditions when at least two response variables are important in process optimisation. The simultaneous optimisation of the five response variables (DH, yield (%), protein recovery (%), protein loss to emulsion (%) and protein loss to sludge (%)) yielded medium mixing speed of 220 rpm, lows solids concentration of 26% and high enzyme dosage of 4.2682% as the optimum levels for independent variables. This demonstrates that multivariable optimisation produces an optimal solution which is a compromise when at least one of the independent factors does not have intersecting optimal regions for two responses optimised simultaneously.

Although the optimisation experiments showed that low solids concentration results in high solids yield and protein recovery, downstream processing may be affected significantly as this requires large volumes of water to be removed from the protein hydrolysates to make dry powders. Apart from downstream processing implications, low solids concentrations will also have a negative impact on equipment size for the hydrolysis process. It is therefore recommended that a holistic approach to process optimisation be adopted, where both upstream and downstream processes are simultaneously considered.