Optimisation of ANN topology for predicting the rehydrated apple cubes colour change using RSM and GA

Winiczenko, Radosław; Górnicki, Krzysztof; Kaleta, Agnieszka; Janaszek-Mańkowska, Monika

doi:10.1007/s00521-016-2801-y

Optimisation of ANN topology for predicting the rehydrated apple cubes colour change using RSM and GA

Original Article
Open access
Published: 24 December 2016

Volume 30, pages 1795–1809, (2018)
Cite this article

Download PDF

You have full access to this open access article

Neural Computing and Applications Aims and scope Submit manuscript

Optimisation of ANN topology for predicting the rehydrated apple cubes colour change using RSM and GA

Download PDF

Radosław Winiczenko¹,
Krzysztof Górnicki¹,
Agnieszka Kaleta¹ &
…
Monika Janaszek-Mańkowska¹

3602 Accesses
Explore all metrics

Abstract

In this study, an efficient optimisation method by combining response surface methodology (RSM) and genetic algorithm (GA) is introduced to find the optimal topology of artificial neural networks (ANNs) for predicting colour changes in rehydrated apple cubes. A multi-layered feed-forward backpropagation ANN model of algorithms was developed to correlate one output (colour change) to four input variables (drying air temperature, drying air velocity, temperature of distilled water and rehydration time). A predictive model for ANN topology in terms of the best mean squared error (MSE) performance on validation samples was created using RSM. RSM model was integrated with an effective GA to find the optimum topology of ANN. The optimum ANN had minimum MSE when the number of hidden neurons, learning rate, momentum constant, number of epochs and number of training runs were 13, 0.33, 0.89, 3869 and 3, respectively. MSE of optimal ANN topology on validation samples was 0.0072095. It turned out that the optimal ANN topology can be considered as more precise for predicting colour change in the rehydrated apple cubes. Mean absolute error and regression coefficient (R) of the optimal ANN topology were determined as 0.0259 and 0.96475 for training, 0.0399 and 0.95243 for testing and 0.0264 and 0.95151 for validation data sets. The results of the testing model on new samples showed excellent agreement between the actual and predicted data with coefficient of determination R ² = 0.97.

A hybrid deep learning neural network for early plant disease diagnosis using a real-world Wheat–Barley vision dataset: challenges and solutions

Article 21 June 2024

A comprehensive review of food rheology: analysis of experimental, computational, and machine learning techniques

Article 28 October 2023

Soft computing techniques for predicting the compressive strength properties of fly ash geopolymer concrete using regression-based machine learning approaches

Article 19 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Rehydration is a complicated process aimed at the reinstatement, by way of the contact with water, of the features the dried material had prior to its pretreatment preceding the drying and the drying itself. The following three processes take place in the course of the rehydration: water absorption by the tissues of the dried material, thanks to which it increases its mass and volume as well as the leaching of water-soluble substances such as sugars, acids, vitamins and minerals from the rehydrated material [1, 2]. The progress of the discussed processes depends on the features of the raw material and conditions in which the drying with preceding pretreatments takes place [3]. Because of that, the progress of the rehydration process reflects the changes that took place in the raw material tissue as a result of the following processes: drying and pretreatment and also rehydration [4]. Such changes are the reason why the dried product fails to attain the features of the raw material after its rehydration, which shows irreversibility of the drying process [5].

Many studies of the rehydration process focused on the determination of rehydration indicators defining the reinstatement value of the dried material and on the process description using empirical formulae [6, 7]. There are also works oriented on the optimisation of parameters in selected technology of the thermal treatment [8, 9] and on the studying of changes in the tissue structure [10, 11], the chemical content [12] and colour [13, 14].

In the process of drying and rehydration, complex and highly nonlinear phenomena take place [15]. Therefore, it is difficult to estimate relationships between the input and the output of this complex system on the basis of mathematical approaches.

Intelligent techniques such as artificial neural networks show a high learning ability and capability of identifying mentioned systems [16]. The ability of neural networks to learn from repeated exposure to system characteristics has made them a popular choice for many applications, including drying technology [17, 18].

In the literature, several papers are related to modelling the heat and mass transfer kinetics [19], drying characteristics [20, 21], approximating the moisture content [22] and quality of apple tissue [23]. ANN models were also developed for predicting some physicochemical properties of apple tissue during hot air drying in thin layer [24]. Recently, several authors applied neural network for modelling heat and mass transfer during rehydration process [25, 26]. A simple ANN model was used to predict the shrinkage and estimate the rehydration capacity of the dried cooked rice [27] and dehydrated carrots [28]. A comprehensive review study devoted to application of ANNs in drying technology was made in paper [29].

Determination of the best ANN topology for estimation of colour change is usually put through by trial and error procedure that is very time-consuming [30]. Optimisation of neural networks parameters requests a large number of different topologies have to be constructed, trained and tested. However, there is no general rule used in selecting the value of variables in ANN. Also, it is dependent on the complexity of the system that is modelled.

In recent times, many researchers have used response surface methodology [31–33] and genetic algorithms [34, 35] to optimise ANN topology.

Genetic algorithm is a biologically inspired optimisation technique [36]. Recently, GA has gained popularity as a robust optimisation tool for multi-modal nonlinear problems. Sometimes GAs can exploit ANN models as their fitness function. In food industry, ANNs and GA system were used to control the fruit storage process. The authors [37, 38] indicated the need to apply a hybrid system for selection of input parameters (temperature, density) and output parameters (colour, mass loss, hardness) to improve the quality of the stored fruit. The authors noticed that the complex system of ANNs and GAs is superior to traditional computational techniques used in problems related to agriculture. ANN and GA system was successfully used for optimising thermal conditions for conduction-heated foods [39, 40]. Recently, ANN and GA approaches in drying technology have been described in papers [41–43].

It can be concluded from the literature review that the coupling of these two methods has many benefits for finding the global optimum neural networks topology and improving the model performance [33, 34].

The objective of this work is to optimise the neural network topology to predict the colour change in the rehydrated apple cubes using integrated RSM and GA methods.

2 Materials and methods

2.1 Material

High-quality Ligol apples were bought from the local market. They were washed in water, cut into cubes, with dimensions of 10 × 10 × 10 mm and were dried on the same day. The initial moisture content of a sample amounted to ca. 85% w.b. (5.66 d.b.).

2.2 Drying equipments and experiments

The drying experiments were carried out in the dryer constructed in our laboratory. The details of dryer equipment and conducting the drying process can be found in a paper [44]. The laboratory dryer was run about 1 h, and when the steady conditions were achieved the samples were placed on a tray. The drying process lasted until the mass of the sample became constant. The drying experiments were performed at three levels of drying air temperatures of 50, 60 and 70 °C, together with two levels of air flow velocities—0.5 and 2 m/s. The final moisture content of dried apples amounted to ca. 9% w.b. (0.098 d.b.). Dry matter of the solid was determined according to AOAC standards [45]. The dried material obtained in the given conditions from three independent experiments was mixed and stored in a tightly sealed container for about one week at 20 °C; after that, samples were taken for further studies. The container in which the dried material was stored was placed in a cupboard, so the dried apples were not exposed to the sunlight.

2.3 Rehydration procedure

The apple cubes were immersed in distilled water at four levels of temperatures—20, 45, 70 and 95 °C, using a water bath ELP 12 (LABOPLAY, Bytom, Poland). The initial mass of each dried sample subjected to rehydration was 10 g, and dried sample mass-to-medium mass ratio at the beginning of rehydration was 1:20. The rehydration lasted from 6 h (20 °C) to 2 h (95 °C). After removal from the water, the sample was dried on a filter paper. The medium was not stirred during the rehydration process, and its temperature was constant.

2.4 Colour determination

Colour images of fresh and rehydrated apple were acquired using a flatbed scanner (Canon CanoScan 5600F). The device was equipped with 6-line colour CCD sensor, fluorescent lamp and the 48-bit input/output interface (16 bits for each RGB channel). Images of resolution of 300 dpi were acquired to sRGB colour space and then saved in BMP format as matrixes with dimensions of 2552 × 3508 pixels. During the scanning process, all tools for an automatic image enhancement were disabled. Apple cubes were randomly positioned on the scanner platen. For fresh apple and each type of dehydrated cubes (various drying conditions: drying temperature, drying air velocity, and various rehydration temperatures and times) 30 images were acquired. The images were then transformed to CIEXYZ colour space [45, 46]. Nonlinear transformation of CIEXYZ to CIEL*a*b* coordinates was done relative to illuminant D50 and observer 10° according to CIE standard using 94.811, 100, 107.32 values as reference whites for X, Y and Z coordinates, respectively [47]. Chroma (C*) and hue (h*) of CIEL*C*h*colour space were calculated according to Schanda [48]. Original digital image of raw apple cubes and preprocessed image of apple cubes extracted from the image background is shown in Fig. 1.

3 Quality of rehydrated product

It was also assumed that quality of the rehydrated product is defined by means of its colour change. The colour of a food product can be considered as a very important quality factor because it plays a decisive role in consumer’s acceptability. The colour change (C _ch) was calculated according to the formula, given by [49]

$$C_{\text{ch}} = \sqrt {\left( {\frac{{\Delta L^{*} }}{{K_{L} S_{L} }}} \right)^{2} + \left( {\frac{{\Delta C^{*} }}{{K_{C} S_{C} }}} \right)^{2} + \left( {\frac{{\Delta H^{*} }}{{K_{H} S_{H} }}} \right)^{2} }$$

(1)

where S _L , S _C , S _H denote the weighing functions, adjusting the internal non-uniform structure of the CIEL^* a ^* b ^* and may be obtained using Eqs. (2–4)

$$S_{L} = 1$$

(2)

$$S_{C} = 1 + 0.045 \cdot C^{*}$$

(3)

$$S_{H} = 1 + 0.015 \cdot C^{*}$$

(4)

The parameters K _L , K _C , K _H express the variation from the reference conditions. The discussed parameters are equal to 1 in reference conditions [50]. Parameters ΔL ^*, ΔC ^*, ΔH ^* denote difference between the tested sample (_T) and the standard (_S) in terms of luminance, chroma and hue, respectively, and are determined according to formulae (5)–(7)

$$\Delta L^{*} = L_{T}^{*} - L_{S}^{*}$$

(5)

$$\Delta C^{*} = C_{T}^{*} - C_{S}^{*}$$

(6)

$$\Delta H^{*} = 2\sqrt {C_{T}^{*} \cdot C_{S}^{*} } \cdot \sin \left( {\frac{{\Delta h^{*} }}{2}} \right)$$

(7)

4 Neural networks

4.1 Design of ANN architecture

A MLFF backpropagation (BP) neural model was developed (Fig. 2) for predicting the colour change in apple cubes during drying and rehydration processes. The following variables considered as the input parameters for the model were taken: drying air temperature, drying air velocity, temperature of distilled water, rehydration time. The network output variable included colour change. Since one output variable (colour change) was dependent on four exogenous input variables, one neuron was taken for the output whereas four neurons for the input layers. It was reported in earlier works [32, 33] that a network with one hidden layer and hyperbolic tangent sigmoid (tansig) function is commonly used for forecasting in practice [51, 52]. So, a single hidden layer network was used in this study. Therefore, for optimisation one hidden layer with tansig transfer function was considered. The tansig function was determined using Eq. (8)

$${\text{tansig}}\left( n \right) = \frac{2}{{1 + { \exp }\left( { - 2n} \right)}} - 1$$

(8)

Moreover, a linear (pureline) function for output was selected for simulation process.

4.2 Data preprocessing

In order to produce the most efficient training, the data before training should be normalised. It is also helpful to analyse the network response after having completed the training. Therefore, to achieve this, 189 cases were chosen from our experiments. Chosen cases were randomly divided into the following sets: for training 133 samples (consisted of ≈70% cases), for validation 28 samples (≈15%) and for testing 28 samples (≈15%). The second data set was applied for evaluating the performance of the network in the process of training, while the third one was used for estimation of the predictive ability of the model which has been developed [51]. The data were normalised between 0.1 and 0.9 in the following way [53]

$$x_{{{\text{normalized}} \;{\text{value}} }} = 0.1 + 0.8\left( {\frac{{y_{{{\text{actual}}\;{\text{value}}}} - y_{{{\text{minimum}}\;{\text{value}}}} }}{{y_{{{\text{maximum}}\;{\text{value}}}} - y_{{{\text{minimum}}\;{\text{value}}}} }}} \right)$$

(9)

4.3 Training methods

In the MLP networks, MSE can be obtained by various methods, including Levenberg–Marquardt (LM), gradient descent (GD) and conjugate gradient (CG). MLPs are as a rule trained using error backpropagation (BP) algorithm. It is a general method for iterative solution for weights and biases. BP uses GD technique. This technique is very slow at a small learning rate, but its convergence properties are slow. Different methods concentrated on speeding up BPs have been applied for an instance momentum term, variable learning rate. Finally, gradient descent momentum (GDM) algorithm has been chosen for training the networks. It avoids local minima, speeds up learning and stabilises convergence [33, 34]. Moreover, GDM allows a network to respond to the local gradient and to ignore small features in the error surface [51].

4.4 Training parameters

In our training process, number of neurons in the hidden layer, number of epochs, learning rate and momentum coefficient are parameters that can affect the network simulation efficiency. However, GDM mainly depends on two training parameters: learning rate (lr) and momentum constant (mc). The first parameter determines the time indispensable for finding the minimum in the weight space. Too high lr leads to an increase in the magnitude of the oscillations for the MSE. Too small lr causes smaller steps taken in the weight space. Moreover, in this case the learning becomes slower and the capability of the network to escape from the local minima in the error surface becomes lower. The mc defines the amount of momentum. A mc of 1 result in a network that is totally insensitive to the local gradient. Consequently, mc does not learn properly. Too high mc causes diverging of the adaptation and gives unusable weights. Too small mc is responsible for a long learning time [32].

Number of neurons in the hidden layer is decisive for network performance. The small number of hidden neurons causes that the ANN is disable for adapting to a being modelled process, whereas when the number of hidden neurons is too big the system memorise errors [51]. Moreover, too many neurons do not propagate errors back efficiently [33] and therefore worsen the ability of the neural network to learn. Similar problems can be encountered while selecting the number of epochs. Too small number of epochs limits the ability of the network to process modelling. Too many epochs can lead to an overtraining of network and to an increasing of errors.

Therefore, figuring out the optimum values of affecting parameters for ANN is an important task and appropriate ranges should be chosen. The following numerical variables were chosen for ANN optimisation number of neurons in the hidden layer, lr, mc, epoch number and number of training runs. The responses sought were faulty on the best validation performance. BP uses a GD technique. Its stability depends on lr. Small lr leads to very stable GD. In the MATLAB 7.0 software, the defaults value for lr and mc equal 0.01 and 0.9, respectively. Accordingly, we changed lr from 0.01 to 0.4 and mc between 0.1 and 0.9. Similarly, the number of neurons in the hidden layer (2–16), training epoch (300–5000) and number of training runs (3–7) were chosen. The range of input variables in ANN model is shown in Table 1.

Table 1 Limit of input variables in the neural network model

Full size table

4.5 Performance evaluation

After having found the optimal ANN topology, measuring its performance is the next step. The performance of the designed ANN was estimated on the basis of coefficient of determination (R ²), mean square error (MSE) and mean absolute error (MAE) [33]. Discussed parameters were determined using Eqs. (10–12)

$$R^{2} = 1 - \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{pi} - x_{di} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{pi} - \bar{x}} \right)^{2} }}$$

(10)

$${\text{MSE}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{n} \left( {x_{pi} - x_{di} } \right)^{2}$$

(11)

$${\text{MAE}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{n} \left| {\left( {x_{pi} - x_{di} } \right)} \right|$$

(12)

where x _pi is the network (predicted) output derived from observation i, x _di is an experimental (actual) output derived from observation i, $\bar{x}$ is the average value of an experimental output, and N is the number of data. MSE informs of the differences between the value implied by the estimator and estimated quantity. The value of MSE close to 0 indicates that the network can be considered as a satisfactory one. R ² informs of the correctness of model fitting. If R ² = 1, the regression line fits the data excellent. MAE shows how close the predictions are to the final outcomes. The value of MAE close to 0 indicates that the error of our model decreases.

5 Hybrid intelligent system

Figure 3 provides a schematic diagram of simulation system. The proposed hybrid RSM–ANN–GA system is described briefly below. This system includes the following steps:

Step 1 Collection of the data set
Step 2 RSM designs the experiment and builds a fitness function
Step 3 GA optimises ANN architecture
Step 4 ANN predicts the colour change

The algorithm proceeded with its iterations until a specified performance criterion became satisfactory. More details related to applied RSM and GA algorithm are described in Sects. 5.1 and 5.2.

5.1 Response surface method

The optimum architecture of ANN was determined on the basis of the runs designed by response surface methodology. A face-centred full central composite design of five numerical factors (number of neurons in the hidden layer, learning rate, momentum constant, training epochs and number of training times) with three-level factorial design matrix was selected. The experimental design matrix (see Table 2) consisting of 50 set of conditions and comprising a full replication of five-factor factorial design of 32 points, 10 star points and 8 centre points was used. The upper and lower limits of the parameters were coded as +1 and −1, respectively.

Table 2 Mean square error (MSE) results obtained by various neural network configuration

Full size table

Fifty different patterns of proposed ANNs (Table 2) were designed and trained to model the best MSE performance on validation data set using the Design-Expert (DOE) software. Finding a suitable approximation for the true efficient relationship between the response and the set of independent variables makes the first stage in discussed methodology [54, 55]. The response variables were then transformed to natural logarithm function. It makes the distribution of the response variable closer to the normal distribution and improves the model fitting to the data. The experimental results of the CCD were fitted with a second-order polynomial equation using a multiple regression technique. Equation (13) represents the quadratic model for predicting the optimal point

$$Y = \beta_{0} + \mathop \sum \limits_{i = 1}^{k} \beta_{i} x_{i} + \mathop \sum \limits_{i = 1}^{k} \beta_{ii} x_{i} x_{i} + \mathop \sum \limits_{i = 1}^{k - 1} \mathop \sum \limits_{j = i + 1}^{k} \beta_{ij} x_{i} x_{j} + \varepsilon_{ij}$$

(13)

where Y is the response (MSE), $\beta_{0} , \beta_{i} , \beta_{ii} ,$ and β _ij are regression coefficients (intercept, linear, quadratic and interaction, respectively), x _i and x _j are the independent variables, k is the number of levels, and ɛ _ij is an error observed in the response.

5.2 Genetic algorithm

GAs use the evolutionary principle of survival characteristics for the best adapted chromosomes [56]. A group of chromosomes is called a population. Each population of chromosomes has the same size which is referred to as population size. According to the researchers [32, 33, 57], a suitable population size numbers about 20–30 chromosomes. However, sometimes a population size with 50–80 has lead to the best answers [58, 59]. With a large population size, the GA searches the solution space more thoroughly, thereby reducing the chance that the algorithm returns a local minimum that is not a global minimum. In addition, a large population size causes the algorithm to run more slowly [58, 60].

The main data structures in GA toolbox are chromosomes, objective function values and fitness values. The chromosome data structure stores an entire population in a single matrix of size Nind × Lind, where Nind is the number of individuals in the population and Lind is the length of the genotypic representation of these individuals. Each of the rows correspond to an individual’s genotype, consisting of base-n, typically binary, values

$${\text{Chrome}} = \left[ {\begin{array}{*{20}c} {{\text{g}}_{1.1} } & {{\text{g}}_{1.2} } & {{\text{g}}_{1.3} } & \ldots & {{\text{g}}_{{1.{\text{Lind}}}} } \\ {{\text{g}}_{2.1} } & {{\text{g}}_{2.2} } & {{\text{g}}_{2.3} } & \ldots & {{\text{g}}_{{2.{\text{Lind}}}} } \\ {{\text{g}}_{3.1} } & {{\text{g}}_{3.2} } & {{\text{g}}_{3.3} } & \ldots & {{\text{g}}_{{3.{\text{Lind}}}} } \\ .& .& .& \ldots & .\\ {{\text{g}}_{Nind.1} } & {{\text{g}}_{{{\text{Nind}}.2}} } & {{\text{g}}_{{{\text{Nind}}.3}} } & \ldots & {{\text{g}}_{{{\text{Nind}}.{\text{Lind}}}} } \\ \end{array} } \right]\begin{array}{*{20}c} {{\text{individual}}\;1} \\ {{\text{individual}}\;2} \\ {{\text{individual}}\;3} \\ . \\ {{\text{individual}}\;{\text{Nind}}} \\ \end{array}$$

Such a data representation does not force the chromosome structure, requiring only all chromosomes to be of equal length. Thus, structured populations or populations with varying genotypic bases may be used in the GA toolbox provided that a suitable decoding function, mapping chromosomes onto phenotypes, is employed.

The decision variables (phenotypes) in the GA are obtained by applying some mapping from the chromosome representation into the decision variable space. Here, each of the strings contained in the chromosome structure decodes to a row vector of order Nvar, according to the number of dimensions in the search space and corresponding to the decision variable vector value. The decision variables are stored in a numerical matrix of size Nind × Nvar. Again, each of the rows corresponds to a particular individual’s phenotype. An example of the phenotype data structure is given below, where bin2real is used for representation of an arbitrary function, possibly from the GA Toolbox, mapping the genotypes onto the phenotypes.

$$\begin{aligned} {\text{Phen }} & = {\text{ bin2real}}\left( {\text{Chrom}} \right)\;\% \;{\text{map}}\;{\text{genotype}}\;{\text{to}}\;{\text{phenotype}} \\ {\text{Phen}} & = \left[ {\begin{array}{*{20}c} {{\text{x}}_{1.1} } & {{\text{x}}_{1.2} } & {{\text{x}}_{1.3} } & \ldots & {{\text{x}}_{{1.{\text{Lind}}}} } \\ {{\text{x}}_{2.1} } & {{\text{x}}_{2.2} } & {{\text{x}}_{2.3} } & \ldots & {{\text{x}}_{{2.{\text{Lind}}}} } \\ {{\text{x}}_{3.1} } & {{\text{x}}_{3.2} } & {{\text{x}}_{3.3} } & \ldots & {{\text{x}}_{{3.{\text{Lind}}}} } \\ .& .& .& \ldots & .\\ {{\text{x}}_{{{\text{Nind}}.1}} } & {{\text{x}}_{{{\text{Nind}}.2}} } & {{\text{x}}_{{{\text{Nind}}.3}} } & \ldots & {{\text{x}}_{{{\text{Nind}}.{\text{Lind}}}} } \\ \end{array} } \right]\;\begin{array}{*{20}c} {{\text{individual}}\;1} \\ {{\text{individual}}\;2} \\ {{\text{individual}}\;3} \\ . \\ {{\text{individual}}\;{\text{Nind}}} \\ \end{array} \\ \end{aligned}$$

An objective function is used for evaluation of the performance of the phenotypes in the problem domain. Objective function values can be scalar or, in the case of multi objective problems, vectorial. Discussed values are not necessarily the same as the fitness values. Objective function values are stored in a numerical matrix of size Nind × Nobj, where Nobj is the number of objectives. Each of the rows corresponds to a particular individual’s objective vector.

$$\begin{aligned} {\text{Objv}} & = {\text{ OBJFUN}}\left( {\text{Phen}} \right)\;\% \;{\text{Objective}}\;{\text{Function}} \\ {\text{Objv}} & = \left[ {\begin{array}{*{20}c} {{\text{y}}_{1.1} } & {{\text{y}}_{1.2} } & {{\text{y}}_{1.3} } & \ldots & {{\text{y}}_{{1.{\text{Lind}}}} } \\ {{\text{y}}_{2.1} } & {{\text{y}}_{2.2} } & {{\text{y}}_{2.3} } & \ldots & {{\text{y}}_{{2.{\text{Lind}}}} } \\ {{\text{y}}_{3.1} } & {{\text{y}}_{3.2} } & {{\text{y}}_{3.3} } & \ldots & {{\text{y}}_{{3.{\text{Lind}}}} } \\ .& .& .& \ldots & .\\ {{\text{y}}_{{{\text{Nind}}.1}} } & {{\text{y}}_{{{\text{Nind}}.2}} } & {{\text{y}}_{{{\text{Nind}}.3}} } & \ldots & {{\text{y}}_{{{\text{Nind}}.{\text{Lind}}}} } \\ \end{array} } \right]\begin{array}{*{20}c} {{\text{individual}}\;1} \\ {{\text{individual}}\;2} \\ {{\text{individual}}\;3} \\ . \\ {{\text{individual}}\;{\text{Nind}}} \\ \end{array} \\ \end{aligned}$$

Fitness values are derived from the objective function values through a scaling or ranking function. Fitnesses are non-negative scalars and are stored in the column vectors of length Nind, an example of which is shown below. Again, ranking is an arbitrary fitness function [61].

$${\text{Fitn}} = {\text{ranking}}\left( {\text{ObjV}} \right).$$

$$\begin{array}{llll} {\text{Fitn}} =& {f_{1} } &{{\text{individual}}\;1} \\ &{f_{2} } & {{\text{individual}}\;2} \\ &{f_{3} } & {{\text{individual}}\;3} \\ &\cdots & {} \\ &{f_{\text{Nind}} } & {{\text{individual}}\;{\text{Nind}}} \\ \end{array}$$

The general steps of a genetic algorithm are presented in Fig. 4. This algorithm encodes a possible solution to a particular problem on a simple chromosome string and applies specified operators to a chromosome for preserving critical information and producing a new set of population with the aim to generate strings which map to high function values [36]. The main GA operators are selection, crossover and mutation (see Table 3). Roulette as selection method was used in the study. Roulette simulates a roulette wheel with the area of each segment proportional to its expectation. GA then uses a random number to select one of the sections with a probability equal to its area. The next main operator is crossover. Crossover combines two chromosomes, or child, for the next generation. A single point as a crossover function was applied in the study. Single point chooses a random integer n between 1 and a number of variables, selects the vector entries numbered less than or equal to n from the first parent, selects genes numbered greater than n from the second parent, and concatenates these entries to form the child.

Table 3 Genetic algorithm operators

Full size table

For example, if p1 and p2 are the parents

$$\begin{aligned} p1 \, & = \, \left[ {{\text{a}}\;{\text{b}}\;{\text{c}}\;{\text{d}}\;{\text{e}}\;{\text{f}}\;{\text{g}}\;{\text{h}}} \right] \\ p2 \, & = \, \left[ {1 \, 2 \, 3 \, 4 \, 5 \, 6 \, 7 \, 8} \right] \\ \end{aligned}$$

and the crossover point is 3, the function returns the following child

$${\text{child}} = \left[ {{\text{a}}\;{\text{b}}\;{\text{c}}\;4\;5\;6\;7\;8} \right].$$

The next parameter of GA is mutation. This operator makes small random changes in the individuals of the population, which provide genetic diversity and enable GA to search a broader space. Uniform as the mutation function was applied in the simulation process. In this case, GA selects a fraction of the vector entries of chromosomes for mutation, where each entry has the same probability as the mutation rate of being mutated. Next, the algorithm replaces each selected entry by a random number selected uniformly from the range of that entry.

The simulation process proceeds by 58 s. The computer simulations were conducted using the computer with the following specifications: Intel Core i5-2400s, processor 2.50 GHz speed, 6 GB memory and commercially available ANN software, MATLAB 7.0 [56].

6 Results

6.1 Statistical test results

The MSE results on validation data set are given in Table 2. In this study, the cubic model (CM) was chosen to give the correlation between neural network effective factors and the response of ln(MSE). Moreover, mentioned model was selected due to high amount of R and non-significant lack of fit. Finally, the selected CM was reduced to modified cubic model (MCM) on the basis of low P value and high ln(MSE) values. The results of the reduced CM in the form of ANOVA are shown in Table 4. ANOVA gives a very high F value (121.20) and a very low P value (<0.0001). The model implies the model significant. The prediction R ² (0.973) is very close to adjusted R ² (0.982). The difference is <0.2. Moreover, adequate precision that measures the signal-to-noise ratio (37.359) indicates an adequate signal.

Table 4 ANOVA for predicted RSM model

Full size table

The results of statistical test show that the first-order effect of neurons number was the most significant term in estimation of ln(MSE). It was followed by a training epoch and lr, respectively, whereas mc and number of training runs had no significant effect on the responses. Similar results in case of lr and training epoch were reported by [33, 34].

6.2 Mathematical model results

Finally, a MCM term of coded value is:

$$\begin{aligned} { \ln }\left( {\text{MSE}} \right) & = - 4.48 - 0.611 \cdot x_{1} - 0.463 \cdot x_{2} + 0.025 \cdot x_{3} - 0.82 \cdot x_{4} + 0.097 \cdot x_{5} \\ \quad + 0.008 \cdot x_{1} \cdot x_{2} - 0.08 \cdot x_{2} \cdot x_{3} - 0.13 \cdot x_{1} \cdot x_{4} + 0.11 \cdot x_{1} \cdot x_{5} - 0.05 \cdot x_{2} \cdot x_{3} - 0.032 \cdot x_{2} \cdot x_{4} \\ \quad + 0.08 \cdot x_{3} \cdot x_{4} + 0.069 \cdot x_{3} \cdot x_{5} + 0.17 \cdot x_{4} \cdot x_{5} + 0.37 \cdot x_{1}^{2} + 0.41 \cdot x_{2}^{2} + 0.79 \cdot x_{4}^{2} - 0.20 \cdot x_{5}^{2} \\ \quad + 0.20 \cdot x_{1} \cdot x_{2} \cdot x_{4} + 0.051 \cdot x_{1} \cdot x_{3} \cdot x_{4} - 0.27 \cdot x_{1}^{2} \cdot x_{2} + 0.36 \cdot x_{1}^{2} \cdot x_{4} - 0.36 \cdot x_{1}^{2} \cdot x_{5} \\ \end{aligned}$$

(14)

where $x_{1} , x_{2} , x_{3} , x_{4, }$ and x ₅ parameters are defined in Table 1. This model was checked hierarchically. The above statistical estimators indicate an adequate neural model with optimal structure that can be used for prediction of colour change in rehydrated apple cubes.

6.3 RSM and contour plot results

Figure 5 shows the effect of normal percent probability on the internally studentised residuals. As can be seen from a residual plot, the linear function very well approximates the results. Moreover, the residual scatters randomly on the display (Fig. 6), suggesting that the variance of the original observation is constant for all responses. Therefore, it can be concluded from these plots that the empirical model is suitable for describing relationships between design variables described by RSM.

Figure 7 shows the response surfaces (RSs) and contour plots (CPs) obtained by Design-Expert (DOE) software. Each graph represents a combination of two factors at the time and holding all other factors at the middle level. The effect of different values of neurons number and training epoch on ln(MSE) can be predicted from the RSs and CPs as shown in Fig. 7a, b. It is obvious that minimum value of MSE can be found by 3000–4000 epochs and 0.25–0.35 lr. Moreover, this range was observed for epoch number in relation with number of neurons (Fig. 7b). The CPs show that along with an increase in number of neurons from 2 to 16 and lr from 0.01 to 0.4, the ln(MSE) decreases to −5 (see Fig. 7c).

The response surface plot of momentum constant and learning rate is shown in Fig. 8a. The CP shows that along with an increase in lr from 0.01 to 0.4 and mc from 0.1 to 0.5, the ln(MSE) decreases to −4.615. Figure 8b shows the contour plot, where along with an increase in training epochs from 3000 to 4500 and mc to 0.5, the ln(MSE) decreases to −4.75.

6.4 GA optimisation results

The fitness function Eq. (14) developed by the RSM model was applied for predicting the colour change in rehydrated apple cubes. The fitness function was a function minimising the ln(MSE) in experimental ranges presented in Table 1.

As can be noticed in Fig. 9a, the optimisation terminated when maximum number of generations exceeded 2000 iterations. The objective function value ln(MSE) = −5.47257 was obtained for the final points presented in Fig. 9b. Table 5 shows results of optimised ANN parameters. The optimum values were as follows: number of neurons = 13, training epoch = 3869, lr = 0.33, mc = 0.89 and number of training runs = 3.

Table 5 Optimised ANN parameters

Full size table

6.5 Errors of model results

Next, ANN with the proposed topology was trained and tested. As can be seen from Fig. 10a, the training stopped when the validation error increased after 1147 iterations. The result is sensible because MSE is very small. As can be seen from graph (Fig. 10a), the test and validation errors have similar characteristics. Furthermore, no significant overfitting has followed [58]. Finally, MSE of optimal ANN topology was equal to 0.0072095 (see Fig. 10a).

Figure 10b shows ANN regression plots between outputs and targets samples. The R values in each case are greater than 95%. Therefore, the fit is reasonably good for all data sets. Additionally, MAE and R for colour change in rehydrated apple cubes were estimated as 0.0259 and 0.96475 for training, 0.0399 and 0.95243 for testing and 0.0264 and 0.95151 for validation data sets. Therefore, the topology with 3 inputs, 13 neurons with 1 hidden layer and 1 output (3–13–1) was applied for predicting the colour change in rehydrated apple cubes.

Comparing the results of the simulation using AG and results from Table 2 (ID. 27), it can be seen that MSE (=0.0072) of optimised ANN topology is smaller compared to the errors values presented in Table 2 (MSE = 0.0074). Moreover, both of the numbers of hidden neurons and epochs are less in case of ANN optimised by genetic algorithm.

6.6 Validation model results

According to the authors [32–34], the trained neural network must have a high predictability for new data. Therefore, 40 data sets of colour change obtained from the new experimental run (drying air temperature = 55 °C, drying air velocity = 0.52 m/s, rehydration temperature = 35 °C and rehydration time = 35 min) were used for the verification of the developed model. The regression result of the testing model with new samples is shown in Fig. 11. It can be seen that the genetic algorithm has been successfully applied to optimise of neural network topology. Moreover, the optimised ANN topology was efficient for predicting colour change in the rehydrated apple cubes. This system can also be used for optimising the topology of a neural network which describes the other engineering problems.

7 Conclusions

The following conclusions can be drawn from the investigations conducted in this work:

1.
An efficient hybrid intelligent approach was proposed to find the optimal topology of neural networks.
2.
The optimal ANN topology was more precise for predicting colour change in the rehydrated apple cubes with a low mean square error (0.0072095) and a high regression coefficient (0.96).
3.
The optimum neural model had minimum when the number of hidden neurons, learning rate, momentum constant, number of epochs and number of training runs were equal to 13, 0.33, 0.89, 3869 and 3, respectively.
4.
The results of the testing model on new trials showed excellent agreement between the actual and predicted data with a coefficient of determination equal to 0.97.
5.
This optimisation method significantly reduces the number of experiments comparing with more expansive learning methods.

References

Lewicki PP (1998) Some remarks on rehydration of dried foods. J Food Eng 36:81–87
Article Google Scholar
Lewicki PP (1998) Effect of pre-drying treatment, drying and rehydration on plant tissue properties: a review. Int J Food Prop 1(1):1–22
Article MathSciNet Google Scholar
McMinn WAM, Magee TRA (1997) Physical characteristics of dehydrated potatoes—part II. J Food Eng 33:49–55
Article Google Scholar
Witrowa-Rajchert D, Lewicki PP (2006) Rehydration properties of dried plant tissues. Int J Food Sci Technol 41:1040–1046
Article Google Scholar
Krokida MK, Marinos-Kouris D (2003) Rehydration kinetics of dehydrated products. J Food Eng 57(1):1–7
Article Google Scholar
Markowski M, Zielińska M (2011) Kinetics of water absorption and soluble-solid loss of hot-air-dried carrots during rehydration. Int J Food Sci Technol 46:1122–1128. doi:10.1111/j.1365-2621.2011.02589
Article Google Scholar
Kaleta A, Górnicki K, Winiczenko R, Chojnacka A (2013) Evaluation of drying models of apple (var. Ligol) dried in a fluidized bed dryer. Energy Convers Manag 67:179–185
Article Google Scholar
Maskan M (2001) Kinetics of colour change of kiwifruits during hot air and microwave drying. J Food Eng 48:169–175
Article Google Scholar
Tijskens LMM, Schijvens EPHM, Biekman ESA (2001) Modelling the change in colour of broccoli and green beans during blanching. Innov Food Sci Emerg Technol 2:303–313
Article Google Scholar
Vetter S, Kunzek H (2003) The influence of suspension solution conditions on the rehydration of apple cell wall material. Eur Food Res Technol 216:39–45
Article Google Scholar
Cunningham SE, McMinn WAM, Magee TRA, Richardson PS (2008) Effect of processing conditions on the water absorption and texture kinetics of potato. J Food Eng 84:214–223
Article Google Scholar
Vadivambal R, Jayas DS (2007) Changes in quality of microwave-treated agricultural products-a review. Biosyst Eng 98:1–16
Article Google Scholar
Moreira R, Chenlo F, Chaguri L, Fernandes C (2008) Water absorption, texture, and color kinetics of air-dried chestnuts during rehydration. J Food Eng 86:584–594
Article Google Scholar
Singh GD, Sharma R, Bawa AS, Saxena DC (2008) Drying and rehydration characteristics of water chestnut (Trapanatans) as a function of drying air temperature. J Food Eng 87:213–221
Article Google Scholar
Omid M, Baharlooei A, Ahmadi H (2009) Modeling drying kinetics of pistachio nuts with multi-layer feed-forward neural network. Drying Technol 27:1069–1077
Article Google Scholar
Simpson P (1989) Artificial neural networks. Pergamon, New York
Google Scholar
Faucett L (1994) Fundamentals of neural networks. Prentice-Hall, Englewood Cliffs, NJ
Google Scholar
Kung S (1993) Digital neural networks. Prentice-Hall, Englewood Cliffs, NJ
MATH Google Scholar
Menlik T, Ozdemir MB, Kirmaci V (2010) Determination of freeze-drying behaviors of apples by artificial neural network. Expert Syst Appl 37:7669–7677
Article Google Scholar
Ochoa-Martınez CI, Ayala-Aponte AA (2007) Prediction of mass transfer kinetics during osmotic dehydration of apples using neural networks. LWT-Food Sci Technol 40:638–645
Article Google Scholar
Samadi SH, Ghobadian B, Najafi G, Motevali A, Faal S (2013) Drying of apple slices in combined heat and power (CHP) dryer: comparison of mathematical models and neural networks. Chem Prod Process Model 8:41–52
Google Scholar
Khoshhal A, Dakhel AA, Etemadi A, Zereshki S (2010) Artificial neural network modeling of apple drying process. J Food Process Eng 33:298–313
Article Google Scholar
Nadian MH, Rafiee S, Aghbashlo M, Hosseinpour S, Mohtasebi SS (2015) Continuous real-time monitoring and neural network modeling of apple slices color changes during hot air drying. Food Bioprod Process 94:263–274
Article Google Scholar
Guine RP, Cruz AC, Mendes M (2014) Convective drying of apples: kinetic study, evaluation of mass transfer properties and data analysis using artificial neural networks. In J Food Eng 10:281–299
Google Scholar
Dadali G, Demirhan E, Ozbek B (2008) Effect of drying conditions on rehydration kinetics of microwave dried spinach. Food Bioprod Process 86:235–241. doi:10.1016/j.fbp.2008.01.006
Article Google Scholar
Šuput DZ, Lazić VL, Pezo LL, Lončar BL, Filipović VS, Nićetin MR, Knežević V (2014) Effects of temperature and immersion time on diffusion of moisture and minerals during rehydration of osmotically treated pork meat cubes. J Food Nutr Res 53:260–270. doi:10.2298/HEMIND131003041S
Google Scholar
Kumar MN, Rao MA (1996) Application of artificial neural networks to investigate the drying of cooked rice. J Food Process Eng 19:321–329
Article Google Scholar
Kerdpiboon S, Kerr WL, Devahastin S (2006) Neural network prediction of physical property changes of dried carrot as a function of fractal dimension and moisture content. Food Res Int 39:1110–1118
Article Google Scholar
Aghbashloa M, Hosseinpoura S, Mujumdarbc AS (2015) Application of artificial neural networks (ANNs) in drying technology: a comprehensive review. Drying Technol 33:1397–1462
Article Google Scholar
Paliwal M, Kumar UA (2009) Neural networks and statistical techniques: a review of applications. Expert Syst Appl 36:2–17
Article Google Scholar
Madadlou A, Emam-Djomeh Z, Mousavi ME, Ehsani M, Javanmard M, Sheehan D (2009) Response surface optimization of an artificial neural network for predicting the size of re-assembled casein micelles. Comput Electron Agric 68:216–221
Article Google Scholar
Aghbashlo M, Kianmehr MH, Nazghelichi T, Rafiee S (2011) Optimization of an artificial neural network topology for predicting drying kinetics of carrot cubes using combined response surface and genetic algorithm. Drying Technol 29:770–779
Article Google Scholar
Nazghelichi T, Aghbashlo M, Kianmehr MH (2011) Optimization of an artificial neural network topology using coupled response surface methodology and genetic algorithm for fluidized bed drying. Comput Electron Agric 75:84–91
Article Google Scholar
Nourbakhsh H, Emam-Djomeh Z, Omid M, Mirsaeedghazi H, Moini S (2014) Prediction of red plum juice permeate flux during membrane processing with ANN optimized using RSM. Comput Electron Agric 102:1–9
Article Google Scholar
Taheri-Garavand A, Rafiee S, Keyhani A, Javadikia P (2013) Modeling of basil leaves drying by GA-NN. Int J Food Eng 9:393–401
Article Google Scholar
Goldberg DE (1989) Genetic algorithm in search, optimization and machine learning. Addison-Wesley Longman Publishing Co, Inc., Boston, MA
MATH Google Scholar
Morimoto T, de Baerdemaeker J, Hashimoto Y (1997) An intelligent approach for optimal control of fruit-storage process using neural networks and genetic algorithms. Comput Electron Agric 18:205–224
Article Google Scholar
Morimoto T, Purwanto W, Suzuki J, Hashimoto Y (1997) Optimization of heat treatment for fruit during storage using neural networks and genetic algorithms. Comput Electron Agric 19:87–101
Article Google Scholar
Chen CR, Ramaswamy HS (2002) Modeling and optimization of constant retort temperature (crt) thermal processing using coupled neural networks and genetic algorithms. J Food Proc Eng 25:351–379
Article Google Scholar
Chen CR, Ramaswamy HS (2002) Modeling and optimization of variable retort temperature (VRT) thermal processing using coupled neural networks and genetic algorithms. J Food Eng 53(3):209–220
Article Google Scholar
Izadifar M, Zolghadri Jahromi M (2007) Application of genetic algorithm for optimization of vegetable oil hydrogenation process. J Food Eng 78:1–8
Article Google Scholar
Santana CJC, Araújo SA, Librantz AFH, Tambourgi EB (2010) Optimization of corn malt drying by use of a genetic algorithm. Drying Technol 28:1236–1244
Article Google Scholar
Fathi M, Mohebbi M, Razavi SMA (2011) Effect of osmotic dehydration and air drying on physicochemical properties of dried kiwifruit and modeling of dehydration process using neural network and genetic algorithm. Food Bioprocess Technol 4:1519–1526. doi:10.1007/s11947-010-0452-z
Article Google Scholar
Kaleta A, Górnicki K (2010) Some remarks on evaluation of drying models of red beet particles. Energy Convers Manag 51:2967–2978
Article Google Scholar
AOAC (2003) Official methods of analysis. Association of official analytical chemists (No. 943.06), Arlington, VA
Rasband W (2013) ImageJ (version 1.47i for Windows 64–bit) Computer program. National Institutes of Health. http://imagej.nih.gov/ij. Accessed Jan 2013
Witt K (2007) CIE color difference metrics. In: Schanda J (ed) Colorimetry understanding the CIE system. Wiley, Hoboken, NJ, pp 79–100
Google Scholar
Schanda J (2007) CIE colorimetry. In: Schanda J (ed) Colorimetry understanding the CIE system. Wiley, Hoboken, NJ, pp 25–78
Google Scholar
CIE Technical Report (1993) Parametric effects in colour difference evaluation; CIE Publication 101:1993. CIE Central Bureau, Vienna
Pascale D (2003) A review of RGB color spaces …from xyY to R’G’B’. The BabelColor Company, Montreal
Google Scholar
Hagan MT, Demuth HB, Beale M (1996) Neural network design. Thomson learning. Vikas Publishing House, Chennai
Google Scholar
Yousefi G, Emam-Djomeh Z, Omid M, Askari GR (2014) Prediction of physicochemical properties of raspberry dried by microwave assisted fluidized bed dryer using artificial neural network. Drying Technol 32:4–12
Article Google Scholar
Sudhakaran R, Vel Murugan V, Sivasakthivel PS, Balaji M (2013) Prediction and optimization of depth of penetration for stainless steel gas tungsten arc welded plates using artificial neural networks and simulated annealing algorithm. Neural Comput Appl 22:637–649. doi:10.1007/s00521-011-0720-5
Article Google Scholar
Montgomery DC (2009) Design and analysis of experiments, 7th edn. Wiley, New York
Google Scholar
Design-Expert Software (2009) Version 8.0 user’s guide
Gen M, Cheng R (2000) Genetic algorithm and engineering optimization. Wiley, New York
Google Scholar
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, West Sussex
MATH Google Scholar
Goldberg DE (1989) Genetic algorithm in search, optimization and machine learning. Addison-Wesley, Harlow
MATH Google Scholar
VanderNoot TJ, Abrahams I (1998) The use of genetic algorithms in the non-linear regression of immittance data J Electro. Anal Chem 448:17–23
Google Scholar
MATLAB 7.6 R2008a (2008) Documentation R. MathWorks, Inc
Sivanandam SN, Deepa SN (2008) Introduction to genetic algorithms. Springer, Berlin
MATH Google Scholar

Download references

Acknowledgements

The article was written as parts of a research Grant No. NN313780940 financed by the National Science Centre funds, Poland.

Author information

Authors and Affiliations

Faculty of Production Engineering, Warsaw University of Life Sciences, Nowoursynowska 164, 02-787, Warsaw, Poland
Radosław Winiczenko, Krzysztof Górnicki, Agnieszka Kaleta & Monika Janaszek-Mańkowska

Authors

Radosław Winiczenko
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Górnicki
View author publications
You can also search for this author in PubMed Google Scholar
Agnieszka Kaleta
View author publications
You can also search for this author in PubMed Google Scholar
Monika Janaszek-Mańkowska
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Radosław Winiczenko.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Winiczenko, R., Górnicki, K., Kaleta, A. et al. Optimisation of ANN topology for predicting the rehydrated apple cubes colour change using RSM and GA. Neural Comput & Applic 30, 1795–1809 (2018). https://doi.org/10.1007/s00521-016-2801-y

Download citation

Received: 15 May 2016
Accepted: 18 December 2016
Published: 24 December 2016
Issue Date: September 2018
DOI: https://doi.org/10.1007/s00521-016-2801-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Optimisation of ANN topology for predicting the rehydrated apple cubes colour change using RSM and GA

Abstract

Similar content being viewed by others

A hybrid deep learning neural network for early plant disease diagnosis using a real-world Wheat–Barley vision dataset: challenges and solutions

A comprehensive review of food rheology: analysis of experimental, computational, and machine learning techniques

Soft computing techniques for predicting the compressive strength properties of fly ash geopolymer concrete using regression-based machine learning approaches

1 Introduction