Application of a genetic algorithm and an artificial neural network for global prediction of the toxicity of phenols to Tetrahymena pyriformis

Habibi-Yangjeh, Aziz; Danandeh-Jenagharad, Mohammad

doi:10.1007/s00706-009-0185-8

Application of a genetic algorithm and an artificial neural network for global prediction of the toxicity of phenols to Tetrahymena pyriformis

Original Paper
Open access
Published: 13 October 2009

Volume 140, pages 1279–1288, (2009)
Cite this article

Download PDF

You have full access to this open access article

Monatshefte für Chemie - Chemical Monthly Aims and scope Submit manuscript

Application of a genetic algorithm and an artificial neural network for global prediction of the toxicity of phenols to Tetrahymena pyriformis

Download PDF

Aziz Habibi-Yangjeh¹ &
Mohammad Danandeh-Jenagharad¹

1484 Accesses
21 Citations
Explore all metrics

Abstract

Genetic algorithm (multiparameter linear regression; GA-MLR) and genetic algorithm–artificial neural network (GA-ANN) global models have been used for prediction of the toxicity of phenols to Tetrahymena pyriformis. The data set was divided into 150 molecules for training, 50 molecules for validation, and 50 molecules for prediction sets. A large number of descriptors were calculated and the genetic algorithm was used to select variables that resulted in the best-fit to models. The six molecular descriptors selected were used as inputs for the models. The MLR model was validated using leave-one-out, leave-group-out cross-validation and external test set. A three-layered feed forward ANN with back-propagation of error was generated using six molecular descriptors appearing in the MLR model. Comparison of the results obtained using the ANN model with those from the MLR revealed the superiority of the ANN model over the MLR. The root mean square error of the training, validation, and prediction sets for the ANN model were calculated to be 0.224, 0.202, and 0.224 and correlation coefficients (r ²) of 0.926, 0.943, and 0.925 were obtained. The improvements are because of non-linear correlations of the toxicity of the compounds with the descriptors selected. The prediction ability of the GA-ANN global model is much better than that of previously proposed models.

Graphical Abstract

Quantitative structure-electrochemistry relationship for substituted benzenoids using Levenberg-Marquardt artificial neural network

Article 22 March 2015

Modeling the toxicity of pollutants mixtures for risk assessment: a review

Article 03 January 2021

Quantile regression model for a diverse set of chemicals: application to acute toxicity for green algae

Article 29 November 2014

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Toxicological assessment of phenolic compounds is essential for risk-assessment purposes. Compounds with a single aromatic ring substituted with a hydroxyl group (the phenols) are ubiquitous in nature and are used in many industries including those involving textiles, leather, paper, and oil. They are also commonly used food additives and frequently utilized in agriculture [1]. There has therefore been great interest in assessing the toxicity of such compounds. The impact of the potential hazard of untested chemicals, a challenge confronting national and international regulatory agencies [2–5], can be measured by experimental investigations, but this approach is both quite expensive and time-consuming. This has meant that the development of computational methods as an alternative tool for predicting the properties of chemicals has been a subject of intensive study. Among computational methods quantitative structure–activity relationships (QSAR) have found diverse applications for predicting compounds’ properties, including biological activity prediction [6], physical property prediction [7], and toxicity prediction [8, 9]. QSPR/QSAR models are essentially calibration models in which the independent variables are molecular descriptors that describe the structure of molecules and the dependent variable is the property/activity of interest. In QSAR studies, techniques which can be used for model construction, for example multiple linear regression (MLR) and artificial neural networks (ANN), have been used for inspection of linear and nonlinear relationships between the activity of interest and molecular descriptors. Artificial neural networks have become popular in QSPR/QSAR models because of their success where complex non-linear relationships exist amongst data [10, 11]. An ANN is formed from artificial neurons connected with coefficients (weights), which constitute the neural structure and are organized in layers. The layers of neurons between the input and output layers are called hidden layers. Neural networks do not need explicit formulation of the mathematical or physical relationships of the problem handled. These give ANNs an advantage over traditional fitting methods for some chemical applications. For these reasons, in recent years ANNs have been applied to a wide variety of chemical problems [12–20]. Application of these techniques usually requires selection of variables to build well-fitting models. Nowadays, genetic algorithms (GA) are well-known as interesting and more widely used methods for variable selection [21–23]. GA are stochastic methods used to solve optimization problems defined by fitness criteria, by applying the evolution hypothesis of Darwin and different genetic functions, i.e., crossover and mutation.

QSAR models have been used to predict the toxicity of phenols [1, 24, 25]. Two approaches have been suggested in this modeling and in similar QSAR modeling. The first of these is the development of “global” models which are defined as QSAR models that cover a number of different mechanisms of action for a given toxicological endpoint. The use of the term “global model” in this study is distinct from that used to define QSAR models based on chemicals with similar modes of action allowing interspecies correlations. The second is the development of a number of “local” models, each covering a single mechanism of action present in the database [26]. Very recently, Enoch et al. [26] used a global QSAR method for prediction of the toxicity of phenols. The ability of the proposed global QSAR model to predict the toxicity of phenols is poor (correlation coefficients (r ²) of the model are 0.71 and 0.73 for training and test sets) [26].

In order to predict accurately the toxicity of these compounds, in this work genetic algorithm–multiparameter linear regression (GA-MLR) and genetic algorithm–artificial neural network (GA-ANN) global models were used to generate QSAR models between the descriptors and toxicity of 250 phenols with diverse chemical structures. The results obtained were compared with each other, with those from previous work [26], and with the experimental values.

Results and discussion

For selection of the most important descriptors the genetic algorithm technique was used. To select the optimum number of descriptors, the influences of the number of the descriptors were investigated for one to ten descriptors.

The R ² value can be generally increased by adding the additional predictor variables to the model, even if the added variable does not contribute to the reduction of the unexplained variance of the dependent variable. Therefore, the R ² usage requires special attention. For this reason, it is better to use another statistical parameter, called the adjusted R ² (R ²_adj ), were R ²_adj is defined by Eq. 1.

$$ R^{2}_{\text{adj}} = 1- \left( { 1- R^{2} } \right)\left( {{\frac{n - 1}{n - p - 1}}} \right) $$

(1)

R ²_adj is interpreted similarly to the R ² value, considering the number of degrees of freedom also. It is adjusted by dividing the residual sum of squares and total sum of squares by their respective degrees of freedom. The R ²_adj value diminishes if an added variable to the equation does not reduce the unexplained variance [27]. Subsequently, R ²_adj is used to compare models with different numbers of predictor variables.

Another statistical parameter is the standard error of the estimate(s) that measures the dispersion of the observed values about the regression line. When the s value is low, the reliability of the prediction is higher. Figure 1 shows plots of R ², R ²_adj , and s for the training set as a function of the number of descriptors for the 1–10 descriptors in the models. R ² and R ²_adj increased with increasing number of descriptors. However, the values of s decreased with increasing number of descriptors. As models with 7–10 descriptors did not significantly improve the statistics of the models, it was determined that the optimum subset size had been achieved with a maximum of 6 descriptors.

The selected variables and the correlation matrix of the descriptors are listed in Table 1, from which it can be seen that the correlation coefficient value of each pair of descriptors was less than 0.65, which meant that the selected descriptors are independent.

Table 1 Correlation coefficient matrix of the selected descriptors

Full size table

To examine the relative importance, and the contribution of each descriptor in the model, for each descriptor the value of the mean effect (MF) was calculated. This calculation was performed by use of Eq. 2.

$$ {\text{MF}}_{j} = {\frac{{\beta_{j} \sum\nolimits_{i = 1}^{i = n} {d_{ij} } }}{{\sum\nolimits_{j}^{m} {\beta_{j} \sum\nolimits_{i}^{n} {d_{ij} } } }}} $$

(2)

MF_j represents the mean effect for the considered descriptor j, β _j is the coefficient of the descriptor j, d _ij stands for the value of the target descriptors for each molecule, and m is the descriptor’s number in the model. The MF value indicates the relative importance of a descriptor, compared with the other descriptors in the model. Its sign shows the direction of variation in the toxicity values as a result of the increase (or reduction) of the descriptor values. The mean effect values are −0.043, 1.071, −0.081, 0.035, −0.004, and 0.023 for Xt, MATS1m, PJI3, Mor23u, nCs, and H-046. By interpreting the descriptors contained in the model, it is possible to gain useful chemical insights into the toxicity of phenols. For this reason, an acceptable interpretation of the QSAR results is provided below.

The first descriptor which has appeared in the model is Xt (total structure connectivity index). Connectivity indices are among the most popular topological indices and are calculated from the vertex degree of the atoms in the H-depleted molecular graph. Xt is a connectivity index contemporarily accounting for all the atoms in the graph. Also the total structure connectivity index is the square root of the simple topological index that is proposed for measuring molecular branching [28]. The mean effect of Xt has a negative sign, which indicates that an increase in the molecular branch leads to a decrease in its pIG ₅₀ value.

The second descriptor is MATS1m (Moran autocorrelation—lag 1/weighted by atomic masses), which is a 2D autocorrelation descriptor. In this descriptor the Moran coefficient is a distance-type function, and is any physicochemical property calculated for each atom of the molecule, for example atomic mass, polarizability, etc. The Moran coefficient usually takes a value in the interval [−1, +1]. Positive autocorrelation corresponds to positive values of the coefficient whereas negative autocorrelation produces negative values. Therefore, the molecule atoms represent a set of discrete points in space and the atomic property is the function evaluated at those points. The physicochemical property in this case is the atomic mass. MATS1m has a positive sign, illustrating a greater mean effect value than that of the other descriptors, which indicates that this descriptor had a significant effect on the toxicity and that the pIG ₅₀ value is directly related to this descriptor. Hence, it was concluded that by increasing the molecular mass the value of this descriptor increased, causing an increase in its pIG ₅₀ value.

The third descriptor is PJI3 (3D Petijean shape index), which is a geometrical descriptor. The Petitjean shape index is a topological anisometry descriptor also called a graph-theoretical shape coefficient that is calculated from the topological radius and the topological diameter obtained from the distance matrix representing the considered molecular graph. PJI3 has a negative sign, which indicates that the pIG ₅₀ is inversely related to this descriptor.

Mor23u is the fourth descriptor appearing in the model. It is a 3D-MoRSE descriptor. 3D MoRSE descriptors (3D molecule representation of structures based on electron diffraction) are derived from infrared spectra simulation using a generalized scattering function [28]. This descriptor was proposed as signal 23/unweighted. Mor23u has a positive sign, which indicates that the pIG ₅₀ is directly related to this descriptor.

The fifth descriptor is nCs which is one of the functional groups. nCs represents the number of total secondary C(sp³). The mean effect of nCs has a negative sign, which indicates that an increase in the number of secondary C(sp³) of the molecule leads to a decrease in its pIG ₅₀ value.

The final descriptor of the model was the H-046 (H attached to C0 (sp³)). It is one of the atom-centered fragment descriptors that describe each atom by its own atom type and the bond types and atom types of its first neighbors. This descriptor represents the first neighbor (hydrogen) of carbon atoms. This descriptor has a positive sign, which indicates that the pIG ₅₀ is directly related to this descriptor.

In summary, it is concluded that the molecular branching, the molecular mass, the molecular shape, the number of secondary C(sp³) of molecules, and the first neighbor (hydrogen) of carbon atoms are of major importance in the toxicity of the compounds studied.

Genetic algorithm: multiparameter linear regression

We used a GA for selection of the most relevant descriptors. Multiparameter linear correlation of pIG ₅₀ values for 150 different phenolic compounds in the training set was achieved by the GA by use of the six descriptors selected, and the following equation was obtained:

$$ \begin{aligned} pIG_{50} = & - 1 5.0 5\left( { \pm 1. 6 6} \right) - 1 5. 7 7\left( { \pm 2.00} \right){\text{Xt}} + 1 7. 8 4\left( { \pm 1. 5 8} \right){\text{MATS1m}} \\ & \quad - 1. 8 4\left( { \pm 0. 3 1} \right){\text{PJI3}} - 1. 2 3\left( { \pm 0. 1 8} \right){\text{Mor23u}} - 0. 1 2\left( { \pm 0.0 4} \right){\text{nCs}} \\ & \quad + 0. 1 4\left( { \pm 0.0 1} \right){\text{H}} - 0 4 6\\ \end{aligned} $$

(3)

The model was then used to predict pIG ₅₀ values for the compounds in the validation and prediction sets. The prediction results are given in Table 2. The calculated values of pIG ₅₀ for the compounds in the training, validation, and prediction sets using the GA-MLR model have also been plotted versus their experimental values (Fig. 2). The correlation coefficients, r ², obtained were 0.747 for the training set, 0.721 for the validation set, and 0.516 for the prediction set. Table 3 shows the root mean square error (RMSE) and r ² of the model for total, training, validation, and prediction sets.

Table 2 Experimental values of the toxicity of phenols to Tetrahymena pyriformis (pIG ₅₀) and the values calculated by the GA-MLR and GA-ANN global models

Full size table

Table 3 Comparison of statistical data obtained by the GA-MLR and GA-ANN models for the toxicity (pIG ₅₀) of phenols

Full size table

The model obtained was validated using the leave-one-out (LOO) and leave-group-out (LGO) cross-validation processes. For LOO cross-validation, a data point is removed from the set and the model is recalculated. The predicted activity for that point is then compared with its actual value. This is repeated until each data point has been omitted once. For LGO, 20% of the data points are removed from the dataset and the model refitted; the values predicted for those points are then compared with the experimental values. Again, this is repeated until each data point has been omitted once. The crossvalidated correlation coefficient (Q ²) was 0.620 for LGO and 0.728 for LOO. This indicates that the regression model obtained has good internal and external predictive power.

Genetic algorithm–artificial neural network

To process the non-linear relationships between the toxicity and the descriptors the ANN modeling method combined with GA for feature selection was employed. The input vectors were the set of descriptors which were selected by the GA, and therefore the number of nodes in the input layer was dependent on the number of selected descriptors. In the GA-MLR model it is assumed that the descriptors are independent of each other and have truly additive relevance to the property under study. ANNs are particularly well-suited for QSAR/QSPR models because of their ability to extract non-linear information present in the data matrix. For this reason the next step in this work was generation of the ANN model. There are no rigorous theoretical principles for choosing the proper network topology; so different structures were tested in order to obtain the optimum number of hidden neurons and training cycles [17–20]. Before training the network, the number of nodes in the hidden layer was optimized. In order to optimize the number of nodes in the hidden layer, several training sessions were conducted with different numbers of hidden nodes (from 1 to 18). The root mean square error of training (RMSET) and validation (RMSEV) sets were obtained at various iterations for different numbers of neurons in the hidden layer and the minimum value of RMSEV was recorded as the optimum value. A plot of RMSET and RMSEV versus the number of nodes in the hidden layer is shown in Fig. 3. It is clear that fifteen nodes in the hidden layer is the optimum value.

This network consists of six inputs, the same descriptors as in the GA-MLR model, and one output for pIG ₅₀. Then an ANN with architecture 6-15-1 was generated. It is noteworthy that training of the network was stopped when the RMSEV started to increase, i.e., when overtraining begins. The overtraining causes the ANN to lose its prediction power [11]. Therefore, during training of the network, it is desirable that iterations are stopped when overtraining begins. To control the overtraining of the network during the training procedure, the values of RMSET and RMSEV were calculated and recorded to monitor the extent of learning in the various iterations. Results showed that overtraining did not occur in the optimum architecture (Fig. 4).

The generated ANN was then trained using the training and validation sets for optimization of the weights and biases. For evaluation of the predictive power of the generated ANN, an optimized network was used for prediction of the pIG ₅₀ values in the prediction set, which were not used in the modeling procedure (Table 2). The calculated values of pIG ₅₀ for the compounds in the training, validation, and prediction sets using the ANN model have been plotted versus their experimental values in Fig. 5. A plot of the residuals for the calculated values of pIG ₅₀ in the training, validation, and prediction sets versus their experimental values is presented in Fig. 6. As can be seen, the model did not show proportional and systematic error, because the distribution of the residuals on both sides of zero are random.

As expected, the calculated values of pIG ₅₀ are in good agreement with the experimental values. The correlation equation for all of the calculated values of pIG ₅₀ from the ANN model and the experimental values is given by Eq. 4.

$$ pIG_{50} \left( {\text{cal}} \right) = 0. 9 2 7\,pIG_{50} \left( { \exp } \right) + 0.0 5 4\\ \left( {r^{2} = 0. 9 2 9;{\text{ RMSE}} = 0. 2 20; \, F = 3 2 5 7. 5 2 3} \right)\\ $$

(4)

Similarly, the correlation of pIG ₅₀ (cal) versus pIG ₅₀ (exp) values in the prediction set is given by Eq. 5.

$$ pIG_{50} \left( {\text{cal}} \right) = 0. 9 2 7\, pIG_{50} \left( { \exp } \right) + 0.0 7 9\\ (r^{2} = 0. 9 2 6;{\text{ RMSE}} = 0. 2 2 4; \, F = 5 9 9.0 7 5) \\ $$

(5)

Table 3 compares the results obtained using the GA-MLR and GA-ANN models. The r ² and RMSE of the models for the total, training, validation, and prediction sets show the potential of the ANN model for prediction of pIG ₅₀ values of phenolic compounds using a global QSAR model. As a result, it was found that a properly selected and trained neural network could fairly represent the dependence of the toxicity of phenols on the descriptors. The optimized neural network could then simulate the complicated nonlinear relationship between pIG ₅₀ value and the descriptors. The RMSE of 0.634 for the prediction set by the GA-MLR model should be compared with the value of 0.224 by the GA-ANN model. As can be seen, the ability of the proposed model to predict the pIG ₅₀ is better than the QSAR models proposed recently [26]. It can be seen from Table 3 that although parameters appearing in the GA-MLR model are used as inputs for the generated GA-ANN model, the statistics indicate substantial improvement. These improvements are because of the non-linear correlation of the toxicity of phenols to Tetrahymena pyriformis with the selected descriptors.

Data and methodology

The data set of toxicity values (pIG ₅₀, or Log (1/IGC ₅₀)) for the 250 phenolic compounds used for the QSAR models was selected from literature [1]. The data set was randomly split into training, validation, and prediction sets (150, 50, and 50 compounds, Table 2). The z-matrices (molecular models) were constructed with HyperChem 7.0 and molecular structures were optimized using the AM1 algorithm [29]. In order to calculate the theoretical descriptors, Dragon package version 2.1 was used [30]. For this purpose the output of the HyperChem software for each compound was fed into the Dragon program and the descriptors were calculated. As a result, a total of 1,481 theoretical descriptors were calculated for each compound in the data sets (250 compounds).

The theoretical descriptors were reduced by the following procedure:

1
descriptors that were constant were eliminated (394 descriptors); and
2
to reduce the redundancy existing in the descriptors, the correlation of the descriptors with each other and with pIG ₅₀ of the molecules were examined, and collinear descriptors (R > 0.9) were detected. Among the collinear descriptors, that with the highest correlation with toxicity values was retained, and the others were removed from the data matrix (703 descriptors).

The genetic algorithm (GA)

To select the most relevant descriptors, evolution of the population was simulated [31–35]. Each individual of the population defined by a chromosome of binary values represented a subset of descriptors. The number of genes on each chromosome was equal to the number of descriptors. The population of the first generation was selected randomly. A gene took a value of 1 if its corresponding descriptor was included in the subset; otherwise, it took a value of zero. The number of genes with a value of 1 was kept relatively low to furnish a small subset of descriptors [35], that is, the probability of generating 0 for a gene was set greater (at least 60%) than that of generating 1. The operators used here were crossover and mutation. The probability of the application of these operators was varied linearly with generation renewal (0–0.1% for mutation and 60–90% for crossover). The population size was varied between 50 and 250 for different GA runs. For a typical run, the evolution of the generation was stopped when 90% of the generations took the same fitness [21]. The GA program was written in Matlab 6.5 [36].

The artificial neural network (ANN)

A feed-forward artificial neural network with a back-propagation (BP) of error algorithm was used to process the non-linear relationship between the selected descriptors and the toxicity (pIG ₅₀). The number of input nodes in the ANN was equal to the number of descriptors appearing in the MLR model. The ANN model is confined to a single hidden layer, because a network with more than one hidden layer would be harder to train. A three-layer network with a sigmoidal transfer function was designed. The initial weights were randomly selected between 0 and 1. Optimization of the weights and biases was carried out according to Levenberg–Marquardt algorithms for BP of error, which, although requiring far more extensive computer memory, are significantly faster than other algorithms based on gradient descent [37]. The data set was randomly divided into three groups: a training set, a validation set, and a prediction set consisting of 150, 50, and 50 molecules. The training and validation sets were used for generation of the model and the prediction set was used for evaluation of the generated model. The performances of the training, validation, and prediction of models were evaluated as the root mean square error (RMSE), which is defined by Eq. 6.

$$ {\text{RMSE}} = \sqrt {\sum\limits_{i = 1}^{N} {{\frac{{(P_{i}^{ \exp } - P_{i}^{\text{cal}} )^{2} }}{N}}} } $$

(6)

where P ^exp_i and P ^cal_i are experimental values of pIG ₅₀ and calculated with the models and N denotes the number of data points. The residual is defined by Eq. 7.

$$ {\text{Residual}} = P_{i}^{ \exp } - P_{i}^{\text{cal}} . $$

(7)

The processing of the data was carried out using Matlab 6.5 [38]. The neural networks were implemented using Neural Network Toolbox Ver. 4.0 for Matlab [39].

Conclusion

In this study, linear (GA-MLR) and nonlinear (GA-ANN) global QSAR models were used to construct quantitative relationships between the toxicity of phenols to Tetrahymena pyriformis and their calculated descriptors. Comparison of the results obtained by use of the GA-ANN and the GA-MLR confirmed the superiority of the GA-ANN model as a more powerful method to predict pIG ₅₀. A suitable model with high statistical quality and low prediction errors was eventually derived. Because the improvement of the results obtained by use of the non-linear model (GA-ANN) is substantial, it can be concluded there is a non-linear correlation between the descriptors and the pIG ₅₀ values of the phenols.

References

Cronin MTD, Aptul AO, Duffy JC, Netzeva TI, Rowe PH, Valkova IV, Schultz TW (2002) Chemosphere 49:1201
Article CAS Google Scholar
Zeeman M, Auer CM, Clements RG, Nabholz JV, Boethling RS (1995) SAR QSAR Environ Res 3:179
Article CAS Google Scholar
Walker JD (2003) J Mol Struct Theochem 622:167
Article CAS Google Scholar
Bradbury SP, Russom CL, Ankley GT, Schultz TW, Walker JD (2003) Environ Toxicol Chem 22:1789
Article CAS Google Scholar
European Commission. White Paper on a strategy for a future Community Policy for Chemicals (2001), http://europa.eu.int/comm/enterprise/reach/
Seierstad M, Agrafiotis DK (2006) Chem Biol Drug Des 67:284
Article CAS Google Scholar
Verma RP, Kurup A, Hansch C (2005) Bioorg Med Chem 13:237
Article CAS Google Scholar
Toropov AA, Benfenati E (2006) Bioorg Med Chem Lett 16:1941
Article CAS Google Scholar
Khadikar PV, Phadnis A, Shrivastava A (2002) Bioorg Med Chem 10:1181
Article CAS Google Scholar
Despagne F, Massart DL (1998) Analyst 123:157
Article Google Scholar
Zupan J, Gasteiger J (1999) Neural networks in chemistry and drug design. Wiley-VCH, Germany
Google Scholar
Habibi-Yangjeh A, Pourbasheer E, Danandeh-Jenagharad M (2008) Bull Korean Chem Soc 29:833
CAS Google Scholar
Meiler J, Meusinger R, Will M (2000) J Chem Inf Comput Sci 40:1169
CAS Google Scholar
Habibi-Yangjeh A, Pourbasheer E, Danandeh-Jenagharad M (2008) Monatsh Chem 139:1423
Article CAS Google Scholar
Habibi-Yangjeh A, Nooshyar M (2005) Phys Chem Liq 43:239
Article CAS Google Scholar
Tabaraki R, Khayamian T, Ensafi AA (2006) J Mol Graph Model 25:46
Article CAS Google Scholar
Habibi-Yangjeh A, Pourbasheer E, Danandeh-Jenagharad M (2009) Monatsh Chem 140:15
Article CAS Google Scholar
Habibi-Yangjeh A, Nooshyar M (2005) Bull Korean Chem Soc 26:139
Article CAS Google Scholar
Habibi-Yangjeh A, Danandeh-Jenagharad M, Nooshyar M (2005) Bull Korean Chem Soc 26:2007
CAS Google Scholar
Habibi-Yangjeh A, Danandeh-Jenagharad M, Nooshyar M (2006) J Mol Model 12:338
Article CAS Google Scholar
Depczynski U, Frost VJ, Molt K (2000) Anal Chim Acta 420:217
Article CAS Google Scholar
Alsberg BK, Marchand-Geneste N, King RD (2000) Chemom Intell Lab Syst 54:75
Article CAS Google Scholar
Jouan-Rimbaud D, Massart DL, Leardi R, Denoord OE (1995) Anal Chem 67:4295
Article CAS Google Scholar
Cronin MTD, Schultz TW (1996) Chemosphere 32:1453
Article CAS Google Scholar
Devillers J (2004) SAR QSAR Environ Res 15:237
Article CAS Google Scholar
Enoch SJ, Cronin MTD, Schultz TW, Madden JC (2008) Chemosphere 71:1225
Article CAS Google Scholar
Hansch C, Taylor J, Sammes P (1990) Comprehensive Medicinal Chemistry: The Rational Design, Mechanistic Study & Therapeutic Application of Chemical Compounds, Pergamon, New York, 6:1
Todeschini R, Consonni V (2000) Handbook of Molecular Descriptors. Wiley-VCH, Weinheim
Book Google Scholar
HyperChem Release 7, HyperCube, Inc. http://www.hyper.com
Todeschini R. Milano Chemometrics and QSPR Group. http://www.disat.unimib.it/chm
Cho SJ, Hermsmeier MA (2002) J Chem Inf Comput Sci 42:927
CAS Google Scholar
Baumann K, Albert H, Korff MV (2002) J Chemom 16:339
Article CAS Google Scholar
Lu Q, Shen G, Yu R (2002) J Comput Chem 23:1357
Article CAS Google Scholar
Ahmad S, Gromiha MM (2003) J Comput Chem 24:1313
Article CAS Google Scholar
Deeb O, Hemmateenejad B, Jaber A, Garduno-Juarez R, Miri R (2007) Chemosphere 67:2122
Article CAS Google Scholar
The Mathworks Inc (2002) Genetic algorithm and direct search toolbox user’s guide. The Mathworks Inc, Massachusetts
Google Scholar
Hagan MT, Menhaj M (1994) IEEE Trans Neural Netw 5:989
Article CAS Google Scholar
Matlab 6.5. Mathworks, 1984–2002
The Mathworks Inc (2002) Neural network toolbox user’s guide. The Mathworks Inc, Massachusetts
Google Scholar

Download references

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Author information

Authors and Affiliations

Department of Chemistry, Faculty of Science, University of Mohaghegh Ardabili, P.O. Box 179, Ardabil, Iran
Aziz Habibi-Yangjeh & Mohammad Danandeh-Jenagharad

Authors

Aziz Habibi-Yangjeh
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Danandeh-Jenagharad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aziz Habibi-Yangjeh.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Habibi-Yangjeh, A., Danandeh-Jenagharad, M. Application of a genetic algorithm and an artificial neural network for global prediction of the toxicity of phenols to Tetrahymena pyriformis . Monatsh Chem 140, 1279–1288 (2009). https://doi.org/10.1007/s00706-009-0185-8

Download citation

Received: 20 January 2009
Accepted: 02 September 2009
Published: 13 October 2009
Issue Date: November 2009
DOI: https://doi.org/10.1007/s00706-009-0185-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.