Abstract
Principal component-genetic algorithm-multiparameter linear regression (PC-GA-MLR) and principal component-genetic algorithm-artificial neural network (PC-GA-ANN) models were applied for prediction acidity constant (pK a ) for various nitrogen-containing compounds. A data set that consisted of 282 various compounds, including 55 anilines, 77 amines, 82 pyridines, 14 pyrimidines, 26 imidazoles and benzimidazoles, and 28 quinolines, is used in this work. A large number of theoretical descriptors were calculated for each compound. The first 179 principal components (PCs) were found to explain more than 99.9% of variances in the original data matrix. From the pool of these PCs, the genetic algorithm was employed for selection of the best set of extracted PCs for PC-MLR and PC-ANN models. The models were generated using 15 PCs as variables. For evaluation of the predictive power of the models, pK a values of 56 compounds in the prediction set were calculated. Root mean square errors (RMSE) for PC-GA-MLR and PC-GA-ANN models are 1.4863 and 0.0750. Comparison of the results obtained by the models reveals superiority of the PC-GA-ANN model relative to the PC-GA-MLR model. Mean percent deviation for the PC-GA-ANN model in the prediction set is 2.123. The improvements are due to the fact that pK a of the compounds demonstrates non-linear correlations with the PCs.
Graphical abstract
Similar content being viewed by others
References
Zhao YH, Yuan X, Yuan LH, Wang LS (1996) Bull Environ Contam Toxicol 57:242
Alines P (1996) J Planar Chromatogr Mod TLC 9:52
Jover J, Bosque R, Sales J (2007) QSAR Comb Sci 26:385
Yao XJ, Wang YW, Zhang XY, Zhang RS, Liu MC, Hu ZD, Fan BT (2002) Chemom Intell Lab Syst 62:217
Guha R, Serra JR, Jurs PC (2004) J Mol Graph Model 23:1
Krogsgaard-Larsen P, Liljefors T, Madsen U (2002) Textbook of drug design and discovery. Taylor & Francis, London
Consonni V, Todeschini R, Pavan M, Gramatica P (2002) J Chem Inf Comput Sci 42:693
Karthikeyan M, Glen RC, Bender A (2005) J Chem Inf Model 45:581
Melnikov AA, Palyulin VA, Zefirov NS (2007) J Chem Inf Model 47:2077
Ajmani S, Rogers SC, Barley MH, Livingstone DJ (2006) J Chem Inf Model 46:2043
Katritzky AR, Stoyanova-Slavova IB, Dobchev DA, Karelson M (2007) J Mol Graph Model 26:529
Shamsipur M, Siroueinejad A, Hemmateenejad B, Abbaspour A, Sharghi H, Alizadeh K, Arshadi S (2007) J Electranal Chem 600:345
Habibi-Yangjeh A, Pourbasheer E, Danandeh-Jenagharad M (2008) Monatsh Chem doi:10.1007/s00706-008-0951-z
Avram S, Berner H, Milac AL, Wolschann P (2008) Monatsh Chem 139:407
Prakasvudhisarn C, Lawtrakul L (2008) Monatsh Chem 139:197
Lawtrakul L, Prakasvudhisarn C (2005) Monatsh Chem 136:1681
Todeschini, V. Consonni (2000) Handbook of Molecular Descriptors, Wiley-VCH, Weinheim, Germany
Sutter JM, Kalivas JH, Lang PM (1992) J Chemometr 6:217
Vendrame R, Braga RS, Takahata Y, Galvao DS (1999) J Chem Inf Comput Sci 39:1094
Malinowski ER (2002) Factor analysis in chemistry. Wiley, New York
Katritzky AR, Tulp I, Fara DC, Lauria A, Maran U, Acree WE (2005) J Chem Inf Model 45:913
Hemmateenejad B, Akhond M, Miri R, Shamsipur M (2003) J Chem Inf Comput Sci 43:1328
Hemmateenejad B, Shamsipur M (2004) Internet Electron J Mol Des 3:316
Jalali-Heravi M, Kyani A (2004) J Chem Inf Comput Sci 44:1328
Hemmateenejad B, Safarpour MA, Miri R, Nesari N (2005) J Chem Inf Model 45:190
Hemmateenejad B, Safarpour MA, Miri R, Taghavi F (2004) J Comput Chem 25:1495
Depczynski U, Frost VJ, Molt K (2000) Anal Chim Acta 420:217
Hemmateenejad B (2005) Chemom Intell Lab Syst 75:231
Goldberg DE (2000) Genetic algorithm in search, optimization and machine learning. Addison-Wesley-Longman, Reading
Cho SJ, Hermsmeier MA (2002) J Chem Inf Comput Sci 42:927
Despagne F, Massart DL (1998) Analyst 123:157
Zupan J, Gasteiger J (1999) Neural networks in chemistry and drug design. Wiley-VCH, Germany
Meiler J, Meusinger R, Will M (2000) J Chem Inf Comput Sci 40:1169
Habibi-Yangjeh A, Nooshyar M (2005) Phys Chem Liq 43:239
Habibi-Yangjeh A, Nooshyar M (2005) Bull Korean Chem Soc 26:139
Habibi-Yangjeh A, Danandeh-Jenagharad M, Nooshyar M (2005) Bull Korean Chem Soc 26:2007
Habibi-Yangjeh A (2007) Phys Chem Liq 45:471
Tabaraki R, Khayamian T, Ensafi AA (2006) J Mol Graph Model 25:46
Habibi-Yangjeh A, Danandeh-Jenagharad M, Nooshyar M (2006) J Mol Model 12:338
Habibi-Yangjeh A, Esmailian M (2007) Bull Korean Chem Soc 28:1477
Habibi-Yangjeh A, Pourbasheer E, Danandeh-Jenagharad M (2008) Bull Korean Chem Soc 29:833
Habibi-Yangjeh A, Esmailian M (2008) Chin J Chem 26:875
Schuurmann G (1996) Quant Struct Act Relat 15:121
Citra MJ (1999) Chemosphere 38:191
Liptak MD, Gross KC, Seybold PG, Feldgus S, Shields GC (2002) J Am Chem Soc 124:6421
Ma Y, Gross KC, Hollingsworth CA, Seybold PG, Murray JS (2004) J Mol Model 10:235
Tehan BG, Lloyd EJ, Wong MG, Pitt WR, Gancia E, Manallack DT (2002) Quant Struct Act Relat 21:473
Saiz-Urra L, Perez Gonzalez MP, Teijeira M (2006) Bioorg Med Chem 14:7347
HyperChem Release 7, HyperCube, Inc., http://www.hyper.com
Todeschini R, Milano Chemometrics and QSPR Group, http://www.disat.unimib.it/chm
Matlab 6.5. Mathworks, 1984–2002
SPSS for Windows, Statistical Package for IBM PC, SPSS Inc., http://www.spss.com
Cartwright HM (1993) Applications of artificial intelligence in chemistry. Oxford University Press, Oxford
Baumann K, Albert H, Von Korff M (2002) J Chemometr 16:339
Lu Q, Shen G, Yu R (2002) J Comput Chem 23:1357
Ahmad S, Gromiha MM (2003) J Comput Chem 24:1313
Deeb O, Hemmateenejad B, Jaber A, Garduno-Juarez R, Miri R (2007) Chemosphere 67:2122
The Mathworks Inc (2002) Genetic algorithm and direct search toolbox users guide, Massachusetts
The Mathworks Inc (2002) Neural network toolbox users guide, Massachusetts
Acknowledgments
The authors wish to acknowledge the vice-presidency of research, University of Mohaghegh Ardabili, for financial support of this work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Habibi-Yangjeh, A., Pourbasheer, E. & Danandeh-Jenagharad, M. Application of principal component-genetic algorithm-artificial neural network for prediction acidity constant of various nitrogen-containing compounds in water. Monatsh Chem 140, 15–27 (2009). https://doi.org/10.1007/s00706-008-0049-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00706-008-0049-7