Linear and nonlinear functions on modeling of aqueous solubility of organic compounds by two structure representation methods
- 106 Downloads
Several quantitative models for the prediction of aqueous solubility of organic compounds were developed based on a diverse dataset with 2084 compounds by using multi-linear regression analysis and backpropagation neural networks. The compounds were described by two different structure representation methods: (1) with 18 topological descriptors; and (2) with 32 radial distribution function codes representing the 3D structure of a molecule and eight additional descriptors. The dataset was divided into a training and a test set based on Kohonen's self-organizing neural network. Good prediction results were obtained for backpropagation neural network models: with 18 topological descriptors, for the 936 compounds in the test set, a correlation coefficient of 0.92, and a standard deviation of 0.62 were achieved; with 3D descriptors, for the 866 compounds in the test set, a correlation coefficient of 0.90, and a standard deviation of 0.73 were achieved. The models were also tested by using another dataset, and the relationship of the two datasets was examined by Kohonen's self-organizing neural network.
Abbreviations: BPG – backpropagation; KNN – Kohonen's self-organizing neural network; MLRA – multilinear regression analysis; MMP – mean molecular polarizability; RDF – radial distribution function.
Unable to display preview. Download preview PDF.
- 21.Hemmer, M.C., Steinhauer, V. and Gasteiger, J., Vibrat. Spectrosc., 19 (1999) 151.Google Scholar
- 22.Hemmer, M.C. and Gasteiger, J., Anal. Chim. Acta, 420 (2000) 145.Google Scholar
- 23.Zupan, J. and Gasteiger, J., Neural Networks in Chemistry and Drug Design, Second edn. Wiley-VCH, Weinheim, Germany, 1999.Google Scholar
- 24.Yalkowsky, S.H. and Dannefelser, R.M., The ARIZONA dATAbASE of Aqueous Solubility. College of Pharmacy, University of Arizona, Tucson, AZ, 1990.Google Scholar
- 25.Syracuse Research Corporation. Physical/Chemical Property Database (PHYSPROP), SRC Environmental Science Center, Syracuse, NY, 1994.Google Scholar
- 28.Gasteiger J., Empirical methods for the calculation of physicochemical data of organic compounds. In: Jochum, C., Hicks, M.G. and Sunkel, J. (Eds.), Physical Property Prediction in Organic Compounds. Springer Verlag, Heidelberg, Germany, 1988, pp. 119–138.Google Scholar
- 29.PETRA can also be accessed on the web: http://www2.chemie.uni-erlangen.de/software/petra/index.html, see also http://www.mol-net.deGoogle Scholar
- 35.Gasteiger, J. and Hutchings, M.G., J. Chem. Soc. Perkin 2, (1984) 559.Google Scholar
- 43.Terfloth, L. and Gasteiger, J., Screening-Trends Drug Discov., 2 (2001) 49. http://www2.chemie.uni-erlangen.de/software/ kmap/ and http://www.mol-net.deGoogle Scholar
- 44.SPSS v. 10.0, SPSS Inc., Chicago, IL. http://www.spss.comGoogle Scholar
- 45.SNNS: Stuttgart Neural Network Simulator, Version 4.2, developed at University of Stuttgart, maintained at University of Tübingen, 1995. http://www-ra.informatik.unituebingen.de/SNNS/Google Scholar