Abstract
Purpose. To build and test a computational model for predicting small molecule solubility, to improve the cost-effectiveness of the selection of vendor compounds suitable for nuclear magnetic resonance (NMR) screening.
Methods. A simple recursive partitioning decision tree-based classification model was generated utilizing “off-the-shelf” commercial software from Accelrys Inc., with a training set of 1992 compounds based on a series of calculated topologic and physical properties. The predictive ability of the decision tree was then assessed by employing it to classify a test set of 2851 vendor compounds, and the classification was subsequently used to guide the purchase of 686 compounds for the purpose of NMR screening.
Results. When the decision tree was used to guide purchasing, the percentage of “acceptable” compounds suitable for NMR screening doubled compared with the use of a simple cLogP cutoff, improving the successful selection rate from 25% to 50%.
Conclusions. A simple recursive partitioning decision tree may successfully be used to improve cost-effectiveness by reducing the wastage associated with the unnecessary purchase of vendor compounds unsuitable for NMR screening because of insolubility.
Similar content being viewed by others
References
C. A. S. Bergstrom, U. Norinder, K. Luthman, and P. Artursson. Experimental and computational screening models for prediction of aqueous drug solubility. Pharm. Res. 19:182-188 (2002).
W. Jorgensen and E. M. Duffy. Prediction of drug solubility from structure. Adv. Drug Deliv. Rev. 54:355-366 (2002).
N. R. McElroy and P. C. Jurs. Prediction of aqueous solubility of heteroatom-containing organic compounds from molecular structure. J. Chem. Inf. Comput. Sci. 41:1237-1247 (2001).
Y. Ran and S. H. Yalkowsky. Prediction of drug solubility by the general solubility equation (GSE). J. Chem. Inf. Comput. Sci. 41:354-357 (2001).
A. Katritzsky and D. B. Tatham. Correlation of the solubility of gases and vapors in methanol and ethanol with their molecular structures. J. Chem. Inf. Comput. Sci. 41:358-363 (2001).
J. Huuskonen. Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology. J. Chem. Inf. Comput. Sci. 40:773-777 (2000).
J. Huuskonen, M. Salo, and J. Taskinen. Aqueous solubility prediction of drugs based on molecular topology and neural network modeling. J. Chem. Inf. Comput. Sci. 38:450-456 (1998).
C. A. Lipinski, F. Lombardo, B. W. Dominy, and P. J. Feeny. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Delivery Rev. 23:3-25 (1997).
T. M. Nelson and P. C. Jurs. Prediction of aqueous solubility for a diverse set of heteroatom-containing organic compounds. J. Chem. Info. Comput. Sci. 34:601-609 (1994).
M. Kalmet. Linear solvation energy relationships: an improved equation for correlation and prediction of aqueous solubilities of aromatic solutes including polycyclic aromatic hydrocarbons and polychlorinated biphenyls. Prog. Phys. Org. Chem 19:295-317 (1993).
G. Klopman, S. Wang, and D. M. Balthasar. Estimation of aqueous solubility of organic molecules by the group contribution approach. Application to the study of biodegradation. J. Chem. Inf. Comput. Sci. 32:439-445 (1992).
N. Bodor and N.-J. Huang. A new method for the estimation of the aqueous solubility of organic compounds. J. Pharm. Sci. 81:954-960 (1992).
N. Bodor, A. Harget, and N.-J. Huang. Neural network studies. 1. Estimation of the aqueous solubility of organic compounds. J. Am. Chem. Soc. 113:9480-9483 (1991).
D. T. Stanton and P. C. Jurs. Development and use of charged particle surface area structural descriptors in computer assisted quantitative structure–property relationship studies. Anal. Chem. 62:2323-2329 (1990).
L. B. Kier and L. H. Hall. Molecular Connectivity in Chemistry and Drug Research. Academic Press, New York, 1976.
L. B. Kier and L. H. Hall. Molecular connectivity VII: Specific treatment of heteroatoms. J. Pharm. Sci. 65:1806-1809 (1976).
L. B. Kier and L. H. Hall. Molecular Connectivity in Structure-Activity Analysis. Wiley, New York, 1986.
Daylight Chemical Information Systems Inc. Mission Viejo, CA. www.daylight.com.
Cerius2 is a product of Accelrys Inc., San Diego, CA. www.accelrys.com.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xia, X., Maliski, E., Cheetham, J. et al. Solubility Prediction by Recursive Partitioning. Pharm Res 20, 1634–1640 (2003). https://doi.org/10.1023/A:1026195503465
Issue Date:
DOI: https://doi.org/10.1023/A:1026195503465