Skip to main content
Log in

Solubility Prediction by Recursive Partitioning

  • Published:
Pharmaceutical Research Aims and scope Submit manuscript

Abstract

Purpose. To build and test a computational model for predicting small molecule solubility, to improve the cost-effectiveness of the selection of vendor compounds suitable for nuclear magnetic resonance (NMR) screening.

Methods. A simple recursive partitioning decision tree-based classification model was generated utilizing “off-the-shelf” commercial software from Accelrys Inc., with a training set of 1992 compounds based on a series of calculated topologic and physical properties. The predictive ability of the decision tree was then assessed by employing it to classify a test set of 2851 vendor compounds, and the classification was subsequently used to guide the purchase of 686 compounds for the purpose of NMR screening.

Results. When the decision tree was used to guide purchasing, the percentage of “acceptable” compounds suitable for NMR screening doubled compared with the use of a simple cLogP cutoff, improving the successful selection rate from 25% to 50%.

Conclusions. A simple recursive partitioning decision tree may successfully be used to improve cost-effectiveness by reducing the wastage associated with the unnecessary purchase of vendor compounds unsuitable for NMR screening because of insolubility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. C. A. S. Bergstrom, U. Norinder, K. Luthman, and P. Artursson. Experimental and computational screening models for prediction of aqueous drug solubility. Pharm. Res. 19:182-188 (2002).

    Google Scholar 

  2. W. Jorgensen and E. M. Duffy. Prediction of drug solubility from structure. Adv. Drug Deliv. Rev. 54:355-366 (2002).

    Google Scholar 

  3. N. R. McElroy and P. C. Jurs. Prediction of aqueous solubility of heteroatom-containing organic compounds from molecular structure. J. Chem. Inf. Comput. Sci. 41:1237-1247 (2001).

    Google Scholar 

  4. Y. Ran and S. H. Yalkowsky. Prediction of drug solubility by the general solubility equation (GSE). J. Chem. Inf. Comput. Sci. 41:354-357 (2001).

    Google Scholar 

  5. A. Katritzsky and D. B. Tatham. Correlation of the solubility of gases and vapors in methanol and ethanol with their molecular structures. J. Chem. Inf. Comput. Sci. 41:358-363 (2001).

    Google Scholar 

  6. J. Huuskonen. Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology. J. Chem. Inf. Comput. Sci. 40:773-777 (2000).

    Google Scholar 

  7. J. Huuskonen, M. Salo, and J. Taskinen. Aqueous solubility prediction of drugs based on molecular topology and neural network modeling. J. Chem. Inf. Comput. Sci. 38:450-456 (1998).

    Google Scholar 

  8. C. A. Lipinski, F. Lombardo, B. W. Dominy, and P. J. Feeny. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Delivery Rev. 23:3-25 (1997).

    Google Scholar 

  9. T. M. Nelson and P. C. Jurs. Prediction of aqueous solubility for a diverse set of heteroatom-containing organic compounds. J. Chem. Info. Comput. Sci. 34:601-609 (1994).

    Google Scholar 

  10. M. Kalmet. Linear solvation energy relationships: an improved equation for correlation and prediction of aqueous solubilities of aromatic solutes including polycyclic aromatic hydrocarbons and polychlorinated biphenyls. Prog. Phys. Org. Chem 19:295-317 (1993).

    Google Scholar 

  11. G. Klopman, S. Wang, and D. M. Balthasar. Estimation of aqueous solubility of organic molecules by the group contribution approach. Application to the study of biodegradation. J. Chem. Inf. Comput. Sci. 32:439-445 (1992).

    Google Scholar 

  12. N. Bodor and N.-J. Huang. A new method for the estimation of the aqueous solubility of organic compounds. J. Pharm. Sci. 81:954-960 (1992).

    Google Scholar 

  13. N. Bodor, A. Harget, and N.-J. Huang. Neural network studies. 1. Estimation of the aqueous solubility of organic compounds. J. Am. Chem. Soc. 113:9480-9483 (1991).

    Google Scholar 

  14. D. T. Stanton and P. C. Jurs. Development and use of charged particle surface area structural descriptors in computer assisted quantitative structure–property relationship studies. Anal. Chem. 62:2323-2329 (1990).

    Google Scholar 

  15. L. B. Kier and L. H. Hall. Molecular Connectivity in Chemistry and Drug Research. Academic Press, New York, 1976.

    Google Scholar 

  16. L. B. Kier and L. H. Hall. Molecular connectivity VII: Specific treatment of heteroatoms. J. Pharm. Sci. 65:1806-1809 (1976).

    Google Scholar 

  17. L. B. Kier and L. H. Hall. Molecular Connectivity in Structure-Activity Analysis. Wiley, New York, 1986.

    Google Scholar 

  18. Daylight Chemical Information Systems Inc. Mission Viejo, CA. www.daylight.com.

  19. Cerius2 is a product of Accelrys Inc., San Diego, CA. www.accelrys.com.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoyang Xia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xia, X., Maliski, E., Cheetham, J. et al. Solubility Prediction by Recursive Partitioning. Pharm Res 20, 1634–1640 (2003). https://doi.org/10.1023/A:1026195503465

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026195503465

Navigation