Molecular Diversity

, Volume 10, Issue 3, pp 301–309 | Cite as

SVM approach for predicting LogP

  • Quan Liao
  • Jianhua Yao
  • Shengang Yuan
Full–length paper


The logarithm of the partition coefficient between n-octanol and water (logP) is an important parameter for drug discovery. Based upon the comparison of several prediction logP models, i.e. Support Vector Machines (SVM), Partial Least Squares (PLS) and Multiple Linear Regression (MLR), the authors reported SVM model is the best one in this paper.

Key words

LogP prediction multiple linear regression (MLR) partial least squares (PLS) support vector machines (SVM) 



the logarithm of the partition coefficient between n-octanol and water


support vector machines


partial least squares


multiple linear regression


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mannhold, R. and Dross, K., Calculation procedures for molecular lipophilicity: a comparative study, Quant. Struct.-Act. Relat., 15 (1996) 403–409.CrossRefGoogle Scholar
  2. 2.
    Testa, B., Crivori, P., Reist, M. and Carrupt, P.A., The influence of lipophilicity on the pharmacokinetic behavior of drugs: Concepts and examples, Perspect. Drug Disc. Design, 19 (2000) 179–211.CrossRefGoogle Scholar
  3. 3.
    Hansch, C. and Fujita, T., Correlation of biochemical activity of phenoxyacetic acids with Hammett substituent constants and partition coefficients, Nature, 194 (1962) 178–180.CrossRefGoogle Scholar
  4. 4.
    Leo, A., Calculating logPoct from structures, Chem. Rev., 93 (1993) 1281–1306.CrossRefGoogle Scholar
  5. 5.
    Suzuki, T. and Kudo, Y., Automatic logP estimation based on combined additive modeling methods, J. Comput.-Aided Mol. Des., 4 (1990) 155–198.PubMedCrossRefGoogle Scholar
  6. 6.
    Klopman, G., Li, J.Y., Wang, S. and Dimayuga, M., Computer automated logP calculations based on an extended group contribution approach, J. Chem. Inf. Comput. Sci., 34 (1994) 752–781.CrossRefGoogle Scholar
  7. 7.
    Wang, R., Fu, Y. and Lai, L., A new atom-additive method for calculating partition coefficients, J. Chem. Inf. Comput. Sci., 37 (1997) 615–621.CrossRefGoogle Scholar
  8. 8.
    Devillers, J., Domine, D. and Guillon, C., Autocorrelation modeling of lipophilicity with a back-propagation neural network, Eur. J. Med. Chem., 33 (1998) 659–664.CrossRefGoogle Scholar
  9. 9.
    Mannhold, R. and Petrauskas, A., Substructure versus whole molecule approaches for calculating logP, QSAR Comb. Sci., 22 (2003) 466–475.CrossRefGoogle Scholar
  10. 10.
    Sun, H., A universal molecular descriptor system for prediction of logP, logS, logBB, and absorption, J. Chem. Inf. Comput. Sci., 44 (2004) 748–757.PubMedCrossRefGoogle Scholar
  11. 11.
    Chuman, H., Mori, A., Tanaka, H., Yamagami, C. and Fujita, T., Analyses of the partition coefficient, logP, using ab initio MO parameter and accessible surface area of solute molecules, J. Pharm. Sci., 93 (2004) 2681–2697.PubMedCrossRefGoogle Scholar
  12. 12.
    In, Y., Chai, H.H. and No, K.T., A partition coefficient calculation method with the SFED model, J. Chem. Inf. Model., 45 (2005) 254–263.PubMedCrossRefGoogle Scholar
  13. 13.
    Schnackenberg, L.K. and Beger, R.D., Whole-molecule calculation of logP based on molar volume, hydrogen bonds, and simulated 13C NMR spectra, J. Chem. Inf. Model., 45 (2005) 360–365.PubMedCrossRefGoogle Scholar
  14. 14.
    Vapnik, V.N. (Ed.) Statistical Learning Theory, John Wiley & Sons, New York, 1998.Google Scholar
  15. 15.
    Cristianini, N. and Shawe-Taylor, J. (Eds.) An Introduction to Support Vector Machines, Cambridge University Press, Cambridge, UK, 2000.Google Scholar
  16. 16.
    Burges, C.J.C., A tutorial on Support Vector Machine for pattern recognition, Data Min. Knowl. Disc., 2 (1998) 121–167.CrossRefGoogle Scholar
  17. 17.
    Burbidge, R., Trotter, M., Buxton, B. and Holden, S., Drug design by machine learning: Support Vector Machines for pharmaceutical data analysis, Comput. Chem., 26 (2001) 5–14.PubMedCrossRefGoogle Scholar
  18. 18.
    Song, M., Breneman, C.M., Bi, J.; Sukumar, N., Bennett, K.P., Cramer, S. and Tugcu, N., Prediction of protein retention times in anion-exchange chromatography systems using Support Vector Regression, J. Chem. Inf. Comput. Sci., 42 (2002) 1347–1357.PubMedCrossRefGoogle Scholar
  19. 19.
    Kramer, S., Frank, E. and Helma, C., Fragment generation and Support Vector Machines for inducing SARs, SAR QSAR Environ. Res., 13 (2002) 509–523.PubMedCrossRefGoogle Scholar
  20. 20.
    Zernov, V.V., Balakin, K.V., Ivaschenko, A.A., Savchuk, N.P. and Pletnev, I.V., Drug discovery using Support Vector Machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions, J. Chem. Inf. Comput. Sci., 43 (2003) 2048–2056.PubMedCrossRefGoogle Scholar
  21. 21.
    Yao, X.J., Panaye, A., Doucet, J.P., Zhang, R.S., Chen, H.F., Liu, M.C., Hu, Z.D. and Fan, B.T., Comparative study of QSAR/QSPR correlations using Support Vector Machines, Radial Basis Function Neural Networks, and Multiple Linear Regression, J. Chem. Inf. Comput. Sci., 44 (2004) 1257–1266.PubMedCrossRefGoogle Scholar
  22. 22.
    Luan, F., Zhang, R.S., Zhao, C.Y., Yao, X.J., Liu, M.C., Hu, Z.D. and Fan, B.T., Classification of the carcinogenicity of N-nitroso compounds based on Support Vector Machines and Linear Discriminant Analysis, Chem. Res. Toxicol., 18 (2005) 198–203.PubMedCrossRefGoogle Scholar
  23. 23.
    Hansch, C., Leo, A. and Hoekman, D. (Eds.) Exploring QSAR: Hydrophobic, Electronic, and Steric Constants, Vol 2, American Chemical Society, Washington, DC, 1995.Google Scholar
  24. 24.
    Zefirov, N.S. and Palyulin, V.A., Fragmental approach in QSPR, J. Chem. Inf. Comput. Sci., 42 (2002) 1112–1122.PubMedCrossRefGoogle Scholar
  25. 25.
    Hurst T. and Heritage T., HQSAR — A highly predictive QSAR technique based on molecular holograms, 213th ACS Natl. Meeting, San Francisco, CA, (1997), CINF 019.Google Scholar
  26. 26.
    Merlot, C., Domine, D., Cleva, C. and Church, D.J., Chemical substructures in drug discovery. Drug, Discovery Today, 8 (2003) 594–602.CrossRefGoogle Scholar
  27. 27.
    Clark, M., Generalized fragment-substructure based property prediction method, J. Chem. Inf. Model, 45 (2005) 30–38.PubMedCrossRefGoogle Scholar
  28. 28.
    Liao, Q., Yao, J.H., Li, F., Yuan, S.G., Doucet, J.P., Panaye, A. and Fan, B.T., CISOC-PSCT: A predictive system for carcinogenic toxicity, SAR QSAR Environ. Res., 15 (2004) 217–235.PubMedCrossRefGoogle Scholar
  29. 29.
    Eriksson, L., Johansson, E., Kettaneh-Wold, N. and Wold, S. (Ed.) Multi- and Megavariate Data Analysis Principles and Applications, Umetrics Academy: Kinnelon, NJ, 2001.Google Scholar
  30. 30.
  31. 31.
    Topliss, J.G. and Edwards, R.P., Chance factors in studies of quantitative structure-activity relationships, J. Med. Chem., 22 (1979) 1238–1244.PubMedCrossRefGoogle Scholar
  32. 32.
    Press, W.H., Teukolsky, S.A., Vetterling, W.T. and Flannery, B.P. (Eds.) Numerical Recipes in C: the Art of Scientific Computing, 2nd Ed., Cambridge University Press, Cambridge, 1995, 676–681.Google Scholar
  33. 33.
    Chang, C.C. and Lin, C.J., LIBSVM – A library for Support Vector Machines,$\sim$cjlin/libsvm/index.html.

Copyright information

© Springer Science + Business Media, Inc. 2006

Authors and Affiliations

  1. 1.Department of Computer Chemistry and Chemoinformatics, Shanghai Institute of Organic ChemistryChinese Academy of SciencesShanghaiChina

Personalised recommendations