Support Vector Machine Prediction of Drug Solubility on GPUs

  • Gaspar Cano
  • José García-Rodríguez
  • Sergio Orts-Escolano
  • Jorge Peña-García
  • Dharmendra Kumar-Yadav
  • Alfonso Pérez-Garrido
  • Horacio Pérez-Sánchez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9044)


The landscape in the high performance computing arena opens up great opportunities in the simulation of relevant biological systems and for applications in Bioinformatics, Computational Biology and Computational Chemistry. Larger databases increase the chances of generating hits or leads, but the computational time needed increases with the size of the database and with the accuracy of the Virtual Screening (VS) method and the model.

In this work we discuss the benefits of using massively parallel architectures for the optimization of prediction of compound solubility using computational intelligence methods such as Support Vector Machines (SVM) methods. SVMs are trained with a database of known soluble and insoluble compounds, and this information is being exploited afterwards to improve VS prediction.

We empirically demonstrate that GPUs are well-suited architecture for the acceleration of Computational Intelligence methods as SVM, obtaining up to a 15 times sustained speedup compared to its sequential counterpart version.


SVM GPU CUDA Bioinformatics Computational Biology 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Borkar, S.: Thousand core chips: A technology perspective. In: Proceedings of the 44th Annual Design Automation Conference, pp. 746–749 (2007)Google Scholar
  2. 2.
    Nvidia, W., Generation, N., Compute, C.: Whitepaper NVIDIA’s Next Generation CUDA Compute Architecture, pp. 1–22Google Scholar
  3. 3.
    Nvidia, C.: Compute unified device architecture programming guide (2007)Google Scholar
  4. 4.
    Fan, X., Weber, W.-D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. In: ACM SIGARCH Computer Architecture News, vol. 35(2), pp. 13–23 (2007)Google Scholar
  5. 5.
    Anderson, D.P.: Boinc: A system for public-resource computing and storage. In: Proceedings. Fifth IEEE/ACM International Workshop on Grid Computing, 2004, pp. 4–10 (2004)Google Scholar
  6. 6.
    Ruiz, A., Ujaldón, M.: Acelerando los momentos de Zernike sobre Kepler (2014)Google Scholar
  7. 7.
    Berl, A., Gelenbe, E., Di Girolamo, M., Giuliani, G., De Meer, H., Dang, M.Q., Pentikousis, K.: Energy-efficient cloud computing. Comput. J. 53(7), 1045–1051 (2010)CrossRefGoogle Scholar
  8. 8.
    Cortes, C., Vapnik, V.: Support-Vector Networks. Mach. Learn. 20(3), 273–297 (1995)zbMATHGoogle Scholar
  9. 9.
    Jorissen, R.N., Gilson, M.K.: Virtual Screening of Molecular Databases Using a Support Vector Machine. J. Chem. Inf. Model. 45(3), 549–561 (2005)CrossRefGoogle Scholar
  10. 10.
    Warmuth, M.K., Liao, J., Rätsch, G., Mathieson, M., Putta, S., Lemmen, C.: Active learning with support vector machines in the drug discovery process. J. Chem. Inf. Comput. Sci. 43(2), 667–673 (2003)CrossRefGoogle Scholar
  11. 11.
    Kriegl, J.M., Arnhold, T., Beck, B., Fox, T.: Prediction of Human Cytochrome P450 Inhibition Using Support Vector Machines. QSAR Comb. Sci. 24(4), 491–502 (2005)CrossRefGoogle Scholar
  12. 12.
    Lee, D.E., Song, J.-H., Song, S.-O., Yoon, E.S.: Weighted Support Vector Machine for Quality Estimation in the Polymerization Process. Ind. Eng. Chem. Res. 44(7), 2101–2105 (2005)CrossRefGoogle Scholar
  13. 13.
    Ivanciuc, O.: Applications of Support Vector Machines in Chemistry. In: Reviews in Computational Chemistry, pp. 291–400. John Wiley & Sons, Inc. (2007)Google Scholar
  14. 14.
    Voigt, J.H., Bienfait, B., Wang, S., Nicklaus, M.C.: Comparison of the NCI open database with seven large chemical structural databases. J. Chem. Inf. Comput. Sci 41(3), 702–712 (2001)CrossRefGoogle Scholar
  15. 15.
    Cao, D.-S., Xu, Q.-S., Hu, Q.-N., Liang, Y.-Z.: ChemoPy: freely available python package for computational biology and chemoinformatics. Bioinforma 29(8), 1092–1094 (2013)CrossRefGoogle Scholar
  16. 16.
    Team, R.C., et al.: R: A language and environment for statistical computing (2012)Google Scholar
  17. 17.
    Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)CrossRefGoogle Scholar
  18. 18.
    Hornik, K., Meyer, D., Karatzoglou, A.: Support vector machines in R. J. Stat. Softw. 15(9), 1–28 (2006)Google Scholar
  19. 19.
    Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine (1998)Google Scholar
  20. 20.
    Yau: GPU Computing with R.‘ R Tutorial: An R Introduction to Statis, r - (2014)Google Scholar
  21. 21.
    Pérez-Sánchez, H., Cano, G., García-Rodríguez, J.: Improving drug discovery using hybrid softcomputing methods. Appl. Soft Comput. 20, 119–126 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Gaspar Cano
    • 1
  • José García-Rodríguez
    • 1
  • Sergio Orts-Escolano
    • 1
  • Jorge Peña-García
    • 2
  • Dharmendra Kumar-Yadav
    • 3
  • Alfonso Pérez-Garrido
    • 2
  • Horacio Pérez-Sánchez
    • 2
  1. 1.Dept. of Computing TechnologyUniversity of AlicanteAlicanteSpain
  2. 2.Bioinformatics and High Performance Computing Research Group (BIO-HPC), Computer Science DepartmentCatholic University of Murcia (UCAM)MurciaSpain
  3. 3.Department of ChemistryUniversity of DelhiDelhiIndia

Personalised recommendations