Skip to main content

Improving quantitative structure–activity relationship models using Artificial Neural Networks trained with dropout


Dropout is an Artificial Neural Network (ANN) training technique that has been shown to improve ANN performance across canonical machine learning (ML) datasets. Quantitative Structure Activity Relationship (QSAR) datasets used to relate chemical structure to biological activity in Ligand-Based Computer-Aided Drug Discovery pose unique challenges for ML techniques, such as heavily biased dataset composition, and relatively large number of descriptors relative to the number of actives. To test the hypothesis that dropout also improves QSAR ANNs, we conduct a benchmark on nine large QSAR datasets. Use of dropout improved both enrichment false positive rate and log-scaled area under the receiver-operating characteristic curve (logAUC) by 22–46 % over conventional ANN implementations. Optimal dropout rates are found to be a function of the signal-to-noise ratio of the descriptor set, and relatively independent of the dataset. Dropout ANNs with 2D and 3D autocorrelation descriptors outperform conventional ANNs as well as optimized fingerprint similarity search methods.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

    Sliwoski G, Kothiwale S, Meiler J, Lowe EW Jr (2014) Computational methods in drug discovery. Pharmacol Rev 66(1):334–395. doi:10.1124/pr.112.007336

    Article  Google Scholar 

  2. 2.

    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    Google Scholar 

  3. 3.

    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  4. 4.

    Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp 8599–8603

  5. 5.

    Myint KZ, Wang L, Tong Q, Xie XQ (2012) Molecular fingerprint-based artificial neural networks QSAR for ligand biological activity predictions. Mol Pharm 9(10):2912–2923. doi:10.1021/mp300237z

    CAS  Article  Google Scholar 

  6. 6.

    Butkiewicz M, Lowe EW Jr, Mueller R, Mendenhall JL, Teixeira PL, Weaver CD, Meiler J (2013) Benchmarking ligand-based virtual High-Throughput Screening with the PubChem database. Molecules 18(1):735–756. doi:10.3390/molecules18010735

    CAS  Article  Google Scholar 

  7. 7.

    Mueller R, Dawson ES, Niswender CM, Butkiewicz M, Hopkins CR, Weaver CD, Lindsley CW, Conn PJ, Meiler J (2012) Iterative experimental and virtual high-throughput screening identifies metabotropic glutamate receptor subtype 4 positive allosteric modulators. J Mol Model 18(9):4437–4446. doi:10.1007/s00894-012-1441-0

    CAS  Article  Google Scholar 

  8. 8.

    Sliwoski G, Lowe EW, Butkiewicz M, Meiler J (2012) BCL:EMAS—enantioselective molecular asymmetry descriptor for 3D-QSAR. Molecules 17(8):9971–9989. doi:10.3390/molecules17089971

    CAS  Article  Google Scholar 

  9. 9.

    Hartman JH, Cothren SD, Park SH, Yun CH, Darsey JA, Miller GP (2013) Predicting CYP2C19 catalytic parameters for enantioselective oxidations using artificial neural networks and a chirality code. Bioorg Med Chem 21(13):3749–3759. doi:10.1016/j.bmc.2013.04.044

    CAS  Article  Google Scholar 

  10. 10.

    Ahmadi M, Shahlaei M (2015) Quantitative structure-activity relationship study of P2X7 receptor inhibitors using combination of principal component analysis and artificial intelligence methods. Res Pharm Sci 10(4):307–325

    Google Scholar 

  11. 11.

    Dahl GE, Jaitly N, Salakhutdinov R (2014) Multi-task neural networks for QSAR predictions. arXiv preprint arXiv:14061231

  12. 12.

    Dahl G (2012) Deep learning how I did it: Merck 1st place interview. Accessed Aug 14 2015

  13. 13.

    Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuz’min VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57(12):4977–5010. doi:10.1021/jm4004285

    CAS  Article  Google Scholar 

  14. 14.

    Sadowski J (1997) A hybrid approach for addressing ring flexibility in 3D database searching. J Comput Aid Mol Des 11(1):53–60

    CAS  Article  Google Scholar 

  15. 15.

    Berenger F, Voet A, Lee XY, Zhang KYJ (2014) A rotation-translation invariant molecular descriptor of partial charges and its use in ligand-based virtual screening. J Chem Inf. doi:10.1186/1758-2946-6-23

    Google Scholar 

  16. 16.

    Sliwoski G, Mendenhall J, Meiler J (2015) Autocorrelation descriptor improvements for QSAR: 2DA_Sign and 3DA_Sign. J Comput Aid Mol Des. doi:10.1007/s10822-015-9893-9

    Google Scholar 

  17. 17.

    Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics. Methods and principles in medicinal chemistry, vol 41, 2nd edn. Wiley-VCH, Weinheim

    Book  Google Scholar 

  18. 18.

    Sastry M, Lowrie JF, Dixon SL, Sherman W (2010) Large-scale systematic analysis of 2D fingerprint methods and parameters to improve virtual screening enrichments. J Chem Inf Model 50(5):771–784. doi:10.1021/ci100062n

    CAS  Article  Google Scholar 

  19. 19.

    Xing L, Glen RC (2002) Novel methods for the prediction of logP, pK(a), and logD. J Chem Inf Comput Sci 42(4):796–805

    CAS  Article  Google Scholar 

  20. 20.

    Ertl P, Rohde B, Selzer P (2000) Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. J Med Chem 43(20):3714–3717

    CAS  Article  Google Scholar 

  21. 21.

    Gasteiger J, Marsili M (1980) Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges. Tetrahedron 36(22):3219–3228. doi:10.1016/0040-4020(80)80168-2

    CAS  Article  Google Scholar 

  22. 22.

    Gilson MK, Gilson HS, Potter MJ (2003) Fast assignment of accurate partial atomic charges: an electronegativity equalization method that accounts for alternate resonance forms. J Chem Inf Comput Sci 43(6):1982–1997. doi:10.1021/ci034148o

    CAS  Article  Google Scholar 

  23. 23.

    Miller KJ (1990) Additivity methods in molecular polarizability. J Am Chem Soc 112(23):8533–8542. doi:10.1021/Ja00179a044

    CAS  Article  Google Scholar 

  24. 24.

    PubChem (2009) PubChem Substructure Fingerprint. Accessed May 05 2014

  25. 25.

    Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536. doi:10.1038/323533a0

    Article  Google Scholar 

  26. 26.

    Mysinger MM, Shoichet BK (2010) Rapid context-dependent ligand desolvation in molecular docking. J Chem Inf Model 50(9):1561–1573. doi:10.1021/ci100214a

    CAS  Article  Google Scholar 

  27. 27.

    Weisstein EW (2000) Normal sum distribution. Wolfram Research, Inc. Accessed Nov 1 2015

  28. 28.

    Valcu M, Valcu CM (2011) Data transformation practices in biomedical sciences. Nat Methods 8(2):104–105. doi:10.1038/nmeth0211-104

    CAS  Article  Google Scholar 

  29. 29.

    LeCun Y, Bottou L, Orr G, Müller K-R (1998) Efficient BackProp. In: Orr G, Müller K-R (eds) Neural networks: tricks of the trade, vol 1524. Lecture Notes in Computer Science. Springer, Berlin, pp 9–50. doi:10.1007/3-540-49430-8_2

  30. 30.

    Prati RC, Batista GE, Silva DF (2014) Class imbalance revisited: a new experimental setup to assess the performance of treatment methods. Knowl Inf Syst 1–24

  31. 31.

    Batista GEAPA, Prati RC, Monard MC (2005) Balancing strategies and class overlapping. Adv Intell Data Anal VI Proc 3646:24–35

    Article  Google Scholar 

  32. 32.

    Nakama T (2009) Theoretical analysis of batch and on-line training for gradient descent learning in neural networks. Neurocomput 73(1–3):151–159. doi:10.1016/j.neucom.2009.05.017

    Article  Google Scholar 

  33. 33.

    Wilson DR, Martinez TR (2003) The general inefficiency of batch training for gradient descent learning. Neural Netw 16(10):1429–1451. doi:10.1016/S0893-6080(03)00138-2

    Article  Google Scholar 

  34. 34.

    Wu W, Wang J, Cheng M, Li Z (2011) Convergence analysis of online gradient method for BP neural networks. Neural Netw 24(1):91–98. doi:10.1016/j.neunet.2010.09.007

    Article  Google Scholar 

  35. 35.

    Igel C, Husken M (2003) Empirical evaluation of the improved Rprop learning algorithms. Neurocomputing 50:105–123. doi:10.1016/S0925-2312(01)00700-7

    Article  Google Scholar 

  36. 36.

    Jain AN, Nicholls A (2008) Recommendations for evaluation of computational methods. J Comput Aid Mol Des 22(3–4):133–139. doi:10.1007/s10822-008-9196-5

    CAS  Article  Google Scholar 

  37. 37.

    Gasteiger J (2003) Handbook of chemoinformatics: from data to knowledge. Wiley-VCH, Weinheim

    Book  Google Scholar 

  38. 38.

    Ba J, Frey B (2013) Adaptive dropout for training deep neural networks. Adv Neural Inf Process Syst 26:3084–3092

    Google Scholar 

  39. 39.

    Mueller R, Rodriguez AL, Dawson ES, Butkiewicz M, Nguyen TT, Oleszkiewicz S, Bleckmann A, Weaver CD, Lindsley CW, Conn PJ, Meiler J (2010) Identification of metabotropic glutamate receptor subtype 5 potentiators using virtual high-throughput screening. ACS Chem Neurosci 1(4):288–305. doi:10.1021/cn9000389

    CAS  Article  Google Scholar 

  40. 40.

    Marsili M, Gasteiger J (1980) Pi-charge distribution from molecular topology and pi-orbital electronegativity. Croat Chem Acta 53(4):601–614

    Google Scholar 

  41. 41.

    Gilson MK, Gilson HSR, Potter MJ (2003) Fast assignment of accurate partial atomic charges: an electronegativity equalization method that accounts for alternate resonance forms. J Chem Inf Comput Sci 43(6):1982–1997. doi:10.1021/Ci034148o

    CAS  Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Jens Meiler.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mendenhall, J., Meiler, J. Improving quantitative structure–activity relationship models using Artificial Neural Networks trained with dropout. J Comput Aided Mol Des 30, 177–189 (2016).

Download citation


  • Artificial Neural Network (ANN)
  • Dropout
  • Quantitative Structure Activity Relationship (QSAR)
  • BioChemicalLibrary (BCL)
  • Machine learning (ML)
  • Ligand-Based Computer-Aided Drug Discovery (LB-CADD)