Skip to main content
Log in

Impact of descriptor vector scaling on the classification of drugs and nondrugs with artificial neural networks

  • Original Paper
  • Published:
Journal of Molecular Modeling Aims and scope Submit manuscript

Abstract

The influence of preprocessing of molecular descriptor vectors for solving classification tasks was analyzed for drug/nondrug classification by artificial neural networks. Molecular properties were used to form descriptor vectors. Two types of neural networks were used, supervised multilayer neural nets trained with the back-propagation algorithm, and unsupervised self-organizing maps (Kohonen maps). Data were preprocessed by logistic scaling and histogram equalization. For both types of neural networks, the preprocessing step significantly improved classification compared to nonstandardized data. Classification accuracy was measured as prediction mean square error and Matthews correlation coefficient in the case of supervised learning, and quantization error in the case of unsupervised learning. The results demonstrate that appropriate data preprocessing is an essential step in solving classification tasks.

Figure Drug/nondrug classification by SOM

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Abbreviations

BP:

Back-propagation algorithm

GDA:

Gradient descent with adaptive learning rate

GDM:

Gradient descent with momentum

HTS:

High-throughput screening

LM:

Levenberg–Marquardt

mcc :

Matthews correlation coefficient

MFFN:

Multilayer feedforward neural network

mse :

Mean square error

QE :

Quantization error

QSAR:

Quantitative structure–activity relationship

RP:

Resilient back-propagation algorithm

SOM:

Self-organizing map

SVM:

Support vector machine

TE :

Topology error

References

  1. Shah AV, Walters WP, Murcko MA (1998) J Med Chem 41:3314–3324

    Article  CAS  PubMed  Google Scholar 

  2. (a) Sadowski J, Kubinyi H (1998) J Med Chem 41:3325–3329; (b) Sadowski J (1998) In: Böhm HJ, Schneider G (eds) Virtual screening for bioactive molecules. Wiley-VCH, Weinheim, pp 117–130

  3. Zuegge J, Fechner U, Roche O, Parrott NJ, Engkvist O, Schneider G (2002) Quant Struct Act Relat 21:249–256

    Article  CAS  Google Scholar 

  4. Roche O, Schneider P, Zuegge J, Guba W, Kansy M, Alanine A, Bleicher K, Danel F, Gutknecht EM, Rogers-Evans M, Neidhart W, Stalder H, Dillon M, Sjögren E, Fotouhi N, Gillespie P, Goodnow R, Harris W, Jones P, Taniguchi M, Tsujii S, von der Saal W, Zimmermann G, Schneider G (2002) J Med Chem 45:137–142

    Article  CAS  PubMed  Google Scholar 

  5. Schneider G, Böhm HJ (2002) Drug Discov Today 7:64–70

    Article  CAS  PubMed  Google Scholar 

  6. Zupan J, Gasteiger J (1999) Neural networks in chemistry and drug design. An introduction. Wiley-VCH, Weinheim

  7. Devillers J (ed) (1996) Neural networks in QSAR and drug design (principles of QSAR and drug design). Academic Press, New York

  8. Schneider G, Wrede P (1998) Prog Biophys Mol Biol 70:175–222

    Article  CAS  PubMed  Google Scholar 

  9. Byvatov E, Schneider G (2003) Appl Bioinf (in press)

  10. Anderson JA, Pellionisz A, Rosenfield E (eds) (1990) Neurocomputing 2: directions for research. MIT Press, Cambridge MA

    Google Scholar 

  11. Churchland PS, Sejnowski TJ (1992) The computational brain. MIT Press, Cambridge MA

  12. Rumelhart DE, Hinton GE, Williams RJ (1986) In: Rumelhart DE, McClelland JL (eds) Parallel distributed processing, vol 1. MIT Press, Cambridge MA, pp 318–362

  13. Kohonen T (1995) Self-organizing map, 2nd edn. Springer, Berlin Heidelberg New York, pp 117–119

  14. Ultsch A, Siemon HP (1990) Proceedings of INNC, pp 305–308

  15. Iivarinen J, Kohonen T, Kangas J, Kaski S (1994) Proceedings of Conference on Artificial Intelligence Research in Finland. pp 122–126

  16. Kraaijveld MA, Mao J, Jain AK (1995) IEEE Trans Neural Networks 6:548–559

    Article  Google Scholar 

  17. Matthews BW (1975) Biochim Biophys Acta 405:442–451

    Article  CAS  PubMed  Google Scholar 

  18. Kiviluoto K (1996) Proceedings of ICNN. pp 294–299

  19. Givehchi A, Dietrich A, Wrede P, Schneider G (2003) QSAR Comb Sci 5:549–559

    Article  Google Scholar 

  20. Chemical Computing Group Inc, 1010 Sherbrooke Street West, Suite 910, Montreal, Quebec, Canada, H3A 2R7; URL:http://www.chemcomp.com/Journal_of_CCG/Features/descr.htm

  21. URL:http://www.cis.hut.fi/projects/somtoolbox/

  22. Mathworks Inc, 3 Apple Hill Drive, Natick, MA 01760–2098, USA; URL:http://www.mathworks.com

  23. Hertz J, Krogh A, Palmer R (1991) Introduction to the theory of neural computation. Addison-Wesley, Redwood City, CA

  24. Demuth H, Beal M (2001) Neural network toolbox, user’s guide version 4. Mathworks Inc, Natick, MA

  25. Hagan MT, Menhaj M (1994) IEEE Trans Neural Networks 5:989–993

    Google Scholar 

  26. Hagan MT, Demuth HB, Beale MH (1996) Neural network design. PWS Publishing, Boston, MA

  27. Riedmiller M, Braun H (1993) Proceedings of the IEEE International Conference on Neural Networks 1:586–591

    Google Scholar 

  28. Widrow B, Lehr MA (1995) Perceptrons, adalines, and back-propagation. In: Arbib MA (ed) The handbook of brain theory and neural networks. MIT Press, Cambridge, MA, pp 719-724

  29. Himberg J (2000) Proceedings of the International Joint Conference on Neural Networks (IJCNN) 3:587–592

  30. Takaoka Y, Endo Y, Yamanobe S, Kakinuma H, Okubo T, Shimazaki Y, Ota T, Sumiya S, Yoshikawa K (2003) J Chem Inf Comput Sci 43:1269–1275

    Article  CAS  PubMed  Google Scholar 

  31. Ajay (2002) Curr Top Med Chem 2:1273–1286

    CAS  PubMed  Google Scholar 

  32. Brüstle M, Beck B, Schindler T, King W, Mitchell T, Clark T (2002) J Med Chem 45:3345-3355

    Article  PubMed  Google Scholar 

Download references

Acknowledgement

Jens Sadowski is thanked for providing us his drug/nondrug data for the purpose of this study. This work was supported by the Beilstein-Institut zur Förderung der Chemischen Wissenschaften, Frankfurt.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alireza Givehchi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Givehchi, A., Schneider, G. Impact of descriptor vector scaling on the classification of drugs and nondrugs with artificial neural networks. J Mol Model 10, 204–211 (2004). https://doi.org/10.1007/s00894-004-0186-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00894-004-0186-9

Keywords

Navigation