Advertisement

Associative Neural Network

  • Igor V. Tetko
Part of the Methods in Molecular Biology™ book series (MIMB, volume 458)

Abstract

An associative neural network (ASNN) is an ensemble-based method inspired by the function and structure of neural network correlations in brain. The method operates by simulating the short- and long-term memory of neural networks. The long-term memory is represented by ensemble of neural network weights, while the short-term memory is stored as a pool of internal neural network representations of the input pattern. The organization allows the ASNN to incorporate new data cases in short-term memory and provides high generalization ability without the need to retrain the neural network weights. The method can be used to estimate a bias and the applicability domain of models. The applications of the ASNN in QSAR and drug design are exemplified.

Keywords

Ensemble networks memory drug design LIBRARY mode 

Abbreviations

ALOGPS

Artificial log P and log S program to predict lipophilcity and aqueous solubility[38, 39]

ASNN

Associative neural network [2, 3, 4]

BASF

Chemical company, www.basf.com.

CCNN

Cascade correlation neural network

CPU

Central processing unit

E-state

Electrotopological state indices [43, 44]

GM

Global model

kNN

k nearest neighbors

LIBRARY mode

An operational mode of the ASNN when new compounds are used to correct neural network ensemble predictions without changing neural network weights (see Eq. 6)

LM

Local model

ESE

Early stopping over the ensemble [6, 10, 12]

log D

The same as log P but for ionized compounds (usually measured at a specific pH)

log P

1 Octanol/water partition coefficient

log S

Aqueous solubility of compounds

NMR

Nuclear magnetic resonance

PHYSPROP

Physical properties database [19]

“nova” set

Set of compounds with log P values in the PHYSPROP database that do not have reported experimental values in BioByte StarList (see [38, 39])

“star” set

Set of compounds with log P values in PHYSPROP database that have reported experimental values in BioByte StarList (see [38, 39])

QSAR

Quantitative structure-activity relationship studies

RMSE

Root mean squared error

UCI

University of California, Irvine

VCCLAB

Virtual Computational Chemistry Laboratory, www.vcclab.org [47, 48]

Notes

Acknowledgment

This study was supported by the Virtual Computational Chemistry Laboratory grant INTAS INFO-00363. I thank Philip Wong for his useful comments and remarks.

References

  1. 1.
    Fuster JM (1995) Memory in the cerebral cortex. MIT Press, Cambridge, MA,.Google Scholar
  2. 2.
    Tetko V (2001) Associative Neural Network, CogPrints Archive, cog00001441.Google Scholar
  3. 3.
    Tetko IV (2002) Associative neural network. Neural Process. Lett 16:187–199.CrossRefGoogle Scholar
  4. 4.
    Tetko IV. (2002) Neural network studies, 4. Introduction to associative neural networks. J Chem Inf Comput Sci 42:717–728.PubMedGoogle Scholar
  5. 5.
    Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal 12:993–1001.CrossRefGoogle Scholar
  6. 6.
    Tetko IV, Livingstone DJ, Luik AI (1995) Neural network studies, 1. Comparison of overfitting and overtraining. J Chem Inf Comput Sci 35:826–833.Google Scholar
  7. 7.
    Breiman,L (2001) Random forests. Machine Learning 45:5–32.CrossRefGoogle Scholar
  8. 8.
    Tong W, Hong H, Fang H, Xie Q, Perkins R (2003) Decision forest: combining the predictions of multiple independent decision tree models. J Chem Inf Comput Sci 43:525–531.PubMedGoogle Scholar
  9. 9.
    Tetko IV, Villa AEP (1995) In Unsupervised and supervised learning: cooperation toward a common goal, In: ICANN'95, international conference on artificial neural networks NEURONIMES'95, Paris. EC2 & Cie, Paris, France, pp 105–110.Google Scholar
  10. 10.
    Tetko IV, Villa AEP (1997) Efficient partition of learning data sets for neural network training. Neural Networks 10:1361–1374.CrossRefPubMedGoogle Scholar
  11. 11.
    Tetko IV, Villa AEP (1997) An efficient partition of training data set improves speed and accuracy of cascade-correlation algorithm. Neural Process Lett 6:51–59.CrossRefGoogle Scholar
  12. 12.
    Tetko IV, Villa AEP (1997) An enhancement of generalization ability in cascade correlation algorithm by avoidance of overfitting/overtraining problem. Neural Process Lett 6:43–50.CrossRefGoogle Scholar
  13. 13.
    Tetko IV, Tanchuk VY (2002) Application of associative neural networks for prediction of lipophilicity in ALOGPS 2.1 program. J Chem Inf Comput Sci 42:1136–1145.PubMedGoogle Scholar
  14. 14.
    Fahlman S, Lebiere C (1990) The cascade-correlation learning architecture. NIPS 2:524–532.Google Scholar
  15. 15.
    Blake EK, Merz C (1998) UCI repository of machine learning databases, available www.ics.uci.edu/∼mlearn/MLRepository.html.
  16. 16.
    Schwenk H, Bengio Y (2000) Boosting neural networks. Neural Comput.12:1869–1887.CrossRefPubMedGoogle Scholar
  17. 17.
    Tetko IV, Tanchuk VY, Villa AE (2001) Prediction of n-octanol/water partition coefficients from PHYSPROP database using artificial neural networks and E-state indices. J Chem Inf Comput Sci 41:1407–1421.PubMedGoogle Scholar
  18. 18.
    Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1994) Numerical recipes in C (2nd edn). Cambridge University Press, New York, p. 998.Google Scholar
  19. 19.
    The Physical Properties Database (PHYSPROP), Syracuse Research Corporation, available www.syrres.com, accessed December 20, 2006.
  20. 20.
    Binev Y, Aires-de-Sousa J (2004) Structure-based predictions of 1H NMR chemical shifts using feed-forward neural networks. J Chem Inf Comput Sci 44:940–945.PubMedGoogle Scholar
  21. 21.
    Binev, Y., Corvo, M, Aires-de-Sousa, J. (2004) The impact of available experimental data on the prediction of 1H NMR chemical shifts by neural networks. J. Chem. Inf. Comput. Sci. 44:946–949.PubMedGoogle Scholar
  22. 22.
    Da Costa FB, Binev Y, Gasteiger J, Aires-De-Sousa J (2004) Structure-based predictions of H-1 NMR chemical shifts of sesquiterpene lactones using neural networks. Tetrahedron Letters 45:6931–6935.CrossRefGoogle Scholar
  23. 23.
    Dimoglo AS, Shvets NM, Tetko IV,d Livingstone DJ (2001) Electronic-topologic investigation of the structure-acetylcholinesterase inhibitor activity relationship in the series of N-benzylpiperidine derivatives. Quant Struct-Activ Rel 20:31–45.CrossRefGoogle Scholar
  24. 24.
    Kandemirli F, Shvets N, Kovalishyn V, Dimoglo A (2006) Combined electronic-topological and neural networks study of some hydroxysemicarbazides as potential antitumor agents. J Mol Graph Model 25:33–36.CrossRefGoogle Scholar
  25. 25.
    Kandemirli F, Shvets N, Unsalan S, Kucukguzel I, Rollas S, Kovalishyn V, Dimoglo A (2006) The structure-antituberculosis activity relationships study in a series of 5-(4-aminophenyl)-4-substituted-2,4-dihydro-3h-1,2,4-triazole-3-thione derivatives. A combined electronic-topological and neural networks approach. Med Chem 2:415–422.CrossRefPubMedGoogle Scholar
  26. 26.
    Dimoglo A, Kovalishyn V, Shvets N, Ahsen, V. (2005) The structure-inhibitory activity relationships study in a series of cyclooxygenase-2 inhibitors: a combined electronic-topological and neural networks approach. Mini Rev Med Chem 5:879–892.CrossRefPubMedGoogle Scholar
  27. 27.
    Ajmani S, Tetko IV, Livingstone DJ, Salt D (2005) A comparative study of neural network architectures for QSAR., In: Aki(Sener) E, Yalcin I (eds) QSAR and molecular modelling in rational design of bioactive molecules, Computer Aided Drug Design and Development Society in Turkey, Istanbul, pp. 183–184.Google Scholar
  28. 28.
    Friedel CC, Jahn KH, Sommer S, Rudd S, Mewes HW, Tetko IV (2005) Support vector machines for separation of mixed plant-pathogen EST collections based on codon usage. Bioinformatics 21:1383–1388.CrossRefPubMedGoogle Scholar
  29. 29.
    Tetko IV, Solov'ev VP, Antonov AV, Yao X, Doucet JP, Fan B, Hoonakker F, Fourches D, Jost P, Lachiche N, Varnek A (2006) Benchmarking of linear and nonlinear approaches for quantitative structure-property relationship studies of metal complexation with ionophores. J Chem Inf Model 46:808–819.CrossRefPubMedGoogle Scholar
  30. 30.
    Vapnik VN (1998) Statistical leaning theory. Wiley, New York,.Google Scholar
  31. 31.
    Tetko IV, Poda GI (2004) Application of ALOGPS 2.1 to predict log D distribution coefficient for Pfizer proprietary compounds. J Med Chem 47:5601–5604.CrossRefPubMedGoogle Scholar
  32. 32.
    Tetko IV, Bruneau P (2004) Application of ALOGPS to predict 1-octanol/water distribution coefficients, logP, and logD, of AstraZeneca in-house database. J Pharm Sci 93:3103–3110.CrossRefPubMedGoogle Scholar
  33. 33.
    Tetko IV, Livingstone DJ (2007) Rule-based systems to predict lipophilicity. In: Testa B, van de Waterbeemd H (eds) Comprehensive medicinal chemistry II: in silico tools in ADMET, vol. 5. Elsevier, Oxford, UK, pp 649–668.Google Scholar
  34. 34.
    Poda GI, Tetko IV, Rohrer DC (2005) Towards predictive ADME profiling of drug candidates: lipophilicity and solubility. In: 229th American Chemical Society national meeting and exposition, ACS, San Diego, CA, p. MEDI 514.Google Scholar
  35. 35.
    Balaki, KV, Savchuk NP, Tetko IV (2006) In silico approaches to prediction of aqueous and DMSO solubility of drug-like compounds: trends, problems and solutions. Curr Med Chem 13:223–241.CrossRefGoogle Scholar
  36. 36.
    Wilson EK (2005) Is safe exchange of data possible? Chem. Eng. News 83:24–29.CrossRefGoogle Scholar
  37. 37.
    Tetko IV, Abagyan R, Oprea TI (2005) Surrogate data—a secure way to share corporate data. J Comput Aided Mol Des 19:749–764.CrossRefPubMedGoogle Scholar
  38. 38.
    Tetko IV, Tanchuk VY (2005) ALOGPS (www.vcclab.org) is a free on-line program to predict lipophilicity and aqueous solubility of chemical compounds. In: 229th American Chemical Society national meeting and exposition, ACS, San Diego, CA pp. U608–U608.
  39. 39.
    Tetko IV (2005) Encoding molecular structures as ranks of models: a new secure way for sharing chemical data and development of ADME/T models. In 229th American Chemical Society national meeting and exposition, San Diego, CA, pp. U602–U602.Google Scholar
  40. 40.
    Tetko IV, Bruneau P, Mewes HW, Rohrer DC, Poda GI (2006) Can we estimate the accuracy of ADMET predictions? In: 232th American Chemical Society national meeting, San Francisco.Google Scholar
  41. 41.
    Tetko IV, Bruneau P, Mewes HW, Rohrer DC, Poda GI (2006) Can we estimate the accuracy of ADME-Tox predictions? Drug Discov Today 11:700–707.CrossRefPubMedGoogle Scholar
  42. 42.
    Tetko IV (2006) In estimation of applicability domain of a model for toxicity against T. pyriformis using ALOGPS logP. Workshop on ranking methods, Verbania, Italy, October 2–3.Google Scholar
  43. 43.
    Kier LB, Hall LH (1990) An electrotopological-state index for atoms in molecules. Pharmaceutical Research 7:801–807.CrossRefPubMedGoogle Scholar
  44. 44.
    Kier LB, Hall LH (1999) Molecular structure description: the electrotopological state. Academic Press, London, p. 245.Google Scholar
  45. 45.
    Hall LH, Kier LB (1995) Electrotopological state indices for atom types—a novel combination of electronic, topological, and valence state information. J Chem Inf Comput Sci 35:1039–1045.Google Scholar
  46. 46.
    Jain N, Yalkowsky SH (2001) Estimation of the aqueous solubility I: application to organic nonelectrolytes. J Pharm Sci 90:234–252.CrossRefPubMedGoogle Scholar
  47. 47.
    [47] Tetko IV, Gasteiger J, Todeschini R, Mauri A, Livingstone D, Ertl P, Palyulin VA, Radchenko EV, Zefirov NS, Makarenko AS, Tanchuk VY, Prokopenko VV (2005) Virtual computational chemistry laboratory—design and description. J Comput-Aided Mol Des 19:453–463.CrossRefPubMedGoogle Scholar
  48. 48.
    Tetko IV (2005) Computing chemistry on the web. Drug Discov Today 10:1497–1500.CrossRefPubMedGoogle Scholar
  49. 49.
    Tetko IV, Villa AE, Aksenova TI, Zielinski WL, Brower J, Collantes ER, Welsh WJ (1998) Application of a pruning algorithm to optimize artificial neural networks for pharmaceutical fingerprinting. J Chem Inf Comput Sci 38:660–668.PubMedGoogle Scholar
  50. 50.
    Härdle W (1990) Smoothing techniques with implementation in S. Springer-Verlag, New York.Google Scholar

Copyright information

© Humana Press, a part of Springer Science + Business Media, LLC 2008

Authors and Affiliations

  1. 1.GSF – National Research Centre for Environment and Health Institute for BioinformaticsGermany

Personalised recommendations