Journal of Computer-Aided Molecular Design

, Volume 23, Issue 7, pp 419–429

Machine learning of chemical reactivity from databases of organic reactions

  • Gonçalo V. S. M. Carrera
  • Sunil Gupta
  • João Aires-de-Sousa
Article

DOI: 10.1007/s10822-009-9275-2

Cite this article as:
Carrera, G.V.S.M., Gupta, S. & Aires-de-Sousa, J. J Comput Aided Mol Des (2009) 23: 419. doi:10.1007/s10822-009-9275-2

Abstract

Databases of chemical reactions contain knowledge about the reactivity of specific reagents. Although information is in general only explicitly available for compounds reported to react, it is possible to derive information about substructures that do not react in the reported reactions. Both types of information (positive and negative) can be used to train machine learning techniques to predict if a compound reacts or not with a specific reagent. The whole process was implemented with two databases of reactions, one involving BuNH2 as the reagent, and the other NaCNBH3. Negative information was derived using MOLMAP molecular descriptors, and classification models were developed with Random Forests also based on MOLMAP descriptors. MOLMAP descriptors were based exclusively on calculated physicochemical features of molecules. Correct predictions were achieved for ∼90% of independent test sets. While NaCNBH3 is a selective reducing reagent widely used in organic synthesis, BuNH2 is a nucleophile that mimics the reactivity of the lysine side chain (involved in an initiating step of the mechanism leading to skin sensitization).

Keywords

MOLMAP Chemical reactivity Databases Machine learning Electrophilicity 

Abbreviations

MOLMAP

MOLecular maps of atom-level properties

BuNH2

Butylamine

RF

Random forest

VOC

Volatile organic compounds

QSAR

Quantitative structure activity relationship

OOB

Out of bag

SVM

Support vector machines

ROC

Receiver operating characteristic

SOM

Self organizing maps

HTS

High-throughput screening

Supplementary material

10822_2009_9275_MOESM1_ESM.pdf (80 kb)
Supplementary material 1 (PDF 80 kb)

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  • Gonçalo V. S. M. Carrera
    • 1
  • Sunil Gupta
    • 1
  • João Aires-de-Sousa
    • 1
  1. 1.REQUIMTE, CQFB, Departamento de Química, Faculdade de Ciências e TecnologiaUniversidade Nova de LisboaCaparicaPortugal

Personalised recommendations