Skip to main content
Log in

Assessment of tautomer distribution using the condensed reaction graph approach

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

We report the first direct QSPR modeling of equilibrium constants of tautomeric transformations (logK T ) in different solvents and at different temperatures, which do not require intermediate assessment of acidity (basicity) constants for all tautomeric forms. The key step of the modeling consisted in the merging of two tautomers in one sole molecular graph (“condensed reaction graph”) which enables to compute molecular descriptors characterizing entire equilibrium. The support vector regression method was used to build the models. The training set consisted of 785 transformations belonging to 11 types of tautomeric reactions with equilibrium constants measured in different solvents and at different temperatures. The models obtained perform well both in cross-validation (Q2 = 0.81 RMSE = 0.7 logK T units) and on two external test sets. Benchmarking studies demonstrate that our models outperform results obtained with DFT B3LYP/6-311 ++ G(d,p) and ChemAxon Tautomerizer applicable only in water at room temperature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Greenwood JR, Calkins D, Sullivan AP, Shelley JC (2010) Towards the comprehensive, rapid, and accurate prediction of the favorable tautomeric states of drug-like molecules in aqueous solution. J Comput Aided Mol Des 24:591–604. https://doi.org/10.1007/s10822-010-9349-1

    Article  CAS  Google Scholar 

  2. Clark T (2010) Tautomers and reference 3D-structures: the orphans of in silico drug design. J Comput Aided Mol Des 24:605–611. https://doi.org/10.1007/s10822-010-9342-8

    Article  CAS  Google Scholar 

  3. Pospisil P, Ballmer P, Scapozza L, Folkers G (2003) Tautomerism in computer-aided drug design. J Recept Signal Transduct Res 23:361–371. https://doi.org/10.1081/RRS-120026975

    Article  CAS  Google Scholar 

  4. Oellien F, Cramer J, Beyer C et al (2006) The impact of tautomer forms on pharmacophore-based virtual screening. J Chem Inf Model 46:2342–2354. https://doi.org/10.1021/ci060109b

    Article  CAS  Google Scholar 

  5. Martin Y (2009) Let’s not forget tautomers. J Comput Aided Mol Des 23:693–704. https://doi.org/10.1007/s10822-009-9303-2

    Article  CAS  Google Scholar 

  6. Warr W (2010) Tautomerism in chemical information management systems. J Comput Aided Mol Des 24:497–520. https://doi.org/10.1007/s10822-010-9338-4

    Article  CAS  Google Scholar 

  7. Sayle RA (2010) So you think you understand tautomerism? J Comput Aided Mol Des 24:485–496

    Article  CAS  Google Scholar 

  8. Sitzmann M, Ihlenfeldt W-D, Nicklaus MC (2010) Tautomerism in large databases. J Comput Aided Mol Des 24:521–551. https://doi.org/10.1007/s10822-010-9346-4

    Article  CAS  Google Scholar 

  9. Guasch L, Sitzmann M, Nicklaus MC (2014) Enumeration of ring-chain tautomers based on SMIRKS rules. J Chem Inf Model 54:2423–2432. https://doi.org/10.1021/ci500363p

    Article  CAS  Google Scholar 

  10. Szegezdi J, Csizmadia F (2007) Tautomer generation. pKa based dominance conditions for generating dominant tautomers. 234th National Meeting of the ACS, Boston, MA, 19–23 August 2007

  11. Trepalin SV, Skorenko AV, Balakin KV et al (2003) Advanced exact structure searching in large databases of chemical compounds. J Chem Inf Comput Sci 43:852–860. https://doi.org/10.1021/ci025582d

    Article  CAS  Google Scholar 

  12. Guasch L, Yapamudiyansel W, Peach ML et al (2016) Experimental and chemoinformatics study of tautomerism in a database of commercially available screening samples. J Chem Inf Model 56:2149–2161. https://doi.org/10.1021/acs.jcim.6b00338

    Article  CAS  Google Scholar 

  13. Advanced Chemistry Development Inc (2015) ACD/Tautomers

  14. ChemAxon JChem Calculator Plugins 15.8.3

  15. Molecular Networks GmbH Computerchemie. MN Tautomer

  16. Schrödinger LLC LigPrep tautomeriser

  17. Xemistry GmbH. CACTVS,

  18. OpenEye Scientific Software. QUACPAC

  19. BIOVIA. BIOVIA Pipeline Pilot

  20. Harańczyk M, Gutowski M (2007) Quantum mechanical energy-based screening of combinatorially generated library of tautomers. TauTGen: a tautomer generator program. J Chem Inf Model 47:686–694. https://doi.org/10.1021/ci6002703

    Article  Google Scholar 

  21. Kochev NT, Paskaleva VH, Jeliazkova N (2013) Ambit-tautomer: an open source tool for tautomer generation. Mol Inform 32:481–504. https://doi.org/10.1002/minf.201200133

    Article  CAS  Google Scholar 

  22. Garcia-Viloca M, Alhambra C, Truhlar DG, Gao J (2003) Hydride transfer catalyzed by xylose isomerase: mechanism and quantum effects. J Comput Chem 24:177–190

    Article  CAS  Google Scholar 

  23. Stigliani J-L, Arnaud P, Delaine T et al (2008) Binding of the tautomeric forms of isoniazid-NAD adducts to the active site of the Mycobacterium tuberculosis enoyl-ACP reductase (InhA): a theoretical approach. J Mol Graph Model 27:536–545. https://doi.org/10.1016/j.jmgm.2008.09.006

    Article  CAS  Google Scholar 

  24. Todorov NP, Monthoux PH, Alberts IL (2006) The influence of variations of ligand protonation and tautomerism on proteinв€’ligand recognition and binding energy landscape. J Chem Inf Model 46:1134–1142. https://doi.org/10.1021/ci050071n

    Article  CAS  Google Scholar 

  25. Rastelli G, Thomas B, Kollman PA, Santi DV (1995) Insight into the specificity of thymidylate synthase from molecular dynamics and free energy perturbation calculations. J Am Chem Soc 117:7213–7227. https://doi.org/10.1021/ja00132a022

    Article  CAS  Google Scholar 

  26. Bonachéra F, Parent B, Barbosa F et al (2006) Fuzzy tricentric pharmacophore fingerprints. 1. Topological fuzzy pharmacophore triplets and adapted molecular similarity scoring schemes. J Chem Inf Model 46:2457–2477. https://doi.org/10.1021/ci6002416

    Article  Google Scholar 

  27. Varnek A, Fourches D, Horvath D et al (2008) ISIDA—platform for virtual screening based on fragment and pharmacophoric descriptors. Curr Comput Aided-Drug Des 4:191–198. https://doi.org/10.2174/157340908785747465

    Article  CAS  Google Scholar 

  28. Ruggiu F, Marcou G, Varnek A, Horvath D (2010) ISIDA property-labelled fragment descriptors. Mol Inform 29:855–868. https://doi.org/10.1002/minf.201000099

    Article  CAS  Google Scholar 

  29. Horvath D, Marcou G, Varnek A (2013) Do not hesitate to use tversky and other hints for successful active analogue searches with feature count descriptors. J Chem Inf Model 53:1543–1562. https://doi.org/10.1021/ci400106g

    Article  CAS  Google Scholar 

  30. Ruggiu F, Gizzi P, Galzi J-L et al (2014) QSPR modelling—a valuable support in HTS quality control. Anal Chem. https://doi.org/10.1021/ac403544k

    Google Scholar 

  31. Brown JB, Okuno Y, Marcou G et al (2014) Computational chemogenomics: is it more than inductive transfer?. J Comput Aided Mol Des 28:597–618. https://doi.org/10.1007/s10822-014-9743-1

    Article  CAS  Google Scholar 

  32. Cramer CJ, Truhlar DG (1999) Implicit solvation models: equilibria, structure, spectra, and dynamics. Chem Rev 99:2161–2200. https://doi.org/10.1021/cr960149m

    Article  CAS  Google Scholar 

  33. Pliego JR, Riveros JM (2001) The cluster—continuum model for the calculation of the solvation free energy of ionic species. J Phys Chem A 105:7241–7247. https://doi.org/10.1021/jp004192w

    Article  CAS  Google Scholar 

  34. Milletti F, Storchi L, Sforna G et al (2009) Tautomer enumeration and stability prediction for virtual screening on large chemical databases. J Chem Inf Model 49:68–75. https://doi.org/10.1021/ci800340j

    Article  CAS  Google Scholar 

  35. Soteras I, Orozco M, Luque FJ (2010) Performance of the IEF-MST solvation continuum model in the SAMPL2 blind test prediction of hydration and tautomerization free energies. J Comput Aided Mol Des 24:281–291. https://doi.org/10.1007/s10822-010-9331-y

    Article  CAS  Google Scholar 

  36. Nicholls A, Wlodek S, Grant JA (2010) SAMPL2 and continuum modeling. J Comput Aided Mol Des 24:293–306. https://doi.org/10.1007/s10822-010-9334-8

    Article  CAS  Google Scholar 

  37. Ribeiro RF, Marenich AV, Cramer CJ, Truhlar DG (2010) Prediction of SAMPL2 aqueous solvation free energies and tautomeric ratios using the SM8, SM8AD, and SMD solvation models. J Comput Aided Mol Des 24:317–333. https://doi.org/10.1007/s10822-010-9333-9

    Article  CAS  Google Scholar 

  38. Palm VA (1978) Tables of rate and equilibrium constants of heterolytic organic reactions. VINITI, Moscow

    Google Scholar 

  39. ChemAxon (2015) InstantJChem 15.7.27.0

  40. ChemAxon (2015) Standardizer, JChem 15.8.3.0

  41. Mason SF (1958) 131. The tautomerism of N-heteroaromatic hydroxy-compounds. Part III. Ionisation constants. J Chem Soc 674. https://doi.org/10.1039/jr9580000674

  42. Mason SF (1957) The tautomerism of N-heteroaromatic hydroxy-compounds. Part II. Ultraviolet spectra. J Chem Soc 5010. https://doi.org/10.1039/jr9570005010

  43. Albert A, Phillips JN (1956) 264. Ionization constants of heterocyclic substances. Part II. Hydroxy-derivatives of nitrogenous six-membered ring-compounds. J Chem Soc 1294. https://doi.org/10.1039/jr9560001294

  44. Varnek A, Fourches D, Hoonakker F et al (2005) Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures. J Comput Aided Mol Des 19:693–703. https://doi.org/10.1007/s10822-005-9008-0

    Article  CAS  Google Scholar 

  45. Hoonakker F, Lachiche N, Varnek A, Wagner A (2011) Condensed graph of reaction: considering a chemical reaction as one single pseudo molecule. Int J Artif Intell Tools 20:253–270

    Article  Google Scholar 

  46. Madzhidov TI, Polishchuk PG, Nugmanov RI et al (2014) Structure-reactivity relationships in terms of the condensed graphs of reactions. Russ J Org Chem 50:459–463

    Article  CAS  Google Scholar 

  47. Nugmanov RI, Madzhidov TI, Khaliullina GR et al (2014) Development of “structure-property” models in nucleophilic substitution reactions involving azides. J Struct Chem 55:1026–1032. https://doi.org/10.1134/S0022476614060043

    Article  CAS  Google Scholar 

  48. Madzhidov TII, Bodrov AVV, Gimadiev TRR et al (2015) Structure–reactivity relationship in bimolecular elimination reactions based on the condensed graph of a reaction. J Struct Chem 56:1227–1234. https://doi.org/10.1134/S002247661507001X

    Article  CAS  Google Scholar 

  49. Madzhidov TI, Gimadiev TR, Malakhova DA et al (2017) Structure-reactivity relationship in Diels-Alder reactions obtained using the condensed reaction graph approach. J Struct Chem. https://doi.org/10.15372/JSC20170402

    Google Scholar 

  50. Lin AI, Madzhidov TI, Klimchuk O et al (2016) Automatized assessment of protective group reactivity: a step toward big reaction data analysis. J Chem Inf Model 56:2140–2148. https://doi.org/10.1021/acs.jcim.6b00319

    Article  CAS  Google Scholar 

  51. Muller C, Marcou G, Horvath D et al (2012) Models for identification of erroneous atom-to-atom mapping of reactions performed by automated algorithms. J Chem Inf Model 52:3116–3122

    Article  CAS  Google Scholar 

  52. Dalby A, Nourse JG, Hounshell WD et al (1992) Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. J Chem Inf Comput Sci 32:244–255. https://doi.org/10.1021/ci00007a012

    Article  CAS  Google Scholar 

  53. ChemAxon (2013) Standardizer, JChem 6.0.0

  54. EPAM Systems (2015) Indigo

  55. Madzhidov TI, Nugmanov RI, Gimadiev TR et al (2015) Consensus approach to atom-to-atom mapping in chemical reactions. Butlerov Commun 44:170–176

    Google Scholar 

  56. Catalán J, López V, Pérez P et al (1995) Progress towards a generalized solvent polarity scale: the solvatochromism of 2-(dimethylamino)-7-nitrofluorene and its homomorph 2-fluoro-7-nitrofluorene. Liebigs Ann 1995:241–252. https://doi.org/10.1002/jlac.199519950234

    Article  Google Scholar 

  57. Catalán J, Díaz C (1997) A generalized solvent acidity scale: the solvatochromism of o-tert-butylstilbazolium betaine dye and its homomorph o,o′-di-tert-butylstilbazolium betaine dye. Liebigs Ann 1997:1941–1949. https://doi.org/10.1002/jlac.199719970921

    Article  Google Scholar 

  58. Kamlet MJ, Taft RW (1976) The solvatochromic comparison method. I. The.beta.-scale of solvent hydrogen-bond acceptor (HBA) basicities. J Am Chem Soc 98:377–383. https://doi.org/10.1021/ja00418a009

    Article  CAS  Google Scholar 

  59. Taft RW, Kamlet MJ (1976) The solvatochromic comparison method. 2. The.alpha.-scale of solvent hydrogen-bond donor (HBD) acidities. J Am Chem Soc 98:2886–2894. https://doi.org/10.1021/ja00426a036

    Article  CAS  Google Scholar 

  60. Kamlet MJ, Abboud JL, Taft RW (1977) The solvatochromic comparison method. 6. The.pi.* scale of solvent polarities. J Am Chem Soc 99:6027–6038. https://doi.org/10.1021/ja00460a031

    Article  CAS  Google Scholar 

  61. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(27):1–27:27. https://doi.org/10.1145/1961189.1961199

    Article  Google Scholar 

  62. Horvath D, Brown J, Marcou G, Varnek A (2014) An evolutionary optimizer of libsvm models. Challenges 5:450–472. https://doi.org/10.3390/challe5020450

    Article  Google Scholar 

  63. Mathea M, Klingspohn W, Baumann K (2016) Chemoinformatic classification methods and their applicability domain. Mol Inform 35:160–180. https://doi.org/10.1002/minf.201501019

    Article  CAS  Google Scholar 

  64. Stewart J (2007) Optimization of parameters for semiempirical methods V: modification of NDDO approximations and application to 70 elements. J Mol Model 13:1173–1213. https://doi.org/10.1007/s00894-007-0233-4

    Article  CAS  Google Scholar 

  65. Stewart JJP (2008) MOPAC2009

  66. Laikov DN (1997) Fast evaluation of density functional exchange-correlation terms using the expansion of the electron density in auxiliary basis sets. Chem Phys Lett 281:151–156

    Article  CAS  Google Scholar 

  67. Perdew JP, Burke K, Ernzerhof M (1996) Generalized gradient approximation made simple. Phys Rev Lett 77:3865–3868

    Article  CAS  Google Scholar 

  68. Laikov DN, Ustynyuk YA (2005) PRIRODA-04: a quantum-chemical program suite. New possibilities in the study of molecular systems with the application of parallel computing. Russ Chem Bull 54:820–826

    Article  CAS  Google Scholar 

  69. Becke AD (1993) Density-functional thermochemistry. III. The role of exact exchange. J Chem Phys 98:5648–5652. https://doi.org/10.1063/1.464913

    Article  CAS  Google Scholar 

  70. Frisch MJ, Trucks GW, Schlegel HB et al (2009) Gaussian 09 Revision C.01, Gaussian Inc., Wallingford

  71. Tomasi J, Mennucci B, Cances E (1999) The IEF version of the PCM solvation method: an overview of a new method addressed to study molecular solutes at the QM ab initio level. J Mol Struct Theochem 464:211–226

    Article  CAS  Google Scholar 

  72. Tomasi J, Mennucci B, Cammi R (2005) Quantum mechanical continuum solvation models. Chem Rev 105:2999–3094. https://doi.org/10.1021/cr9904009

    Article  CAS  Google Scholar 

  73. Marenich AV, Cramer CJ, Truhlar DG (2009) Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions. J Phys Chem B 113:6378–6396. https://doi.org/10.1021/jp810292n

    Article  CAS  Google Scholar 

  74. Kjellin G, Sandström J, Willadsen T et al (1969) Tautomeric cyclic thiones. Part IV. The thione-thiol equilibrium in some azoline-2-thiones. Acta Chem Scand 23:2888–2899. https://doi.org/10.3891/acta.chem.scand.23-2888

    Article  CAS  Google Scholar 

  75. Bell RP, Smith PW (1966) The enol content and acidity of cyclopentanone, cyclohexanone, and acetone in aqueous solution. J Chem Soc B Phys Org 241. https://doi.org/10.1039/j29660000241

  76. Albert A, Barlin GB (1959) Ionization constants of heterocyclic substances. Part III. Mercapto-derivatives of pyridine, quinoline, and isoquinoline. J Chem Soc 2384. https://doi.org/10.1039/jr9590002384

  77. Angyl SJ, Angyal CL (1952) The tautomerism of N-hetero-aromatic amines. Part I. J Chem Soc 1461. https://doi.org/10.1039/jr9520001461

  78. Albert A, Phillips JN (1956) Ionization constants of heterocyclic substances. Part II. Hydroxy-derivatives of nitrogenous six-membered ring-compounds. J Chem Soc 1294. https://doi.org/10.1039/jr9560001294

  79. Kjellin G, Sandström J, Sæthre LJ et al (1973) The thione-thiol tautomerism in simple thioamides. Acta Chem Scand 27:209–217. https://doi.org/10.3891/acta.chem.scand.27-0209

    Article  CAS  Google Scholar 

  80. Chua S-O, Cook MJ, Katritzky AR (1973) Tautomeric pyridines. Part XIV. The tautomerism of 2-benzyl-, 2-benzhydryl-, and 2-anilino-pyridine. J Chem Soc Perkin Trans 2:2111. https://doi.org/10.1039/p29730002111

    Article  Google Scholar 

Download references

Acknowledgements

This study was supported by Russian Science Foundation, Grant No. 14-43-00024. TG thanks the IDEX UniStra program for the fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Varnek.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supporting Information

contains the parameters of the the best 10 SVR models selected for consensus calculation, outliers analysis, tautomeric transformations which couldn’t be generated using the ChemAxon tautomerization plugin, a technical information about CGR creation, SDF format for CGR storage and results of DFT calculations for external datasets. The training and two test datasets used in the modeling are available in the RDF format in Supporting Information. (DOCX 1542 KB)

Supplementary material 2. (RDF 48 KB)

Supplementary material 3. (RDF 62 KB)

Supplementary material 4. (RDF 2259 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gimadiev, T.R., Madzhidov, T.I., Nugmanov, R.I. et al. Assessment of tautomer distribution using the condensed reaction graph approach. J Comput Aided Mol Des 32, 401–414 (2018). https://doi.org/10.1007/s10822-018-0101-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-018-0101-6

Keywords

Navigation