Abstract
Macroscopic pKa values were calculated for all compounds in the SAMPL6 blind prediction challenge, based on quantum chemical calculations with a continuum solvation model and a linear correction derived from a small training set. Microscopic pKa values were derived from the gas-phase free energy difference between protonated and deprotonated forms together with the Conductor-like Polarizable Continuum Solvation Model and the experimental solvation free energy of the proton. pH-dependent microstate free energies were obtained from the microscopic pKas with a maximum likelihood estimator and appropriately summed to yield macroscopic pKa values or microstate populations as function of pH. We assessed the accuracy of three approaches to calculate the microscopic pKas: direct use of the quantum mechanical free energy differences and correction of the direct values for short-comings in the QM solvation model with two different linear models that we independently derived from a small training set of 38 compounds with known pKa. The predictions that were corrected with the linear models had much better accuracy [root-mean-square error (RMSE) 2.04 and 1.95 pKa units] than the direct calculation (RMSE 3.74). Statistical measures indicate that some systematic errors remain, likely due to differences in the SAMPL6 data set and the small training set with respect to their interactions with water. Overall, the current approach provides a viable physics-based route to estimate macroscopic pKa values for novel compounds with reasonable accuracy.
Similar content being viewed by others
References
Nicholls A, Mobley DL, Guthrie JP, Chodera JD, Bayly CI, Cooper MD, Pande VS (2008) Predicting small-molecule solvation free energies: An informal blind test for computational chemistry. J Med Chem 51(4):769–779. https://doi.org/10.1021/jm070549+
Guthrie JP (2009) A blind challenge for computational solvation free energies: introduction and overview. J Phys Chem B 113(14):4501–4507. https://doi.org/10.1021/jp806724u
Geballe MT, Skillman AG, Nicholls A, Guthrie JP, Taylor PJ (2010) The SAMPL2 blind prediction challenge: Introduction and overview. J Comput Aided Mol Des 24(4):259–279. https://doi.org/10.1007/s10822-010-9350-8
Geballe MT, Guthrie JP (2012) The SAMPL3 blind prediction challenge: transfer energy overview. J Comput Aided Mol Des 26(5):489–496. https://doi.org/10.1007/s10822-012-9568-8
Mobley DL, Wymer KL, Lim NM, Guthrie JP (2014) Blind prediction of solvation free energies from the SAMPL4 challenge. J Comput Aided Mol Des 28(3):135–150. https://doi.org/10.1007/s10822-014-9718-2
Bannan CC, Calabró G, Kyu DY, Mobley DL (2016) Calculating partition coefficients of small molecules in octanol/water and cyclohexane/water. J Chem Theory Comput 12(8):4015–24. https://doi.org/10.1021/acs.jctc.6b00449
Beckstein O, Iorga BI (2012) Prediction of hydration free energies for aliphatic and aromatic chloro derivatives using molecular dynamics simulations with the OPLS-AA force field. J Comput Aided Mol Des 26(5):635–645. https://doi.org/10.1007/s10822-011-9527-9
Beckstein O, Fourrier A, Iorga BI (2014) Prediction of hydration free energies for the SAMPL4 diverse set of compounds using molecular dynamics simulations with the OPLS-AA force field. J Comput Aided Mol Des 28(3):265–276. https://doi.org/10.1007/s10822-014-9727-1
Kenney IM, Beckstein O, Iorga BI (2016) Prediction of cyclohexane-water distribution coefficients for the SAMPL5 data set using molecular dynamics simulations with the OPLS-AA force field. J Comput Aided Mol Des 30(11):1045–1058. https://doi.org/10.1007/s10822-016-9949-5
Babić S, Horvat AJM, Pavlović DM, Kaštelan-Macan M (2007) Determination of \(\text{p}K_\text{a}\) values of active pharmaceutical ingredients. TrAC Trends in Analytical Chemistry 26(11):1043–1061. https://doi.org/10.1016/j.trac.2007.09.004
Lee AC, Crippen GM (2009) Predicting \(\text{p}K_\text{a}\). J Chem Inf Model 49(9):2013–2033. https://doi.org/10.1021/ci900209w
Alexov E, Mehler EL, Baker N, Baptista AM, Huang Y, Milletti F, Nielsen JE, Farrell D, Carstensen T, Olsson MHM, Shen JK, Warwicker J, Williams S, Word JM (2011) Progress in the prediction of \(\text{p}K_\text{a}\) values in proteins. Proteins 79(12):3260–3275. https://doi.org/10.1002/prot.23189
Rupp M, Korner R, Tetko IV (2011) Predicting the \(\text{p}K_\text{a}\) of small molecules. Comb Chem High Throughput Screening 14(5):307–327. https://doi.org/10.2174/138620711795508403
Reijenga J, van Hoof A, van Loon A, Teunissen B (2013) Development of methods for the determination of \(\text{p}K_\text{a}\) values. Anal Chem Insights 8:53–71. https://doi.org/10.4137/ACI.S12304
Ho J, Coote ML (2009) A universal approach for continuum solvent \(\text{p}K_\text{a}\) calculations: are we there yet? Theor Chem Acc 125(1–2):3–21. https://doi.org/10.1007/s00214-009-0667-0
Mongan J, Case DA, McCammon JA (2004) Constant pH molecular dynamics in generalized born implicit solvent. J Comput Chem 25(16):2038–2048. https://doi.org/10.1002/jcc.20139
Chen W, Morrow BH, Shi C, Shen JK (2014) Recent development and application of constant pH molecular dynamics. Mol Simul 40(10–11):830–838. https://doi.org/10.1080/08927022.2014.907492
Swails JM, York DM, Roitberg AE (2014) Constant pH replica exchange molecular dynamics in explicit solvent using discrete protonation states: implementation, testing, and validation. J Chem Theory Comput 10(3):1341–1352. https://doi.org/10.1021/ct401042b
Radak BK, Chipot C, Suh D, Jo S, Jiang W, Phillips JC, Schulten K, Roux B (2017) Constant-pH molecular dynamics simulations for large biomolecular systems. J Chem Theory Comput 13(12):5933–5944. https://doi.org/10.1021/acs.jctc.7b00875
Di Russo NV, Estrin DA, Martí MA, Roitberg AE (2012) pH-dependent conformational changes in proteins and their effect on experimental pK(a)s: the case of Nitrophorin 4. PLoS Comput Biol 8(11):e1002761. https://doi.org/10.1371/journal.pcbi.1002761
Morrow BH, Koenig PH, Shen JK (2013) Self-assembly and bilayer-micelle transition of fatty acids studied by replica-exchange constant pH molecular dynamics. Langmuir 29(48):14823–1430. https://doi.org/10.1021/la403398n
Huang Y, Chen W, Dotson DL, Beckstein O, Shen J (2016) Mechanism of pH-dependent activation of the sodium-proton antiporter NhaA. Nat Commun 7(12):940. https://doi.org/10.1038/ncomms12940
Alongi KS, Shields GC (2010) Theoretical calculations of acid dissociation constants: a review article. In: Annual reports in computational chemistry, vol 6, Elsevier Science B.V., chap 8, pp 113–138. https://doi.org/10.1016/S1574-1400(10)06008-1
Muckerman JT, Skone JH, Ning M, Wasada-Tsutsui Y (2013) Toward the accurate calculation of \(\text{p}K_\text{a}\) values in water and acetonitrile. Biochim Biophys Acta 1827:882–891. https://doi.org/10.1016/j.bbabio.2013.03.011
McQuarrie DA (1976) Statistical mechanics. HarperCollins, New York
Zhang H, Jiang Y, Yan H, Yin C, Tan T, van der Spoel D (2017) Free-energy calculations of ionic hydration consistent with the experimental hydration free energy of the proton. J Phys Chem Lett 8(12):2705–2712. https://doi.org/10.1021/acs.jpclett.7b01125
Ihlenfeldt W, Takahashi Y, Abe H, Sasaki S (1994) Computation and management of chemical properties in CACTVS: an extensible networked approach toward modularity and compatibility. J Chem Inf Comput Sci 34(1):109–116. https://doi.org/10.1021/ci00017a013
Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Mennucci B, Petersson GA, Nakatsuji H, Caricato M, Li X, Hratchian HP, Izmaylov AF, Bloino J, Zheng G, Sonnenberg JL, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T, Montgomery JA Jr, Peralta JE, Ogliaro F, Bearpark M, Heyd JJ, Brothers E, Kudin KN, Staroverov VN, Kobayashi R, Normand J, Raghavachari K, Rendell A, Burant JC, Iyengar SS, Tomasi J, Cossi M, Rega N, Millam JM, Klene M, Knox JE, Cross JB, Bakken V, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Martin RL, Morokuma K, Zakrzewski VG, Voth GA, Salvador P, Dannenberg JJ, Dapprich S, Daniels AD, Farkas O, Foresman JB, Ortiz JV, Cioslowski J, Fox DJ (2009) Gaussian 09 Revision D.01. Gaussian Inc., Wallingford
Klamt A, Schüürmann G (1993) COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. J Chem Soc Perkin Trans 2:799–805. https://doi.org/10.1039/P29930000799
Andzelm J, Külmel C, Klamt A (1995) Incorporation of solvent effects into density functional calculations of molecular energies and geometries. J Chem Phys 103(21):9312–9320. https://doi.org/10.1063/1.469990
Barone V, Cossi M (1998) Quantum calculation of molecular energies and energy gradients in solution by a conductor solvent model. J Phys Chem A 102(11):1995–2001. https://doi.org/10.1021/jp9716997
Cossi M, Rega N, Scalmani G, Barone V (2003) Energies, structures, and electronic properties of molecules in solution with the C-PCM solvation model. J Comput Chem 24(6):669–681. https://doi.org/10.1002/jcc.10189
Bishop CM (2006) Pattern recognition and machine learning. Information science and statistics. Springer, New York
Jones E, Oliphant T, Peterson P, et al (2001) SciPy: Open source scientific tools for Python. http://www.scipy.org/. Accessed 31 May 2018
Lundblad R, Macdonald F (2010) Handbook of biochemistry and molecular biology, 4th edn. Taylor & Francis, Boca Raton
Ndukwe IE, Wang X, Reibarkh M, Isik M, Martin GE (2018) NMR characterization of microstates of SM14. Tech. rep, Merck NMR Structure Elucidation Group
Faber NKM (1999) Estimating the uncertainty in estimates of root mean square error of prediction: application to determining the size of an adequate test set in multivariate calibration. Chemom Intell Lab Syst 49(1):79–89. https://doi.org/10.1016/S0169-7439(99)00027-1
Acknowledgements
Research reported in this publication was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under Award Number R01GM118772 (to OB). BII was supported in part by Grants ANR-10-LABX-33 (LabEx LERMIT) and ANR-14-JAMR-0002-03 (JPIAMR) from the French National Research Agency (ANR), and by a Grant DIM MAL-INF from the Région Ile-de-France.
Author information
Authors and Affiliations
Corresponding authors
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Selwa, E., Kenney, I.M., Beckstein, O. et al. SAMPL6: calculation of macroscopic pKa values from ab initio quantum mechanical free energies. J Comput Aided Mol Des 32, 1203–1216 (2018). https://doi.org/10.1007/s10822-018-0138-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-018-0138-6