Journal of Computer-Aided Molecular Design

, Volume 32, Issue 10, pp 1203–1216 | Cite as

SAMPL6: calculation of macroscopic pKa values from ab initio quantum mechanical free energies

  • Edithe Selwa
  • Ian M. Kenney
  • Oliver BecksteinEmail author
  • Bogdan I. IorgaEmail author


Macroscopic pKa values were calculated for all compounds in the SAMPL6 blind prediction challenge, based on quantum chemical calculations with a continuum solvation model and a linear correction derived from a small training set. Microscopic pKa values were derived from the gas-phase free energy difference between protonated and deprotonated forms together with the Conductor-like Polarizable Continuum Solvation Model and the experimental solvation free energy of the proton. pH-dependent microstate free energies were obtained from the microscopic pKas with a maximum likelihood estimator and appropriately summed to yield macroscopic pKa values or microstate populations as function of pH. We assessed the accuracy of three approaches to calculate the microscopic pKas: direct use of the quantum mechanical free energy differences and correction of the direct values for short-comings in the QM solvation model with two different linear models that we independently derived from a small training set of 38 compounds with known pKa. The predictions that were corrected with the linear models had much better accuracy [root-mean-square error (RMSE) 2.04 and 1.95 pKa units] than the direct calculation (RMSE 3.74). Statistical measures indicate that some systematic errors remain, likely due to differences in the SAMPL6 data set and the small training set with respect to their interactions with water. Overall, the current approach provides a viable physics-based route to estimate macroscopic pKa values for novel compounds with reasonable accuracy.


pKa pH Quantum chemistry SAMPL challenge 



Research reported in this publication was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under Award Number R01GM118772 (to OB). BII was supported in part by Grants ANR-10-LABX-33 (LabEx LERMIT) and ANR-14-JAMR-0002-03 (JPIAMR) from the French National Research Agency (ANR), and by a Grant DIM MAL-INF from the Région Ile-de-France.

Supplementary material (4.6 mb)
(ZIP 4.682 MB)


  1. 1.
    Nicholls A, Mobley DL, Guthrie JP, Chodera JD, Bayly CI, Cooper MD, Pande VS (2008) Predicting small-molecule solvation free energies: An informal blind test for computational chemistry. J Med Chem 51(4):769–779. CrossRefPubMedGoogle Scholar
  2. 2.
    Guthrie JP (2009) A blind challenge for computational solvation free energies: introduction and overview. J Phys Chem B 113(14):4501–4507. CrossRefPubMedGoogle Scholar
  3. 3.
    Geballe MT, Skillman AG, Nicholls A, Guthrie JP, Taylor PJ (2010) The SAMPL2 blind prediction challenge: Introduction and overview. J Comput Aided Mol Des 24(4):259–279. CrossRefPubMedGoogle Scholar
  4. 4.
    Geballe MT, Guthrie JP (2012) The SAMPL3 blind prediction challenge: transfer energy overview. J Comput Aided Mol Des 26(5):489–496. CrossRefPubMedGoogle Scholar
  5. 5.
    Mobley DL, Wymer KL, Lim NM, Guthrie JP (2014) Blind prediction of solvation free energies from the SAMPL4 challenge. J Comput Aided Mol Des 28(3):135–150. CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Bannan CC, Calabró G, Kyu DY, Mobley DL (2016) Calculating partition coefficients of small molecules in octanol/water and cyclohexane/water. J Chem Theory Comput 12(8):4015–24. CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Beckstein O, Iorga BI (2012) Prediction of hydration free energies for aliphatic and aromatic chloro derivatives using molecular dynamics simulations with the OPLS-AA force field. J Comput Aided Mol Des 26(5):635–645. CrossRefPubMedGoogle Scholar
  8. 8.
    Beckstein O, Fourrier A, Iorga BI (2014) Prediction of hydration free energies for the SAMPL4 diverse set of compounds using molecular dynamics simulations with the OPLS-AA force field. J Comput Aided Mol Des 28(3):265–276. CrossRefPubMedGoogle Scholar
  9. 9.
    Kenney IM, Beckstein O, Iorga BI (2016) Prediction of cyclohexane-water distribution coefficients for the SAMPL5 data set using molecular dynamics simulations with the OPLS-AA force field. J Comput Aided Mol Des 30(11):1045–1058. CrossRefPubMedGoogle Scholar
  10. 10.
    Babić S, Horvat AJM, Pavlović DM, Kaštelan-Macan M (2007) Determination of \(\text{p}K_\text{a}\) values of active pharmaceutical ingredients. TrAC Trends in Analytical Chemistry 26(11):1043–1061. CrossRefGoogle Scholar
  11. 11.
    Lee AC, Crippen GM (2009) Predicting \(\text{p}K_\text{a}\). J Chem Inf Model 49(9):2013–2033. CrossRefGoogle Scholar
  12. 12.
    Alexov E, Mehler EL, Baker N, Baptista AM, Huang Y, Milletti F, Nielsen JE, Farrell D, Carstensen T, Olsson MHM, Shen JK, Warwicker J, Williams S, Word JM (2011) Progress in the prediction of \(\text{p}K_\text{a}\) values in proteins. Proteins 79(12):3260–3275. CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Rupp M, Korner R, Tetko IV (2011) Predicting the \(\text{p}K_\text{a}\) of small molecules. Comb Chem High Throughput Screening 14(5):307–327. CrossRefGoogle Scholar
  14. 14.
    Reijenga J, van Hoof A, van Loon A, Teunissen B (2013) Development of methods for the determination of \(\text{p}K_\text{a}\) values. Anal Chem Insights 8:53–71. CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Ho J, Coote ML (2009) A universal approach for continuum solvent \(\text{p}K_\text{a}\) calculations: are we there yet? Theor Chem Acc 125(1–2):3–21. CrossRefGoogle Scholar
  16. 16.
    Mongan J, Case DA, McCammon JA (2004) Constant pH molecular dynamics in generalized born implicit solvent. J Comput Chem 25(16):2038–2048. CrossRefPubMedGoogle Scholar
  17. 17.
    Chen W, Morrow BH, Shi C, Shen JK (2014) Recent development and application of constant pH molecular dynamics. Mol Simul 40(10–11):830–838. CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Swails JM, York DM, Roitberg AE (2014) Constant pH replica exchange molecular dynamics in explicit solvent using discrete protonation states: implementation, testing, and validation. J Chem Theory Comput 10(3):1341–1352. CrossRefPubMedPubMedCentralGoogle Scholar
  19. 19.
    Radak BK, Chipot C, Suh D, Jo S, Jiang W, Phillips JC, Schulten K, Roux B (2017) Constant-pH molecular dynamics simulations for large biomolecular systems. J Chem Theory Comput 13(12):5933–5944. CrossRefPubMedGoogle Scholar
  20. 20.
    Di Russo NV, Estrin DA, Martí MA, Roitberg AE (2012) pH-dependent conformational changes in proteins and their effect on experimental pK(a)s: the case of Nitrophorin 4. PLoS Comput Biol 8(11):e1002761. CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Morrow BH, Koenig PH, Shen JK (2013) Self-assembly and bilayer-micelle transition of fatty acids studied by replica-exchange constant pH molecular dynamics. Langmuir 29(48):14823–1430. CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Huang Y, Chen W, Dotson DL, Beckstein O, Shen J (2016) Mechanism of pH-dependent activation of the sodium-proton antiporter NhaA. Nat Commun 7(12):940. CrossRefGoogle Scholar
  23. 23.
    Alongi KS, Shields GC (2010) Theoretical calculations of acid dissociation constants: a review article. In: Annual reports in computational chemistry, vol 6, Elsevier Science B.V., chap 8, pp 113–138. Google Scholar
  24. 24.
    Muckerman JT, Skone JH, Ning M, Wasada-Tsutsui Y (2013) Toward the accurate calculation of \(\text{p}K_\text{a}\) values in water and acetonitrile. Biochim Biophys Acta 1827:882–891. CrossRefPubMedGoogle Scholar
  25. 25.
    McQuarrie DA (1976) Statistical mechanics. HarperCollins, New YorkGoogle Scholar
  26. 26.
    Zhang H, Jiang Y, Yan H, Yin C, Tan T, van der Spoel D (2017) Free-energy calculations of ionic hydration consistent with the experimental hydration free energy of the proton. J Phys Chem Lett 8(12):2705–2712. CrossRefPubMedGoogle Scholar
  27. 27.
    Ihlenfeldt W, Takahashi Y, Abe H, Sasaki S (1994) Computation and management of chemical properties in CACTVS: an extensible networked approach toward modularity and compatibility. J Chem Inf Comput Sci 34(1):109–116. CrossRefGoogle Scholar
  28. 28.
    Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Mennucci B, Petersson GA, Nakatsuji H, Caricato M, Li X, Hratchian HP, Izmaylov AF, Bloino J, Zheng G, Sonnenberg JL, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T, Montgomery JA Jr, Peralta JE, Ogliaro F, Bearpark M, Heyd JJ, Brothers E, Kudin KN, Staroverov VN, Kobayashi R, Normand J, Raghavachari K, Rendell A, Burant JC, Iyengar SS, Tomasi J, Cossi M, Rega N, Millam JM, Klene M, Knox JE, Cross JB, Bakken V, Adamo C, Jaramillo J, Gomperts R, Stratmann RE, Yazyev O, Austin AJ, Cammi R, Pomelli C, Ochterski JW, Martin RL, Morokuma K, Zakrzewski VG, Voth GA, Salvador P, Dannenberg JJ, Dapprich S, Daniels AD, Farkas O, Foresman JB, Ortiz JV, Cioslowski J, Fox DJ (2009) Gaussian 09 Revision D.01. Gaussian Inc., WallingfordGoogle Scholar
  29. 29.
    Klamt A, Schüürmann G (1993) COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. J Chem Soc Perkin Trans 2:799–805. CrossRefGoogle Scholar
  30. 30.
    Andzelm J, Külmel C, Klamt A (1995) Incorporation of solvent effects into density functional calculations of molecular energies and geometries. J Chem Phys 103(21):9312–9320. CrossRefGoogle Scholar
  31. 31.
    Barone V, Cossi M (1998) Quantum calculation of molecular energies and energy gradients in solution by a conductor solvent model. J Phys Chem A 102(11):1995–2001. CrossRefGoogle Scholar
  32. 32.
    Cossi M, Rega N, Scalmani G, Barone V (2003) Energies, structures, and electronic properties of molecules in solution with the C-PCM solvation model. J Comput Chem 24(6):669–681. CrossRefPubMedGoogle Scholar
  33. 33.
    Bishop CM (2006) Pattern recognition and machine learning. Information science and statistics. Springer, New YorkGoogle Scholar
  34. 34.
    Jones E, Oliphant T, Peterson P, et al (2001) SciPy: Open source scientific tools for Python. Accessed 31 May 2018
  35. 35.
    Lundblad R, Macdonald F (2010) Handbook of biochemistry and molecular biology, 4th edn. Taylor & Francis, Boca RatonGoogle Scholar
  36. 36.
    Ndukwe IE, Wang X, Reibarkh M, Isik M, Martin GE (2018) NMR characterization of microstates of SM14. Tech. rep, Merck NMR Structure Elucidation GroupGoogle Scholar
  37. 37.
    Faber NKM (1999) Estimating the uncertainty in estimates of root mean square error of prediction: application to determining the size of an adequate test set in multivariate calibration. Chemom Intell Lab Syst 49(1):79–89. CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Institut de Chimie des Substances Naturelles, CNRS UPR 2301, Université Paris-Saclay, Labex LERMITGif-sur-YvetteFrance
  2. 2.Department of PhysicsArizona State UniversityTempeUSA
  3. 3.Center for Biological PhysicsArizona State UniversityTempeUSA

Personalised recommendations