Analytical and Bioanalytical Chemistry

, Volume 389, Issue 4, pp 1267–1281

Benchmarking and validating algorithms that estimate pKa values of drugs based on their molecular structures

Original Paper

DOI: 10.1007/s00216-007-1502-x

Cite this article as:
Meloun, M. & Bordovská, S. Anal Bioanal Chem (2007) 389: 1267. doi:10.1007/s00216-007-1502-x


The REGDIA regression diagnostics algorithm in S-Plus is introduced in order to examine the accuracy of pKa predictions made with four updated programs: PALLAS, MARVIN, ACD/pKa and SPARC. This report reviews the current status of computational tools for predicting the pKa values of organic drug-like compounds. Outlier predicted pKa values correspond to molecules that are poorly characterized by the pKa prediction program concerned. The statistical detection of outliers can fail because of masking and swamping effects. The Williams graph was selected to give the most reliable detection of outliers. Six statistical characteristics (Fexp, R2, \( {\text{R}}^{2}_{{\text{P}}} \), MEP, AIC, and s(e) in pKa units) of the results obtained when four selected pKa prediction algorithms were applied to three datasets were examined. The highest values of Fexp, R2, \( {\text{R}}^{2}_{{\text{P}}} \), the lowest values of MEP and s(e), and the most negative AIC were found using the ACD/pKa algorithm for pKa prediction, so this algorithm achieves the best predictive power and the most accurate results. The proposed accuracy test performed by the REGDIA program can also be applied to test the accuracy of other predicted values, such as log P, log D, aqueous solubility or certain physicochemical properties of drug molecules.


pKa prediction pKa accuracy Dissociation constants Outliers Influential points Residuals Goodness-of-fit Williams graph 

Copyright information

© Springer-Verlag 2007

Authors and Affiliations

  1. 1.Department of Analytical Chemistry, Faculty of Chemical TechnologyPardubice UniversityPardubiceCzech Republic

Personalised recommendations