Abstract
In an earlier study (Didziapetris R & Lanevskij K (2016). J Comput Aided Mol Des. 30:1175–1188) we collected a database of publicly available hERG inhibition data for almost 6700 drug-like molecules and built a probabilistic Gradient Boosting classifier with a minimal set of physicochemical descriptors (log P, pKa, molecular size and topology parameters). This approach favored interpretability over statistical performance but still achieved an overall classification accuracy of 75%. In the current follow-up work we expanded the database (provided in Supplementary Information) to almost 9400 molecules and performed temporal validation of the model on a set of novel chemicals from recently published lead optimization projects. Validation results showed almost no performance degradation compared to the original study. Additionally, we rebuilt the model using AFT (Accelerated Failure Time) learning objective in XGBoost, which accepts both quantitative and censored data often reported in protein inhibition studies. The new model achieved a similar level of accuracy of discerning hERG blockers from non-blockers at 10 µM threshold, which can be conceived as close to the performance ceiling for methods aiming to describe only non-specific ligand interactions with hERG. Yet, this model outputs quantitative potency values (IC50) and is not tied to a particular classification cut-off. pIC50 from patch-clamp measurements can be predicted with R2 ≈ 0.4 and MAE < 0.5, which enables ligand ranking according to their expected potency levels. The employed approach can be valuable for quantitative modeling of various ADME and drug safety endpoints with a high prevalence of censored data.
Similar content being viewed by others
Data availability
All datasets collected and analyzed during this study are available in supplementary information.
References
Vandenberg JI, Perry MD, Perrin MJ, Mann SA, Ke Y, Hill AP (2012) hERG K(+) Channels: structure, function, and clinical significance. Physiol Rev 92:1393–1478
Kalyaanamoorthy S, Barakat KH (2018) Development of safe drugs: the hERG challenge. Med Res Rev 38:525–555. https://doi.org/10.1002/med.21445
Garrido A, Lepailleur A, Mignani SM, Dallemagne P, Rochais C (2020) hERG Toxicity assessment: useful guidelines for drug design. Eur J Med Chem 195:112290. https://doi.org/10.1016/j.ejmech.2020.112290
Lester RM, ICH E14, S7B Cardiac Safety Regulations (2021) Update on: the expanded role of preclinical assays and the “double-negative” scenario. Clin Pharmacol Drug Dev 10:964–973. https://doi.org/10.1002/cpdd.1003
Kratz JM, Grienke U, Scheel O, Mann SA, Rollinger JM (2017) Natural products modulating the hERG channel: heartaches and hope. Nat Prod Rep 34:957–980. https://doi.org/10.1039/c7np00014f
Villoutreix BO, Taboureau O (2015) Computational investigations of hERG channel blockers: new insights and current predictive models. Adv Drug Deliv Rev 86:72–82. https://doi.org/10.1016/j.addr.2015.03.003
Krishna S, Borrel A, Huang R, Zhao J, Xia M, Kleinstreuer N (2022) High-Throughput Chemical Screening and Structure-Based Models to Predict hERG Inhibition. Biology (Basel) 11:209. https://doi.org/10.3390/biology11020209
Rácz A, Bajusz D, Miranda-Quintana RA, Héberger K (2021) Machine learning models for classification tasks related to drug safety. Mol Divers 25:1409–1424. https://doi.org/10.1007/s11030-021-10239-x
Braga RC, Alves VM, Silva MFB, Muratov E, Fourches D, Lião LM, Tropsha A, Andrade CH (2015) Pred-hERG: a novel web-accessible computational tool for predicting cardiac toxicity. Mol Inform 34:698–701. https://doi.org/10.1002/minf.201500040
Siramshetty VB, Chen Q, Devarakonda P, Preissner R (2018) The catch-22 of predicting hERG blockade using publicly accessible bioactivity data. J Chem Inform Model 58:1224–1233. https://doi.org/10.1021/acs.jcim.8b00150
Cai C, Guo P, Zhou Y, Zhou J, Wang Q, Zhang F, Fang J, Cheng F (2019) Deep learning-based prediction of drug-induced cardiotoxicity. J Chem Inform Model 59:1073–1084. https://doi.org/10.1021/acs.jcim.8b00769
Vachal P, Duffy JL, Campeau L-C, Amin RP, Mitra K, Murphy BA, Shao PP, Sinclair PJ, Ye F, Katipally R, Lu Z, Ondeyka D, Chen Y-H, Zhao K, Sun W, Tyagarajan S, Bao J, Wang S-P, Cote J, Lipardi C, Metzger D, Leung D, Hartmann G, Wollenberg GK, Liu J, Tan L, Xu Y, Chen Q, Liu G, Blaustein RO, Johns DG (2021) Invention of MK-8262, a cholesteryl ester transfer protein (CETP) inhibitor backup to anacetrapib with best-in-class properties. J Med Chem 64:13215–13258. https://doi.org/10.1021/acs.jmedchem.1c00959
van Veldhoven JPD, Campostrini G, van Gessel CJE, Ward-van Oostwaard D, Liu R, Mummery CL, Bellin M, IJzerman AP (2021) Targeting the Kv11.1 (hERG) channel with allosteric modulators. Synthesis and biological evaluation of three novel series of LUF7346 derivatives. Eur J Med Chem 212:113033. https://doi.org/10.1016/j.ejmech.2020.113033
Leung KM, Elashoff RM, Afifi AA (1997) Censoring issues in survival analysis. Annu Rev Public Health 18:83–104. https://doi.org/10.1146/annurev.publhealth.18.1.83
Wei LJ (1992) The accelerated failure time model: a useful alternative to the cox regression model in survival analysis. Stat Med 11:1871–1879. https://doi.org/10.1002/sim.4780111409
Sheridan RP (2012) Three useful dimensions for domain applicability in QSAR models using random forest. J Chem Inform Model 52:814–823. https://doi.org/10.1021/ci300004n
Sheridan RP, Wang WM, Liaw A, Ma J, Gifford EM (2016) Extreme gradient boosting as a method for quantitative structure-activity relationships. J Chem Inform Model 56:2353–2360. https://doi.org/10.1021/acs.jcim.6b00591
Didziapetris R, Lanevskij K (2016) Compilation and physicochemical classification analysis of a diverse hERG inhibition database. J Comput Aided Mol Des 30:1175–1188. https://doi.org/10.1007/s10822-016-9986-0
Chen X-L, Kang J, Rampe D (2011) Manual whole-cell patch-clamping of the HERG cardiac K + channel. Methods Mol Biol 691:151–163. https://doi.org/10.1007/978-1-60761-849-2_9
Sorota S, Zhang X-S, Margulis M, Tucker K, Priestley T (2005) Characterization of a hERG screen using the IonWorks HT: comparison to a hERG rubidium efflux screen. Assay Drug Dev Technol 3:47–57. https://doi.org/10.1089/adt.2005.3.47
Schupp M, Park SH, Qian B, Yu W (2020) Electrophysiological studies of GABAA receptors using QPatch II, the next generation of automated patch-clamp instruments. Curr Protoc Pharmacol 89:e75. https://doi.org/10.1002/cpph.75
Murphy SM, Palmer M, Poole MF, Padegimas L, Hunady K, Danzig J, Gill S, Gill R, Ting A, Sherf B, Brunden K, Stricker-Krongrad A (2006) Evaluation of functional and binding assays in cells expressing either recombinant or endogenous hERG channel. J Pharmacol Toxicol Methods 54:42–55. https://doi.org/10.1016/j.vascn.2005.10.003
Chiu PJS, Marcoe KF, Bounds SE, Lin C-H, Feng J-J, Lin A, Cheng F-C, Crumb WJ, Mitchell R (2004) Validation of a [3H]astemizole binding assay in HEK293 cells expressing HERG K + channels. J Pharmacol Sci 95:311–319
Raab CE, Butcher JW, Connolly TM, Karczewski J, Yu NX, Staskiewicz SJ, Liverton N, Dean DC, Melillo DG (2006) Synthesis of the first sulfur-35-labeled hERG radioligand. Bioorg Med Chem Lett 16:1692–1695. https://doi.org/10.1016/j.bmcl.2005.12.021
Ertl P, Rohde B, Selzer P (2000) Fast calculation of molecular polar surface area as a sum of fragment-based contribution and its application to the prediction of drug transport properties. J Med Chem 43:3714–3717
Veber DF, Johnson SR, Cheng H-Y, Smith BR, Ward KW, Kopple KD (2002) Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem 45:2615–2623
Choi K-H, Song C, Shin D, Park S (2011) hERG channel blockade by externally applied quaternary ammonium derivatives. Biochim Biophys Acta 1808:1560–1566. https://doi.org/10.1016/j.bbamem.2011.02.008
Bridgland-Taylor MH, Hargreaves AC, Easter A, Orme A, Henthorn DC, Ding M, Davis AM, Small BG, Heapy CG, Abi-Gerges N, Persson F, Jacobson I, Sullivan M, Albertson N, Hammond TG, Sullivan E, Valentin J-P, Pollard CE (2006) Optimisation and validation of a medium-throughput electrophysiology-based hERG assay using IonWorks HT. J Pharmacol Toxicol Methods 54:189–199. https://doi.org/10.1016/j.vascn.2006.02.003
Chen T, Guestrin C (2016) XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp 785–794
Barnwal A, Cho H, Hocking T (2022) Survival regression with accelerated failure time model in XGBoost. J Comput Graph Stat. https://doi.org/10.1080/10618600.2022.2067548
McFadden D (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P (ed) Frontiers in econometrics. Academic Press, New York, pp 105–142
Cox DR, Snell EJ (1989) Analysis of binary data, 2nd edn. Chapman and Hall, London, UK
Veall MR, Zimmermann KF (1994) Evaluating Pseudo-R2’s for binary probit models. Qual Quant 28:151–164. https://doi.org/10.1007/BF01102759
Hemmert GAJ, Schons LM, Wieseke J, Schimmelpfennig H (2018) Log-likelihood-based Pseudo-R2 in logistic regression: deriving sample-sensitive benchmarks. Sociol Methods Res 47:507–531. https://doi.org/10.1177/0049124116638107
Powers D (2011) Evaluation: From precision, recall and f-measure to roc., informedness, markedness & correlation. J Mach Learn Technol 2:37–63
Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21:6. https://doi.org/10.1186/s12864-019-6413-7
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Viera AJ, Garrett JM (2005) Understanding interobserver agreement: the kappa statistic. Fam Med 37:360–363
Czodrowski P (2014) Count on kappa. J Comput Aided Mol Des 28:1049–1055. https://doi.org/10.1007/s10822-014-9759-6
Japertas P, Didziapetris R, Petrauskas A (2002) Fragmental methods in the design of new compounds. Applications of the advanced algorithm builder. Quant Struct-Act Relat 21:23–37
ACD/Labs (2021) Percepta v. 2021.2. Advanced chemistry development, Inc., Toronto, Ontario, Canada. https://www.acdlabs.com/products/percepta-platform/. Accessed on 22 July 2022
Bezanson J, Edelman A, Karpinski S, Shah VB (2017) Julia: a fresh approach to numerical computing. SIAM Rev 59:65–98. https://doi.org/10.1137/141000671
R Core Team (2021) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.r-project.org/. Accessed on 22 July 2022
Jones DC, Arthur B, Nagy T, Mattriks GS, Godisemo, Holy T, Noack A, Sengupta A, Darakananda D, Dunning BA, Leblanc I, Huijzer S, Fischer R, Chudzicki K, Piibeleht D, Mellnik M, Kleinschmidt A, Breloff D, Yu T, Huchette Y, Innes J, inkyu MJ, Verzani J, Pelenitsyn A, Coalson C, O’Mara C, Saba E(2021) GiovineItalia/Gadfly.jl: v1.3.4. https://zenodo.org/record/5559613
Doak BC, Kihlberg J (2017) Drug discovery beyond the rule of 5—opportunities and challenges. Expert Opin Drug Discov 12:115–119. https://doi.org/10.1080/17460441.2017.1264385
DeGoey DA, Chen H-J, Cox PB, Wendt MD (2018) Beyond the rule of 5: lessons learned from AbbVie’s drugs and compound collection. J Med Chem 61:2636–2651. https://doi.org/10.1021/acs.jmedchem.7b00717
Egbert M, Whitty A, Keserű GM, Vajda S (2019) Why some targets benefit from beyond rule of five drugs. J Med Chem 62:10005–10025. https://doi.org/10.1021/acs.jmedchem.8b01732
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M (2011) pROC: an open-source package for R and S + to analyze and compare ROC curves. BMC Bioinformatics 12:77. https://doi.org/10.1186/1471-2105-12-77
Jamieson C, Moir EM, Rankovic Z, Wishart G (2006) Medicinal chemistry of hERG optimizations: Highlights and hang-ups. J Med Chem 49:5029–5046. https://doi.org/10.1021/jm060379l
Aliagas I, Gobbi A, Heffron T, Lee M-L, Ortwine DF, Zak M, Khojasteh SC (2015) A probabilistic method to report predictions from a human liver microsomes stability QSAR model: a practical tool for drug discovery. J Comput Aided Mol Des 29:327–338. https://doi.org/10.1007/s10822-015-9838-3
Göller AH, Kuhnke L, Montanari F, Bonin A, Schneckener S, ter Laak A, Wichard J, Lobell M, Hillisch A (2020) Bayer’s in silico ADMET platform: a journey of machine learning over the past two decades. Drug Discov Today 25:1702–1709. https://doi.org/10.1016/j.drudis.2020.07.001
Molnar C (2022) Global model-agnostic methods. In: Interpretable machine learning, 2nd ed. https://christophm.github.io/interpretable-ml-book/global-methods.html. Accessed on 22 July 2022
Kirsch GE, Trepakova ES, Brimecombe JC, Sidach SS, Erickson HD, Kochan MC, Shyjka LM, Lacerda AE, Brown AM (2004) Variability in the measurement of hERG potassium channel inhibition: effects of temperature and stimulus pattern. J Pharmacol Toxicol Methods 50:93–101. https://doi.org/10.1016/j.vascn.2004.06.003
Rajamani S, Anderson CL, Anson BD, January CT (2002) Pharmacological rescue of human K(+) channel long-QT2 mutations: human ether-a-go-go-related gene rescue without block. Circulation 105:2830–2835. https://doi.org/10.1161/01.cir.0000019513.50928.74
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 46:3–26
Siramshetty VB, Nguyen D-T, Martinez NJ, Southall NT, Simeonov A, Zakharov AV (2020) Critical assessment of artificial intelligence methods for prediction of hERG channel inhibition in the “Big Data” era. J Chem Inform Model 60:6007–6019. https://doi.org/10.1021/acs.jcim.0c00884
Funding
No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. RD collected and curated literature data. KL performed statistical modeling and visualization. All authors were involved in the analysis and interpretation of the results. KL wrote the first draft of the manuscript. RD and AS reviewed the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lanevskij, K., Didziapetris, R. & Sazonovas, A. Physicochemical QSAR analysis of hERG inhibition revisited: towards a quantitative potency prediction. J Comput Aided Mol Des 36, 837–849 (2022). https://doi.org/10.1007/s10822-022-00483-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-022-00483-0