Applicability Domain: A Step Toward Confident Predictions and Decidability for QSAR Modeling

Kar, Supratik; Roy, Kunal; Leszczynski, Jerzy

doi:10.1007/978-1-4939-7899-1_6

Supratik Kar³,
Kunal Roy⁴ &
Jerzy Leszczynski³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1800))

49 Citations

Abstract

In the context of human safety assessment through quantitative structure–activity relationship (QSAR) modeling, the concept of applicability domain (AD) has an enormous role to play. The Organization of Economic Co-operation and Development (OECD) for QSAR model validation recommended as principle 3 “A defined domain of applicability” to be present for a predictive QSAR model. The study of AD allows estimating the uncertainty in the prediction for a particular molecule based on how similar it is to the training compounds which are used in the model development. In the current scenario, AD represents an active research topic, and many methods have been designed to estimate the competence of a model and the confidence in its outcome for a given prediction task. Thus, characterization of interpolation space is significant in defining the AD. The diverse set of reported AD methods was constructed through different hypotheses and algorithms. These multiplicities of methodologies mystify the end users and make the comparison of the AD for different models a complex issue to address. We have attempted to summarize in this chapter the important concepts of AD including particulars of the available methods to compute the AD along with their thresholds and criteria for estimating AD through training set interpolation in the descriptor space. The idea about transparent domain and decision domain are also discussed. To help readers determine the AD in their projects, practical examples together with available open source software tools are provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Predictive QSAR Modeling: Methods and Applications in Drug Discovery and Chemical Risk Assessment

Applicability Domain Characterization for Machine Learning QSAR Models

Predictive QSAR Modeling: Methods and Applications in Drug Discovery and Chemical Risk Assessment

References

Roy K, Kar S, Das RN (2015) Understanding the basics of QSAR for applications in pharmaceutical sciences and risk assessment. Academic Press, San Diego, CA, USA
Google Scholar
Roy K, Kar S (2015) Importance of applicability domain of QSAR models. In: Roy K (ed) Quantitative structure-activity relationships in drug design, predictive toxicology, and risk assessment. IGI Global, Hershey PA, USA, pp 180–211
Chapter Google Scholar
Gadaleta D, Mangiatordi GF, Catto M, Carotti A, Nicolotti O (2016) Applicability domain for QSAR models: where theory meets reality. Int J Quant Struct Prop Relat J 1:45–63
Google Scholar
Mathea M, Klingspohn W, Baumann K (2016) Chemoinformatic classification methods and their applicability domain. Mol Inform 35:160–180
Article PubMed CAS Google Scholar
Wold S, Sjostrom M, Eriksson L (2001) PLS-regression: a basic tool of chemometrics. Chemom Intell Lab Syst 58:109–130
Article CAS Google Scholar
Netzeva TI, Worth AP, Aldenberg T, Benigni R, Cronin MTD, Gramatica P et al (2005) Current status of methods for defining the applicability domain of (quantitative) structure-activity relationships. Altern Lab Anim 33:155–173
PubMed CAS Google Scholar
Golbraikh A, Tropsha A (2002) Beware of q²! J Mol Graph Model 20:269–276
Article PubMed CAS Google Scholar
OECD, Principles for the validation of (Q)SARs (2004). http://www.oecd.org/dataoecd/33/37/37849783.pdf (Accessed 20 May, 2017)
Jaworska JS, Comber M, Auer C, Van Leeuwen CJ (2003) Summary of a workshop on regulatory acceptance of (Q)SARs for human health and environmental endpoints. Environ Health Perspect 111:1358–1360
Article PubMed PubMed Central Google Scholar
Gramatica P (2007) Principles of QSAR models validation: internal and external. QSAR Comb Sci 26:694–701
Article CAS Google Scholar
Weaver S, Paul Gleeson M (2008) The importance of the domain of applicability in QSAR modeling. J Mol Graph Model 26:1315–1326
Article PubMed CAS Google Scholar
Roy K, Kar S, Das RN (2015) A primer on QSAR/QSPR modeling: fundamental concepts (SpringerBriefs in Molecular Science). Springer, Berlin
Book Google Scholar
Roy K, Kar S (2015) How to judge predictive quality of classification and regression based QSAR models? In: Haq ZU, Madura J (eds) Frontiers of computational chemistry. Bentham, Sharjah, pp 71–120
Google Scholar
Hanser T, Barber C, Marchaland JF, Werner S (2016) Applicability domain: towards a more formal definition. SAR QSAR Environ Res 27:865–881
Article CAS Google Scholar
Jaworska J, Nikolova-Jeliazkova N, Aldenberg T (2005) QSAR applicability domain estimation by projection of the training set descriptor space: a review. Altern Lab Anim 33:445–459
PubMed CAS Google Scholar
Stanforth RW, Kolossov E, Mirkin B (2007) A measure of domain of applicability for QSAR modeling based on intelligent K-means clustering. QSAR Comb Sci 26:837–844
Article CAS Google Scholar
Guha R, Jurs PC (2005) Determining the validity of a QSAR model-a classification approach. J Chem Inf Model 45:65–73
Article PubMed CAS Google Scholar
Nikolova-Jeliazkova N, Jaworska J (2005) An approach to determining applicability domain for QSAR group contribution models: an analysis of SRC KOWWIN. Altern Lab Anim 33:461–470
PubMed CAS Google Scholar
Worth AP, Bassan A, Gallegos A, Netzeva TI, Patlewicz G, Pavan M et al (2005) The characterisation of (quantitative) structure-activity relationships: preliminary guidance. ECB Report EUR 21866 EN, European Commission, Joint Research Centre; Ispra, Italy, pp. 95
Google Scholar
Topkat OPS (2000). U.S. Patent 6, 036, 349
Google Scholar
Preparata FP, Shamos MI (1991) In: Preparata FP, Shamos MI (eds) Computational geometry: an introduction. Springer-Verlag, New York
Google Scholar
Jaworska JS, Nikolova-Jeliazkova N, Aldenberg T (2004) Review of methods for applicability domain estimation. Report, The European Commission-Joint Research Centre, Ispra, Italy
Google Scholar
Hair JF Jr, Anderson RE, Tatham RL, Black WC (2005) Multivariate data analysis. Pearson Education, Singapore
Google Scholar
Sheridan R, Feuston RP, Maiorov VN, Kearsley S (2004) Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR. J Chem Inform Comput Sci 44:1912–1928
Article CAS Google Scholar
SIMCA-P 10.0. (2002) info@umetrics.com, UMETRICS, Umea, Sweden, www.umetrics.com
Tetko IV, Sushko I, Pandey AK, Zhu H, Tropsha A, Papa E et al (2008) Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection. J Chem Inform Comput Sci 48:1733–1746
Article CAS Google Scholar
Manallack DT, Tehan BG, Gancia E, Hudson BD, Ford MG, Livingstone DJ et al (2003) A consensus neural network-based technique for discriminating soluble and poorly soluble compounds. J Chem Inform Comput Sci 43:674–679
Article CAS Google Scholar
Tetko IV (2008) Associative neural network. Methods Mol Biol 458:185–202
PubMed Google Scholar
Tetko IV, Tanchuk VY (2002) Application of associative neural networks for prediction of lipophilicity in ALOGPS 2.1 program. J Chem Inform Comput Sci 42:1136–1145
Article CAS Google Scholar
Chen JJ, Tsai CA, Young JF, Kodell RL (2005) Classification ensembles for unbalanced class sizes in predictive toxicology. SAR QSAR Environ Res 16:517–529
Article PubMed CAS Google Scholar
Jouan-Rimbaud D, Bouveresse E, Massart DL, de Noord OE (1999) Detection of prediction outliers and inliers in multivariate calibration. AnalyticaChimicaActa 388:283–301
CAS Google Scholar
Roy K, Kar S, Ambure P (2015) On a simple approach for determining applicability domain of QSAR models. Chemom Intell Lab Syst 145:22–29
Article CAS Google Scholar
Dimitrov S, Dimitrova G, Pavlov T, Dimitrova N, Patlewicz G, Niemela J et al (2005) Stepwise approach for defining the applicability domain of SAR and QSAR models. J Chem Inform Model 45:839–849
Article CAS Google Scholar
Tong W, Hong H, Fang H, Xie Q, Perkins R (2003) Decision forest: combining the predictions of multiple independent decision tree models. J Chem Inform Comput Sci 43:525–531
Article CAS Google Scholar
Tong W, Hong H, Xie Q, Xie L, Fang H, Perkins R (2004) Assessing QSAR limitations–a regulatory perspective. Curr Comput Aided Drug Des 1:195–205
Article Google Scholar
Fechner N, Jahn A, Hinselmann G, Zell A (2009) Atomic local neighborhood flexibility incorporation into a structured similarity measure for QSAR. J Chem Inform Model 49:549–560
Article CAS Google Scholar
Mirkin B (2005) Clustering for data mining: a data recovery approach. Chapman & Hall/CRC, London
Book Google Scholar
Smellie A (2004) Accelerated K-means clustering in metric spaces. J Chem Inform Comput Sci 44:1929–1935
Article CAS Google Scholar

Download references

Acknowledgments

S.K. and J.L. thank the National Science Foundation (NSF/CREST HRD-1547754, and NSF/RISE HRD-1547836) for financial support. K.R. is thankful to the UGC, New Delhi for financial assistance under the UPE II scheme.

Author information

Authors and Affiliations

Interdisciplinary Center for Nanotoxicity, Department of Chemistry and Biochemistry, Jackson State University, Jackson, MS, USA
Supratik Kar & Jerzy Leszczynski
Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
Kunal Roy

Authors

Supratik Kar
View author publications
You can also search for this author in PubMed Google Scholar
Kunal Roy
View author publications
You can also search for this author in PubMed Google Scholar
Jerzy Leszczynski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kunal Roy .

Editor information

Editors and Affiliations

Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari Aldo Moro, Bari, Italy
Orazio Nicolotti

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Kar, S., Roy, K., Leszczynski, J. (2018). Applicability Domain: A Step Toward Confident Predictions and Decidability for QSAR Modeling. In: Nicolotti, O. (eds) Computational Toxicology. Methods in Molecular Biology, vol 1800. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7899-1_6

Download citation

DOI: https://doi.org/10.1007/978-1-4939-7899-1_6
Published: 23 June 2018
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7898-4
Online ISBN: 978-1-4939-7899-1
eBook Packages: Springer Protocols

Publish with us

Policies and ethics

Applicability Domain: A Step Toward Confident Predictions and Decidability for QSAR Modeling

Abstract

Access this chapter

Similar content being viewed by others

Predictive QSAR Modeling: Methods and Applications in Drug Discovery and Chemical Risk Assessment

Applicability Domain Characterization for Machine Learning QSAR Models

Predictive QSAR Modeling: Methods and Applications in Drug Discovery and Chemical Risk Assessment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Navigation

Applicability Domain: A Step Toward Confident Predictions and Decidability for QSAR Modeling

Abstract

Access this chapter

Similar content being viewed by others

Predictive QSAR Modeling: Methods and Applications in Drug Discovery and Chemical Risk Assessment

Applicability Domain Characterization for Machine Learning QSAR Models

Predictive QSAR Modeling: Methods and Applications in Drug Discovery and Chemical Risk Assessment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this protocol

Cite this protocol

Download citation

Publish with us

Search

Navigation