Abstract
The large amount of costs, time, effort and failures involved in the process of drug discovery and development made it difficult for the researchers to discover drugs and prompted the need for methods which could improve the productivity and efficiency of drug design. Cheminformatics is an emerging field which acts as an interface between chemistry and computers and helps in processing, managing and analysis of large chemical information using computer methods. In this chapter, we have outlined the applications of cheminformatics in the field of drug discovery, such as identification of lead compounds, virtual library generation, high throughput screening and data mining, prediction of biological activities of compounds and in silico ADMET prediction. Various cheminformatics approaches that include data mining, representation of chemical compounds via descriptors, similarity and substructures searching and classification algorithms have also been discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xu J, Hagler A (2002) Chemoinformatics and drug discovery. Molecules 7:566–600
Hecht P (2002) High-throughput screening: beating the odds with informatics-driven chemistry. Curr Drug Discov:21–24
Gallop MA, Barrett RW, Dower WJ, Fodor SP, Gordon EM (1994) Applications of combinatorial technologies to drug discovery. 1. Background and peptide combinatorial libraries. J Med Chem 37:1233–1251
Brown FK (1998) Chemoinformatics: what is it and how does it impact drug discovery. Annu Rep Med Chem 33:375–384
Engel T (2006) Basic overview of chemoinformatics. J Chem Inf Model 46:2267–2277
Hann M, Green R (1999) Chemoinformatics—a new name for an old problem? Curr Opin Chem Biol 3:379–383
Gasteiger J, Engel T (2006) Chemoinformatics: a textbook. Wiley
James CA Cheminformatics 101. An introduction to the computer science and chemistry of chemical information systems. eMolecules Inc., Del Mar
Todeschini R, Consonni V (2008) Handbook of molecular descriptors, vol 11. Wiley, NewYork
Valla A, Giraud M, Dore JC (1993) Descriptive modeling of the chemical structure-biological activity relations of a group of malonic polyethylenic acids as shown by different pharmacotoxicologic tests. Pharmazie 48:295–301
Liu K, Feng J, Young SS (2005) Power MV: a software environment for molecular viewing, descriptor generation, data analysis and hit evaluation. J Chem Inf Model 45:515–522
Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474
Mitchell JB (2014) Machine learning methods in chemoinformatics. Wiley Interdiscip Rev Comput Mol Sci 4:468–481
Alpaydin E (2014) Introduction to machine learning. MIT Press, Cambridge
Daumé H (2012) A course in machine learning (ciml.Info), p. 189
Brown RD, Martin YC (1996) Use of structure−activity data to compare structure-based clustering methods and descriptors for use in compound selection. J Chem Inf Comput Sci 36:572–584
Mitchell TM (1997) Machine learning. McGraw-Hill Science/Engineering/Math, Maidenhead, p. 432
Simon P (2013) Too big to ignore: the business case for big data. Wiley, Hoboken, p. 89
Mitchell JBO (2014) Machine learning methods in chemoinformatics. Wiley Interdiscip Rev Comput Mol Sci 4:468–481
So S-S, Karplus M (1997) Three-dimensional quantitative structure− activity relationships from molecular similarity matrices and genetic neural networks. 1. Method and validations. J Med Chem 40:4347–4359
Li H et al (2006) Prediction of estrogen receptor agonists and characterization of associated molecular descriptors by statistical learning methods. J Mol Graph Model 25:313–323
Briem H, Günther J (2005) Classifying “kinase inhibitor-likeness” by using machine-learning methods. Chembiochem 6:558–566
Jehad Ali RK, Ahmad N, Maqsood I (2012) Random forests and decision trees. Int J Comput Sci Issues 9
Marchese Robinson RL, Glen RC, Mitchell JB (2011) Development and comparison of hERG blocker classifiers: assessment on different datasets yields markedly different results. Mol Informat 30:443–458
Kuz'min VE, Polishchuk PG, Artemenko AG, Andronati SA (2011) Interpretation of QSAR models based on random forest methods. Mol Informat 30:593–603
Li S, Fedorowicz A, Singh H, Soderholm SC (2005) Application of the random forest method in studies of local lymph node assay based skin sensitization data. J Chem Inf Model 45:952–964
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29:131–163
Koutsoukas A et al (2013) In silico target predictions: defining a benchmarking data set and comparison of performance of the multiclass Naïve Bayes and Parzen-Rosenblatt window. J Chem Inf Model 53:1957–1966
Cannon EO et al (2007) Support vector inductive logic programming outperforms the naive Bayes classifier and inductive logic programming for the classification of bioactive chemical compounds. J Comput Aided Mol Des 21:269–280
von Korff M, Sander T (2006) Toxicity-indicating structural patterns. J Chem Inf Model 46:536–544
Platt JCSequential minimal optimization. A fast algorithm for training support vector machines. Report no. MSR-TR-98-14, 21 (Microsoft Research), 1998)
Liao Q, Yao J, Yuan S (2007) Prediction of mutagenic toxicity by combination of recursive partitioning and support vector machines. Mol Divers 11:59–72
Kinnings SL et al (2011) A machine learning-based method to improve docking scoring functions and its application to drug repurposing. J Chem Inf Model 51:408–419
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46:175–185
Ajmani S, Jadhav K, Kulkarni SA (2006) Three-dimensional QSAR using the k-nearest neighbor method and its interpretation. J Chem Inf Model 46:24–31
Honório KM, da Silva AB (2005) A study on the influence of molecular properties in the psychoactivity of cannabinoid compounds. J Mol Model 11:200–209
Basak SC, Grunwald GD (1995) Predicting mutagenicity of chemicals using topological and quantum chemical parameters: a similarity based study. Chemosphere 31:2529–2546
Begam BF, Kumar JS (2012) A study on cheminformatics and its applications on modern drug discovery. Proced Eng 38:1264–1275
Aktar MW, Murmu S (2008) Chemoinformatics: principles and applications. 1 Pesticide Residue Laboratory, Department of Agricultural Chemicals, 2 Department of Agricultural Chemistry and Soil Science, Bidhan Chandra Krishi Viswavidyalaya, Mohanpur-741252, Nadia, West Bengal, India.
Nantasenamat C, Isarankura-Na-Ayudhya C, Naenna T, Prachayasittikul V (2009) A practical overview of quantitative structure-activity relationship. EXCLI J 8:74–88
Walters WP, Stahl MT, Murcko MA (1998) Virtual screening—an overview. Drug Discov Today 3:160–178
Diller DJ, Merz KM (2001) High throughput docking for library design and library prioritization. Proteins 43:113–124
Willett P (2000) Chemoinformatics–similarity and diversity in chemical libraries. Curr Opin Biotechnol 11:85–88
Gedeck P, Willett P (2001) Visual and computational analysis of structure–activity relationships in high-throughput screening data. Curr Opin Chem Biol 5:389–395
Halford B (2014) Reflections on CHEMDRAW. Chem Eng News 92:26–27
Park J et al (2009) Automated extraction of chemical structure information from digital raster images. Chem Cent J 3:4
Hunter AD (1997) ACD/ChemSketch 1.0 (freeware); ACD/ChemSketch 2.0 and its Tautomers, Dictionary, and 3D Plug-ins; ACD/HNMR 2.0; ACD/CNMR 2.0. ACS Publications.
Steinbeck C et al (2003) The chemistry development kit (CDK): an open-source Java library for chemo-and bioinformatics. J Chem Inf Comput Sci 43:493–500
Cao Y, Charisi A, Cheng L-C, Jiang T, Girke T (2008) Chemmine R: a compound mining framework for R. Bioinformatics 24:1733–1734
Ertl P (2010) Molecular structure input on the web. J Cheminform 2(1)
O'Boyle NM et al (2011) Open babel: an open chemical toolbox. J Chem 3:33
Wang Y et al (2009) PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Res 37:W623–W633
Acknowledgements
AG is thankful to Jawaharlal Nehru University for usage of all computational facilities. AG is grateful to University Grants Commission, India for the Faculty Recharge Position. Salma Jamal acknowledges a Senior Research Fellowship from Indian Council of Medical Research (ICMR), New Delhi.
Competing Interests
The authors declare that they have no competing interests.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Jamal, S., Grover, A. (2017). Cheminformatics Approaches in Modern Drug Discovery. In: Grover, A. (eds) Drug Design: Principles and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-10-5187-6_9
Download citation
DOI: https://doi.org/10.1007/978-981-10-5187-6_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5186-9
Online ISBN: 978-981-10-5187-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)