Abstract
This chapter reviews results of research carried out by Basak and collaborators during the past four decades in the development of novel mathematical chemodescriptors and omics-based biodescriptors as well as their applications in quantitative structure-activity relationship (QSAR) and quantitative molecular similarity analysis (QMSA) studies related to the prediction of toxicities, bioactivities, and properties of chemicals. For chemodescriptor-based QSAR and QMSA studies, we have used graph theoretical, three-dimensional (3D), and quantum chemical indices. The graph theoretic chemodescriptors fall into two major categories:
-
(a)
Numerical invariants defined on simple molecular graphs representing only the adjacency and distance relationship of atoms bonds; such invariants are called topostructural (TS) indices
-
(b)
Topological indices derived from weighted molecular graphs, called topochemical (TC) indices.
Collectively, the TS and TC descriptors are known as topological indices (TIs). The set of independent variables used for modeling also includes a group of three-dimensional (3D) molecular descriptors. Semiempirical and various levels of ab initio quantum chemical indices have also been used for hierarchical QSAR (HiQSAR) modeling. Results indicate that in many cases of property-activity/toxicity analyzed by us, a TS + TC combination explains most of the variance in the data.
In the area of quantitative molecular similarity analysis (QMSA), we have used different arbitrary (user-defined) and tailored (property-specific) similarity spaces for analog selection and k-nearest neighbor (KNN)-based property estimation of chemicals from their selected analogs. Preliminary data suggest that tailored spaces outperform arbitrary spaces. Additional research is needed to test the validity of this observation. Rapid clustering of large chemical libraries can be accomplished using calculated TIs, and this approach has promise both for drug discovery and toxicology.
With respect to biodescriptor development, we have mainly applied techniques of statistics, chemometrics, and discrete mathematics in order to calculate invariants of objects associated with proteomics maps. Invariants or vectors calculated from maps derived from normal animals or cells vis-à-vis those treated with drugs and toxicants show that such descriptors are capable of discriminating between maps of control biological systems and those exposed to drugs or xenobiotics. Finally, we discussed the approach of integrated QSAR (I-QSAR) where both computed chemodescriptors and biodescriptors are used for quantitative prediction of bioactivity.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Hardman JG, Limbird LE, Gilman AG (2001) Goodman and Gilman’s the pharmacological basis of therapeutics. McGraw- Hill, New York
Hoffman DJ, Ratner BA, Burton GA Jr, Cairns J Jr (1995) Handbook of ecotoxicology. CRC Press, Boca Raton
Nogrady T (1985) Medicinal chemistry: a biochemical approach. Oxford University Press, New York
Rand G (ed) (1995) Fundamentals of aquatic toxicology: effects, environmental fate and risk assessment, 2nd edn. Taylor and Francis, New York
Primas H (1981) Chemistry, quantum mechanics and reductionism. Springer, Berlin
Woolley RG (1978) Must a molecule have a shape? J Am Chem Soc 100:1073–1078
Basak SC, Veith GJ, Niemi GD (1991) Predicting properties of molecules using graph invariants. J Math Chem 7:243–272
Einstein A (1954) Remarks on Bertrand Russell’s theory of knowledge. In: Einstein A (ed) Ideas and opinions. Ed. Carl Seelig, (Based on MEIN WELTBILD, edited by Carl Seelig, and other sources; New translations and revisions by Sonja Bargmann), Crown Publishers, New York, pp 18–24
Bunge M (1973) Method, model and matter. Reidel, Dordrecht
Carhart RE, Smith DH, Venkataraghavan R (1985) Atoms pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 25:64–73. doi:10.1021/ci00046a002
Euler I (1736) Solutio problematis ad geometriam situs pertinentis. Comment Acad Sci U Petrop 8:128–140
Sylvester JJ (1878) On an application of the new atomic theory to the graphical representation of the invariants and covariants of binary quantics, with three appendices. Am J Math 1:105–125
Wiener H (1947) Structural determination of paraffin boiling points. J Am Chem Soc 69:17–20
Balasubramanian K, Basak SC (1998) Characterization of isospectral graphs using graph invariants and derived orthogonal parameters. J Chem Inf Comput Sci 38:367–373
Nandy A, Harle M, Basak SC (2006) Mathematical descriptors of DNA sequences: development and application. Arkivoc 9:211–238
Basak SC (2013) Philosophy of mathematical chemistry: a personal perspective. HYLE Int J Philos Chem 19:3–17
Basak SC (2013) Mathematical descriptors for the prediction of property, bioactivity, and toxicity of chemicals from their structure: a chemical-cum-biochemical approach. Curr Comput Aided Drug Des 9:449–462
Basak SC, Magnuson VR, Niemi GJ, Regal RR (1988) Determining structural similarity of chemicals using graph-theoretic indices. Discrete Appl Math 19:17–44
Lajiness M (1990) Molecular similarity-based methods for selecting compounds for screening. In: Rouvray DH (ed) Computational chemical graph theory. Nova, New York, pp 299–316
Basak SC, Mills D, Gute BD, Balaban AT, Basak K, Grunwald GD (2010) Use of mathematical structural invariants in analyzing, combinatorial libraries: a case study with psoralen derivatives. Curr Comput Aided Drug Des 6:240–251
Basak SC (2014) Molecular similarity and hazard assessment of chemicals: a comparative study of arbitrary and tailored similarity spaces. J Eng Sci Manag Educ 7:178–184
Basak SC (1987) Use of molecular complexity indices in predictive pharmacology and toxicology: a QSAR approach. Med Sci Res 15:605–609
Raychaudhury C, Ray SK, Ghosh JJ, Roy AB, Basak SC (1984) Discrimination of isomeric structures using information-theoretic topological indices. J Comput Chem 5:581–588
Balaban AT, Mills D, Ivanciuc O, Basak SC (2000) Reverse wiener indices. Croat Chim Acta 73:923–941
Nikolic S, Trinajstic N, Amic D, Beslo D, Basak SC (2001) Modeling the solubility of aliphatic alcohols in water. Graph connectivity indices versus line graph connectivity indices. In: Diudea MV (ed) QSAR/QSPR studies by molecular descriptors. Nova, Huntington, pp 63–81
Randic M, Vracko M, Nandy A, Basak SC (2000) On 3-D graphical representation of DNA primary sequences and their numerical characterization. J Chem Inf Comput Sci 40:1235–1244
Basak SC, Gute BD (2008) Mathematical descriptors of proteomics maps: background and applications. Curr Opin Drug Discov Dev 11:320–326
Hosoya H (1971) Topological index. A newly proposed quantity characterizing the topological nature of structural isomers of saturated hydrocarbons. Bull Chem Soc Jpn 44:2332–2339
MolconnZ (2003) Version 4.05. Hall Ass. Consult. Quincy
Basak SC, Harriss DK, Magnuson VR (1988) POLLY v. 2.3. Copyright of the University of Minnesota, USA
Basak SC, Grunwald GD (1993) APProbe. Copyright of the University of Minnesota, USA
Filip PA, Balaban TS, Balaban AT (1987) A new approach for devising local graph invariants: derived topological indices with low degeneracy and good correlation ability. J Math Chem 1:61–83
Stewart JJP (1990) MOPAC Version 6.00, QCPE #455, Frank J Seiler Research Laboratory, US Air Force Academy, CO
Frisch MJ et al (1998) Gaussian 98 (Revision A.11.2). Gaussian, Inc., Pittsburgh
Auer CM, Nabholz JV, Baetcke KP (1990) Mode of action and the assessment of chemical hazards in the presence of limited data: use of structure-activity relationships (SAR) under TSCA, section 5. Environ Health Perspect 87:183–197
Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474
Todeschini R, Consonni V, Mauri A, Pavan M. (2006) DRAGON – Software for the calculation of molecular descriptors, version 5.4, Talete srl. Milan.
Johnson M, Basak SC, Maggiora G (1988) A characterization of molecular similarity methods for property prediction. Math Comput Mod 11:630–634
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429
Cook RD, Li B, Chiaromonte F (2010) Envelope models for parsimonious and efficient multivariate linear regression. Stat Sin 20:927–1010
Hawkins DM, Basak SC, Mills D (2003) Assessing model fit by cross-validation. J Chem Inf Comput Sci 3:579–586
Hawkins DM, Basak SC, Mills D (2004) QSARs for chemical mutagens from structure: ridge regression fitting and diagnostics. Environ Toxicol Pharmacol 16:37–44
Basak SC, Mills D, Hawkins DM, Kraker JJ (2007) Proper statistical modeling and validation in QSAR: a case study in the prediction of rat fat-air partitioning. In: Simos TE, Maroulis G (eds) Computation in modern science and engineering, proceedings of the International Conference on Computational Methods in Science and Engineering 2007 (ICCMSE 2007). American Institute of Physics, Melville, pp 548–551
Basak SC, Majumdar S (2016) Current landscape of hierarchical QSAR modeling and its applications: Some comments on the importance of mathematical descriptors as well as rigorous statistical methods of model building and validation. In: Basak SC, Restrepo G, Villaveces JL (ed) Advances in mathematical chemistry and applications, vol 1. Bentham eBooks, Elsevier & Bentham Science Publishers, Sharjah, U. A. E, pp 251–281
Basak SC, Majumdar S (2015) Hierarchical quantitative structure-activity relationships (HiQSARs) for the prediction of physicochemical and toxicological properties of chemicals using computed molecular descriptors, Mol2Net Conference. http://sciforum.net/email/validate/49668c1bf65ab8520f721a84f7d84e05
Majumdar S, Basak SC, Grunwald GD (2013) Adapting interrelated two-way clustering method for quantitative structure-activity relationship (QSAR) modeling of mutagenicity/non-mutagenicity of a diverse set of chemicals. Curr Comput Aided Drug Des 9:463–471
Basak SC, Majumdar S (2015) Prediction of mutagenicity of chemicals from their calculated molecular descriptors: a case study with structurally homogeneous versus diverse datasets. Curr Comput Aided Drug Des 11:117–123
Basak SC, Majumdar S (2015) The importance of rigorous statistical practice in the current landscape of QSAR modelling (editorial). Curr Comput Aided Drug Des 11:2–4
Kraker JJ, Hawkins DM, Basak SC, Natarajan R, Mills D (2007) Quantitative structure-activity relationship (QSAR) modeling of juvenile hormone activity: comparison of validation procedures. Chemometr Intell Lab Syst 87:33–42
Hawkins DM, Kraker JJ, Basak SC, Mills D (2008) QSPR checking and validation: a case study with hydroxy radical reaction rate constant. SAR QSAR Environ Res 19:525–539
SAS Institute, Inc (1988) In SAS/STAT user guide, release 6.03 edition. Cary
Hoskuldsson A (1995) A combined theory for PCA and PLS. J Chemom 9:91–123
Hawkins DM, Basak SC, Shi X (2001) QSAR with few compounds and many features. J Chem Inf Comput Sci 41:663–670
Tang C, Zhang L, Zhang A, Ramanathan M (2001) Interrelated two-way clustering: an unsupervised approach for gene expression data analysis. In: Bilof R, Palagi L (eds) Proceedings of BIBE 2001: 2nd IEEE international symposium on bioinformatics and bioengineering, Bethesda, Maryland, November 4–5, 2001. IEEE Computer Society, Los Alamitos, pp 41–48
Basak SC, Magnuson VR, Niemi GJ, Regal RR, Veith GD (1987) Topological indices: their nature, mutual relatedness, and applications. Math Mod 8:300–305
Basak SC, Grunwald GD, Majumdar S (2015) Intrinsic dimensionality of chemical space: characterization and applications, Mol2Net conference. http://sciforum.net/email/validate/49668c1bf65ab8520f721a84f7d84e05
Basak SC (1999) Information theoretic indices of neighborhood complexity and their applications. In: Devillers J, Balaban AT (eds) Topological indices and related descriptors in QSAR and QSPR. Gordon and Breach Science Publishers, Amsterdam, pp 563–593
Randic M (1975) Characterization of molecular branching. J Am Chem Soc 97:6609–6615
Bonchev D, Trinajstić N (1977) Information theory, distance matrix and molecular branching. J Chem Phys 67:4517–4533
Hoffmann R, Minkin VI, Carpenter BK (1997) Ockham’s razor and chemistry. HYLE Int J Philos Chem 3:3–28
Katritzky AR, Putrukhin R, Tathan S, Basak SC, Benfenati E, Karelson M, Maran U (2001) Interpretation of quantitative structure-property and -activity relationships. J Chem Inf Comput Sci 41:679–685
Katritzky AR, Putrukhin R, Tathan S, Basak SC, Benfenati E, Karelson M, Maran U (2001) Interpretation of quantitative structure-property and -activity relationships. J Chem Inf Comput Sci 41:679–685
So SS, Karplus M (1997) Three-dimensional quantitative structure-activity relationships from molecular similarity matrices and genetic neural networks. 2. Applications. J Med Chem 40:4360–4371
Basak SC, Mills D, Mumtaz MM, Balasubramanian K (2003) Use of topological indices in predicting aryl hydrocarbon (Ah) receptor binding potency of dibenzofurans: a hierarchical QSAR approach. Ind J Chem 42A:1385–1391
Basak SC, Majumdar S (2015) Current landscape of hierarchical QSAR modeling and its applications: some comments on the importance of mathematical descriptors as well as rigorous statistical methods of model building and validation. In: Basak SC, Restrepo G, Villaveces JL (eds) Advances in mathematical chemistry and applications, vol 1. Bentham eBooks, Bentham Science Publishers, pp 251–281
Ben-Dor A, Friedman N, Yakhini Z (2001) Class discovery in gene expression data. In: Proceedings of the fifth annual international conference on computational molecular biology (RECOMB 2001), New York
Gute BD, Basak SC (1997) Predicting acute toxicity of benzene derivatives using theoretical molecular descriptors: a hierarchical QSAR approach. SAR QSAR Environ Res 7:117–131
Gute BD, Grunwald GD, Basak SC (1999) Prediction of the dermal penetration of polycyclic aromatic hydrocarbons (PAHs): a hierarchical QSAR approach. SAR QSAR Environ Res 10:1–15
Basak SC, Mills DR, Balaban AT, Gute BD (2001) Prediction of mutagenicity of aromatic and heteroaromatic amines from structure: a hierarchical QSAR approach. J Chem Inf Comput Sci 41:671–678
Popper K (2005) The logic of scientific discovery. Taylor & Francis e-Library, London and New York
Basak SC, Majumdar S (2015) Two QSAR paradigms- congenericity principle versus diversity begets diversity principle- analyzed using computed mathematical chemodescriptors of homogeneous and diverse sets of chemical mutagens. Mol2Net Conference. http://sciforum.net/email/validate/49668c1bf65ab8520f721a84f7d84e05
Sahigara F, Mansouri K, Ballabio D, Mauri A, Consonni V, Todeschini R (2012) Comparison of different approaches to define the applicability domain of QSAR models. Molecules 17:4791–4810
Jaworska J, Nikolova-Jeliazkova N, Aldenberg T (2005) QSAR applicability domain estimation by projection of the training set descriptor space: a review. Altern Lab Anim 33:445–459
Preparata FP, Shamos MI (1991) Convex hulls: basic algorithms. In: Preparata FP, Shamos MI (eds) Computational geometry: an introduction. Springer, New York, pp 95–148
Worth AP, Bassan A, Gallegos A, Netzeva TI, Patlewicz G, Pavan M, Tsakovska I, Vracko M (2005) The characterisation of (quantitative) structure-activity relationships: preliminary guidance, ECB Report EUR 21866 EN. European Commission, Joint Research Centre, Ispra, p 95
Pharmaceutical Research and Manufacturers of America (2014) Biopharmaceutical research industry profile. Available from: http://www.phrma.org/sites/default/files/pdf/2014_PhRMA_PROFILE.pdf. Accessed on 11 Dec 2015
Santos-Filho OA, Hopfinger AJ, Cherkasov A, de Alencastro RB (2009) The receptor-dependent QSAR paradigm: an overview of the current state of the art. Med Chem (Shariqah) 5:359–366
Basak SC, Bhattacharjee AK, Vracko M (2015) Big data and new drug discovery: tackling “Big Data” for virtual screening of large compound databases. Curr Comput Aided Drug Des 11:197–201
Crawford MA (1963) The effects of fluoroacetate, malonate and acid-base balance on the renal disposal of citrate. Biochem J 8:115–120
Quastel JH, Wooldridge WR (1928) Some properties of the dehydrogenating enzymes of bacteria. Biochem J 22:689–702
Basak SC, Grunwald GD (1995) Predicting mutagenicity of chemicals using topological and quantum chemical parameters: a similarity based study. Chemosphere 31:2529–2546
Reuschenbach P, Silvani M, Dammann M, Warnecke D, Knacker T (2008) ECOSAR model performance with a large test set of industrial chemicals. Chemosphere 71:1986–1995
Ankley GT, Bennett RS, Erickson RJ, Hoff DJ, Hornung W, Johnson RD, Mount DR, Nichols JW, Russom CL, Schmeider PK, Serrano JA, Tietge J, Villeneuve DL (2010) Adverse outcome pathways: a conceptual framework to support ecotoxicology research and risk assessment. Environ Toxicol Chem 29:730–741
Ankley GT, Villeneuve DL (2006) The fathead minnow in aquatic toxicology: past, present and future. Aquat Toxicol 78:91–102
Basak SC, Grunwald GD, Host GE, Niemi GJ, Bradbury SP (1998) A comparative study of molecular similarity, statistical and neural network methods for predicting toxic modes of action of chemicals. Environ Toxicol Chem 17:1056–1064
Russom CL, Bradbury SP, Broderius SJ, Hammermeister DE, Drummond RA (1997) Predicting modes of toxic action from chemical structure: acute toxicity in the fathead minnow (pimephales promelas). Environ Toxicol Chem 16:948–967
Gute BD, Grunwald GD, Mills D, Basak SC (2001) Molecular similarity based estimation of properties: a comparison of structure spaces and property spaces. SAR QSAR Environ Res 11:363–382
Gute BD, Basak SC, Mills D, Hawkins DM (2002) Tailored similarity spaces for the prediction of physicochemical properties. Internet Electron J Mol Des 1:374–387. http://www.biochempress.com/
Basak SC, Gute BD, Mills D, Hawkins DM (2003) Quantitative molecular similarity methods in the property/toxicity estimation of chemicals: a comparison of arbitrary versus tailored similarity spaces. J Mol Struct THEOCHEM 622:127–145
Hamori E, Ruskin J (1983) H Curves, a novel method of representation of nucleotide series especially suited for long DNA sequences. J Biol Chem 258:1318–1327
Gates MA (1986) A simple way to look a DNA. J Theor Biol 119:319–328
Nandy A (1996) Graphical analysis of DNA sequence structure: III. Indications of evolutionary distinctions and characteristics of introns and exons. Curr Sci 70:661–668
Leong PM, Morgenthaler S (1995) Random walk and gap plots of DNA sequences. Comput Appl Biosci 11:503–507
Randić M, Zupan J, Balaban AT, Vikic-Topic D, Plavsic D (2011) Graphical representation of proteins. Chem Rev 111:790–862
Indo-US Workshop on Mathematical Chemistry. http://www.nrri.umn.edu/indousworkshop
Raychaudhury C, Nandy A (1998) Indexation schemes and similarity measures for macromolecular sequences. Paper presented at the Indo-US Workshop on Mathematical Chemistry, Shantiniketan. 9–13 January 1998
Randić M, Vracko M, Nandy A, Basak SC (2000) On 3–D representation of DNA primary sequences. J Chem Inf Comput Sci 40:1235–1244
Guo X, Randić M, Basak SC (2001) A novel 2-D graphical representation of DNA sequences of low degeneracy. Chem Phys Lett 350:106–112
Nandy A, Sarkar T, Basak SC, Nandy P, Das S (2014) Characteristics of influenza HA-NA interdependence determined through a graphical technique. Curr Comput Aided Drug Des 10:285–302
Nandy A, Basak SC (2015) Prognosis of possible reassortments in recent H5N2 epidemic influenza in USA: implications for computer-assisted surveillance as well as drug/vaccine design. Curr Comput Aided Drug Des 11:110–116
Steiner S, Witzmann FA (2000) Proteomics: applications and opportunities in preclinical drug development. Electrophoresis 21:2099–2104
Witzmann FA, Monteiro-Riviere NA (2006) Multi-walled carbon nanotube exposure alters protein expression in human keratinocytes. Nanomedicine Nanotechnol Biol Med 2:158–168
Basak SC, Gute BD, Monteiro-Riviere N, Witzmann FA (2010) Characterization of toxicoproteomics maps for chemical mixtures using information theoretic approach. In: Mumtaz M (ed) Principles and practice of mixtures toxicology. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, pp 215–232
Vracko M, Basak SC, Geiss K, Witzmann FA (2006) Proteomics maps-toxicity relationship of halocarbons studied with similarity index and genetic algorithm. J Chem Inf Model 46:130–136
Randic M, Witzmann FA, Vracko M, Basak SC (2001) On characterization of proteomics maps and chemically induced changes in proteomes using matrix invariants: application to peroxisome proliferators. Med Chem Res 10:456–479
Basak SC, Gute BD, Witzmann FA (2006) Information-theoretic biodescriptors for proteomics maps: development and applications in predictive toxicology. Conf Proc WSEAS Trans Inf Sci Appl 7:996–1001
Arcos JC (1987) Structure–activity relationships: criteria for predicting the carcinogenic activity of chemical compounds. Environ Sci Technol 21:743–745
Hawkins DM, Basak SC, Kraker JJ, Geiss KT, Witzmann FA (2006) Combining chemodescriptors and biodescriptors in quantitative structure-activity relationship modeling. J Chem Inf Model 46:9–16
Basak SC, Gute BD, Balaban AT (2004) Interrelationship of major topological indices evidenced by clustering. Croat Chem Acta 77:331–344
Johnson M, Maggiora GM (1990) Concepts and applications of molecular similarity. Wiley, New York
Basak SC (2016) Mathematical structural descriptors of molecules and biomolecules: background and applications. In: Basak SC, Restrepo G, Villaveces JL (ed) Advances in mathematical chemistry and applications, vol 1. Bentham eBooks, Elsevier & Bentham Science Publishers, Sharjah, U. A. E. pp 3–23
Zanni R, Galvez-Llompart M, Garcıa-Domenech R, Galvez J (2015) Latest advances in molecular topology applications for drug discovery. Expert Opin Drug Discov 10:1–13
Acknowledgments
I am thankful to Kanika Basak, Gregory Grunwald, Douglas Hawkins, Brian Gute, Subhabrata Majumdar, Denise Mills, Dilip K. Sinha, Ashesh Nandy, Frank Witzmann, Kevin Geiss, Krishnan Balasubramanian, Ramanathan Natarajan, Gerald J. Niemi, Alexandru T. Balaban, the late Alan Katritzky, Milan Randic, Nenad Trinajstic, Sonja Nikolic, Marjan Vracko, Marjana Novic, Xiaofeng Guo, Terry Neumann, Qianhong Zhu, late Gilman D. Veith, Marissa Harle, Vincent R. Magnuson, Donald K. Harriss, Chandan Raychaudhury, Samar K. Ray and Lester R. Drewes for collaboration in my research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer India
About this chapter
Cite this chapter
Basak, S.C. (2016). Mathematical Chemodescriptors and Biodescriptors: Background and Their Applications in the Prediction of Bioactivity/Toxicity of Chemicals. In: Singh, S. (eds) Systems Biology Application in Synthetic Biology. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2809-7_10
Download citation
DOI: https://doi.org/10.1007/978-81-322-2809-7_10
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2807-3
Online ISBN: 978-81-322-2809-7
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)