Abstract
In this chapter we will present alvaDesc, a software to calculate and analyze molecular descriptors and fingerprints.
Molecular descriptors and fingerprints play an essential role in quantitative structure-activity relationships (QSAR) as they are the mathematical representation of chemicals and they serve as the input for the data analysis methods used to build QSAR models.
The increasing number of newly proposed molecular descriptors and fingerprints and generally the attention paid by the scientific community to the development of novel methodologies to represent chemical structures are evidences of the relevance of these representations in the prediction of chemical properties.
Despite the complexity of dealing with a high number of variables, different types of molecular descriptors and fingerprints can highlight specific traits of molecular structures. These aspects, together with the increased availability of chemical data and methods for data analysis, are some of the challenges that researchers face in the development of QSAR models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ihlenfeldt WD, Bolton EE, Bryant SH (2009) The PubChem chemical structure sketcher. J Cheminform 1(1):1–9
Kim S, Thiessen PA, Bolton EE, Bryant SH (2015) PUG-SOAP and PUG-REST: web services for programmatic access to chemical information in PubChem. Nucleic Acids Res 43(W1):W605–W611
Davies M et al (2015) ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res 43(W1):W612–W620
Gaulton A et al (2017) The ChEMBL database in 2017. Nucleic Acids Res 45(D1):D945–D954
Irwin JJ, Shoichet BK (2005) ZINC – a free database of commercially available compounds for virtual screening. J Chem Inf Model 45(1):177–182
Worth AP (2009) The role of Qsar methodology in the regulatory assessment of chemicals. In: Recent advances in QSAR studies. Springer, Dordrecht; New York
Cassotti M, Ballabio D, Consonni V, Mauri A, Tetko IV, Todeschini R (2014) Prediction of acute aquatic toxicity toward Daphnia magna by using the GA-kNN method. Altern Lab Anim 42(1):31–41
Cassotti M, Consonni V, Mauri A, Ballabio D (2014) Validation and extension of a similarity-based approach for prediction of acute aquatic toxicity towards Daphnia magna. SAR QSAR Environ Res 25(12):1013–1036
Khan PM, Roy K, Benfenati E (2019) Chemometric modeling of Daphnia magna toxicity of agrochemicals. Chemosphere 224:470–479
Tebby C, Mombelli E, Pandard P, Péry ARR (2011) Exploring an ecotoxicity database with the OECD (Q)SAR Toolbox and DRAGON descriptors in order to prioritise testing on algae, daphnids, and fish. Sci Total Environ 409(18):3334–3343
Grisoni F, Consonni V, Vighi M (2018) Acceptable-by-design QSARs to predict the dietary biomagnification of organic chemicals in fish. Integr Environ Assess Manag 15(1):51–63
Khan K, Roy K (2017) Ecotoxicological modelling of cosmetics for aquatic organisms: a QSTR approach. SAR QSAR Environ Res 28(7):567–594
Holmquist H, Lexén J, Rahmberg M, Sahlin U, Palm JG, Rydberg T (2018) The potential to use QSAR to populate ecotoxicity characterisation factors for simplified LCIA and chemical prioritisation. Int J Life Cycle Assess 23(11):2208–2216
Khan K, Roy K, Benfenati E (2019) Ecotoxicological QSAR modeling of endocrine disruptor chemicals. J Hazard Mater 369:707–718
Fourches D, Muratov E, Tropsha A (2010) Trust but verify: on the importance of chemical structure curation in chemoinformatics and QSAR modeling research. J Chem Inf Model 50(7):1189–1204
Todeschini R, Consonni V (2009) Molecular Descriptors for Chemoinformatics. Vol. 1. Alphabetical Listing; Vol. 2. Appendices, References. Wiley-VCH, Weinheim
Mauri A, Consonni V, Todeschini R (2017) Molecular descriptors. In: Leszczyński J, Kaczmarek-Kedziera A, Puzyn T, Papadopoulos MG, Reis H, Shukla MK (eds) Handbook of computational chemistry. Springer International Publishing, Switzerland, pp 2065–2093
Moriwaki H, Tian YS, Kawashita N, Takagi T (2018) Mordred: a molecular descriptor calculator. J Cheminform 10(1):1–14
Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466
Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The Chemistry Development Kit (CDK): an open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43(2):493–500
Willighagen EL et al (2017) The chemistry development kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform 9(1):1–19
RDKit: Open-source cheminformatics; http://www.rdkit.org
Mauri A, Consonni V, Pavan M, Todeschini R (2006) Dragon software: an easy approach to molecular descriptor calculations. Match Commun Math Comput Chem 56(2):237–248
Alvascience srl (2019) alvaDesc (software for molecular descriptors calculation). Available at: https://www.alvascience.com/
Durant JL, Leland BA, Henry DR, Nourse JG (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42(6):1273–1280
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742–754
Ballabio D, Consonni V, Mauri A, Claeys-Bruno M, Sergent M, Todeschini R (2014) A novel variable reduction method adapted from space-filling designs. Chemom Intell Lab Syst 136:147–154
Berthold MR et al (2008) KNIME: the Konstanz information miner. In: Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R (eds) Data analysis, machine learning and applications, vol 11(1). Springer, Berlin/Heidelberg, pp 319–326
Sushko I et al (2011) Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 25(6):533–554
Young D, Martin T, Venkatapathy R, Harten P (2008) Are the chemical structures in your QSAR correct? QSAR Comb Sci 27(11–12):1337–1345
Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29(6–7):476–488
Randić M (1996) Molecular bonding profiles. J Math Chem 19(3):375–392
Guha R, Willighagen E (2012) A survey of quantitative descriptions of molecular structure. Curr Top Med Chem 12(18):1946–1956
Todeschini R, Gramatica P (1997) The Whim theory: new 3D molecular descriptors for Qsar in environmental modelling. SAR QSAR Environ Res 7(1–4):89–115
Consonni V, Todeschini R, Pavan M, Gramatica P (2002) Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors. J Chem Inf Comput Sci 42(3):682–692
Wiener H (1947) Structural determination of paraffin boiling points. J Am Chem Soc 69(1):17–20
Plavšić D, Nikolić S, Trinajstić N, Mihalić Z (1993) On the Harary index for the characterization of chemical graphs. J Math Chem 12(1):235–250
Randić M (1975) On characterization of molecular branching. J Am Chem Soc 97(23):6609–6615
Randić M (2001) The connectivity index 25 years after. J Mol Graph Model 20(1):19–35
Moreau JL, Broto P (1980) Autocorrelation of molecular structures: application to SAR studies. Nouv J Chim 4:757–764
Broto P (1984) Molecular structures: perception, autocorrelation descriptor and sar studies. Eur J Med Chem 19:66–70
Moran PAP (1950) Notes on continuous stochastic phenomena. Biometrika 37(1–2):17–23
Schneider G, Neidhart W, Giller T, Schmid G (1999) ‘Scaffold-Hopping’ by topological pharmacophore search: a contribution to virtual screening. Angew Chemie Int Ed 38(19):2894–2896
Renner S, Fechner U, Schneider G (2006) Alignment-free pharmacophore patterns – a correlation vector approach. In: Langer T, Hoffmann RD (eds) Pharmacophores and pharmacophore searches. Wiley-VCH, Weinheim, pp 49–79
Ertl P, Rohde B, Selzer P (2000) Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. J Med Chem 43(20):3714–3717
Ertl P (2008) Polar Surface Area. In: Mannhold R (eds) Molecular Drug Properties. Measurement and Prediction. Wiley-VCH, Weinheim, pp 111–126
Moriguchi I, Hirono S, Nakagome I, Hirano H (1994) Comparison of reliability of log P values for drugs calculated by several methods. Chem Pharm Bull 42(4):976–978
Ghose AK, Viswanadhan VN, Wendoloski JJ (1998) Prediction of hydrophobic (lipophilic) properties of small organic molecules using fragmental methods: an analysis of ALOGP and CLOGP methods. J Phys Chem A 102(21):3762–3772
Lipinski CA (2004) Lead- and drug-like compounds: the rule-of-five revolution. Drug Discov Today Technol 1(4):337–341
Jolliffe IT (2002) Principal component analysis. Springer-Verlag, New York
Kruskal JB (1964) Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1):1–14
Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605
Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 25(2):64–73
Kier LB, Hall LH (1990) An electrotopological-state index for atoms in molecules. Pharm Res 7(8):801–807
Hall LH, Kier LB (1995) Electrotopological state indices for atom types: a novel combination of electronic, topological, and valence state information. J Chem Inf Comput Sci 35(6):1039–1045
Kier LB, Hall LH (1981) Derivation and significance of valence molecular connectivity. J Pharm Sci 70(6):583–589
Gombar V, Kumar A, Murthy MS (1987) Quantitative structure activity relationships part ix. A modified connectivity index as structure quantifier. Indian J Chem Sect B Org Chem Incl Med Chem 26(12):1168–1170
Burden FR (1989) Molecular identification number for substructure searches. J Chem Inf Comput Sci 29(3):225–227
Santiago J, Claeys-Bruno M, Sergent M (2012) Construction of space-filling designs using WSP algorithm for high dimensional spaces. Chemom Intell Lab Syst 113:26–31
Rojas C et al (2017) A QSTR-based expert system to predict sweetness of molecules. Front Chem 5:53
Ajmani S, Rogers SC, Barley MH, Livingstone DJ (2006) Application of QSPR to mixtures. J Chem Inf Model 46(5):2043–2055
Varnek A, Kireeva N, Tetko IV, Baskin II, Solov’ev VP (2007) Exhaustive QSPR studies of a large diverse set of ionic liquids: how accurately can we predict melting points? J Chem Inf Mod 47(3):1111–1122
Roy K, Das RN, Popelier PLA (2014) Quantitative structure-activity relationship for toxicity of ionic liquids to Daphnia magna: aromaticity vs. lipophilicity. Chemosphere 112:120–127
Roy K, Das RN, Popelier PLA (2015) Predictive QSAR modelling of algal toxicity of ionic liquids and its interspecies correlation with Daphnia toxicity. Environ Sci Pollut Res 22(9):6634–6641
Oprisiu I, Novotarskyi S, Tetko IV (2013) Modeling of non-additive mixture properties using the Online CHEmical database and Modeling Environment (OCHEM). J Cheminform 5(1):1
Mauri A, Ballabio D, Todeschini R, Consonni V (2016) Mixtures, metabolites, ionic liquids: a new measure to evaluate similarity between complex chemical systems. J Cheminform 8(1):1–3
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Mauri, A. (2020). alvaDesc: A Tool to Calculate and Analyze Molecular Descriptors and Fingerprints. In: Roy, K. (eds) Ecotoxicological QSARs. Methods in Pharmacology and Toxicology. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0150-1_32
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0150-1_32
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0149-5
Online ISBN: 978-1-0716-0150-1
eBook Packages: Springer Protocols