Skip to main content

alvaDesc: A Tool to Calculate and Analyze Molecular Descriptors and Fingerprints

  • Protocol
  • First Online:
Ecotoxicological QSARs

Part of the book series: Methods in Pharmacology and Toxicology ((MIPT))

Abstract

In this chapter we will present alvaDesc, a software to calculate and analyze molecular descriptors and fingerprints.

Molecular descriptors and fingerprints play an essential role in quantitative structure-activity relationships (QSAR) as they are the mathematical representation of chemicals and they serve as the input for the data analysis methods used to build QSAR models.

The increasing number of newly proposed molecular descriptors and fingerprints and generally the attention paid by the scientific community to the development of novel methodologies to represent chemical structures are evidences of the relevance of these representations in the prediction of chemical properties.

Despite the complexity of dealing with a high number of variables, different types of molecular descriptors and fingerprints can highlight specific traits of molecular structures. These aspects, together with the increased availability of chemical data and methods for data analysis, are some of the challenges that researchers face in the development of QSAR models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ihlenfeldt WD, Bolton EE, Bryant SH (2009) The PubChem chemical structure sketcher. J Cheminform 1(1):1–9

    Article  CAS  Google Scholar 

  2. Kim S, Thiessen PA, Bolton EE, Bryant SH (2015) PUG-SOAP and PUG-REST: web services for programmatic access to chemical information in PubChem. Nucleic Acids Res 43(W1):W605–W611

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Davies M et al (2015) ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res 43(W1):W612–W620

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Gaulton A et al (2017) The ChEMBL database in 2017. Nucleic Acids Res 45(D1):D945–D954

    Article  CAS  PubMed  Google Scholar 

  5. Irwin JJ, Shoichet BK (2005) ZINC – a free database of commercially available compounds for virtual screening. J Chem Inf Model 45(1):177–182

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Worth AP (2009) The role of Qsar methodology in the regulatory assessment of chemicals. In: Recent advances in QSAR studies. Springer, Dordrecht; New York

    Google Scholar 

  7. Cassotti M, Ballabio D, Consonni V, Mauri A, Tetko IV, Todeschini R (2014) Prediction of acute aquatic toxicity toward Daphnia magna by using the GA-kNN method. Altern Lab Anim 42(1):31–41

    Article  CAS  PubMed  Google Scholar 

  8. Cassotti M, Consonni V, Mauri A, Ballabio D (2014) Validation and extension of a similarity-based approach for prediction of acute aquatic toxicity towards Daphnia magna. SAR QSAR Environ Res 25(12):1013–1036

    Article  CAS  PubMed  Google Scholar 

  9. Khan PM, Roy K, Benfenati E (2019) Chemometric modeling of Daphnia magna toxicity of agrochemicals. Chemosphere 224:470–479

    CAS  PubMed  Google Scholar 

  10. Tebby C, Mombelli E, Pandard P, Péry ARR (2011) Exploring an ecotoxicity database with the OECD (Q)SAR Toolbox and DRAGON descriptors in order to prioritise testing on algae, daphnids, and fish. Sci Total Environ 409(18):3334–3343

    Article  CAS  PubMed  Google Scholar 

  11. Grisoni F, Consonni V, Vighi M (2018) Acceptable-by-design QSARs to predict the dietary biomagnification of organic chemicals in fish. Integr Environ Assess Manag 15(1):51–63

    Article  PubMed  CAS  Google Scholar 

  12. Khan K, Roy K (2017) Ecotoxicological modelling of cosmetics for aquatic organisms: a QSTR approach. SAR QSAR Environ Res 28(7):567–594

    Article  CAS  PubMed  Google Scholar 

  13. Holmquist H, Lexén J, Rahmberg M, Sahlin U, Palm JG, Rydberg T (2018) The potential to use QSAR to populate ecotoxicity characterisation factors for simplified LCIA and chemical prioritisation. Int J Life Cycle Assess 23(11):2208–2216

    Article  Google Scholar 

  14. Khan K, Roy K, Benfenati E (2019) Ecotoxicological QSAR modeling of endocrine disruptor chemicals. J Hazard Mater 369:707–718

    Article  CAS  PubMed  Google Scholar 

  15. Fourches D, Muratov E, Tropsha A (2010) Trust but verify: on the importance of chemical structure curation in chemoinformatics and QSAR modeling research. J Chem Inf Model 50(7):1189–1204

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Todeschini R, Consonni V (2009) Molecular Descriptors for Chemoinformatics. Vol. 1. Alphabetical Listing; Vol. 2. Appendices, References. Wiley-VCH, Weinheim

    Google Scholar 

  17. Mauri A, Consonni V, Todeschini R (2017) Molecular descriptors. In: Leszczyński J, Kaczmarek-Kedziera A, Puzyn T, Papadopoulos MG, Reis H, Shukla MK (eds) Handbook of computational chemistry. Springer International Publishing, Switzerland, pp 2065–2093

    Google Scholar 

  18. Moriwaki H, Tian YS, Kawashita N, Takagi T (2018) Mordred: a molecular descriptor calculator. J Cheminform 10(1):1–14

    Article  CAS  Google Scholar 

  19. Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466

    Article  CAS  PubMed  Google Scholar 

  20. Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The Chemistry Development Kit (CDK): an open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43(2):493–500

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Willighagen EL et al (2017) The chemistry development kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform 9(1):1–19

    Article  CAS  Google Scholar 

  22. RDKit: Open-source cheminformatics; http://www.rdkit.org

  23. Mauri A, Consonni V, Pavan M, Todeschini R (2006) Dragon software: an easy approach to molecular descriptor calculations. Match Commun Math Comput Chem 56(2):237–248

    CAS  Google Scholar 

  24. Alvascience srl (2019) alvaDesc (software for molecular descriptors calculation). Available at: https://www.alvascience.com/

  25. Durant JL, Leland BA, Henry DR, Nourse JG (2002) Reoptimization of MDL keys for use in drug discovery. J Chem Inf Comput Sci 42(6):1273–1280

    Article  CAS  PubMed  Google Scholar 

  26. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742–754

    Article  CAS  PubMed  Google Scholar 

  27. Ballabio D, Consonni V, Mauri A, Claeys-Bruno M, Sergent M, Todeschini R (2014) A novel variable reduction method adapted from space-filling designs. Chemom Intell Lab Syst 136:147–154

    Article  CAS  Google Scholar 

  28. Berthold MR et al (2008) KNIME: the Konstanz information miner. In: Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R (eds) Data analysis, machine learning and applications, vol 11(1). Springer, Berlin/Heidelberg, pp 319–326

    Chapter  Google Scholar 

  29. Sushko I et al (2011) Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 25(6):533–554

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Young D, Martin T, Venkatapathy R, Harten P (2008) Are the chemical structures in your QSAR correct? QSAR Comb Sci 27(11–12):1337–1345

    Article  CAS  Google Scholar 

  31. Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29(6–7):476–488

    Article  CAS  PubMed  Google Scholar 

  32. Randić M (1996) Molecular bonding profiles. J Math Chem 19(3):375–392

    Article  Google Scholar 

  33. Guha R, Willighagen E (2012) A survey of quantitative descriptions of molecular structure. Curr Top Med Chem 12(18):1946–1956

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Todeschini R, Gramatica P (1997) The Whim theory: new 3D molecular descriptors for Qsar in environmental modelling. SAR QSAR Environ Res 7(1–4):89–115

    Article  CAS  Google Scholar 

  35. Consonni V, Todeschini R, Pavan M, Gramatica P (2002) Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors. J Chem Inf Comput Sci 42(3):682–692

    Article  CAS  PubMed  Google Scholar 

  36. Wiener H (1947) Structural determination of paraffin boiling points. J Am Chem Soc 69(1):17–20

    Article  CAS  PubMed  Google Scholar 

  37. Plavšić D, Nikolić S, Trinajstić N, Mihalić Z (1993) On the Harary index for the characterization of chemical graphs. J Math Chem 12(1):235–250

    Article  Google Scholar 

  38. Randić M (1975) On characterization of molecular branching. J Am Chem Soc 97(23):6609–6615

    Article  Google Scholar 

  39. Randić M (2001) The connectivity index 25 years after. J Mol Graph Model 20(1):19–35

    Article  PubMed  Google Scholar 

  40. Moreau JL, Broto P (1980) Autocorrelation of molecular structures: application to SAR studies. Nouv J Chim 4:757–764

    CAS  Google Scholar 

  41. Broto P (1984) Molecular structures: perception, autocorrelation descriptor and sar studies. Eur J Med Chem 19:66–70

    CAS  Google Scholar 

  42. Moran PAP (1950) Notes on continuous stochastic phenomena. Biometrika 37(1–2):17–23

    Article  CAS  PubMed  Google Scholar 

  43. Schneider G, Neidhart W, Giller T, Schmid G (1999) ‘Scaffold-Hopping’ by topological pharmacophore search: a contribution to virtual screening. Angew Chemie Int Ed 38(19):2894–2896

    Article  CAS  Google Scholar 

  44. Renner S, Fechner U, Schneider G (2006) Alignment-free pharmacophore patterns – a correlation vector approach. In: Langer T, Hoffmann RD (eds) Pharmacophores and pharmacophore searches. Wiley-VCH, Weinheim, pp 49–79

    Chapter  Google Scholar 

  45. Ertl P, Rohde B, Selzer P (2000) Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. J Med Chem 43(20):3714–3717

    Article  CAS  PubMed  Google Scholar 

  46. Ertl P (2008) Polar Surface Area. In: Mannhold R (eds) Molecular Drug Properties. Measurement and Prediction. Wiley-VCH, Weinheim, pp 111–126

    Google Scholar 

  47. Moriguchi I, Hirono S, Nakagome I, Hirano H (1994) Comparison of reliability of log P values for drugs calculated by several methods. Chem Pharm Bull 42(4):976–978

    Article  CAS  Google Scholar 

  48. Ghose AK, Viswanadhan VN, Wendoloski JJ (1998) Prediction of hydrophobic (lipophilic) properties of small organic molecules using fragmental methods: an analysis of ALOGP and CLOGP methods. J Phys Chem A 102(21):3762–3772

    Article  CAS  Google Scholar 

  49. Lipinski CA (2004) Lead- and drug-like compounds: the rule-of-five revolution. Drug Discov Today Technol 1(4):337–341

    Article  CAS  PubMed  Google Scholar 

  50. Jolliffe IT (2002) Principal component analysis. Springer-Verlag, New York

    Google Scholar 

  51. Kruskal JB (1964) Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29(1):1–14

    Article  Google Scholar 

  52. Van Der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

    Google Scholar 

  53. Carhart RE, Smith DH, Venkataraghavan R (1985) Atom pairs as molecular features in structure-activity studies: definition and applications. J Chem Inf Comput Sci 25(2):64–73

    Article  CAS  Google Scholar 

  54. Kier LB, Hall LH (1990) An electrotopological-state index for atoms in molecules. Pharm Res 7(8):801–807

    Article  CAS  PubMed  Google Scholar 

  55. Hall LH, Kier LB (1995) Electrotopological state indices for atom types: a novel combination of electronic, topological, and valence state information. J Chem Inf Comput Sci 35(6):1039–1045

    Article  CAS  Google Scholar 

  56. Kier LB, Hall LH (1981) Derivation and significance of valence molecular connectivity. J Pharm Sci 70(6):583–589

    Article  CAS  PubMed  Google Scholar 

  57. Gombar V, Kumar A, Murthy MS (1987) Quantitative structure activity relationships part ix. A modified connectivity index as structure quantifier. Indian J Chem Sect B Org Chem Incl Med Chem 26(12):1168–1170

    Google Scholar 

  58. Burden FR (1989) Molecular identification number for substructure searches. J Chem Inf Comput Sci 29(3):225–227

    Article  CAS  Google Scholar 

  59. Santiago J, Claeys-Bruno M, Sergent M (2012) Construction of space-filling designs using WSP algorithm for high dimensional spaces. Chemom Intell Lab Syst 113:26–31

    Article  CAS  Google Scholar 

  60. Rojas C et al (2017) A QSTR-based expert system to predict sweetness of molecules. Front Chem 5:53

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  61. Ajmani S, Rogers SC, Barley MH, Livingstone DJ (2006) Application of QSPR to mixtures. J Chem Inf Model 46(5):2043–2055

    Article  CAS  PubMed  Google Scholar 

  62. Varnek A, Kireeva N, Tetko IV, Baskin II, Solov’ev VP (2007) Exhaustive QSPR studies of a large diverse set of ionic liquids: how accurately can we predict melting points? J Chem Inf Mod 47(3):1111–1122

    Article  CAS  Google Scholar 

  63. Roy K, Das RN, Popelier PLA (2014) Quantitative structure-activity relationship for toxicity of ionic liquids to Daphnia magna: aromaticity vs. lipophilicity. Chemosphere 112:120–127

    Article  CAS  PubMed  Google Scholar 

  64. Roy K, Das RN, Popelier PLA (2015) Predictive QSAR modelling of algal toxicity of ionic liquids and its interspecies correlation with Daphnia toxicity. Environ Sci Pollut Res 22(9):6634–6641

    Article  CAS  Google Scholar 

  65. Oprisiu I, Novotarskyi S, Tetko IV (2013) Modeling of non-additive mixture properties using the Online CHEmical database and Modeling Environment (OCHEM). J Cheminform 5(1):1

    Article  CAS  Google Scholar 

  66. Mauri A, Ballabio D, Todeschini R, Consonni V (2016) Mixtures, metabolites, ionic liquids: a new measure to evaluate similarity between complex chemical systems. J Cheminform 8(1):1–3

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Mauri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Mauri, A. (2020). alvaDesc: A Tool to Calculate and Analyze Molecular Descriptors and Fingerprints. In: Roy, K. (eds) Ecotoxicological QSARs. Methods in Pharmacology and Toxicology. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0150-1_32

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0150-1_32

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0149-5

  • Online ISBN: 978-1-0716-0150-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics