Molecular Descriptors

  • Andrea Mauri
  • Viviana Consonni
  • Roberto Todeschini
Living reference work entry


Despite the number of available chemicals growing exponentially, testing of their toxicological and environmental behavior is often a critical issue and alternative strategies are required. Additionally, there is the need to predict properties of not yet synthesized compounds to reduce the costs of synthesis, selecting only those that have the maximal potential to be active and nontoxic compounds. In order to evaluate chemical properties avoiding chemical synthesis and reducing expensive and time-demanding laboratory testing, it is necessary to build in silico models establishing a mathematical relationship between the structures of molecules and the considered properties (quantitative structure–activity relationships, QSARs). Molecular descriptors play a fundamental role in QSAR and other in silico models since they formally are the numerical representation of a molecular structure. Molecular descriptors can be classified using different criteria. Among them, there are two main categories, experimental and theoretical descriptors. The basis to understand and perform molecular descriptor calculation, the different theoretical descriptor categories together with their perspectives are described in this chapter.


Molecular Descriptor Molecular Graph Topological Index Connectivity Index Vertex Degree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Ajmani, S., Rogers, S. C., Barley, M. H., & Livingstone, D. J. (2006). Application of QSPR to mixtures. Journal of Chemical Information and Modeling, 46, 2043–2055.CrossRefGoogle Scholar
  2. Balaban, A. T. (1982). Highly discriminating distance-based topological index. Chemical Physics Letters, 89, 399–404.CrossRefGoogle Scholar
  3. Balaban, A. T. (1985). Applications of graph theory in chemistry. Journal of Chemical Information and Computer Sciences, 25, 334–343.Google Scholar
  4. Balasubramanian, K. (1995). Geometry-dependent connectivity indices for the characterization of molecular structures. Chemical Physics Letters, 235, 580–586.CrossRefGoogle Scholar
  5. Basak, S. C., Gute, B. D., & Grunwald, G. D. (1997). Use of topostructural, topochemical, and geometric parameters in the prediction of vapor pressure: A hierarchical QSAR approach. Journal of Chemical Information and Computer Sciences, 37, 651–655.CrossRefGoogle Scholar
  6. Bobra, A., Shiu, W. Y., & Mackay, D. (1985). Quantitative structure-activity relationships for the acute toxicity of chlorobenzenes to Daphnia magna. Environmental Toxicology and Chemistry, 4, 297–305.CrossRefGoogle Scholar
  7. Bolton, E. E., Wang, Y., Thiessen, P. A., & Bryant, S. H. (2008). PubChem: Integrated platform of small molecules and biological activities. Annual Reports in Computational Chemistry, 4, 217–241.CrossRefGoogle Scholar
  8. Boyle, N. M. O., Banck, M., James, C. A., Morley, C., Vandermeersch, T., & Hutchison, G. R. (2011). Open Babel: An open chemical toolbox. Journal of Chemical Information and Modeling, 3, 33.Google Scholar
  9. Broto, P., Moreau, G., & Vandycke, C. (1984). Molecular structures: Perception, autocorrelation descriptor and sar studies. European Journal of Medicinal Chemistry, 19, 66–70.Google Scholar
  10. Burden, F. R. (1989). Molecular identification number for substructure searches. Journal of Chemical Information and Computer Sciences, 29, 225–227.Google Scholar
  11. Buzea, C., Pacheco, I. I., & Robbie, K. (2007). Nanomaterials and nanoparticles: Sources and toxicity. Biointerphases, 2, MR17–MR71.CrossRefGoogle Scholar
  12. Carhart, R. E., Smith, D. H., & Venkataraghavan. R. (1985). Atom pairs as molecular features in structure-activity studies: Definition and applications. 13, 8–11.Google Scholar
  13. Cassotti, M., Ballabio, D., Consonni, V., Mauri, A., Tetko, I. V., & Todeschini, R. (2014a). Prediction of acute aquatic toxicity toward Daphnia magna by using the GA-kNN method. ATLA, Alternatives to Laboratory Animals, 42, 31–41.Google Scholar
  14. Cassotti, M., Consonni, V., Mauri, A., & Ballabio, D. (2014b). Validation and extension of a similarity-based approach for prediction of acute aquatic toxicity towards Daphnia magna. SAR and QSAR in Environmental Research, 25, 1013–1036.CrossRefGoogle Scholar
  15. Cherkasov, A., Muratov, E. N., Fourches, D., Varnek, A., Baskin, I. I., Cronin, M., Dearden, J. C., Gramatica, P., Martin, Y. C., Todeschini, R., Consonni, V., Kuz, V. E., Cramer, R. D., Benigni, R., Yang, C., Rathman, J. F., Terfloth, L., Gasteiger, J., Richard, A. M., & Tropsha, A. (2014). QSAR modeling: Where have you been? Where are you going to? Journal of Medicinal Chemistry, 57, 4977–5010.CrossRefGoogle Scholar
  16. Consonni, V., Todeschini, R., & Pavan, M. (2002a). Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors. Journal of Chemical Information and Computer Sciences, 42, 682–692.CrossRefGoogle Scholar
  17. Consonni, V., Todeschini, R., Pavan, M., & Gramatica, P. (2002b). Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 2. Application of the novel 3D molecular descriptors to QSAR/QSPR studies. Journal of Chemical Information and Computer Sciences, 42, 693–705.CrossRefGoogle Scholar
  18. Corbett, P. T., Leclaire, J., Vial, L., West, K. R., Wietor, J. L., Sanders, J. K. M., & Otto, S. (2006). Dynamic combinatorial chemistry. Chemical Reviews, 106, 3652–3711.CrossRefGoogle Scholar
  19. Cormen, T. H., Leiserson, C. E., & Rivest, R. L. (2001). Introduction to algorithms (2nd ed.). The MIT Press/McGraw-Hill.Google Scholar
  20. Durant, J. L., Leland, B. A., Henry, D. R., & Nourse, J. G. (2002). Reoptimization of MDL keys for use in drug discovery. Journal of Chemical Information and Computer Sciences, 42, 1273–1280.CrossRefGoogle Scholar
  21. Elyashberg, M., Williams, A. J., Blinov, K. (2011). Contemporary computer-assisted approaches to molecular structure elucidation. Royal Society of Chemistry.Google Scholar
  22. Estrada, E. (1995). Edge adjacency relationships and a novel topological index related to molecular volume. Journal of Chemical Information and Computer Sciences, 35, 31–33. doi:10.1021/ci00023a004.Google Scholar
  23. Fechner, U., Franke, L., Renner, S., Schneider, P., & Schneider, G. (2003). Comparison of correlation vector methods for ligand-based similarity searching. Journal of Computer-Aided Molecular Design, 17, 687–698.CrossRefGoogle Scholar
  24. Fouches, D., Muratov, E. N., & Tropsha, A. (2010). Trust but verify: On the importance of chemical structure curation in cheminformatics and QSAR modeling research. Journal of Chemical Information and Modeling, 50, 1189–1204.CrossRefGoogle Scholar
  25. Fourches, D., Pu, D., Tassa, C., Weissleder, R., Shaw, S. Y., Mumper, R. J., & Tropsha, A. (2010). Quantitative nanostructure – Activity relationship modeling. ACS Nano, 4, 5703–5712. doi:10.1021/nn1013484.CrossRefGoogle Scholar
  26. Fourches, D., Pu, D., & Tropsha, A. (2011). Exploring quantitative nanostructure-activity relationships (QNAR) modeling as a tool for predicting biological effects of manufactured nanoparticles. Combinatorial Chemistry & High Throughput Screening, 14, 217–225. doi:10.2174/138620711794728743.CrossRefGoogle Scholar
  27. Geary, R. C. (1954). The contiguity ratio and statistical mapping. Incorporated Statistician, 5, 115–127, 129–146. doi:10.2307/2986645Google Scholar
  28. Gissi, A., Lombardo, A., Roncaglioni, A., Gadaleta, D., Mangiatordi, G. F., Nicolotti, O., & Benfenati, E. (2015). Evaluation and comparison of benchmark QSAR models to predict a relevant REACH endpoint: The bioconcentration factor (BCF). Environmental Research, 137C, 398–409.CrossRefGoogle Scholar
  29. Guha, R., & Willighagen, E. L. (2012). A survey of quantitative descriptions of molecular structure. Current Topics in Medicinal Chemistry, 12, 1946–1956. doi:10.1016/j.biotechadv.2011.08.021.Secreted.CrossRefGoogle Scholar
  30. Haasch, M. L., McClellan-Green, P., & Oberdörster, E. (2005). Consideration of the toxicity of manufactured nanoparticles. AIP Conference Proceedings, 786, 586–590.CrossRefGoogle Scholar
  31. Hansch, C., Leo, A., & Livingstone, D. J. (1996). Exploring QSAR fundamentals and applications in chemistry and biology. Pesticide Biochemistry and Physiology, 56, 78.CrossRefGoogle Scholar
  32. Harary, F. (1969). Graph theory. Reading: Addison-Wesley.Google Scholar
  33. Hollas, B. (2003). An analysis of the autocorrelation descriptor for molecules. Journal of Mathematical Chemistry, 33, 91–101.CrossRefGoogle Scholar
  34. Hughes, K., Paterson, J., & Meek, M. E. (2009). Tools for the prioritization of substances on the Domestic Substances List in Canada on the basis of hazard. Regulatory Toxicology and Pharmacology, 55, 382–393.CrossRefGoogle Scholar
  35. Irwin, J. J., & Shoichet, B. K. (2005). ZINC – A free database of commercially available compounds for virtual screening. Journal of Chemical Information and Modeling, 45, 177–182.CrossRefGoogle Scholar
  36. Ivanciuc, O., & Balaban, A. T. (1994). Design of topological indices. Part 8. Path matrices and derived molecular graph invariants. MATCH Communications Mathematical and in Computer Chemistry, 30, 141–152.Google Scholar
  37. Ivanciuc, O., & Ivanciuc, T. (2000). Matrices and structural descriptors computed from molecular graph distances. In A. T. Balaban & J. Devillers (Eds.), Topological indices and related descriptors in QSAR and QSPR (pp. 221–277). Amsterdam: Gordon and Breach Science Publishers.Google Scholar
  38. Jurs, P. C., Dixon, J. S., & Egolf, L. M. (1995). Representations of molecules. In H. Van Waterbeemd (Ed.), Chemometrics methods in molecular design (Vol. 2, pp. 15–38). New York: VCH Publishers.CrossRefGoogle Scholar
  39. Kar, S., Gajewicz, A., Puzyn, T., & Roy, K. (2014). Nano-quantitative structure-activity relationship modeling using easily computable and interpretable descriptors for uptake of magnetofluorescent engineered nanoparticles in pancreatic cancer cells. Toxicology in Vitro, 28, 600–606.CrossRefGoogle Scholar
  40. Kier, L. B., & Hall, L. H. (1977). The nature of structure-activity relationships and their relation to molecular connectivity. European Journal of Medicinal Chemistry, 12, 307–375.Google Scholar
  41. Kind, T., & Fiehn, O. (2010). Advances in structure elucidation of small molecules using mass spectrometry. Bioanalytical Reviews, 2, 23–60.CrossRefGoogle Scholar
  42. Kühne, R., Ebert, R. U., Vonderohe, P. C., Ulrich, N., Brack, W., & Schüürmann, G. (2013). Read-across prediction of the acute toxicity of organic compounds toward the water flea Daphnia magna. Molecular Informatics, 32, 108–120.CrossRefGoogle Scholar
  43. Lehn, J.-M. (1999). Dynamic combinatorial chemistry and virtual combinatorial libraries. Chemistry A European Journal, 5, 2455–2463.CrossRefGoogle Scholar
  44. Lovasz, L., & Pelikan, J. (1973). On the eigenvalues of trees. Periodica Mathematica Hungarica, 3, 175–182.CrossRefGoogle Scholar
  45. Mannhold, R., Poda, G. I., Ostermann, C., & Tetko, I. V. (2009). Calculation of molecular lipophilicity: State-of-the-art and comparison of log P methods on more than 96,000 compounds. Journal of Pharmaceutical Sciences, 98, 861–893.CrossRefGoogle Scholar
  46. Mauri, A., Manganaro, A., Todeschini, R., Consonni, V., & Ballabio, D. (2014). Dragon software for molecular descriptor calculation.Google Scholar
  47. Merris, R. (1994). Laplacian matrices of graphs: A survey. Linear Algebra and its Applications, 197–198, 143–176.CrossRefGoogle Scholar
  48. Mihalic, Z., Nikolić, S., & Trinajstić, N. (1992). Comparative study of molecular descriptors derived from the distance matrix. Journal of Chemical Information and Modeling, 32, 28–37.CrossRefGoogle Scholar
  49. Mohar, B., Babic, D., & August, R. (1993). A novel definition of the Wiener index for trees. Journal of Chemical Information and Computer Sciences, 33, 153–154.Google Scholar
  50. Moran, P. (1950). Notes on continuous stochastic phenomena. Biometrika, 37, 17–23.CrossRefGoogle Scholar
  51. Moreau, J. L., & Broto, P. (1980). Autocorrelation of molecular structures: Application to SAR studies. Nouveau Journal de Chimie, 4, 757–764.Google Scholar
  52. Murray-Rust, P. (1999). Chemical markup, XML, and the Worldwide Web. 1. Basic principles. Journal of Chemical Information and Computer Sciences, 39, 928–942.CrossRefGoogle Scholar
  53. Murray-Rust, P., & Rzepa, H. S. (2001). Chemical markup, XML and the World-Wide Web. 2. Information objects and the CMLDOM. Journal of Chemical Information and Computer Sciences, 41, 1113–1123.CrossRefGoogle Scholar
  54. Oksel, C., Ma, C. Y., Liu, J. J., Wilkins, T., & Wang, X. Z. (2015). (Q)SAR modelling of nanomaterial toxicity: A critical review. Particuology, 21, 1–19.CrossRefGoogle Scholar
  55. Oprisiu, I., Novotarskyi, S., & Tetko, I. V. (2013). Modeling of non-additive mixture properties using the Online CHEmical database and Modeling environment (OCHEM). Journal of Chemical Information and Modeling, 5, 4. doi:10.1186/1758-2946-5-4.Google Scholar
  56. Pence, H. E., & Williams, A. (2010). Chemspider: An online chemical information resource. Journal of Chemical Education, 87, 1123–1124.CrossRefGoogle Scholar
  57. Puzyn, T., Leszczynska, D., & Leszczynski, J. (2009). Toward the development of “Nano-QSARs”: Advances and challenges. Small, 5, 2494–2509. doi:10.1002/smll.200900179.CrossRefGoogle Scholar
  58. Randić, M. (1975). On characterization of molecular branching. Journal of the American Chemical Society, 97, 6609–6615. doi:10.1021/ja00856a001.CrossRefGoogle Scholar
  59. Randić, M. (1992). Similarity based on extended basis descriptors. Journal of Chemical Information and Modeling, 32, 686–692. doi:10.1021/ci00010a016.CrossRefGoogle Scholar
  60. Randić, M. (1996). Molecular bonding profiles. Journal of Mathematical Chemistry, 19, 375–392. doi:10.1007/BF01166727.CrossRefGoogle Scholar
  61. Randić, M. (2001). The connectivity index 25 years after. Journal of Molecular Graphics and Modelling, 20, 19–35. doi:10.1016/S1093-3263(01)00098-5.CrossRefGoogle Scholar
  62. Renner, S., Fechner, U., & Schneider, G. (2006). Alignment-free pharmacophore patterns – A correlation vector approach. In T. Langer & R. D. Hoffmann (Eds.), Pharmacophores and pharmacophore searches (pp. 49–79). Weinheim: Wiley-VCH.CrossRefGoogle Scholar
  63. Rogers, D., & Hahn, M. (2010). Extended-connectivity fingerprints. Journal of Chemical Information and Modeling, 50, 742–754. doi:10.1021/ci100050t.CrossRefGoogle Scholar
  64. Roy, K., Das, R. N., & Popelier, P. L. a. (2014). Quantitative structure-activity relationship for toxicity of ionic liquids to Daphnia magna: Aromaticity vs. lipophilicity. Chemosphere, 112, 120–127. doi:10.1016/j.chemosphere.2014.04.002.CrossRefGoogle Scholar
  65. Ruggiu, F., Marcou, G., Varnek, A., & Horvath, D. (2010). ISIDA property-labelled fragment descriptors. Molecular Informatics, 29, 855–868.CrossRefGoogle Scholar
  66. Schneider, G., Neidhart, W., Giller, T., & Schmid, G. (1999). “Scaffold-Hopping” by topological pharmacophore search: A contribution to virtual screening. Angewandte Chemie International Edition in English, 38, 2894–2896.CrossRefGoogle Scholar
  67. Testa, B., & Kier, L. B. (1991). The concept of molecular structure in structure-activity relationship studies and drug design. Medicinal Research Reviews, 11, 35–48.CrossRefGoogle Scholar
  68. Todeschini, R., & Consonni, V. (2009). Molecular descriptors for chemoinformatics (Vol. 2). Weinheim: Wiley-VCH.CrossRefGoogle Scholar
  69. Todeschini, R., & Gramatica, P. (1997). The Whim theory: New 3D molecular descriptors for QSAR in environmental modelling. SAR and QSAR in Environmental Research, 7, 89–115.CrossRefGoogle Scholar
  70. Todeschini, R., Lasagni, M., & Marengo, E. (1994). New molecular descriptors for 2D and 3D structures. Theory. Journal of Chemometrics, 8, 263–272.CrossRefGoogle Scholar
  71. Trinajstic, N., Nikolic, S., Lucic, B., Amic, D., & Mihalic, Z. (1997). The Detour matrix in chemistry. Journal of Chemical Information and Modeling, 37, 631–638.Google Scholar
  72. Tropsha, A. (2010). Best practices for QSAR model development, validation, and exploitation. Molecular Informatics, 29, 476–488.CrossRefGoogle Scholar
  73. Vighi, M., & Calamari, D. (1985). QSARs for organotin compounds on Daphnia magna. Chemosphere, 14, 1925–1932.CrossRefGoogle Scholar
  74. Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Modeling, 28, 31–36.CrossRefGoogle Scholar
  75. Wiener, H. (1947). Structural determination of paraffin boiling points. Journal of the American Chemical Society, 69, 17–20.CrossRefGoogle Scholar
  76. Williams, A., & Tkachenko, V. (2014). The Royal Society of Chemistry and the delivery of chemistry data repositories for the community. Journal of Computer-Aided Molecular Design, 28, 1023–1030.CrossRefGoogle Scholar
  77. Worth, A. P. (2010). Chapter 13: The role of QSAR methodology in the regulatory assessment of chemicals. Media. pp 367–382.Google Scholar
  78. Young, D., Martin, T., Venkatapathy, R., & Harten, P. (2008). Are the chemical structures in your QSAR correct? QSAR and Combinatorial Science, 27, 1337–1345.CrossRefGoogle Scholar
  79. Zeeman, M., Auer, C. M., Clements, R. G., Nabholz, J. V., & Boethling, R. S. (1995). U.S. EPA regulatory perspectives on the use of QSAR for new and existing chemical evaluations. SAR and QSAR in Environmental Research, 3, 179–201.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Andrea Mauri
    • 1
  • Viviana Consonni
    • 1
  • Roberto Todeschini
    • 1
  1. 1.Department of Earth and Environmental SciencesUniversity of Milano-BicoccaMilanItaly

Personalised recommendations