Abstract
The identification of molecular descriptors that are able to distinguish between different compound classes is of paramount importance in chemoinformatics. To aid in the identification of such discriminatory descriptors, concepts from information theory have been adapted. In an earlier study, an approach termed Differential Shannon Entropy (DSE) has been introduced for descriptor profiling to detect and quantify compound database-dependent differences in the information content and value range distribution of descriptors. Because the DSE approach was intrinsically limited in its ability to select compound class-specific descriptors by comparing data sets of very different size, this approach has recently been extended to Mutual Information-DSE (MI-DSE). Herein, DSE, MI-DSE, and the Shannon entropy concept underlying both information theoretic approaches are introduced and compared, and differences between their application areas are discussed.
Key words
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Xue L, Bajorath J (2000) Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Combin Chem High Throughput Screen 3:363–372
Bajorath J (2002) Integration of virtual and high-throughput screening. Nat Rev Drug Discov 1:882–894
Shannon, CE (1948) A mathematical theory of communication. Bell Syst Tech J 27: 379–423
Godden JW, Stahura FL, Bajorath J (2000) Variability of molecular descriptors in compound databases revealed by Shannon entropy calculations. J Chem Inf Comput Sci 40: 796–800
Stahura FL, Godden JW, Bajorath J (2000) Distinguishing between natural products and synthetic molecules by descriptor Shannon entropy analysis and binary QSAR calculations. J Chem Inf Comput Sci 40:1245–1252
Godden JW, Bajorath J (2001) Differential Shannon entropy as a sensitive measure of differences in database variability of molecular descriptors. J Chem Inf Comput Sci 41:1060–1066
Stahura FL, Godden JW, Bajorath J (2002) Differential Shannon entropy analysis identifies molecular property descriptors that predict aqueous solubility of synthetic compounds with high accuracy in binary QSAR calculations. J Chem Inf Comput Sci 42:550–558
Wassermann, Anne Mai, et al (2010) Identification of descriptors capturing compound class-specific features by mutual information analysis. J Chem Inf Model 50:1935–1940
MOE (Molecular Operating Environment), Chemical Computing Group Inc.: Montreal, Quebec, Canada, 2007
Irwin JJ, Shoichet BK (2005) ZINC – a free database of commercially available compounds for virtual screening. J Chem Inf Model 45:177–182
MDL Drug Data Report (MDDR), Symyx Software: San Ramon, CA, USA, 2005
Cover TM, Thomas JA (1991) Elements of Information Theory. Wiley-Interscience, New York
Lin J (1991) Divergence measures based on Shannon entropy. IEEE Trans Inf Theory 37: 145–151
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Wassermann, A.M., Nisius, B., Vogt, M., Bajorath, J. (2012). Information Entropic Functions for Molecular Descriptor Profiling. In: Baron, R. (eds) Computational Drug Discovery and Design. Methods in Molecular Biology, vol 819. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-465-0_4
Download citation
DOI: https://doi.org/10.1007/978-1-61779-465-0_4
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-61779-464-3
Online ISBN: 978-1-61779-465-0
eBook Packages: Springer Protocols