Partitioning in Binary-Transformed Chemical Descriptor Spaces
Here we describe a statistically based partitioning method called median partitioning (MP), which involves the transformation of value distributions of molecular property descriptors into a binary classification scheme. The MP approach fundamentally differs from other partitioning approaches that involve dimension reduction of chemical spaces such as cell-based partitioning, since MP directly operates in original, albeit simplified, chemical space. Modified versions of the MP algorithm have been implemented and successfully applied in diversity selection, compound classification, and virtual screening. These findings have demonstrated that dimension reduction techniques, although elegant in their design, are not necessarily required for effective partitioning of molecular datasets. An attractive feature of statistical partitioning approaches such as decision tree methods or MP is their computational efficiency, which is becoming an important criterion for the analysis of compound databases containing millions of molecules.
Key WordsBiological activity chemical descriptors chemical spaces classification methods compound databases decision trees diversity selection partitioning algorithms space transformation statistics statistical medians
- 2.Mason, J. S. and Pickett, S. D. (1997) Partition-based selection. Perspect. Drug Discov. Design 7/8, 85–114.Google Scholar
- 6.Chen, X., Rusinko, A. III, and Young, S. S. (1998) Recursive partitioning analysis of a large structure-activity data set using three-dimensional descriptors. J. Chem. Inf. Comput. Sci. 38, 1054–1062.Google Scholar
- 11.Higgs, R. E., Bemis, K. G., Watson, I. A., and Wikel, J. H. (1997) Experimental designs for selecting molecules from large chemical databases. J. Chem. Inf. Comput. Sci. 37, 861–870.Google Scholar
- 17.Xue, L. and Bajorath, J. (2000) Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Combin. Chem. High Throughput Screen. 3, 363–372.Google Scholar
- 20.Shannon, C. E. and Weaver, W. (1963) The mathematical theory of communication. University of Illinois Press, Urbana, IL.Google Scholar