Generalization Rules for Binarized Descriptors

  • Jürgen Paetz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4345)


Virtual screening of molecules is one of the hot topics in life science. Often, molecules are encoded by descriptors with numerical values as a basis for finding regions with a high enrichment of active molecules compared to non-active ones. In this contribution we demonstrate that a simpler binary version of a descriptor can be used for this task as well with similar classification performance, saving computational and memory resources. To generate binary valued rules for virtual screening, we used the GenIntersect algorithm that heuristically determines common properties of the binary descriptor vectors. The results are compared to the ones achieved with numerical rules of a neuro-fuzzy system.


Enrichment Factor Virtual Screening Generalization Rule Virtual Screen Binarized Descriptor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ajay: Predicting Drug-Likeness: Why and How? Current Topics in Medicinal Chemistry 2(12), 1273–1286 (2002)CrossRefGoogle Scholar
  2. 2.
    Xu, H.: Retrospect and Prospect of Virtual Screening in Drug Discovery. Current Topics in Medicinal Chemistry 2(12), 1305–1320 (2002)CrossRefGoogle Scholar
  3. 3.
    Böhm, H.-J., Schneider, G.: Virtual Screening for Bioactive Molecules. Wiley VCH, Weinheim (2000)CrossRefGoogle Scholar
  4. 4.
    Lyne, P.D.: Structure-Based Virtual Screening: An Overview. Drug Discovery Today 7(20), 1047–1055 (2002)CrossRefGoogle Scholar
  5. 5.
    Schneider, G., Böhm, H.-J.: Virtual Screening and Fast Automated Docking Methods. Drug Discovery Today 7(1), 64–70 (2002)CrossRefGoogle Scholar
  6. 6.
    Borgelt, C., Berthold, M.R.: Mining Molecular Fragments: Finding Relevant Substructures of Molecules. In: Proc. of the 2nd IEEE Int. Conf. on Data Mining (ICDM), Maebashi City, Japan, pp. 51–58 (2002)Google Scholar
  7. 7.
    Todeschini, T., Consonni, V.: Handbook of Molecular Descriptors. Wiley-VCH, Weinheim (2000)CrossRefGoogle Scholar
  8. 8.
    Schneider, G., Neidhart, W., Giller, T., Schmid, G.: Scaffold Hopping by Topological Pharmacophore Search: A Contribution to Virtual Screening, Angewandte Chemie. International Edition 38(19), 2894–2895 (1999)CrossRefGoogle Scholar
  9. 9.
    Schneider, P., Schneider, G.: Collection of Bioactive Reference Compounds for Focused Library Design. QSAR & Combinatorial Science 22, 713–718 (2003)CrossRefGoogle Scholar
  10. 10.
    Huber, K.-P., Berthold, M.R.: Building Precise Classifiers with Automatic Rule Extraction. In: Proc. of the IEEE Int. Conf. on Neural Networks (ICNN), Perth, Western Australia, pp. 1263–1268. Univ. of Western Australia (1995)Google Scholar
  11. 11.
    Paetz, J.: Metric Rule Generation with Septic Shock Patient Data. In: Proc. of the 1st Int. Conf. on Data Mining (ICDM), San Jose, CA, USA, pp. 637–638 (2001)Google Scholar
  12. 12.
    Paetz, J.: Knowledge Based Approach to Septic Shock Patient Data Using a Neural Network with Trapezoidal Activation Functions, Artificial Intelligence in Medicine. Special Issue on Knowledge-Based Neurocomputing in Medicine 28(2), 207–230 (2003)Google Scholar
  13. 13.
    Berthold, M.R.: Mixed Fuzzy Rule Formation. International Journal of Approximate Reasoning 32, 67–84 (2003)MATHCrossRefGoogle Scholar
  14. 14.
    Fechner, U., Paetz, J., Schneider, G.: Comparison of Three Holographic Fingerprint Descriptors and Their Binary Counterparts. QSAR & Combinatorial Science 24, 961–967 (2005)CrossRefGoogle Scholar
  15. 15.
    Paetz, J.: Intersection Based Generalization Rules for the Analysis of Symbolic Septic Shock Patient Data. In: Proc. of the 2nd IEEE Int. Conf. on Data Mining (ICDM), Maebashi City, Japan, pp. 673–676 (2002)Google Scholar
  16. 16.
    Beyer, H.-G.: An Alternative Explanation for the Manner in Which Genetic Algorithms Operate. BioSystems 41, 1–15 (1997)CrossRefGoogle Scholar
  17. 17.
    Paetz, J.: Durchschnittsbasierte Generalisierungsregeln Teil I: Grundlagen. Frankfurter Informatik-Berichte Nr. 1/02, Institut für Informatik, Fachbereich Biologie und Informatik, J.W. Goethe-Univ. Frankfurt am Main, Germany (2002) ISSN 1616–9107Google Scholar
  18. 18.
    Agrawal, R., Skrikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. of the 20th Int. Conf. on Very Large Databases (VLDB), Santiago de Chile, Chile, pp. 487–499 (1994)Google Scholar
  19. 19.
    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)MATHGoogle Scholar
  20. 20.
    Paetz, J., Schneider, G.: Virtual Screening Using Local Neuro-Fuzzy Rules. In: Proc. of the 13th. IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE), Budapest, Hungary, pp. 861–866 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jürgen Paetz
    • 1
  1. 1.J.W. Goethe-Universität Frankfurt am MainFrankfurt am MainGermany

Personalised recommendations