Advertisement

A Numerical Refinement Operator Based on Multi-Instance Learning

  • Erick Alphonse
  • Tobias Girschick
  • Fabian Buchwald
  • Stefan Kramer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6489)

Abstract

We present a numerical refinement operator based on multi- instance learning. In the approach, the task of handling numerical vari- ables in a clause is delegated to statistical multi-instance learning schemes. To each clause, there is an associated multi-instance classification model with the numerical variables of the clause as input. Clauses are built in a greedy manner, where each refinement adds new numerical variables which are used additionally to the numerical variables already known to the multi-instance model. In our experiments, we tested this approach with multi-instance learners available in the Weka workbench (like MI-SVMs). These clauses are used in a boosting approach that can take advantage of the margin information, going beyond standard covering procedures or the discrete boosting of rules, like in SLIPPER. The approach is evaluated on the problem of hexose binding site prediction, a pharmacological application and mutagenicity prediction. In two of the three applications, the task is to find configurations of points with certain properties in 3D space that characterize either a binding site or drug activity: the logical part of the clause constitutes the points with their properties, whereas the multi-instance model constrains the distances among the points. In summary, the new numerical refinement operator is interesting both theoretically as a new synthesis of logical and statistical learning and practically as a new method for characterizing binding sites and pharmacophores in biochemical applications.

Keywords

Biochemical Application Logic Programming Inductive Logic Programming Constraint Variable Logical Part 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Anthony, S., Frisch, A.M.: Generating numerical literals during refinement. In: Lavrač, N., Džeroski, S. (eds.) ILP 1997. LNCS, vol. 1297, pp. 61–76. Springer, Heidelberg (1997)Google Scholar
  2. 2.
    Botta, M., Piola, R.: Refining numerical constants in first order logic theories. Machine Learning 38(1/2), 109–131 (2000)zbMATHCrossRefGoogle Scholar
  3. 3.
    Sebag, M., Rouveirol, C.: Constraint inductive logic programming. In: Advances In Inductive Logic Programming, pp. 277–294 (1996)Google Scholar
  4. 4.
    Srinivasan, A., Camacho, R.: Numerical reasoning with an ILP system capable of lazy evaluation and customised search. Journal of Logic Programming 40(2-3), 185–213 (1999)MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Dietterich, T.G., Michalski, R.S.: A comparative review of selected methods for learning from examples. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning, an Artificial Intelligence Approach, vol. 1, pp. 41–81 (1983)Google Scholar
  6. 6.
    Zucker, J.-D., Ganascia, J.-G.: Selective reformulation of examples in concept learning. In: Proc. of ICML 1994, pp. 352–360 (1994)Google Scholar
  7. 7.
    Fensel, D., Zickwolff, M., Wiese, M.: Are substitutions the better examples? Learning complete sets of clauses with Frog. In: Proc. of ILP 1995, pp. 453–474 (1995)Google Scholar
  8. 8.
    Srinivasan, A., Page, D., Camacho, R., King, R.D.: Quantitative pharmacophore models with inductive logic programming. Machine Learning 64(1-3), 65–90 (2006)zbMATHCrossRefGoogle Scholar
  9. 9.
    Davis, J., Costa, V.S., Ray, S., Page, D.: An integrated approach to feature invention and model construction for drug activity prediction. In: Proc. of ICML 2007 (2007)Google Scholar
  10. 10.
    Nock, R., Nielsen, F.: A real generalization of discrete adaboost. In: Brewka, G., Coradeschi, S., Perini, A., Traverso, P. (eds.) Proc. of ECAI 2006, vol. 141 (2006)Google Scholar
  11. 11.
    Cohen, W.W., Singer, Y.: A simple, fast, and effective rule learner. In: Proc. of AAAI 1999, pp. 335–342 (1999)Google Scholar
  12. 12.
    Nassif, H., Hassan, A., Sawsan, K., Keirouz, W., Page, D.: An ILP Approach to Model and Classify Hexose Binding Sites. In: Proc. of ILP 2009, vol. 78 (2009)Google Scholar
  13. 13.
    Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research 28, 235–242 (2000)CrossRefGoogle Scholar
  14. 14.
    Srinivasan, A., Muggleton, S., King, R.D., Sternberg, M.J.E.: Mutagenesis: ILP experiments in a non-determinate biological domain. In: Proc. of ILP 1994, vol. 237, pp. 217–232 (1994)Google Scholar
  15. 15.
    Davis, J., Santos Costa, V., Ray, S., Page, D.: Tightly integrating relational learning and multiple-instance regression for real-valued drug activity prediction. In: Proc. of ICML 2007, vol. 287 (2007)Google Scholar
  16. 16.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explorations 11 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Erick Alphonse
    • 1
  • Tobias Girschick
    • 2
  • Fabian Buchwald
    • 2
  • Stefan Kramer
    • 2
  1. 1.Laboratoire d’Informatique de l’université Paris-NordVilletaneuseFrance
  2. 2.Institut für Informatik I12Technische Universität MünchenGarching b. MünchenGermany

Personalised recommendations