Machine Learning

, Volume 86, Issue 1, pp 57–88 | Cite as

Bridging logic and kernel machines

  • Michelangelo Diligenti
  • Marco Gori
  • Marco Maggini
  • Leonardo Rigutini


We propose a general framework to incorporate first-order logic (FOL) clauses, that are thought of as an abstract and partial representation of the environment, into kernel machines that learn within a semi-supervised scheme. We rely on a multi-task learning scheme where each task is associated with a unary predicate defined on the feature space, while higher level abstract representations consist of FOL clauses made of those predicates. We re-use the kernel machine mathematical apparatus to solve the problem as primal optimization of a function composed of the loss on the supervised examples, the regularization term, and a penalty term deriving from forcing real-valued constraints deriving from the predicates. Unlike for classic kernel machines, however, depending on the logic clauses, the overall function to be optimized is not convex anymore. An important contribution is to show that while tackling the optimization by classic numerical schemes is likely to be hopeless, a stage-based learning scheme, in which we start learning the supervised examples until convergence is reached, and then continue by forcing the logic clauses is a viable direction to attack the problem. Some promising experimental results are given on artificial learning tasks and on the automatic tagging of bibtex entries to emphasize the comparison with plain kernel machines.


Kernel machines First-order logic Learning from constraints Learning with prior knowledge Multi-task learning Semantic-based regularization 


  1. Allgower, E., & Georg, K. (2003). Introduction to numerical continuation methods. In Society for industrial mathematics (p. 2003). Google Scholar
  2. Belkin, M., Niyogi, P., & Sindhwani, V. (2006). Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. The Journal of Machine Learning Research, 7, 2434. MathSciNetGoogle Scholar
  3. Bengio, Y. (2009). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (pp. 41–48). Google Scholar
  4. Caponnetto, A., Micchelli, C., Pontil, M., & Ying, Y. (2008). Universal kernels for multi-task learning. Journal of Machine Learning Research. Google Scholar
  5. Chapelle, O. (2007). Training a support vector machine in the primal. Neural Computation, 19(5), 1155–1178. CrossRefzbMATHMathSciNetGoogle Scholar
  6. Cumby, C., & Roth, D. (2002). Learning with feature description logics. In Proceedings of the 12th international conference on inductive logic programming. Google Scholar
  7. Cumby, C., & Roth, D. (2003). On kernel methods for relational learning. In Proceedings of the twentieth international conference on machine learning (ICML-2003), Washington DC, 2003. Google Scholar
  8. Diligenti, M., Gori, M., Maggini, M., & Rigutini, L. (2010a). Multitask kernel-based learning with first-order logic constraints. In The 20th international conference on inductive logic programming. Google Scholar
  9. Diligenti, M., Gori, M., Maggini, M., & Rigutini, L. (2010b). Multitask kernel-based learning with logic constraints. In The 19th European conference on artificial intelligence. Google Scholar
  10. Fanizzi, N., D’Amato, C., & Esposito, F. (2008). Statistical learning for inductive query answering on owl ontologies. In THE SEMANTIC WEB—ISWC (pp. 195–212). Google Scholar
  11. Fung, G., Mangasarian, O., & Shavlik, J. (2002). Knowledgebased support vector machine classifiers. In Proceedings of sixteenth conference on neural information processing systems (NIPS), Vancouver, Canada. Google Scholar
  12. Fung, G., Mangasarian, O., & Shavlik, J. (2003). Knowledgebased nonlinear kernel classifiers. In International conference on learning theory—COLT, Washington D.C. Google Scholar
  13. Giaquinta, M., & Hildebrand, S. (1996a). Calculus of variations I (Vol. 1). Berlin: Springer. Google Scholar
  14. Giaquinta, M., & Hildebrand, S. (1996b). Calculus of variations II (Vol. 2). Berlin: Springer. Google Scholar
  15. Gori, M. (2009). Semantic-based regularization and Piaget’s cognitive stages. Neural Networks, 22(7), 1035–1036. CrossRefGoogle Scholar
  16. Gori, M., & Melacci, S. (2010). Learning with convex constraints. In 20th International conference on artificial neural networks. Google Scholar
  17. Gorse, D., Shepherd, A. J., & Taylor, J. (1997). The new era in supervised learning. Neural Networks, 10(2), 343–352. CrossRefGoogle Scholar
  18. Gorse, D., Sherpard, A. J., & Taylor, J. (2004). A classical algorithm for avoiding local minima. In Proceedings of WCCI-2004. Google Scholar
  19. Guerin, F. (2008). Constructivism in ai: Prospects, progress and challenges. In Proceedings of the AISB convention 2008, Aberdeen, Scotland, 1–4 April, 2008, (pp. 20–27). Google Scholar
  20. Guerin, F., & McKenzie, D. (2008). A Piagetian model of early sensorimotor development. In Proceedings of the eighth international conference on epigenetic robotics, University of Sussex, 30–31 July 2008. Google Scholar
  21. Haussler, D. (1999). Convolution kernels on discrete structures, Tech. rep., Department of Computer Science, University of California at Santa Cruz. Google Scholar
  22. Hitzler, P., Holldobler, S., & Sedab, A. K. (2004). Logic programs and connectionist networks. Journal of Applied Logic, 2(3), 245–272. CrossRefzbMATHMathSciNetGoogle Scholar
  23. Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to adolescence. New York: Basic Books. CrossRefGoogle Scholar
  24. Katakis, I., Tsoumakas, G., & Vlahavas, I. (2008). Multilabel text classification for automated tag suggestion. ECML PKDD Discovery Challenge, 75. Google Scholar
  25. Klement, E., Mesiar, R., & Pap, E. (2000). Triangular norms. Norwell: Kluwer Academic. zbMATHGoogle Scholar
  26. Klir, G., & Yuan, B. (1995). Fuzzy sets and fuzzy logic: theory and applications. New York: Prentice Hall. zbMATHGoogle Scholar
  27. Landwehr, N., Passerini, A., Raedt, L. D., & Frasconi, P. (2006). Kfoil: learning simple relational kernels. In Proceeding of the AAAI-2006. Google Scholar
  28. Landwehr, N., Passerini, A., Raedt, L., & Frasconi, P. (2010). Fast learning of relational kernels. Machine Learning. Google Scholar
  29. Laurer, F., & Bloch, G. (2009). Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing, 71(7–9), 1578–1594. Google Scholar
  30. Le, Q., Smola, A., & Gartner, T. (2006). Simpler knowledge-based support vector machines. In Proceedings of the 23rd international conference on machine learning. Google Scholar
  31. Maclin, R., Wild, E., Shavlik, J., Torrey, L., & Walker, T. (2007). Refining rules incorporated into knowledge-based support vector learners via successive linear programming. In A. Press (Ed.), AAAI conference on artificial intelligence, Vancouver, British Columbia, Canada, pp. 584–589. Google Scholar
  32. Melacci, S., Maggini, M., & Gori, M. (2009). Semi-supervised learning with constraints for multi-view object recognition. In Proceedings of the 19th international conference on artificial neural networks (pp. 653–662). Berlin: Springer. Google Scholar
  33. Muggleton, S.L.H., Amini, A., & Sternberg, M., (2005). In A. Hoffmann, H. Motoda, & T. Scheffer (Eds.), Support vector inductive logic programming (pp. 163–175). San Mateo: Kaufmann. Google Scholar
  34. Piaget, J. (1961). La psychologie de l’intelligence. Paris: Armand Colin. Google Scholar
  35. Poggio, T., & Girosi, F. (1989). A theory of networks for approximation and learning. Tech. rep., MIT, 1989. Google Scholar
  36. Raedt, L. D., Frasconi, P., Kersting, K., & Muggleton, S. (Eds.). (2008). Probabilistic inductive logic programming (Vol. 4911). Lecture notes in artificial intelligence. Berlin: Springer. Google Scholar
  37. Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine Learning, 62(1–2), 107–136. CrossRefGoogle Scholar
  38. Scholkopf, B., & Smola, A. J. (2001). Learning with Kernels. Cambridge: MIT Press. Google Scholar
  39. Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys (CSUR), 34(1), 1–47. CrossRefGoogle Scholar
  40. Sloman, A. (2009). Ontologies for baby animals and robots, Tech. rep., Talks 68. Google Scholar
  41. Weng, J. (2004). Developmental robotics: Theory and experiments. International Journal of Humanoid Robotics, 1, 199–236. CrossRefGoogle Scholar

Copyright information

© The Author(s) 2011

Authors and Affiliations

  • Michelangelo Diligenti
    • 1
  • Marco Gori
    • 1
  • Marco Maggini
    • 1
  • Leonardo Rigutini
    • 1
  1. 1.Dipartimento di Ingegneria dell’InformazioneUniversità di SienaSienaItaly

Personalised recommendations