Efficient and Scalable Induction of Logic Programs Using a Deductive Database System

Ferreira, Michel; Fonseca, Nuno A.; Rocha, Ricardo; Soares, Tiago

doi:10.1007/978-3-540-73847-3_22

Efficient and Scalable Induction of Logic Programs Using a Deductive Database System

Michel Ferreira¹,
Nuno A. Fonseca¹,
Ricardo Rocha¹ &
…
Tiago Soares¹

Conference paper

483 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4455))

Abstract

A consequence of ILP systems being implemented in Prolog or using Prolog libraries is that, usually, these systems use a Prolog internal database to store and manipulate data. However, in real-world problems, the original data is rarely in Prolog format. In fact, the data is often kept in Relational Database Management Systems (RDBMS) and then converted to a format acceptable by the ILP system. Therefore, a more interesting approach is to link the ILP system to the RDBMS and manipulate the data without converting it. This scheme has the advantage of being more scalable since the whole data does not need to be loaded into memory by the ILP system. In this paper we study several approaches of coupling ILP systems with RDBMS systems and evaluate their impact on performance. We propose to use a Deductive Database (DDB) system to transparently translate the hypotheses to relational algebra expressions. The empirical evaluation performed shows that the execution time of ILP algorithms can be effectively reduced using a DDB and that the size of the problems can be increased due to a non-memory storage of the data.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The Protein Data Bank. Nucleic Acids Research, 235–242 (2000)
Google Scholar
Benson, D., Karsch-Mizrachi, I., Lipman, D., Ostell, J., Wheeler, D.: GenBank. Nucleic Acids Research 33, 235–242 (2005)
Article Google Scholar
Wrobel, S.: Inductive Logic Programming for Knowledge Discovery in Databases. In: Relational Data Mining, pp. 74–101. Springer, Heidelberg (2001)
Google Scholar
Raedt, L.D.: Attribute Value Learning versus Inductive Logic Programming: The Missing Links. In: Page, D.L. (ed.) Inductive Logic Programming. LNCS, vol. 1446, pp. 1–8. Springer, Heidelberg (1998)
Chapter Google Scholar
Raedt, L.D., Laer, W.V.: Inductive Constraint Logic. In: International Conference on Algorithmic Learning Theory, pp. 80–94. Springer, Heidelberg (1995)
Google Scholar
Raedt, L.D., Dehaspe, L.: Clausal Discovery. Machine Learning 26, 99–146 (1997)
Article MATH Google Scholar
Srinivasan, A.: The Aleph Manual (2003), available from http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph
Muggleton, S., Firth, J.: Relational Rule Induction with CProgol4.4: A Tutorial Introduction. In: Relational Data Mining, pp. 160–188. Springer, Heidelberg (2001)
Google Scholar
Fonseca, N.A., Silva, F., Camacho, R.: April - An Inductive Logic Programming System. In: Fisher, M., van der Hoek, W., Konev, B., Lisitsa, A. (eds.) JELIA 2006. LNCS (LNAI), vol. 4160, pp. 481–484. Springer, Heidelberg (2006)
Chapter Google Scholar
Soares, T., Ferreira, M., Rocha, R.: The MYDDAS Programmer’s Manual. Technical Report DCC-2005-10, Department of Computer Science, University of Porto (2005)
Google Scholar
Shen, W.-M., Leng, B.: Metapattern Generation for Integrated Data Mining. In: Knowledge Discovery and Data Mining, pp. 152–157 (1996)
Google Scholar
Brockhausen, P., Morik, K.: Direct Access of an ILP Algorithm to a Database Management System. In: MLnet Familiarization Workshop on Data Mining with Inductive Logic Programing, pp. 95–100 (1996)
Google Scholar
Morik, K.: Knowledge Discovery in Databases - an Inductive Logic Programming Approach. In: Foundations of Computer Science: Potential - Theory - Cognition, pp. 429–436. Springer, Heidelberg (1997)
Google Scholar
Bockhorst, J., Ong, I.M.: FOIL-D: Efficiently Scaling FOIL for Multi-Relational Data Mining of Large Datasets. In: Camacho, R., King, R., Srinivasan, A. (eds.) ILP 2004. LNCS (LNAI), vol. 3194, pp. 63–79. Springer, Heidelberg (2004)
Google Scholar
Botta, M., Giordana, A., Saitta, L., Sebag, M.: Relational Learning as Search in a Critical Region. Journal of Machine Learning Research 4, 431–463 (2003)
Article MathSciNet Google Scholar
Codd, E.F.: A relational model for large shared data banks. Communications of the ACM 13(6), 377–387 (1970)
Article MATH Google Scholar
Ullman, J.D.: Principles of Database and Knowledge-Base Systems. Computer Science Press (1989)
Google Scholar
Muggleton, S., Raedt, L.D.: Inductive Logic Programming: Theory and Methods. Journal of Logic Programming 19/20, 629–679 (1994)
Article Google Scholar
Soares, T., Rocha, R., Ferreira, M.: Generic Cut Actions for External Prolog Predicates. In: Van Hentenryck, P. (ed.) PADL 2006. LNCS, vol. 3819, pp. 16–30. Springer, Heidelberg (2005)
Chapter Google Scholar
Blockeel, H., Dehaspe, L., Demoen, B., Janssens, G., Ramon, J., Vandecasteele, H.: Improving the Efficiency of Inductive Logic Programming Through the Use of Query Packs. Journal of Machine Learning Research 16, 135–166 (2002)
MATH Google Scholar
Muggleton, S.: Inverse Entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming 13, 245–286 (1995)
Google Scholar
Blockeel, H., Raedt, L.D.: Top-Down Induction of First-Order Logical Decision Trees. Artificial Intelligence 101, 285–297 (1998)
Article MATH MathSciNet Google Scholar
McCreath, E., Sharma, A.: Extraction of meta-knowledge to restrict the hypothesis space for ILP systems. In: Australian Joint Conference on Artificial Intelligence, pp. 75–82. World Scientific, Singapore (1995)
Google Scholar
Santos Costa, V., Srinivasan, A., Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., Laer, W.V.: Query Transformations for Improving the Efficiency of ILP Systems. Journal of Machine Learning Research 4, 465–491 (2002)
Article Google Scholar
Srinivasan, A.: A study of two sampling methods for analysing large datasets with ILP. Data Mining and Knowledge Discovery 3(1), 95–123 (1999)
Article Google Scholar
DiMaio, F., Shavlik, J.W.: Learning an Approximation to Inductive Logic Programming Clause Evaluation. In: Camacho, R., King, R., Srinivasan, A. (eds.) ILP 2004. LNCS (LNAI), vol. 3194, pp. 80–97. Springer, Heidelberg (2004)
Google Scholar
Berardi, M., Varlaro, A., Malerba, D.: On the Effect of Caching in Recursive Theory Learning. In: Camacho, R., King, R., Srinivasan, A. (eds.) ILP 2004. LNCS (LNAI), vol. 3194, pp. 44–62. Springer, Heidelberg (2004)
Google Scholar
Rocha, R., Fonseca, N.A., Santos Costa, V.: On Applying Tabling to Inductive Logic Programming. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 707–714. Springer, Heidelberg (2005)
Chapter Google Scholar
Fonseca, N.A., Silva, F., Camacho, R.: Strategies to Parallelize ILP Systems. In: Kramer, S., Pfahringer, B. (eds.) ILP 2005. LNCS (LNAI), vol. 3625, pp. 136–153. Springer, Heidelberg (2005)
Google Scholar
Weber, I.: Discovery of First-Order Regularities in a Relational Database Using Offline Candidate Determination. In: Džeroski, S., Lavrač, N. (eds.) Inductive Logic Programming. LNCS, vol. 1297, pp. 288–295. Springer, Heidelberg (1997)
Google Scholar
Dehaspe, L., Toironen, H.: Discovery of Relational Association Rules. In: Relational Data Mining, pp. 189–208. Springer, Heidelberg (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

DCC-FC & LIACC, University of Porto, Portugal
Michel Ferreira, Nuno A. Fonseca, Ricardo Rocha & Tiago Soares

Authors

Michel Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
Nuno A. Fonseca
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Rocha
View author publications
You can also search for this author in PubMed Google Scholar
Tiago Soares
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Stephen Muggleton Ramon Otero Alireza Tamaddoni-Nezhad

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ferreira, M., Fonseca, N.A., Rocha, R., Soares, T. (2007). Efficient and Scalable Induction of Logic Programs Using a Deductive Database System. In: Muggleton, S., Otero, R., Tamaddoni-Nezhad, A. (eds) Inductive Logic Programming. ILP 2006. Lecture Notes in Computer Science(), vol 4455. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73847-3_22

Download citation

DOI: https://doi.org/10.1007/978-3-540-73847-3_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73846-6
Online ISBN: 978-3-540-73847-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics