Advertisement

The VLDB Journal

, Volume 28, Issue 2, pp 147–171 | Cite as

Logical scalability and efficiency of relational learning algorithms

  • Jose PicadoEmail author
  • Arash Termehchy
  • Alan Fern
  • Parisa Ataei
Regular Paper
  • 121 Downloads

Abstract

Relational learning algorithms learn the definition of a new relation in terms of existing relations in the database. The same database may be represented under different schemas for various reasons, such as efficiency, data quality, and usability. Unfortunately, the output of current relational learning algorithms tends to vary quite substantially over the choice of schema, both in terms of learning accuracy and efficiency. We introduce the property of schema independence of relational learning algorithms, and study both the theoretical and empirical dependence of existing algorithms on the common class of (de) composition schema transformations. We show theoretically and empirically that current relational learning algorithms are generally not schema independent. We propose Castor, a relational learning algorithm that achieves schema independence.

Keywords

Relational learning Machine Learning Schema independence Database transformations Inductive logic programming 

References

  1. 1.
    Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases: The Logical Level. Addison-Wesley, Boston (1994)Google Scholar
  2. 2.
    Abouzied, A., Angluin, D., Papadimitriou, C., Hellerstein, J., Silberschatz, A.: Learning and verifying quantified boolean queries by example. In: PODS, pp. 49–60 (2013)Google Scholar
  3. 3.
    Angluin, D., Frazier, M., Pitt, L.: Learning conjunctions of Horn clauses. Mach. Learn. 9(2–3), 147–164 (1992)zbMATHGoogle Scholar
  4. 4.
    Arias, M., Khardon, R., Maloberti, J.: Learning Horn expressions with LOGAN-H. J. Mach. Learn. Res. 8, 549–587 (2007)zbMATHGoogle Scholar
  5. 5.
    Atzeni, P., Ausiello, G., Batini, C., Moscarini, M.: Inclusion and Equivalent Between Relational Database Schemata. TCS, Mumbai (1982)zbMATHGoogle Scholar
  6. 6.
    Cate, B.T., Dalmau, V., Kolaitis, P.G.: Learning schema mappings. TODS 38(4), 28:1–28:31 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Chen, Y., Goldberg, S., Wang, D.Z., Johri, S.S.: Ontological pathfinding: mining first-order knowledge from large knowledge bases. In: SIGMOD, pp. 835–846 (2016)Google Scholar
  8. 8.
    Costa, V.S., Srinivasan, A., Camacho, R., Blockeel, H., Demoen, B., Janssens, G., Struyf, J., Vandecasteele, H., Laer, W.V.: Query transformations for improving the efficiency of ILP systems. J. Mach. Learn. Res. 4, 465–491 (2003)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Davis, J., Burnside, E.S., de Castro Dutra, I., Page, D., Ramakrishnan, R., Costa, V.S., Shavlik, J.W.: View learning for statistical relational learning: with an application to mammography. In: IJCAI, pp. 677–683 (2005)Google Scholar
  10. 10.
    De Raedt, L.: Logical and Relational Learning, 1st edn. Springer, Berlin (2010)zbMATHGoogle Scholar
  11. 11.
    Fagin, R.: Inverting schema mappings. TODS 32(4), 25 (2007)CrossRefzbMATHGoogle Scholar
  12. 12.
    Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT, pp. 207–224. Springer, Berlin, Heidelberg (2003)Google Scholar
  13. 13.
    Fan, W., Bohannon, P.: Information preserving xml schema embedding. TODS 33(1), 1–44 (2008)CrossRefGoogle Scholar
  14. 14.
    Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.: Fast rule mining in ontological knowledge bases with AMIE+. VLDB J. 24(6), 707–730 (2015)CrossRefGoogle Scholar
  15. 15.
    Getoor, L., Machanavajjhala, A.: Entity resolution in big data. In: KDD, p. 1527 (2013)Google Scholar
  16. 16.
    Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)CrossRefzbMATHGoogle Scholar
  17. 17.
    Hull, R.: Relative information capacity of simple relational database schemata. In: PODS, pp. 97–109 (1984)Google Scholar
  18. 18.
    Khardon, R.: Learning function-free Horn expressions. Mach. Learn. 37(3), 241–275 (1999)CrossRefzbMATHGoogle Scholar
  19. 19.
    Kuželka, O., Železný, F.: A restarted strategy for efficient subsumption testing. Fundam. Inf. 89(1), 95–109 (2009)zbMATHGoogle Scholar
  20. 20.
    Li, H., Chan, C.-Y., Maier, D.: Query from examples: an iterative, data-driven approach to query construction. PVLDB 8(13), 2158–2163 (2015)Google Scholar
  21. 21.
    Muggleton, S.: Inverse Entailment and Progol. New Gener. Comput. Special Issue Inductive Log. Program. 13, 245–286 (1995)CrossRefGoogle Scholar
  22. 22.
    Muggleton, S., Feng, C.: Efficient induction of logic programs. In: ALT. Springer/Ohmsha, pp. 368–381 (1990)Google Scholar
  23. 23.
    Muggleton, S., Santos, J.C.A., Tamaddoni-Nezhad, A.: Progolem: a system based on relative minimal generalisation. In: De Raedt, L. (ed.) ILP, vol. 5989. Springer, Berlin, Heidelberg (2009)Google Scholar
  24. 24.
    Picado, J., Termehchy, A., Fern, A., Ataei, P.: Schema independent relational learning. arXiv:1508.03846 (2015)
  25. 25.
    Picado, J., Termehchy, A., Fern, A., Ataei, P.: Schema independent relational learning. In: SIGMOD, pp. 929–944 (2017)Google Scholar
  26. 26.
    Quinlan, J.R.: Learning logical definitions from relations. Mach. Learn. 5, 239–266 (1990)Google Scholar
  27. 27.
    Reddy, C., Tadepalli, P.: Learning Horn definitions: theory and an application to planning. New Gener. Comput. 17, 77–98 (1998)CrossRefGoogle Scholar
  28. 28.
    Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)CrossRefGoogle Scholar
  29. 29.
    Srinivasan, A.: The Aleph manual (2007). http://www.cs.ox.ac.uk/activities/machinelearning/Aleph/aleph. Accessed 15 Oct 2018
  30. 30.
    Termehchy, A., Winslett, M., Chodpathumwan, Y.: How schema independent are schema free query interfaces? In: ICDE, pp. 645–660 (2011)Google Scholar
  31. 31.
    Zeng, Q., Patel, J.M., Page, D.: QuickFOIL: scalable inductive logic programming. PVLDB 8(3), 197–208 (2014)Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of EECSOregon State UniversityCorvallisUSA

Personalised recommendations