Advertisement

An Investigation into the Role of Domain-Knowledge on the Use of Embeddings

  • Lovekesh VigEmail author
  • Ashwin Srinivasan
  • Michael Bain
  • Ankit Verma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10759)

Abstract

Computing similarity in high-dimensional vector spaces is a long-standing problem that has recently seen significant progress with the invention of the word2vec algorithm. Usually, it has been found that using an embedded representation results in much better performance for the task being addressed. It is not known whether embeddings can similarly improve performance with data of the kind considered by Inductive Logic Programming (ILP), in which data apparently dissimilar on the surface, can be similar to each other given domain (background) knowledge. In this paper, using several ILP classification benchmarks, we investigate if embedded representations are similarly helpful for problems where there is sufficient amounts of background knowledge. We use tasks for which we have domain expertise about the relevance of background knowledge available and consider two subsets of background predicates (“sufficient” and “insufficient”). For each subset, we obtain a baseline representation consisting of Boolean-valued relational features. Next, a vector embedding specifically designed for classification is obtained. Finally, we examine the predictive performance of widely-used classification methods with and without the embedded representation. With sufficient background knowledge we find no statistical evidence for an improved performance with an embedded representation. With insufficient background knowledge, our results provide empirical evidence that for the specific case of using deep networks, an embedded representation could be useful.

Notes

Acknowledgements

A.S. is a Visiting Professor in the Department of Computer Science, University of Oxford; and Visiting Professorial Fellow, School of CSE, UNSW Sydney. A.S. is supported by the SERB grant EMR/2016/002766.

References

  1. 1.
    Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 420–434. Springer, Heidelberg (2001).  https://doi.org/10.1007/3-540-44503-X_27 CrossRefGoogle Scholar
  2. 2.
    Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 2787–2795. Curran Associates Inc, Red Hook (2013)Google Scholar
  3. 3.
    Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)Google Scholar
  4. 4.
    Faruquie, T.A., Srinivasan, A., King, R.D.: Topic models with relational features for drug design. In: Riguzzi, F., Železný, F. (eds.) ILP 2012. LNCS (LNAI), vol. 7842, pp. 45–57. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38812-5_4 CrossRefGoogle Scholar
  5. 5.
    França, M.V.M., Zaverucha, G., d’Avila Garcez, A.S.: Fast relational learning using bottom clause propositionalization with artificial neural networks. Mach. Learn. 94(1), 81–104 (2014)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Goodfellow, I.J., Bengio, Y., Courville, A.C.: Deep Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2016)zbMATHGoogle Scholar
  7. 7.
    Joshi, S., Ramakrishnan, G., Srinivasan, A.: Feature construction using theory-guided sampling and randomised search. In: Železný, F., Lavrač, N. (eds.) ILP 2008. LNCS (LNAI), vol. 5194, pp. 140–157. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-85928-4_14 CrossRefGoogle Scholar
  8. 8.
    King, R.D., Muggleton, S.H., Srinivasan, A., Sternberg, M.J.: Structure-activity relationships derived by machine learning: the use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. Proc. Natl. Acad. Sci. U.S.A. 93(1), 438–442 (1996)CrossRefGoogle Scholar
  9. 9.
    King, R.D., Srinivasan, A.: Prediction of rodent carcinogenicity bioassays from molecular structure using inductive logic programming. Environ. Health Perspect. 104, 1031–1040 (1996)CrossRefGoogle Scholar
  10. 10.
    Koch, G.: Siamese neural networks for one-shot image recognition (2015)Google Scholar
  11. 11.
    Lavrač, N., Džeroski, S., Grobelnik, M.: Learning nonrecursive definitions of relations with linus. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 265–281. Springer, Heidelberg (1991).  https://doi.org/10.1007/BFb0017020 CrossRefGoogle Scholar
  12. 12.
    Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: AAAI (2015)Google Scholar
  13. 13.
    Lodhi, H.: Deep relational machines. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8227, pp. 212–219. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-42042-9_27 CrossRefGoogle Scholar
  14. 14.
    Marshall, J.B.: The sign test with ties included. Appl. Math. 5, 1594–1597 (2014)CrossRefGoogle Scholar
  15. 15.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a Meeting Held 5–8 December 2013, Lake Tahoe, Nevada, United States, pp. 3111–3119 (2013)Google Scholar
  16. 16.
    Muggleton, S.H., Santos, J.C.A., Tamaddoni-Nezhad, A.: TopLog: ILP using a logic program declarative bias. In: de la Garcia, M., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 687–692. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-89982-2_58 CrossRefGoogle Scholar
  17. 17.
    Muggleton, S.: Inverse entailment and progol. New Gener. Comput. 13(3&4), 245–286 (1995)CrossRefGoogle Scholar
  18. 18.
    Ramakrishnan, G., Joshi, S., Balakrishnan, S., Srinivasan, A.: Using ILP to construct features for information extraction from semi-structured text. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 211–224. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-78469-2_22 CrossRefGoogle Scholar
  19. 19.
    Saha, A., Srinivasan, A., Ramakrishnan, G.: What kinds of relational features are useful for statistical learning? In: Riguzzi, F., Železný, F. (eds.) ILP 2012. LNCS (LNAI), vol. 7842, pp. 209–224. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-38812-5_15 CrossRefGoogle Scholar
  20. 20.
    Specia, L., Srinivasan, A., Joshi, S., Ramakrishnan, G., das Graças Volpe Nunes, M.: An investigation into feature construction to assist word sense disambiguation. Mach. Learn. 76(1), 109–136 (2009)CrossRefGoogle Scholar
  21. 21.
    Srinivasan, A.: The Aleph Manual (1999). http://www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/
  22. 22.
    Srinivasan, A., Muggleton, S.H., Sternberg, M.J.E., King, R.D.: Theories for mutagenicity: a study in first-order and feature-based induction. Artif. Intell. 85(1–2), 277–299 (1996)CrossRefGoogle Scholar
  23. 23.
    Srinivasan, A., King, R.D.: Feature construction with inductive logic programming: a study of quantitative predictions of biological activity by structural attributes. In: Muggleton, S. (ed.) ILP 1996. LNCS, vol. 1314, pp. 89–104. Springer, Heidelberg (1997).  https://doi.org/10.1007/3-540-63494-0_50 CrossRefGoogle Scholar
  24. 24.
    Srinivasan, A., King, R.D., Bain, M.: An empirical study of the use of relevance information in inductive logic programming. J. Mach. Learn. Res. 4, 369–383 (2003)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Srinivasan, A., Ramakrishnan, G.: Parameter screening and optimisation for ILP using designed experiments. J. Mach. Learn. Res. 12, 627–662 (2011)zbMATHGoogle Scholar
  26. 26.
    Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: AAAI (2014)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Lovekesh Vig
    • 1
    Email author
  • Ashwin Srinivasan
    • 2
  • Michael Bain
    • 3
  • Ankit Verma
    • 4
  1. 1.TCS ResearchNew DelhiIndia
  2. 2.School of Computer Science and Information SystemsBITS PilaniSancoaleIndia
  3. 3.School of Computer Science and EngineeringUNSWSydneyAustralia
  4. 4.School of Computational and Integrative SciencesJawaharlal Nehru UniversityNew DelhiIndia

Personalised recommendations