Abstract
Concept learning approaches based on refinement operators explore partially ordered solution spaces to compute concepts, which are used as binary classification models for individuals. However, the number of concepts explored by these approaches can grow to the millions for complex learning problems. This often leads to impractical runtimes. We propose to alleviate this problem by predicting the length of target concepts before the exploration of the solution space. By these means, we can prune the search space during concept learning. To achieve this goal, we compare four neural architectures and evaluate them on four benchmarks. Our evaluation results suggest that recurrent neural network architectures perform best at concept length prediction with a macro F-measure ranging from 38% to 92%. We then extend the CELOE algorithm, which learns ALC concepts, with our concept length predictor. Our extension yields the algorithm CLIP. In our experiments, CLIP is at least 7.5\(\times \) faster than other state-of-the-art concept learning algorithms for ALC—including CELOE—and achieves significant improvements in the F-measure of the concepts learned on 3 out of 4 datasets. For reproducibility, we provide our implementation in the public GitHub repository at https://github.com/dice-group/LearnALCLengths.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
The implementations of OCEL and ELTL in the DL-Learner framework, which we used for our experiments, fail to consider the set threshold accurately. Hence, Table 7 contains values larger than 2 min for these two algorithms.
- 3.
Note that we ran OCEL with its default settings and F1 scores are not available.
References
Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000)
Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, Cambridge (2003)
Badea, L., Nienhuys-Cheng, S.-H.: A refinement operator for description logics. In: Cussens, J., Frisch, A. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 40–59. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44960-4_3
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(2) (2012)
Bin, S., Bühmann, L., Lehmann, J., Ngonga Ngomo, A.C.: Towards SPARQL-based induction for large-scale RDF data sets. In: ECAI 2016, pp. 1551–1552, IOS Press (2016)
Bordes, A., Glorot, X., Weston, J., Bengio, Y.: A semantic matching energy function for learning with multi-relational data. Mach. Learn. 94(2), 233–259 (2013). https://doi.org/10.1007/s10994-013-5363-6
Bühmann, L., Lehmann, J., Westphal, P.: DL-Learner—a framework for inductive learning on the Semantic Web. J. Web Semant. 39, 15–24 (2016)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32(suppl1), D258–D261 (2004)
Dai, Y., Wang, S., Xiong, N.N., Guo, W.: A survey on knowledge graph embedding: approaches, applications and benchmarks. Electronics 9(5), 750 (2020)
Demir, C., Ngomo, A.C.N.: Convolutional complex knowledge graph embeddings. arXiv preprint arXiv:2008.03130 (2020)
Deshpande, O., et al.: Building, maintaining, and using knowledge bases: a report from the trenches. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1209–1220 (2013)
Fanizzi, N., d’Amato, C., Esposito, F.: DL-FOIL concept learning in description logics. In: Železný, F., Lavrač, N. (eds.) ILP 2008. LNCS (LNAI), vol. 5194, pp. 107–121. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-85928-4_12
Heindorf, S., et al.: EvoLearner: Learning description logics with evolutionary algorithms. In: Proceedings of the ACM Web Conference (2022)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hogan, A., et al.: Knowledge graphs. Synth. Lect. Data Semant. Knowl. 12(2), 1–257 (2021)
Ioannidis, V.N., et al.: DRKG-drug repurposing knowledge graph for COVID-19 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kouagou, N.J., Heindorf, S., Demir, C., Ngomo, A.N.: Neural class expression synthesis. CoRR abs/2111.08486 (2021)
Krötzsch, M., Simancik, F., Horrocks, I.: A description logic primer. CoRR abs/1201.4089 (2012)
Lehmann, J.: DL-Learner: learning concepts in description logics. J. Mach. Learn. Res. 10, 2639–2642 (2009)
Lehmann, J.: Learning OWL Class Expressions, vol. 22. IOS Press (2010)
Lehmann, J., Auer, S., Bühmann, L., Tramp, S.: Class expression learning for ontology engineering. J. Web Semant. 9(1), 71–81 (2011)
Lehmann, J., Hitzler, P.: Concept learning in description logics using refinement operators. Mach. Learn. 78(1–2), 203 (2010)
MacLean, F.: Knowledge graphs and their applications in drug discovery. Expert Opin. Drug Discov. 16(9), 1057–1069 (2021)
Nickel, M., Tresp, V., Kriegel, H.P.: Factorizing YAGO: scalable machine learning for linked data. In: Proceedings of the 21st international conference on World Wide Web, pp. 271–280 (2012)
Percha, B., Altman, R.B.: A global network of biomedical relationships derived from text. Bioinformatics 34(15), 2614–2624 (2018)
Rizzo, G., Fanizzi, N., d’Amato, C.: Class expression induction as concept space exploration: from DL-Foil to DL-Focl. Future Gener. Comput. Syst. 108, 256–272 (2020)
Rizzo, G., Fanizzi, N., d’Amato, C., Esposito, F.: A framework for tackling myopia in concept learning on the web of data. In: Faron Zucker, C., Ghidini, C., Napoli, A., Toussaint, Y. (eds.) EKAW 2018. LNCS (LNAI), vol. 11313, pp. 338–354. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03667-6_22
Rudolph, S.: Foundations of description logics. In: Polleres, A., d’Amato, C., Arenas, M., Handschuh, S., Kroner, P., Ossowski, S., Patel-Schneider, P. (eds.) Reasoning Web 2011. LNCS, vol. 6848, pp. 76–136. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23032-5_2
Sarker, M.K., Hitzler, P.: Efficient concept induction for description logics. In: AAAI, pp. 3036–3043 (2019)
Schmidt-Schauß, M., Smolka, G.: Attributive concept descriptions with complements. Artif. Intell. 48(1), 1–26 (1991)
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Wang, Z., Li, J., Liu, Z., Tang, J.: Text-enhanced representation learning for knowledge graph. In: Proceedings of International Joint Conference on Artificial Intelligent (IJCAI), pp. 4–17 (2016)
Weston, J., Bordes, A., Yakhnenko, O., Usunier, N.: Connecting language and knowledge bases with embedding models for relation extraction. arXiv preprint arXiv:1307.7973 (2013)
Wishart, D.S., et al.: DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 46(D1), D1074–D1082 (2018)
Xie, R., Liu, Z., Jia, J., Luan, H., Sun, M.: Representation learning of knowledge graphs with entity descriptions. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Zaheer, M., Kottur, S., Ravanbakhsh, S., Póczos, B., Salakhutdinov, R., Smola, A.J.: Deep sets. In: NIPS, pp. 3391–3401 (2017)
Acknowledgements
This work is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 860801. This work has been supported by the German Federal Ministry of Education and Research (BMBF) within the project DAIKIRI under the grant no 01IS19085B and by the German Federal Ministry for Economic Affairs and Climate Action (BMWK) within the project RAKI under the grant no 01MD19012B. The authors gratefully acknowledge the funding of this project by computing time provided by the Paderborn Center for Parallel Computing (PC2).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kouagou, N.J., Heindorf, S., Demir, C., Ngomo, AC.N. (2022). Learning Concept Lengths Accelerates Concept Learning in ALC. In: Groth, P., et al. The Semantic Web. ESWC 2022. Lecture Notes in Computer Science, vol 13261. Springer, Cham. https://doi.org/10.1007/978-3-031-06981-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-06981-9_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-06980-2
Online ISBN: 978-3-031-06981-9
eBook Packages: Computer ScienceComputer Science (R0)