Advertisement

Inducing Implicit Relations from Text Using Distantly Supervised Deep Nets

  • Michael Glass
  • Alfio Gliozzo
  • Oktie Hassanzadeh
  • Nandana Mihindukulasooriya
  • Gaetano Rossiello
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11136)

Abstract

Knowledge Base Population (KBP) is an important problem in Semantic Web research and a key requirement for successful adoption of semantic technologies in many applications. In this paper we present Socrates, a deep learning based solution for Automated Knowledge Base Population from Text. Socrates does not require manual annotations which would make the solution hard to adapt to a new domain. Instead, it exploits a partially populated knowledge base and a large corpus of text documents to train a set of deep neural network models. As a result of the training process, the system learns how to identify implicit relations between entities across a highly heterogeneous set of documents from various sources, making it suitable for large-scale knowledge extraction from Web documents. Main contributions of this paper include (a) a novel approach based on composite contexts to acquire implicit relations from Title Oriented Documents, and (b) an architecture for unifying relation extraction using binary, unary, and composite contexts. We provide an extensive evaluation of the system across three different benchmarks with different characteristics, showing that our unified framework can consistently outperform state of the art solutions. Remarkably, Socrates ranked first in both the knowledge base population and attribute validation track at the Semantic Web Challenge at ISWC 2017.

Keywords

Knowledge base population Deep learning Distant supervision 

References

  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K. (ed.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-76298-0_52CrossRefGoogle Scholar
  2. 2.
    Chang, H., et al.: Extracting multilingual relations under limited resources: TAC 2016 cold-start KB construction and slot-filling using compositional universal schema. In: Proceedings of TAC (2016)Google Scholar
  3. 3.
    Drozd, A., Gladkova, A., Matsuoka, S.: Word embeddings, analogies, and machine learning: beyond king - man + woman = queen. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics, pp. 3519–3530 (2016)Google Scholar
  4. 4.
    Feng, X., Guo, J., Qin, B., Liu, T., Liu, Y.: Effective deep memory networks for distant supervised relation extraction. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, 19–25 August 2017, pp. 4002–4008 (2017).  https://doi.org/10.24963/ijcai.2017/559
  5. 5.
    Ferrucci, D., et al.: Building watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)CrossRefGoogle Scholar
  6. 6.
    Glass, M., Gliozzo, A.: A dataset for web-scale knowledge base population. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 256–271. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-93417-4_17CrossRefGoogle Scholar
  7. 7.
    Glass, M., Gliozzo, A.: Discovering implicit knowledge with unary relations. Preprint (2018). https://ibm.box.com/s/31jqgm5xxjixetee4b1upisxdwbtw12r
  8. 8.
    Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L., Weld, D.S.: Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 541–550. Association for Computational Linguistics (2011)Google Scholar
  9. 9.
    Kohlschütter, C., Fankhauser, P., Nejdl, W.: Boilerplate detection using shallow text features. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 441–450. WSDM 2010. ACM, New York, NY, USA (2010).  https://doi.org/10.1145/1718487.1718542
  10. 10.
    Lin, M., Chen, Q., Yan, S.: Network in network. arXiv preprint arXiv:1312.4400 (2013)
  11. 11.
    Lin, Y., Shen, S., Liu, Z., Luan, H., Sun, M.: Neural relation extraction with selective attention over instances. In: Proceedings of ACL (2016)Google Scholar
  12. 12.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013)Google Scholar
  13. 13.
    Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2, pp. 1003–1011. Association for Computational Linguistics (2009)Google Scholar
  14. 14.
    Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6323, pp. 148–163. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-15939-8_10CrossRefGoogle Scholar
  15. 15.
    Riedel, S., Yao, L., McCallum, A., Marlin, B.M.: Relation extraction with matrix factorization and universal schemas. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 74–84 (2013)Google Scholar
  16. 16.
    Röder, M., Usbeck, R., Ngomo, A.C.N.: GERBIL-benchmarking named entity recognition and linking consistently. Semant. Web J. (2018). http://www.semantic-web-journal.net/system/files/swj1671.pdf
  17. 17.
    Roth, B., Monath, N., Belanger, D., Strubell, E., Verga, P., McCallum, A.: Building knowledge bases with universal schema: cold start and slot-filling approaches. In: Proceedings of the Eighth Text Analysis Conference (TAC 2015) (2015)Google Scholar
  18. 18.
    Shin, J., Wu, S., Wang, F., De Sa, C., Zhang, C., Ré, C.: Incremental knowledge base construction using deepdive. Proc. VLDB Endow. 8(11), 1310–1321 (2015)CrossRefGoogle Scholar
  19. 19.
    Shwartz, V., Goldberg, Y., Dagan, I.: Improving hypernymy detection with an integrated path-based and distributional method. In: Annual Conference of the Association for Computational Linguistics (ACL), pp. 2389–2398 (2016)Google Scholar
  20. 20.
    Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465. Association for Computational Linguistics (2012)Google Scholar
  21. 21.
    Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H., Jin, Z.: Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1785–1794 (2015)Google Scholar
  22. 22.
    Zeng, D., Liu, K., Chen, Y., Zhao, J.: Distant supervision for relation extraction via piecewise convolutional neural networks. In: EMNLP, pp. 1753–1762 (2015)Google Scholar
  23. 23.
    Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 2335–2344 (2014)Google Scholar
  24. 24.
    Zeng, W., Lin, Y., Liu, Z., Sun, M.: Incorporating relation paths in neural relation extraction. arXiv preprint arXiv:1609.07479 (2016)
  25. 25.
    Zhang, Y., et al.: Stanford at TAC KBP 2016: sealing pipeline leaks and understanding Chinese. In: Proceedings of TAC (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Michael Glass
    • 1
  • Alfio Gliozzo
    • 1
  • Oktie Hassanzadeh
    • 1
  • Nandana Mihindukulasooriya
    • 2
  • Gaetano Rossiello
    • 1
    • 3
  1. 1.Knowledge Induction and Reasoning GroupIBM Research AINew YorkUSA
  2. 2.Ontology Engineering GroupUniversidad Politcnica de MadridMadridSpain
  3. 3.Department of Computer ScienceUniversity of BariBariItaly

Personalised recommendations