Populating a Knowledge Base with Object-Location Relations Using Distributional Semantics

  • Valerio Basile
  • Soufian Jebbara
  • Elena Cabrio
  • Philipp Cimiano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10024)

Abstract

The paper presents an approach to extract knowledge from large text corpora, in particular knowledge that facilitates object manipulation by embodied intelligent systems that need to act in the world. As a first step, our goal is to extract the prototypical location of given objects from text corpora. We approach this task by calculating relatedness scores for objects and locations using techniques from distributional semantics. We empirically compare different methods for representing locations and objects as vectors in some geometric space, and we evaluate them with respect to a crowd-sourced gold standard in which human subjects had to rate the prototypicality of a location given an object. By applying the proposed framework on DBpedia, we are able to build a knowledge base of 931 high confidence object-locations relations in a fully automatic fashion (The work in this paper is partially funded by the ALOOF project (CHIST-ERA program)).

References

  1. 1.
    Bach, N., Badaskar, S.: A Review of Relation Extraction (2007)Google Scholar
  2. 2.
    Barker, K., Agashe, B., Chaw, S.Y., Fan, J., Friedland, N., Glass, M., Hobbs, J., Hovy, E., Israel, D., Kim, D.S., Mulkar-Mehta, R., Patwardhan, S., Porter, B., Tecuci, D., Yeh, P.: Learning by reading: a prototype system, performance baseline and lessons learned. In: Proceedings of the 22nd National Conference on Artificial Intelligence, vol. 1, pp. 280–286. AAAI 2007 (2007)Google Scholar
  3. 3.
    Baroni, M., Dinu, G., Kruszewski, G.: Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In: Proceedings of ACL 2014 (vol. 1: Long Papers), June 2014Google Scholar
  4. 4.
    Blohm, S., Cimiano, P., Stemle, E.: Harvesting relations from the web - quantifiying the impact of filtering functions. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, pp. 1316–1321 (2007)Google Scholar
  5. 5.
    Bordes, A., Weston, J., Collobert, R., Bengio, Y.: Learning structured embeddings of knowledge bases. Artif. Intell. (Bengio), 301–306 (2011)Google Scholar
  6. 6.
    Bunescu, R.C., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: HLT/EMNLP. http://acl.ldc.upenn.edu/H/H05/H05-1091.pdf
  7. 7.
    Camacho-Collados, J., Pilehvar, M.T., Navigli, R.: HLT-NAACL (2015)Google Scholar
  8. 8.
    Cimiano, P., Wenderoth, J.: Automatically learning qualia structures from the web. In: Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition. DeepLA 2005, pp. 28–37 (2005)Google Scholar
  9. 9.
    Ciobanu, A.M., Dinu, A.: Alternative measures of word relatedness in distributional semantics. In: Joint Symposium on Semantic Processing, p. 80 (2013)Google Scholar
  10. 10.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: NLP (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)MATHGoogle Scholar
  11. 11.
    Daiber, J., Jakob, M., Hokamp, C., Mendes, P.N.: Improving efficiency and accuracy in multilingual entity extraction. In: Proceedings of I-Semantics (2013)Google Scholar
  12. 12.
    Etzioni, O.: Machine reading at web scale. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM 2008, p. 2 (2008)Google Scholar
  13. 13.
    Etzioni, O., Fader, A., Christensen, J., Soderland, S., Mausam, M.: Open information extraction: the second generation. In: Proceedings of IJCAI, IJCAI 2011, vol. 1 (2011)Google Scholar
  14. 14.
    Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E., Smith, N.A.: Retrofitting word vectors to semantic lexicons. In: Proceedings of NAACL (2015)Google Scholar
  15. 15.
    Girju, R., Badulescu, A., Moldovan, D.: Learning semantic constraints for the automatic discovery of part-whole relations. In: Proceedings of the NAACL 2003, vol. 1 (2003)Google Scholar
  16. 16.
    Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)CrossRefGoogle Scholar
  17. 17.
    Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L.S., Weld, D.S.: Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of ACL 2011, pp. 541–550 (2011)Google Scholar
  18. 18.
    Hoffmann, R., Zhang, C., Weld, D.S.: Learning 5000 relational extractors. In: Proceedings of ACL 2010, pp. 286–295 (2010)Google Scholar
  19. 19.
    Kiela, D., Hill, F., Clark, S.: Specializing word embeddings for similarity or relatedness. In: Proceedings of EMNLP 2015 (September), pp. 2044–2048 (2015)Google Scholar
  20. 20.
    Köhn, A.: What’s in an embedding? Analyzing word embeddings through multilingual evaluation. Proc. EMNLP 2015(2014), 2067–2073 (2015)Google Scholar
  21. 21.
    Landauer, T.K., Dutnais, S.T.: A solution to platos problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)CrossRefGoogle Scholar
  22. 22.
    Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, pp. 1188–1196 (2014)Google Scholar
  23. 23.
    Liu, H., Singh, P.: Conceptnet— a practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)MathSciNetCrossRefGoogle Scholar
  24. 24.
    McAuley, J.J., Pandey, R., Leskovec, J.: Inferring networks of substitutable and complementary products. In: KDD (2015)Google Scholar
  25. 25.
    Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of ICLR 2013 (2013)Google Scholar
  26. 26.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  27. 27.
    Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  28. 28.
    Mitchell, J., Lapata, M.: Vector-based models of semantic composition. In: Computational Linguistics (June), pp. 236–244Google Scholar
  29. 29.
    Mooney, R.J.: Learning to connect language and perception. In: Proceedings of the 23rd National Conference on Artificial Intelligence, AAAI 2008, vol. 3, pp. 1598–1601 (2008)Google Scholar
  30. 30.
    Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)MathSciNetCrossRefMATHGoogle Scholar
  31. 31.
    Radinsky, K., Agichtein, E., Gabrilovich, E., Markovitch, S.: A word at a time: computing word relatedness using temporal semantic analysis. In: Proceedings of WWW 2011, pp. 337–346. ACM (2011)Google Scholar
  32. 32.
    Reisinger, J., Mooney, R.J.: Multi-prototype vector-space models of word meaning. In: Proceedings of ACL 2010, pp. 109–117. Association for Computational Linguistics (2010)Google Scholar
  33. 33.
    Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st ICML, pp. 1818–1826 (2014)Google Scholar
  34. 34.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of WWW 2007, pp. 697–706. ACM, New York (2007)Google Scholar
  35. 35.
    Sun, Y., Lin, L., Tang, D., Yang, N., Ji, Z., Wang, X.: Modeling mention, context and entity with neural networks for entity disambiguation. In: IJCAI International Joint Conference on Artificial Intelligence, pp. 1333–1339 (2015)Google Scholar
  36. 36.
    Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: Proceedings of EMNLP-CoNLL 2012, pp. 455–465 (2012)Google Scholar
  37. 37.
    Weston, J., Bordes, A., Yakhnenko, O., Usunier, N.: Connecting language and knowledge bases with embedding models for relation extraction. In: EMNLP, pp. 1366–1371Google Scholar
  38. 38.
    Xu, W., Hoffmann, R., Zhao, L., Grishman, R.: Filling knowledge base gaps for distant supervision of relation extraction. In: Proceedings of ACL 2013, vol. 2: Short Papers, pp. 665–670 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Valerio Basile
    • 1
  • Soufian Jebbara
    • 2
  • Elena Cabrio
    • 1
  • Philipp Cimiano
    • 2
  1. 1.Université Côte d’Azur, Inria, CNRS, I3SSophia AntipolisFrance
  2. 2.Bielefeld UniversityBielefeldGermany

Personalised recommendations