Linked Disambiguated Distributional Semantic Networks
Conference paper
First Online:
- 4 Citations
- 1.8k Downloads
Abstract
We present a new hybrid lexical knowledge base that combines the contextual information of distributional models with the conciseness and precision of manually constructed lexical networks. The computation of our count-based distributional model includes the induction of word senses for single-word and multi-word terms, the disambiguation of word similarity lists, taxonomic relations extracted by patterns and context clues for disambiguation in context. In contrast to dense vector representations, our resource is human readable and interpretable, and thus can be easily embedded within the Semantic Web ecosystem.
Keywords
Lexical Knowledge Base Word Similarity Lists BabelNet Sense Inventory Distributional Thesaurus (DT)
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Notes
Acknowledgments
We acknowledge the support of the Deutsche Forschungsgemeinschaft (DFG) under the JOIN-T project.
References
- 1.Biemann, C.: Chinese Whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of the TextGraphs, pp. 73–80 (2006)Google Scholar
- 2.Biemann, C., Riedl, M.: Text: now in 2D! a framework for lexical expansion with contextual similarity. J. Lang. Model. 1(1), 55–95 (2013)CrossRefGoogle Scholar
- 3.Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - a crystallization point for the web of data. JWS 7(3), 154–165 (2009)CrossRefGoogle Scholar
- 4.Camacho-Collados, J., Pilehvar, M.T., Navigli, R.: NASARI: a novel approach to a semantically-aware representation of items. In: Proceedings of the NAACL-HLT, pp. 567–577 (2015)Google Scholar
- 5.Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: Proceedings of the EMNLP, pp. 1025–1035 (2014)Google Scholar
- 6.Chiarcos, C., Hellmann, S., Nordhoff, S.: Linking linguistic resources: examples from the open linguistics working group. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics - Representing and Connecting Language Data and Language Metadata, pp. 201–216. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 7.Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the KDD, pp. 601–610 (2014)Google Scholar
- 8.Evert, S.: The Statistics of Word Cooccurrences: Word Pairs and Collocations. Ph.D. thesis, Institut für maschinelle Sprachverarbeitung, University of Stuttgart (2005)Google Scholar
- 9.Faralli, S., Navigli, R.: Growing multi-domain glossaries from a few seeds using probabilistic topic models. In: Proceedings of the EMNLP, pp. 170–181 (2013)Google Scholar
- 10.Fellbaum, C. (ed.): WordNet: An Electronic Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
- 11.Goikoetxea, J., Soroa, A., Agirre, E.: Random walks and neural network language models on knowledge bases. In: Proceedings of the NAACL HLT, pp. 1434–1439 (2015)Google Scholar
- 12.Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the COLING, pp. 539–545 (1992)Google Scholar
- 13.Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. ArtInt, pp. 28–61 (2013)Google Scholar
- 14.McCrae, J.P., Fellbaum, C., Cimiano, P.: Publishing and linking WordNet using lemon and RDF. In: Proceedings of the 3rd Workshop on Linked Data in Linguistics (2014)Google Scholar
- 15.Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the NIPS, pp. 3111–3119 (2013)Google Scholar
- 16.Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. ArtInt. 193, 217–250 (2012)MathSciNetzbMATHGoogle Scholar
- 17.Parker, R., Graff, D., Kong, J., Chen, K., Maeda, K.: English Gigaword, 5th edn. Linguistic Data Consortium, Philadelphia (2011)Google Scholar
- 18.Remus, S., Biemann, C.: Domain-specific corpus expansion with focused webcrawling. In: Proceedings of the LREC (2016)Google Scholar
- 19.Richter, M., Quasthoff, U., Hallsteinsdóttir, E., Biemann, C.: Exploiting the Leipzig corpora collection. In: Proceedings of the IS-LTC (2006)Google Scholar
- 20.Riedl, M., Biemann, C.: A single word is not enough: ranking multiword expressions using distributional semantics. In: Proceedings of the EMNLP, pp. 2430–2440 (2015)Google Scholar
- 21.Rothe, S., Schütze, H.: AutoExtend: extending word embeddings to embeddings for synsets and lexemes. In: Proceedings of the ACL, pp. 1793–1803 (2015)Google Scholar
- 22.Ruppert, E., Kaufmann, M., Riedl, M., Biemann, C.: JoBimViz: a web-based visualization for graph-based distributional semantic models. In: Proceedings of the ACL-IJCNLP System Demonstrations, pp. 103–108 (2015)Google Scholar
Copyright information
© Springer International Publishing AG 2016