Semantic Web Evaluation Challenge

Semantic Web Evaluation Challenges pp 51-62 | Cite as

Exploiting Linked Open Data to Uncover Entity Types

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 548)


Extracting structured information from text plays a crucial role in automatic knowledge acquisition and is at the core of any knowledge representation and reasoning system. Traditional methods rely on hand-crafted rules and are restricted by the performance of various linguistic pre-processing tools. More recent approaches rely on supervised learning of relations trained on labelled examples, which can be manually created or sometimes automatically generated (referred as distant supervision). We propose a supervised method for entity typing and alignment. We argue that a rich feature space can improve extraction accuracy and we propose to exploit Linked Open Data (LOD) for feature enrichment. Our approach is tested on task-2 of the Open Knowledge Extraction challenge, including automatic entity typing and alignment. Our approach demonstrate that by combining evidences derived from LOD (e.g. DBpedia) and conventional lexical resources (e.g. WordNet) (i) improves the accuracy of the supervised induction method and (ii) enables easy matching with the Dolce+DnS Ultra Lite ontology classes.


Resource Description Framework Name Entity Recognition Head Noun SPARQL Query Link Open Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



Part of this research has been sponsored by the EPSRC funded project LODIE: Linked Open Data for IE, EP/J019488/1.


  1. 1.
    Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM Conference on Digital Libraries. DL 2000, pp. 85–94. ACM, New York, NY, USA (2000).
  2. 2.
    Bizer, C., Heath, T., Ayers, D., Raimond, Y.: Interlinking open data on the web. Media 79(1), 31–35 (2007). ng-open-data.pdf Google Scholar
  3. 3.
    Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: Dbpedia-a crystallization point for the web of data. Web Seman. Sci. Serv. Agents World Wide Web 7(3), 154–165 (2009)CrossRefGoogle Scholar
  4. 4.
    Bizer, C., Volz, J., Kobilarov, G., Gaedke, M.: Silk - a link discovery framework for the web of data. In: 18th International World Wide Web Conference, April 2009.
  5. 5.
    Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka, E.R., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: AAAI (2010)Google Scholar
  6. 6.
    Daille, B., Habert, B., Jacquemin, C., Royauté, J.: Empirical observation of term variations and principles for their description. Terminology 3(2), 197–257 (1996)CrossRefGoogle Scholar
  7. 7.
    Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. EMNLP 2011, pp. 1535–1545. Association for Computational Linguistics, Stroudsburg, PA, USA (2011).
  8. 8.
    Fader, A., Zettlemoyer, L., Etzioni, O.: Open question answering over curated and extracted knowledge bases. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 1156–1165. KDD 2014. ACM, New York, NY, USA (2014).
  9. 9.
    Giunchiglia, F., Shvaiko, P., Yatskevich, M.: Discovering missing background knowledge in ontology matching. In: Proceedings of the 2006 Conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 - September 1, 2006, Riva Del Garda, Italy. pp. 382–386. IOS Press, Amsterdam, The Netherlands (2006).
  10. 10.
    Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora (1992)Google Scholar
  11. 11.
    Kachroudi, M., Moussa, E.B., Zghal, S., Ben, S.: Ldoa results for oaei 2011. Ontology Matching, p. 148 (2011)Google Scholar
  12. 12.
    Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Seman. Web J. 5, 1–29 (2014)Google Scholar
  13. 13.
    Li, Y., Bontcheva, K., Cunningham, H.: Adapting svm for data sparseness and imbalance: a case study in information extraction. Nat. Lang. Eng. 15(02), 241–271 (2009)CrossRefGoogle Scholar
  14. 14.
    Min, B., Shi, S., Grishman, R., Lin, C.Y.: Ensemble semantics for large-scale unsupervised relation extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1027–1037. Association for Computational Linguistics (2012)Google Scholar
  15. 15.
    Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL 2009, vol. 2, pp. 1003–1011. Association for Computational Linguistics, Stroudsburg, PA, USA (2009).
  16. 16.
    Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007). CrossRefGoogle Scholar
  17. 17.
    Otero-Cerdeira, L., Rodríguez-Martínez, F.J., Gómez-Rodríguez, A.: Ontology matching: a literature review. Expert Syst. Appl. 42(2), 949–971 (2015). CrossRefGoogle Scholar
  18. 18.
    Scharffe, F., Liu, Y., Zhou, C.: Rdf-ai: an architecture for rdf datasets matching, fusion and interlink. In: Proceedings of IJCAI 2009 Workshop on Identity, Reference, and Knowledge Representation (IR-KR), Pasadena (CA US) (2009)Google Scholar
  19. 19.
    Singhal, A.: Introducing the knowledge graph: things, not strings. Official Google Blog, May 2012Google Scholar
  20. 20.
    Yao, X., Van Durme, B.: Information extraction over structured data: question answering with freebase. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1 (Long Papers), pp. 956–966. Association for Computational Linguistics (2014).

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.OAK Group, Department of Computer ScienceUniversity of SheffieldSheffieldUK

Personalised recommendations