LODifier: Generating Linked Data from Unstructured Text

  • Isabelle Augenstein
  • Sebastian Padó
  • Sebastian Rudolph
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7295)


The automated extraction of information from text and its transformation into a formal description is an important goal in both Semantic Web research and computational linguistics. The extracted information can be used for a variety of tasks such as ontology generation, question answering and information retrieval. LODifier is an approach that combines deep semantic analysis with named entity recognition, word sense disambiguation and controlled Semantic Web vocabularies in order to extract named entities and relations between them from text and to convert them into an RDF representation which is linked to DBpedia and WordNet. We present the architecture of our tool and discuss design decisions made. An evaluation of the tool on a story link detection task gives clear evidence of its practical potential.


Resource Description Framework Name Entity Recognition Word Sense Disambiguation Entity Recognition Link Open Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Agirre, E., de Lacalle, O.L., Soroa, A.: Knowledge-based WSD and specific domains: performing over supervised WSD. In: Proceedings of the International Joint Conferences on Artificial Intelligence, Pasadena, CA (2009)Google Scholar
  2. 2.
    Allan, J.: Introduction to topic detection and tracking, pp. 1–16. Kluwer Academic Publishers, Norwell (2002)Google Scholar
  3. 3.
    van Assem, M., van Ossenbruggen, J.: WordNet 3.0 in RDF (2011), (Online; accessed July 12, 2011)
  4. 4.
    Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - A crystallization point for the Web of Data. Web Semant. 7, 154–165 (2009)CrossRefGoogle Scholar
  5. 5.
    Byrne, K., Klein, E.: Automatic extraction of archaeological events from text. In: Proceedings of Computer Applications and Quantitative Methods in Archaeology, Williamsburg, VA (2010)Google Scholar
  6. 6.
    Cafarella, M.J., Ré, C., Suciu, D., Etzioni, O., Banko, M.: Structured querying of web text: A technical challenge. In: Proceedings of the Conference on Innovative Data Systems Research, Asilomar, CA (2007)Google Scholar
  7. 7.
    Cimiano, P., Völker, J.: Text2Onto. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 227–238. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Curran, J.R., Clark, S., Bos, J.: Linguistically Motivated Large-Scale NLP with C&C and Boxer. In: Proceedings of the ACL 2007 Demo Session, pp. 33–36 (2007)Google Scholar
  9. 9.
    Dijkstra, E.W.: A note on two problems in connexion with graphs. Numerische Mathematik 1, 269–271 (1959)MathSciNetzbMATHCrossRefGoogle Scholar
  10. 10.
    Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press (1998)Google Scholar
  11. 11.
    Finkel, J.R., Manning, C.D.: Hierarchical Joint Learning: Improving Joint Parsing and Named Entity Recognition with Non-Jointly Labeled Data. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 720–728 (2010)Google Scholar
  12. 12.
    Harrington, B., Clark, S.: Asknet: Automated semantic knowledge network. In: Proceedings of the American Association for Artificial Intelligence, Vancouver, BC, pp. 889–894 (2007)Google Scholar
  13. 13.
    Hitzler, P., Krötzsch, M., Rudolph, S.: Foundations of Semantic Web Technologies. Chapman & Hall/CRC (2009)Google Scholar
  14. 14.
    Kamp, H., Reyle, U.: From Discourse to Logic: Introduction to Model-theoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Studies in Linguistics and Philosophy, vol. 42. Kluwer, Dordrecht (1993)Google Scholar
  15. 15.
    Koehn, P.: Statistical Significance Tests for Machine Translation Evaluation. In: Proceedings of Empirical Methods in Natural Language Processing, Barcelona, Spain, pp. 388–395 (2004)Google Scholar
  16. 16.
    Li, Y., Chu, V., Blohm, S., Zhu, H., Ho, H.: Facilitating pattern discovery for relation extraction with semantic-signature-based clustering. In: Proceedings of the ACM Conference on Information and Knowledge Management, pp. 1415–1424 (2011)Google Scholar
  17. 17.
    Lösch, U., Bloehdorn, S., Rettinger, A.: Graph Kernels for RDF Data. In: Simperl, E., et al. (eds.) ESWC 2012. LNCS, pp. 134–148. Springer, Heidelberg (2012)Google Scholar
  18. 18.
    Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: Proceedings of the ACM Conference on Information and Knowledge Management (2008)Google Scholar
  19. 19.
    Ramakrishnan, C., Kochut, K.J., Sheth, A.P.: A Framework for Schema-Driven Relationship Discovery from Unstructured Text. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 583–596. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  20. 20.
    Unger, C., Cimiano, P.: Pythia: Compositional Meaning Construction for Ontology-Based Question Answering on the Semantic Web. In: Muñoz, R., Montoyo, A., Métais, E. (eds.) NLDB 2011. LNCS, vol. 6716, pp. 153–160. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  21. 21.
    Valiente, G.: Algorithms on Trees and Graphs. Springer, Berlin (2002)zbMATHGoogle Scholar
  22. 22.
    Wojtinnek, P.-R., Harrington, B., Rudolph, S., Pulman, S.: Conceptual Knowledge Acquisition Using Automatically Generated Large-Scale Semantic Networks. In: Croitoru, M., Ferré, S., Lukose, D. (eds.) ICCS 2010. LNCS, vol. 6208, pp. 203–206. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Isabelle Augenstein
    • 1
  • Sebastian Padó
    • 1
  • Sebastian Rudolph
    • 2
  1. 1.Department of Computational LinguisticsUniversität HeidelbergGermany
  2. 2.Institute AIFBKarlsruhe Institute of TechnologyGermany

Personalised recommendations