Advertisement

M-ATOLL: A Framework for the Lexicalization of Ontologies in Multiple Languages

  • Sebastian Walter
  • Christina Unger
  • Philipp Cimiano
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8796)

Abstract

Many tasks in which a system needs to mediate between natural language expressions and elements of a vocabulary in an ontology or dataset require knowledge about how the elements of the vocabulary (i.e. classes, properties, and individuals) are expressed in natural language. In a multilingual setting, such knowledge is needed for each of the supported languages. In this paper we present M-ATOLL, a framework for automatically inducing ontology lexica in multiple languages on the basis of a multilingual corpus. The framework exploits a set of language-specific dependency patterns which are formalized as SPARQL queries and run over a parsed corpus. We have instantiated the system for two languages: German and English. We evaluate it in terms of precision, recall and F-measure for English and German by comparing an automatically induced lexicon to manually constructed ontology lexica for DBpedia. In particular, we investigate the contribution of each single dependency pattern and perform an analysis of the impact of different parameters.

Keywords

Lexical Entry Text Corpus Computational Linguistics SPARQL Query Dependency Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agichtein, E., Gravano, L.: Snowball: Extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 85–94. ACM (2000)Google Scholar
  2. 2.
    Akbik, A., Broß, J.: Wanderlust: Extracting semantic relations from natural language text using dependency grammar patterns. In: Proceedings of the Workshop on Semantic Search in Conjunction with the 18th Int. World Wide Web Conference (2009)Google Scholar
  3. 3.
    Blohm, S., Cimiano, P.: Using the web to reduce data sparseness in pattern-based information extraction. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 18–29. Springer, Heidelberg (2007)Google Scholar
  4. 4.
    Bouayad-Agha, N., Casamayor, G., Wanner, L.: Natural language generation in the context of the semantic web. Semantic Web (2013)Google Scholar
  5. 5.
    Cimiano, P., Lopez, V., Unger, C., Cabrio, E., Ngonga Ngomo, A.-C., Walter, S.: Multilingual question answering over linked data (QALD-3): Lab overview. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 321–332. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  6. 6.
    Finlayson, M.A.: Code for java libraries for accessing the princeton wordnet: Comparison and evaluation (2013)Google Scholar
  7. 7.
    Gerber, D., Ngomo, A.-C.N.N.: Bootstrapping the linked data web. In: 1st Workshop on Web Scale Knowledge Extraction ISWC (2011)Google Scholar
  8. 8.
    Ittoo, A., Bouma, G.: On learning subtypes of the part-whole relation: do not mix your seeds. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1328–1336 (2010)Google Scholar
  9. 9.
    Lin, D., Pantel, P.: DIRT - discovery of inference rules of text. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 323–328. ACM (2001)Google Scholar
  10. 10.
    Lopez, V., Fernández, M., Motta, E., Stieler, N.: Poweraqua: Supporting users in querying and exploring the semantic web. Semantic Web 3(3), 249–265 (2012)Google Scholar
  11. 11.
    Mahendra, R., Wanzare, L., Bernardi, R., Lavelli, A., Magnini, B.: Acquiring relational patterns from wikipedia: A case study. In: Proc. of the 5th Language and Technology Conference (2011)Google Scholar
  12. 12.
    McCrae, J., Spohr, D., Cimiano, P.: Linking lexical resources and ontologies on the semantic web with lemon. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 245–259. Springer, Heidelberg (2011)Google Scholar
  13. 13.
    McCrae, J., Unger, C.: Design patterns for engineering the ontology-lexicon interface. In: Buitelaar, P., Cimiano, P. (eds.) Towards the Multilingual Semantic Web. Springer (2014)Google Scholar
  14. 14.
    Miller, G., Fellbaum, C.: Wordnet: An electronic lexical database (1998)Google Scholar
  15. 15.
    Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: A unified approach. Proceedings of Transactions of the Association for Computational Linguistics, TACL (2014)Google Scholar
  16. 16.
    Navigli, R., Ponzetto, S.P.: Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence 193, 217–250 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 113–120 (2006)Google Scholar
  18. 18.
    Prévot, L., Huang, C.-R., Calzolari, N., Gangemi, A., Lenci, A., Oltramari, A.: Ontology and the lexicon: a multi-disciplinary perspective. In: Ontology and the Lexicon: A Natural Language Processing Perspective, pp. 3–24. Cambridge University Press (2010)Google Scholar
  19. 19.
    Sennrich, R., Schneider, G., Volk, M., Warin, M.: A new hybrid dependency parser for german. In: Proceedings of the German Society for Computational Linguistics and Language Technology, pp. 115–124 (2009)Google Scholar
  20. 20.
    Unger, C., Bühmann, L., Lehmann, J., Ngomo, A.-C.N., Gerber, D., Cimiano, P.: Template-based question answering over rdf data. In: Proceedings of the 21st International Conference on World Wide Web, pp. 639–648. ACM (2012)Google Scholar
  21. 21.
    Unger, C., McCrae, J., Walter, S., Winter, S., Cimiano, P.: A lemon lexicon for dbpedia. In: Proceedings of 1st International Workshop on NLP and DBpedia, Sydney, Australia, October 21-25 (2013)Google Scholar
  22. 22.
    Vila, M., Rodríguez, H., Martí, M.A.: Wrpa: A system for relational paraphrase acquisition from wikipedia. Procesamiento del Lenguaje Natural 45, 11–19 (2010)Google Scholar
  23. 23.
    Walter, S., Unger, C., Cimiano, P., Bär, D.: Evaluation of a layered approach to question answering over linked data. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part II. LNCS, vol. 7650, pp. 362–374. Springer, Heidelberg (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sebastian Walter
    • 1
  • Christina Unger
    • 1
  • Philipp Cimiano
    • 1
  1. 1.Semantic Computing Group, CITECBielefeld UniversityGermany

Personalised recommendations