YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames

  • Thomas Rebele
  • Fabian Suchanek
  • Johannes Hoffart
  • Joanna Biega
  • Erdal Kuzey
  • Gerhard Weikum
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9982)


YAGO is a large knowledge base that is built automatically from Wikipedia, WordNet and GeoNames. The project combines information from Wikipedias in 10 different languages into a coherent whole, thus giving the knowledge a multilingual dimension. It also attaches spatial and temporal information to many facts, and thus allows the user to query the data over space and time. YAGO focuses on extraction quality and achieves a manually evaluated precision of 95 %. In this paper, we explain how YAGO is built from its sources, how its quality is evaluated, how a user can access it, and how other projects utilize it.


Knowledge base Wikipedia WordNet Geonames 


  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). doi:10.1007/978-3-540-76298-0_52 CrossRefGoogle Scholar
  2. 2.
    Biega, J., Kuzey, E., Suchanek, F.M.: Inside YAGO2s: a transparent information extraction architecture. In: WWW demo (2013)Google Scholar
  3. 3.
    Brown, L.D., Cai, T.T., DasGupta, A.: Interval estimation for a binomial proportion. Stat. Sci. 16(2), 101–117 (2001)MathSciNetMATHGoogle Scholar
  4. 4.
    De Melo, G., Weikum, G.: Towards a universal wordnet by learning from combined evidence. In: CIKM (2009)Google Scholar
  5. 5.
    Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: KDD (2014)Google Scholar
  6. 6.
    Fellbaum, C.: WordNet: An Electronic Lexical Database. Language, Speech, and Communication. MIT Press, Cambridge (1998)MATHGoogle Scholar
  7. 7.
    Ferrucci, D.A., Brown, E.W., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Lally, A., Murdock, J.W., Nyberg, E., Prager, J.M., Schlaefer, N., Welty, C.A.: Building watson: an overview of the deepqa project. AI Magazine 31(3), 59–79 (2010)Google Scholar
  8. 8.
    Galárraga, L., Symeonidou, D., Moissinac, J.C.: Rule Mining for semantifying wikilinks. In: Linked Open Data Workshop at WWW (2015)Google Scholar
  9. 9.
    Galárraga, L., Suchanek, F.M.: Towards a numerical rule mining language. In: AKBC workshop (2014)Google Scholar
  10. 10.
    Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in ontological knowledge bases with AMIE+. In: VLDBJ (2015)Google Scholar
  11. 11.
    Hoffart, J., Suchanek, F.M., Berberich, K., Lewis-Kelham, E., De Melo, G., Weikum, G.: YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In: WWW (2011)Google Scholar
  12. 12.
    Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Hoffart, J., Yosef, M.A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust disambiguation of named entities in text. In: EMNLP (2011)Google Scholar
  14. 14.
    Huet, T., Biega, J.A., Suchanek, F.M.: Mining history with Le Monde. In: AKBC Workshop (2013)Google Scholar
  15. 15.
    Kasneci, G., Ramanath, M., Suchanek, F., Weikum, G.: The YAGO-NAGA approach to knowledge discovery. ACM SIGMOD Record 37(4), 41–47 (2009)CrossRefGoogle Scholar
  16. 16.
    Kuzey, E., Setty, V., Strötgen, J., Weikum, G.: As time goes by: comprehensive tagging of textual phrases with temporal scopes. In: WWW (2016)Google Scholar
  17. 17.
    Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. 6(2), 167–195 (2015)Google Scholar
  18. 18.
    Magnini, B., Cavaglia, G.: Integrating subject field codes into WordNet. In: LREC (2000)Google Scholar
  19. 19.
    Mahdisoltani, F., Biega, J., Suchanek, F.: YAGO3: A knowledge base from multilingual Wikipedias. In: CIDR (2015)Google Scholar
  20. 20.
    Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  21. 21.
    Razniewski, S., Suchanek, F.M., Nutt, W.: But what do we actually know?. In: AKBC workshop (2016)Google Scholar
  22. 22.
    Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge. In: WWW (2007)Google Scholar
  23. 23.
    Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a large ontology from wikipedia and wordnet. Web Semant. 6(3), 203–217 (2008)CrossRefGoogle Scholar
  24. 24.
    Talaika, A., Biega, J.A., Amarilli, A., Suchanek, F.M.: IBEX: harvesting entities from the web using unique identifiers. In: WebDB workshop (2015)Google Scholar
  25. 25.
    Tandon, N., de Melo, G., De, A., Weikum, G.: Knowlywood: mining activity knowledge from hollywood narratives. In: CIKM (2015)Google Scholar
  26. 26.
    Tandon, N., de Melo, G., Suchanek, F., Weikum, G.: WebChild: harvesting and organizing commonsense knowledge from the web. In: WSDM (2014)Google Scholar
  27. 27.
    Vrandečić, D., Krtzsch, M.: Wikidata: a free collaborative knowledge base. Communications of the ACM 57, 78–85 (2014)Google Scholar
  28. 28.
    W3C: RDF 1.1 Concepts and Abstract Syntax (2014)Google Scholar
  29. 29.
    Yahya, M., Barbosa, D., Berberich, K., Wang, Q., Weikum, G.: Relationship queries on extended knowledge graphs. In: WSDM (2016)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Thomas Rebele
    • 1
  • Fabian Suchanek
    • 1
  • Johannes Hoffart
    • 2
  • Joanna Biega
    • 2
  • Erdal Kuzey
    • 2
  • Gerhard Weikum
    • 2
  1. 1.Télécom ParisTechParisFrance
  2. 2.Max Planck Institute for InformaticsSaarbrückenGermany

Personalised recommendations