Extraction of Historical Events from Wikipedia

  • Daniel HienertEmail author
  • Francesco Luciano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7540)


The DBpedia project extracts structured information from Wikipedia and makes it available on the web. Information is gathered mainly with the help of infoboxes that contain structured information of the Wikipedia article. A lot of information is only contained in the article body and is not yet included in DBpedia. In this paper we focus on the extraction of historical events from Wikipedia articles that are available for about 2,500 years for different languages. We have extracted about 121,000 events with more than 325,000 links to DBpedia entities and provide access to this data via a Web API, SPARQL endpoint, Linked Data Interface and in a timeline application.


Historical events Wikipedia DBpedia Linked data 


  1. 1.
    Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) The Semantic Web. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Bhole, A., et al.: Extracting named entities and relating them over time based on wikipedia. Informatica (Slovenia) 31(4), 463–468 (2007)Google Scholar
  3. 3.
    Buscaldi, D., Rosso, P.: A bag-of-words based ranking method for the wikipedia question answering task. In: Peters, C., et al. (eds.) Evaluation of Multilingual and Multi-modal Information Retrieval. LNCS, vol. 4730, pp. 550–553. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Buscaldi, D., Rosso, P.: A comparison of methods for the automatic identification of locations in wikipedia. In: Proceedings of the 4th ACM workshop on Geographical information retrieval, pp. 89–92. ACM, New York, NY, USA (2007)Google Scholar
  5. 5.
    Chasin, R.: Event and Temporal Information Extraction towards Timelines of Wikipedia Articles. Simile, pp. 1–9 (2010)Google Scholar
  6. 6.
    Dakka, W., Cucerzan, S.: Augmenting Wikipedia with Named Entity Tags. In: Proceedings of IJCNLP 2008 (2008)Google Scholar
  7. 7.
    Exner, P., Nugues, P.: Using semantic role labeling to extract events from Wikipedia. In: Proceedings of the Workshop on Detection, Representation, and Exploitation of Events in the Semantic Web (DeRiVE 2011). Workshop in Conjunction with the 10th International Semantic Web Conference 2011 (ISWC 2011). Bonn (2011)Google Scholar
  8. 8.
    Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. The MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  9. 9.
    van Hage, W.R., et al.: Design and use of the simple event model (SEM). Web Semant. Sci. Serv. Agents World Wide Web 9, 2 (2011)Google Scholar
  10. 10.
    Hienert, D., et al.: VIZGR: combining data on a visual level. In: Proceedings of the 7th International Conference on Web Information Systems and Technologies (WEBIST) (2011)Google Scholar
  11. 11.
    Medelyan, O., et al.: Mining meaning from wikipedia. Int. J. Hum.-Comput. Stud. 67(9), 716–754 (2009)CrossRefGoogle Scholar
  12. 12.
    Ruiz-Casado, M., et al.: Automatising the learning of lexical patterns: an application to the enrichment of WordNet by extracting semantic relationships from wikipedia. Data Knowl. Eng. 61(3), 484–499 (2007)CrossRefGoogle Scholar
  13. 13.
    Shaw, R., Troncy, R., Hardman, L.: LODE: Linking Open Descriptions of Events. In: Gómez-Pérez, A., Yu, Y., Ding, Y. (eds.) The Semantic Web. LNCS, vol. 5926, pp. 153–167. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  14. 14.
    Suchanek, F.M., et al.: Combining linguistic and statistical analysis to extract relations from web documents. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 712–717. ACM, New York, NY, USA (2006)Google Scholar
  15. 15.
    Suchanek, F.M., et al.: Yago: a core of semantic knowledge. In: Proceedings of the 16th international conference on World Wide Web, pp. 697–706. ACM, New York, NY, USA (2007)Google Scholar
  16. 16.
    Toral, A., Munoz, R.: A proposal to automatically build and maintain gazetteers for named entity recognition by using wikipedia. In: EACL 2006 (2006)Google Scholar
  17. 17.
    Wang, G., Zhang, H., Wang, H., Yu, Y.: Enhancing relation extraction by eliciting selectional constraint features from wikipedia. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds.) Natural Language Processing and Information Systems. LNCS, vol. 4592, pp. 329–340. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  18. 18.
    Wang, G., Yu, Y., Zhu, H.: PORE: positive-only relation extraction from wikipedia text. In: Aberer, K., et al. (eds.) The Semantic Web. LNCS, vol. 4825, pp. 580–594. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  19. 19.
    Woodward, D.: Extraction and Visualization of Temporal Information and Related Named Entities from Wikipedia. Springs, pp. 1–8 (2001)Google Scholar
  20. 20.
    Wu, F., et al.: Information extraction from Wikipedia: moving down the long tail. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge discovery and data mining, pp. 731–739. ACM, New York, NY, USA (2008)Google Scholar
  21. 21.
    Wu, F., Weld, D.S.: Automatically refining the wikipedia infobox ontology. In: Proceeding of the 17th International Conference on World Wide Web, pp. 635–644. ACM, New York, NY, USA (2008)Google Scholar
  22. 22.
    Wu, F., Weld, D.S.: Autonomously semantifying wikipedia. In: Proceedings of the sixteenth ACM Conference on Information and Knowledge Management, pp. 41–50. ACM, New York, NY, USA (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.GESIS – Leibniz Institute for the Social SciencesCologneGermany

Personalised recommendations