Extracting Events from Wikipedia as RDF Triples Linked to Widespread Semantic Web Datasets

  • Carlo Aliprandi
  • Francesco Ronzano
  • Andrea Marchetti
  • Maurizio Tesconi
  • Salvatore Minutoli
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6778)

Abstract

Many attempts have been made to extract structured data from Web resources, exposing them as RDF triples and interlinking them with other RDF datasets: in this way it is possible to create clouds of highly integrated Semantic Web data collections. In this paper we describe an approach to enhance the extraction of semantic contents from unstructured textual documents, in particular considering Wikipedia articles and focusing on event mining. Starting from the deep parsing of a set of English Wikipedia articles, we produce a semantic annotation compliant with the Knowledge Annotation Format (KAF). We extract events from the KAF semantic annotation and then we structure each event as a set of RDF triples linked to both DBpedia and WordNet. We point out examples of automatically mined events, providing some general evaluation of how our approach may discover new events and link them to existing contents.

Keywords

Knowledge Representation Knowledge Extraction Semantic Web Natural Language Processing Semantics 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    RDF W3C Web Page, http://www.w3.org/RDF/
  2. 2.
  3. 3.
    Urbansky, D., Thom, J.A.: WebKnox: Web Knowledge Extraction. In: 13th Australasian Document Computing Symposium, Hobart (2008)Google Scholar
  4. 4.
    Zhao, S., Betx, J.: Corroborate and Learn Facts from the Web. In: 13th International Conference on Knowledge Discovery and Data Mining, San Josè (2007)Google Scholar
  5. 5.
    Banko, M., Etzioni, O.: The Tradeoffs Between Open and Traditional Relation Extraction. In: 46th ACL: Human Language Technologies, Columbus (2008)Google Scholar
  6. 6.
    Linked Data Web Site, http://linkeddata.org/
  7. 7.
    DBpedia Web Site, http://dbpedia.org/About
  8. 8.
    Open Calais Web Site, http://www.opencalais.com/
  9. 9.
    Wikify! Web Site, http://www.wikifyer.com/
  10. 10.
    Faviki Web Site, http://www.faviki.com/
  11. 11.
    Passant, A.: LODr - A Linking Open Data Tagging System. In: Social Data on the Web Workshop at the 7th Int. Semantic Web Conference, Karlsrhue (2008)Google Scholar
  12. 12.
    Tesconi, M., Ronzano, F., Marchetti, A., Minutoli, S.: Semantify del.icio.us: automatically turn your tags into senses. In: Social Data on the Web Workshop at the 7th International Semantic Web Conference, Karlsrhue (2008)Google Scholar
  13. 13.
    Tagpedia Web Site, http://www.tagpedia.org/
  14. 14.
    Nakayama, K.: Extracting Structured Knowledge for Semantic Web by Mining Wikipedia. In: Social Data on the Web Workshop at the 7th International Semantic Web Conference, Karlsrhue (2008)Google Scholar
  15. 15.
    Ronzano, F., Marchetti, A., Tesconi, M., Minutoli, S.: Tagpedia: a Semantic Reference to Describe and Search for Web Resources. In: Social Web and Knowledge Management Workshop at the 17th World Wide Web Conference, WWW 2008, Beijing (2008)Google Scholar
  16. 16.
    Adafre, S.F., Jijkoun, V., de Rijke, M.: Fact Discovery in Wikipedia. In: IEEE/WIC/ACM International Conference on Web Intelligence, Silicon Valley (2007)Google Scholar
  17. 17.
    Bhole, A., Fortuna, B., Grobelnik, M., Mladenic, D.: Mining Wikipedia and Relating Named Entities over Time. In: 13th International Conference on Knowledge Discovery and Data Mining, San Josè (2007)Google Scholar
  18. 18.
    Asterias, J., Zaragoza, H., Ciaramita, M., Attardi, G.: Semantically Annotated Snapshot of the English Wikipedia. In: 6th International Language Resources and Evaluation Conference LREC 2008, Marrakech (2008)Google Scholar
  19. 19.
    Suh, S., Halpin, H., Klein, E.: Extracting Common Sense Knowledge from Wikipedia. In: 6th International Semantic Web Conference, Athens, GA, USA (2006)Google Scholar
  20. 20.
    Bosma, W., Vossen, P., Soroa, A., Rigau, G., Tesconi, M., Marchetti, A., Aliprandi, C., Monachini, M.: KAF: a generic semantic annotation format. In: 5th International Conference on Generative Approaches to the Lexicon, Pisa (2009)Google Scholar
  21. 21.
    McCord, M.C.: Slot Grammar: A System for Simpler Construction of Practical Natural Language Grammars. Natural Language and Logic, 118–145 (1989)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Carlo Aliprandi
    • 1
  • Francesco Ronzano
    • 2
  • Andrea Marchetti
    • 2
  • Maurizio Tesconi
    • 2
  • Salvatore Minutoli
    • 2
  1. 1.Synthema SrlOspedaletto (Pisa)Italy
  2. 2.Institute of Informatics and Telematics (IIT) CNRPisaItaly

Personalised recommendations