Aspects of Natural Language Processing pp 311-331

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5070)

Exploring Curvature-Based Topic Development Analysis for Detecting Event Reporting Boundaries

  • Jakub Piskorski

Abstract

In the era of proliferation of electronic news media and an ever-growing demand for prompt and concise information, natural language text processing technologies which map free texts into structured data format are becoming paramount. Recently, we have witnessed an emergence of publicly accessible news aggregation systems for facilitating navigation through news. This paper reports on some explorations of refining a real-time news event extraction system, which runs on top of the Europe Media Monitoring news aggregation system developed at the Joint Research Centre of the European Commission. Our experiments focus on the task of detecting new events in a given news story, i.e. tagging events extracted by the core event extraction system as new. Several methods ranging from simple similarity computation of event descriptions of adjacent events to more elaborate ones based on curvature-based topic development analysis which utilize global knowledge. The paper describes first the particularities of the real-time news event extraction processing chain. Next, in order to get a better insight how news stories evolve over time some statistics on event dynamics are presented. Finally, the new event detection techniques are introduced and the results of the evaluation are given.

Keywords

event extraction topic detection security informatics open source intelligence 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Jakub Piskorski
    • 1
  1. 1.Joint Research Centre of the European Commission, Web Mining and Intelligence of IPSCIspra (VA)Italy

Personalised recommendations