Aspects of Natural Language Processing pp 311-331
Exploring Curvature-Based Topic Development Analysis for Detecting Event Reporting Boundaries
In the era of proliferation of electronic news media and an ever-growing demand for prompt and concise information, natural language text processing technologies which map free texts into structured data format are becoming paramount. Recently, we have witnessed an emergence of publicly accessible news aggregation systems for facilitating navigation through news. This paper reports on some explorations of refining a real-time news event extraction system, which runs on top of the Europe Media Monitoring news aggregation system developed at the Joint Research Centre of the European Commission. Our experiments focus on the task of detecting new events in a given news story, i.e. tagging events extracted by the core event extraction system as new. Several methods ranging from simple similarity computation of event descriptions of adjacent events to more elaborate ones based on curvature-based topic development analysis which utilize global knowledge. The paper describes first the particularities of the real-time news event extraction processing chain. Next, in order to get a better insight how news stories evolve over time some statistics on event dynamics are presented. Finally, the new event detection techniques are introduced and the results of the evaluation are given.
Keywordsevent extraction topic detection security informatics open source intelligence
Unable to display preview. Download preview PDF.