In this paper we propose an incremental clustering algorithm for event detection, which makes use of the temporal references in the text of newspaper articles. This algorithm is hierarchically applied to a set of articles in order to discover the structure of topics and events that they describe. In the first level, documents with a high temporal-semantic similarity are clustered together into events. In the next levels of the hierarchy, these events are successively clustered so that more complex events and topics can be discovered. The evaluation results demonstrate that regarding the temporal references of documents improves the quality of the system-generated clusters, and that the overall performance of the proposed system compares favorably to other on-line detection systems of the literature.
Keywords
- Event Detection
- Temporal Reference
- Term Frequency
- Newspaper Article
- Similar Document
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.