Trends Analysis of Topics Based on Temporal Segmentation

  • Wei Chen
  • Parvathi Chundi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5691)

Abstract

Extracting interesting information from large unstructured document sets is a time consuming task. In this paper, we describe an approach to analyze the temporal trends of a given topic in a time-stamped document set based on time series segmentation. We consider topics containing multiple keywords and use a fuzzy set based method to compute a numeric value to measure the relevance of a document set to the given topic. The measure of relevance is then used to assign a discrepancy score to a segmentation of the time period associated with the document set. The discrepancy score of a segmentation represents the likelihood of the topic across all segments in a segmentation. Given a user specified value k, we then define a min different k segmentation to capture the k-segmentation with the maximum possible discrepancy score and describe a dynamic-programming based algorithm to compute it. The proposed approach is illustrated by several experiments using a subset of the TDT-Pilot Corpus data set. Our experiments show that the min difference k segmentation successfully highlights the temporal trends of a topic using k segments.

Keywords

Temporal Text Mining Temporal Segmentation Topics Fuzzy sets 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Wei Chen
    • 1
  • Parvathi Chundi
    • 1
  1. 1.University of Nebraska at Omaha, NEOmahaUS

Personalised recommendations