Data Mining and Knowledge Discovery

, Volume 27, Issue 3, pp 421–441

Fast sequence segmentation using log-linear models

Article

DOI: 10.1007/s10618-012-0301-y

Cite this article as:
Tatti, N. Data Min Knowl Disc (2013) 27: 421. doi:10.1007/s10618-012-0301-y
  • 680 Downloads

Abstract

Sequence segmentation is a well-studied problem, where given a sequence of elements, an integer K, and some measure of homogeneity, the task is to split the sequence into K contiguous segments that are maximally homogeneous. A classic approach to find the optimal solution is by using a dynamic program. Unfortunately, the execution time of this program is quadratic with respect to the length of the input sequence. This makes the algorithm slow for a sequence of non-trivial length. In this paper we study segmentations whose measure of goodness is based on log-linear models, a rich family that contains many of the standard distributions. We present a theoretical result allowing us to prune many suboptimal segmentations. Using this result, we modify the standard dynamic program for 1D log-linear models, and by doing so reduce the computational time. We demonstrate empirically, that this approach can significantly reduce the computational burden of finding the optimal segmentation.

Keywords

Segmentation Pruning Change-point detection Dynamic program 

Copyright information

© The Author(s) 2013

Authors and Affiliations

  1. 1.Department of Mathematics and Computer ScienceUniversity of AntwerpAntwerpBelgium
  2. 2.Department of Computer ScienceKatholieke Universiteit LeuvenLeuvenBelgium

Personalised recommendations