Data Mining and Knowledge Discovery

, Volume 27, Issue 3, pp 421–441

Fast sequence segmentation using log-linear models


DOI: 10.1007/s10618-012-0301-y

Cite this article as:
Tatti, N. Data Min Knowl Disc (2013) 27: 421. doi:10.1007/s10618-012-0301-y


Sequence segmentation is a well-studied problem, where given a sequence of elements, an integer K, and some measure of homogeneity, the task is to split the sequence into K contiguous segments that are maximally homogeneous. A classic approach to find the optimal solution is by using a dynamic program. Unfortunately, the execution time of this program is quadratic with respect to the length of the input sequence. This makes the algorithm slow for a sequence of non-trivial length. In this paper we study segmentations whose measure of goodness is based on log-linear models, a rich family that contains many of the standard distributions. We present a theoretical result allowing us to prune many suboptimal segmentations. Using this result, we modify the standard dynamic program for 1D log-linear models, and by doing so reduce the computational time. We demonstrate empirically, that this approach can significantly reduce the computational burden of finding the optimal segmentation.


Segmentation Pruning Change-point detection Dynamic program 

Copyright information

© The Author(s) 2013

Authors and Affiliations

  1. 1.Department of Mathematics and Computer ScienceUniversity of AntwerpAntwerpBelgium
  2. 2.Department of Computer ScienceKatholieke Universiteit LeuvenLeuvenBelgium

Personalised recommendations