Advertisement

Knowledge and Information Systems

, Volume 5, Issue 2, pp 162–182 | Cite as

Necessary and Sufficient Pre-processing in Numerical Range Discretization

  • Tapio Elomaa
  • Juho RousuEmail author
Article

Abstract.

The time complexities of class-driven numerical range discretization algorithms depend on the number of cut point candidates. Previous analysis has shown that only a subset of all cut points - the segment borders - have to be taken into account in optimal discretization with respect to many goodness criteria. In this paper we show that inspecting segment borders alone suffices in optimizing any convex evaluation function. For strictly convex evaluation functions inspecting all of them also is necessary, since the placement of neighboring cut points affects the optimality of a segment border. With the training set error function, which is not strictly convex, it suffices to inspect an even smaller set of cut point candidates, called alternations, when striving for optimal partition. On the other hand, we prove that failing to check an alternation may lead to suboptimal discretization. We present a linear-time algorithm for finding all alternation points. The number of alternation points is typically much lower than the total number of cut points. In our experiments running the discretization algorithm over the sequence of alternation points led to a significant speed-up.

Keywords:

Discretization Numerical attributes Optimal partitioning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag London Limited 2003

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of HelsinkiHelsinkiFinland
  2. 2.Department of Computer Science FIN-00014 University of HelsinkiFinland

Personalised recommendations