Optimizing Feature Sets for Structured Data

  • Ulrich Rückert
  • Stefan Kramer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4701)

Abstract

Choosing a suitable feature representation for structured data is a non-trivial task due to the vast number of potential candidates. Ideally, one would like to pick a small, but informative set of structural features, each providing complementary information about the instances. We frame the search for a suitable feature set as a combinatorial optimization problem. For this purpose, we define a scoring function that favors features that are as dissimilar as possible to all other features. The score is used in a stochastic local search (SLS) procedure to maximize the diversity of a feature set. In experiments on small molecule data, we investigate the effectiveness of a forward selection approach with two different linear classification schemes.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Ulrich Rückert
    • 1
  • Stefan Kramer
    • 1
  1. 1.Institut für Informatik/I12, Technische Universität München, Boltzmannstr. 3, D-85748 Garching b. MünchenGermany

Personalised recommendations