Optimizing Feature Sets for Structured Data

  • Ulrich Rückert
  • Stefan Kramer
Conference paper

DOI: 10.1007/978-3-540-74958-5_72

Part of the Lecture Notes in Computer Science book series (LNCS, volume 4701)
Cite this paper as:
Rückert U., Kramer S. (2007) Optimizing Feature Sets for Structured Data. In: Kok J.N., Koronacki J., Mantaras R.L.., Matwin S., Mladenič D., Skowron A. (eds) Machine Learning: ECML 2007. ECML 2007. Lecture Notes in Computer Science, vol 4701. Springer, Berlin, Heidelberg

Abstract

Choosing a suitable feature representation for structured data is a non-trivial task due to the vast number of potential candidates. Ideally, one would like to pick a small, but informative set of structural features, each providing complementary information about the instances. We frame the search for a suitable feature set as a combinatorial optimization problem. For this purpose, we define a scoring function that favors features that are as dissimilar as possible to all other features. The score is used in a stochastic local search (SLS) procedure to maximize the diversity of a feature set. In experiments on small molecule data, we investigate the effectiveness of a forward selection approach with two different linear classification schemes.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Ulrich Rückert
    • 1
  • Stefan Kramer
    • 1
  1. 1.Institut für Informatik/I12, Technische Universität München, Boltzmannstr. 3, D-85748 Garching b. MünchenGermany

Personalised recommendations