Soft Computing

, Volume 16, Issue 5, pp 903–917

New algorithms for finding approximate frequent item sets

  • Christian Borgelt
  • Christian Braune
  • Tobias Kötter
  • Sonja Grün
Focus

DOI: 10.1007/s00500-011-0776-2

Cite this article as:
Borgelt, C., Braune, C., Kötter, T. et al. Soft Comput (2012) 16: 903. doi:10.1007/s00500-011-0776-2

Abstract

In standard frequent item set mining a transaction supports an item set only if all items in the set are present. However, in many cases this is too strict a requirement that can render it impossible to find certain relevant groups of items. By relaxing the support definition, allowing for some items of a given set to be missing from a transaction, this drawback can be amended. The resulting item sets have been called approximate, fault-tolerant or fuzzy item sets. In this paper we present two new algorithms to find such item sets: the first is an extension of item set mining based on cover similarities and computes and evaluates the subset size occurrence distribution with a scheme that is related to the Eclat algorithm. The second employs a clustering-like approach, in which the distances are derived from the item covers with distance measures for sets or binary vectors and which is initialized with a one-dimensional Sammon projection of the distance matrix. We demonstrate the benefits of our algorithms by applying them to a concept detection task on the 2008/2009 Wikipedia Selection for schools and to the neurobiological task of detecting neuron ensembles in (simulated) parallel spike trains.

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  • Christian Borgelt
    • 1
  • Christian Braune
    • 1
    • 2
  • Tobias Kötter
    • 3
  • Sonja Grün
    • 4
    • 5
  1. 1.European Centre for Soft ComputingMieres (Asturias)Spain
  2. 2.Department of Computer ScienceOtto-von-Guericke-University of MagdeburgMagdeburgGermany
  3. 3.Department of Computer ScienceUniversity of KonstanzConstanceGermany
  4. 4.RIKEN Brain Science InstituteWako-ShiJapan
  5. 5.Institute of Neuroscience and Medicine (INM-6)Research Center JülichJülichGermany