Advertisement

An Improved Algorithm for Mining Non-Redundant Interacting Feature Subsets

  • Chaofeng Sha
  • Jian Gong
  • Aoying Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5446)

Abstract

The application of feature subsets with high order correlation in classification has demonstrates its power in a recent study, where non-redundant interacting feature subsets (NIFS) is defined based on multi-information. In this paper, we re-examine the problem of finding NIFSs. We further improve the upper bounds and lower bounds on the correlations, which can be used to significantly prune the search space. The experiments on real datasets demonstrate the efficiency and effectiveness of our approach.

Keywords

Mutual Information Feature Subset Mining Algorithm Real Dataset Association Rule Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between sets of items in large databases. In: Buneman, P. (ed.) Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, pp. 207–216. ACM Press, New York (1993)CrossRefGoogle Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: VLDB 1994, pp. 487–499 (1994)Google Scholar
  3. 3.
    Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: generalizing association rules to correlations. In: Proceedings of 1997 ACM-SIGMOD International Conference on Management of Data (SIGMOD 1997) (1997)Google Scholar
  4. 4.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  5. 5.
    Cover, T., Thomas, J.: Elements of Information Theory. Wiley Series in Telecommunications. Wiley Interscience, Hoboken (1991)CrossRefzbMATHGoogle Scholar
  6. 6.
    Han, T.S.: Nonnegative entropy measures of multivariate symmetric correlations. Inform. Contr. 36, 133–156 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Heikinheimo, H., Hinkkanen, E., Mannila, H., Mielikainen, T., Seppanen, J.: Finding low-Entropy sets and trees from binary data. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2007)Google Scholar
  8. 8.
    Ke, Y., Cheng, J., Ng, W.: Mining Quantitative Correlated Patterns Using an Information-Theoretic Approach. In: Eliassi-Rad, T. (ed.) Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 227–236. ACM, Philadelphia (2006)Google Scholar
  9. 9.
    Knobbe, A., Ho, E.: Maximally informative k-itemsets and their efficient discovery. In: KDD 2006, pp. 237–244 (2006)Google Scholar
  10. 10.
    Omiecinski, E.R.: Alternative Interest Measures for Mining Associations in Databases. IEEE Transactions on Data Engineering 15(1), 57–69 (2003)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Pan, F., Roberts, A., McMillan, L., de Villena, F., Threadgill, D., Wang, W.: Sample selection for maximal diversity. In: Proceedings of the 5th IEEE International Conference on Data Mining (2007)Google Scholar
  12. 12.
    Pan, F., Wang, W., Tung, A.K.H., Yang, J.: Finding representative set from massive data. In: ICDM 2005, pp. 338–345 (2005)Google Scholar
  13. 13.
    Xiong, H., Tan, P., Kumar, V.: Hyperclique Pattern Discovery. Data Mining and Knowledge Discovery Journal 13(2), 219–242 (2006)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Yeung, R.W.: A first course in information theory. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Zhang, X., Pan, F., Wang, W., Nobel, A.: Mining non-redundant high order correlation in binary data. In: Proceedings of the 34th International Conference on Very Large Data Bases, Vienna, Austria, Auckland, New Zealand (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Chaofeng Sha
    • 1
  • Jian Gong
    • 1
  • Aoying Zhou
    • 2
  1. 1.School of Computer ScienceFudan UniversityChina
  2. 2.Shanghai Key Laboratory of Trustworthy ComputingECNUChina

Personalised recommendations