Advertisement

Infosel++: Information Based Feature Selection C++ Library

  • Adam Kachel
  • Jacek Biesiada
  • Marcin Blachnik
  • Włodzisław Duch
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6113)

Abstract

A large package of algorithms for feature ranking and selection has been developed. Infosel++, Information Based Feature Selection C++ Library, is a collection of classes and utilities based on probability estimation that can help developers of machine learning methods in rapid interfacing of feature selection algorithms, aid users in selecting an appropriate algorithm for a given task (embed feature selection in machine learning task), and aid researchers in developing new algorithms, especially hybrid algorithms for feature selection. A few examples of such possibilities are presented.

Keywords

Feature Selection Mutual Information Feature Subset Feature Selection Method Feature Selection Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature Extraction, Foundations and Applications. Studies in Fuzziness and Soft Computing Series. Springer, Heidelberg (2006)MATHCrossRefGoogle Scholar
  2. 2.
    Duch, W., Maszczyk, T.: Universal learning machines. In: Chan, J.H. (ed.) ICONIP 2009, Part II. LNCS, vol. 5864, pp. 206–215. Springer, Heidelberg (2009)Google Scholar
  3. 3.
    Saeys, Y., Inza, I., Larrańaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)CrossRefGoogle Scholar
  4. 4.
    Liu, H., Motoda, M. (eds.): Computational Methods of Feature Selection. CRC Press, Boca Raton (2007)MATHGoogle Scholar
  5. 5.
    Saeys, Y., Liu, H., Inza, I., Wehenkel, L., de Peer, Y.V.: New challenges for feature selection in data mining and knowledge discovery. In: JMLR Workshop and Conf. Proc. (2008)Google Scholar
  6. 6.
    Duch, W.: Filter methods. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature extraction, foundations and applications, pp. 89–118. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Kohavi, R., Sommerfield, D., Dougherty, J.: Data mining using MLC++, a machine learning library in C++. Int. J. of Artificial Intelligence Tools 6(4), 537–566 (1997)CrossRefGoogle Scholar
  8. 8.
    Witten, I., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)MATHGoogle Scholar
  9. 9.
    Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: Rapid prototyping for complex data mining tasks. In: Proc. of the 12th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, KDD 2006 (2006)Google Scholar
  10. 10.
    Pudil, P., Novovicova, J., Somol, P.: Feature selection toolbox software package. Pattern Recognition Lettters 23(4), 487–492 (2002)MATHCrossRefGoogle Scholar
  11. 11.
    Su, Y., Murali, T., Pavlovic, V., Schaffer, M., Kasif, S.: Rankgene: identification of diagnostic genes based on expression data. Bioinformatics 19, 1578–1579 (2003)CrossRefGoogle Scholar
  12. 12.
    Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. on Knowledge and Data Engineering 17(4), 491–502 (2005)CrossRefGoogle Scholar
  13. 13.
    Press, W., Teukolsky, S., Vetterling, W., Flannery, G.: Numerical recipes in C. The art of scientific computing. Cambridge University Press, Cambridge (1988)Google Scholar
  14. 14.
    Vilmansen, T.: Feature evalution with measures of probabilistic dependence. IEEE Transaction on Computers 22(4), 381–388 (1973)MATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Golub, T., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  16. 16.
    Ding, C., Peng, F.: Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology 3(2), 185–205 (2004)CrossRefGoogle Scholar
  17. 17.
    Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Networks 5(4) (July 1994)Google Scholar
  18. 18.
    Kwak, N., Choi, C.H.: Input feature selection for classification problems. IEEE Transactions on Evolutionary Computation 13(1), 143–159 (2002)Google Scholar
  19. 19.
    Tesmer, M., Este’vez, P.: AMIFS: Adaptive feature selection by using mutual information. In: Proc. of Int. Joint Conf. on Neural Networks, Budapeszt, pp. 1415–1420. IEEE Press, Los Alamitos (2004)Google Scholar
  20. 20.
    Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, JMLR 5, 1205–1224 (2004)MathSciNetGoogle Scholar
  21. 21.
    Duch, W., Biesiada, J.: Feature Selection for High-Dimensional Data: A Kolmogorov-Smirnov Correlation-Based Filter Solution. In: Advances in Soft Computing, pp. 95–104. Springer, Heidelberg (2005)Google Scholar
  22. 22.
    Biesiada, J., Duch, W.: A Kolmogorov-Smirnov correlation-based filter solution for microarray gene expressions data. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part II. LNCS, vol. 4985, pp. 285–294. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  23. 23.
    Blachnik, M., Duch, W., Kachel, A., Biesiada, J.: Feature Selection for Supervised Classification: A Kolmogorov-Smirnov Class Correlation-Based Filter. In: AIMeth, Symposium on Methods of Artificial Intelligence, Gliwice, Poland, November 10-19 (2009)Google Scholar
  24. 24.
    Koller, D., Sahami, M.: Toward optimal feature selection. In: Proc. of the 13th Int. Conf. on Machine Learning, pp. 284–292. Morgan Kaufmann, San Francisco (1996)Google Scholar
  25. 25.
    Xing, E., Jordan, M., Karp, R.: Feature selection for high-dimensional genomic microarray data. In: Proc. of the 8th Int. Conf. on Machine Learning (2001)Google Scholar
  26. 26.
    Lorenzo, J., Hermandez, M., Mendez, J.: GD: A Measure based on Information Theory for Attribute Selection. In: Coelho, H. (ed.) IBERAMIA 1998. LNCS (LNAI), vol. 1484, pp. 124–135. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  27. 27.
    Sridhar, D., Barlett, E., Seagrave, R.: Informatic theoretic susbset selection for neural networks models. Computers & Chemical Engineering 22(4), 613–626 (1998)CrossRefGoogle Scholar
  28. 28.
    John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proc. Eleventh Inter. Conf. on Machine Learning, pp. 121–129. Morgan Kaufmann, San Francisco (1994)Google Scholar
  29. 29.
    Biesiada, J., Duch, W.: Feature Selection for High-Dimensional Data: A Pearson Redundancy Based Filter. In: Advances in Soft Computing, vol. 45, pp. 242–249. Springer, Heidelberg (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Adam Kachel
    • 1
  • Jacek Biesiada
    • 1
    • 3
  • Marcin Blachnik
    • 1
  • Włodzisław Duch
    • 2
  1. 1.Electrotechnology DepartmentSilesian University of TechnologyKrasiñskiegoPoland
  2. 2.Department of InformaticsNicolaus Copernicus UniversityToruñPoland
  3. 3.Division of Biomedical InformaticsChildren’s Hospital Research FoundationOhio

Personalised recommendations