An Introduction to Feature Extraction

Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 207)


This chapter introduces the reader to the various aspects of feature extraction covered in this book. Section 1 reviews definitions and notations and proposes a unified view of the feature extraction problem. Section 2 is an overview of the methods and results presented in the book, emphasizing novel contributions. Section 3 provides the reader with an entry point in the field of feature extraction by showing small revealing examples and describing simple but effective algorithms. Finally, Section 4 introduces a more theoretical formalism and points to directions of research and open problems.


Support Vector Machine Feature Selection Random Forest Feature Subset Feature Selection Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. H. Almuallim and T. G. Dietterich. Learning with many irrelevant features. In Proceedings of the Ninth National Conference on Artificial Intelligence (AAAI-91), volume 2, pages 547–552, Anaheim, California, 1991. AAAI Press.Google Scholar
  2. A. Ben-Hur and I. Guyon. Detecting stable clusters using principal component analysis. In M.J. Brownstein and A. Kohodursky, editors, Methods In Molecular Biology, pages 159–182. Humana Press, 2003.Google Scholar
  3. A. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1–2):245–271, December 1997.zbMATHCrossRefMathSciNetGoogle Scholar
  4. Leo Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.zbMATHCrossRefGoogle Scholar
  5. I. Guyon. Design of experiments of the NIPS 2003 variable selection benchmark., 2003.
  6. I. Guyon and A. Elisseeff. An introduction to variable and feature selection. JMLR, 3:1157–1182, March 2003.zbMATHCrossRefGoogle Scholar
  7. T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, Data Mining, Inference and Prediction. Springer Verlag, 2000.Google Scholar
  8. K. Kira and L. Rendell. A practical approach to feature selection. In D. Sleeman and P. Edwards, editors, International Conference on Machine Learning, pages 249–256, Aberdeen, July 1992. Morgan Kaufmann.Google Scholar
  9. R. Kohavi and G. John. Wrappers for feature selection. Artificial Intelligence, 97(1–2):273–324, December 1997.zbMATHCrossRefGoogle Scholar
  10. D. Koller and M. Sahami. Toward optimal feature selection. In 13th International Conference on Machine Learning, pages 284–292, July 1996.Google Scholar
  11. J. Kruskal and M. Wish. Multidimensional Scaling. Sage Publications, 1978.Google Scholar
  12. H. Liu and H. Motoda. Feature Extraction, Construction and Selection: A Data Mining Perspective. Kluwer Academic, 1998.Google Scholar
  13. R. G. Lyons. Understanding Digital Signal Processing. Prentice Hall, 2004.Google Scholar
  14. D. J. C. MacKay. Bayesian non-linear modeling for the energy prediction competition. ASHRAE Transactions, 100:1053–1062, 1994.Google Scholar
  15. R. M. Neal. Defining priors for distributions using dirichlet diffusion trees. Technical Report 0104, Dept. of Statistics, University of Toronto, 2001.Google Scholar
  16. R. M. Neal. Bayesian Learning for Neural Networks. Number 118 in Lecture Notes in Statistics. Springer-Verlag, New York, 1996.zbMATHGoogle Scholar
  17. A. Y. Ng. On feature selection: learning with exponentially many irrelevant features as training examples. In 15th International Conference on Machine Learning, pages 404–412. Morgan Kaufmann, San Francisco, CA, 1998.Google Scholar
  18. J. Pearl. Causality. Cambridge University Press, 2000.Google Scholar
  19. R. E. Woods R. C. Gonzalez. Digital Image Processing. Prentice Hall, 1992.Google Scholar
  20. P. Soille. Morphological Image Analysis. Springer-Verlag, 2004.Google Scholar
  21. N. Tishby, F. C. Pereira, and W. Bialek. The information bottleneck method. In Proc. of the 37th Annual Allerton Conference on Communication, Control and Computing, pages 368–377, 1999.Google Scholar
  22. J. S. Walker. A primer on wavelets and their scientific applications. Chapman and Hall/CRC, 1999.Google Scholar
  23. E.P. Xing and R.M. Karp. Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. In 9th International Conference on Intelligence Systems for Molecular Biology, 2001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  1. 1.ClopiNetBerkeleyUSA
  2. 2.Zürich Research LaboratoryIBM Research GmbHRüschlikonSwitzerland

Personalised recommendations