An Introduction to Feature Extraction
This chapter introduces the reader to the various aspects of feature extraction covered in this book. Section 1 reviews definitions and notations and proposes a unified view of the feature extraction problem. Section 2 is an overview of the methods and results presented in the book, emphasizing novel contributions. Section 3 provides the reader with an entry point in the field of feature extraction by showing small revealing examples and describing simple but effective algorithms. Finally, Section 4 introduces a more theoretical formalism and points to directions of research and open problems.
KeywordsSupport Vector Machine Feature Selection Random Forest Feature Subset Feature Selection Method
Unable to display preview. Download preview PDF.
- H. Almuallim and T. G. Dietterich. Learning with many irrelevant features. In Proceedings of the Ninth National Conference on Artificial Intelligence (AAAI-91), volume 2, pages 547–552, Anaheim, California, 1991. AAAI Press.Google Scholar
- A. Ben-Hur and I. Guyon. Detecting stable clusters using principal component analysis. In M.J. Brownstein and A. Kohodursky, editors, Methods In Molecular Biology, pages 159–182. Humana Press, 2003.Google Scholar
- I. Guyon. Design of experiments of the NIPS 2003 variable selection benchmark. http://www.nipsfsc.ecs.soton.ac.uk/papers/NIPS2003-Datasets.pdf, 2003.
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, Data Mining, Inference and Prediction. Springer Verlag, 2000.Google Scholar
- K. Kira and L. Rendell. A practical approach to feature selection. In D. Sleeman and P. Edwards, editors, International Conference on Machine Learning, pages 249–256, Aberdeen, July 1992. Morgan Kaufmann.Google Scholar
- D. Koller and M. Sahami. Toward optimal feature selection. In 13th International Conference on Machine Learning, pages 284–292, July 1996.Google Scholar
- J. Kruskal and M. Wish. Multidimensional Scaling. Sage Publications, 1978.Google Scholar
- H. Liu and H. Motoda. Feature Extraction, Construction and Selection: A Data Mining Perspective. Kluwer Academic, 1998.Google Scholar
- R. G. Lyons. Understanding Digital Signal Processing. Prentice Hall, 2004.Google Scholar
- D. J. C. MacKay. Bayesian non-linear modeling for the energy prediction competition. ASHRAE Transactions, 100:1053–1062, 1994.Google Scholar
- R. M. Neal. Defining priors for distributions using dirichlet diffusion trees. Technical Report 0104, Dept. of Statistics, University of Toronto, 2001.Google Scholar
- A. Y. Ng. On feature selection: learning with exponentially many irrelevant features as training examples. In 15th International Conference on Machine Learning, pages 404–412. Morgan Kaufmann, San Francisco, CA, 1998.Google Scholar
- J. Pearl. Causality. Cambridge University Press, 2000.Google Scholar
- R. E. Woods R. C. Gonzalez. Digital Image Processing. Prentice Hall, 1992.Google Scholar
- P. Soille. Morphological Image Analysis. Springer-Verlag, 2004.Google Scholar
- N. Tishby, F. C. Pereira, and W. Bialek. The information bottleneck method. In Proc. of the 37th Annual Allerton Conference on Communication, Control and Computing, pages 368–377, 1999.Google Scholar
- J. S. Walker. A primer on wavelets and their scientific applications. Chapman and Hall/CRC, 1999.Google Scholar
- E.P. Xing and R.M. Karp. Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. In 9th International Conference on Intelligence Systems for Molecular Biology, 2001.Google Scholar