An Introduction to Feature Extraction
This chapter introduces the reader to the various aspects of feature extraction covered in this book. Section 1 reviews definitions and notations and proposes a unified view of the feature extraction problem. Section 2 is an overview of the methods and results presented in the book, emphasizing novel contributions. Section 3 provides the reader with an entry point in the field of feature extraction by showing small revealing examples and describing simple but effective algorithms. Finally, Section 4 introduces a more theoretical formalism and points to directions of research and open problems.
Unable to display preview. Download preview PDF.
- H. Almuallim and T. G. Dietterich. Learning with many irrelevant features. In Proceedings of the Ninth National Conference on Artificial Intelligence (AAAI-91), volume 2, pages 547–552, Anaheim, California, 1991. AAAI Press.Google Scholar
- A. Ben-Hur and I. Guyon. Detecting stable clusters using principal component analysis. In M.J. Brownstein and A. Kohodursky, editors, Methods In Molecular Biology, pages 159–182. Humana Press, 2003.Google Scholar
- I. Guyon. Design of experiments of the NIPS 2003 variable selection benchmark. http://www.nipsfsc.ecs.soton.ac.uk/papers/NIPS2003-Datasets.pdf, 2003.
- T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, Data Mining, Inference and Prediction. Springer Verlag, 2000.Google Scholar
- K. Kira and L. Rendell. A practical approach to feature selection. In D. Sleeman and P. Edwards, editors, International Conference on Machine Learning, pages 249–256, Aberdeen, July 1992. Morgan Kaufmann.Google Scholar
- D. Koller and M. Sahami. Toward optimal feature selection. In 13th International Conference on Machine Learning, pages 284–292, July 1996.Google Scholar
- J. Kruskal and M. Wish. Multidimensional Scaling. Sage Publications, 1978.Google Scholar
- H. Liu and H. Motoda. Feature Extraction, Construction and Selection: A Data Mining Perspective. Kluwer Academic, 1998.Google Scholar
- R. G. Lyons. Understanding Digital Signal Processing. Prentice Hall, 2004.Google Scholar
- D. J. C. MacKay. Bayesian non-linear modeling for the energy prediction competition. ASHRAE Transactions, 100:1053–1062, 1994.Google Scholar
- R. M. Neal. Defining priors for distributions using dirichlet diffusion trees. Technical Report 0104, Dept. of Statistics, University of Toronto, 2001.Google Scholar
- A. Y. Ng. On feature selection: learning with exponentially many irrelevant features as training examples. In 15th International Conference on Machine Learning, pages 404–412. Morgan Kaufmann, San Francisco, CA, 1998.Google Scholar
- J. Pearl. Causality. Cambridge University Press, 2000.Google Scholar
- R. E. Woods R. C. Gonzalez. Digital Image Processing. Prentice Hall, 1992.Google Scholar
- P. Soille. Morphological Image Analysis. Springer-Verlag, 2004.Google Scholar
- N. Tishby, F. C. Pereira, and W. Bialek. The information bottleneck method. In Proc. of the 37th Annual Allerton Conference on Communication, Control and Computing, pages 368–377, 1999.Google Scholar
- J. S. Walker. A primer on wavelets and their scientific applications. Chapman and Hall/CRC, 1999.Google Scholar
- E.P. Xing and R.M. Karp. Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. In 9th International Conference on Intelligence Systems for Molecular Biology, 2001.Google Scholar