- 5.7k Downloads
Variable selection is an important topic in many types of modelling: the choice which variables to take into account to a large degree determines the result. This is true for every single technique discussed in this reader, be it PCA, clustering methods, classification methods, or regression. In the unsupervised approaches, uninformative variables can obscure the “real” picture, and distances between objects can become meaningless. In the supervised cases (both classification and regression), there is the danger of chance correlations with dependent variables, leading to models with low predictive power. This danger is all the more real given the very low sample-to-variable ratios of current data sets. The aim of variable selection then is to reduce the independent variables to those that contain relevant information, and thereby to improve statistical modelling.
KeywordsVariable Selection Candidate Solution Ridge Regression Subset Selection Global Optimization Method
Unable to display preview. Download preview PDF.