An Investigation of the Performance of Informative Samples Preservation Methods
Instance-based learning algorithms make prediction/generalization based on the stored instances. Storing all instances of large data size applications causes huge memory requirements and slows program execution speed; it may make the prediction process impractical or even impossible. Therefore researchers have made great efforts to reduce the data size of instance-based learning algorithms by selecting informative samples. This paper has two main purposes. First, it investigates recent developments in informative sample preservation methods and identifies five representative methods for use in this study. Second, the five selected methods are implemented in a standardized input-output interface so that the programs can be used by other researchers, their performance in terms of accuracy and reduction rates are compared on ten benchmark classification problems. K-nearest neighbor is employed as the classifier in the performance comparison.
KeywordsInstance-based learning subset selection pattern selection classification algorithms
Unable to display preview. Download preview PDF.