Encyclopedia of Machine Learning

2010 Edition
| Editors: Claude Sammut, Geoffrey I. Webb

Data Set

Reference work entry
DOI: https://doi.org/10.1007/978-0-387-30164-8_196

A data set is a collection of data used for some specific machine learning purpose. A  training set is a data set that is used as input to a learning system, which analyzes it to learn a model. A  test set or  evaluation set is a data set containing data that are used to evaluate the model learned by a learning system. A training set may be divided further into a  growing set and a  pruning set. Where the training set and the test set contain disjoint sets of data, the test set is known as a  holdout set.

Copyright information

© Springer Science+Business Media, LLC 2011