Efficient induction of numerical constraints
In this paper, we address the problem of learning an hyperplane that separates data described by numeric attributes. When data are not linearly separable, the “best” hyperplanes are those that minimize the number of misclassified examples. This topic has already been studied in many works but, as far as we know, the originality of our approach is to be based on Linear Programming technics, which appear to be very efficient. We propose two methods. In the first one, some variables are required to be integers; this enables to get one of the best solutions in terms of the number of misclassified examples, but as a counterpart, solving a Linear Programming problem which includes both real-valued and integer-valued variables is not efficient. In the second method, we relax the constraints for some variables to be integers. In that case, we get only an approximate solution to the problem, but the gain from the point of view of efficiency is so important that the process can be iterated, removing the misclassified examples that are the farthest from the learned hyperplane up to a given threshold. Experiments have been made on randomly generated examples; they show that, in average, the two methods give the same rate of noise.
Unable to display preview. Download preview PDF.
- 1.Breiman, Friedman, Olshem, Stone, 1993. Classification of Regression Trees. Chapman & Hall.Google Scholar
- 2.Chvâtal V., 1983. Linear Programming. Freeman.Google Scholar
- 3.Dietterich T.G., Lathrop R.H., Lozano-Perez T., 1996. Solving the MultipleInstance Problem with Axis-Parallel Rectangles. to be published in Artificial Intelligence Journal.Google Scholar
- 4.Kijsirikul B., Numao M. et al, 1992. Discrimination-based Constructive Induction of Logic Programs, Procs. of AAAI-92, San Jose, pp.44–49, 1992.Google Scholar
- 5.Martin L., Vrain C., 1996. Induction of Constraint Logic Programs. Procs. of the Seventh International Workshop on Algorithmic Learning Theory, Lecture Notes in Artificial Intelligence 1160, Springer, pp. 169–177.Google Scholar
- 6.Martin L., Vrain C., 1997. Learning linear constraints in Inductive Logic Programming. Procs. of the European Conference on Machine Learning, ECML-97, Lecture Notes in Artificial Intelligence 1224, Springer, pp. 162–169.Google Scholar
- 7.Rumelhart D.E., Hinton G.E., Williams R.J., 1986. Learning internal representations by error propagation. Parallel distributed processing: Explorations in macrostructure of cognition, Vol. 1, Bradford Books, Cambridge, MA, pp. 318–362.Google Scholar
- 8.Vapnik V.N., 1995. The Nature of Statistical Learning Theory. Springer.Google Scholar