Learning data discretization via convex optimization
- 183 Downloads
Discretization of continuous input functions into piecewise constant or piecewise linear approximations is needed in many mathematical modeling problems. It has been shown that choosing the length of the piecewise segments adaptively based on data samples leads to improved accuracy of the subsequent processing such as classification. Traditional approaches are often tied to a particular classification model which results in local greedy optimization of a criterion function. This paper proposes a technique for learning the discretization parameters along with the parameters of a decision function in a convex optimization of the true objective. The general formulation is applicable to a wide range of learning problems. Empirical evaluation demonstrates that the proposed convex algorithms yield models with fewer number of parameters with comparable or better accuracy than the existing methods.
KeywordsPiecewise constant embedding Piecewise linear embedding Parameter discretization Convex optimization Classification Histograms
VF was supported by Czech Science Foundation Grant 16-05872S. OF was supported by the internal CTU Funding SGS17/185/OHK3/3T/13.
- Bartos, K., & Sofka, M. (2015). Robust representation for domain adaptation in network security. In In proceedings of ECML/PKDD, volume 3, (pp. 116–132).Google Scholar
- Bhatt, R., & Dhall, A. (2010). Skin segmentation dataset. UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/skin+segmentation
- Dalal, N., & Triggs, B. (2005). Histogram of oriented gradients for human detection. In Proceedings of computer vision and pattern recognition, volume 1, (pp. 886–893).Google Scholar
- Dougherty, J., Kohavi, R., & Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. In Proceedings of international conference on machine learning, Morgan Kaufmann, (pp. 194–202).Google Scholar
- Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of international joint conference on artificial intelligence, (pp. 1022–1029).Google Scholar
- Friedman, N., & Goldszmidt, M. (1996). Discretizing continuous attributes while learning bayesian networks. In Proceedings of international conference on machine learning, (pp. 157–165).Google Scholar
- Kerber, R. (1992). Chimerge: Discretization of numeric attributes. In Proceedings of the tenth national conference on artificial intelligence, AAAI’92, (pp. 123–128).Google Scholar
- Lichman, M. (2013). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science.Google Scholar
- Pele, O., Taskar, B., Globerson, A., & Werman, M. (2013). The pairwise piecewise-linear embedding for efficient non-linear classification. In Proceedings of the international conference on machine learning, (pp. 205–213).Google Scholar
- Rao, C. (2005). Data mining and data visualization. In C. R. Rao, E. J. Wegman, & J. L. Solka (Eds.), Handbook of Statistics, volume 24. Newyork: Elsevier.Google Scholar
- Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. London, New York: Chapman & Hall.Google Scholar