Date: 10 Aug 2012
Learning monotone nonlinear models using the Choquet integral
The learning of predictive models that guarantee monotonicity in the input variables has received increasing attention in machine learning in recent years. By trend, the difficulty of ensuring monotonicity increases with the flexibility or, say, nonlinearity of a model. In this paper, we advocate the so-called Choquet integral as a tool for learning monotone nonlinear models. While being widely used as a flexible aggregation operator in different fields, such as multiple criteria decision making, the Choquet integral is much less known in machine learning so far. Apart from combining monotonicity and flexibility in a mathematically sound and elegant manner, the Choquet integral has additional features making it attractive from a machine learning point of view. Notably, it offers measures for quantifying the importance of individual predictor variables and the interaction between groups of variables. Analyzing the Choquet integral from a classification perspective, we provide upper and lower bounds on its VC-dimension. Moreover, as a methodological contribution, we propose a generalization of logistic regression. The basic idea of our approach, referred to as choquistic regression, is to replace the linear function of predictor variables, which is commonly used in logistic regression to model the log odds of the positive class, by the Choquet integral. First experimental results are quite promising and suggest that the combination of monotonicity and flexibility offered by the Choquet integral facilitates strong performance in practical applications.
Editors: Dimitrios Gunopulos, Donato Malerba, and Michalis Vazirgiannis.
Angilella, S., Greco, S., & Matarazzo, B. (2009). Non-additive robust ordinal regression with Choquet integral, bipolar and level dependent Choquet integrals. In Proceedings of the joint 2009 international fuzzy systems association world congress and 2009 European society of fuzzy logic and technology conference. IFSA/EUSFLAT (pp. 1194–1199).
Beliakov, G. (2008). Fitting fuzzy measures by linear programming. Programming library fmtools. In Proc. FUZZ-IEEE 2008, IEEE international conference on fuzzy systems, Piscataway, NJ (pp. 862–867).
Ben-David, A. (1995). Monotonicity maintenance in information-theoretic machine learning algorithms. Machine Learning, 19, 29–43.
Ben-David, A., Sterling, L., & Pao, Y. H. (1989). Learning and classification of monotonic ordinal concepts. Computational Intelligence, 5(1), 45–49. CrossRef
Bohanec, M., & Rajkovic, V. (1990). Expert system for decision making. Sistemica, 1(1), 145–157.
Daniels, H., & Kamp, B. (1999). Applications of mlp networks to bond rating and house pricing. Neural Computation and Applications, 8, 226–234. CrossRef
Dembczyński, K., Kotłowski, W., & Słowiński, R. (2006). Additive preference model with piecewise linear components resulting from dominance-based rough set approximations. In Lecture notes in computer science: Vol. 4029. International conference on artificial intelligence and soft computing 2006 (pp. 499–508).
Duivesteijn, W., & Feelders, A. (2008). Nearest neighbour classification with monotonicity constraints. In Lecture notes in computer science: Vol. 5211. Machine learning and knowledge discovery in databases (pp. 301–316). Berlin: Springer. CrossRef
Fallah Tehrani, A., Cheng, W., Dembczynski, K., & Hüllermeier, E. (2011). Learning monotone nonlinear models using the Choquet integral. In Proceedings ECML/PKDD–2011, European conference on machine learning and principles and practice of knowledge discovery in databases, Athens, Greece.
Feelders, A. (2010). Monotone relabeling in ordinal classification. In Proceedings of the 10th IEEE international conference on data mining (pp. 803–808). Washington: IEEE Computer Society. CrossRef
Grabisch, M. (1995b). A new algorithm for identifying fuzzy measures and its application to pattern recognition. In Proceedings of IEEE international conference on fuzzy systems (Vol. 1, pp. 145–150). New York: IEEE.
Grabisch, M. (2003). Modelling data by the Choquet integral. In Information fusion in data mining (pp. 135–148). Berlin: Springer.
Grabisch, M., Murofushi, T., & Sugeno, M. (Eds.) (2000). Fuzzy measures and integrals: theory and applications. Heidelberg: Physica. MATH
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. (2009). The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18. CrossRef
Hüllermeier, E., & Fallah Tehrani, A. (2012a). Efficient learning of classifiers based on the 2-additive Choquet integral. In Computational intelligence in intelligent data analysis. Studies in computational intelligence. Springer, forthcoming.
Hüllermeier, E., & Fallah Tehrani, A. (2012b). On the VC dimension of the Choquet integral. In IPMU–2012, 14th international conference on information processing and management of uncertainty in knowledge-based systems, Catania, Italy.
Jaccard, J. (2001). Interaction effects in logistic regression. Newbury Park: Sage Publications. MATH
Kotłowski, W., Dembczyński, K., Greco, S., & Słowiński, R. (2008). Stochastic dominance-based rough set model for ordinal classification. Information Sciences, 178(21), 3989–4204. CrossRef
Landwehr, N., Hall, M., & Frank, E. (2003). Logistic model trees. In Proceedings of the 14th European conference on machine learning (pp. 241–252). Berlin: Springer.
Lee, S., Lee, H., Abbeel, P., & Ng, A. (2006). Efficient L1 regularized logistic regression. In Proceedings of the 21st national conference on artificial intelligence (pp. 401–408). Menlo Park: AAAI.
Modave, F., & Grabisch, M. (1998). Preference representation by a Choquet integral: commensurability hypothesis. In Proceedings of the 7th international conference on information processing and management of uncertainty in knowledge-based systems (pp. 164–171). Paris: Editions EDK.
Mori, T., & Murofushi, T. (1989). An analysis of evaluation model using fuzzy measure and the Choquet integral. In Proceedings of the 5th fuzzy system symposium (pp. 207–212). Japan Society for Fuzzy Sets and Systems.
Murofushi, T., & Soneda, S. (1993). Techniques for reading fuzzy measures (III): interaction index. In Proceedings of the 9th fuzzy systems symposium (pp. 693–696).
Potharst, R., & Feelders, A. (2002). Classification trees for problems with monotonicity constraints. ACM SIGKDD Explorations Newsletter, 4(1), 1–10. CrossRef
Sill, J. (1998). Monotonic networks. In Advances in neural information processing systems (pp. 661–667). Denver: MIT Press.
Sugeno, M. (1974). Theory of fuzzy integrals and its application. Ph.D. thesis, Tokyo Institute of Technology.
Tibshirani, R. J., Hastie, T. J., & Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. Berlin: Springer. MATH
Torra, V. (2011). Learning aggregation operators for preference modeling. In Preference learning (pp. 317–333). Berlin: Springer.
Torra, V., & Narukawa, Y. (2007). Modeling decisions: information fusion and aggregation operators. Berlin: Springer.
Vapnik, V. (1998). Statistical learning theory. New York: Wiley. MATH
- Learning monotone nonlinear models using the Choquet integral
Volume 89, Issue 1-2 , pp 183-211
- Cover Date
- Print ISSN
- Online ISSN
- Springer US
- Additional Links
- Choquet integral
- Monotone learning
- Nonlinear models
- Choquistic regression
- VC dimension
- Industry Sectors