Learning monotone nonlinear models using the Choquet integral
 Ali Fallah Tehrani,
 Weiwei Cheng,
 Krzysztof Dembczyński,
 Eyke Hüllermeier
 … show all 4 hide
Abstract
The learning of predictive models that guarantee monotonicity in the input variables has received increasing attention in machine learning in recent years. By trend, the difficulty of ensuring monotonicity increases with the flexibility or, say, nonlinearity of a model. In this paper, we advocate the socalled Choquet integral as a tool for learning monotone nonlinear models. While being widely used as a flexible aggregation operator in different fields, such as multiple criteria decision making, the Choquet integral is much less known in machine learning so far. Apart from combining monotonicity and flexibility in a mathematically sound and elegant manner, the Choquet integral has additional features making it attractive from a machine learning point of view. Notably, it offers measures for quantifying the importance of individual predictor variables and the interaction between groups of variables. Analyzing the Choquet integral from a classification perspective, we provide upper and lower bounds on its VCdimension. Moreover, as a methodological contribution, we propose a generalization of logistic regression. The basic idea of our approach, referred to as choquistic regression, is to replace the linear function of predictor variables, which is commonly used in logistic regression to model the log odds of the positive class, by the Choquet integral. First experimental results are quite promising and suggest that the combination of monotonicity and flexibility offered by the Choquet integral facilitates strong performance in practical applications.
 Angilella, S., Greco, S., Matarazzo, B. (2009) Nonadditive robust ordinal regression with Choquet integral, bipolar and level dependent Choquet integrals. Proceedings of the joint 2009 international fuzzy systems association world congress and 2009 European society of fuzzy logic and technology conference. pp. 11941199
 Beliakov, G. (2008) Fitting fuzzy measures by linear programming. Programming library fmtools. Proc. FUZZIEEE 2008, IEEE international conference on fuzzy systems. pp. 862867
 Beliakov, G., James, S. (2011) Citationbased journal ranks: the use of fuzzy measures. Fuzzy Sets and Systems 167: pp. 101119 CrossRef
 BenDavid, A. (1995) Monotonicity maintenance in informationtheoretic machine learning algorithms. Machine Learning 19: pp. 2943
 BenDavid, A., Sterling, L., Pao, Y. H. (1989) Learning and classification of monotonic ordinal concepts. Computational Intelligence 5: pp. 4549 CrossRef
 Bohanec, M., Rajkovic, V. (1990) Expert system for decision making. Sistemica 1: pp. 145157
 Chandrasekaran, R., Ryu, Y., Jacob, V., Hong, S. (2005) Isotonic separation. INFORMS Journal on Computing 17: pp. 462474 CrossRef
 Choquet, G. (1954) Theory of capacities. Annales de L’Institut Fourier 5: pp. 131295 CrossRef
 Daniels, H., Kamp, B. (1999) Applications of mlp networks to bond rating and house pricing. Neural Computation and Applications 8: pp. 226234 CrossRef
 Dembczyński, K., Kotłowski, W., Słowiński, R. (2006) Additive preference model with piecewise linear components resulting from dominancebased rough set approximations. International conference on artificial intelligence and soft computing 2006. pp. 499508
 Dembczyński, K., Kotlowski, W., Slowinski, R. (2009) Learning rule ensembles for ordinal classification with monotonicity constraints. Fundamenta Informaticae 94: pp. 163178
 Demsar, J. (2006) Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7: pp. 130
 Duivesteijn, W., Feelders, A. (2008) Nearest neighbour classification with monotonicity constraints. Machine learning and knowledge discovery in databases. Springer, Berlin, pp. 301316 CrossRef
 Fallah Tehrani, A., Cheng, W., Dembczynski, K., Hüllermeier, E. (2011) Learning monotone nonlinear models using the Choquet integral. Proceedings ECML/PKDD–2011, European conference on machine learning and principles and practice of knowledge discovery in databases.
 Feelders, A. (2010) Monotone relabeling in ordinal classification. Proceedings of the 10th IEEE international conference on data mining. IEEE Computer Society, Washington, pp. 803808 CrossRef
 Grabisch, M. (1995) Fuzzy integral in multicriteria decision making. Fuzzy Sets and Systems 69: pp. 279298 CrossRef
 Grabisch, M. (1995) A new algorithm for identifying fuzzy measures and its application to pattern recognition. Proceedings of IEEE international conference on fuzzy systems. IEEE, New York, pp. 145150
 Grabisch, M. (1997) korder additive discrete fuzzy measures and their representation. Fuzzy Sets and Systems 92: pp. 167189 CrossRef
 Grabisch, M. (2003) Modelling data by the Choquet integral. Information fusion in data mining. Springer, Berlin, pp. 135148
 Grabisch, M., Nicolas, J. M. (1994) Classification by fuzzy integral: performance and tests. Fuzzy Sets and Systems 65: pp. 255271 CrossRef
 Grabisch, M., Murofushi, T., Sugeno, M. eds. (2000) Fuzzy measures and integrals: theory and applications. Physica, Heidelberg
 Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. (2009) The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11: pp. 1018 CrossRef
 Hosmer, D., Lemeshow, S. (2000) Applied logistic regression. Wiley, New York CrossRef
 Hüllermeier, E., & Fallah Tehrani, A. (2012a). Efficient learning of classifiers based on the 2additive Choquet integral. In Computational intelligence in intelligent data analysis. Studies in computational intelligence. Springer, forthcoming.
 Hüllermeier, E., Fallah Tehrani, A. (2012) On the VC dimension of the Choquet integral. IPMU–2012, 14th international conference on information processing and management of uncertainty in knowledgebased systems.
 Jaccard, J. (2001) Interaction effects in logistic regression. Sage Publications, Newbury Park
 Kotłowski, W., Dembczyński, K., Greco, S., Słowiński, R. (2008) Stochastic dominancebased rough set model for ordinal classification. Information Sciences 178: pp. 39894204 CrossRef
 Landwehr, N., Hall, M., Frank, E. (2003) Logistic model trees. Proceedings of the 14th European conference on machine learning. Springer, Berlin, pp. 241252
 Lee, S., Lee, H., Abbeel, P., Ng, A. (2006) Efficient L1 regularized logistic regression. Proceedings of the 21st national conference on artificial intelligence. AAAI, Menlo Park, pp. 401408
 Modave, F., Grabisch, M. (1998) Preference representation by a Choquet integral: commensurability hypothesis. Proceedings of the 7th international conference on information processing and management of uncertainty in knowledgebased systems. Editions EDK, Paris, pp. 164171
 Mori, T., & Murofushi, T. (1989). An analysis of evaluation model using fuzzy measure and the Choquet integral. In Proceedings of the 5th fuzzy system symposium (pp. 207–212). Japan Society for Fuzzy Sets and Systems.
 Murofushi, T., Soneda, S. (1993) Techniques for reading fuzzy measures (III): interaction index. Proceedings of the 9th fuzzy systems symposium. pp. 693696
 Potharst, R., Feelders, A. (2002) Classification trees for problems with monotonicity constraints. ACM SIGKDD Explorations Newsletter 4: pp. 110 CrossRef
 Sill, J. (1998) Monotonic networks. Advances in neural information processing systems. MIT Press, Denver, pp. 661667
 Sperner, E. (1928) Ein Satz über Untermengen einer endlichen Menge. Mathematische Zeitschrift 27: pp. 544548 CrossRef
 Sugeno, M. (1974). Theory of fuzzy integrals and its application. Ph.D. thesis, Tokyo Institute of Technology.
 Tibshirani, R. J., Hastie, T. J., Friedman, J. (2001) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin
 Torra, V. (2011) Learning aggregation operators for preference modeling. Preference learning. Springer, Berlin, pp. 317333
 Torra, V., Narukawa, Y. (2007) Modeling decisions: information fusion and aggregation operators. Springer, Berlin
 Valiant, L. (1984) A theory of the learnable. Communications of the ACM 27: pp. 11341142 CrossRef
 Vapnik, V. (1998) Statistical learning theory. Wiley, New York
 Vitali, G. (1925) Sulla definizione di integrale delle funzioni di una variabile. Annali Di Matematica Pura Ed Applicata 2: pp. 111121 CrossRef
 Title
 Learning monotone nonlinear models using the Choquet integral
 Journal

Machine Learning
Volume 89, Issue 12 , pp 183211
 Cover Date
 20121001
 DOI
 10.1007/s1099401253183
 Print ISSN
 08856125
 Online ISSN
 15730565
 Publisher
 Springer US
 Additional Links
 Topics
 Keywords

 Choquet integral
 Monotone learning
 Nonlinear models
 Choquistic regression
 Classification
 VC dimension
 Industry Sectors
 Authors

 Ali Fallah Tehrani ^{(1)}
 Weiwei Cheng ^{(1)}
 Krzysztof Dembczyński ^{(2)}
 Eyke Hüllermeier ^{(1)}
 Author Affiliations

 1. Department of Mathematics and Computer Science, Marburg University, Marburg, Germany
 2. Institute of Computing Science, Poznań University of Technology, Poznań, Poland