Learning customized and optimized lists of rules with mathematical programming

Rudin, Cynthia; Ertekin, Şeyda

doi:10.1007/s12532-018-0143-8

Learning customized and optimized lists of rules with mathematical programming

Full Length Paper
Published: 05 September 2018

Volume 10, pages 659–702, (2018)
Cite this article

Mathematical Programming Computation Aims and scope Submit manuscript

Cynthia Rudin¹ &
Şeyda Ertekin^2,3

1184 Accesses
24 Citations
12 Altmetric
2 Mentions
Explore all metrics

Abstract

We introduce a mathematical programming approach to building rule lists, which are a type of interpretable, nonlinear, and logical machine learning classifier involving IF-THEN rules. Unlike traditional decision tree algorithms like CART and C5.0, this method does not use greedy splitting and pruning. Instead, it aims to fully optimize a combination of accuracy and sparsity, obeying user-defined constraints. This method is useful for producing non-black-box predictive models, and has the benefit of a clear user-defined tradeoff between training accuracy and sparsity. The flexible framework of mathematical programming allows users to create customized models with a provable guarantee of optimality. The software reviewed as part of this submission was given the DOI (Digital Object Identifier) https://doi.org/10.5281/zenodo.1344142.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Investigating the Impact of Independent Rule Fitnesses in a Learning Classifier System

A Metaheuristic Perspective on Learning Classifier Systems

Discovering Rule Lists with Preferred Variables

Notes

The recall or sensitivity of a classifier is the true positive rate, the precision is the fraction of predicted positives that are true positives, the specificity is the true negative rate:
$$ \begin{aligned} \text { recall = sensitivity }= & {} \frac{\sum _i 1_{(y_i=f(x_i)) \& (y_i=1)}}{\sum _i 1_{y_i=1}}, \,\,\, \text { precision }= \frac{\sum _i 1_{(y_i = f(x_i)) \& (y_i=1)}}{\sum _i 1_{f(x_i)=1}}, \\&\text { specificity } = \frac{\sum _i 1_{(y_i=f(x_i)) \& (y_i=-1)}}{\sum _i 1_{y_i=-1}} . \end{aligned}$$

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Databases, pp. 487–499 (1994)
Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., Rudin, C.: Learning certifiably optimal rule lists for categorical data. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2017)
Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., Rudin, C.: Learning certifiably optimal rule lists for categorical data. J. Mach. Learn. Res. 18, 1–78 (2018)
Google Scholar
Anthony, M.: Decision lists. Tech. rep., CDAM Research Report LSE-CDAM-2005-23 (2005)
Bache, K., Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)
Bayardo, R.J., Agrawal, R.: Mining the most interesting rules. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 145–154 (1999)
Bennett, K.P., Blue, J.A.: Optimal decision trees. Tech. rep., R.P.I. Math Report No. 214, Rensselaer Polytechnic Institute (1996)
Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 7, 1039–1082 (2017)
Article MathSciNet Google Scholar
Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I.: An implementation of logical analysis of data. IEEE Trans. Knowl. Data Eng. 12(2), 292–306 (2000)
Article Google Scholar
Breiman, L.: Random forests. Mach Learn 45(1), 5–32 (2001)
Article Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)
MATH Google Scholar
Chang, A.: Integer optimization methods for machine learning. Ph.D. thesis, Massachusetts Institute of Technology (2012)
Chen, C., Rudin, C.: An optimization approach to learning falling rule lists. In: Proceedings of Artificial Intelligence and Statistics (AISTATS) (2018)
Chipman, H.A., George, E.I., McCulloch, R.E.: Bayesian CART model search. J. Am. Stat. Assoc. 93(443), 935–948 (1998)
Article Google Scholar
Cieslak, D.A., Chawla, N.V.: Learning decision trees for unbalanced data. In: Daelemans, W., Goethals, B., Morik, K. (eds.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, vol. 5211, pp. 241–256. Springer, Berlin (2008)
Chapter Google Scholar
Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann (1995)
Cusick, G.R., Courtney, M.E., Havlicek, J., Hess, N.: Crime during the transition to adulthood: how youth fare as they leave out-of-home care. National Institute of Justice, Office of Justice Programs, US Department of Justice (2010)
Dobkin, D., Fulton, T., Gunopulos, D., Kasif, S., Salzberg, S.: Induction of shallow decision trees (1996)
Farhangfar, A., Greiner, R., Zinkevich, M.: A fast way to produce optimal fixed-depth decision trees. In: International Symposium on Artificial Intelligence and Mathematics (ISAIM 2008), Fort Lauderdale, Florida, USA, January 2–4 (2008)
Fawcett, T.: Prie: a system for generating rulelists to maximize roc performance. Data Min. Knowl. Discov. 17(2), 207–224 (2008)
Article MathSciNet Google Scholar
Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explor. Newsl. 15(1), 1–10 (2014)
Article Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Article MathSciNet Google Scholar
Friedman, J.H., Popescu, B.E.: Predictive learning via rule ensembles. Ann. Appl. Stat. 2(3), 916–954 (2008)
Article MathSciNet Google Scholar
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. (2006). https://doi.org/10.1145/1132960.1132963
Article Google Scholar
Goethals, B.: Survey on frequent pattern mining. Tech. rep., Helsinki Institute for Information Technology (2003)
Goh, S.T., Rudin, C.: Box drawings for learning with imbalanced data. In: Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2014)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). https://doi.org/10.1145/1656274.1656278
Article Google Scholar
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Min. Knowl. Discov. 15, 55–86 (2007)
Article MathSciNet Google Scholar
Hata, I., Veloso, A., Ziviani, N.: Learning accurate and interpretable classifiers using optimal multi-criteria rules. J. Inf. Data Manag. 4(3) (2013)
Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining: a general survey and comparison. SIGKDD Explor. 2, 58–64 (2000)
Article Google Scholar
Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., Baesens, B.: An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Support Syst. 51(1), 141–154 (2011)
Article Google Scholar
Jennings, D.L., Amabile, T.M., Ross, L.: Informal covariation assessments: Data-based versus theory-based judgements. In: Kahneman, D., Slovic, P., Tversky, A. (eds.) Judgment Under Uncertainty: Heuristics and Biases, pp. 211–230. Cambridge Press, Cambridge (1982)
Chapter Google Scholar
Klivans, A.R., Servedio, R.A.: Toward attribute efficient learning of decision lists and parities. J. Mach. Learn. Res. 7, 587–602 (2006)
MathSciNet MATH Google Scholar
Kuhn, M., Weston, S., Coulter, N.: C50: C5.0 Decision Trees and Rule-Based Models, C Code for C5.0 by R. Quinlan. http://CRAN.R-project.org/package=C50. r package version 0.1.0-013 (2012)
Lakkaraju, H., Rudin, C.: Learning cost effective and interpretable treatment regimes in the form of rule lists. In: Proceedings of Artificial Intelligence and Statistics (AISTATS) (2017)
Leondes, C.T.: Expert Systems: The Technology of Knowledge Management and Decision Making for the 21st Century. Academic Press, London (2002)
Google Scholar
Letham, B., Rudin, C., McCormick, T.H., Madigan, D.: Interpretable classifiers using rules and bayesian analysis: building a better stroke prediction model. Ann. Appl. Stat. 9(3), 1350–1371 (2015)
Article MathSciNet Google Scholar
Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. IEEE International Conference on Data Mining, pp. 369–376 (2001)
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 80–96 (1998)
Long, P.M., Servedio, R.A.: Attribute-efficient learning of decision lists and linear threshold functions under unconcentrated distributions. Adv. Neural Inf. Process. Syst. 19, 921–928 (2007)
Google Scholar
Malioutov, D., Varshney, K.: Exact rule learning via boolean compressed sensing. In: Proceedings of The 30th International Conference on Machine Learning, pp. 765–773 (2013)
Marchand, M., Sokolova, M.: Learning with decision lists of data-dependent features. J. Mach. Learn. Res. 6, 427–451 (2005)
MathSciNet MATH Google Scholar
McCormick, T.H., Rudin, C., Madigan, D.: Bayesian hierarchical modeling for predicting medical conditions. Ann. Appl. Stat. 6(2), 652–668 (2012)
Article MathSciNet Google Scholar
McGarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20, 39–61 (2005)
Article Google Scholar
Meinshausen, N.: Node harvest. Ann. Appl. Stat. 4(4), 2049–2072 (2010)
Article MathSciNet Google Scholar
Miller, G.A.: The magical number seven, plus or minus two: Some limits to our capacity for processing information. Psychol. Rev. 63(2), 81–97 (1956)
Article Google Scholar
Muggleton, S., De Raedt, L.: Inductive logic programming: theory and methods. J. Log. Program. 19, 629–679 (1994)
Article MathSciNet Google Scholar
Naumov, G.: NP-completeness of problems of construction of optimal decision trees. Sov. Phys. Dokl. 36(4), 270–271 (1991)
MATH Google Scholar
Nijssen, S., Fromont, E.: Mining optimal decision trees from itemset lattices. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2007)
Nijssen, S., Fromont, E.: Optimal constraint-based decision tree induction from itemset lattices. Data Min. Knowl. Discov. 21(1), 9–51 (2010)
Article MathSciNet Google Scholar
Norouzi, M., Collins, M., Johnson, M.A., Fleet, D.J., Kohli, P.: Efficient non-greedy optimization of decision trees. Adv. Neural Inf. Process. Syst. 28, 1729–1737 (2015)
Google Scholar
Plate, T.A.: Accuracy versus interpretability in flexible modeling: implementing a tradeoff using gaussian process models. Behaviormetrika 26, 29–50 (1999)
Article Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos (1993)
Google Scholar
Ridgeway, G.: The pitfalls of prediction. NIJ J. Natl. Inst. Justice 271, 34–40 (2013)
Google Scholar
Rivest, R.L.: Learning decision lists. Mach. Learn. 2(3), 229–246 (1987)
Google Scholar
Rückert, U.: A statistical approach to rule learning. Ph.D. thesis, Technischen Universität München (2008)
Rudin, C., Letham, B., Salleb-Aouissi, A., Kogan, E., Madigan, D.: Sequential event prediction with association rules. In: Proceedings of the 24th Annual Conference on Learning Theory (COLT) (2011)
Rudin, C., Letham, B., Madigan, D.: Learning theory analysis for association rules and sequential event prediction. J. Mach. Learn. Res. 14, 3384–3436 (2013)
MathSciNet MATH Google Scholar
Rüping, S.: Learning interpretable models. Ph.D. thesis, Universität Dortmund (2006)
Simon, G.J., Kumar, V., Li, P.W.: A simple statistical model and association rule filtering for classification. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 823–831 (2011)
Su, G., Wei, D., Varshney, K.R., Malioutov, D.M.: Interpretable two-level boolean rule learning for classification. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2016) (2016). arXiv:1606.05798
Tan, P.N., Kumar, V.: Interestingness measures for association patterns: a perspective. Tech. rep., Department of Computer Science, University of Minnesota (2000)
Thabtah, F.: A review of associative classification mining. Knowl. Eng. Rev. 22, 37–65 (2007)
Article Google Scholar
Ustun, B., Rudin, C.: Supersparse linear integer models for optimized medical scoring systems. Mach. Learn. 102(3), 349–391 (2016)
Article MathSciNet Google Scholar
Ustun, B., Rudin, C.: Optimized risk scores. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2017)
Vanhoof, K., Depaire, B.: Structure of association rule classifiers: a review. In: Proceedings of the International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 9–12 (2010)
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Vellido, A., Martín-Guerrero, J.D., Lisboa, P.J.: Making machine learning models interpretable. In: Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2012)
Verwer, S., Zhang, Y.: Learning decision trees with flexible constraints and objectives using integer optimization In: Salvagnin, D., Lombardi, M. (eds.) Integration of AI and OR Techniques in Constraint Programming. CPAIOR 2017. Lecture Notes in Computer Science, vol. 10335, pp 94–103. Springer (2017)
Wang, F., Rudin, C.: Falling rule lists. In: Proceedings of Artificial Intelligence and Statistics (AISTATS) (2015)
Wang, T., Rudin, C., Doshi-Velez, F., Liu, Y., Klampfl, E., MacNeille, P.: A Bayesian framework for learning rule sets for interpretable classification. J. Mach. Learn. Res. 18(70), 1–37 (2017)
MathSciNet MATH Google Scholar
Wu, Y., Tjelmeland, H., West, M.: Bayesian CART: prior specification and posterior simulation. J. Comput. Graph. Stat. 16(1), 44–66 (2007)
Article MathSciNet Google Scholar
Yang, H., Rudin, C., Seltzer, M.: Scalable Bayesian rule lists. In: Proceedings of the 34th International Conference on Machine Learning (ICML) (2017)
Yin, X., Han, J.: CPAR: Classification based on predictive association rules. In: Proceedings of the 2003 SIAM International Conference on Data Mining, pp. 331–335 (2003)
Chapter Google Scholar
Zeng, J., Ustun, B., Rudin, C.: Interpretable classification models for recidivism prediction. J. R. Stat. Soc. Ser. A (Stat. Soc.) 180(3), 689–722 (2017)
Article MathSciNet Google Scholar
Zhang, Y., Laber, E.B., Tsiatis, A., Davidian, M.: Using decision lists to construct interpretable and parsimonious treatment regimes. Biometrics 71(4), 895–904 (2015)
Article MathSciNet Google Scholar

Download references

Acknowledgements

We gratefully acknowledge funding from the MIT Big Data Initiative, and the National Science Foundation under grant IIS-1053407. Thanks to Daniel Bienstock and anonymous reviewers for encouragement and for helping us to improve the readability of the manuscript.

Author information

Authors and Affiliations

Departments of Computer Science, Electrical and Computer Engineering, and Statistical Science, Duke University, Durham, USA
Cynthia Rudin
Department of Computer Engineering, Middle Eastern Technical University, Ankara, Turkey
Şeyda Ertekin
MIT Sloan School of Management, Massachusetts Institute of Technology, Cambridge, USA
Şeyda Ertekin

Authors

Cynthia Rudin
View author publications
You can also search for this author in PubMed Google Scholar
Şeyda Ertekin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cynthia Rudin.

Appendices

Appendix A: Additional accuracy comparison experiments

Table 10 shows more detail about experiments, specifically, it contains the numerical values for accuracy for all algorithms, and all pairwise hypothesis tests. Because there can be a large number of parameters to tune in several of the algorithms in Table 10, it is clearly possible to tune them to provide better performance; for instance, in our method there are tuning parameters that govern the number and characteristics of rules in each class, along with tuning parameters for regularization. We chose a single parameter setting for our method for experimental comparisons to other methods, to avoid the possibility that the method performs well due to its flexibility. Further, as Table 11 shows, for SVM with gaussian kernels, there is not a single setting of SVM parameter values that is the best for all datasets. This table also shows the range of values one obtains when using SVM with various parameter settings. Note in particular that SVM never has perfect test accuracy on the Tic Tac Toe dataset, for any parameter settings we tried.

Appendix B: CART and C5.0 have difficulty with the Tic Tac Toe dataset

Figures 5 and 7 show decision trees for CART and C5.0, which are not particularly interpretable. Even as we varied C5.0 and CART’s parameters across their full ranges, they were not able to detect the pattern, as shown in Figs. 6 and 8.

Appendix C: ORL Tic Tac Toe models for other folds

The ORL models for other folds are shown in Tables 12 and 13. ORL provides correct models on all folds.

Table 12 Tic Tac Toe rules for Split 2. This model perfectly captures what it means for the X player to win in tic tac toe. If the X player gets three X’s in a row in one of 8 possible ways, the classifier predicts that the X player wins. Otherwise, the X player is predicted not to win

Full size table

Table 13 Tic Tac Toe rules for Split 3. Again this model perfectly captures what it means for the X player to win in tic tac toe

Full size table

Appendix D: Additional Haberman experiments

In Table 14 we show the effect of varying $C_1$ on the accuracy of classification for one fold of the Haberman experiment, with C fixed at 1 / number of rules, with a 2 h maximum time limit for the solver (here, CPLEX). As long as $C_1$ is small enough the accuracy is not affected.

Table 14 Train/Test accuracy for Haberman dataset experiments, with C fixed at 1 / number of rules

Full size table

Appendix E: Violent crime F-scores and Gmeans

Table 15 shows numerical values for the training and test F-scores and G-means. The test values are also displayed in Fig. 3.

Table 15 Top: F-scores and G-means for the violent crime dataset (mean and standard deviation computed across folds). Each column represents an algorithm. Bottom: The same information (F-scores and G-means for each algorithm) is displayed

Full size table

Appendix F: README for ORL package

This package contains the data and code for running ORL experiments, associated with the paper Learning Customized and Optimized Lists of Rules with Mathematical Programming by Cynthia Rudin and Şeyda Ertekin.

In the github repository https://github.com/SeydaErtekin/ORL, the code for the first phase of ORL (Rule Generation) is under the Rule_Generation directory, and code for the second phase (Ranking of the discovered rules) is under the Rule_Ranking directory. We provide two of the datasets that we used in our experiments, namely Haberman’s Survival and TicTacToe, under the Datasets directory.

In the package, we provide two shell scripts for running experiments with Haberman and TicTacToe datasets. The first script, run_haberman.sh, uses Haberman’s sample train/test split under Datasets/processed/ and invokes the sequence of codes for generating and ranking rules, followed by displaying the ranked rules. With the default settings, the script generates the ranked rules shown in Table 5 in the paper. For TicTacToe, we use the toy ruleset under Rule_Generation/rules, so run_tictactoe.sh only runs the rule ranking and displaying routines. This ruleset and corresponding results form the basis of our discussion in Sect. 3.1. Note that both scripts require Matlab and AMPL with Gurobi solver to be installed on the local machine.

An overview of the order of execution and the dependencies of the code is given in the diagram below.

In this package, we also provide a sample train/test split for both datasets, as well as the rules (under Rule_Generation/rules directory), the data input for rule ranking and the ranked rules (under Rule_Ranking/rules directory). The script print_ranked_rules.m can be used to view the ordered rule lists for these splits. For the Haberman’s Survival dataset, the set of rules include all rules discovered with a particular setting of the input parameters. For the TicTacToe dataset, we provide the toy ruleset (that we discuss in Sect. 3.1 in the paper) that is a trimmed version of all discovered rules. This toy ruleset includes eight rules for the 1 class, three rules for the 0 class, and two default rules (one of each class). The input data for TicTacToe used for ranking (under Rule_Ranking/rules/tictactoe_binary_train12_rank_input.dat) only initializes the necessary parameters required for ranking; it does not need to precompute the values of the variables because the number of rules is small and the optimization completes within a few seconds.

Directory structure

Datasets .csv files of the original datasets. If you’d like to generate brand new train/test splits for the datasets, you can use the script generate_rulegen_input.m to generate up to 3 train/test splits by chunking the dataset into 3 equal sized chunks. Files for each split are suffixed with 12, 13, or 23, indicating which chunks were used for training. For example, the files with suffix 12 indicate that first and second chunks are in the train set and chunk 3 is in test set.

Note that due to the random shuffling of the examples, any newly generated train/test splits will be different than what we provided, hence may yield different results. If you’d like to use the existing splits that we reported results for in the paper, you can use the files under Datasets/processed.

Datasets/processed Directory that contains train/test sets (files with .txt extension) and the train sets in ampl data format (with .dat extension). The former files are used for performance evaluation whereas the latter files are used in rule generation.

Rule_Generation Contains generate_rulegen_input.m script for generating files under Datasets/processed, and the ampl code(s) that implement rule generation routines. GenerateRules.sa is the main implementation of the rule generation routine and AddRule.sa is a helper script (called from GenerateRules.sa) that is responsible for writing discovered rules to the output file as well as adding the rule to the list of constraints so we do not discover the same rule again in subsequent iterations. The objective and constraints for rule generation are specified in a model file called RuleGen.mod.

Rule_Generation/rules Contains the files for the discovered rules for both classes in the datasets. We provide representative rules for both datasets in this directory. Files with “one” and “zero” suffixes include rules for one and zero classes, respectively. The file with “all” suffix is the aggregate of both files and default rules for both classes.

Rule_Ranking Contains matlab script generate_rulerank_input.m for aggregating the rules for both classes under Rule_Generation/rules. The aggregate rules are written to Rule_Generation/rules (with “all” suffix and .txt extension) and an ampl formatted version is written under the rules subdirectory. The Rule_Ranking directory also includes the ampl code RankRules.sa that implements the rule ranking routine and the model file RankObj.mod.

Rule_Ranking/rules Contains the data input used for rule ranking as well as the ranking output (the $\pi $ vector of rule heights). This directory contains the ranked rules for both dataset at obtained for different C and $C_1$ settings. Running print_ranked_rules.m (up in the Rule_Ranking directory) prints the ranked rules for the specified dataset/experiment in human-readable form. print_accuracy.m similarly computes the accuracy on train or test set (controlled within the code) for the specified dataset/experiment.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rudin, C., Ertekin, Ş. Learning customized and optimized lists of rules with mathematical programming. Math. Prog. Comp. 10, 659–702 (2018). https://doi.org/10.1007/s12532-018-0143-8

Download citation

Received: 02 September 2015
Accepted: 03 August 2018
Published: 05 September 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s12532-018-0143-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning customized and optimized lists of rules with mathematical programming

Abstract

Access this article

Similar content being viewed by others

Investigating the Impact of Independent Rule Fitnesses in a Learning Classifier System

A Metaheuristic Perspective on Learning Classifier Systems

Discovering Rule Lists with Preferred Variables

Notes

References

Acknowledgements