Skip to main content
Log in

Learning customized and optimized lists of rules with mathematical programming

  • Full Length Paper
  • Published:
Mathematical Programming Computation Aims and scope Submit manuscript

Abstract

We introduce a mathematical programming approach to building rule lists, which are a type of interpretable, nonlinear, and logical machine learning classifier involving IF-THEN rules. Unlike traditional decision tree algorithms like CART and C5.0, this method does not use greedy splitting and pruning. Instead, it aims to fully optimize a combination of accuracy and sparsity, obeying user-defined constraints. This method is useful for producing non-black-box predictive models, and has the benefit of a clear user-defined tradeoff between training accuracy and sparsity. The flexible framework of mathematical programming allows users to create customized models with a provable guarantee of optimality. The software reviewed as part of this submission was given the DOI (Digital Object Identifier) https://doi.org/10.5281/zenodo.1344142.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. The recall or sensitivity of a classifier is the true positive rate, the precision is the fraction of predicted positives that are true positives, the specificity is the true negative rate:

    $$ \begin{aligned} \text { recall = sensitivity }= & {} \frac{\sum _i 1_{(y_i=f(x_i)) \& (y_i=1)}}{\sum _i 1_{y_i=1}}, \,\,\, \text { precision }= \frac{\sum _i 1_{(y_i = f(x_i)) \& (y_i=1)}}{\sum _i 1_{f(x_i)=1}}, \\&\text { specificity } = \frac{\sum _i 1_{(y_i=f(x_i)) \& (y_i=-1)}}{\sum _i 1_{y_i=-1}} . \end{aligned}$$

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Databases, pp. 487–499 (1994)

  2. Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., Rudin, C.: Learning certifiably optimal rule lists for categorical data. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (2017)

  3. Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., Rudin, C.: Learning certifiably optimal rule lists for categorical data. J. Mach. Learn. Res. 18, 1–78 (2018)

    Google Scholar 

  4. Anthony, M.: Decision lists. Tech. rep., CDAM Research Report LSE-CDAM-2005-23 (2005)

  5. Bache, K., Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)

  6. Bayardo, R.J., Agrawal, R.: Mining the most interesting rules. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 145–154 (1999)

  7. Bennett, K.P., Blue, J.A.: Optimal decision trees. Tech. rep., R.P.I. Math Report No. 214, Rensselaer Polytechnic Institute (1996)

  8. Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 7, 1039–1082 (2017)

    Article  MathSciNet  Google Scholar 

  9. Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I.: An implementation of logical analysis of data. IEEE Trans. Knowl. Data Eng. 12(2), 292–306 (2000)

    Article  Google Scholar 

  10. Breiman, L.: Random forests. Mach Learn 45(1), 5–32 (2001)

    Article  Google Scholar 

  11. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)

    MATH  Google Scholar 

  12. Chang, A.: Integer optimization methods for machine learning. Ph.D. thesis, Massachusetts Institute of Technology (2012)

  13. Chen, C., Rudin, C.: An optimization approach to learning falling rule lists. In: Proceedings of Artificial Intelligence and Statistics (AISTATS) (2018)

  14. Chipman, H.A., George, E.I., McCulloch, R.E.: Bayesian CART model search. J. Am. Stat. Assoc. 93(443), 935–948 (1998)

    Article  Google Scholar 

  15. Cieslak, D.A., Chawla, N.V.: Learning decision trees for unbalanced data. In: Daelemans, W., Goethals, B., Morik, K. (eds.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, vol. 5211, pp. 241–256. Springer, Berlin (2008)

    Chapter  Google Scholar 

  16. Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann (1995)

  17. Cusick, G.R., Courtney, M.E., Havlicek, J., Hess, N.: Crime during the transition to adulthood: how youth fare as they leave out-of-home care. National Institute of Justice, Office of Justice Programs, US Department of Justice (2010)

  18. Dobkin, D., Fulton, T., Gunopulos, D., Kasif, S., Salzberg, S.: Induction of shallow decision trees (1996)

  19. Farhangfar, A., Greiner, R., Zinkevich, M.: A fast way to produce optimal fixed-depth decision trees. In: International Symposium on Artificial Intelligence and Mathematics (ISAIM 2008), Fort Lauderdale, Florida, USA, January 2–4 (2008)

  20. Fawcett, T.: Prie: a system for generating rulelists to maximize roc performance. Data Min. Knowl. Discov. 17(2), 207–224 (2008)

    Article  MathSciNet  Google Scholar 

  21. Freitas, A.A.: Comprehensible classification models: a position paper. ACM SIGKDD Explor. Newsl. 15(1), 1–10 (2014)

    Article  Google Scholar 

  22. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  Google Scholar 

  23. Friedman, J.H., Popescu, B.E.: Predictive learning via rule ensembles. Ann. Appl. Stat. 2(3), 916–954 (2008)

    Article  MathSciNet  Google Scholar 

  24. Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. (2006). https://doi.org/10.1145/1132960.1132963

    Article  Google Scholar 

  25. Goethals, B.: Survey on frequent pattern mining. Tech. rep., Helsinki Institute for Information Technology (2003)

  26. Goh, S.T., Rudin, C.: Box drawings for learning with imbalanced data. In: Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2014)

  27. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009). https://doi.org/10.1145/1656274.1656278

    Article  Google Scholar 

  28. Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Min. Knowl. Discov. 15, 55–86 (2007)

    Article  MathSciNet  Google Scholar 

  29. Hata, I., Veloso, A., Ziviani, N.: Learning accurate and interpretable classifiers using optimal multi-criteria rules. J. Inf. Data Manag. 4(3) (2013)

  30. Hipp, J., Güntzer, U., Nakhaeizadeh, G.: Algorithms for association rule mining: a general survey and comparison. SIGKDD Explor. 2, 58–64 (2000)

    Article  Google Scholar 

  31. Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., Baesens, B.: An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decis. Support Syst. 51(1), 141–154 (2011)

    Article  Google Scholar 

  32. Jennings, D.L., Amabile, T.M., Ross, L.: Informal covariation assessments: Data-based versus theory-based judgements. In: Kahneman, D., Slovic, P., Tversky, A. (eds.) Judgment Under Uncertainty: Heuristics and Biases, pp. 211–230. Cambridge Press, Cambridge (1982)

    Chapter  Google Scholar 

  33. Klivans, A.R., Servedio, R.A.: Toward attribute efficient learning of decision lists and parities. J. Mach. Learn. Res. 7, 587–602 (2006)

    MathSciNet  MATH  Google Scholar 

  34. Kuhn, M., Weston, S., Coulter, N.: C50: C5.0 Decision Trees and Rule-Based Models, C Code for C5.0 by R. Quinlan. http://CRAN.R-project.org/package=C50. r package version 0.1.0-013 (2012)

  35. Lakkaraju, H., Rudin, C.: Learning cost effective and interpretable treatment regimes in the form of rule lists. In: Proceedings of Artificial Intelligence and Statistics (AISTATS) (2017)

  36. Leondes, C.T.: Expert Systems: The Technology of Knowledge Management and Decision Making for the 21st Century. Academic Press, London (2002)

    Google Scholar 

  37. Letham, B., Rudin, C., McCormick, T.H., Madigan, D.: Interpretable classifiers using rules and bayesian analysis: building a better stroke prediction model. Ann. Appl. Stat. 9(3), 1350–1371 (2015)

    Article  MathSciNet  Google Scholar 

  38. Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. IEEE International Conference on Data Mining, pp. 369–376 (2001)

  39. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, pp. 80–96 (1998)

  40. Long, P.M., Servedio, R.A.: Attribute-efficient learning of decision lists and linear threshold functions under unconcentrated distributions. Adv. Neural Inf. Process. Syst. 19, 921–928 (2007)

    Google Scholar 

  41. Malioutov, D., Varshney, K.: Exact rule learning via boolean compressed sensing. In: Proceedings of The 30th International Conference on Machine Learning, pp. 765–773 (2013)

  42. Marchand, M., Sokolova, M.: Learning with decision lists of data-dependent features. J. Mach. Learn. Res. 6, 427–451 (2005)

    MathSciNet  MATH  Google Scholar 

  43. McCormick, T.H., Rudin, C., Madigan, D.: Bayesian hierarchical modeling for predicting medical conditions. Ann. Appl. Stat. 6(2), 652–668 (2012)

    Article  MathSciNet  Google Scholar 

  44. McGarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20, 39–61 (2005)

    Article  Google Scholar 

  45. Meinshausen, N.: Node harvest. Ann. Appl. Stat. 4(4), 2049–2072 (2010)

    Article  MathSciNet  Google Scholar 

  46. Miller, G.A.: The magical number seven, plus or minus two: Some limits to our capacity for processing information. Psychol. Rev. 63(2), 81–97 (1956)

    Article  Google Scholar 

  47. Muggleton, S., De Raedt, L.: Inductive logic programming: theory and methods. J. Log. Program. 19, 629–679 (1994)

    Article  MathSciNet  Google Scholar 

  48. Naumov, G.: NP-completeness of problems of construction of optimal decision trees. Sov. Phys. Dokl. 36(4), 270–271 (1991)

    MATH  Google Scholar 

  49. Nijssen, S., Fromont, E.: Mining optimal decision trees from itemset lattices. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) (2007)

  50. Nijssen, S., Fromont, E.: Optimal constraint-based decision tree induction from itemset lattices. Data Min. Knowl. Discov. 21(1), 9–51 (2010)

    Article  MathSciNet  Google Scholar 

  51. Norouzi, M., Collins, M., Johnson, M.A., Fleet, D.J., Kohli, P.: Efficient non-greedy optimization of decision trees. Adv. Neural Inf. Process. Syst. 28, 1729–1737 (2015)

    Google Scholar 

  52. Plate, T.A.: Accuracy versus interpretability in flexible modeling: implementing a tradeoff using gaussian process models. Behaviormetrika 26, 29–50 (1999)

    Article  Google Scholar 

  53. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, Los Altos (1993)

    Google Scholar 

  54. Ridgeway, G.: The pitfalls of prediction. NIJ J. Natl. Inst. Justice 271, 34–40 (2013)

    Google Scholar 

  55. Rivest, R.L.: Learning decision lists. Mach. Learn. 2(3), 229–246 (1987)

    Google Scholar 

  56. Rückert, U.: A statistical approach to rule learning. Ph.D. thesis, Technischen Universität München (2008)

  57. Rudin, C., Letham, B., Salleb-Aouissi, A., Kogan, E., Madigan, D.: Sequential event prediction with association rules. In: Proceedings of the 24th Annual Conference on Learning Theory (COLT) (2011)

  58. Rudin, C., Letham, B., Madigan, D.: Learning theory analysis for association rules and sequential event prediction. J. Mach. Learn. Res. 14, 3384–3436 (2013)

    MathSciNet  MATH  Google Scholar 

  59. Rüping, S.: Learning interpretable models. Ph.D. thesis, Universität Dortmund (2006)

  60. Simon, G.J., Kumar, V., Li, P.W.: A simple statistical model and association rule filtering for classification. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 823–831 (2011)

  61. Su, G., Wei, D., Varshney, K.R., Malioutov, D.M.: Interpretable two-level boolean rule learning for classification. In: ICML Workshop on Human Interpretability in Machine Learning (WHI 2016) (2016). arXiv:1606.05798

  62. Tan, P.N., Kumar, V.: Interestingness measures for association patterns: a perspective. Tech. rep., Department of Computer Science, University of Minnesota (2000)

  63. Thabtah, F.: A review of associative classification mining. Knowl. Eng. Rev. 22, 37–65 (2007)

    Article  Google Scholar 

  64. Ustun, B., Rudin, C.: Supersparse linear integer models for optimized medical scoring systems. Mach. Learn. 102(3), 349–391 (2016)

    Article  MathSciNet  Google Scholar 

  65. Ustun, B., Rudin, C.: Optimized risk scores. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2017)

  66. Vanhoof, K., Depaire, B.: Structure of association rule classifiers: a review. In: Proceedings of the International Conference on Intelligent Systems and Knowledge Engineering (ISKE), pp. 9–12 (2010)

  67. Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  68. Vellido, A., Martín-Guerrero, J.D., Lisboa, P.J.: Making machine learning models interpretable. In: Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (2012)

  69. Verwer, S., Zhang, Y.: Learning decision trees with flexible constraints and objectives using integer optimization In: Salvagnin, D., Lombardi, M. (eds.) Integration of AI and OR Techniques in Constraint Programming. CPAIOR 2017. Lecture Notes in Computer Science, vol. 10335, pp 94–103. Springer (2017)

  70. Wang, F., Rudin, C.: Falling rule lists. In: Proceedings of Artificial Intelligence and Statistics (AISTATS) (2015)

  71. Wang, T., Rudin, C., Doshi-Velez, F., Liu, Y., Klampfl, E., MacNeille, P.: A Bayesian framework for learning rule sets for interpretable classification. J. Mach. Learn. Res. 18(70), 1–37 (2017)

    MathSciNet  MATH  Google Scholar 

  72. Wu, Y., Tjelmeland, H., West, M.: Bayesian CART: prior specification and posterior simulation. J. Comput. Graph. Stat. 16(1), 44–66 (2007)

    Article  MathSciNet  Google Scholar 

  73. Yang, H., Rudin, C., Seltzer, M.: Scalable Bayesian rule lists. In: Proceedings of the 34th International Conference on Machine Learning (ICML) (2017)

  74. Yin, X., Han, J.: CPAR: Classification based on predictive association rules. In: Proceedings of the 2003 SIAM International Conference on Data Mining, pp. 331–335 (2003)

    Chapter  Google Scholar 

  75. Zeng, J., Ustun, B., Rudin, C.: Interpretable classification models for recidivism prediction. J. R. Stat. Soc. Ser. A (Stat. Soc.) 180(3), 689–722 (2017)

    Article  MathSciNet  Google Scholar 

  76. Zhang, Y., Laber, E.B., Tsiatis, A., Davidian, M.: Using decision lists to construct interpretable and parsimonious treatment regimes. Biometrics 71(4), 895–904 (2015)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge funding from the MIT Big Data Initiative, and the National Science Foundation under grant IIS-1053407. Thanks to Daniel Bienstock and anonymous reviewers for encouragement and for helping us to improve the readability of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cynthia Rudin.

Appendices

Appendix A: Additional accuracy comparison experiments

Table 10 shows more detail about experiments, specifically, it contains the numerical values for accuracy for all algorithms, and all pairwise hypothesis tests. Because there can be a large number of parameters to tune in several of the algorithms in Table 10, it is clearly possible to tune them to provide better performance; for instance, in our method there are tuning parameters that govern the number and characteristics of rules in each class, along with tuning parameters for regularization. We chose a single parameter setting for our method for experimental comparisons to other methods, to avoid the possibility that the method performs well due to its flexibility. Further, as Table 11 shows, for SVM with gaussian kernels, there is not a single setting of SVM parameter values that is the best for all datasets. This table also shows the range of values one obtains when using SVM with various parameter settings. Note in particular that SVM never has perfect test accuracy on the Tic Tac Toe dataset, for any parameter settings we tried.

Appendix B: CART and C5.0 have difficulty with the Tic Tac Toe dataset

Figures 5 and 7 show decision trees for CART and C5.0, which are not particularly interpretable. Even as we varied C5.0 and CART’s parameters across their full ranges, they were not able to detect the pattern, as shown in Figs. 6 and 8.

Fig. 5
figure 5

CART Classifier for Tic Tac Toe dataset. Numbers indicate which square on the board is considered. Left branch always means “yes” and right branch means “no”. In the leaves, “W” and “NW” stand for win or no-win for X

Fig. 6
figure 6

CART training and testing classification accuracy on Tic Tac Toe dataset for various “confidence factor” values. Performance is not close to perfect no matter what parameter value is chosen, even though a perfect solution exists

Fig. 7
figure 7

C5.0 Classifier for Tic Tac Toe dataset

Fig. 8
figure 8

C5.0 training and testing classification accuracy on Tic Tac Toe dataset for various “Confidence Factor” values. Performance is not close to perfect no matter what parameter value is chosen, even though a perfect solution exists

Appendix C: ORL Tic Tac Toe models for other folds

The ORL models for other folds are shown in Tables 12 and 13. ORL provides correct models on all folds.

Table 12 Tic Tac Toe rules for Split 2. This model perfectly captures what it means for the X player to win in tic tac toe. If the X player gets three X’s in a row in one of 8 possible ways, the classifier predicts that the X player wins. Otherwise, the X player is predicted not to win
Table 13 Tic Tac Toe rules for Split 3. Again this model perfectly captures what it means for the X player to win in tic tac toe

Appendix D: Additional Haberman experiments

In Table 14 we show the effect of varying \(C_1\) on the accuracy of classification for one fold of the Haberman experiment, with C fixed at 1 / number of rules, with a 2 h maximum time limit for the solver (here, CPLEX). As long as \(C_1\) is small enough the accuracy is not affected.

Table 14 Train/Test accuracy for Haberman dataset experiments, with C fixed at 1 / number of rules

Appendix E: Violent crime F-scores and Gmeans

Table 15 shows numerical values for the training and test F-scores and G-means. The test values are also displayed in Fig. 3.

Table 15 Top: F-scores and G-means for the violent crime dataset (mean and standard deviation computed across folds). Each column represents an algorithm. Bottom: The same information (F-scores and G-means for each algorithm) is displayed

Appendix F: README for ORL package

This package contains the data and code for running ORL experiments, associated with the paper Learning Customized and Optimized Lists of Rules with Mathematical Programming by Cynthia Rudin and Şeyda Ertekin.

In the github repository https://github.com/SeydaErtekin/ORL, the code for the first phase of ORL (Rule Generation) is under the Rule_Generation directory, and code for the second phase (Ranking of the discovered rules) is under the Rule_Ranking directory. We provide two of the datasets that we used in our experiments, namely Haberman’s Survival and TicTacToe, under the Datasets directory.

In the package, we provide two shell scripts for running experiments with Haberman and TicTacToe datasets. The first script, run_haberman.sh, uses Haberman’s sample train/test split under Datasets/processed/ and invokes the sequence of codes for generating and ranking rules, followed by displaying the ranked rules. With the default settings, the script generates the ranked rules shown in Table 5 in the paper. For TicTacToe, we use the toy ruleset under Rule_Generation/rules, so run_tictactoe.sh only runs the rule ranking and displaying routines. This ruleset and corresponding results form the basis of our discussion in Sect. 3.1. Note that both scripts require Matlab and AMPL with Gurobi solver to be installed on the local machine.

An overview of the order of execution and the dependencies of the code is given in the diagram below.

figure o

In this package, we also provide a sample train/test split for both datasets, as well as the rules (under Rule_Generation/rules directory), the data input for rule ranking and the ranked rules (under Rule_Ranking/rules directory). The script print_ranked_rules.m can be used to view the ordered rule lists for these splits. For the Haberman’s Survival dataset, the set of rules include all rules discovered with a particular setting of the input parameters. For the TicTacToe dataset, we provide the toy ruleset (that we discuss in Sect. 3.1 in the paper) that is a trimmed version of all discovered rules. This toy ruleset includes eight rules for the 1 class, three rules for the 0 class, and two default rules (one of each class). The input data for TicTacToe used for ranking (under Rule_Ranking/rules/tictactoe_binary_train12_rank_input.dat) only initializes the necessary parameters required for ranking; it does not need to precompute the values of the variables because the number of rules is small and the optimization completes within a few seconds.

Directory structure

Datasets .csv files of the original datasets. If you’d like to generate brand new train/test splits for the datasets, you can use the script generate_rulegen_input.m to generate up to 3 train/test splits by chunking the dataset into 3 equal sized chunks. Files for each split are suffixed with 12, 13, or 23, indicating which chunks were used for training. For example, the files with suffix 12 indicate that first and second chunks are in the train set and chunk 3 is in test set.

Note that due to the random shuffling of the examples, any newly generated train/test splits will be different than what we provided, hence may yield different results. If you’d like to use the existing splits that we reported results for in the paper, you can use the files under Datasets/processed.

Datasets/processed Directory that contains train/test sets (files with .txt extension) and the train sets in ampl data format (with .dat extension). The former files are used for performance evaluation whereas the latter files are used in rule generation.

Rule_Generation Contains generate_rulegen_input.m script for generating files under Datasets/processed, and the ampl code(s) that implement rule generation routines. GenerateRules.sa is the main implementation of the rule generation routine and AddRule.sa is a helper script (called from GenerateRules.sa) that is responsible for writing discovered rules to the output file as well as adding the rule to the list of constraints so we do not discover the same rule again in subsequent iterations. The objective and constraints for rule generation are specified in a model file called RuleGen.mod.

Rule_Generation/rules Contains the files for the discovered rules for both classes in the datasets. We provide representative rules for both datasets in this directory. Files with “one” and “zero” suffixes include rules for one and zero classes, respectively. The file with “all” suffix is the aggregate of both files and default rules for both classes.

Rule_Ranking Contains matlab script generate_rulerank_input.m for aggregating the rules for both classes under Rule_Generation/rules. The aggregate rules are written to Rule_Generation/rules (with “all” suffix and .txt extension) and an ampl formatted version is written under the rules subdirectory. The Rule_Ranking directory also includes the ampl code RankRules.sa that implements the rule ranking routine and the model file RankObj.mod.

Rule_Ranking/rules Contains the data input used for rule ranking as well as the ranking output (the \(\pi \) vector of rule heights). This directory contains the ranked rules for both dataset at obtained for different C and \(C_1\) settings. Running print_ranked_rules.m (up in the Rule_Ranking directory) prints the ranked rules for the specified dataset/experiment in human-readable form. print_accuracy.m similarly computes the accuracy on train or test set (controlled within the code) for the specified dataset/experiment.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rudin, C., Ertekin, Ş. Learning customized and optimized lists of rules with mathematical programming. Math. Prog. Comp. 10, 659–702 (2018). https://doi.org/10.1007/s12532-018-0143-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12532-018-0143-8

Keywords

Mathematics Subject Classification

Navigation