Editable machine learning models? A rule-based framework for user studies of explainability


So far, most user studies dealing with comprehensibility of machine learning models have used questionnaires or surveys to acquire input from participants. In this article, we argue that compared to questionnaires, the use of an adapted version of a real machine learning interface can yield a new level of insight into what attributes make a machine learning model interpretable, and why. Also, we argue that interpretability research also needs to consider the task of humans editing the model, not least due to the existing or forthcoming legal requirements on the right of human intervention. In this article, we focus on rule models as these are directly interpretable as well as editable. We introduce an extension of the EasyMiner system for generating classification and explorative models based on association rules. The presented web-based rule editing software allows the user to perform common editing actions such as modify rule (add or remove attribute), delete rule, create new rule, or reorder rules. To observe the effect of a particular edit on predictive performance, the user can validate the rule list against a selected dataset using a scoring procedure. The system is equipped with functionality that facilitates its integration with crowdsourcing platforms commonly used to recruit participants.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. 1.


  2. 2.

    It should be noted that the current version of the editor does not meet the requirements of the particular study by Muggleton et al. (2018), since some syntactical constructs necessary for the expression of ILP rules are not supported.

  3. 3.

    In this paper, we do not consider the more expressive rules based on the GUHA method (Hájek et al. 1966) that earlier versions of EasyMiner could also process.

  4. 4.

    The rules are sorted by confidence, support and antecedent length, which is the number of attribute-value pairs in the condition of the rule. For confidence and support, the higher value is better. For antecedent length, the shorter (and simpler) antecedent is preferred.

  5. 5.

    Sorted by confidence, support and length as noted above.


  1. Barakat N, Bradley AP (2010) Rule extraction from support vector machines: a review. Neurocomputing 74(1–3):178–190

    Article  Google Scholar 

  2. Boley H, Paschke A, Shafiq O (2010) RuleML 1.0: the overarching specification of web rules. In: International workshop on rules and rule markup languages for the semantic web, Springer, pp 162–178

  3. Brainard DH, Vision S (1997) The psychophysics toolbox. Spatial Vis 10:433–436

    Article  Google Scholar 

  4. Dalmaijer ES, Mathôt S, Van der Stigchel S (2014) Pygaze: An open-source, cross-platform toolbox for minimal-effort programming of eyetracking experiments. Behav Res Methods 46(4):913–921

    Article  Google Scholar 

  5. Elkano M, Galar M, Sanz JA, Fernández A, Barrenechea E, Herrera F, Bustince H (2014) Enhancing multiclass classification in FARC-HD fuzzy classifier: on the synergy between n-dimensional overlap functions and decomposition strategies. IEEE Trans Fuzzy Syst 23(5):1562–1580

    Article  Google Scholar 

  6. Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181

    MathSciNet  MATH  Google Scholar 

  7. Fürnkranz J, Kliegr T (2015) A brief overview of rule learning. In: International symposium on rules and rule markup languages for the semantic web, Springer, pp 54–69

  8. Fürnkranz J, Kliegr T (2018) The need for interpretability biases. In: International symposium on intelligent data analysis, Springer, pp 15–27, https://doi.org/10.1007/978-3-030-01768-2_2

  9. Fürnkranz J, Gamberger D, Lavrač N (2012) Foundations of rule learning. Springer, Berlin

    Book  Google Scholar 

  10. Fürnkranz J, Kliegr T, Paulheim H (2020) On cognitive preferences and the plausibility of rule-based models. Machine Learning pp 853–898

  11. Gabriel A, Paulheim H, Janssen F (2014) Learning semantically coherent rules. In: Proceedings of the 1st International Workshop on Interactions between Data Mining and Natural Language Processing co-located with The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (DMNLP@ PKDD/ECML), CEUR Workshop Proceedings, Nancy, France, pp 49–63

  12. García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977. https://doi.org/10.1007/s00500-008-0392-y

    Article  Google Scholar 

  13. Grice HP (1975) Logic and conversation. In: Speech Acts, Brill, pp 41–58

  14. Hájek P, Havel I, Chytil M (1966) The GUHA method of automatic hypotheses determination. Computing 1(4):293–308

    Article  Google Scholar 

  15. HLEG AI (2019) Ethics guidelines for trustworthy artificial intelligence. Retrieved from High-Level Expert Group on Artificial Intelligence (AI HLEG). https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai

  16. Huysmans J, Dejaeger K, Mues C, Vanthienen J, Baesens B (2011) An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Supp Syst 51(1):141–154. https://doi.org/10.1016/j.dss.2010.12.003

    Article  Google Scholar 

  17. Kliegr T, Bahník Š, Fürnkranz J (2018) A review of possible effects of cognitive biases on interpretation of rule-based machine learning models. arXiv:1804.02969

  18. Kulesza T, Burnett M, Wong WK, Stumpf S (2015) Principles of explanatory debugging to personalize interactive machine learning. In: Proceedings of the 20th International Conference on Intelligent User Interfaces, Association for Computing Machinery, New York, NY, USA, IUI’15, pp 126–137, https://doi.org/10.1145/2678025.2701399

  19. Lage I, Chen E, He J, Narayanan M, Kim B, Gershman S, Doshi-Velez F (2019) An evaluation of the human-interpretability of explanation. arXiv:1902.00006

  20. Lakkaraju H, Bach SH, Leskovec J (2016) Interpretable decision sets: A joint framework for description and prediction. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York, NY, USA, KDD ’16, pp 1675–1684, https://doi.org/10.1145/2939672.2939874

  21. Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, AAAI Press, KDD’98, pp 80–86

  22. Michalski RS (1969) On the quasi-minimal solution of the general covering problem. In: Proceedings of the V International Symposium on Information Processing (FCIP 69)(Switching Circuits), pp 125–128

  23. Miller T (2019) Explanation in artificial intelligence: insights from the social sciences. Artif Intell 267:1–38

    MathSciNet  Article  Google Scholar 

  24. Muggleton SH, Schmid U, Zeller C, Tamaddoni-Nezhad A, Besold T (2018) Ultra-strong machine learning: comprehensibility of programs learned with ILP. Mach Learn 107(7):1119–1140

    MathSciNet  Article  Google Scholar 

  25. Páez A (2019) The pragmatic turn in explainable artificial intelligence (XAI). Minds and Machines pp 1–19

  26. Piltaver R, Lustrek M, Gams M, Martincic-Ipsic S (2016) What makes classification trees comprehensible? Expert Syst Appl 62:333–346. https://doi.org/10.1016/j.eswa.2016.06.009

    Article  Google Scholar 

  27. Rapp M, Mencía EL, Fürnkranz J (2019) Simplifying random forests: On the trade-off between interpretability and accuracy. arXiv:1911.04393

  28. Roig A (2017) Safeguards for the right not to be subject to a decision based solely on automated processing (article 22 GDPR). Eur J Law Technol 8(3)

  29. Schmid U, Finzel B (2020) Mutual explanations for cooperative decision making in medicine. KI-Künstliche Intelligenz pp 1–7

  30. Sorower MS, Doppa JR, Orr W, Tadepalli P, Dietterich TG, Fern XZ (2011) Inverting Grice’s maxims to learn rules from natural language extractions. In: Advances in neural information processing systems, pp 1053–1061

  31. Tomanová P, Hradil J, Sklenák V (2019) Measuring users’ color preferences in CRUD operations across the globe: a new software ergonomics testing platform. Cognition, Technology & Work pp 1–11

  32. Towell GG, Shavlik JW (1993) Extracting refined rules from knowledge-based neural networks. Machine Learn 13(1):71–101

    Google Scholar 

  33. Vojíř S, Duben PV, Kliegr T (2014) Business rule learning with interactive selection of association rules. In: Patkos T, Wyner AZ, Giurca A (eds) Proceedings of the RuleML 2014 Challenge and the RuleML 2014 Doctoral Consortium hosted by the 8th International Web Rule Symposium, Challenge+DC@RuleML 2014, Prague, Czech Republic, August 18-20, 2014, CEUR-WS.org, CEUR Workshop Proceedings, vol 1211, http://ceur-ws.org/Vol-1211/paper5.pdf

  34. Vojíř S, Zeman V, Kuchař J, Kliegr T (2018) Easyminer.eu: Web framework for interpretable machine learning based on rules and frequent itemsets. Knowl Based Syst 150:111–115. https://doi.org/10.1016/j.knosys.2018.03.006

    Article  Google Scholar 

  35. Wang T, Rudin C, Velez-Doshi F, Liu Y, Klampfl E, MacNeille P (2016) Bayesian rule sets for interpretable classification. In: 2016 IEEE 16th international conference on data mining (ICDM), IEEE, pp 1269–1274

  36. Wason PC (1960) On the failure to eliminate hypotheses in a conceptual task. Q J Experimen Psychol 12(3):129–140

    Article  Google Scholar 

  37. Wason PC (1968) Reasoning about a rule. Q J Experimen Psychol 20(3):273–281

    Article  Google Scholar 

  38. Yang Y, Kandogan E, Li Y, Sen P, Lasecki W (2019) A study on interaction in human-in-the-loop machine learning for text analytics. In: IUI Workshops, CEUR-WS.org, (CEUR Workshop Proceedings), vol 2327

  39. Yin M, Chen Y, Sun YA (2014) Monetary interventions in crowdsourcing task switching. In: Second AAAI Conference on Human Computation and Crowdsourcing (HCOMP), AAAI, pp 234–242

  40. Zilke JR, Mencía EL, Janssen F (2016) DeepRED–rule extraction from deep neural networks. In: International Conference on Discovery Science, Springer, pp 457–473

Download references


This research was supported by long term institutional support of research activities and grant IGA 33/2018 of the University of Economics, Prague. Author contributions: SV implemented the system and edited the article, TK conceived the research, wrote the article and organised the internal user studies.

Author information



Corresponding author

Correspondence to Tomáš Kliegr.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 45093 KB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Vojíř, S., Kliegr, T. Editable machine learning models? A rule-based framework for user studies of explainability. Adv Data Anal Classif 14, 785–799 (2020). https://doi.org/10.1007/s11634-020-00419-2

Download citation


  • Rule learning
  • User experiment
  • Crowdsourcing
  • Explainable Artificial Intelligence
  • Cognitive Computing
  • Legal compliance

Mathematics Subject Classification

  • 68T30