Abstract
In this paper, we present Forest GUMP (for Generalized, Unifying Merge Process) a tool for providing tangible experience with three concepts of explanation. Besides the well-known model explanation and outcome explanation, Forest GUMP also supports class characterization, i.e., the precise characterization of all samples with the same classification. Key technology to achieve these results is algebraic aggregation, i.e., the transformation of a Random Forest into a semantically equivalent, concise white-box representation in terms of Algebraic Decision Diagrams (ADDs). The paper sketches the method and illustrates the use of Forest GUMP along an illustrative example taken from the literature. This way readers should acquire an intuition about the tool, and the way how it should be used to increase the understanding not only of the considered dataset, but also of the character of Random Forests and the ADD technology, here enriched to comprise infeasible path elimination.
Chapter PDF
Similar content being viewed by others
Keywords
References
Akers, S.B.: Binary decision diagrams. IEEE Trans. Comput. 27(6), 509–516 (1978)
Bahar, R., Frohm, E., Gaona, C., Hachtel, G., Macii, E., Pardo, A., Somenzi, F.: Algebraic decision diagrams and their applications. In: Proceedings of 1993 International Conference on Computer Aided Design (ICCAD). pp. 188–191 (1993). https://doi.org/10.1109/ICCAD.1993.580054
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (Oct 2001). https://doi.org/10.1023/A:1010933404324
Bryant, R.E.: Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput. 35(8), 677–691 (1986). https://doi.org/10.1109/TC.1986.1676819
Chipman, H.A., George, E.I., McCulloch, R.E.: Making sense of a forest of trees (1999)
Deng, H.: Interpreting tree ensembles with intrees. Int. J. Data Sci. Anal. 7(4), 277–287 (2019). https://doi.org/10.1007/s41060-018-0144-8
Domingos, P.M.: Knowledge discovery via multiple models. Intell. Data Anal. 2(1-4), 187–202 (1998). https://doi.org/10.1016/S1088-467X(98)00023-7
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annals of eugenics 7(2) (1936)
Gossen, F., Margaria, T., Murtovi, A., Naujokat, S., Steffen, B.: Dsls for decision services: A tutorial introduction to language-driven engineering. In: Margaria, T., Steffen, B. (eds.) Leveraging Applications of Formal Methods, Verification and Validation. Modeling - 8th International Symposium, ISoLA 2018, Limassol, Cyprus, November 5-9, 2018, Proceedings, Part I. Lecture Notes in Computer Science, vol. 11244, pp. 546–564. Springer (2018). https://doi.org/10.1007/978-3-030-03418-4_33
Gossen, F., Margaria, T., Steffen, B.: Towards explainability in machine learning: The formal methods way. IT Prof. 22(4), 8–12 (2020). https://doi.org/10.1109/MITP.2020.3005640
Gossen, F., Margaria, T., Steffen, B.: Formal methods boost experimental performance for explainable AI. IT Prof. 23(6), 8–12 (2021). https://doi.org/10.1109/MITP.2021.3123495, https://doi.org/10.1109/MITP.2021.3123495
Gossen, F., Murtovi, A., Linden, J., Steffen, B.: The java library for algebraic decision diagrams. https://add-lib.scce.info, accessed: 2022-01-13
Gossen, F., Steffen, B.: Large random forests: Optimisation for rapid evaluation. CoRR abs/1912.10934 (2019), http://arxiv.org/abs/1912.10934
Gossen, F., Steffen, B.: Algebraic aggregation of random forests: towards explainability and rapid evaluation. International Journal on Software Tools for Technology Transfer (Sep 2021). https://doi.org/10.1007/s10009-021-00635-x
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., Pedreschi, D.: A survey of methods for explaining black box models. ACM Comput. Surv. 51(5), 93:1–93:42 (2019). https://doi.org/10.1145/3236009
Hara, S., Hayashi, K.: Making tree ensembles interpretable: A bayesian model selection approach. In: Storkey, A.J., Pérez-Cruz, F. (eds.) International Conference on Artificial Intelligence and Statistics, AISTATS 2018, 9-11 April 2018, Playa Blanca, Lanzarote, Canary Islands, Spain. Proceedings of Machine Learning Research, vol. 84, pp. 77–85. PMLR (2018), http://proceedings.mlr.press/v84/hara18a.html
Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition. vol. 1, pp. 278–282 vol.1 (1995). https://doi.org/10.1109/ICDAR.1995.598994
Hungar, H., Steffen, B., Margaria, T.: Methods for generating selection structures, for making selections according to selection structures and for creating selection descriptions. https://patents.justia.com/patent/9141708 (Sep 2015), USPTO Patent number: 9141708
Lee, C.Y.: Representation of switching circuits by binary-decision programs. Bell System Technical Journal 38(4), 985–999 (1959)
Lou, Y., Caruana, R., Gehrke, J.: Intelligible models for classification and regression. In: Yang, Q., Agarwal, D., Pei, J. (eds.) The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12, Beijing, China, August 12-16, 2012. pp. 150–158. ACM (2012). https://doi.org/10.1145/2339530.2339556
de Moura, L., Bjørner, N.: Z3: An efficient smt solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) Tools and Algorithms for the Construction and Analysis of Systems. pp. 337–340. Springer Berlin Heidelberg, Berlin, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24
Murtovi, A., Bainczyk, A., Steffen, B.: Forest gump: A tool for explanation (tacas 2022 artifact) (Nov 2021). https://doi.org/10.5281/zenodo.5733107
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Ribeiro, M.T., Singh, S., Guestrin, C.: "why should I trust you?": Explaining the predictions of any classifier. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. pp. 1135–1144. ACM (2016). https://doi.org/10.1145/2939672.2939778
Somenzi, F.: Cudd: Cu decision diagram package release 3.0 (2015)
Steffen, B., Gossen, F., Naujokat, S., Margaria, T.: Language-Driven Engineering: From General-Purpose to Purpose-Specific Languages, pp. 311–344. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-319-91908-9_17
Van Assche, A., Blockeel, H.: Seeing the forest through the trees: Learning a comprehensible model from an ensemble. In: Kok, J.N., Koronacki, J., Mantaras, R.L.d., Matwin, S., Mladenič, D., Skowron, A. (eds.) Machine Learning: ECML 2007. pp. 418–429. Springer Berlin Heidelberg, Berlin, Heidelberg (2007)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining, Fourth Edition: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 4th edn. (2016)
Zhou, Y., Hooker, G.: Interpreting models via single tree approximation (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this paper
Cite this paper
Murtovi, A., Bainczyk, A., Steffen, B. (2022). Forest GUMP: A Tool for Explanation. In: Fisman, D., Rosu, G. (eds) Tools and Algorithms for the Construction and Analysis of Systems. TACAS 2022. Lecture Notes in Computer Science, vol 13244. Springer, Cham. https://doi.org/10.1007/978-3-030-99527-0_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-99527-0_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-99526-3
Online ISBN: 978-3-030-99527-0
eBook Packages: Computer ScienceComputer Science (R0)