Abstract
A major challenge during the development of Machine Learning systems is the large number of models resulting from testing different model types, parameters, or feature subsets. The common approach of selecting the best model using one overall metric does not necessarily find the most suitable model for a given application, since it ignores the different effects of class confusions. Expert knowledge is key to evaluate, understand and compare model candidates and hence to control the training process. This paper addresses the research question of how we can support experts in the evaluation and selection of Machine Learning models, alongside the reasoning about them. ML-ModelExplorer is proposed – an explorative, interactive, and model-agnostic approach utilising confusion matrices. It enables Machine Learning and domain experts to conduct a thorough and efficient evaluation of multiple models by taking overall metrics, per-class errors, and individual class confusions into account. The approach is evaluated in a user-study and a real-world case study from football (soccer) data analytics is presented.
ML-ModelExplorer and a tutorial video are available online for use with own data sets: www.ml-and-vis.org/mex
Keywords
- Multi-class classification
- Model selection
- Feature selection
- Human-centered machine learning
- Visual analytics
This is a preview of subscription content, access via your institution.
Buying options










Notes
- 1.
domain experts are assumed to have a basic understanding of classification problems, i.e. understand class errors and class confusions.
- 2.
ML-ModelExplorer online: www.ml-and-vis.org/mex.
- 3.
ML-ModelExplorer video: https://youtu.be/IO7IWTUxK_Y.
References
Alsallakh, B., Hanbury, A., Hauser, H., Miksch, S., Rauber, A.: Visual methods for analyzing probabilistic classification data. IEEE Trans. Visual Comput. Graphics 20(12), 1703–1712 (2014)
Armatas, V., Yiannakos, A., Papadopoulou, S., Skoufas, D.: Evaluation of goals scored in top ranking soccer matches: Greek “superleague” 2006–08. Serbian J. Sports Sci. 3, 39–43 (2009)
Bernard, J., Zeppelzauer, M., Sedlmair, M., Aigner, W.: VIAL: a unified process for visual interactive labeling. Vis. Comput. 34(9), 1189–1207 (2018). https://doi.org/10.1007/s00371-018-1500-3
Chang, W., Cheng, J., Allaire, J., Xie, Y., McPherson, J.: shiny: web application framework for R. r package version 1.0.5 (2017). https://CRAN.R-project.org/package=shiny
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Fawcett, T.: ROC graphs: notes and practical considerations for researchers. Technical report, HP Laboratories (2004)
Frencken, W., Lemmink, K., Delleman, N., Visscher, C.: Oscillations of centroid position and surface area of soccer teams in small-sided games. Eur. J. Sport Sci. 11(4), 215–223 (2011). https://doi.org/10.1080/17461391.2010.499967
Goes, F.R., Kempe, M., Meerhoff, L.A., Lemmink, K.A.P.M.: Not every pass can be an assist: a data-driven model to measure pass effectiveness in professional soccer matches. Big Data 7(1), 57–70 (2019). https://doi.org/10.1089/big.2018.0067
Goes, F.R., et al.: Unlocking the potential of big data to support tactical performance analysis in professional soccer: a systematic review. Eur. J. Sport Sci. (2020, to appear). https://doi.org/10.1080/17461391.2020.1747552
Holzinger, A., et al.: Interactive machine learning: experimental evidence for the human in the algorithmic loop. Appl. Intell. 49(7), 2401–2414 (2018). https://doi.org/10.1007/s10489-018-1361-5
Huang, W., Song, G., Li, M., Hu, W., Xie, K.: Adaptive weight optimization for classification of imbalanced data. In: Sun, C., Fang, F., Zhou, Z.-H., Yang, W., Liu, Z.-Y. (eds.) IScIDE 2013. LNCS, vol. 8261, pp. 546–553. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42057-3_69
Inc., P.T.: Collaborative data science (2015). https://plot.ly
Inselberg, A.: The plane with parallel coordinates. Vis. Comput. 1(2), 69–91 (1985)
Jiang, L., Liu, S., Chen, C.: Recent research advances on interactive machine learning. J. Vis. 22(2), 401–417 (2018). https://doi.org/10.1007/s12650-018-0531-1
Kautz, T., Eskofier, B.M., Pasluosta, C.F.: Generic performance measure for multiclass-classifiers. Pattern Recogn. 68, 111–125 (2017). https://doi.org/10.1016/j.patcog.2017.03.008
Krause, J., Perer, A., Bertini, E.: Infuse: interactive feature selection for predictive modeling of high dimensional data. IEEE Trans. Visual Comput. Graph. 20(12), 1614–1623 (2014)
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)
LeCun, Y.: The MNIST database of handwritten digits (1999). http://yann.lecun.com/exdb/mnist/
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015). https://doi.org/10.1038/nature14539
Link, D., Lang, S., Seidenschwarz, P.: Real time quantification of dangerousity in football using spatiotemporal tracking data. PLoS ONE 11(12), 1–16 (2016). https://doi.org/10.1371/journal.pone.0168768
Meerhoff, L.A., Goes, F., de Leeuw, A.W., Knobbe, A.: Exploring successful team tactics in soccer tracking data. In: MLSA@PKDD/ECML (2019)
Memmert, D., Lemmink, K.A.P.M., Sampaio, J.: Current approaches to tactical performance analyses in soccer using position data. Sports Med. 47(1), 1–10 (2016). https://doi.org/10.1007/s40279-016-0562-5
Park, C., Lee, J., Han, H., Lee, K.: ComDia+: an interactive visual analytics system for comparing, diagnosing, and improving multiclass classifiers. In: 2019 IEEE Pacific Visualization Symposium (PacificVis), pp. 313–317, April 2019
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syste. Mag. 6, 21–45 (2006)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2017). https://www.R-project.org/
Raschka, S.: Model evaluation, model selection, and algorithm selection in machine learning. CoRR abs/1811.12808 (2018)
Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29(9), 2352–2449 (2017)
Ren, D., Amershi, S., Lee, B., Suh, J., Williams, J.D.: Squares: supporting interactive performance analysis for multiclass classifiers. IEEE Trans. Visual Comput. Graphics 23(1), 61–70 (2017)
Sacha, D., et al.: What you see is what you can change: human-centered machine learning by interactive visualization. Neurocomputing 268, 164–175 (2017). https://doi.org/10.1016/j.neucom.2017.01.105
Shneiderman, B.: The eyes have it: a task by data type taxonomy for information visualizations. In: In Proceedings of Visual Languages, pp. 336–343. IEEE Computer Science Press (1996)
Theissler, A.: Detecting known and unknown faults in automotive systems using ensemble-based anomaly detection. Knowl. Based Syst. 123(C), 163–173 (2017). https://doi.org/10.1016/j.knosys.2017.02.023
Zhang, J., Wang, Y., Molino, P., Li, L., Ebert, D.S.: Manifold: a model-agnostic framework for interpretation and diagnosis of machine learning models. IEEE Trans. Visual Comput. Graph. 25(1), 364–373 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 IFIP International Federation for Information Processing
About this paper
Cite this paper
Theissler, A., Vollert, S., Benz, P., Meerhoff, L.A., Fernandes, M. (2020). ML-ModelExplorer: An Explorative Model-Agnostic Approach to Evaluate and Compare Multi-class Classifiers. In: Holzinger, A., Kieseberg, P., Tjoa, A., Weippl, E. (eds) Machine Learning and Knowledge Extraction. CD-MAKE 2020. Lecture Notes in Computer Science(), vol 12279. Springer, Cham. https://doi.org/10.1007/978-3-030-57321-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-57321-8_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57320-1
Online ISBN: 978-3-030-57321-8
eBook Packages: Computer ScienceComputer Science (R0)
-
Published in cooperation with
http://www.ifip.org/