Abstract
The paper deals with multiclass learning from the perspective of analytically interpreting the results of the analysis as well as that of navigating into them by using interactive visualization tools. It is showed that by combining the Sequential Automatic Search of Subset of Classifiers (SASSC) algorithm with the interactive visualization of classification trees provided by the Klassification—Interactive Methods for Trees (KLIMT) software it is possible to highlight important information deriving from the knowledge extraction process without neglecting the prediction accuracy of the classification method. Empirical evidence from two benchmark datasets demonstrates the advantages deriving from the joint use of SASSC and KLIMT.
Similar content being viewed by others
References
Asuncion A, Newman DJ (2007) UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html
Benabdeslem K, Bennani Y (2006) Dendogram-based SVM for multi-class classification. J Comput Inf Technol 14: 5–32
Brand M (1998) Pattern discovery via entropy minimization. In: Heckerman D, Whittaker J (eds) Proceedings of the seventh international workshop on artificial intelligence and statistics. Morgan Kaufmann Publishers Inc, San Francisco, CA. http://obuisson.free.fr/biblio/machine_learning/brand98pattern.pdf
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont
Conversano C, Mola F (2010) Detecting subset of classifiers for multi-attribute response prediction. In: Lauro CN, Greenacre MJ, Palumbo F (eds) Studies in classification, data analysis, and knowledge organization. Springer, Berlin-Heidelberg, pp 225–232
Cutzu F (2003) Polychotomous classification with pairwise classifiers: a new voting principle In: Windeatt T, Roli F (eds) Multiple classifier system, Proceedings of the fourth international workshop MCS 2003. Springer, New York, pp 115–124
Deterding DH (1988) Speaker normalisation for automatic speech recognition. PhD thesis, University of Cambridge, Cambridge
Dietterich TG, Bakiri G (1995) Solving multi-classlearning problems via error-correcting output codes. J Artif Intell Res 2: 263–286
Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley, New York
Fogarty T (1992) First nearest neighbour classification on Frey and Slate’s letter recognition problem (technical note). Mach Learn 9: 387–388
Frey PW, Slate DJ (1991) Letter recognition using Holland-style adaptive classifiers. Mach Learn 6: 161–182
Furnkranz J (2002) Round robin classification. J Mach Learn Res 2: 721–747
Hastie TJ, Friedman J, Tibshirani RJ (2001) The elements of statistical learning. Springer, New York
Hastie TJ, Tibshirani RJ (1998) Classification by pairwise coupling. Ann Stat 26(1): 451–478
Hoffman H (2008) Mosaic plots and their variants. In: Chen C, Hardle W, Unwin A (eds) Handbook of data visualization. Springer, Berlin-Heidelberg, pp 617–642
Hummel J (1996) Linked bar charts: analysing categorical data graphically. Comput Stat 11: 36–44
Inselberg A (1998) Visual data mining with parallel coordinates. Comput Stat 53(1): 47–63
Mola F, Conversano C (2008) Sequential automatic search of a subset of classifiers in multiclass learning. In: Brito P (ed) Compstat 2008 Proceedings in computational statistics. Springer, Berlin, pp 291–302
Paradis E (2006) Analysis of phylogenetics and evolution with R. Springer, New York
R Development Core Team (2009) R: A language and environment for statistical computing. R Foundation for statistical computing. Vienna, Austria. http://www.R-project.org
Rabiner LR, Schafer RW (1978) Difital processing of speech signals. Prentice Hall, Englewood Cliffs New Jersey
Ripley B (2009) Tree: classification and regression trees. R package version 1.0-27. http://cran.r-project.org/web/packages/tree/index.html
Therneau TM, Atkinson B (2009) Rpart: recursive partitioning, R package version 3.1-45. http://CRAN.R-project.org/package=rpart
Urbanek S (2002) Different ways to see a tree—KLIMT. In: Haerdle W, Ronz B (eds) Compstat 2002 proceedings in computational statistics. Springer, Berlin-Heidelberg, pp 303–308
Urbanek S (2006) Trees. In: Unwin A, Theus M, Hofmann H (eds) Graphics of large datasets. Springer, New York, pp 177–202
Urbanek S (2008) Visualising trees and forests. In: Chen C, Hardle W, Unwin A (eds) Handbook of data visualization. Springer, Berlin-Heidelberg, pp 252–264
Urbanek S, Unwin A (2002) Making trees interactive with KLIMT—a COSADA software project. Stat Comp Graphi Newsl 13(1): 13–16
Urbanek S, Wichtrey T (2009) Iplots: iPlots—interactive graphics for R, R package version 1.1-3. http://www.iPlots.org/
Weston J, Herbich R (2000) Adaptive margin support vector machines. In: Smola AJ, Bartlett PL, Scholkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT press, Cambridge, pp 281–295
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Conversano, C. Interactive visualization in multiclass learning: integrating the SASSC algorithm with KLIMT. Comput Stat 26, 711–731 (2011). https://doi.org/10.1007/s00180-011-0255-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-011-0255-3