Advertisement

Autonomous Visualization

  • Khalid El-Arini
  • Andrew W. Moore
  • Ting Liu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4213)

Abstract

Many classification algorithms suffer from a lack of human interpretability. Using such classifiers to solve real world problems often requires blind faith in the given model. In this paper we present a novel approach to classification that takes into account interpretability and visualization of the results. We attempt to efficiently discover the most relevant snapshot of the data, in the form of a two-dimensional scatter plot with easily understandable axes. We then use this plot as the basis for a classification algorithm. Furthermore, we investigate the trade-off between classification accuracy and interpretability by comparing the performance of our classifier on real data with that of several traditional classifiers. Upon evaluating our algorithm on a wide range of canonical data sets we find that, in most cases, it is possible to obtain additional interpretability with little or no loss in classification accuracy.

Keywords

Subspace Cluster Projection Pursuit Arithmetic Expression Blind Faith American Medical Informatics Association 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proc. ACM SIGMOD International Conference on Management of Data, pp. 94–105 (1998)Google Scholar
  2. 2.
    Bentley, J.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Caruana, R., Kangarloo, H., David, J., Dionisio, N., Sinha, U., Johnson, D.: Case-based explanation of non-case-based learning methods. In: Proc. American Medical Informatics Association Symposium, pp. 212–215 (1999)Google Scholar
  4. 4.
    de Oliveira, M.C.F., Levkowitz, H.: From visual data exploration to visual data mining: A survey. IEEE Trans. Visualization and Computer Graphics 9(3) (July-September 2003)Google Scholar
  5. 5.
    El-Arini, K., Moore, A.W., Liu, T.: Autonomous visualization. Technical Report CMU-CS-06-137, Carnegie Mellon University (2006)Google Scholar
  6. 6.
    Falkenhainer, B.C., Michalski, R.S.: Integrating quantitative and qualitative discovery: The ABACUS system. Machine Learning 1(4), 367–401 (1986)Google Scholar
  7. 7.
    Friedman, J.H., Tukey, J.W.: A projection pursuit algorithm for exploratory data analysis. IEEE Trans. Computers C-23(9), 881–889 (1974)CrossRefGoogle Scholar
  8. 8.
    Galkin, I., Reinisch, B., Huang, X., Benson, R., Fung, S.: Automated diagnostics for resonance signature recognition on IMAGE/RPI plasmagrams. Rad. Sci. (2004)Google Scholar
  9. 9.
    Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems (2005)Google Scholar
  10. 10.
    Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Machine Learning 11(1), 63–90 (1993)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Langley, P.: Data-driven discovery of physical laws. Cog. Science 5(1), 31–54 (1981)CrossRefGoogle Scholar
  12. 12.
    Leban, G., Mramor, M., Bratko, I., Zupan, B.: Simple and effective visual models for gene expression cancer diagnostics. In: Proc. ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 167–176 (2005)Google Scholar
  13. 13.
    Lee, E., Cook, D., Klinke, S., Lumley, T.: Projection pursuit for exploratory supervised classification. Comp. and Graphical Statistics 14(4), 831–846 (2005)CrossRefMathSciNetGoogle Scholar
  14. 14.
    Pelleg, D., Moore, A.W.: Mixtures of rectangles: Interpretable soft clustering. In: Proc. 18th International Conference on Machine Learning, pp. 401–408 (2001)Google Scholar
  15. 15.
    Sprenger, T., Brunella, R., Gross, M.: H-BLOB: A hierarchical visual clustering method using implicit surfaces. In: Proc. Visualization 2000, pp. 61–68 (2000)Google Scholar
  16. 16.
    Weiss, S.M., Galen, R.S., Tadepalli, P.V.: Maximizing the predictive value of production rules. Artificial Intelligence 45(1-2), 47–71 (1990)CrossRefGoogle Scholar
  17. 17.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Khalid El-Arini
    • 1
  • Andrew W. Moore
    • 1
  • Ting Liu
    • 1
  1. 1.School of Computer ScienceCarnegie Mellon UniversityPittsburghUSA

Personalised recommendations