Advertisement

Interactive Visual Decision Tree Classification

  • Yan Liu
  • Gavriel Salvendy
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4551)

Abstract

Data mining (DM) modeling is a process of transforming information enfolded in a dataset into a form amenable to human cognition. Most current DM tools only support automatic modeling, during which uses have little interaction with computing machines other than assigning some parameter values at the beginning of the process. Arbitrary selection of parameter values, however, can lead to an unproductive modeling process. Automatic modeling also downplays the key roles played by humans in current knowledge discovery systems. Classification is the process of finding models that distinguish data classes in order to predict the class of objects whose class labels are unknown. Decision tree is one of the most widely used classification tools. A novel interactive visual decision tree (IVDT) classification process has been proposed in this research; it aims to facilitate decision tree classification process regarding enhancing users’ understanding and improving the effectiveness of the process by combining the flexibility, creativity, and general knowledge of humans with the enormous storage capacity and computational power of computers. An IVDT for categorical input attributes has been developed and experimented on twenty subjects to test three hypotheses regarding its potential advantages. The experimental results suggested that compared to the automatic modeling process as typically applied in current decision tree modeling tools, IVDT process can improve the effectiveness of modeling in terms of producing trees with relatively high classification accuracies and small sizes, enhance users’ understanding of the algorithm, and give them greater satisfaction with the task.

Keywords

visual data mining interactive modeling model visualization data visualization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From Data Mining to Knowledge Discovery: An Overview. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R., Park, M. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 1–36. AAAI Press, Stanford, California (1996)Google Scholar
  2. 2.
    Pyle, D.: Data Preparation for Data Mining. Morgan Kaufmann Publishers, Seattle, Washington, USA (1999)Google Scholar
  3. 3.
    Ankerst, M., Keim, D.A., Kriegel, H.-P.: Circle Segments: A Technique for Visually Exploring Large Multidimensional Data Sets. In: Proceedings of IEEE Visualization 96, Hot Topic Session, San Francisco, CA (1996)Google Scholar
  4. 4.
    Kreuseler, M., Schumann, H.: A Flexible Approach for Visual Data Mining. IEEE Transactions on Visualization and Computer Graphics 8(1), 39–51 (2002)CrossRefGoogle Scholar
  5. 5.
    Fayyad, U., Grinstein, G.G.: Introduction. In: Fayyad, U.M., Grinstein, G., Wierse, A. (eds.) Information Visualization in Data Mining and Knowledge Discovery, pp. 1–20. Morgan Kaufmann Publishers, San Francisco (2002)Google Scholar
  6. 6.
    Domingos, P.: The Role of Occam’s Razor in Knowledge Discovery. Data. Mining and knowledge discovery 3(4), 409–425 (1999)CrossRefGoogle Scholar
  7. 7.
    Quinlan, J.R.: C 4.5 – Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)Google Scholar
  8. 8.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Pacific Grove, Wadsworth, CA (1984)Google Scholar
  9. 9.
    Murthy, S.K.: Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey. Data Mining and Knowledge Discovery 2(4), 345–389 (1998)CrossRefGoogle Scholar
  10. 10.
    Helander, M.G.: Design of Visual Displays. In: Salvendy, G. (ed.) Handbook of Human Factors, pp. 507–548. John Wiley and Sons, Chichester (1987)Google Scholar
  11. 11.
    Enterprise, S.A.S.: Miner: Retrieved (May 15, 2006), from http://www.sas.com/technologies/analytics/datamining/miner/
  12. 12.
    SPSS Answer Tree: Retrieved (May 15, 2006), from http://www.spss.com
  13. 13.
    Shneiderman, B.: Tree Visualization with Tree-Maps: 2D Space Filling Approach. ACM Transactions on graphics 11(1), 92–99 (1992)zbMATHCrossRefGoogle Scholar
  14. 14.
    Barlow, S.T., Neville, P.A.: Comparison of 2-D visualizations of hierarchies. In: IEEE Symposium on Information Visualization 2001, San Diego, CA, pp. 131–138 (2001)Google Scholar
  15. 15.
    Ankerst, M., Elsen, C., Ester, M., Kriegel, H-P.: Visual Classification: An Interactive Approach to Decision Tree Construction. In: Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining, pp. 392–397 (1999)Google Scholar
  16. 16.
    Ankerst, M., Ester, M., Kriegel, H.-P.: Towards an Effective Cooperation of the Computer and the User for Classification. In: Proc. 6th Int. Conf. on Knowledge Discovery and Data Mining (KDD 2000), Boston, MA, pp. 178–188 (2000)Google Scholar
  17. 17.
    Ware, M., Frank, E., Holmes, G., Hall, M., Witten, T.H.: Interactive Machine Learning: Letting Uses Build Classifiers. International Journal of Human Computer Studies 55(3), 281–292 (2001)zbMATHCrossRefGoogle Scholar
  18. 18.
    Poulet, F.: Cooperation between Automatic Algorithms, Interactive Algorithms and Visualization Tools for Visual Data Mining. In: The 6th European Conference on Principles and Practice of Knowledge Discovery in Databases, Helsinki, Finland, pp. 19–23 (2002)Google Scholar
  19. 19.
    Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, Irvine, CA: University of California, Department of Information and Computer Science. Retrieved (May 15, 2006), from http://www.ics.uci.edu/~mlearn/MLRepository.html
  20. 20.
    Inselberg, A.: The Plane with Parallel Coordinates. The. Visual Computer 1(1), 69–97 (1985)zbMATHCrossRefGoogle Scholar
  21. 21.
    Friendly, M.: Visualizing Categorical Data. SAS Institute, NC (2001)Google Scholar
  22. 22.
    Liu, Y., Salvendy, G.: Design and Evaluation of Interactive Visual Decision Tree Classification. International Journal of Human-Computer Studies 65(2), 95–110 (2006)CrossRefGoogle Scholar
  23. 23.
    Hackman, J.R., Oldham, G.R.: Development of the Job Diagnostic Survey. Journal of Applied Psychology 60(2), 159–170 (1975)CrossRefGoogle Scholar
  24. 24.
    Novick, D.: What is Effectiveness? In: CHI 1997 Workshop on HCI Research and Paractice Agenda Based on Human Needs and Social Responsibility, CHI+ 1997, Atlanta, GA (1997) Google Scholar
  25. 25.
    Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison Wesley, Boston, MA (2005)Google Scholar
  26. 26.
    Dunsmore, A., Roper, M.: Comparative Evaluation of Program Comprehension Measures. Technical Report EFoCS 35-2000, Department of Computer Science, University of Strathclyde, Glasgow, UK (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Yan Liu
    • 1
  • Gavriel Salvendy
    • 2
  1. 1.Department of Biomedical, Industrial, and Human Factors Engineering, Wright State University, Dayton, OH 45435 
  2. 2.School of Industrial Engineering, Purdue University, West Lafayette, IN 47906, Department of Industrial Engineering, Tsinghua University, Beijing, 100084P.R. China

Personalised recommendations