Skip to main content

Identify Error-Sensitive Patterns by Decision Tree

  • Conference paper
  • First Online:
Advances in Data Mining: Applications and Theoretical Aspects (ICDM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9165))

Included in the following conference series:

  • 1400 Accesses

Abstract

When errors are inevitable during data classification, finding a particular part of the classification model which may be more susceptible to error than others, when compared to finding an Achilles’ heel of the model in a casual way, may help uncover specific error-sensitive value patterns and lead to additional error reduction measures. As an initial phase of the investigation, this study narrows the scope of problem by focusing on decision trees as a pilot model, develops a simple and effective tagging method to digitize individual nodes of a binary decision tree for node-level analysis, to link and track classification statistics for each node in a transparent way, to facilitate the identification and examination of the potentially “weakest” nodes and error-sensitive value patterns in decision trees, to assist cause analysis and enhancement development.

This digitization method is not an attempt to re-develop or transform the existing decision tree model, but rather, a pragmatic node ID formulation that crafts numeric values to reflect the tree structure and decision making paths, to expand post-classification analysis to detailed node-level. Initial experiments have shown successful results in locating potentially high-risk attribute and value patterns; this is an encouraging sign to believe this study worth further exploration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Quinlan, J.R.: C4. 5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  2. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)

    MATH  Google Scholar 

  3. Alpaydin, E.: Introduction to Machine Learning. The MIT Press, London (2004)

    Google Scholar 

  4. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2006)

    Google Scholar 

  5. Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bio-informatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  6. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)

    Google Scholar 

  7. Gheyas, I.A., Smith, L.S.: Feature subset selection in large dimensionality domains. Pattern Recogn. 43(1), 5–13 (2010)

    Article  MATH  Google Scholar 

  8. Tabakhi, S., Moradi, P., Akhlaghian, F.: An unsupervised feature selection algorithm based on ant colony optimization. Eng. Appl. Artif. Intell. 32, 112–123 (2014)

    Article  Google Scholar 

  9. Breiman, L.: Bagging Predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  MathSciNet  Google Scholar 

  10. Schapire, R.E.: The Strength of Weak Learnability. Mach. Learn. 5(2), 197–227 (1990)

    Google Scholar 

  11. Ho, T.K.: Random decision forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282 (1995)

    Google Scholar 

  12. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of online learning and an application to boosting. In: Computational Learning Theory, pp. 23–37 (1995)

    Google Scholar 

  13. Breiman, L.: Random Forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  14. Grossmann, E.: AdaTree: boosting a weak classifier into a decision tree. In: Computer Vision and Pattern Recognition Workshop (2004)

    Google Scholar 

  15. Tu, Z.: Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering. In: Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1589–1596 (2005)

    Google Scholar 

  16. Monteith, K., Carroll, J.L., Seppi, K., Martinez, T.: Turning bayesian model averaging into bayesian model combination. In: The 2011 International Joint Conference on Neural Networks, pp. 2657–2663

    Google Scholar 

  17. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, New Jersey (2004)

    Book  Google Scholar 

  18. Yang, P., Yang, Y.H., Zhou, B., Zomaya, A.: A review of ensemble methods in bioinformatics. Current Bioinf. 5(4), 296–308 (2010)

    Article  Google Scholar 

  19. Wu, W., Zhang, S.: Evaluation of error-sensitive attributes. In: Li, J., Cao, L., Wang, C., Tan, K.C., Liu, B., Pei, J., Tseng, V.S. (eds.) PAKDD 2013 Workshops. LNCS, vol. 7867, pp. 283–294. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  20. Bache, K., Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2013). http://archive.ics.uci.edu/ml

  21. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Wu, W. (2015). Identify Error-Sensitive Patterns by Decision Tree. In: Perner, P. (eds) Advances in Data Mining: Applications and Theoretical Aspects. ICDM 2015. Lecture Notes in Computer Science(), vol 9165. Springer, Cham. https://doi.org/10.1007/978-3-319-20910-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20910-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20909-8

  • Online ISBN: 978-3-319-20910-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics