Advertisement

Contributions of Domain Knowledge and Stacked Generalization in AI-Based Classification Models

  • Weiping Wu
  • Vincent ChengSiong Lee
  • TingYean Tan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3339)

Abstract

We exploit the merits of C4.5 decision tree classifier with two stacking meta-learners: back-propagation multilayer perceptron neural network and naive-Bayes respectively. The performance of these two hybrid classification schemes have been empirically tested and compared with C4.5 decision tree using two US data sets (raw data set and new data set incorporated with domain knowledge) simultaneously to predict US bank failure. Significant improvements in prediction accuracy and training efficiency have been achieved in the schemes based on new data set. The empirical test results suggest that the proposed hybrid schemes perform marginally better in term of AUC criterion.

Keywords

Domain Knowledge True Positive Rate Positive Instance Negative Instance Bank Failure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)CrossRefGoogle Scholar
  2. 2.
    Cherkassky, V., Lari-Najafi, H.: Data representation for diagnostic neural networks. IEEE Expert 7, 43–53 (1992)CrossRefGoogle Scholar
  3. 3.
    Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers (2004), http://www.hpl.hp.com/personal/Tom_Fawcett/papers/ROC101.pdf
  4. 4.
    George, H.H., Donald, G.S., Alan, B.C.: Bank management: text and cases. John Wiley & Sons, Inc, Chichester (1994)Google Scholar
  5. 5.
    Hirsh, H., Noordewier, M.: Using background knowledge to improve inductive learning of DAS sequences. In: Proceedings of IEEE Conference on AI for Applications (1994)Google Scholar
  6. 6.
    John, G., Kohavi, R., Pfleger, K.: Irrelevant features and subset selection problem. In: Proceedings of 11th International Conference on Machine Learning (1994)Google Scholar
  7. 7.
    Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the 13th International Conference on Machine Learning (1996)Google Scholar
  8. 8.
    Ledezma, A., Aler, R., Borrajo, D.: Empirical study of a stacking state-space - Tools with Artificial Intelligence. In: Proceedings of the 13th International Conference. IEEE Expert, vol. 7-9, pp. 210–217 (2001)Google Scholar
  9. 9.
    Piramuthu, S., Shaw, M.J., Gentry, J.A.: A classification approach using multi-layered neural networks. Decision Support Systems 11, 509–525 (1994)CrossRefGoogle Scholar
  10. 10.
    Radcliffe, N.J., Surry, P.D.: Fundamental limitations on search algorithms: Evolutionary computing in perspective. In: van Leeuwen, J. (ed.) Computer Science Today. LNCS, vol. 1000, Springer, Heidelberg (1995)CrossRefGoogle Scholar
  11. 11.
    Witten, I.H., Frank, E.: Data mining—Practical machine learning tools and techniques with Java implementation. Morgan Kaufmann Publisher, San Francisco (1999)Google Scholar
  12. 12.
    Zhou, Z.H., Jiang, Y.: NeC4.5: Neural Ensemble Based C4.5. IEEE Transactions on knowledge and data engineering 16(6), 770–773 (2004)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Weiping Wu
    • 1
  • Vincent ChengSiong Lee
    • 1
  • TingYean Tan
    • 2
  1. 1.School of Business Systems 
  2. 2.Department of Accounting and FinanceMonash UniversityClaytonAustralia

Personalised recommendations