Advertisement

A Novel Prototype Decision Tree Method Using Sampling Strategy

  • Bhanu Prakash Battula
  • Debnath Bhattacharyya
  • C. V. P. R. Prasad
  • Tai-hoon KimEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9155)

Abstract

Data Mining is a popular knowledge discovery technique. In data mining decision trees are of the simple and powerful decision making models. One of the limitations in decision trees is towards the data source which they tackle. If data sources which are given as input to decision tree are of imbalance nature then the efficiency of decision tree drops drastically, we propose a decision tree structure which mimics human learning by performing balance of data source to some extent. In this paper, we propose a novel method based on sampling strategy. Extensive experiments, using C4.5 decision tree as base classifier, show that the performance measures of our method is comparable to state-of-the-art methods.

Keywords

Knowledge discovery Data mining Classification Decision trees Sampling strategy 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Juanli, H., Deng, J., Sui, M.: A new approach for decision tree based on principal component analysis. In: Proceedings of Conference on Computational Intelligence and Software Engineering, pp. 1–4 (2009)Google Scholar
  2. 2.
    Bergsma, S.: Large-scale semi-supervised learning for natural language processing. PhD Thesis, University of Alberta (2010)Google Scholar
  3. 3.
    Durkin, J.: Expert systems: design and development. Prentice Hall, Englewood Clis (1994)Google Scholar
  4. 4.
    Quinlan, J.: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)Google Scholar
  5. 5.
    Purdila, V., Pentiuc, S.-G.: MR-Tree - A Scalable MapReduce Algorithm for Building Decision Trees. Journal of Applied Computer Science & Mathematics, 16(8) (2014). SuceavaGoogle Scholar
  6. 6.
    Farid, D.M., Harbi, N., Mohammad Zahidur, R.: Combining naive bayes and decision tree for adaptive intrusion detect. International Journal of Network Security & Its Applications (IJNSA), 2(2) (April 2010)Google Scholar
  7. 7.
    Mohammad, K., Mahmood, A.: The Use of Genetic Algorithm, Clustering and Feature Selection Techniques in Constrcution of Decision Tree Models for Credit Scoring. International Journal of Managing Information Technology (IJMIT) 5(4) (November 2013). doi: 10.5121/ijmit.2013.5402
  8. 8.
    Dianhong, W., Xingwen, L., Liangxiao, J., Xiaoting, Z., Yongguang, Z.: Rough Set Approach to Multivariate Decision Trees Inducing? Journal of Computers, 7(4) (April 2012)Google Scholar
  9. 9.
    Xinmeng, Z., Shengyi, J.: A Splitting Criteria Based on Similarity in Decision Tree Learning. Journal of Software, 7(8) (August 2012)Google Scholar
  10. 10.
    Ying, W., Xinguang, P., Jing, B.: Computer Crime Forensics Based on Improved Decision Tree Algorithm. Journal of Networks, 9(4) (April 2014)Google Scholar
  11. 11.
    Dong-sheng, L., Shujiang, F.: A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem. Scientific World Journal, Article ID 468324, 11 (2014). Hindawi Publishing Corporation. http://dx.doi.org/10.1155/2014/468324
  12. 12.
    Win-Tsung, L., Yue-Shan, C., Ruey-Kai, S., Chun-Chieh, C., Shyan-Ming, Y.: CUDT: A CUDA Based Decision Tree Algorithm. Scientific World Journal, Article ID 745640, 12 (2014). Hindawi Publishing Corporation. http://dx.doi.org/10.1155/2014/745640
  13. 13.
    Tarun, C., Jayashri, V.: Fault Diagnosis in Benchmark Process Control System Using Stochastic Gradient Boosted Decision Trees. International Journal of Soft Computing and Engineering (IJSCE), 1(3) (July 2011). ISSN: 2231-2307Google Scholar
  14. 14.
    Ganga Devi, S.V.S.: Fuzzy Rule Extraction for Fruit Data Classification. Compusoft, An international journal of advanced computer technology, 2(12) (December 2013)Google Scholar
  15. 15.
    Hamilton, A., Asuncion, D., Newman.: UCI Repository of Machine Learning Database (School of Information and Computer Science). Univ. of California, Irvine (2007). http://www.ics.uci.edu/∼mlearn/MLRepository.html
  16. 16.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)Google Scholar
  17. 17.
    Quinlan, J.: Induction of decision trees. Machine Learning 1, 81–106 (1986)Google Scholar
  18. 18.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)zbMATHGoogle Scholar
  19. 19.
    Chawla, N.V., et al.: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research. 16, 321–357 (2002)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Bhanu Prakash Battula
    • 1
  • Debnath Bhattacharyya
    • 2
  • C. V. P. R. Prasad
    • 3
  • Tai-hoon Kim
    • 4
    Email author
  1. 1.Department of CSEVignan CollegeGunturIndia
  2. 2.Department of Computer Science and EngineeringVignan’s Institute of Information TechnologyVisakhapatnamIndia
  3. 3.Research ScholarAcharya Nagarjuna UniversityGunturIndia
  4. 4.Department of Convergence SecuritySungshin Women’s UniversitySeoulKorea

Personalised recommendations